Keywords

1 Introduction

Nowadays, people prefer online shopping of products from the various e-commerce websites because this helps them to save time and offers them a wider range of selection at their convenience. Focusing our selection on the basis of the consumer reviews of other customers helps us to save time and filter the products based on the reviews. But most of the reviews often contain very less details about that particular product and have more of other sentences which are not useful to the potential buyer. Hence, we need to extract only that information required by the customer and trim out other undesired information, so that they can be displayed on compact devices such as mobile phones which are often handy among people. So, in this work sentiment analysis is projected for this purpose.

Sentiment analysis is one of the stages of opinion mining . In sentiment analysis, we classify the particular word into positive, negative, or neutral in order to predict the emotion of the speaker or reviewer toward the product. These reviews are given based on the customers’ judgment of the product and experiences with it after using it. So, basically, the sentiment analysis of the reviews of a set of customers are being used by one customer to analyze the product and make up his mind whether to buy the product or not.

Opinion mining deals with natural language processing in order to identify the important keywords in the given sentences. Opinion mining consists of three stages—Opinion Retrieval, Opinion Classification , and Opinion Summarization. Opinion retrieval aims to extract the keywords containing the opinions or comments concerned to a particular subject of the user’s interest. Opinion retrieval is followed by opinion classification which deals with classifying the extracted keywords into positive, negative, or neutral based on an existing dictionary. This can also be referred to as polarization of the words. The next stage, opinion summarization is the process of reproducing summaries from the extracted polarized keywords.

This paper aims at gathering important opinion words from the product reviews given by the existing customers taken from the Amazon review dataset, finding their orientation, i.e.- positive, negative, or neutral and finally outline the views of public to a potential user which enhances his decision-making process on whether to choose the product or not. All this has been done using R Programming language.

2 Related Work

Min Wang and Hanxiao Shi [ 1 ] have proposed an approach to realize polarity analysis of new words, and implement quantitative computation of sentiment words and automatic expansion of polarity lexicon. Their experimental results showed the feasibility and effectiveness of their approach. Their future work includes making fine-grained sentiment analysis possible from the attribute-level with an automatically built polarity lexicon.

Basant Agarwal et al. [ 2 ] worked on sentiment-rich phrases that were obtained using POS-based rules and dependency relations that were capable of extracting contextual and syntactic information from the document. Their experiments prove that by using POS patterns for the phrases, performance of sentiment analysis can be improved. In future, they would like to explore more patterns using POS Tagging to get better results.

Rui Yao and Jianhua Chen [ 3 ] have applied sentiment analysis and machine -earning concepts to study the relationship between the online reviews for a movie and the box office collection of the movie. It takes into account only positive and negative reviews leaving behind the neutral ones. Further experiments with larger datasets can be carried out to train and test the model. The comparisons among different movies can also be considered.

Kai Gao et al. [ 4 ] conducted the SVM -based algorithm to do the alternative structural formulation of the SVM optimization problem for classification . Two different datasets which are, microblogging and e-commerce were used to evaluate the performance. Experiments prove that the proposed approach, which includes feature extraction & selection and SVM, is effective in microblogging multiclass sentiment classification and e-commerce sentiment classification.

In their paper, Pooja Kherwa et al. [ 5 ] gather opinions and review data from e-commerce websites, social networks, popular portals, and blogs to find out what exactly people are talking about and the sentiment they are expressing. The Scoring Algorithm, which they have used, scans every line of data and gives a summary and a graphical representation if required. The efficiency of this algorithm can be improved upon if a self-learning system can be implemented.

Giovanni Acampora and Georgina Cosma [ 6 ] introduce an innovative framework consisting of methods to analyze efficiently sentiments of the customer reviews and compute their corresponding numerical data so that companies can plan their future projects. The dimension and imprecision ratings of data are calculated. As a conclusion, they propose a system to reduce the uncertainty between the reviews to validate the reviews to get the useful reviews and produce a more accurate system.

In their work, Siddharth Aravindan and Asif Ekbal [ 7 ] put forward a system that obtains the product features automatically from the reviews and divide it into positive and negative. It does feature extraction followed by polarity classification using association rules and supervised machine learning. Their future work is to investigate some more features for opinion mining , and to make use of classifiers that would enhance their work.

Yan Wan et al. [ 8 ] conducted a fine-grain sentiment analysis to get better results of the customer reviews and use general methods to crawl reviews and find implicit features based on POS rules. They believe that it can help producers make improvements clear and discover niche market and can also help the consumer understand the advantages as well as the disadvantages of the target product and hence make a wise selection.

3 Methodology

The modules of the system designed are illustrated in Fig. 1 and explained in the subsequent sections.

Fig. 1
figure 1

System architecture

3.1 POS Tagging

Part-of-speech Tagging (POS Tagging) [9] is the process of attaching every word of a file (corpus) with its corresponding part of speech, based on its definition and relation with the adjacent phrases and words. The outcome of this process, is all the words along with their equivalent POS tag from which the words can be identified as nouns, adjectives, pronouns, verbs, etc.

This process is essential because, in the reviews, the product features are often described in the form of nouns or noun phrases and the sentiment regarding those nouns is in the form of adjectives. Therefore, extracting the noun with its corresponding adjective allows us to identify a feature of the product and the customer’s emotion toward it.

The process of POS tagging involves converting each word into Unicode Transformation Format (UTF-8) so as to encode all the character vectors into 8 bit code units in order to avoid complications of byte order marks. This is followed by tokenization of all the individual characters to convert them into individual tokens. The next step is to remove the stop words which are the most common words used in a language. The program has been written in such a way that the user can add his desired words into the list of existing stop words. The final step is to apply sentence token annotations and word token annotations to the customer reviews .

From the analysis of the dataset of a particular model of a digital camera, one of the review obtained was-

the/DT macro-mode/NN is/VBZ exceptional/JJ and/VBP the/DT pictures/NNS are/VBP clear/JJ

The nouns, i.e., macro-mode and pictures and identified and the POS Tag NN is appended to them which signifies that they are nouns. The adjectives, i.e., exceptional and clear are identified and the tag JJ is appended to them signifying that they are adjectives. The extracted nouns can be used for frequent feature detection, whereas the adjectives can be used to identify the polarity of the review.

3.2 Frequent Feature Identification

As a result of the above discussion, the features of a product are described in the form of nouns. Therefore, once, the nouns and adjectives are generated, the nouns along with their respective adjectives can be used to find the most frequently repeating positive (Table 1) and negative feature (Table 2) which the customers have reviewed. From the analysis of the entire dataset of the reviews the following table of the frequent features list can be obtained.

Table 1 Positive features
Table 2 Negative features

3.3 Opinion Word Extraction

The process of POS tagging is followed by opinion word extraction which is the process of extracting all types of adjectives (i.e., comparative & superlative) in order to find the customer’s emotion toward the product, i.e., positive or negative. Whether the word is positive or negative can be determined by comparing each adjective extracted to a dictionary consisting of the list of positive and negative words. For example in the review:

awesome camera with great print quality but bad zoom.

The words awesome and great can easily be identified as a word having positive sentiment and therefore are counted as positive words whereas bad can be identified as a word having negative sentiment and hence is counted as a negative word.

3.4 Sentiment Orientation Identification

The next step is to calculate the sentiment score of each review . The sentiment score helps us to classify the total score of each review and therefore the positive, negative, and neutral reviews can be identified. A score of +1 is assigned to a positive word whereas −1 is assigned to a negative word. The total review score can be calculated by summing up the individual scores of all the adjectives in a review. In this paper, the reviews that have score greater than 0 are classified as positive, reviews having a score of less than 0 are classified as negative and a score of 0 makes the review a neutral review. For example in the following reviews:

excellent compact digital camera!

In the above sentence, the words excellent and compact are given the score of +1 each giving it an overall score of +2, therefore making it a positive review.

the main drawback of this camera is its lens.

In this sentence, the word drawback gives it a score −1 making the total sentence score −1, therefore making it a negative review.

amazing camera but comes at an exorbitant price

In the above sentence, the word amazing is given a score of +1whereas the word exorbitant is given a score of −1, therefore giving a total score of 0 and makes it a neutral review.

3.5 Classification

Classification should be applied on the data so that the can analyze existing data can be analyzed to predict the future trends of the data. There are many classification algorithms which make this job easy. Two classification algorithms, Naïve Bayes Classification [10] and SVM Classification [11] in R Programming language are used in this paper. For applying classification algorithms, the entire dataset is divided into training set and testing set. The training set is used to predict the results of classification on the dataset while the testing set is used to validate the results. In our experiments, we have used 33% of the dataset as testing set and the remaining as the testing set.

  1. A.

    Naïve Bayes

Naïve Bayes method works on the lines of Bayes Theorem of probability to predict the class of the data. A Naïve Bayes classifier [12] estimates that the presence of one feature in a class is not related to the presence of any other feature. It is a highly scalable problem, requiring a number of parameters linear in the number of variables in a learning problem. Naïve Bayes is used so that it can give easy, fast, and accurate results compared to other classification algorithms.

  1. B.

    SVM Classification

Support Vector Classification [13] is a machine-learning algorithm mainly used for classification and regression analysis . The goal of this algorithm is to find a decision boundary between the two classes that is located at the maximum distance from any point in the training data. SVM [14] is mainly used because by introducing a kernel, flexibility in threshold can be gained.

3.6 Accuracy Calculation and Comparison

Once the results of both the algorithms are acquired, the acquired results have to be compared in order to check which algorithm gives a better performance. From our analysis, we come to a conclusion that SVM Classification yields better results compared to Naïve Bayes as SVM has a higher accuracy compared to Naïve Bayes.

3.7 Result Visualization

The final result of classification can be represented in any format like bar graphs, histograms, trees, and tables. A histogram is usedcto show results of sentiment classification. Receiver Operating Characteristic Plot (ROC Plot) to show the results of our analysis. An ROC Plot is a graphical plot that depicts the results of our classification. In this, the true positive rate is plotted against the false positive rate.

4 Experiment Results

  1. A.

    About the Dataset

The dataset used for this experiment is the review dataset of a particular model of popular digital camera from Amazon [15] collected over a couple of years. This collected data has been segregated into positive and negative reviews using R Programming modules.

  1. B.

    Experimental Result Analysis

The entire dataset has been given a score based on their adjectives as explained in the previous sections of the paper. Each review is given a score ranging between −2 and 6 based on the positive or negative polarity of the words. In the graph below (Fig. 2), frequency of the words is taken on the x-axis and the score of the review denoted by analysis$score is plotted on y-axis. The length of each bar of the histogram denotes the frequency of reviews of the particular score in the database.

Fig. 2
figure 2

Histogram of analysis scores

The ROC Curve (Fig. 3) plots hit rate or true positive rate on x-axis against false alarm rate or false positive rate. The hit rate shows the part of predictions that have been identified correctly. The false alarm rate refers the expectancy of false ratio. The area under the curve shows the accuracy of the predictions.

Fig. 3
figure 3

ROC plot

The given table (Table 3) shows a comparison of the experimental results of SVM and Naïve Bayes classifications using Precision and Recall functions done using modules of R. Precision the fraction of extracted instances which are correct, whereas Recall is the fraction of correct instances that have been extracted. By comparing the accuracy values, it is clear that SVM has a higher accuracy and hence is a better classification method compared to Naïve Bayes.

Table 3 Result comparison summary

5 Future Work

As part of our future work, the plan is to enhance the existing techniques and refine them in order to extend the process of Sentiment Analysis to include neutral sentiments along with positive and negative sentiments in order to get better results using multiclass Naïve Bayes and multiclass SVM Classification. We also plan to include several other classification algorithms in order to obtain better results. Implied Sentiments which are not expressed in words are also a means to get user views on a product. In future, we plan to understand and classify these implied sentiments accurately.