A comprehensive analysis of adverb types for mining user sentiments on amazon product reviews

Chauhan, Ummara Ahmed; Afzal, Muhammad Tanvir; Shahid, Abdul; Abdar, Moloud; Basiri, Mohammad Ehsan; Zhou, Xujuan

doi:10.1007/s11280-020-00785-z

A comprehensive analysis of adverb types for mining user sentiments on amazon product reviews

Published: 20 February 2020

Volume 23, pages 1811–1829, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

World Wide Web Aims and scope Submit manuscript

A comprehensive analysis of adverb types for mining user sentiments on amazon product reviews

Download PDF

Ummara Ahmed Chauhan¹,
Muhammad Tanvir Afzal¹,
Abdul Shahid²,
Moloud Abdar³,
Mohammad Ehsan Basiri⁴ &
…
Xujuan Zhou ORCID: orcid.org/0000-0002-1736-739X⁵

1302 Accesses
30 Citations
8 Altmetric
1 Mention
Explore all metrics

Abstract

Online shopping websites like Amazon stipulate a platform to the users where they can share their opinions about different products. Recently, it has been identified that prior to the purchasing, 81% of the users explore different online platforms in order to assess the reliability of product that they intend to buy. The reviews of different users are expressed by using natural language, which help a user to make an informed decision. From past few years, scientific community has payed attention to automatically specify the meaning of review through Sentiment Analysis. Sentiment Analysis is a research area which is gradually being evolved thus, helping the users to tackle the sentiment hidden in a review. To date, different sentiment analysis-based studies have been conducted in literature. For sentiment classification, the core ingredient is the exploitation of polarity bearing words present in the reviews e.g. adjectives, verbs, and adverbs etc. Different studies suggest the importance of different forms of adverbs in sentiment classification task. In literature, it has been reported that general adverbs strongly help to classify sentiments with better accuracy whereas other suggest that degree adverbs are important for sentiment classification. There are ten distinct forms of adverbs such as general adverbs, general superlative adverbs, general comparative adverbs, general-wh adverbs, degree adverbs, degree superlative adverbs, degree comparative adverbs, degree-wh adverbs, time adverbs and locative adverbs. In this paper, we intend to tackle a question that what is the impact of different forms of adverb on the classification of sentiments? For this, the impacts of all these forms have been evaluated on 51,005 reviews of two products, office products and musical DVDs acquired from Amazon. The outcomes of study revealed that two general superlative adverbs and degree-wh adverb hold more impact than the other forms of adverbs. The general superlative adverbs have attained F-measure of 0.86 and degree-wh adverbs have attained F-measure of 0.80.

Survey on Product Review Sentiment Classification and Analysis Challenges

Cross Domain Sentiment Analysis Using Different Machine Learning Techniques

Sentiment Analysis of Amazon Mobile Reviews

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The advent of internet has ensued certain feasibilities at a wider level. Through this, people from all around the world are able to communicate and share their experiences. Along with rapid increase in its usage, people prefer to explore online reviews prior to purchasing any product. There are various social websites such as amazon.com, ebay.com, neweggs.com, alibaba.com etc that stipulate a platform to the users to share their opinions about different products. According to [22], 81% of the users perform online search before purchasing products. These reviews are usually expressed in the form of natural language or by a star rating. These reviews assist the users to ensure the reliability of intended product and to make an informed decision. All of the popular e-shopping platforms (amazon, ebay, neweggs, alibaba, etc.) provide users with complete product specifications including brand, price, features or characteristics . Furthermore, such platforms also provide facility to users for sharing their experiences about the purchased products in the form of a feedback. This feedback is usually available in two forms.

1.
Users can provide their feedback in the form of star rating where customers summarize their experience on a scale of 1(Unsatisfactory) to 5 (Excellent).
2.
Users can also provide their feedback in the form of detailed reviews where they describe experience related to the product in their own words.

These reviews can serve as an influential factor behind purchasing or altering the plan of purchasing a particular product. . Anyone intends to buy a product can analyze these reviews to form the decision accordingly. Therefore, online customer feedback is considered as a significant informative source, which is useful for both probable customers and product manufacturers. The expeditious increase in the user count also increases the number of online reviews. The exploration of huge plethora of reviews requires a lot of cognitive efforts of users to go through all of them. Eventually, user will end up by reviewing only few of the reviews.

To overcome this issue, data scientists use tools like natural language processing and text analysis to extract the essence of all the available product reviews. The term sentiment analysis encompasses the development process behind these tools [30].

Different studies have been conducted in the literature to perform sentiment classification. For sentiment classification, the core ingredient is the exploitation of polarity bearing words present in the reviews e.g. adjectives, verbs, adverbs etc. Part-Of-Speech are commonly used for evaluation of sentiments [13]. The Sentiment analysis approaches can be classified into two categories such as symbolic and sub-symbolic [8]. Our work belongs to the first type of approach as we are using SentiWordNet (a lexical resource for opinion mining). We have performed a comprehensive analysis on role of the aforementioned POS, and hence identified an important research question that how various forms of adverbs impact on the sentiment classification? This study has exploited all ten distinct forms of adverbs such as general adverbs, general superlative adverbs, general comparative adverbs, general-wh adverbs, degree adverbs, degree superlative adverbs, degree comparative adverbs, degree-wh adverbs, time adverbs and locative adverbs. Furthermore, a comprehensive dataset is acquired from, which contains 51,005 reviews of two products, office products and musical DVDs.

Along with the reviews expressed in a natural language, we have also extracted the corresponding star ratings. These star rates are considered as benchmark in this study. The outcomes of study revealed that general superlative adverbs and degree-wh adverb has outperformed other forms of adverbs by attaining F-measure of 0.86 and 0.80 respectively. The proposed study will be beneficial for sentiment classification systems that exploit the adverbs as polarity bearing features.

2 Literature review

Millions of product reviews are available online. According to a survey conducted in USA, 81% of Internet users do online research prior to purchasing online products [22]. Therefore, many approaches and techniques are used to extract useful information from social reviews and sentiment analysis is one of them.

Sentiments are feelings generally expressed as opinions, attitudes, emotions etc. and some subjective impression without facts [33]. Moreover, social media is deemed as a platform to share these feelings. Since each user hold a unique way to express his opinion. Therefore, generally a conflict is assumed in opinions, which are expressed in these reviews, such as bad and good, better and worst etc [18].

Hence, the sentiments expressed in reviews can be classified as positive and negative, or into an n-point scale, e.g., very good, good, satisfactory, bad, very bad. Similar to our study, most of the studies have considered the star ratings as benchmark to evaluate research efforts [12, 17, 26].

Many researchers have contributed for the analysis of sentiments using different approaches. The Claws Tagger is used to classify the features as noun, verb, adjective, adverb etc. It is important to identify these polarity baring words are useful in classifying the sentiments [1]. Furthermore, using different sentiment lexicon like SentiWordNet is used for finding the polarity of these POS tags. Subsequently, the polarity features (verbs, adjectives, adverbs etc.) are used to classify sentiments. Some researchers have evaluated these parameters individually while others have formed their different combinations.

2.1 Role of adjective

The classification of reviews as negative and positive based on single feature is a difficult task for researcher, however, Bojy et al. conducted experiments on 550 negative and 222 positive sentences and extracted adjectives as major part of opinions [5]. In addition to this, researchers contributed that the first step is to analyze that which feature can be used as opinionated feature. Boiy et al. 2007 identified adjectives as subjective of a document and produce with 74.8% results.

Dray et al. 2009 extracted adjectives from movie reviews and calculate precision/recall. They concluded that adjectives can be used to classify a review as positive and negative [10]. Dray et al. 2009 recorded f-score of 0.71 for positive class and 0.62 for negative class. Furthermore, Moghaddam and Popowich conducted experiments for extracting adjectives along with their comparative and superlative forms from reviews [21]. Based on the extracted features the reviews are classified as positive and negative. Moghaddam and Popowich obtained 73% results for their experiments.

After considering the forms of adjectives some researcher contributed for understanding the linguistics forms of adjectives and their specific role in reviews. Kumar and Suresha [16] performed classification of reviews regarding data taken from [36] and [19], adjectives were found as main stream of opinions. They collected adjectives (JJ), superlative adjectives (JJS), and comparative adjectives (JJR) for different experiment and compared with results conducted in 2002 and 2005 respectively. The analysis made by Kumar and Suresha was overall 68.8% better than the others.

In addition to this, Rill et al. states that adjectives (JJ) along with its two forms such as superlative adjectives (JJS) and comparative adjectives (JJR) are helpful in obtaining the opinions from reviews [29]. The results obtained by this approach is 0.78 using SentiWordNet library. These researcher also contributed that reviews are “J-Shaped” typically asymmetric in which title contains positive words rather than rest of review. Similarly, Das and Balabantaray conducted a study and experience that adjectives along with its different forms produce 76.6% better results [7]. The researchers Padmaja et al. extended the POS tagging work and extracted the features such as noun, adjectives, verbs and adverbs. Hence, they exploit the combination of adjectives and nouns from which they obtained some significant 72.5% results [25].

The critical review of the papers reviewed in this section has been demonstrated in the Table 1. It is evident from this table that most of the researchers used SentiWordNet as a lexicon resource.

Table 1 Literature review for adjectives

Full size table

2.2 Role of verbs

From the literature, it was found that adjectives are considered as opinions phrases in which subjective meaning is present. Therefore, the polarity of these adjectives may occur as negative or positive. Therefore, a question arises that if only adjectives have been used as a stream for opinions? For answering this question, further literature was consulted and it was found that Chesley et al. states that there are two verbs classes 1).subjective verbs and 2).objective verbs in which verbs are classified as negative or positive [6]. Therefore, these verbs classes are responsible for classifying a review as positive or negative. Their approach produces 67.8% results.

The role of verb are further discussed by Neviarouskaya et al. in their approach to classify opinions on individual sentence level. They extracted 1947 verbs which are annotated to perform experiment [24]. The researcher states that sentence must contain a verb because it is the part-of- speech on which action is depended. Therefore, sentence without a verb part might be unable to classify opinions as positive or negative. The Neviarouskaya et al. produces f-scores of 0.71 for their research.

The critical review of the papers reviewed in this section has been demonstrated in the Table 2. It is evident from this table that most of the researchers used SentiWordNet as a lexicon resource. Different datasets were used, and only few research efforts handled negations.

Table 2 Literature review for verbs

Full size table

2.3 Role of adverbs

The objective of sentiment analysis is to find out the opinion words and sentence which majorly express the voice of customer as stated by Zhang et al. [40]. They discussed that customer voice can be determined by implementing a graph. The graph explains the relation between opinion bearing words and other aspects. The graph is conducted on degree superlative (RBS) and degree comparative (RBR) forms of adverbs. The results for their approach were 72.8%.

After studying the forms of adverbs similar to adjectives some researchers like Vinodhini & Chandrasekaran concluded a survey that negative impact of any sentence in a document can be determined by some general (WH-RBQ) and degree (WH-RRQ) interrogative adverbs [37]. Therefore, if interrogative sentences are part of opinions then negative opinion can be extracted whenever general (WH-RBQ) and degree (WH-RRQ) adverbs occur. Their study achieves the results 0.74 f-measure. Further, the contribution made by Wang et al. states that the degree adverbs strengthen the sentiments of reviews and blogs and help them to be classified as positive and negative [38]. They conducted experiments on 1375 reviews of three domains such as electronic devices, hotels and E-journals and extracted degree adverbs. They achieved f-measure of 0.80.

Similarly, Dragut & Fellbaum states that the adjectives are opinion words for any review or document and sentiment can be evaluated from them but on the other hand the adverbs support those opinions and present a clear picture or voice [9]. They classified the adverbs in classes such as: strong positive adverbs, weak adverbs, strong negative adverbs, doubtful adverbs. They also stated that general adverbs like awfully scores negative but when negation occurs it changes its class from negative to positive and sentiment of reviews also reversed. Their research produces f-measure of 0.76.

The critical review of the papers reviewed in this section has been demonstrated in the Table 3.

Table 3 Literature review for adverbs

Full size table

2.4 Hybrid approaches

As its name suggests that hybrid approaches are those approaches that combine various parts-of-speech for their impact analysis on sentiment analysis. For instance, Benamara et al. [3, 14], Khan and Baharudin have delineated that if adverbs are also combined with adjectives then they provide better results in locating the opinions.

In the similar fashion, verbs and adverbs have been combined to analyze their impact on sentiment analysis task. For example, Bjorkelun et al. [4], Mudinas et al. [23] have specified that by combining the features which are modifiers such as comparative adverbs and verbs are helpful in analysis of the opinions.

There exist various other studies in the literature that have combined more than two features for sentiment analysis. Kaushik et al. [32] have identified that adjective, adverbs and verbs are more important than other POS because mostly opinion is extracted from adjectives adverbs, and verbs. Similarly, Patel and Soni [34] have depicted that by merging verb as feature with adjectives and adverb, JJRB (adjective + adverb) and RBVB (adverb + verb) better results can be obtained using Senti-word net in unsupervised learning.

In this study, we have analyzed various state-of-the-art approaches and analyzed the role of each polarity bearing feature. Here we conclude all of these papers that which particular type of polarity bearing feature has been utilized and what is their success rate. The studies in Table 4 shows whenever adjectives alone were used, researchers were able to achieve F-Measure of 0.79 or less than that in different research papers. It should also be noted here that some researchers have not reported the F-measure in their papers, such papers have been ignored. The presented table indicates that adjectives and adverbs have individually obtained maximum F-Measure of 0.79. However, whenever they have been combined with other polarity bearing features such as nouns, adjectives, adverbs, and verbs has no significant improvement is reported.

Table 4 A comparative study of all existing approaches

Full size table

This highlights two important findings from the literature that

1.
To recapitulate, the main findings are as follows: Adjectives and adverbs hold the more potential than other forms of adverbs to classify sentiments with high accuracy.
2.
Combining other features with adjectives or adverbs does not enhance the value of F-measure

From this discussion, it is obvious that adjectives and adverbs are the most important polarity bearing features. Moreover, different types of adjectives have already been studied [29, 31]. Two types of adverbs such as: general adverbs [9] and degree adverbs [38] have also been utilized. It has been concluded recently that they hold sufficient potential to measure the intensity of sentiments, and their future work was to analyze the role of adverbs on larger dataset and the exploitation of other types of adverbs. Therefore, this research investigates the role of all forms of adverbs comprehensively on larger dataset of more than 50,000 product reviews.

This research is not only useful for researchers but also helpful for developers. The developers can focus on the best identified adverb types to build accurate sentiment classification tools. Similarly, the researchers can exploit the identified best adverb types in the future when combining with other polarity features for better accuracy.

3 Proposed methodology

As discussed briefly that different researchers have utilized different feature to mine sentiments such as noun, adjective, verbs, adverbs and their different combinations. Various studies have depicted the role of adverbs is significant for classifications of sentiments. As best of our knowledge, there exist very few studies that have evaluated one or two types of adverbs. Since different forms of adverbs hold potential factor in identification of a sentiment, therefore, all of these should be taken into account to perform sentiment classification. In this regard, this paper investigates various form of adverbs to measure their impact on the sentiment analysis classification.

To analyze the impact of adverbs and its different forms and to identify which form or their different combinations play a consequential role in extracting the sentiments, we have proposed a comprehensive methodology.

The architecture diagram for the proposed methodology is explained in Figure 1.

To explore the impact of adverbs and its forms, the comprehensive dataset is collected from the social media website named as Amazon. This data set contains reviews which are further pre-processed and POS tagging is applied. The tagger tagged different parts of speech from which mainly the adverbs and its respective forms are extracted. These are ten distinct forms which are further classified using linguistics [35]. These tags are extracted using CLAWS C7 tagger [28]. After acquiring the forms of adverbs, different combinations are processed to acquire their scores using Senti-Word Net library [2]. Furthermore, reviews are classified in two different classes such as positive and negative according to their scores. Along with the extraction of reviews written in natural language, star rates are also extracted against each review as these rates are assigned by users or customers. Hence, the classification of two respective classes is further compared with benchmark for final evaluation.

3.1 Dataset collection

The dataset used in this research study is crawled from Amazon by using .Net crawler. The dataset crawler fetches the reviews of two products which are distinct in nature. The extracted reviews were consists of product reviews, star ratings, and later on (after POS tagging) form of adverbs were added.

The developed crawler is based on xPath expressions. It is recommended by the World Wide Web (WWW) for locating elements and attributes in an XML document. Further, it is based on a tree representation of the XML document and provides the ability to navigate around the tree, selecting nodes by a variety of criteria. In our case, as the product reviews are in HTML pages, we had to load it as XML and then required information of the products is extracted (Table 5).

Table 5 Dataset details of microsoft products

Full size table

The extracted data set contains reviews of two products: office products, which includes Microsoft Word, Microsoft PowerPoint, Microsoft Excel and Microsoft Access Database. The other product is musical DVDs, which contains two main albums that are pop tracks and slow tracks. The details of the data set is shown in Table 6.

Table 6 Dataset details of Musical DVDs

Full size table

The reason for selection these two products are 1) total number of reviews are in greater size as compared to other products, 2) reviews are provided from diversified locations, and 3) as reviews from different locations so language compulsiveness is another factor.

3.2 Data pre-processing

In the pre-processing step, first of all, we have verified the sentence boundary and then tokenized the text. Stop words, extra white spaces, html tags, new lines redundant characters, emotions and special symbols are removed.

3.3 Part-of-speech tagging (POS tagging)

The reviews comprise of different Part-Of-Speech such as noun, adjective, verb, and adverb. These are tagged using CLAWS C7 tag set. Since our study focuses on adverbs, therefore, all forms of adverbs are extracted from the reviews using Constituent Likelihood Automatic Word-tagging System (CLAWS) C7 tagger [28]. CLAWS has been continuously developed since the early 1980s. It has consistently achieved 96% to 97% accuracy. Several tagsets have been used in CLAWS and 132 basic tags were used in its initial version. The current standard tagset is the C7 tag-set which consists of 160 tags. In our research, we have used CLAWS WWW tagger free service to tag our dataset. As our study is focusing on adverbs, therefore, CLAWS C7 tagged some adverbs such as:

General adverbs (RR): This type of adverb modifies the verb such as “carelessly” and “easily” etc.
General WH adverbs (WH-RR): This type of adverbs transforms general verbs along interrogation e.g. “when, where and what” etc.
General Comparative adverbs (RRQ): This type of adverbs is used to make comparison e.g. “better, longer and easier” etc.
General Superlative adverbs (RRT): It is used to modify general adverb by using superior form e.g. “best, longest and easiest” etc.
Adverb of time (RT): It tells us about when an action happened, also for how long and how often e.g. “now, yesterday, and tomorrow, all day, rarely, seldom” etc.
Degree adverbs (RG): This type of adverb tells us about the degree of an action, an adjective or another adverb. e.g. “so, very and much” etc.
Degree WH Adverbs, (RG-WH): This type of adverbs covers a set of words beginning with wh-. e.g. “how, however and whatever, when, why” etc.
Degree Comparative adverbs (RGT): It modifies verbs along another adverb with comparison e.g. “more, less and few” etc.
Degree Superlative adverbs (RGQ): This type of adverbs modify verbs along another adverb with superior form e.g. “most, least and worst” etc.
Locative adverbs, (RL): It describes the location of another adverb e.g. “alongside, forward and middle” etc.

List of all forms of adverb identified in both the datasets (Office products and Musical DVDs) after POS tagging are shown in Table 7.

Table 7 List of all forms of adverb identified in both the datasets

Full size table

Let’s consider an example which helps to understand how adverb can be a part of review and how they can be combined together in a sentence or sentence of any review or document to explain the story.

In the Figure 2, the respective adverbs, which appear in a review are highlighted. Now the problem is to understand that how these adverbs narrate the story of any user and for sentiment how it will be classified as positive or negative? The different forms of adverbs in Figure 2, such as “annually, only, as-well, already, professionally” are some general adverbs (RR) while “around” is a locative adverb (RL) and “most” is general superlative adverb (RGQ) where as “soon, as” are degree adverbs (RG). Now the questions arise that

What is the contribution of these adverbs or forms of adverbs in sentiment classification of any review?
How these adverbs or its different forms will impact on sentiment classification?

For this reason the proposed methodology comprises some experiments to understand the contribution and impact of these adverbs.

3.4 Scoring features

SentiWordNet 3.0, a lexical resource explicitly devised for supporting sentiment classification and opinion mining applications. SentiWordNet 3.0 is an improved version of SentiWordNet 1.0, a lexical resource and is publicly available for research purposes. It is currently licensed to more than 300 research groups and used in a variety of research projects worldwide. SentiWordNet is one of these lexicons that assigns to each synset of WordNet. Therefore, it is knowledge base which can be used for assigning the scores. The total positive words present in WordNet are 30,76,708 and negative words are 1,51,044. Every feature which is present in any document, review or text is assigned with some positive and negative scores. In this study, all the adverbs are also scored for calculating the total scores of a review from SentiWordNet. Therefore, every positive score is calculated as the average of the positive scores of all the synonyms of that word present in dictionary. Similarly, the negative scores are also calculated as the average of the negative of all synonyms of that word, which are also present in dictionary. Two different level scoring scheme are used in this research such as 1) Sentence and 2) Review level scoring.

3.4.1 Sentence scoring

The sentence score is calculated by the scores of the words present in the sentence.

$$ senScore(S)= \frac{1}{n}\sum\limits_{i=1}^{n}(P_{i}) $$

(1)

Where:

$$ \begin{array}{@{}rcl@{}} senScore(S) &=& \text{ is score for a sentence in a document or review}.\\ n &=& \text{ is the total number of words present in a sentence}. \\ P_{i} &=& \text{ polarity words present in sentence where i is the limits of words}. \end{array} $$

Let us consider an example for calculating the sentence level scores.

Sentence 1: “The Microsoft version 2013 office is very good and many things are enhanced especially the new style.”

Explanation: The word “very” is degree adverb and the word “especially” is general adverb. Now these two distinct adverbs will get the scores from SentiWordNet library and average is calculated for this sentence.

The sentence score is positive because both of adverbs have positive polarity score returned by polarity lexicon “SentiWordNet”. Let us consider another example where negation occurs.

Sentence 2: “The Access is not that good as compared to SQL but others like Excel, Word is much better than before”.

Explanation: This sentence contains “not” a general adverb “as” degree adverb and “much” degree superlative adverb while “better” is general superlative adverb. Now these four adverbs will get scores and to find the polarity for this sentence in which negation occurs firstly the negativity is calculated by formula such as:

$$ NegScore = 1 - (positiveSocres + negativeScores ) $$

(2)

Afterwards, the total calculation is constructed to understand the sentiments of a sentence. Thus, all the sentences are scored and finally averaged for scoring the review of a product.

3.4.2 Review scoring

The review score are calculated by the scores of sentences present in a review.

$$ revScore(R)= \frac{1}{n}\sum\limits_{i=1}^{n}(S_{i}) $$

(3)

Where:

$$ \begin{array}{@{}rcl@{}} revScore(R) &=& is score of document or review. \\ n &=& is the total number of sentence in a review. \\ S_{i} &=& sentence present in a review where i is the limit of sentences. \\ \end{array} $$

Lets’ consider an example shown in Figure 3. In this example, the user has rated the review as 5-star, which is positive. For classifying the review using adverbs and its different forms, the respective review is tagged. After tagging the review, different forms of adverbs are extracted. After extracting these forms, they have been combined together to obtain their scores by using SentiWordNet. Firstly, the scores are assigned at sentence level and then at review level. Therefore, the final scores of reviews are obtained and will be classified in either positive or negative class.

Review Class: The results for this review are positive as it scored for the range of 5-Star i.e. from 0.51 to 1.0.

3.5 Star rating

For every review there is always a star rate which is assigned by a user on behalf of her experiences for a particular product. Thus, amazon also contains star rates whenever customer shares opinions. To evaluate 5 star rating of the review, the first step is to find out the ranges which is from highest to lowest ratings. To calculate these star rates the range 0 to 1 are considered and so different researchers contributed such as Pappas and Popescu-Belis [27], Lak and Turetken [17], Jang Jong [12] and Lee and Pang [26] which indicates the highly positive and highly negative ranges i.e. 1.0 to 0.1 respectively. The Table 8 demonstrates the star rates along with polarity values and classification as taken form the literature [15, 20, 40].

Table 8 Star ratings and polarity values

Full size table

In both of the dataset, the star rating were different. For example, for Microsoft products around 63% reviews were rated as 4 or 5 stars, around 9% of the reviews were of 3 star, and around 28% of the reviews were 1 or 2 stars. Similarly, for the Musical DVDs, around 75% reviews of 4 to 5 stars, around 8% were 3 star reviews, and around 18% reviews were rates as 1 or 2 stars. Thus, accumulating the positive, neutral, and negative reviews from both the dataset were around 33220, 4475, and 13310 respectively.

3.6 Impact of adverbs and its evaluation

The impact of following ten different adverbs is studied in this research to understand their behavior.

general adverb
general superlative adverb
general comparative
general-Wh adverbs
degree adverb
degree superlative adverb
degree comparative adverb
degree Wh-adverb
time adverb
locative adverb

In order to evaluate the proposed methodology and to classify the reviews in classes such as positive and negative the standard formula of precision-recall is utilized.

$$ Precision= \frac{True Positive}{True Positive + False Positive } $$

(4)

Precision is calculated for all forms which are correctly selected as shown in (4). Whereas for recall

$$ Recall = \frac{True Positive}{True Positive + True Negative } $$

(5)

Recall is calculated for those different forms which are successfully selected as shown in (5).

Furthermore, the f-measure is calculated for these respective three classes and simplifies the results as shown in (6).

$$ f-measure = \frac{2 * (precision . recall )}{precision + recall } $$

(6)

In the final step was to measures the scores using features combination from single features to ten distinct features.

4 Result analysis

4.1 Single feature analysis

This section discusses the results of various forms of Adverbs and are shown in the Figure 4.

One Feature: Positive Analysis

This has been highlighted in the Figure 4 that how each feature (General adverbs (RR), General WH adverbs (WH-RR), General Comparative adverbs (RRQ),General Superlative adverbs (RRT), Adverb of time (RT) ,Degree adverbs (RG) ,Degree WH Adverbs, (RG-WH) ,Degree Comparative adverbs (RGT) ,Degree Superlative adverbs (RGQ) ,Locative adverbs (RL)) performed for classifying reviews into positive class. The precision (blue bar), recall (brown bar), and F-measure (green bar) values were computed for all various forms of adverb are shown in Figure 4. For example, General adverbs (RR) feature was able to classify a review into positive review with 0.89, 0.84, 0.86 precision, recall, and F-measure respectively.

If we observe the F-measure closely, RRT (general superlative adverbs) performed the best by securing the F-measure of 0.86. Similarly, the following forms were able to achieve the F-measure of more than or equal to 0.75: (1) RR (general adverbs), (2) RGQ (degree superlative adverbs) However, the RR-Wh (general-wh adverbs), and RG-Wh (degree-wh adverbs) obtained the lowest F-measure of 0.59.

Findings: When single feature is used, the feature RRT (general superlative adverbs) obtained the best result by scoring f-measure of 0.86.

One Feature: Negative Analysis

This has been highlighted in the Figure 5 that how each of the evaluated feature (General adverbs (RR), General WH adverbs (WH-RR), General Comparative adverbs (RRQ),General Superlative adverbs (RRT), Adverb of time (RT) ,Degree adverbs (RG) ,Degree WH Adverbs, (RG-WH) ,Degree Comparative adverbs (RGT) ,Degree Superlative adverbs (RGQ) ,Locative adverbs (RL)) performed for classifying reviews into negative class. Similar to the previous results, precision, recall, and F-measure are shown with blue, brown, and green bars respectively. Further, these values are computed for each distinct form of adverbs independently.

If we observe the F-measure closely, RG-Wh (degree-wh adverbs) performed the best by securing the F-measure of 0.80. Similarly, the following forms were able to achieve the F-measure of more than or equal to 0.75: (1) RR-Wh (general-wh adverbs), (2) RRT (general superlative adverbs), (3) RT (time adverbs) and (4) RL (locative adverbs). However, the RRQ (general comparative adverbs), and RGT (degree superlative adverbs) obtained the lowest F-measure of 0.69.

Findings: When single feature is used, the feature RG-Wh (degree-wh adverbs) obtained the best result by scoring f-measure of 0.80.

Explanation of Results:

When each of the ten forms of adverbs have been evaluated individually for Sentiment Classification, the general superlative (RRT) adverb class outperformed other classes in the positive class while degree-wh (RG-Wh) adverb class performed better for negative class. This section highlights that why they have performed better than others. The in-depth analysis indicates that most of the words associated with these adverb types (RRT and RG-Wh) are strong polarity bearing words, which can be used independently to state the polarity by the users.

Few such words are presented in the Table 9. This table presents some words from RRT RR-Wh, RR, and RG classes. It is clear that the words associated with RRT and RG-Wh (Easiest, Hardest, whether, whereas) have clear meaning to predict the sentiment class. However, the words associated with RG, RR are ambiguous. This is the reason that RRT (general superlative adverb) and RG-Wh (Degree-Wh) performed better than other 8 types. Furthermore, it is identified that there is a possibility that the negation could be used with RRT and RG-Wh to change the meaning, for example, “not easiest”. Therefore, we have also handled the negation in this research study, otherwise, wherever the words “easiest, hardest” are used, it has a clear sentiment class of positive or negative.

Table 9 Examples of various form of adverbs and its impact

Full size table

In Figure 6, the “Postive” score results of the proposed work are compared with Haider et al. [11] and Zafar et al. [39]. In this result, the top 5 best performer adverbs are selected from Haider et al. and Zafar et al. research studies.

It is evident from the results that the proposed work has outperformed previous studies that exist in this domain. There are various reasons for better performance such as: 1) the proposed model was trained on a large corpus which includes 9555 + 41,450 reviews as compared to 5513 tweets, 2) size of individual review is far greater than tweets size which has 140 character limitation, however, in our case, reasonable size review have more descriptive power, 3) gold standard dataset of Haider et al. is fairly small as compared to the proposed dataset, and finally 4)The adverb types “RT” and “RL” were not identified by Haider et al., however, in the proposed model, enough evidences were found related to these two types giving an edge to the proposed model. In Figure 7, the results of the “Negative” scores are compared with Haider et al. [11] and Zafar et al. [39]. The results show that adverbs play a significant role in the determination of the polarity scores for sentiment analysis. The results of Figures 6 and 7 show that the methodology adopted to use adverbs to classify the sentiment of a larger text is viable and can be used by future research studies.

5 Conclusions

In this paper, we have performed sentiment classifications of product reviews extracted from Amazon. We have critically analyzed the state-of-the-art in the domain and observed that the contemporary research studies have not exploited the impact of all of the adverb and its different forms for sentiments classification for product reviews. To address this, we have exploited 10 different types of adverbs. These types include general adverbs(RR), general superlative adverbs (RRT) , general comparative adverbs (RRQ) , general-wh adverbs(RR-Wh), degree adverbs(RG) ,degree superlative adverbs (RGQ), degree comparative adverbs (RGT), degree-wh adverbs (RG-Wh), time adverbs (RT) and locative adverbs (RL). To conduct this study, a diversified dataset comprising of 51,005 reviews of two products has been extracted from Amazon. These two products are office products and musical DVDs. To evaluate the results, we have compared the classification results (or polarity scores) with the benchmark data set containing star rating of same reviews. The outcomes of study revealed that general superlative adverbs secured the highest value of F-measure i.e., 0.86 for the positive class. In future, the best polarity bearing adverb features can be combined with verbs, adjectives and its forms to analyze the sentiment classification. This approach can further be applied in other domains rather than product reviews e.g. short text messages (Tweets), scientific documents, blogs, and news articles.

References

Alrababah, S.A.A., Gan, K.H., Tan, T.-P.: Mining opinionated product features using WordNet lexicographer files. J. Inform. Sci. 43(6), 769–785 (2017)
Article Google Scholar
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Lrec (2010)
Benamara, F., et al.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: ICWSM. Citeseer (2007)
Bjorkelund, E., Burnett, T.H., Norvag, K.: A study of opinion mining and visualization of hotel reviews. In: Proceedings of the 14th International Conference on Information Integration and Web-based Applications and Services. ACM (2012)
Boiy, E., et al.: Automatic sentiment analysis in on-line text. In: ELPUB (2007)
Chesley, P., et al.: Using verbs and adjectives to automatically classify blog sentiment. Training 580(263), 233 (2006)
Google Scholar
Das, O., Balabantaray, R.C.: Sentiment analysis of movie reviews using POS tags and term frequencies. Int. J. Comput. Appl. (IJCA) 96(25), 36–41 (2014)
Google Scholar
Dragoni, M., Poria, S., Cambria, E.: OntoSenticNet: a commonsense ontology for sentiment analysis. IEEE Intell. Sys. 33(3) (2018)
Article Google Scholar
Dragut, E., Fellbaum, C.: The role of adverbs in sentiment analysis. In: Proceedings of Frame Semantics in NLP: a Workshop in Honor of Chuck Fillmore (1929-2014) (2014)
Dray, G., Plantié, M., Harb, A., Poncelet, P., Roche, M., Trousset, F.: Opinion mining from blogs. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 1, 205–213 (2009)
Google Scholar
Haider, S., Tanvir Afzal, M., Asif, M., Maurer, H., Ahmad, A., Abuarqoub, A.: Impact analysis of adverbs for sentiment classification on Twitter product reviews. Concurrency and Computation: Practice and Experience, e4956 (2018)
Jong, J.: Predicting rating with sentiment analysis. Citeseer (2011)
Kalarani, P., Brunda, S.: Sentiment analysis by POS and joint sentiment topic features using SVM and ANN. Soft. Comput 23(1), 1–13 (2018)
Google Scholar
Khan, K., et al.: Mining opinion components from unstructured reviews: a review. Journal of King Saud University-Computer and Information Sciences 26(3), 258–275 (2014)
Article Google Scholar
Kincl, T., Novák, M., Pribil, J.: Getting inside the minds of the customers: automated sentiment analysis. In: European Conference on Management, Leadership and Governance. Academic Conferences International Limited (2013)
K.M, A.K., Suresha: Analyzing Web user’ opinion from phrases and emoticons. International Journal of Computer Applications (IJCA) Special Issue on Computational Science - New Dimensions & Perspectives NCCSE 4, 133–139 (2011)
Google Scholar
Lak, P., Turetken, O.: Star ratings versus sentiment analysis–a comparison of explicit and implicit measures of opinions. In: 2014 47th Hawaii International Conference on System Sciences (HICSS). IEEE (2014)
Liu, B.: Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5(1), 1–167 (2012)
Article Google Scholar
Liu, B., Hu, M., Cheng, J.: Opinion observer: analyzing and comparing opinions on the web. In: Proceedings of the 14th International Conference on World Wide Web, pp 342–351. ACM (2005)
Mai, F., Wang, X.S., Curry, D., Chiang, R.H.L.: Mining product reviews: market structure analysis using deep learning and evolutionary clustering. SSRN Electron. J. https://doi.org/10.2139/ssrn.2724124 (2016)
Moghaddam, S., Popowich, F.: Opinion polarity identification through adjectives. arXiv:1011.4623 (2010)
Morrison, K.: 81% of shoppers conduct online research before buying AdWeek (2014)
Mudinas, A., Zhang, D., Levene, M.: Combining lexicon and learning based approaches for concept-level sentiment analysis. In: Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining. ACM (2012)
Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Semantically distinct verb classes involved in sentiment analysis. In: IADIS AC (1) (2009)
Padmaja, S., Fatima, S.S., Bandu, S.: Evaluating sentiment analysis methods and identifying scope of negation in newspaper articles. Int. J. Adv. Res. Artificial Intell. 3(11) (2014)
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2005)
Pappas, N., Popescu-Belis, A.: Explaining the stars weighted multiple-instance learning for aspect-based sentiment analysis. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)
Rayson, P., Garside, R.: The claws Web tagger. ICAME J. 22, 121–123 (1998)
Google Scholar
Rill, S., et al.: A generic approach to generate opinion lists of phrases for opinion mining applications. In: Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining. ACM (2012)
Rothe, S., Ebert, S., Schütze, H.: Ultradense word embeddings by orthogonal transformation. arXiv:1602.07572 (2016)
Rout, J.K., et al.: A model for sentiment and emotion analysis of unstructured social media text. Electron. Commer. Res. 18(1), 181–199 (2018)
Article Google Scholar
Shaw, R., et al.: Building a scalable database-driven reverse dictionary. IEEE Trans. Knowl. Data Eng. 25(3), 528–540 (2013)
Article Google Scholar
Somprasertsri, G., Lalitrojwong, P.: Mining feature-opinion in online customer reviews for opinion summarization. J. UCS 16(6), 938–955 (2010)
Google Scholar
Soni, V., Patel, M.R.: Unsupervised opinion mining from text reviews using sentiwordnet. Int. J. Comput. Trends Technol (IJCTT) 11, 0 (2014)
Google Scholar
Thomson, A.J., Martinet, A.V., Draycott, E.: A practical English grammar (1986)
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp 417–424 (2002)
Vinodhini, G., Chandrasekaran, R.: Sentiment analysis and opinion mining: a survey. Int. J. 2(6), 282–292 (2012)
Google Scholar
Wang, H., et al.: Feature-based sentiment analysis approach for product reviews. J. Softw. 9(2), 274–279 (2014)
Google Scholar
Zafar, L., Ahmed, I., Aleem, M., Islam, M.A., Iqbal, M.A.: Analyzing adverbs impact for sentiment analysis using hadoop. In: 2017 13th International Conference on Emerging Technologies (ICET). IEEE (2017)
Zhang, K., Narayanan, R., Choudhary, A.N.: Voice of the customers: mining online customer reviews for product feature-based ranking. WOSN 10, 11–11 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Capital University of Science and Technology, Islamabad, Pakistan
Ummara Ahmed Chauhan & Muhammad Tanvir Afzal
Institute of Computing, Kohat University of Science and Technology, Kohat, Pakistan
Abdul Shahid
Department of Computer Science, University of Quebec in Montreal, Montreal, Quebec, Canada
Moloud Abdar
Department of Computer Engineering, Shahrekord University, Shahrekord, Iran
Mohammad Ehsan Basiri
School of Management and Enterprise, University of Southern Queensland, Queensland, Australia
Xujuan Zhou

Authors

Ummara Ahmed Chauhan
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Tanvir Afzal
View author publications
You can also search for this author in PubMed Google Scholar
Abdul Shahid
View author publications
You can also search for this author in PubMed Google Scholar
Moloud Abdar
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Ehsan Basiri
View author publications
You can also search for this author in PubMed Google Scholar
Xujuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xujuan Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Computational Social Science as the Ultimate Web Intelligence

Guest Editors: Xiaohui Tao, Juan D. Velasquez, Jiming Liu, and Ning Zhong

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chauhan, U.A., Afzal, M.T., Shahid, A. et al. A comprehensive analysis of adverb types for mining user sentiments on amazon product reviews. World Wide Web 23, 1811–1829 (2020). https://doi.org/10.1007/s11280-020-00785-z

Download citation

Received: 28 February 2019
Revised: 12 November 2019
Accepted: 13 January 2020
Published: 20 February 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11280-020-00785-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A comprehensive analysis of adverb types for mining user sentiments on amazon product reviews

Abstract

Similar content being viewed by others

Survey on Product Review Sentiment Classification and Analysis Challenges

Cross Domain Sentiment Analysis Using Different Machine Learning Techniques

Sentiment Analysis of Amazon Mobile Reviews

1 Introduction