Sentiment analysis of movie reviews: finding most important movie aspects using driving factors

Parkhe, Viraj; Biswas, Bhaskar

doi:10.1007/s00500-015-1779-1

Sentiment analysis of movie reviews: finding most important movie aspects using driving factors

Focus
Published: 18 July 2015

Volume 20, pages 3373–3379, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Soft Computing Aims and scope Submit manuscript

Sentiment analysis of movie reviews: finding most important movie aspects using driving factors

Download PDF

1860 Accesses
32 Citations
Explore all metrics

Abstract

The opinion conveyed by the user towards the movie can be understood by sentiment analysis of the movie review. In the current work we focus on finding the aspects of a movie review which direct its polarity the most. This is achieved using certain driving factors, which are scores given to the various movie aspects. Generally its found that aspects with high driving factors affect the review polarity the most.

Efficient Word2Vec Vectors for Sentiment Analysis to Improve Commercial Movie Success

Effective Approach for Sentiment Analysis on Movie Reviews

Opinion Mining and Sentiment Analysis in Social Media: Challenges and Applications

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction and related works

Machine learning and/or various language processing tools are used to find whether the given document is positive or negative in polarity. This is called sentiment analysis. In the following work we have found whether a movie review is positively or negatively oriented using sentiment analysis. Document level sentiment analysis and aspect level sentiment analysis are the two levels of sentiment analysis (Singh et al. 2013; Parkhe and Biswas 2014). The first level uses certain lexicon-based method or machine learning approaches for document classification. Pang et al. (2002) suggested a good method of sentiment classification using Nave Bayes, SVM and Maximum Entropy classifiers. They experimented with different features like unigrams, unigrams and bigrams, adjectives, top unigrams, etc. and compared their results. Kang et al. (2012) proposed a method for mitigating the error caused when the accuracies of the positive and negative classes are expressed as average values. For this they proposed an improved Naive Bayes algorithm that reduced the accuracy gap. Sometimes the accuracy obtained by the machine learning algorithms is low; thus to address this problem Basari et al. (2012) used Support Vector Machines coupled with Particle Swarm Optimization to increase the overall accuracy. In their study they increased the accuracy from 71.87 to 77 %.

The second level deals with each individual aspect of the movie. A movie has many different aspects such as Direction, screenplay, acting, story, etc. and the reviewer may tend to give his/her opinion based on these aspects. Better analysis of the review is possible if individual aspect polarities are taken into consideration. Reviewers tend to have different opinion about various movie aspects. Thus for detailed analysis of the review, aspect-based analysis is the way to go. Many researchers have worked on aspect-based sentiment analysis. Thet et al. (2010) proposed a method for fine-grained analysis of sentiment orientation and sentiment strength of the reviewer towards the various aspects of the movie. It uses domain-specific and generic opinion lexicons to score the words and with the help of dependency tree, it identifies various inter word dependencies and helps in propagating the word score over the entire document. Singh et al. (2013) gave a new feature-based heuristic for aspect level sentiment analysis. In their scheme they analyse the review text and assign sentiment label on each aspect of the review. Then each aspect text is scored using SentiWordNet (2015) with feature selection comprising of adjective, adverbs, verbs and n-gram features. The overall document is then scored based on the aggregate score of each aspect. Yu et al. (2011) proposed a method for identifying important aspects from online consumer reviews. They identified the important aspects based on the observations that such aspects are commented the most in a review and overall product opinion is greatly influenced by consumer opinion on such important aspects. In their algorithm they formulate the aspect value distribution via a Multivariate Gaussian Distribution. In this paper, we tend to find the movie aspects that dictate the polarity of the review the most. For this we give different weightage to individual movie aspects, called driving factors. The overall score is the sum of individual aspect scores weighed by their driving factors. The approach of Yu et al. (2011) differs from our approach in the method by which they assign aspect values. They use a Multivariate Gaussian distribution while we use a randomized approach to assign values to the driving factors. Also we choose those driving factors that give the maximum accuracy as the best driving factors. The rest of the paper is organized as follows: Sect. 2 describes the proposed method; Sect. 3 gives the dataset, experimental results and performance; Sect. 4 gives the conclusion and future work, and the last section gives Compliance with Ethical Standards and references.

Table 1 Lexicon used for aspect based text seperator

Full size table

2 Proposed method

The following method suggests a technique for aspect-based sentiment analysis of movie reviews (Parkhe and Biswas 2014). Figure 1 describes the method flow. The first step is pre-processing. In this step, we collect reviews from different sources and pre-process them to make them suitable for use in the method. Pre-processing includes formatting of reviews so that they can be aligned in the required format. For this the HTML tags and other tags were removed. For the following method the reviews were pre-processed into simple text format. The next step was to separate the review text into aspects and this was done using Aspect Based Text Separator (ABTS). The various movie aspects that we used are screenplay, music, acting, plot, movie and direction. An aspect-specific lexicon was used to separate the review aspect wise. Table 1 shows some of the words used to separate the sentences (Thet et al. 2010). Each word in the lexicon was associated with the part of speech of that word. While searching the sentence to match the lexicon word, we first tagged the sentence with the Stanford Part of-Speech Tagger (2015) and then we matched the lexicon word within the sentence having the same part of speech.

In the next step, these separated reviews were forwarded to the aspect-specific classifiers. A Naive Bayes classifier Pang et al. (2002) was used for this purpose. It calculates the probability of a word or albeit a sentence, belonging to positive or a negative class of reviews. The outputs were obtained using the traditional training and testing method. The outputs were either $-1$ or 1 denoting that the input text was negatively or positively oriented, respectively. Instead of NB we can use any classifier like SVM, etc. that is able to clearly classify the text into two classes. However, we must carry out proper processing of input data so that it meets the proper data format requirement for each classifier. Based on the weightage of the driving factors of the movie, the aspect-based output is multiplied with the respective driving factor.

Table 2 Performance measures

Full size table

Table 3 Performance measures (DF $=$ driving factor)

Full size table

The higher the value of the driving factor of an aspect, the more is its importance in the review. The driving factors follow the relationship

$$\begin{aligned} \sum \alpha _i = 1, \end{aligned}$$

where $\sum \alpha _i$ is the (ith) driving factor. The net output obtained is the sum of all the classifier outputs obtained multiplied with their respective driving factors. The output is

$$\begin{aligned} \omega (\mathrm{d}) = \sum \alpha _i X_i\quad X_i \subseteq [-1,1], \end{aligned}$$

where $\alpha _i$ is the driving factor of (ith) aspect and $X^i$ is the output of (ith) classifier and (d) is the document under consideration. Now if

$$\begin{aligned} \omega (\mathrm{d})\le 0 \rightarrow \text {negative classification of review}\quad \text {d}\\ \omega (\mathrm{d})> 0 \rightarrow \text {positive classification of review}\quad \text {d} \end{aligned}$$

Thus we have used a threshold score for the classification of the document.

3 Dataset, experimental results and performance

The dataset was acquired from the Large movie review dataset site of Stanford AI Lab (Maas et al. 2011; Large Movie Review Dataset 2015; Parkhe and Biswas 2014). The dataset consists of 25,000 positive and 25,000 negative reviews and was collected from IMDB. Though there is no specific time span for review collection from IMDB, but it was ensured that no more than 30 reviews from a single movie get included in the final dataset. Because of even number of positive and negative reviews, the minimum accuracy that we can obtain from the experiment is 50 %. The dataset contains only highly positive and highly negative reviews. The authors of the dataset included a negative review only if it scored 4 out of 10 and included a positive review if it scored 7 out of 10 on a benchmark set by them (Maas et al. 2011). Neutral reviews were omitted. It was seen that ABTS separated the review into various aspects having unequal text distribution. This was due to the fact that in each review, the reviewer commented on each aspect in unequal number of sentences. Also in some reviews not all aspects were commented on. The score for such reviews was made 0. As mentioned in the previous section a Nave Bayes classifier was used for classifying the separated aspect based text. The individual classifiers got the aspect-based text as input in the ratio of 70:30, for training and testing, respectively. The experiment ran for 1000 iterations and during each iteration, random values between 0 and 1 were assigned to the driving factors. For the particular dataset under consideration, the driving factors giving the highest accuracy were chosen as the best driving factors (Table 2). The experiment conducted gave results as depicted in Table 3.

Table 4 Experimental results for action genre

Full size table

Table 5 Experimental results for adventure genre

Full size table

Table 6 Experimental results for animation genre

Full size table

The results in Table 3 depict the relationship between accuracy and driving factors used. The highest accuracy obtained was 0.79372, i.e. 79.372 %. The corresponding factors are Screenplay—0.07877, Music—0.11756, Acting—0.28147, Plot—0.16390, Movie—0.31225 and Direction—0.108133.

Thus by using the mentioned driving factors, we get an accuracy of 79.372 %. This is the highest accuracy obtained using this method. Also its worth noting that giving equal importance to all factors, i.e. giving each a value of 0.165, has resulted in a lower accuracy of 78.268 % than the highest accuracy obtained by unequal distribution of factors. The effect of changing driving factors can be seen in the accuracy of the overall classification obtained. In the above case of 79.372 % accuracy we have given most importance to the Movie, Acting and Plot aspects. Thus we can interpret from the results that in the reviews used from the dataset, the user has given more importance to these factors while writing the review. It also means that if the reviewer gives a positive opinion towards these aspects, then due to their high importance the overall review will tend to be positive even if he/she gives a negative opinion towards the other aspects. Giving more importance to certain factors also has an added advantage; it tends to suppress the user opinion about other factors. Suppose we have a review X and it contains user opinion about two factors F1 and F2. Also the overall orientation of the review is positive in nature. The user has given a positive review about F1 and a negative about F2. Also the amount of text in the review for F1 aspect is less as compared to the F2 aspect. Now if we use any non-aspect based sentiment analysis method, then since text size of F2 is greater than text size of F1 and also since F2 is negative in orientation, the overall document score will tend to reduce and skew towards negativity. On the other hand, if driving factors are used and F1 is given more importance the document score will better reflect the positivity of the review. Since each aspect of a movie is analysed separately in this method, we can track the effect each aspect has towards the overall score of the document. This individual aspect-based tracking can be used in a fine-grained aspect-based recommendation system, which recommends movies based on their various aspects instead of the overall rating of the movie. Also this method can be applied on a product review dataset thus enabling us to see what opinion each user has on the various aspects of the product, thus helping in the development of proper product placement strategy. It is very difficult to acquire such in-depth knowledge from the dataset using non-aspect based methods.

Table 7 Experimental results for comedy genre

Full size table

Table 8 Experimental results for crime genre

Full size table

Table 9 Experimental results for documentary genre

Full size table

Table 10 Experimental results for drama genre

Full size table

We wanted to see how the above method would work on reviews of specific movie genre. Thus we applied the method on movie reviews of genres like action, adventure, animation, comedy, crime, documentary, drama, horror and the results obtained are showed in Tables 4, 5, 6, 7, 8, 9, 10. For certain experimental simplifications, the sum of the driving factors is taken to be 2 instead of 1 as mentioned previously. As can be seen from the tables, we got an accuracy of 63.8 % for action genre, 63.33 % in Adventure genre, 81.48 % in animation genre, 77 % in Comedy genre, 87.3 % in Crime genre, 84.82 % in Documentary genre, 76.64 % in Drama genre and 83.33 % in Horror genre.

The various performance measures used were (Singh et al. 2013)

$$\begin{aligned}&\text {Accuracy} = \frac{\text {Total correctly classified documents}}{\text {Total number of documents}} \\&\text {Precision} = \frac{tp}{tp + fp} \\&\text {Specificity} = \frac{tn}{\text {Total negatively oriented documents}} \\&\text {Recall} = \frac{tp}{\text {Total positively oriented documents}}, \end{aligned}$$

where (tp), (fp) and (tn) are the true positives, false positives and true negatives obtained during the classification. The result obtained by applying the various performance measures can be seen in the given tables. As can be seen from Fig. 2 for action genre we got direction, plot and screenplay as the most important driving factors, for adventure genre we got direction, acting and screenplay, for animation genre we got direction, screenplay and acting, for comedy we got direction, music and movie, for crime we got movie, screenplay and plot, for documentary we got music, screenplay and direction, for drama we got movie, music and acting and for horror we got acting, movie and direction as the most important driving factors. Only the highest accuracy across each genre was considered for obtaining the above results. The graph denotes the percentage distribution of the driving factors across each genre. The total value of these factors comes out to be 2 as stated previously. Thus it shows that each genre has unique important driving factors and if the reviewer comments positively on these aspects then the overall accuracy of the classification increases (Table 11).

Table 11 Experimental results for horror genre

Full size table

4 Conclusion and future work

The experiment was conducted to find which movie aspects influence the orientation of the review using driving factors. It concluded with Movie, Acting and Plot aspects getting overall high driving factors and resulting in an accuracy of 79.372 % for the current dataset in consideration. The importance of these aspects may or may not change, but since the experiments were conducted on a large dataset, it is quite unlikely that it will.

As we can see from the results obtained for genre-specific reviews, the method gave high accuracy for some genres, while it gave lower accuracy for others. Thus its evident that the method used for mixed review classification is not that good for reviews of certain genres. Thus a newer approach need to be developed of genre-specific classification of reviews as reviews of different genres tend to incorporate genre-specific words or sentences that can have different meaning based on the context in which they are used. For instance, the word funny is used in a good context for a comedy movie but may be used in a wrong context for movie genre like horror, etc. Thus such context-specific words and sentences resulted in uneven accuracy as depicted in the results.

The current method used for classifying the text is Naive Bayes Classifier which uses a bag-of-word approach. This approach does not consider the inter word meaning dependencies and also the context in which the word was used, i.e. genre. For this purpose we tend to develop scoring method using context specific lexicon. Each word in the lexicon will have a different positive and negative score based on the context (genre) in which it was used. Also to incorporate the inter word dependencies we tend to use clause-based scoring of a sentence. It scores each clause of a sentence individually and thus the overall sentence score is the sum of individual clause scores. Thus by coupling the above improved method with the use of genre-specific driving factors we tend to obtain more refined scores for the movie reviews.

References

Basaria ASH, Hussina B, Anantaa IGP, Zeniarjab J (2012) Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization. In: Malaysian Technical Universities conference on engineering & technology (MUCET 2012) part 4: information and communication technology
Bro J, Ehrig H (2010) Generating a context-aware sentiment lexicon for aspect-based product review mining. In: IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology
Kang H, Yoo SJ, Han D (2012) Senti-lexicon and improved Naive Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst Appl 39(5):6000–6010
Large Movie Review Dataset (2015) Acquired from stanford AI lab. http://ai.stanford.edu/~amaas/data/sentiment
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learningtechniques. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Philadelphia, pp 79–86
Parkhe V, Biswas B (2014) Aspect based sentiment analysis of movie reviews: finding the polarity directing aspects. In: Proceedings of international conference on soft computing and machine intelligence 2014
Sentiment Analysis (2015) Wikipedia-the free encyclopedia. http://wikipedia.org/wiki/Sentiment_analysis
SentiWordNet (2015) Lexical resource for opinion mining. http://sentiwordnet.isti.cnr.it
Singh VK, Piryani R, Uddin A, Waila P (2013) Sentiment analysis of movie reviews a new feature-based heuristic for aspect-level sentiment classification. In: Proceedings of the 2013 international muli-conference on automation, communication, computing, control and compressed sensing, Kerala-India, IEEE Xplore, pp 712–717
Stanford Part-Of-Speech Tagger (2015) Stanford natural language processing group. http://nlp.stanford.edu/software/tagger.shtml
Taboada M, Brook J, Stede M (2009) Genre-based paragraph classification for sentiment analysis. In: Proceedings of SIGDIAL 2009: the 10th annual meeting of the special interest group in discourse and dialogue, University of London, Queen Mary, pp 62–70
Thet TT, Na J-C, Khoo CSG (2010) Aspect-based sentiment analysis of movie reviews on discussion boards. J Inf Sci 36(6):823–848
Yu J, Zha Z-J, Wang M, Chua T-S (2011) Aspect ranking: identifying important product aspects from online consumer reviews. In: Proceedings of the 49th annual meeting of the association for computational linguistics, Portland, Oregon, pp 1496–1505, 19–24 June 2011

Download references

Acknowledgments

The above work is an extension of previous work published in ISCMI 2014 (Parkhe and Biswas 2014). Proper citations have been included for the same in the above work for transparency purposes.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, IIT-(BHU), Varanasi, India
Viraj Parkhe & Bhaskar Biswas

Authors

Viraj Parkhe
View author publications
You can also search for this author in PubMed Google Scholar
Bhaskar Biswas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Viraj Parkhe.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest

Informed consent

Also Informed consent was obtained from all individual participants included in the study. This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by S. Deb, T. Hanne and S. Fong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parkhe, V., Biswas, B. Sentiment analysis of movie reviews: finding most important movie aspects using driving factors. Soft Comput 20, 3373–3379 (2016). https://doi.org/10.1007/s00500-015-1779-1

Download citation

Published: 18 July 2015
Issue Date: September 2016
DOI: https://doi.org/10.1007/s00500-015-1779-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Sentiment analysis of movie reviews: finding most important movie aspects using driving factors

Abstract

Similar content being viewed by others

Efficient Word2Vec Vectors for Sentiment Analysis to Improve Commercial Movie Success

Effective Approach for Sentiment Analysis on Movie Reviews

Opinion Mining and Sentiment Analysis in Social Media: Challenges and Applications

1 Introduction and related works

2 Proposed method

3 Dataset, experimental results and performance

4 Conclusion and future work

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed consent

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sentiment analysis of movie reviews: finding most important movie aspects using driving factors

Abstract

Similar content being viewed by others

Efficient Word2Vec Vectors for Sentiment Analysis to Improve Commercial Movie Success

Effective Approach for Sentiment Analysis on Movie Reviews

Opinion Mining and Sentiment Analysis in Social Media: Challenges and Applications

Explore related subjects

1 Introduction and related works

2 Proposed method

3 Dataset, experimental results and performance

4 Conclusion and future work

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed consent

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation