Keywords

1 Introduction

Many people read online customer reviews and ratings. According to studies, consumers trust online reviews or comments from strangers before purchasing a product or service. In this field, numerous statistical surveys and studies have been conducted. According to a study conducted in, 39% of customers read about eight reviews, while only 12% read 16 or more reviews before purchasing a product; 98% of customers admit that customer reviews of previous buyers influence their purchasing decision. According to statistics, potential buyers are willing to spend 31% more on a product or service that has received positive feedback.

Many reviews are lengthy, making it difficult for a potential customer to read them and decide whether or not to purchase the product. The large number of reviews also makes it difficult for product manufacturers to track customer sentiments and opinions about their products and services.

As a result, creating a review summary is required. Reviews are described using sentiment analysis [3]. Sentiment analysis employs the natural concept of natural language processing to extract subjective information required for source materials. The main task is to determine whether the stated opinion is positive or negative [4].

Because customers rarely express their opinions in simple terms, judging an opinion stated can be a difficult task. Some perspectives are comparative, while others are direct. By simply condensing these ratings into two more general categories positive or negative sentimental analysis helps shoppers visualize customer satisfaction while making purchases [5]. Feedback is largely used to help customers make online purchases and learn about current product market trends, which helps retailers create market strategies.

2 Problem Statement

  • A word of opinion that is regarded as positive in one circumstance may be regarded as negative in another.

  • We can significantly increase the precision and capability of sentiment analysis with the use of machine learning.

2.1 Existing System

At different granularities, enough work has been done in the field of sentiment analysis. Some works at the document level classify the entire review based on the reviewer's subjective judgment. In certain sentence-level studies, the focus is on determining the polarity of a sentence (e.g., positive, neutral, or negative) using semantic data gleaned from the sentences’ textual content. Additionally, several recent researches also include sentiment analysis at the phrase level, with the major emphasis being on phrases, which are collections of words that frequently have a unique idiomatic meaning. The topic of sentiment analysis at the aspect level, however, is still developing and needs additional study. Sentiment analysis has been used in a variety of industries, including the travel and entertainment sectors. While another article employs Perceptron neural networks, the work employed a combination of machine learning characteristics and lexical features. Additionally, research has been done on the data derived from social media, such as Twitter's mapping of social media attitudes using observations and quantifiable data. The study made the case that tracking customer opinion online may serve as dynamic feedback for any firm. The study classified the moods of Twitter tweets into three classifications: positive, negative, and neutral using a tree kernel-based model. This can also be used to track how the general public feels about a specific incident, piece of news, etc.

Method

Year of proposal

Classification

Text level

Prediction accuracy

Pros

Cons

OPINE

2005

Unsupervised rule-based approach

Word

87%

Domain independent

Difficulty in availing OPINE system, thus rare to get applied in real life

Sentiment analysis: Adjectives and adverbs are better than adjectives alone

2006

Linguistic approach

Document

Pearson correlation of 0.47

Adjectives are given more priority (adjectives expresses human sentiments better than adverbs alone)

None

Opinion digger

2010

Unsupervised machine learning method

Sentence

51%

Rates product at aspect level

Requires rating guidelines to rate. Works only on known data

Sentiment classification using lexical contextual sentence structure

2011

Rule-based approach

Sentence

86%

Said to be domain independent

Depends solely on wordNet

Interdependent latent Dirichlet allocation

2011

Probabilistic graphical model

Document

73%

Faster in comparing and correlating sentiment and rating

Correlation between identified clusters and feature or ratings are not explicit always [6]

A joint model of feature mining and sentiment analysis for product review rating

2011

Machine learning

Document

71% (in 3 categories)

46.9% (in 5 categories)

Automatic calculation of feature vector

Use of WordNet

2.2 Proposed System

Figure 1 says that, the architecture of the proposed system, the main goal is to be the process the data using an NLP and then used VADER analysis to get the priority of user opinion.

Fig. 1
A flowchart starts with inputting a dataset, which is then pre-processed using N L P, followed by feature extraction using C N N, train using transformation network, and ends with classifying opinions and computing their polarity using a trained transformation network.

Flowchart

3 Literature Survey

Sl. No.

Title

Author

Methodology

Limitations

Paper-1

Survey of Deep Learning Techniques for Aspect-Based Sentiment Analysis

Ishani Chatterjee, Haoyue Liu

ABSA is treated as a multiclassification problem by traditional machine learning and deep learning techniques

Data preprocessing is underrated process. People focus more on methodology and give less attention to preprocessing of data

Paper-2

A Sentiment Analysis Survey

Preeti Routray, Smita Prava Mishra

Here, various aspects of text document sentiment analysis are reviewed

Need to improve the quality of system such as accuracy

Paper-3

Deep Learning Sentiment Analysis

Shilpa P C, Rissa Shereen, Vinod P

Twitter message sentiment analysis system. The tweets that we take into account for the analysis are a mix of various and emotions

Further analysis is required to obtain personality of the user from their tweets

Paper-4

Survey on Sentiment Analysis and Opinion Mining

G. Vinodhini, RM. Chandrasekaran

In this study, issues in the subject of sentiment analysis are discussed along with methodologies and methods

Major obstacles include the use of several languages, opinions based on features, and phrase complexity

Paper-5

Sentiment Analysis Algorithm and Application

Walaa Medhat, Ahmed Hassan, Hoda Korashy

This study primarily focuses on providing a concise overview of SA techniques and the connected topics

More work is needed for sentiment analysis to analyze a context-based SA

Paper-6

Deep Learning and Machine Learning for Sentiment Analysis

Yogesh Chnadra, Antoreep Jana

Different techniques for sentiment analysis have been considered

Sentiment analysis is done using machine learning classifiers

One of the difficulties with sentiment analysis is accuracy

4 Methodology

The implementation of the project consist of four steps that can be defined below [712]:

  1. 1.

    Data Preprocessing

  2. 2.

    Filtering

  3. 3.

    Compute Polarity of Opinion

  4. 4.

    Classify Opinion.

Data Preprocessing

The review contains word that are not required in the classification model. It can consist of hyperlinks, emoji special characters, double quotation, punctuation, extra white space. Data preprocessing in defined for the removal of such words. Stop-words such as ‘is’, ‘are’, ‘the’, which do not contain any meaning are filtered out by using inbuilt python module. Streaming and lemmatization are also done using NLP to normalize the text for further preprocessing using the model [1316].

Data Scraping

It is done to extract data from the preprocessed data. In this project we have used this process to extract the features that are required for the analysis of the sentence. Once the data scraping is done then the extracted feature is used for the classification of the polarity of the user opinion.

Compute Polarity of Opinions

Once the data is done then the extracted data is used by the VADER analysis tool to compute the polarity of the sentence. A list of features/words is used; these words have been labeled as either positive or negative.

Classify Opinions

The statement is categorized as positive or negative depending on the compound score after the polarity calculation [17, 18]. The compound score totalizes ratings with values ranging from − 1 (negative) to + 1 (positive).

The sentiment is favorable (complex score ≥ 0.05). Sentiment of Neutrality: (− 0.05, compound score 0.05). Unfavorable Attitude: (compound score = − 0.05)

5 Results

Snapshot 1. Home page offers 3 analysis options, Products, Document, and Text, and includes sentiment analysis with positive, negative, and neutral categories. Snapshot 2. The product analysis tab prompts the user to enter a product U R L. Submit button is below.
Snapshot 3. User is prompted to enter a text for analysis. A submit button is below. Snapshot 4. Choose File button provided to upload a dot P D F or dot T X T file for document analysis. A submit button is provided below.
Snapshot 5. The sentiment distribution is represented using a pie chart. The chart displays the distribution of sentiments, including positive with 70%, negative with 2%, and neutral with 28%. Values are approximate.