1 Introduction

Stock market forecasting continues to be a challenging task in the economics sector due to its extremely stochastic character. Forecasting and analysing stock market movements have acquired huge notoriety, as stock market movement changes may have a profound influence on the economy. Political, social, environmental, economic, and public health factors all have an impact on stock market movement (Chou, Park, and Chou, 2021; Shang et al. 2021), causing markets to oscillate and become complex and uncertain (Chaudhuri, Mukherjee, Chowdhury, Sadhukhan, and Goswami, 2018; Wagner 2020). The stock market's volatility is well known to investors. They constantly monitor market movements to manage micro-investments and maximize profits while minimizing risk. Predicting stock market movement is a difficult task that requires much data analysis. Appropriate statistical models and artificially intelligent algorithms are required to address these issues and find an adequate solution. Numerous machine learning and deep learning algorithms may produce a reliable forecast with minimal errors (Mukherjee, Sadhukhan, Sarkar, Roy, and De, 2021).

Stock market movement can be studied using fundamental analysis (which considers economic considerations) or technical analysis (which considers historical data) (Valle-Cruz et al. 2021). Investors' opinions, traders' feelings, general public views, and different news items are another category of factors that undoubtedly influence the stock market (Biswas et al. 2020). It may collectively be classified as part of the well-known field of research known as sentiment analysis. Sentiment analysis is a type of analysis that uses statistics, natural language processing, and machine learning to ascertain the emotional content of communications (Hajhmida and Oueslati 2021; Hussein 2018).

COVID-19 was found for the first time in India in January 2020. It could have caused a terrible pandemic. Since March 2020, all workplaces, including offices, shops, and markets, have been shut down indefinitely. All commercial activities were halted, resulting in economic collapses around the world. People are forced to work from home due to the total lockdown scenario. During this hard time, social media platforms are profoundly used to share feelings, opinions regarding economic issues, and the dilemma in stock market investments. Opinions and feelings are posted on many social media platforms, and financial news and articles are in several languages from various Indian states. Natural language processing assists in their processing, and sentiment analysis extracts their feelings (Rajput 2020).

Sentiment analysis can be led through an assortment of approaches and tools. Sentiment analysis is currently receiving much attention for predicting stock market movements. This study focuses on the sentiment analysis of tweets, Facebook comments, news headlines, and online financial news articles. The emotion ratings generated in this manner are paired with stock data to investigate the repercussions of a COVID-19 pandemic. The motivation behind this exploration is to introduce a model in which sentiment scores produced by multiple sentiment analysis techniques are integrated with stock market data to quantify and compare the prediction performances. Seven sentiment analysis tools are utilized in this article to construct sentiment scores from four different sources of web scraped data. The data for the Nifty-50 stock market index were obtained from Yahoo Finance for this research. Stock data have been used to extract OHLC (open, high, low, and close) characteristics.

The rest of this research work is organized as follows: Section 2 discusses related works done in this field of research. Section 3 describes the background studies involved in this work. Section 4 presents the main system model proposed in this research work. Section 5 illustrates the experimental analysis and implementation. Section 6 discusses the results and their analysis. Section 7 compares the proposed work with existing works. Finally, Sect. 8 precisely concludes the work with some future work proposals.

2 Related work

The recent rise in the availability of textual data has prompted a surge in interest in sentiment analysis. Opinion mining and opinion summarizing are the two main subfields of sentiment analysis. The former is often concerned with forecasting whether the text reflects a positive or negative value based on what we are attempting to predict, whereas the latter is typically concerned with summarizing what has been stated (Derakhshan and Beigy 2019). Sentiment analysis may be performed at various levels of abstraction. This section focuses on in-depth reviews of various relevant research articles. The primary focus in this case is to examine stock market movement prediction and sentiment analysis of web scraped data.

Numerous researchers have collected and analysed Facebook comments to use them in various operations and decision-making processes (Akter and Aziz 2016; Hajhmida and Oueslati 2021; Marengo et al. 2021; Rase 2020). Hajhmida et al. proposed using Facebook data for the prediction of mobile application breakout. They used the Facebook graph API to evaluate the sentiment polarity of user comments and then built a breakout prediction model using machine learning techniques (Hajhmida and Oueslati 2021). Akter et al. established market prices by employing sentiment analysis of data acquired from FOODBANK's social media posts, which is a very popular Facebook group in Bangladesh, using the lexicon approach (Akter and Aziz 2016). Marengo et al. used a language modelling approach to explore connections between language stated on Facebook and self-reported quality of life (physical, psychological, social) (Marengo et al. 2021). Deep learning technologies such as convolutional neural networks and long short-term memory have been utilized to understand people's feelings and opinions by producing sentiment analysis of Afaan Oromoo social networking site information such as Facebook posts and comments (Rase 2020).

Twitter sentiment analysis also enables us to make numerous decisions. They utilized an LSTM model that includes investor feelings, stock price time series data, and an attention mechanism to provide an accurate forecast of stock prices (Chou et al. 2021). Investors' emotions are taken into account, and tweets from investors are collected and sorted using a sentiment index to determine whether the investor plans to purchase or sell. Hassan et al. analysed the sentiments stated in tweets about new research publications to assess how influential they are early in the research cycle. According to the findings, a positive association between tweet emotions and citation counts was shown to be useful in predicting the early impact of literature (O. A.-H. Hassan, Ramaswamy, and Miller, 2009). Lu et al. performed sentiment analysis on a large dataset of tweets related to cruise tourism during the COVID-19 pandemic.

The study highlights the significance of sentiment analysis and reaffirms a recent request for sentiment analysis to be a critical component of tourism research (Lu and Zheng 2021). Public sentiment may be connected with stock price behaviour. Kordonis et al. used machine learning techniques to determine the correlation between tweets and stock market price behaviour (Kordonis, Symeonidis, and Arampatzis, 2016). Forecasting election results also makes use of sentiment analysis, which analyses public opinion on social media to make accurate predictions about how voters will support (Chauhan et al. 2021).

Newspaper articles and headlines are another source of text for sentiment analysis. Ghasiya et al. used the nonnegative matrix factorization (NMF) topic modelling technique on Middle East-related articles from three Japanese newspapers. After the identification of critical themes, they employed typical supervised machine learning techniques to extract overall and topic-specific sentiments from the acquired headlines (Ghasiya and Okamura 2021). Users' sentiments obtained from news headlines have a significant impact on traders' buying and selling behaviours, since they are quickly influenced by what they read. Gite et al. utilized LSTM-based deep learning in conjunction with machine learning techniques to anticipate stock prices with a high degree of accuracy (Gite et al. 2021). Mehta et al. developed and deployed a technique for predicting the accuracy of stock prices that takes public opinion into account in addition to other characteristics. To estimate future stock prices, the suggested algorithm takes into account public sentiment, opinions, news, and past stock prices (Mehta et al. 2021).

Online financial news and other news articles are crucial tools for making many decisions, which may be used in a variety of research areas through sentiment analysis. A novel sentiment analysis system based on a deep neural network was developed in (Shi et al. 2021). The novel technique improved sentiment categorization by 9% when compared to the logistic regression method. Additionally, the sentiment information calculated by the analysis system was applied to the stock movement prediction job and significantly enhanced performance when compared to techniques that used simply trading data as input. Ly and Nguyen aimed to mitigate investor risk by developing a revolutionary framework that uses sentiment analysis to anticipate the first three, five, ten, twenty, and thirty days of an IPO's price movement by evaluating its prospectus (Ly and Nguyen 2020). Wu et al. calculated the investors’ sentiment index using a sentiment analysis approach based on convolutional neural networks using nontraditional data. They integrated sentiment index, technical indicators, and historical stock transaction data as the stock price prediction feature set and used a long short-term memory network to forecast the China Shanghai A-share market (Wu et al. 2021). When forecasting the daily price trend of the OMXS30 stock market index, researchers found that adding sentiment characteristics extracted from financial news to a numerical dataset based on past prices improved classification performance (Elena 2021). Arif et al. examined the performance of learning classifier systems (LCSs), which are rule-based machine learning approaches, in sentiment analysis of tweets and movie reviews, as well as spam identification using SMS and email datasets. (Arif et al. 2018). The existing LCS approach is expanded by incorporating a unique encoding scheme for classifier rules to account for feature vector sparsity. The collected findings indicate that the suggested encoding strategy accelerated the learning process and consistently produced high-quality outcomes across all studies. Turner et al. emphasized stock price prediction using a sentiment vocabulary constructed from financial conference call records. They provided a technique for automatically generating an emotion lexicon based on an established probabilistic methodology. The research further demonstrates that when forecasting stock price change, domain-specific sentiment lexicons outperform general sentiment lexicons (Turner, Labille, Computer Science and Computer Engineering, University of Arkansas, Fayetteville, Arkansas, United States, Gauch, and Computer Science and Computer Engineering, University of Arkansas, Fayetteville, Arkansas, United States, 2021). Huang and Tanaka designed a modularized multiagent reinforcement learning system with the goal of introducing scalability, reusability, and depth of information intake to financial portfolio management using web news sentiment data (Z. Huang and Tanaka 2021). They demonstrated that their technique qualifies as a stepping stone for inspiring further innovative financial portfolio management system designs by its originality and superiority over current benchmarks. Another recent study aims to forecast the erratic price movement of cryptocurrencies by studying social media sentiment and determining their association (X. Huang et al. 2021). The research presented a method for determining the sentiment of messages on China's most popular social media network, Sina Weibo. In this research, Weibo posts were captured, the crypto-specific sentiment lexicon was created, and a long short-term memory (LSTM)-based recurrent neural network was used to forecast the price trend for future time frames using the past cryptocurrency price movement. Table 1 shows a brief summary of related work in this domain.

Table 1 Summary of related work

According to the review study, significant research has previously been done on sentiment analysis in stock market movement prediction using web scraped data from various sources, such as Twitter, Facebook, and news headlines. However, significant additional work is required to correctly estimate the influence of public sentiment on stock market movement.

3 Methodologies

This section precisely covers the main theoretical notions used in the current research endeavour. Sentiment analysis is the well-known approach of data science. Almost every active research area employs data science approaches in their respective areas, since it brings together many algorithms, machine learning theories, and tools to unearth buried knowledge from raw data (Budiharto 2021). Currently, stock market movement prediction and analysis is one of the most popular domains where data science is used extensively. The movement of market prices is impacted by a variety of online factors, including social media comments, financial news, stock-related news, and many more. Natural language processing is a method of dealing with these sorts of unstructured online data by turning them into a structured format that a computer can combine with stock data to determine their influence on market prediction (Biswas et al. 2020; Hajhmida and Oueslati 2021; Hussein 2018).

3.1 Natural language processing, sentiment analysis, and web scraping

Natural language processing (NLP) is a subfield of artificial intelligence (AI) that analyses text data to uncover underlying knowledge (Okon et al. 2020). Stock market fluctuations may have a significant economic impact on the economy and individual customers. On the other hand, public events, inflation, and the news media all have an effect on stock price movement (Shi et al. 2021). Sentiment analysis is a classification technique that addresses public data for opinion mining. It employs natural language processing techniques to determine the polarity of an opinion, emotion, or feeling in terms of positive, negative, or neutral sentiments (Elena 2021). Web data are necessary in a wide variety of fields, including research, academia, business, marketing, and governance. These data are available in a variety of formats. Manually downloading web data is a tedious task. Web scraping is a data extraction technology offered as a software application that automatically extracts data from different websites and stores it in a common type of database, allowing for easier processing, analysis, and visualization of data (De S Sirisuriya, 2015; Patel 2020).

3.2 Sentiment analysis tools

Each of the seven sentiment analysis tools utilized in this work is depicted below. The locations of these instruments in the sentiment analysis categorization are depicted in Fig. 1. Although logistic regression (LR) is classified as supervised learning regression, it conducts binary classification. As a result, logistic regression and support vector classifiers are two of the most extensively used classification techniques. Logistic regression is a statistical technique that utilizes a linear dataset to predict binary values for any number of independent variables (Ly and Nguyen 2020). On the other hand, SVC generates an optimal separating hyperplane to discover maximum data separation during classification (Mehta et al. 2021).

Fig. 1
figure 1

Sentiment analysis tools’ (used in this work) positions in sentiment analysis classification

NLTK (Natural Language Tool Kit) is a premier open-source natural language processing (NLP) platform for Python, featuring over 50 corpora. SentiWordNet is a lexicon resource and text processing tool for classification, tokenization, and semantic reasoning. TextBlob is a Python library that reuses NLTK corpora to assign polarity and subjectivity scores to the text data after processing (Bonta et al. 2019). The Valence Aware Dictionary for Sentiment Reasoning, abbreviated VADER, is a lexicon-based and rule-based free, open-source sentiment analysis tool that classifies polarity (positive, negative, neutral) as well as the degree of polarity value word by word (Singh et al. 2021). It works better with social media data and uses the polarity_scores () function to calculate the polarity of words (Bonta et al. 2019). The Loughran–McDonald tool is equipped with a sentiment dictionary with six sentiment dimensions based on the financial industry, best-suited for financial text classification (Elena 2021). Positive, negative, and neutral polarity classifications are generated using this manually created dictionary. If C-1 and C-2 denote the positive and negative word counts, respectively, then C-1/sentence and C-2/sentence are used to denote the sentence's polarity as positive (1) or negative (-1)(Turner et al. 2021). Henry is another dictionary-based sentiment analyser. As with Loughran–McDonald, this tool is focused on positive and negative words associated with finance. The loadDictionaryHE () function is used to access the words during this analysis process (Turner et al. 2021). Stanford CoreNLP is a sentiment analysis tool that is based on a recursive neural network. It computes the sentiment score as polarity by examining the meaning of the text (Lin et al. 2018).

3.3 Long short-term memory (LSTM)

Stock market data are time-series data that can be processed and used efficiently by LSTM, an improved version of RNN to forecast future price movements (Mehta et al. 2021). Figure 2 shows the comprehensive LSTM architecture (Van Houdt et al. 2020). With three gates, an input gate, an output gate, and a forget gate, LSTM overcomes RNN's inability to remember long-term dependencies by preserving relevant information and erasing no relevant information (Gers and Schmidhuber 2001). The forget gate preserves relevant long-term data using Eq. (1), the input gate updates information using σ as the excitation function as in Eq. (2), and finally, the output gate provides output with Eq. (3). Equation (4) generates output vector \({h}_{t}\).

$$f_{t} = \sigma \left( {w_{xf} x_{t} + w_{hf} h_{t - 1} + w_{cf} c_{t} + b_{f} } \right)$$
(1)
$$i_{t} = \sigma \left( {w_{xi} x_{t} + w_{hi} h_{t - 1} + w_{ci} c_{t} + b_{i} } \right)$$
(2)
$$o_{t} = \sigma \left( {w_{xo} x_{t} + w_{ho} h_{t - 1} + w_{co} c_{t} + b_{o} } \right)$$
(3)
$$h_{t} = o_{t} \tanh \left( {c_{t} } \right)$$
(4)

where \(\sigma\) is the activation function, \(w\) is the weight matrix, \(b_{f}\), \(b_{i}\), and \(b_{o}\) are deviation vectors, \(x_{t}\) is the input vector, \(c_{t - 1}\) is the old cell state, \(c_{t}\) is the updated cell state, and \(h_{t}\) is the output vector.

Fig. 2
figure 2

LSTM architecture

4 Proposed model

This section describes in detail the proposed model and algorithms used in this research work by a systematic three-phase working structure.

4.1 System model

Figure 3 displays the system model of this research, which is a three-phase architectural model.

Fig. 3
figure 3

System model of the current research work

The entire processing is done through three phases:

Phase-1: This phase is in charge of scraping public opinions, online news, and articles from related web pages. There were many scraping sources in this phase, such as social media, online news articles, and financial news headlines.

Phase-2: This phase carries out data preprocessing and sentiment analysis of the scraped data done in Phase-1. Disparate sentiment analysis tools calculate sentiment scores after performing preprocessing on the scraped data fed in from Phase 1. Preprocessing is a vital step before sentiment analysis to obtain better accuracy.

Phase 3: Phase 3 is the final phase of the proposed system model. This phase accumulates the calculated sentiment scores performed in Phase-2 with the stock data to calculate stock market movement prediction with the help of the LSTM deep learning model. The experiment thus performed is classified into regression and classification modes. In both classifications of experiments, two categories of results are generated for analysis. One is the data-tool combination to produce the best stock market movement prediction performance, and the other is determining stock market movement prediction performance with the combined effect of sentiment scores. The calculated results are analysed and compared to come up with different conclusions.

4.2 Algorithm

In this subsection, three algorithms for each phase are described. Algorithm 1 depicts the web scraping mechanism performed in Phase-1, Algorithm 2 illustrates the sentiment analysis in Phase-2, and Algorithm 2.1 outlines preprocessing, which is the fundamental task before performing sentiment analysis. Finally, Algorithm 3 does the final trick in this research project: It predicts how the stock market will move based on the sentiment scores that were calculated in Phase 2.

4.2.1 Algorithm 1: web scraping

Web scraping is performed for a specific duration from ‘N’ different sources to gather ‘N’ sets of raw text data used in sentiment analysis in Phase-2. Table 3 provides the implementation details of the web scraping in this work in Sect. 5.

figure a

4.2.2 Algorithm 2: data preprocessing and sentiment analysis

‘N’ sets of web scraped raw data are required to be preprocessed before feeding into the ‘M’ number of sentiment analysis tools. Data need to be preprocessed to achieve higher accuracy while performing sentiment analysis.

The data preprocessing function of the sentiment analysis algorithm is illustrated in Algorithm 2.1 next to Algorithm 2. A sentiment score of 1 is assigned for positive sentiment, -1 for negative sentiment and 0 for neutral sentiment. Then, the daywise average sentiment score is calculated. Each set of data (Set-1 to Set-N) contains the daytime average sentiment score calculated from each sentiment analysis tool (Tool-1 to Tool-M). Table 4 depicts the data format used in this study prior to and during the execution of the sentiment analysis algorithms.

figure b
figure c

4.2.3 Algorithm 3: stock market movement prediction using LSTM model

The Phase-3 algorithm will combine the daytime average sentiment scores produced in phase-2 ('N' X 'M') with Nifty50 stock data collected from Yahoo Finance and then put them into the LSTM-based stock market movement prediction process to improve performance. This phase will analyse and compare the influence of 'N' X 'M' sentiment scores on stock market movement forecasts in terms of the regression metrics MSE, MAE and R-squared and the classification metrics accuracy, recall and F1 score. Table 5 specifies the implementation details.

figure d

5 Experimental analysis and implementation

This section focuses on the experimental set-up and implementation process to achieve the intended outcome.

5.1 Experimental set-up

Table 2 portrays the details of the experimental set-up to implement and execute the obligatory experiments in this research work.

Table 2 Details of the experimental set-up

5.2 Implementation

This subsection illustrates the implementation process of the three phases of the proposed system model, as depicted in Fig. 3.

5.2.1 Web scraping

Web scraping is the first phase of the system model depicted in Fig. 3. In this study, four sources of online data ('N' = 4 in Algorithm 1) are scraped to feed into phase 2 of the system model for preprocessing and sentiment analysis. Table 3 provides a description of the web scraping that has been performed along with the scraped raw data size. Scraped data are stored in a csv file. Data from four different online sources are shown in Table 8. Scraped raw data are represented as DS-1 through DS-4, where N = 4 (number of online data sources).

Table 3 Details of web scraping implementation

5.2.2 Data preprocessing and sentiment analysis

The second phase of the system model is devoted to sentiment analysis. Web scraped raw data from Phase-1 need to be preprocessed before performing sentiment analysis. Four independent sets of scraped raw data (DS-1, DS-2, DS-3, and DS-4) are preprocessed using the methods described in Algorithm 2.1 in this study.

The preprocessed data in the.csv file are then supplied into seven distinct sentiment analysis tools ('M' = 7 in Algorithm 2). The Seven Tools are mentioned in Table 2. These seven tools analyse the sentiment of each of the four data points, and each tool generates a daily sentiment score for each opinion, news or article gathered per day from each of the four data points. Therefore, a total of 28 ('N x M' = 4 × 7) distinct sentiment ratings were created for use in Phase-3 of the system model. The implementation of the seven tools is discussed as follows:

5.2.2.1 Logistic regression and linear SVC:

Training and testing data in csv files have been preprocessed and are ready to be put into the logistic regression model/linear SVC model. Pickle, seaborn, nltk, sklearn, matplotlib, and a number of additional packages must be imported. Here, Twitter sentiment analysis is mentioned. The model is trained using a training.csv file containing 79,908 tweets. Tweets, their feelings, date, user, flag, and ID are all included in the file. Training requires only text and feelings. The processed text vectorization is then performed. The pickle package may be used to save the models in a pickle file. The test data are now turned into a list that the LR model/linear SVC model can read. The findings are saved in a separate.csv file with two columns: tweets and feelings. Each tweet has its own sentiment.

5.2.2.2 Stanford’s core NLP:

Stanford’s Core-NLP is needed to insert NLP requisite open-source libraries of JAVA language to add the dependency in the pom.xml file. Documents are iterated by passing preprocessed data (DS-1 through DS-4) one by one into a document. In the next step, the sentiment method is used to obtain the sentiment scores and return them, which are saved in the.xlxs file. The Excel file has negative, positive, and neutral scores for the dataset.

5.2.2.3 Word Loughran–McDonald sentiment and henry sentiment

To evaluate whether a statement is positive or negative, both approaches make use of dictionaries to classify words as positive or negative and then count how many times each positive or negative word appears in a given phrase. If the number of positive words exceeds the number of negative words, the statement is positive. The statement is negative if the number of negative words exceeds the number of positive words; otherwise, it is neutral. A data frame is created by assembling all of the scraped data (DS-1 to DS-4 one by one) line by line. Each word in each line is verified, and the positive counter is incremented if the word is positive. The negative counter is incremented if the term is negative. The counters were then compared to determine whether the line under experiment was positive, negative, or neutral. The sole distinction between these two dictionaries is the classification of terms as positive or negative.

5.2.2.4 VADER

The SentimentIntensityAnalyzer package is first imported from the nltk.sentiment.vader module and then initialized. All scraped data (DS-1 to DS-4 sequentially) were examined line by line to calculate sentiment scores (positive, negative, neutral, and compound) and saved in a data frame where they scored the percentage of positive, negative, and neutral, which were then normalized to produce the compound score or "Overall Sentiment." The polarity of the compound score indicates the polarity of the line's "Overall Sentiment."

If the compound score is between −0.05 and 0.05, it is considered neutral; if the compound score is more than 0.05, it is considered positive; otherwise, it is considered negative. By using the above logic and comparing it to previous results, it was found that while the percentage of negative feelings remained constant, the percentage of positive sentiments grew. As a consequence of normalization for the purpose of calculating the compound score, the percentage of neutral feelings was reduced.

5.2.2.5 TextBlob

Python has a library named "textblob" that is used to perform sentiment analysis by providing an interface for performing basic natural language processing activities. Each data entry is assigned a float polarity score of -1.0 for negativity and 1.0 for positivity by TextBlob. A score of 0 is awarded to circumstances in which no words map to any of the words in the pre-set training set.

Table 4 provides the symbolization of the data after performing web scraping and sentiment analysis. Table 5 shows the data representation following sentiment analysis using seven different sentiment analysis methods. Before sentiment analysis, the data are represented as DS-1 through DS-4, and after sentiment analysis, they are represented as ADS-1 through ADS-4, as average sentiment scores were calculated per day.

Table 4 Details of data representation after web scraping and sentiment analysis implementation
Table 5 Details of data representation after toolwise sentiment analysis implementation

5.2.3 Stock market movement prediction using LSTM model

In Phase-3 of the proposed model, the tool-specific average sentiment scores derived in Phase-2 are sequentially integrated with Nifty50 stock market data acquired from Yahoo Finance. Each combination is entered into the LSTM model, which is used to forecast stock market movement. Accuracy of prediction is expressed as a percentage.

Many machine learning evaluation metrics have been used to estimate the performance of the proposed model (Batra and Daudpota 2018; Eck, Germani, Sharma, Seitz, and Ramdasi, 2021; Mokhtari et al. 2021). As the proposed model is implemented using LSTM, both regression and classification are implemented here to evaluate the model. Regression is evaluated in terms of R2, MSE and MAE. The confusion matrix (Fig. 4) helped generate the accuracy, recall and F1 score of the classification implementation of the proposed model. The confusion matrix is a highly recommended metric to evaluate any machine learning prediction model in terms of TP, TN, FP and FN, where

  1. 1.

    TP (true positive): Correctly classified positivity.

  2. 2.

    TN (true negative): Correctly classified as negative.

  3. 3.

    FP (false positive): Falsely classified positivity.

  4. 4.

    FN (false negative): Falsely classified negativity.

Fig. 4
figure 4

Confusion matrix

The row containing TP and TN represents precision. The column consisting of TP and FP represents recall (sensitivity), whereas the second column consisting of TN and FN represents specificity. The metrics used in this research are recall, F1 score and accuracy. Equations (5), (6), and (7) are used to calculate the accuracy (correct percentage of prediction), recall (how many actual positives are predicted correctly) and F1 score (balance calculation between recall and precision) of the proposed model:

$${\text{Percentage Accuracy}} = \frac{TP + TN}{{TP + TP + FP + FN}}*100$$
(5)
$${\text{Recall}} = \frac{TP}{{TP + FN}}$$
(6)
$$F1 - {\text{score}} = \frac{{\text{2*Recall*Precision}}}{{\text{Recall + Precision}}}$$
(7)

where Precision (how many positive predictions are correct) is expressed in Eq. (8).

$${\text{Precision}} = \frac{TP}{{TP + FP}}$$
(8)

The pattern of data combinations used in this research is depicted in Table 6. The specifications for the LSTM Model are listed in Table 7. The mean square error (MSE) cost function is determined using Eq. (9):

$${\text{MSE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {{\text{real}}_{i} - {\text{predict}}_{i} } \right)^{2}$$
(9)
Table 6 Details of Data Combination for input to the LSTM Model
Table 7 Details of the LSTM model specification

The mean absolute error (MAE) and R2 are also calculated with Eqs. (10) and (11), respectively:

$${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {\left| {{\text{real}}_{i} - {\text{predict}}_{i} } \right|} \right)$$
(10)
$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{1 = 1}^{n} \left( {{\text{predict}}_{i} - {\text{mean}}} \right)^{2} }}{{\mathop \sum \nolimits_{1 = 1}^{n} \left( {{\text{real}}_{i} - {\text{mean}}} \right)^{2} }}$$
(11)

In Eqs. (9), (10) and (11), `n’ represents the number of samples.

6 Results and analysis

This section mainly describes the experimental results obtained during the research and its analysis.

6.1 Web scraping result

Web scraping of individual online data sources is displayed in Table 8 as a sample from every four sources.

Table 8 Web-scraped data from four online sources: DS-1, DS-2, DS-3 and DS-4

6.2 Data preprocessing and sentiment analysis

Web scraped raw data from Phase-1 of the system model are passed into Phase-2 for raw data preprocessing and subsequent sentiment analysis. The following table shows the resultant sample of daily average sentiment scores obtained using three tools: VADER, linear SVC, and Henry. Table 9 depicts the average sentiment scores per day collected from different tools.

Table 9 Average sentiment scores per day from different tools

6.3 Stock market movement prediction using LSTM model

This section will mainly perform two categories of experiments, namely regression and classification, to determine the performance of stock market movement prediction in terms of the loss functions and percentage accuracy of the proposed model towards the prediction ability using the LSTM deep learning model. Sentiment scores calculated in Phase-2 are coupled with stock data in this phase. Table 10 contains some stock data as an illustration with graphical representation in Fig. 5.

Table 10 Sample Nifty50 stock data
Fig. 5
figure 5

Stock price history

In this study, seven sentiment analysis tools are utilized to perform sentiment analysis on four web scraped data sources, yielding 28 sets of sentiment scores. Using the LSTM model, these sentiment ratings are paired with stock data to forecast market movement. The experimental findings were created and analysed into two categories.

6.3.1 Determining the data-tool combination to produce the best stock market movement prediction performance

Stock data are combined with each average sentiment score per day computed by seven different tools on four different scraped datasets to perform prediction operations pertaining to two experimental categories, regression and classification. Regression metrics are R-squared, MSE and MAE, and classification metrics are accuracy in percentage, recall and F1 score for inspecting tool level performances and data source level effectiveness of public sentiments on stock market movement. Table 11 displays the experimental results in terms of the regression metrics R-squared, MSE and MAE. Table 12 shows the experimental results in the form of classification metrics accuracy, recall, and F1 score. Table 11 presents stock data coupled with each sentiment score derived from seven tools on four data sources in columns and regression metrics as rows. From this table, we find that “Facebook Comments” (ADS-4) by linear SVC, Vader, and Loughran–McDonald yield the greatest results when paired with stock data independently. Next, better results come from sentiment scores, which were derived from financial news of "Economic Times" (ADS-3) from logistic regression and Henry. After that, tweets from Twitter (ADS-2) perform best from TextBlob and Stanford. Finally, stock-related articles headlines from “Economic Times” (ADS-1) from logistic regression performed best. Among these best performing data-tool combinations in each group of experiments (prediction with sentiment scores from each data source and tool), it is clear that sentiment scores of "Facebook Comments," analysed by linear SVC when combined with stock data, generate the best accuracy measure of 98.11%.

Table 11 First category experimental results in terms of cost functions
Table 12 First category experimental results in terms of accuracy, recall, and F1 score

Table 12 goes with the flow and portrays that VADER, linear SVC and Loughran–McDonald perform their best with ADS-4 with accuracy, logistic regression and Henry with ADS-3, TextBlob and Stanford with ADS-2 and finally logistic regression with ADS-1. Among these results, the linear SVC-generated sentiment score from ADS-4 has the best accuracy. Then, comes logistic regression from ADS-1 and VADER from ADS-4.

6.3.2 Determining stock market movement prediction performance with the combined effect of sentiment scores

Stock market movement prediction is performed by combining stock data with sentiment scores calculated from each of the seven tools from four sources one-after-another and checking the amalgamate effect on stock market movement prediction. As in the first category of experiments, Table 13 portrays the experimental results in terms of the regression metrics R-squared, MSE and MAE, and Table 14 shows the experimental results in the form of the classification metrics accuracy, recall and F1 score. When gradually combining each of the four sentiment scores with the stock data, five out of seven experiments show a significant increase in accuracy percentages. Figure 6 shows the performance results derived from each tool for stock data with combined datasets, i.e., ADS-1, ADS-2, ADS-3 and ADS-4. This figure shows the performance comparison of the effectiveness of four sentiment scores from four sources calculated by seven tools on stock market movement prediction.

Table 13 Second category experimental results in terms of cost functions
Table 14 Second category experimental results in terms of accuracy, recall, and F1 score
Fig. 6
figure 6

Comparative performance from each tool with combined datasets

All four linear SVC sentiment scores have a significant impact on stock market movement prediction, according to Fig. 6. In this case, the best prediction performance in percentage accuracy is 98.32 per cent for linear SVC, followed by 97.67 per cent for logistic regression and 96.85 per cent for VADER. Since linear SVC performs best, some of its experimental details are given in Table 15. Table 15 depicts a snapshot of the stock dataset with all four sentiment scores. Table 16 shows all of the findings from stock market prediction using linear SVC derived for the four sentiment scores. Here, accuracy, mean absolute error (MAE), and mean square error (MSE) are initially assessed for stock data alone without any sentiment score. Each of these four sentiment scores is then combined incrementally, and the results vary. Gradually, accuracy improves while error drops. Here, in Table 16, there is one exception when stock data are combined with ADS-1_SVC. A spike is shown in this case. Otherwise, all other cases followed the incremental pattern.

Table 15 Sample Nifty50 stock data with all four linear SVC sentiment scores
Table 16 Stock market prediction results with linear SVC’s four sentiment scores

Additionally, Table 15 illustrates that public opinion and news articles/headlines have an effect on stock market movement forecast performance. The cumulative influence of these characteristics improves prediction accuracy while lowering the cost. The following figures illustrate the prediction graphs and the accompanying cost versus epoch graph. The cost function and prediction graph in Fig. 7a and b are shown without any emotion score. The subsequent figures (Figs. 8, 9, 10 and 11 with (a) and (b) counterparts) demonstrate that when sentiment ratings are integrated sequentially, accuracy increases and cost decreases. Figure 10b illustrates the optimal prediction when all four sentiment ratings are added together.

Fig. 7
figure 7

Stock market movement prediction without any sentiment score

Fig. 8
figure 8

Stock market movement prediction with one sentiment score (news headlines) from linear SVC

Fig. 9
figure 9

Stock market movement prediction with two sentiment scores (news headlines and Twitter) from linear SVC

Fig. 10
figure 10

Stock market movement prediction with three sentiment scores (news headlines, Twitter and news articles) from linear SVC

Fig. 11
figure 11

Stock Market Movement Prediction with four sentiment scores (news headlines, Twitter, news articles, and Facebook comments) from Linear SVC

7 Comparison with existing works

Table 17 shows the comparison of this proposed research work with two of the latest published related works. As indicated in the table, the current work scrapes a sufficient number of online sources (four sources) to perform sentiment analysis using seven tools. In this work, prediction operations are carried out using the LSTM deep learning model. When four sentiment scores and stock data are combined, the current proposed work achieves a high degree of accuracy. Thus, when four sentiment scores from seven sentiment tools are merged with stock data, the experimental setup described in Table 2 results in a higher prediction accuracy.

Table 17 Comparison with two existing works

Dutta et al. 2021 used the Vader sentiment analysis tool on news articles from the “Economic Times”, and an LSTM deep learning model was implemented on the BSE stock index. Since the work is related to the proposed model, which addresses the NSE stock index and six more sentiment analysis tools to perform sentiment analysis on four different data sources, we compared it to represent the enhancement in the performance by enhancing the number of features.

Wang et al. 2021 considered six stock prices (extra features included in this paper are adjusted close and volume) with sentiment scores of news headlines. Here, the vaderSentiment library is used for sentiment analysis as one of our sentiment analysis tools, and a total of six machine learning approaches, SVM, neural networks, naïve Bayes-based method, and random forest, logistic regression and XGBoost model, were tested. In future work, they mentioned using a deep learning approach. The authors also indicated the limited amount of news articles collected. As the proposed work is related and we tried to overcome the limitation as well as incorporate the future work of this paper here, we compared the performance with the performance of the proposed work to indicate improvement.

Table 18 has also been included to provide all four data combination results with stock data from each sentiment analysis tool (seven sentiment analysis tools) for both classification and regression. We found that in all the metrics, linear SCV provides the best performance.

Table 18 Comparison with two existing works

8 Conclusion and future work

The purpose of this research is to predict and analyse stock market movement during a lockdown situation caused by the COVID-19 outbreak using sentiment scores. Four internet sources are used to scrape data in this case, including "Stock Market Related News Headlines", "Twitter's Tweets", "Financial News Articles" and "Facebook Comments". Seven sentiment analysis tools are utilized to determine the sentiment scores of four web scraped datasets: logistic regression, linear support vector classifier, Vader, Stanford's Core-NLP, Textblob, Henry, and Loughran–McDonald. Each of the seven tools on the four sources provides 28 sets of sentiment scores on an average daily basis. These scores are paired with stock data in two categories to conduct two types of Stock Market Movement Prediction tests in regression and classification.

The first category of the experiment associates stock data with individual sentiment scores and forecasts the stock market using an LSTM deep learning network. The accuracy of a prediction is expressed as a percentage. In the second category of the experiment, sentiment scores from four online sources for each tool are gradually integrated with stock data to assess the combined influence of all four sentiment scores on prediction performance. The following conclusions may be drawn from the outcomes of these tests:

  • The highest percentage accuracy is reached when the average sentiment ratings of Facebook comments derived using linear SVC are paired with stock data. Here, the accuracy is 98.11 percent.

  • When average sentiment scores from four sources are incrementally integrated with stock data from each tool, five tools' scores give the greatest performance when combined with all four sentiment scores. The best performance is achieved by the composite impact of linear SVC-generated sentiment ratings, which is 98.32 percent.

  • The overall analysis of the results reveals that using public mood and news headlines/articles in combination with stock data improves forecast accuracy. As a result, increasing the number of sentiments used during the lockdown period improves forecast accuracy.

  • Linear SVC-generated sentiment scores from “Facebook comments” produce the best performances.

This study may be extended in the future to include additional stock indexes and the use of deep learning as another sentiment analysis tool to monitor changes in forecast accuracy. Stock market technical indicators are another parameter that can be used in conjunction with stock data to assess the accuracy of stock market movement forecasts.