Abstract
Stock price prediction is one of the most important aspects of business investment plans, and has been an attractive research topic for both researchers and financial analysts. Many previous studies indicated the effectiveness of social media sentiment in stock price predictions through time series modelling. However, the time series information hidden in consecutive trading days has not been fully explored. In this paper, we build a stock price prediction model based on attention-based Long Short Term Memory (ALSTM) network using price data, technical indicators and sentiment information from social media. We employed a novel method to feed the deep network with long time series data to learn the deep sequential information of stock price movement. A fine-tuned BERT sentiment classification model and a sentiment lexicon are proposed to extract deep sentiment tendency of social media posts. We conducted experiments on 28 stocks within three years’ transaction period, and the results show that: (1) evaluated by the indicators of the Mean Absolute Error (MAE), the Root Mean Square Error (RMSE) and the accuracy, our proposed method outperforms the baseline models in both validation and test data sets; (2) models incorporating stock prices, technical indicators and sentiment features perform better than models that only use partial data source; (3) the fine-tuned BERT model performs better in sentiment classification task, and the exploitation of the sentiment features computed with the use of BERT model also led to higher predicting accuracy compared with the features calculated using sentiment lexicon; and (4) setting the input window length to 5-day achieves the best performance in average prediction accuracy.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Stock price prediction is an important task in the planning of investment activities. However, it remains a challenging problem to build an effective stock price prediction model, considering that stock prices are affected by multiple factors. In addition to historical prices and a series of technical indicators, the current stock price is also affected by social sentiment. The overall social mood toward a company may be one of the most significant variables affecting its stock price. Nowadays, with the rapid development of social media, an increasing number of investor posts are released on social media, making large amounts of sentiment data available.
Many prior studies have confirmed the validity of investor’s sentiments in stock market predictions [4, 55, 61, 63], even in the Bitcoin exchange market [87]. However, the social media information comprises texts in loose and unrestricted format which grow in a dynamic way. Therefore, this study attempts to integrate and make use of as much content as possible in the social environment of stock market to develop an effective stock prediction method that fully utilizes time series information.
Other drawbacks of previous studies involve using only snapshots of the dataset at time point t to predict another time point in the future [12, 83] and using models that were not tailored for deep sequential information [55]. This ignores the time series relationships among consecutive trading days before time point t, which is also a significant information hidden in the historical time series. LSTM network [29] is designed to learn sequential information, which has been verified to be superior to other models for the task of extracting effective information from complex financial time series data [35, 58]. Therefore, we believe it will help to improve the performances of our prediction method.
To address these questions, we employ four approaches that 1) propose a fine-tuned BERT sentiment classification model and a sentiment lexicon to construct sentiment analysis, 2) convert sentiment information into novel representation feature as model input, 3) build a ALSTM-based architecture to learn the deep sequential information via varying input window length, and 4) conduct experiments on a large scale of social media posts concerning 28 stocks for a period of three years.
This study makes four contributions, namely: (1) we introduce an ALSTM-based architecture for stock price prediction using stock price data, technical indicators and sentiment information, which performs better than the baseline models in both validation and test data sets using three different evaluation metrics; (2) we compare the model performance using different data source, the real effectiveness of sentiment analysis in stock prediction is demonstrated; (3) we propose a fine-tuned BERT sentiment classification model which shows good performance in sentiment classification task, and the exploitation of the sentiment feature computed with the use of the BERT model also lead to higher predicting accuracy compared with the feature calculated using sentiment lexicon; and finally, (4) we compare the predicting accuracy when using different input window length and found that setting time window to 5-day can improve the average predicting performance for all proposed models. The highest average predicting accuracy of 28 stocks is achieved when using sentiment feature calculated by the fine-tuned BERT model.
The rest of the paper is presented in the following. Section 2 introduces some related works on stock predictions based on price data and technical indicators, predictions combining sentiment analysis, and also predictions using long input window length. Section 3 describes our proposed methodology. Section 4 presents the detailed experimental process and assesses experimental results. Section 5 presents the discussion and implications. Finally, the last section concludes our contribution and proposes future works.
2 Related work
This section summarizes studies on (1) Domain 1: Stock predictions based on price data and technical indicators, (2) Domain 2: Stock predictions based on sentiment analysis, and (3) Domain 3: Stock predictions based on long input window length. Several research gaps are concluded through the summary.
2.1 Stock predictions based on price data and technical indicators
Stock market prediction has been an important task in both academics and businesses. Based on the Efficient Market Hypothesis (EMH) [18], some of the early studies propose that it is impossible, given the risk it may face, to achieve above-market returns over the long term. Therefore, the prediction accuracy of the stock market will not exceed 50% [71]. However, the EMH has been questioned ever since [31, 62], especially with the rapid development of machine learning models [5, 21, 64, 85]. Prediction accuracy of 56% is generally considered as satisfying results [73, 77].
Despite Fama’s hypothesis, there are two different philosophies of trading for stock market prediction [8]: fundamental analysis and technical analysis. The former analysis macroeconomic factors, a company’s financial conditions, while the latter assumes that future performance are related to certain historical patterns [75] like time-series prices. Several technical indicators are defined to represent these patterns including the moving average (MA) [24], exponential Moving Average (EMA) [37], momentum [43], Bollinger band [23], etc.
Some researchers tried to make stock predictions based on historical prices only [93, 94] or predict by using a small dataset [22]. Due to the low instance test set, the result may be insufficient. Stock markets generate large-scale trading data every day, providing large amounts of training data for deep neural network [47]. Fischer and Krauss [20] applied an LSTM-based model for financial time series predictions, and the result shows that the LSTM network performs better than memory-free classification models, i.e., a random forest, a logistic regression classifier, and a deep neural net.
Studies in Table 1 cover 4 main aspects of work: (a) stock market selection; (b) feature selection; (c) input window length; and (d) predicting method adoption. Each column corresponds to one aspect. As for selection of stock market, these studies choose a continuous period of time for training and testing. As for feature selection, it can be classified as price data (e.g. [28, 86]), or technical indicators (e.g. [93]), or both of them (e.g. [54, 59]). Input window length is the length of the input vector (e.g., 3d represents a 3-day time window). Some abbreviations are used for this field: ‘m’ is minutes and ‘d’ is days. A null value means no relevant information mentioned. As for predicting method adoption, it can be classified as (1) reduced-form models, such as ARIMA (e.g. [85]), GARCH (e.g. [25]) or (2) machine learning models, including Bayesian network (e.g. [94]), SVM (e.g. [5]), SVR (e.g. [88]), or (3) deep learning models, such as ANN (e.g. [9]), RNN (e.g. [3]), LSTM (e.g. [41, 58, 92]).
2.2 Stock predictions based on sentiment analysis
Sentiment analysis, which is mainly designed to understand what others are thinking [57], has been proved effective in many applications including movie reviews [39, 40, 80], product reviews [38] and public opinions [70, 81]. Nowadays, sentiment information extracted from social media for stock market prediction has also been proved to be effective [46, 60]. There are two main sources for the researchers to merge the information extracted from the text content into their financial models. In previous studies, the main source was the news [45, 67, 68], and in recent studies, social media sources [48]. Bollen, et al. [6] conducted the most influential study to gauge specific dimensions of Twitter sentiments in predicting Dow-Jones index and achieved higher predicting accuracy. Since this seminal study, sentiment extracted from Twitter [52, 82], Yahoo! Finance [56], Sina Weibo [83], GuBa [48], etc. has been proven to be highly correlated with the stock market. Xing, et al. [84] mentioned that it is insufficient for investors to make investments only based on public sentiment and other factors must also be considered in prediction models.
There are two main perspectives on sentiment analysis of text contents: sentiment lexicon [15, 30] and natural language processing [1, 32]. Picasso, et al. [61] extracted two distinct sets of sentiment features from sentiment texts based on the dictionary of Loughran and Mcdonald [50] and AffectiveSpace2 [7] separately. The former is a specific dictionary for financial applications while the latter is a vector space model which is designed to extract sentiments from structured content. Their results show that combining sentiments with price technical indicators outperforms using price data only. The employment of AffectiveSpace feature as input achieved higher accuracy, while the use of the features calculated by Loughran and McDonald dictionary achieves higher returns.
As shown in Table 2, these studies include 5 main aspects of work: (a) stock market selection; (b) feature selection; (c) input window length; (d) sentiment analysis method adoption; and (e) predicting method adoption. As for selection of stock market, these studies also focus on a continuous period of time. As for feature selection, these studies add sentiment information into feature set in the form of (1) polar sentiments (e.g. [45], (2) sub-categorical sentiments (e.g. [61, 69]), or (3) sentiment index (e.g. [31]). As for input window length, these studies also focused on a fixed input window length (e.g. [6, 48]). As for sentiment analysis method adoption, it can be classified as sentiment lexicon (e.g. [82]) or natural language processing (e.g. [56]). As for predicting method adoption, machine learning models, including SVM (e.g. [47]), SVR (e.g. [52]) and (2) deep learning models, such as LSTM (e.g. [12]), RNN (e.g. [83]) are commonly used.
2.3 Stock predictions based on long input window length
Stock prediction can be viewed as a time series problem when using long input window length for model training. Given a univariate or a multivariate time-series, one may treat the entire time-series as a sample. There has been a lot of interest in predicting through long input window length, and it remains an active research area [15, 91].
Nguyen, et al. [55] extract information from two consecutive days for stock movement prediction. In their study, features of each day are considered to be a parallel relationship and used for the training of SVM. Shynkevich, et al. [72] employ technical indicators to describe the information about the past trend of the stock price. In their research, indicators are regarded as a snapshot of the current situation which also reflect the past behaviour over a certain period of time. Several machine learning algorithms are proposed to train these input features which are calculated from price data through different time span. With the rapid development of computer engineering, deep learning algorithms have been widely used in financial time series modelling tasks. Instead of using indicators calculated from different input window length, these studies consider higher-dimensional input data [17, 34], allowing deep learning networks to learn the hidden sequential information.
As shown in Table 3, the 5 main aspects in these studies include: (a) stock market selection; (b) feature selection; (c) input window length; (d) input data form; and (e) predicting method adoption. Stock market selection and feature selection are trivial. As for input window length, these studies use a relatively long time period (e.g. [49]), or several optional lengths for comparison (e.g. [53, 89]). As for input data form, it can be categorized as one-dimensional vector (e.g. [55, 72]) or high dimension vector (e.g. [42]). As for predicting method adoption, LSTM (e.g. [66, 79]) is most commonly used.
2.4 Summary
Through summarizing and comparing previous researches in above three domains, we identified three issues that warrant further investigation, which follows here.
The first issue is that many previous studies make prediction barely using stock price data and several technical indicators. The booming development of social media accelerates the dissemination of users’ opinions and sentiments [44]. Investors tend to seek for emotional help [19], leading the impact of sentiment opinions more significant than usual. Hence, sentiment analysis on social media posts own greater significance in stock prediction task.
The second issue is that the sentiment analysis approaches lack an in-depth understanding of the sentiment text content. Some of the semantics-based methods use sentiment lexicon to analysis the sentiment. However, since the sentiment of the whole content is judged by limited keywords, the deep sentiment in the text may be neglected due to the imperfection of the sentiment lexicon. In order to extract the deep sentiment, an efficient method should be developed. Therefore, we utilize BERT [16] in our sentiment analysis process, inasmuch as it has yielded better results for many NLP tasks including sentiment classification.
The last issue is that the previous studies fail to explore the impact of using long input window lengths on prediction performance. Although many previous studies consider taking long input window length for models to learn, the length number is usually fixed [45, 90], or the input data form lacks time series information [55]. The change of input window length may also result in variation in prediction performance but is seldom considered. Hence, it is of vital importance to discuss the difference of using different input window lengths in prediction.
To settle the three issues, this study build a prediction model based on ALSTM networks using three data sources as input: price data, technical indicators and sentiment feature. The sentiment feature is extracted from social media posts through two different sentiment analysis methods for comparison. The first one is a manually predefined sentiment polarity lexicon in the financial field, and the second one is a fine-tuned BERT sentiment classification model. Different length of input window is organized to feed the ALSTM networks for comparison. To our knowledge, this paper is one of the earliest attempts to reveal the impact of sentiment analysis via different window lengths for stock price prediction.
3 Methodology
An overview of the research framework is shown in Fig. 1. First, the sentiment posts are analysed and sentiment indicator for each transaction day are calculated. Then the sentiment indicators combining with the time series stock prices and technical indicators are organized as model input. Through learning the past N days’ features, the closing price of N + 1 day is predicted. Details of each part are explained in the following subsections.
3.1 Price and technical indicators
In this study, 6 stock price indictors and 8 technical indicators are selected to construct the indicator set.
The stock price data comprises open, close, high, low price, turnover rata and trading volume. Technical indicators are wildly used for market states analysis [3]. Therefore, besides historical prices, we also employ several technical indicators as extra inputs for ALSTM networks which are shown in Table 4. These indicators can reflect the stock trends from multiple aspects, which provides rich stock market signals for the ALSTM networks to learn. However, these technical indicators may not have exact values at every single day due to the different time configuration. Therefore, transaction days with missing values are removed to ensure the integrity of the time series data.
3.2 Sentiment analysis
The sentiments analysis module in Fig. 1 classifies sentiment posts into three categories: positive, negative, and neutral, according to the beliefs or expectations expressed: a positive post means that the mentioned stock price is supposed to rise in the nearly future, or it shows the poster’s tendency in buying this stock; a negative post indicates the expectation in price falling or the tendency of selling this stock; and a neutral post means no obvious expectation or recommendation shown in the post and poster has no tendency in trading. These user-generated text contents are processed by two sentiment analysis methods in this study for comparison: a manually constructed sentiment lexicons and a fine-tuned BERT model for sentiment classification.
3.2.1 Sentiment lexicon
Sentiment dictionaries have been widely used in transforming sentimental contents into representations. In this experiment, the National Taiwan University Sentiment Dictionary was used as basic lexicon and extra finance related terms were manually added. These terms are regarded as rise/fall relevant terms which were summarized from online posts and relevant studies for making up the lack of relevance between the original lexicon and the stock market. The new lexicon contains two polar sentiments: positive and negative. Words which are not exist in our lexicon is regarded as the third sentiment dimension – neutral. Based on the natural language processing, three steps are employed to process these online posts. The first step is Chinese word segmentation and unwanted word removal. Unwanted word such as stop words and special characters (@, #, $ etc.) has no role during classification process. By this step, the text sequences for each post is obtained. The second step is sentiment word matching. Through this process, the text sequences are matched with our sentiment lexicon, which mark words with tags “positive”, “negative” and “neutral”. The third step is post sentiment calculation. The sentiment polarity of post j is calculated through Eqs. (1)–(4).
j represents the number of posts; i represents the ith word in text sequence. The Pos(i, j) or Neg(i, j) indicates weather the ith word is positive or negative respectively. When the word appears in the positive part of our lexicon, PosCountj is employed as the total positive number. When the word appears in the negative part of our lexicon, NegCountj is employed as the total negative number. In this study, PosCountj and NegCountj are used to represent the extent of expectation on rise and fall.
Through Eqs. (3) and (4), the magnitudes of PosCountj and NegCountj are compared. When PosCountj is larger than NegCountj, it means the post has more expectation in the rising of the stock price, and vice versa. Dj is calculated in accordance with PosCountj and NegCountj to classify the polarity of the post j into positive, negative and neutral. These marks of sentiment polarity are employed to construct sentiment indicators.
3.2.2 BERT-based sentiment classifier
Besides sentiment lexicon, we also employ BERT, a pre-trained language model based on deep bidirectional Transformers [78], to perform sentiment classification task. We also take advantage of fine-tuned BERT for sentence-level sentiment classification as it has produced state-of-art results for many NLP tasks [26]. The output of this multi-class, single-label sentiment classifier is the predicted probability of each class, and we get the final predicted category (positive, negative or neutral) according to the output probability.
A natural idea for fine-tuning is to further pre-train BERT with target domain data [74] since BERT was trained in the general domain. In this study, we directly fine-tune the pre-trained BERT model with task-specific dataset, which is constructed using randomly selected data from GuBa dataset. The sentiment polarity of each text is manually labelled in the following process. First, we unified the sentimental annotation guideline in the financial fields. Second, a group of five coders completes the first round of sentiment annotation. Then another group of five coders completes the second round of sentiment annotation for the same text contents. Inconsistencies in annotation are judged by a five-coder verification team under final discussion. Finally, it was used in the fine tuning process for the specific task. In this way, we reduce the limitation of the model performance and endow the model with rich sentiment knowledge.
3.2.3 Construction of sentiment indicators
Sentiment indicators are constructed through sentiment indicators construction method in Fig. 1 based on the sentiment classification results. Following [2, 10, 33], we adopt the bullishness indicator, which is defined as Eq. (5),
where \( {M}_t^c={\sum}_{i\in D(t)}{w}_i{x}_i^c \) is the weighted sum of messages of type c ∈ {pos, neg, neu} in the time interval D(t). \( {x}_i^c \) is equal to one when post i is type c and zero otherwise, and wi is the weight of the post. Antweiler and Frank [2] reveal that the alternative weighting schemes make no difference to conclusions and employ the equal weighting. Therefore, we also regard \( {M}_t^c \) as the number of posts of different categories. Antweiler and Frank [2] propose another bullishness indicator, which is shown in Eq. (6):
In order to reflect the number of investors expressing a certain sentiment, they provide an alternative method of calculation, as shown in Eq. (7):
The second measurement of \( {B}_t^{\ast } \) outperforms the other one in their research. However, neutral posts are not considered in these bullishness indicators. The neutral posts can also reflect the investors’ attention on a particular stock even if they do not contain obvious expectations or beliefs. Considering a more comprehensive investor attention, we propose the following investor sentiment indicator \( {B}_t^{all} \), as is shown in Eq. (8),
where Mt is the total number of posts at time interval D(t). Mt changes with the investors’ attention and is not influenced by the sentiment classification methods.
3.3 Attention-based LSTM networks
In this study, attention-based LSTM networks are chosen as prediction model. LSTM has similar architecture with Recurrent Neural Network (RNN). Recurrent neural network is able to learn temporal patterns from sequential data through internal loops. Weights is learned by backpropagation which has difficulties in retaining long-term information, and may confronts the problem of vanishing (or exploding gradients). LSTM models were proposed to solve these problems [29], and the biggest difference is that there exist three more gates in LSTM.
These gates determine whether each data can pass through the gate and enable LSTM networks to learn long-term dependencies. These three gates are the input gate, forget gate, and output gate. An input gate indicates whether new information can be added into the LSTM memory. A forget gate decides what information should be abandon. An output gate controls whether to output the state. The calculations for the integral process are performed as the following formulas:
where, Wf, Wi, Wc, Wo are weight matrices, bf, bi, bc, bo are bias vectors, ht is the memory cell value at time t, σ calculates how much data to keep, ft is the value of the forget gate layer, it shows the values of the input gate, \( {\overset{\sim }{C}}_t \) is the total data reserved at time t, Ct indicates the current cell state, ot is the output gate layer. The LSTM model comprises these memory blocks and is capable to learn longer temporal patterns.
Attention mechanism is introduced to the LSTM networks, which will adaptively assign different attention weights to different features. After forming the feature vector H = {h1, h2, …hT} through the hidden layer, the attention mechanism will look for the attention weight αi of hi, and the attention mechanism formula is as follows:
where Wh is the weight matrix of hi. The output of the attention mechanism can be obtained as:
where the above ∗ operation is number multiplications componentwise. That is, \( {h}_j^{\ast }={h}_j\ast {\alpha}_j,j=1,2,\dots, T \).
4 Experiments
4.1 Dataset
Two datasets are employed in stock price prediction process. The first one is the stock prices and technical indicators dataset, and the second one – the sentiment information dataset. Stock prices and technical indicators come from the RESET Financial Database (www.resset.com), while the sentiment information comes from GuBa (http://guba.eastmoney.com).
4.1.1 Stock price and technical indicator dataset
All 28 pharmaceutical stocks in the CSI 300 are chosen to conduct experiment. Stock historical prices and technical indicators are collected for a period of three years (from November 18, 2016 to November 18, 2019). Stock codes and company names are shown in the Table 5.
There are three reasons for choosing the 28 pharmaceutical stocks in the CSI 300 stocks:
-
1.
Csi 300 stocks have higher capitals comparing with others in the whole A-share market, which means there are more discussions in the GuBa.
-
2.
Negative news about pharmaceutical and biological companies continues to emerge. Increasing attention has been drawn from Chinese society, such as the fraud case of DEEJ and the expired honey case of Tongrentang Chinese Medicine.
-
3.
Choosing stocks in the same industry can reduce the negative impact of the industry factors on stock price prediction.
4.1.2 GuBa dataset
For sentiment indicators constructing, expectations and beliefs need to be extracted from online posts. Text contents of the 28 stocks are collected from GuBa during the same three-year period to build our sentiment information dataset. GuBa is the most representative internet stock message board in China where investors usually share company news, stock price movement predictions, facts, and comments (usually with strong emotional tendencies) on specific company events. Each stock has its own GuBa page where the stock-related posts can be easily accessed. Two examples of GuBa posts published by investors during the three-year period are shown in Fig. 2. The first post shows negative sentiment obviously and the other shows strong optimism about the stock price future trends.
The stock market is closed for weekends and holidays. The posts published from 2:40 pm of the previous transaction date to 2:40 pm of the current transaction date are assigned to the current transaction date. Transaction date over 24 hours are divided by the number of days it covers. Each stock has transaction dates for a three-year period in our dataset.
However, as in other sentiment information sources, posts on the GuBa are also messy. The post content is usually varying in length, riddled with many spelling mistakes, uncommon expressions, redundant HTML links and irrelevant information. Table 6 tabulates the statistics of each transaction date concerning the min, median, mean, max and the total number of the number of posts for each stock after a clean-up pre-processing. Over this three-year period, we accumulated a total of 1,451,272 pieces of data.
4.2 Baseline setup
In the experiment, Support Vector Regression (SVR) and recurrent neural networks (RNN) are used as baselines.
4.2.1 Support vector regression
First designed by Cortes and Vapnik [14] as a classifier, SVR is employed to capture nonlinear relationship and has a global optimum. Previous studies have reported the effectiveness of SVR in financial time series forecasting problems [27, 64].
In a regression task, given a time-series data set \( F={\left\{\left({\mathbf{x}}_k,{y}_k\right)\right\}}_{k=1}^n \) derived from an unknown function y = g(x), we need to determine a function y = f(x) based on F and to minimize the difference between f and the unknown function g. The main idea of SVR is to build a mapping x → ϕ(x) to a new feature space X′according to the mapping scheme. The nonlinear relationship is then transformed into a linear relationship between the new feature ϕ(x) and label y in the new created space. The SVR model can be obtained as
Where xk is support vectors in data set F and yk is the corresponding labels. K(xk, x) = ϕ(xk) · ϕ(x) is the kernel function and “·” is the inner product in feature space X′. Learning process on the given data set F is to find the support vectors and determining the parameters α and b. There requires no need for explicit calculation for the new feature ϕ(x), since a kernel function is employed in training and forecasting. The most widely used kernel is the radial basis function (RBF) with a width of σ as shown in Eq. (19):
A grid-search and cross-validation process is employed to get the optimal model, and the parameter grid consists of penalty C = {0.1, 1, 2, 5, 10} and kernel parameter gamma = {0.01, 0.1, 0.2, 0.5, 0.8}.
4.2.2 Recurrent neural networks
Recurrent neural networks (RNN) [51] are wildly employed in stock market predictions [11]. RNN is a type of neural network where connections between the calculating units form a directed circle. Same task is performed for every element in a sequence and the output depends only on the previous calculation.
In our RNN model, the input value of the tth day xt = (xt, 1, ⋯, xt, m) is iterated over the following equations,
where ht is the hidden state which is calculated based on the previous hidden state ht − 1 and the input xt at the current time step. ot is the predicted output value which refers to the closing price in this study. U, W and V are the input-to-hidden, hidden-to-hidden and hidden-to-output parameters respectively.
A grid-search and cross-validation process is also employed, and the parameter grid consists of dropout rate d = {0.1, 0.35, 0.5} and batch size b = {10, 100, 200, 400}.
4.3 ALSTM setup
In the experiment, three advanced methods for ALSTM training are applied. First, we make use of Root mean square prop (RMSprop) [76], a mini-batch version of rprop, as optimizer since it is “usually a good choice for recurrent neural networks” [13]. The initial learning rate is set to 0.001 as recommended in the default settings. A higher initial learning rate can reduce the time required for model optimization at an early stage, but it will bring more difficulties in achieving optimality and the model performance is restricted. Accordingly, a lower initial learning rate leads to more training epochs but a better optimum. Therefore, a decay mechanism is adopted to reduce the learning rate to half of itself when the loss rate does not decrease in 5 consecutive iterations to obtain the optimal model.
Second, early-stop mechanisms are employed to stop the training process automatically and to further reduce the overfitting risk. Max training epochs is set to 1000. When the training loss cannot be optimized after several rounds of iterations, the subsequent training becomes no longer necessary. When the loss does not decrease in 20 consecutive epochs, the model with the least loss rate is saved and is supposed to own the best generalization ability.
b=15ptThird, grid-search and cross-validation process are also employed, and the grid consists of two hyper parameters, each parameter contains several candidate hyper parameter values:
-
Dropout rate = {0.1, 0.35, 0.5}: The dropout rate of dropout layers.
-
Batch size = {10, 100, 200, 400}: The number of samples selected for training at a time.
4.4 BERT setup
In this study, the pre-trained language model BERT-base, which contains 12 Transformer blocks, 12 self-attention heads and the hidden size of 768, is employed as the encoder. The input sequence is output as a sequence representation through BERT. A special token [CLS] which contains the classification embedding is always placed at the sentence beginning. In sentiment classification tasks, the whole sequence is represented by the final hidden state h of the first token. A softmax layer is employed to predict the output probability of label c:
where W means the task-specific parameter matrix. Parameters are fine-tuned by maximizing the probability of the correct label.
The parameters are randomly initialized, most of them remaining unchanged as in pre-training, except for the batch size and learning rate. To avoid overfitting, the dropout rate was always kept at 0.1 to the dense layer. For model training, we used the Adam [36] optimizer and the number of epochs is set to 3. Max sequence length is set to 32 in the training process. The optimal parameter values are usually task-specific, and therefore we employ the grid-search process to find the optimal parameters. The following possible candidate values are found to work well across all tasks:
-
Batch size = {16, 32}
-
Learning rata = {5e-5, 3e-5, 2e-5}
In this study, 100,000 GuBa posts are selected for fine tuning of the model, 90% of them for fine-tuning to find the best parameter set and the rest of them for evaluation.
4.5 Experiment setup
We conduct a large amount of comparative experiments on 28 selected stocks based on the ALSTM networks to evaluate the predicting performance, SVR and RNN are used as baseline models. The time span of the dataset is within the range from 18 November 2016 to 18 November 2019. The data form 18 November 2016 to 1 June 2019 (about 85% of the data) is used for training to conduct cross-validation to select optimal hyper parameters, and the data from 1 June 2019 to 18 November 2019 (last 15% of the data) is used for testing to evaluate the out-of-sample performances.
Following Ratto, et al. [65], we also adopt the “walk forward testing” method in cross-validation process. To maximally utilize the available data, an increasing-window was designed to run a 5-fold time split cross-validation. The first k folds of the time series data is used for training and the k + 1th fold for validation. The cross-validation process is shown in Fig. 3.
For analysing the performance of each model, RMSE, MAE and accuracy are used as evaluation metrics. The RMSE and MAE, which provide an excellent error metric, are widely used in model valuation. The accuracy is employed to evaluate the consistency of the price movement in directions between the real and predicted values.
Given a set of time series observation values and the corresponding predictions, RMSE and MAE are defined as follows,
where rt + 1 and \( {\hat{r}}_{t+1} \) denotes the actual closing price and the predicted one at time t + 1 respectively. RMSE is used as the evaluation metric to find the best parameter set for each model. Each transaction date was marked a label (up, down) through comparing the closing price of two consecutive days. Accuracy is calculated by comparing the real trend with the predicted trend, which is defined as follows,
where:
-
tu: the number of samples correctly classified as uptrend.
-
td: the number of samples correctly classified as downtrend.
-
fu: the number of samples incorrectly classified as uptrend.
-
fd: the number of samples incorrectly classified as downtrend.
The purpose of this study is to employ stock prices, technical indicators and GuBa sentiments of day t to predict the closing price of day t + 1. For the RNN and the ALSTM models, we also combine the past N days’ features for training where N represents 3, 5, 7, 10, 15 and 30. This series of comparative experiments were designed to learn the sequential information and discover the best input window length for stock price prediction. We use the form of matrix and space vector to represent the input data, which is defined as:
The meaning of this matrix is that there are N days’ stock data for each training input, and each day consists n features. The timing information of the historical N trading days’ sequence data are modelled, and is used for input as a vector. As shown in Fig. 4, a sliding time window is applied to get the features and labels. This window moves forward by one step until the end of the time series. Finally, by learning the historical data of the previous N days, the closing price of the N + 1 day is predicted.
4.6 Experiment results
The comparison of sentiment classification accuracy between sentiment lexicon and fine-tuned BERT is shown in Table 7 where we calculate the overall accuracy and the accuracy in predicting positive, negative and neutral posts. It can be observed that our BERT based sentiment classification method achieved better performance in predicting all three sentiment tendency on test set. The accuracy in sentiment classification reaches 85.9% on test set, 22.0% higher than sentiment lexicon method.
Table 8 tabulates the cross-validation results for based on different sentiment classification methods. The smallest RMSE score is marked in bold font. The result shows that the ALSTM model using different sentiment classification methods has the best performance in most cases. Among all 28 stocks, RNN obtains the best fitting results for only 1 stock using fine-tuned BERT sentiment classification and 9 stocks using sentiment lexicon. SVR obtains the worst performance among all three models.
The results of the test set are shown in Table 9 for fine-tuned BERT and Table 10 for sentiment lexicon respectively. The smallest MAE and RMSE scores for each stock are marked in bold font, and the highest accuracy score is underlined.
Based on fine-tuned BERT (Table 9), the ALSTM model outperforms the baselines on 21 stocks under the MAE, 20 stocks under the RMSE and 23 stocks under the accuracy. The RNN has the best performance on 7 stocks under the MAE, 8 stocks under the RMSE and 4 stocks under the accuracy. The SVR has the best performance on 1 stocks under the accuracy. It is clear that the ALSTM model outperforms the RNN and the SVR (64:15:1). Based on sentiment lexicon (Table 10), the ALSTM outperforms the baselines on 20 stocks under the MAE, 19 stocks under the RMSE and 21 stocks under the accuracy. The RNN has the best performance on 8 stocks under the MAE, 9 stocks under the RMSE and 1 stocks under the accuracy. The SVR has the best performance on 6 stocks under the accuracy. In summary, the ALSTM outperforms the RNN and the SVR (60:18:6). By comparing the results based on different sentiment classification methods, it is clear that the ALSTM obtains the best performance, the RNN obtains the second best results, while SVR has the worst results.
The average accuracy of 28 stocks using different input window length is calculated in Table 11 for easy comparison. It can be concluded that when setting the input window length to 5-day, the ALSTM model using fine-tuned BERT sentiment classification method achieves the highest accuracy. The average accuracy of 28 stocks reaches 61.24%.
4.7 Discussions on experimental results
4.7.1 The effectiveness of integrating sentiments
We use Δs to represent changes in accuracy between the results with and without sentiment feature to assess the effectiveness of integrating sentiments into stock predictions. Δs is calculated by,
where Accall represents the accuracy of the ALSTM model using both price and sentiments and Accp the accuracy using price data only. The improvements between two sentiment classification methods are shown in Fig. 5. It is clear that combining price data and sentiments for stock predicting outperforms using exclusively price data for most stocks. Through further comparison, most of the improvements brought by sentiment lexicon are under 15%. The fine-tuned BERT method significantly improves the prediction accuracy to a greater extent, with some of the improvements exceeding 15%.
4.7.2 The effectiveness of using multiple information sources
To verify whether multiple information sources can improve predicting performance or the sentiment information is enough for prediction and other additional statistical measures are unnecessary. Thus, we use Δp to evaluate the difference in accuracy between the ALSTM models with and without price data. Δp is calculated by,
where Accs represents the predicting accuracy based on sentiment feature only. The results of Δp are shown in Table 12. It is clear that using multiple information sources outperforms using sentiment source only in all cases.
4.7.3 The effectiveness of using long input window length
To investigate whether the increase of the input window length can help the models to extract more time series information and improve the predicting performance, we employ ΔT to represent the changes of the accuracy between N time steps and 1 time step where N represents 3, 5, 7, 10, 15 and 30. ΔT is calculated by,
where AccN is the average accuracy of 28 stocks when the input window length is set to N and Acc1 is the average accuracy when N = 1. The changes are shown in Table 13. It can be observed that using the 5-day time series data as model input can improve the performance for all proposed models in average accuracy.
5 Conclusions and future work
Stock price prediction is an important aspect of formulating a low-risk and high-return investment. This study focuses on an increasingly significant aspect of financial market research, namely: how to integrate investor sentiments from social media, and make model more qualified to learn time series information. To address the problem, we take the GuBa dataset of 28 stocks from November 18, 2016 to November 18, 2019 for efficient stock price movement prediction using SVR, RNN and ALSTM models. In this work, we propose a fine-tuned BERT sentiment classification model for sentiment analysis and a sentiment lexicon based on NTUSD for comparison. MAE, RMSE and accuracy are employed to evaluate the predictive accuracy. Furthermore, we evaluate the improvements bring by using different input window length. Results show that,
-
1.
Based on multiple information sources, the ALSTM model performs better than the SVR and the RNN under the MAE, RMSE and accuracy.
-
2.
Based on ALSTM, using multiple information sources improves the prediction accuracy than using either stock price data or sentiments.
-
3.
The fine-tuned BERT model achieves higher accuracy in sentiment classification task, and the exploitation of the sentiment feature computed by the fine-tuned BERT model also led to better predicting performance.
-
4.
Combining the 5-day features as a long time series sequential input for models to learn achieves the best predicting accuracy.
Furthermore, there are several future avenues available for this study. Sentiments from social media are the only sentiment resource considered in this study. However, the news data is also widely used in stock price predictions, as it is an important information source about the situation of the country. Moreover, only the historical prices, technical indicators and social media sentiments are employed in this study. Considering the complex and volatile stock market environment, we can further design another prediction model to extract information from other useful sources to make more comprehensive prediction. For example, the company’s financial conditions, which can be concluded from the company’s financial statements and balance sheet. Finally, a more advanced hyper-parameters selection scheme can also be employed in future experiments.
Data availability
The datasets analysed during the current study are not publicly available due to data privacy policy but are available from the corresponding author on reasonable request.
References
Anjaria M, Guddeti RMR (2014) A novel sentiment analysis of social networks using supervised learning. Soc Netw Anal Min 4(1):181. https://doi.org/10.1007/s13278-014-0181-9
Antweiler W, Frank M (2004) Is all that talk just noise? The information content of internet stock message boards. J Financ 59:1259–1294. https://doi.org/10.2139/ssrn.282320
Baek Y, Kim HY (2018) ModAugNet: A new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst Appl 113:457–480. https://doi.org/10.1016/j.eswa.2018.07.019
Baker M, Wurgler J (2006) Investor sentiment and the cross-section of stock returns. J Financ 61(4):1645–1680. https://doi.org/10.1111/j.1540-6261.2006.00885.x
Ballings M, Van den Poel D, Hespeels N, Gryp R (2015) Evaluating multiple classifiers for stock price direction prediction. Expert Syst Appl 42(20):7046–7056. https://doi.org/10.1016/j.eswa.2015.05.013
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8. https://doi.org/10.1016/j.jocs.2010.12.007
Cambria E, Fu J, Bisio F, Poria S (2015) AffectiveSpace 2: enabling affective intuition for concept-level sentiment analysis. Proc AAAI 29:508–514
Cavalcante RC, Brasileiro RC, Souza VLF, Nobrega JP, Oliveira ALI (2016) Computational intelligence and financial markets: A survey and future directions. Expert Syst Appl 55:194–211. https://doi.org/10.1016/j.eswa.2016.02.006
Chandra R, Chand S (2016) Evaluation of co-evolutionary neural network architectures for time series prediction with mobile application in finance. Appl Soft Comput 49:462–473. https://doi.org/10.1016/j.asoc.2016.08.029
Checkley MS, Higón DA, Alles H (2017) The hasty wisdom of the mob: how market sentiment predicts stock market behavior. Expert Syst Appl 77:256–263. https://doi.org/10.1016/j.eswa.2017.01.029
Chen W, Yeo CK, Lau CT, Lee BS (2018) Leveraging social media news to predict stock index movement using RNN-boost. Data Knowl Eng 118:14–24. https://doi.org/10.1016/j.datak.2018.08.003
Chen M-Y, Liao C-H, Hsieh R-P (2019) Modeling public mood and emotion: stock market trend prediction with anticipatory computing approach. Comput Hum Behav 101:402–408. https://doi.org/10.1016/j.chb.2019.03.021
Chollet F (2016) Keras. https://github.com/keras-team/keras. Accessed 13 Feb 2023
Cortes C, Vapnik V (1995) Support vector network. Mach Learn 20:273–297. https://doi.org/10.1007/BF00994018
Oliveira FA, Zárate LE, de Azebedo Reis M, Nobre CN (2011) The use of artificial neural networks in the analysis and prediction of stock prices. 2011 IEEE international conference on systems, man, and cybernetics, pp 2151–215., https://doi.org/10.1109/ICSMC.2011.6083990
Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. https://doi.org/10.48550/arXiv.1810.04805
Eapen J, Bein D, Verma A (2019) Novel deep learning model with CNN and bi-directional LSTM for improved stock market index prediction. 2019 IEEE 9th annual computing and communication workshop and conference (CCWC), 0264-0270, https://doi.org/10.1109/CCWC.2019.8666592
Fama EF (1991) Efficient capital markets: II. J Financ 46(5):1575–1617. https://doi.org/10.1111/j.1540-6261.1991.tb04636.x
Faraji-Rad A, Pham M (2016) Uncertainty increases the reliance on affect in decisions. SSRN Electron J 44. https://doi.org/10.2139/ssrn.2715333
Fischer T, Krauss C (2018) Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res 270(2):654–669. https://doi.org/10.1016/j.ejor.2017.11.054
Gerlein EA, McGinnity M, Belatreche A, Coleman S (2016) Evaluating machine learning classification for financial trading: an empirical approach. Expert Syst Appl 54:193–207. https://doi.org/10.1016/j.eswa.2016.01.018
Giles C, Lawrence S (2001) Noisy time series prediction using recurrent neural networks and grammatical inference. Mach Learn 44:161–183. https://doi.org/10.1023/A:1010884214864
Gradojevic N, Lento C, Wright C (2007) Investment information content in Bollinger bands? Appl Financ Econ Lett 3:263–267. https://doi.org/10.1080/17446540701206576
Gunasekarage A, Power DM (2001) The profitability of moving average trading rules in south Asian stock markets. Emerg Mark Rev 2(1):17–33. https://doi.org/10.1016/S1566-0141(00)00017-0
Güreşen E, Kayakutlu G, Daim T (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38:10389–10397. https://doi.org/10.1016/j.eswa.2011.02.068
Harb JGD, Ebeling R, Becker K (2020) A framework to analyze the emotional reactions to mass violent events on twitter and influential factors. Inf Process Manag 57(6):102372. https://doi.org/10.1016/j.ipm.2020.102372
Henrique BM, Sobreiro VA, Kimura H (2018) Stock price prediction using support vector regression on daily and up to the minute prices. J Finance Data Sci 4(3):183–201. https://doi.org/10.1016/j.jfds.2018.04.003
Hiransha M, Gopalakrishnan EA, Menon VK, Soman KP (2018) NSE stock market prediction using deep-learning models. Procedia Comput Sci 132:1351–1362. https://doi.org/10.1016/j.procs.2018.05.050
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Hung C (2017) Word of mouth quality classification based on contextual sentiment lexicons. Inf Process Manag 53(4):751–763. https://doi.org/10.1016/j.ipm.2017.02.007
Junqué de Fortuny E, De Smedt T, Martens D, Daelemans W (2014) Evaluating and understanding text-based stock price prediction models. Inf Process Manag 50(2):426–441. https://doi.org/10.1016/j.ipm.2013.12.002
Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 137–146. https://doi.org/10.1145/956750.956769
Kim S-H, Kim D (2014) Investor sentiment from internet message postings and the predictability of stock returns. J Econ Behav Organ 107:708–729. https://doi.org/10.1016/j.jebo.2014.04.015
Kim T, Kim H (2019) Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data. PLoS One 14:e0212320. https://doi.org/10.1371/journal.pone.0212320
Kim HY, Won CH (2018) Forecasting the volatility of stock price index: A hybrid model integrating LSTM with multiple GARCH-type models. Expert Syst Appl 103:25–37. https://doi.org/10.1016/j.eswa.2018.03.002
Kingma D, Ba J (2014) Adam: A method for stochastic optimization. International Conference on Learning Representations
Klinker F (2011) Exponential moving average versus moving exponential average. Math Semesterber 58:97–107. https://doi.org/10.1007/s00591-010-0080-8
Kumar S, Kumar K (2018) IRSC: integrated automated review mining system using virtual Machines in Cloud environment. 2018 Conference on Information and Communication Technology (CICT), 1–6, https://doi.org/10.1109/INFOCOMTECH.2018.8722387
Kumar K, Kurhekar M (2017) Sentimentalizer: Docker container utility over cloud. 2017 ninth international conference on advances in pattern recognition (ICAPR), 1–6, https://doi.org/10.1109/ICAPR.2017.8593104
Kumar K, Bamrara R, Gupta P, Singh N (2020) M2P2: Movie’s trailer reviews based movie popularity prediction system. In: Soft Computing: Theories and Applications, pp 671–681. https://doi.org/10.1007/978-981-15-0751-9_62
Kumar A, Purohit K, Kumar K (2021) Stock Price prediction using recurrent neural network and Long short-term memory. Conference proceedings of ICDLAIR2019, 153-160
Lee C, Soo V (2017) Predict stock Price with financial news based on recurrent convolutional neural networks. 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI), 160–165, https://doi.org/10.1109/TAAI.2017.27
Lee C, Swaminathan B (1999) Price momentum and trading volume. J Financ 55. https://doi.org/10.2139/ssrn.92589
Lee S, Ha T, Lee D, Kim JH (2018) Understanding the majority opinion formation process in online environments: an exploratory approach to Facebook. Inf Process Manag 54(6):1115–1128. https://doi.org/10.1016/j.ipm.2018.08.002
Li X, Xie H, Chen L, Wang J, Deng X (2014) News impact on stock price return via sentiment analysis. Knowl-Based Syst 69:14–23. https://doi.org/10.1016/j.knosys.2014.04.022
Li B, Chan KCC, Ou C, Ruifeng S (2017) Discovering public sentiment in social media for predicting stock movement of publicly listed companies. Inf Syst 69:81–92. https://doi.org/10.1016/j.is.2016.10.001
Li X, Wu P, Wang W (2020) Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong. Inf Process Manag 57(5):102212. https://doi.org/10.1016/j.ipm.2020.102212
Li Y, Bu H, Li J, Wu J (2020) The role of text-extracted investor sentiment in Chinese stock price prediction with the enhancement of deep learning. Int J Forecast 36(4):1541–1562. https://doi.org/10.1016/j.ijforecast.2020.05.001
Long J, Chen Z, He W, Wu T, Ren J (2020) An integrated framework of deep learning and knowledge graph for prediction of stock price trend: an application in Chinese stock exchange market. Appl Soft Comput 91:106205. https://doi.org/10.1016/j.asoc.2020.106205
Loughran TIM, McDonald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J Financ 66(1):35–65. https://doi.org/10.1111/j.1540-6261.2010.01625.x
Mandic D, Chambers J (2001) Recurrent neural networks for Prediction: Learning Algorithms,Architectures and Stability. https://doi.org/10.1002/047084535X
Maqsood H, Mehmood I, Maqsood M, Yasir M, Afzal S, Aadil F, Selim MM, Muhammad K (2020) A local and global event sentiment based efficient stock exchange forecasting using deep learning. Int J Inf Manag 50:432–451. https://doi.org/10.1016/j.ijinfomgt.2019.07.011
Mourelatos M, Alexakos C, Amorgianiotis T, Likothanassis S (2018) Financial indices modelling and trading utilizing deep learning techniques: the ATHENS SE FTSE/ASE large cap use case. 2018 Innovations in Intelligent Systems and Applications (INISTA), 1–7, https://doi.org/10.1109/INISTA.2018.8466286
Nelson DMQ, Pereira ACM, Oliveira RAD (2017) Stock market's price movement prediction with LSTM neural networks. International Joint Conference on Neural Networks.
Nguyen TH, Shirai K, Velcin J (2015) Sentiment analysis on social media for stock movement prediction. Expert Syst Appl 42(24):9603–9611. https://doi.org/10.1016/j.eswa.2015.07.052
Oh C, Sheng O (2011) Investigating predictive Power of stock Micro blog sentiment in forecasting future stock Price directional movement. Proceedings of the international conference on information systems, ICIS 2011, Shanghai, China, December 4–7, 2011
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2:1–135. https://doi.org/10.1561/1500000011
Pang X, Zhou Y, Wang P, Lin W, Chang V (2018) An innovative neural network approach for stock market prediction. J Supercomput 76:2098–2118. https://doi.org/10.1007/s11227-017-2228-y
Patel J, Shah S, Thakkar P, Kotecha K (2015) Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Syst Appl 42(1):259–268. https://doi.org/10.1016/j.eswa.2014.07.040
Peng Y, Jiang H (2015) Leverage financial news to predict stock Price movements using word Embeddings and deep neural networks
Picasso A, Merello S, Ma Y, Oneto L, Cambria E (2019) Technical analysis and sentiment embeddings for market trend prediction. Expert Syst Appl 135:60–70. https://doi.org/10.1016/j.eswa.2019.06.014
Qian B, Rasheed K (2007) Stock market prediction with multiple classifiers. Appl Intell 26:25–33. https://doi.org/10.1007/s10489-006-0001-7
Qian Y, Li Z, Yuan H (2020) On exploring the impact of users’ bullish-bearish tendencies in online community on the stock market. Inf Process Manag 57(5):102209. https://doi.org/10.1016/j.ipm.2020.102209
Qu H, Zhang Y (2016) A new kernel of support vector regression for forecasting high-frequency stock returns. Math Probl Eng 2016:1–9. https://doi.org/10.1155/2016/4907654
Ratto AP, Merello S, Oneto L, Ma Y, Cambria E (2018) Ensemble of Technical Analysis and Machine Learning for market trend prediction. 2018 IEEE symposium series on computational intelligence (SSCI)
Rezaei H, Faaljou H, Mansourfar G (2020) Stock price prediction using deep learning and frequency decomposition. Expert Syst Appl 114332:114332. https://doi.org/10.1016/j.eswa.2020.114332
Schumaker RP, Chen H (2009) A quantitative stock prediction system based on financial news. Inf Process Manag 45(5):571–583. https://doi.org/10.1016/j.ipm.2009.05.001
Schumaker RP, Chen H (2009) Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM J Trans Inf Syst 27(2):12. https://doi.org/10.1145/1462198.1462204
Sehgal V, Song C (2007) SOPS: stock prediction using web sentiment. Seventh IEEE international conference on data mining workshops (ICDMW 2007), 21–26, https://doi.org/10.1109/ICDMW.2007.100
Sharma S, Kumar P, Kumar K (2017) LEXER: LEXicon based emotion AnalyzeR. Pattern recognition and machine intelligence, pp 373–379. https://doi.org/10.1007/978-3-319-69900-4_47
Sharpe M, Walczak S (2001) An empirical analysis of data requirements for financial forecasting with neural networks. J Manag Inf Syst 17
Shynkevich Y, McGinnity TM, Coleman S, Belatreche A, Li Y (2017) Forecasting Price movements using technical Indicators: Investigating the Impact of Varying Input Window Length. Neurocomputing 264:71–88. https://doi.org/10.1016/j.neucom.2016.11.095
Si J, Mukherjee A, Liu B, Li Q, Deng X (2013) Exploiting Topic based Twitter Sentiment for Stock Prediction. ACL 2013
Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune BERT for text classification? China National Conference on Chinese Computational Linguistics
Taylor M, Allen H (1992) The use of technical analysis in the foreign exchange market. J Int Money Financ 11:304–314. https://doi.org/10.1016/0261-5606(92)90048-3
Tieleman T, Hinton GE, Srivastava N, Swersky K (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. Neural Networks for Machine Learning, COURSERA
Tsibouris G, Zeidenberg M (1995) Testing the efficient market hypothesis with gradient descent algorithms. Neural Netw Capital Markets 8:127–136
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Proceedings of the 31st international conference on neural information processing systems, 6000–6010
Verma I, Dey L, Meisheri H (2017) Detecting, Quantifying and Accessing Impact of News Events on Indian Stock Indices. https://doi.org/10.1145/3106426.3106482
Vijayvergia A, Kumar K (2018) STAR: rating of reviewS by exploiting variation in emoTions using trAnsfer leaRning framework. 2018 Conference on Information and Communication Technology (CICT), 1–6, https://doi.org/10.1109/INFOCOMTECH.2018.8722356
Vijayvergia A, Kumar K (2021) Selective shallow models strength integration for emotion detection using GloVe and LSTM. Multimed Tools Appl 80(18):28349–28363. https://doi.org/10.1007/s11042-021-10997-8
Vu TIT, Chang S (2012) An experiment in integrating sentiment features for tech stock prediction in twitter. Workshop on Information Extraction & Entity Analytics on Social Media Data
Wang Q, Xu W, Zheng H (2018) Combining the wisdom of crowds and technical analysis for financial market prediction using deep random subspace ensembles. Neurocomputing 299:51–61. https://doi.org/10.1016/j.neucom.2018.02.095
Xing F, Cambria E, Welsch R (2018) Intelligent asset allocation via market sentiment views. IEEE Comput Intell Mag 13:25–34. https://doi.org/10.1109/MCI.2018.2866727
Yeh C-Y, Huang C-W, Lee S-J (2011) A multiple-kernel support vector regression approach for stock market price forecasting. Expert Syst Appl 38(3):2177–2186. https://doi.org/10.1016/j.eswa.2010.08.004
Yong BX, Abdul Rahim MR, Abdullah AS (2017) A stock market trading system using deep neural network. In: Modeling, Design and Simulation of Systems, pp 356–364. https://doi.org/10.1007/978-981-10-6463-0_31
Yu JH, Kang J, Park S (2019) Information availability and return volatility in the bitcoin market: analyzing differences of user opinion and interest. Inf Process Manag 56(3):721–732. https://doi.org/10.1016/j.ipm.2018.12.002
Zhang X, Tan Y (2018) Deep stock ranker: A LSTM neural network model for stock selection. In (pp. 614-623). https://doi.org/10.1007/978-3-319-93803-5_58
Zhang L, Aggarwal C, Qi G-J (2017) Stock Price prediction via discovering multi-frequency trading patterns. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 2141–2149. https://doi.org/10.1145/3097983.3098117
Zhang X, Zhang Y, Wang S, Yao Y, Fang B, Yu PS (2018) Improving stock market prediction via heterogeneous information fusion. Knowl-Based Syst 143:236–247. https://doi.org/10.1016/j.knosys.2017.12.025
Zhang Y, Chu G, Shen D (2020) The role of investor attention in predicting stock prices: the long short-term memory networks perspective. Financ Res Lett 101484. https://doi.org/10.1016/j.frl.2020.101484
Zhang YA, Yan B, Aasma M (2020) A novel deep learning framework: prediction and analysis of financial time series using CEEMD and LSTM. Expert Syst Appl 159:113609. https://doi.org/10.1016/j.eswa.2020.113609
Zuo Y, Kita E (2012) Stock price forecast using Bayesian network. Expert Syst Appl 39(8):6729–6737. https://doi.org/10.1016/j.eswa.2011.12.035
Zuo Y, Kita E (2012) Up/down analysis of stock index by using Bayesian network. Eng Manag Res 1. https://doi.org/10.5539/emr.v1n2p46
Acknowledgements
This paper was supported by the National Natural Science Foundation of China (project numbers are 72274096, 72174087, 71774084 and 71874082 ), the National Social Science Fund of China (project number is 17ZDA291), program for Jiangsu Excellent Scientific and Technological Innovation Team (project number is [2020]10).
Author information
Authors and Affiliations
Contributions
Zhongtian Ji: Conceptualization, Methodology, Investigation, Writing - original draft. Peng Wu: Project administration, Supervision, Writing - review & editing, Funding acquisition. Chen Ling: Formal analysis, Writing - review & editing, Data curation. Peng Zhu: Writing - review & editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ji, Z., Wu, P., Ling, C. et al. Exploring the impact of investor’s sentiment tendency in varying input window length for stock price prediction. Multimed Tools Appl 82, 27415–27449 (2023). https://doi.org/10.1007/s11042-023-14587-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-14587-8