1 Introduction

The stock market is a source for organizations to raise additional funds for their business by selling shares of ownership of the company. Through issuing shares and bonds in the stock market, public companies can raise idle funds to solve the problem of temporary capital scarcity. For private investors, the stock is one of the easily available financial products. Through holding suitable stocks and doing appropriate portfolio management, personal assets will be appreciated to some degree. Stock trading is the leading business for securities companies and investment banks. The price movements of the shares in a market are captured in price indices called stock market indices. e.g., FTSE and the S&P. The market index is calculated by taking the average of the prices of a set of companies.

$$ {\text{Equally}}\,{\text{weighted}}\,{\text{price}}\,{\text{Index}} = {\text{Sum}}\, ( {\text{Pirces}}\,{\text{of}}\,{\text{N}}\,{\text{companies)/divisor}}. $$

The stock price prediction has been an exciting and busy research domain for a long time. New technologies are playing an important part and making prediction more exciting and appropriate. Even though many studies have been done to predict the prices of stocks, but no reasonably perfect model has been developed to predict the stock market. Nowadays, every individual or group is looking towards the stock market to gain the maximum return with minimising the risk and financial loss.

There are many things which affect the stock market, but the most important factors are classified into three different categories: fundamental factors, technical factors, and the market sentiment. Social media posts, twits, comments, and news on the Internet regarding any organisation can impact on the stock prices. Opinion mining or sentiment analysis is used to extract the information from the sentiments, which helps the researcher to forecast the stock market to gain higher returns [1].

In the last few decades, various tools and technologies are used to get the mood of the market before any movement in the stock prices. Investment banks evade funds, and insurance companies spend vital time and effort in predicting changes in the stocks. Facebook and Twitter are used as social media platforms around the globe, which are used for sentiment analysis. Sentiment Analysis is a process of finding the mood of a sentence. The mood can be positive, negative, or neutral. It is used to derive the opinion or sentiment of the speaker. Generally, it is used to determine how a person feels about a particular subject or topic.

The efficient market hypothesis (EMH) advocates that social media sentiments help to understand the stock market better. It exists in three forms [2]. Efficient Market Hypothesis (EMH) is a speculation theory that states it is difficult to “beat the market” in since stock market proficiency causes the current stock costs to consistently fuse and reflect all applicable data present in the market.

  • Weak EMH It has covered only the past data

  • Semi-Strong EMH In this, All public information is utilized.

  • Strong EMH Publicly and privately available information is used in strong EMH.

Random walk theory—It Assumes that it is not possible to predict stock prices as stock prices don’t depend on past stock. It also considers that stock prices have high fluctuation, so it is not possible to predict the prices in the stock market.

2 Related Work

Chen and Lazer [3] have researched the connection between the movement of the stock market and twitter feeds. They used both classification and regression strategy to predict the movement of the stock market. After investigation, they identified several features affecting the sentiment values. Zhang et al. [4] have discovered the relationship between Twitter sentiment and stock prices. They analysed the aftereffect of three machine learning systems and the viability of the methods on the correlation. Adebiyi Ayodele et al. [5] has given a hybridized approach for prediction by combining fundamental and technical analysis.

Bollen et al. [6] have also analysed the behaviour of the stock market on the basis of the mood of the people on twitter. Twitter text analysed for a specific period for finding the changes in mood. They found that the accuracy of the Dow Jones Industrial Average (DJIA) predictions improved using specific public mood.

Rechenthin et al. [7] show the ability to predict daily price direction by combining the sentiment of a favourite stock message board with historical price movements. Various authors used a variety of supervised learning algorithms to collect sentiment information and used artificial neural networks (ANN) to determine that the markets were predictable.

3 Stock Market Prediction Approaches

The stock market prediction has two conventional approaches. These are Fundamental and Technical analysis.

3.1 Fundamental Analysis

The performance and profitability of an organization can be analysed to measure its intrinsic value. The intrinsic value can be determined by understanding and styling physically by way of its sales, Staffing quality, infrastructure, and return on investment. Fundamental analysis uses various factors like sales revenues, returns on equity, profit margins, and future growth. It is the right approach for long term investment and growth. This approach is advantageous because of its systematic approach and its ability to predict changes early before showing on the charts.

3.2 Technical Analysis

Technical analysis is one another method used to evaluate stocks by analysing statistics generated by market activity, past prices, and volume. Technical Analysis covers significant activities related to stock price movement like trends, patterns, peak price, lowest price, etc. [8]. Forecasting of any stock price often depends on past behaviour of the stock and its correlated variables. It finds the patterns and indicators on stock charts to determine the future performance of the stock. However, sometimes, this approach is criticized because of its highly subjective nature (Table 1).

Table 1 Analysis between technical and fundamental approaches based on different parameters [9, 10]

Fundamental and Technical analysis are compared in the above table, results show that the technical analysis approaches are most effective for short to medium term trades, whereas Fundamental analysis approach is better to make long term investments (Fig. 1).

Fig. 1
figure 1

Stock market prediction using machine learning algorithms

4 Machine Learning Algorithms Used

4.1 Regression Analysis

Regression Analysis is used for stock market prediction by using non-linear methods. Regression is used in some situations where the relationship between the outcome and the predictor variables is not linear. It is based on analyzing the market variables, and afterward, these equations are utilized as the predictive model to the adjustments in the number of variables and to predict the dependent variable relationships during the forecast period.

4.2 Hidden Markov Model

This model is also one of the models used for predicting stock market future. It analyses the hidden state variables to predict future output and state variables.

4.3 Artificial Neural Network

ANN is a data-driven model. It is a network of computational entities (neurons) that uses parallel distributed computing to predict the prices. This ANN Model can be trained using correct data to be applicable in fields such as regression and classification [11]. Neural networks can tackle an issue without earlier learning of the connection amongst input and output, so these are also called as self-adjusting methods. A generic ANN with input, processing, and output units is showing in the below figure (Fig. 2).

Fig. 2
figure 2

Seven-input one-output ANN representation (Smith 1997) [12]

4.4 Naive Bayesian Classifier

Naïve Bayes is a classification technique which generates Bayesian Networks for a given dataset based on Bayes theorem. Naïve Bayesian classifier is based on a supervised learning algorithm. A supervised learning algorithm is a kind of machine learning in which supervision in learning comes from labelled examples in the training dataset. [13] It is also known as learning with the help of a teacher just because here, class labels are already defined. The naïve Bayesian classifier approach is one of the best techniques for data classification. Classification is a two-step process, the very first step is learning, where the training set is analyzed for model construction, whereas another one is classification step, in which class labels are predicted for the dataset according to the model used.

4.5 Decision Tree Classifier

Decision trees are the easiest way to perform the classification. In this classifier, the class label is denoted by terminal nodes, internal nodes are denoted by the test on an attribute, and the desired outcome of the test is represented by branches of a tree. These classifiers are very popular because they can handle multidimensional data and doesn’t require any domain knowledge.

4.6 Random Forest

Random forest algorithm is a supervised machine learning algorithm. It is very flexible and easy to use because it can be used for both regression and classification task. The forest it builds, is an ensemble of Decision Trees, most of the time trained with the bagging method. In this algorithm, there is no requirement for the pruning of trees that are not sensitive to outliers in training data.

5 Stock Market Analysis

After considering all the algorithms and methods involved in the prediction of stock prices, the efficiency and accuracy of prediction are analysed. We found that the Short-Term Memory (LSTM) Neural network always produce better results as compared to other methods like K-Nearest Neighbor (KNN), Naive Bayesian Classifier, Regression and Support Vector Machines (SVM).

5.1 Comparative Analysis of Various Prediction Techniques

In the above table, we compared different analysis methods used to predict the stock market. These methods are compared to different criteria like dataset, training methods, tools, and algorithm used. Various authors proposed models based on different algorithms are also analysed with dataset and accuracy level (Tables 2, 3 and Figs. 3, 4).

Table 2 Comparative analysis between technical analysis, fundamental analysis, time series analysis, and machine learning techniques on different criteria
Table 3 Proposed models by authors with respective datasets and accuracy results obtained
Fig. 3
figure 3

Authors’ results and accuracy graph

Fig. 4
figure 4

Algorithms and accuracy results graph

6 Conclusion

The prediction of the stock market is a complex and challenging task. Many authors suggested different prediction approaches for the stock market like Non-linear regression, Naïve Bayes Classifier, Random Forest Method, Artificial Neural Networks and support vector Machine. We study fundamental and technical analysis and compare them to get the results. A comparative analysis of various different algorithms used for predicting future stock market prices and found that Long short Memory Neural network (LSTM NN) producing better results as compared to other techniques. In the future, this research would help to improve the efficiency and accuracy of prediction. Deep learning classifiers would also be analysed in the future to predict and gain the maximum benefit from the stock market.