Introduction

Often referred to as a cryptocurrency, Bitcoin is a decentralized digital currency. Under the pseudonym Satoshi Nakamoto, it was first presented in a whitepaper by an unidentified individual or group in 2008, and it was released as open-source software in 2009. Transaction of money can be done using Bitcoin over a peer-to-peer network without the involvement of a middleman like a bank. Markets play a significant role in the economic development of any country. Hence volatility and fluctuation in the value of cryptocurrency can cause a lot of stress to both investors and the people who want to use them as money. It highly relies on the trusted third party and the encryption techniques used to ensure the safety and privacy of the user. Today’s market imports and exports are highly dependent on bitcoins hence the prediction of the bitcoin value become an essential task in today’s world. However, predicting their value is also a tricky task as it depends on a large number of factors out of which we are still not familiar with many of them. Hence, many researchers working in this field are giving their endless efforts and hard work toward progress in this particular sector. The market prediction methods categories include many deep learning and machine learning techniques such as pattern recognition, statistical approaches, making a hybrid model, etc.

Utilizing past data to spot patterns and trends, statistical techniques are crucial in forecasting the price of Bitcoin. Time series analysis is a statistical technique that is frequently used to forecast future values by analyzing past price movements and related factors. Many researchers used machine learning algorithms that include regression [1]. Data on Bitcoin prices are regularly analyzed using a variety of statistical models, including auto-regressive integrated moving averages (ARIMA) [2]. To generate accurate forecasts, these models consider variables such as trading volume, emotion in the market, and past price swings. Furthermore, statistical indicators like Bollinger Bands, Relative Strength Index (RSI), and moving averages are frequently used to pinpoint possible entry or exit opportunities. The price of Bitcoin can be predicted in part by using pattern recognition techniques, which use sophisticated computers to find recurrent patterns and trends in price history. Pattern recognition for Bitcoin price prediction is a popular use case for machine learning techniques, especially those under the supervised learning category. These techniques look at various factors, such as past price movements, trade volume, and market mood, to identify trends that might presage price changes.

In this paper, we built a model that predicts the price of Bitcoin for the upcoming hour, using sophisticated models such as Bidirectional Long Short-Term Memory (BiLSTM). Accurate and timely information is essential for traders and investors to make wise judgments in the fast-paced and dynamic world of cryptocurrency trading. A more sophisticated analysis is made possible by a BiLSTM model’s capacity to identify past and future dependencies in the price data. This ability may reveal minute patterns and trends that could be signs of short-term price fluctuations. Given the tremendous swings in the price of Bitcoin and its vulnerability to sudden changes in the market, traders can benefit greatly from precise forecasts for the upcoming hour, which can help them execute buy or sell orders at the right time. Moreover, sentiment analysis is significant because it sheds light on market participants’ emotional and psychological traits. It’s critical to comprehend the general emotion surrounding Bitcoin in the very erratic and speculative realm of cryptocurrency, where market moves are frequently influenced by opinions and feelings. Positive attitudes, which express confidence and optimism, might indicate future upward trends, particularly if they are supported by favorable regulatory developments or growing institutional involvement. On the other hand, unfavorable attitudes stemming from worries about security or unclear regulations may serve as precursors to future declines. Sentiment research enables traders and investors to assess market sentiment in real-time, enabling them to quickly adjust to shifting market conditions.

Here, the dataset we used had almost 6 million data in it which made it difficult to read. Hence, we first divide the whole data into chunks of 1 million and then csv read our data. Later we remove all duplicate files and remove all unnecessary details present in data. After data cleaning, we calculated the sentiment score as, when given as input, they improve the accuracy and enhance the performance of the model by reducing all losses. Later after pre-processing, we sent our data to the respective network for further processing.

The main contributions of our proposed methods are as follows:

  1. 1.

    A novel hourly Sentiment Analysis module from social media X (formerly known as twitter) has been designed.

  2. 2.

    A bidirectional Long Short-Term Memory (LSTM) network has been designed incorporating sentiment analysis and bitcoin price history.

  3. 3.

    Extensive experiments conducted on actual bitcoin prices suggest extremely close predictions of bitcoin prices.

The rest of the paper is organized as follow: Sect. “Literature Review” discuss some recent state-of-the-art work, Data preprocessing methods have been discussed in Sect. “Data Preprocessing”, Sect. “Methodology” covers adopted methodologies including model design, Sect. “Simulation and Result” analyzes simulated results with ablation studies in Sect. “Conclusion”, and finally Sect. “Conclusion” conclude the work (Fig. 1).

Fig. 1
figure 1

Various methods for prediction of Bitcoin

Literature Review

This section includes all the previously done work and the contribution in this area. Earlier many machine learning algorithms came into play and they did significant work in this field. V. S. Pagolu et al. [3], suggest a method in which public opinions in tweets are analyzed using two distinct textual representations: Word2vec and N-gram. The author examined the relationship between a company’s stock market movements and sentiments in tweets by applying supervised machine learning methods and sentiment analysis to the tweets that were retrieved from Twitter. A Roslan et al. [4] uses sentiment analysis software, which assigns a score to each tweet. MATLAB’s Artificial Neural Network serves as the central component of the prediction approach, allowing us to train the data and forecast the stock price.

To predict the closing price of stocks, Authors K. Zhang, G. Zhong [5] suggest a unique Generative Adversarial Network (GAN) architecture that uses the Long Short-Term Memory (LSTM) as the generator and the Multi-Layer Perceptron (MLP) as the discriminator. While the MLP-designed discriminator seeks to distinguish between created and actual stock data, the LSTM-built generator mines stock data distributions from provided stock market data and generates data in the same distributions. H. M, G. E.A., and V. K. Menon [6] employs four different types of deep learning architectures: Multilayer Perceptrons (MLP), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Convolutional Neural Networks (CNN). S. Liu et al. [7] designed a model that used RNN LSTM-based because of its exceptional ability to forecast market activity. M. H. Bin Mohd Sabri et al. [8] built a deep learning LSTM model for training and predicting the future price of bitcoins. Sentiment analysis proves to be beneficial in establishing a relationship between bitcoin price with the sentiments of the investors.

As of now, many new technologies have evolved for the prediction work. G. Serafini et al. [9] examine the predictive ability of network feelings in this study, as well as statistical and deep learning techniques. They contrast the Auto-Regressive Integrated Moving Average with eXogenous input (ARIMAX) and the Recurrent Neural Network (RNN), two models that were utilized to forecast Bitcoin time series. S. Oikonomopoulos et al. [10] proposed methods using sentiment analysis of social media. They performed analysis using methods like VADER and then determined time series using ADF, KPSS, and Granger Causality test. Many researchers like B. Sonare et al. [11] also make use of various Machine learning algorithms for the prediction in which they use methods linear regression method [12], random forest, gradient boosting [13], Support Vector Regression (SVR), Ridge regression [14]. K. Chakravarty et al. [15] proposed a method in which they used methods of data analysis. Trends and behavioral patterns are predicted using predictive analysis. The predictive model is used to investigate how a comparable unit obtained from several samples performs uniquely. N. Srivastava et al. [16] used Natural Language Processing (NLP) libraries, namely VADER, FinBERT, and TextBlob approaches the generate various scores and use a few characteristics like Aroon Indicators of bitcoins for their prediction while others correlate the bitcoin price and Google search patterns [17]. Author S. Ranjan et al. [18] predicted a five-minute time gap. they used both essential elements of trade and a collection of high-dimensional and later applied a linear regression method to get the accuracy of the model.

J. V. Critien et al. [19] leverage both the volume of tweets and the sentiment that is gleaned from them. To determine the ideal time frame at which the expressed sentiment turns into a trustworthy predictor of price change, we report the findings of experiments investigating the relationship between sentiment and the future price at various temporal granularities. Rajabi et al. [20] also provided dataset including market statistics for Bitcoin, Blockchain, and Google. Based on the Prediction Hardship Factor (PHF), a recently proposed criterion to characterize the level of prediction difficulty, evaluations have demonstrated that this method outperforms well-known techniques like Support Vector Regression and ARIMA in obtaining the minimum error under typical conditions. B. Sonare et al. [11] works on the stock price prediction have been made using LSTM followed by algorithms like Linear Regression, Decision Tree, Recurrent Neural Network, Support Vector Machine, etc.

Moritz Wilksch et al. [21] collected a 10000 tweets dataset and designed and deployed their machine learning model that outperformed most of the other models. Their model followed basic steps like data collection, cleaning, and preprocessing, and then machine learning and deep learning modeling were done on the data. A. J. Toriola et al. [22] proposed a model in which they used eARIMA and LSTM accuracy. Intensity and polarity were measured using VADER sentiments. Results showed that LSTM performed better than ARIMA. Z. Chen et al. [23] also used various statistical and machine learning methods like regression and linear discriminant analysis for daily price prediction and their model reached an accuracy of up to 65.5 percent.

Data Preprocessing

The formal conversion and flow of data are briefly described in this section. The steps followed are displayed in Fig. 2.

Fig. 2
figure 2

Data Preprocessing

Data Reading

In the given dataset, we are provided with more than 6 Million data but since CSV files can only read 1 million data at a time it became necessary to divide the data into smaller chunks for their processing. A CSV file is used to read a dataset of tweets about Bitcoin in parts, each chunk having 100,000 rows. These pieces are then concatenated into a single data frame, denoted as f, using a concatenation method. The dataset must be cleaned and preprocessed as the main objective of the following steps. First, the drop-duplicates method is used to eliminate duplicate rows from the DataFrame f. By ensuring that every tweet in the dataset is distinct, this step helps to prevent redundancy in the analysis that follows. Next, the datetime function is used to convert the ‘date’ column in the DataFrame to datetime format. To handle temporal data in a meaningful way and enable a time-based analysis of the tweets, this procedure is essential. Rows having erroneous datetime values possibly due to conversion parsing errors are removed from the dataset using the drop-on method to further refine it and the index is reset.

Data Cleaning

To preprocess and clean textual data from tweets, the supplied code defines the function clean-tweet. To accomplish the cleaning, the function uses many regular expression substitutions on an input tweet. In particular, it substitutes a consistent “bitcoin” representation for many hashtag variations about Bitcoin, such as “#bitcoin”, “#Bitcoin”, and “#btc”. It also eliminates from the tweet all additional hashtags that contain newline characters, hyperlinks (URLs), alphanumeric characters, and mentions (usernames that begin with ‘@’). To provide a more targeted analysis, this cleaning procedure attempts to standardize the representation of terms linked to Bitcoin and remove unnecessary information, including hashtags and links. The function is then applied using the apply method to the ‘text’ column of a DataFrame (f) that contains tweets about Bitcoin. For jobs involving natural language processing, such as sentiment analysis or topic modeling, where clear and consistent text is essential for insightful results, this column can be further employed.

Calculation of Sentimental Score

This process involves the calculation of a sentimental score using a lambda function to conduct sentiment analysis on a data frame of Bitcoin related tweets. Based on the compound polarity-derived bipolarity-scores method, the sentiment scores are calculated. The ‘compound’ score, which goes from -1 (very negative) to 1 (most positive), represents the general sentiment of a given text. The calculated score helps with additional analysis by giving a numerical representation of each tweet’s sentiment.

Hourly Sentiment Analysis

This step involves the aggregation and grouping of sentiment scores calculated for Bitcoin-related tweets at an hourly granularity. The sentiment scores are grouped by hour using this new column as a foundation. The ‘mean’ technique is then used to determine the mean sentiment ratings for each hour after grouping the DataFrame based on the ‘Datetime’ column. The final step is to reset the resultant frame to a new index, which produces a condensed depiction of the average sentiment scores for every hourly interval. This procedure makes it possible to examine how sentiment trends have changed over time in a more consolidated manner, making it easier to investigate how sentiments in the dataset change hourly.

Fig. 3
figure 3

Visualization of data after pre- processing

Handling Missing Data

The goal of the code we provided is to produce an exhaustive and uninterrupted hourly timeline of sentiment ratings obtained from tweets on Bitcoin. First, the lowest and maximum date values from the ‘Datetime’ column in the ‘hourly-sentiment’ DataFrame are used to produce the entire date range. The hourly frequency used to construct this date range is “freq=‘H’”.The entire date range is then contained in a single column called “Datetime” of a new DataFrame called “hourly-sentiment-complete.” The ‘Datetime’ column is then used as the common key to merge this DataFrame with the original ‘hourly-sentiment’ DataFrame to fill in any hours that may have been missing from the original dataset.

Using the ‘interpolate’ technique, linear interpolation is used to fill in the missing sentiment values in the ‘hourly-sentiment-complete’ data frame. This guarantees a seamless transition and consistent emotional score portrayal during all hours. To provide a consolidated and continuous representation of sentiment scores for additional analysis or visualization, the print line verifies that the Bitcoin tweet sentiment data was successfully saved to the CSV file.

Getting Bitcoin Price Based on Tweet Sentiments

In this step, based on sentiment analysis of Bitcoin tweets, we established the time range for getting previous prices. The purpose of selecting this time frame was to capture the historical prices of Bitcoin for the latest ninety days of sentiment data. Then a function is defined that will call an API to obtain historical Bitcoin price information. The currency, the start and end timestamps (‘from’ and ‘to’) generated from the calculated dates, and optional parameters like the preferred interval for the historical data (‘interval’) are the parameters for the API request. After pre-processing of data, the visualization of data is as shown in Fig. 3

Methodology

Fig. 4
figure 4

Deep Learning-based modified Bidirectional LSTM model for Bitcoin price prediction

Sentiment Analysis Algorithm

The following steps were performed during data preprocessing:

Algorithm 1
figure a

Algorithm for Twitter Sentiment Analysis

Network Architecture

Input LSTM Layer

The input to the network is processed by an LSTM layer (Fig. 4). If we denote \({\textbf{a}}_t\) as input at time step t, LSTM updates its hidden state \({\textbf{h}}_t\) and cell state \({\textbf{c}}_t\) as follows:

$$\begin{aligned} \begin{aligned}&{\textbf{f}}_t = \sigma ({\mathcal {W}}_f \cdot [{\textbf{h}}_{t-1}, {\textbf{a}}_t] + {\textbf{b}}_f) \\&{\textbf{i}}_t = \sigma ({\mathcal {W}}_i \cdot [{\textbf{h}}_{t-1}, {\textbf{a}}_t] + {\textbf{b}}_i) \\&\tilde{{\textbf{c}}}_t = \tanh ({\mathcal {W}}_c \cdot [{\textbf{h}}_{t-1}, {\textbf{a}}_t] + {\textbf{b}}_c) \\&{\textbf{o}}_t = \sigma ({\mathcal {W}}_o \cdot [{\textbf{h}}_{t-1}, {\textbf{a}}_t] + {\textbf{b}}_o) \\&{\textbf{c}}_t = {\textbf{f}}_t \odot {\textbf{c}}_{t-1} + {\textbf{i}}_t \odot \tilde{{\textbf{c}}}_t \\&{\textbf{h}}_t = {\textbf{o}}_t \odot \tanh ({\textbf{c}}_t) \end{aligned} \end{aligned}$$
(1)

where \(\sigma\) is the sigmoid function, \(\odot\) denotes element-wise multiplication, \({\mathcal {W}}\) and \({\textbf{b}}\) are the weight matrices and bias vectors, and \([{\textbf{h}}_{t-1}, {\textbf{a}}_t]\) denotes the concatenation of \({\textbf{h}}_{t-1}\) and \({\textbf{a}}_t\).

Dropout Layer

After the LSTM layer, a Dropout layer is applied. This layer randomly sets a fraction p of input units to 0 at each update during training time, which helps prevent overfitting.

Bidirectional LSTM

The output from the Dropout layer is then passed to a Bidirectional LSTM layer. This layer processes the input sequence in both forward and backward directions. The outputs from both directions are concatenated, resulting in a shape of (None, 59, 236). The forward and backward LSTM equations are the same as those for the Input LSTM but with different weight matrices and bias vectors.

Dense Layer

The concatenated output is then passed to a Dense layer, which performs a linear operation on the layer’s input. If we denote the input to the Dense layer as \({\textbf{z}}\), the output \({\mathcal {Y}}\) is given by:

$$\begin{aligned} {\mathcal {Y}} = {\mathcal {W}} \cdot {\textbf{z}} + {\textbf{b}} \end{aligned}$$
(2)

where \({\mathcal {W}}\) is the weight matrix and \({\textbf{b}}\) is the bias vector.

Activation

Finally, an Activation layer is used to apply an activation function to the output of the Dense layer. Depending on the specific task the network is designed to perform, this could be a function like sigmoid or softmax. For example, if the sigmoid function is used, the final predicted bitcoin price\({\mathcal{B}}_{P}\) is given by:

$${\mathcal{B}}_{P} = \frac{1}{{1 + e^{{ - {\mathcal{Y}}}}}}$$
(3)

where e is the base of the natural logarithm.

Loss Function

Selecting an appropriate loss function is vital for this research as it guides the training process of the machine learning model. The choice of loss function significantly influences the optimization landscape, a mathematical construct that quantifies the disparity between expected and actual values.

Mean Square Error

A widely used loss function in machine learning tasks, such as predicting the price of Bitcoin, is the Mean Squared Error (MSE). When incorporating sentiments along with price in the forecast, the MSE loss function can be formulated as follows:

$${\text{MSE}} = \frac{1}{n}\sum\limits_{{i = 1}}^{n} {\left( {{\hat{\mathcal{B}}}_{{P_{i} }} - {\mathcal{B}}_{{P_{i} }} } \right)^{2} }$$
(4)

Root Mean Square Error

In the prediction of Bitcoin prices, the Root Mean Square Error (RMSE) serves as a valuable metric for evaluating the accuracy of predictive models:

$${\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{{i = 1}}^{n} {\left( {{\hat{\mathcal{B}}}_{{P_{i} }} - {\mathcal{B}}_{{P_{i} }} } \right)^{2} } }$$
(5)

Mean Absolute Error

The Mean Absolute Error (MAE) formula represents the average absolute difference between the actual values and the predicted values:

$${\text{MAE}} = \frac{1}{n}\sum\limits_{{i = 1}}^{n} {\left| {{\hat{\mathcal{B}}}_{{P_{i} }} - {\mathcal{B}}_{{P_{i} }} } \right|}$$
(6)

Simulation and Result

The key results and ablation studies are presented as follows:

Training & Prediction Accuracy

Our work uses Bidirectional Long Short-Term Memory (Bi-LSTM) networks to predict hourly Bitcoin values with sentiment analysis. It has produced very encouraging simulations and results. The model is capable of capturing both past and future information since it makes use of Bi-LSTM, a potent type of recurrent neural network. This makes it skilled at identifying complex patterns and relationships in the time series data. A crucial measure of prediction accuracy, the Root Mean Square Error (RMSE), produced an astonishingly low result of 0.041. Sentiment analysis, when added with input, improves the model’s comprehension of market sentiment and produces more accurate and trustworthy price forecasts.

Figure 5 shows a significant discrepancy between the expected and actual values when Bitcoin prices are forecasted only using historical price data using the bidirectional LSTM model. The absence of market emotions as an input feature leads to a model that finds it difficult to capture the subtle dynamics affecting fluctuations in bitcoin prices. The lack of overlap between the expected and actual price curves highlights this weakness and shows how little the model can adjust to the sentiment-driven and dynamic character of the cryptocurrency market.

On the contrary, Fig. 6 shows the same when sentiments are included as input features, the resulting graph showcases a convergence between the predicted and actual values. This overlap shows how well the model, which uses sentiment analysis to better comprehend and react to changes in market sentiment, can anticipate future events. Sentiment analysis plays a major role in bringing the expected and actual Bitcoin values into line, giving a more realistic picture of the complex processes at work in the cryptocurrency market.

Fig. 5
figure 5

Prediction made by Bidirectional LSTM model given only price

Fig. 6
figure 6

Prediction made by Bidirectional model given both price and sentiments

Comparison with Other Works

Thus, the graph makes it evident that the proposed model performs well when given sentiments and bitcoin price as input data. Furthermore, comparison of the proposed approach with other similar works has been reported in Table 1. It is very clear form the comparison that our proposed approach outperforms each of the model compared by a substantial margin.

Table 1 Comparison of the Proposed Approach with other works

Ablation Study

This section explores the contribution of different components of the proposed approach and present the simulations outcomes in the following subsections:

Effect of Sentiment Analysis Module on Price Prediction

Looking at the output graphs from our prediction models, we can see a clear difference between the cases when sentiment data and the price of Bitcoin are used for predicting, and the ones where just the price of Bitcoin is used. If we only look at the past prices of Bitcoin, we can see that the projected values deviate from the actual prices and take the form of a different trajectory.

Fig. 7
figure 7

Given historical data prediction without sentiments

In Fig. 7, historical data of hourly change in price have been provided. Since only data is given as input, The predicted price does not accurately match the actual price since it shows a lot of variation that can be seen. This disparity illustrates the shortcomings of making precise forecasts based only on price dynamics.

The Predicted result of our model is shown in Fig. 8 shows a remarkable convergence of anticipated and actual prices. By adding sentiment score, the model’s predictive power have increased. A more comprehensive image is created by combining sentiment analysis, historical data, and real pricing, which helps to create a prediction model that closely mimics the complex subtleties of the Bitcoin market.

Fig. 8
figure 8

Given historical data prediction with sentiments

Effect of Sentiment Analysis Module on Loss Function

Figure 9a shows that the selection of input characteristics has a major impact on how successful predictive models are when it comes to predicting Bitcoin values. The resultant loss function is typically more vulnerable to both intrinsic volatility and market movements when the previous price of Bitcoin is the only factor taken into account for prediction. The model’s capacity to capture the intricate dynamics of the bitcoin market is restricted by the lack of further contextual data, such as sentiment analysis of tweets or outside influences. As a result, when the forecast is made just using the price of Bitcoin, the losses are frequently larger.

Fig. 9
figure 9

Comparison of loss with and without sentiment score

However, Fig. 9b represents how the addition of sentiment analysis improves the input characteristics by measuring the overall sentiment conveyed in news articles, social media posts, and other textual data. With this extra data, market mood and patterns may be better understood, which can result in more accurate forecasts and possibly fewer losses. Sentiment analysis and Bitcoin prices together provide a complete input for prediction models, which enhances forecasting accuracy and strengthens the assessment of the model’s effectiveness.

Effectiveness of LSTM Architectures and Other Variants

LSTM can identify and evaluate intricate, non-linear patterns in time-series data such as bitcoin trends. LSTM models offer a complete and all-encompassing perspective of the variables impacting Bitcoin pricing since they can combine a wide range of input characteristics, including sentiment research from social media, macroeconomic data, and global market trends.

Table 2 Comparison of the bidirectional LSTM with other LSTM variants considering both price and sentiments

In this work, we have compared the different variants of the LSTM architectures such as stacked LSTM, 2-Path LSTM, sequence-to-sequence LSTM, sequence-to-sequence VAE LSTM, bidirectional sequence-to-sequence LSTM, and bidirectional LSTM with GRU. The outcome of comparison on these architectures are reported in Fig. 10 and Table 2. It is very clear from the Table 2 that the bidirectional LSTM performed better than any all of the remaining variants.

As shown in Fig. 6, The hourly forecast produced by the model using only price as the only input varies greatly from the real price change during that hour. The graph illustrates how the LSTM model predicted the hourly prices of bitcoin, although the results were not adequate when compared to the real values.

Fig. 10
figure 10

Predictions made by different LSTM models given both Sentiment Score and Price

Conclusion

In conclusion, our efforts to forecast Bitcoin values hourly through the use of sentiment analysis have produced encouraging outcomes, demonstrating a notable increase in the prediction accuracy of the model when we use the bidirectional LSTM model. When sentiments are taken into account as input characteristics, the visual inspection of the prediction graph shows a notable alignment between the predicted and actual values.