1 Introduction

Bitcoin is a digital currency, built on a peer-to-peer network and on the blockchain, a public ledger where all transactions are recorded and made available to all nodes. Opposite to traditional banking transactions, based on trust for the counterparty, Bitcoin relies on cryptography and on a consensus protocol for the network. The entire system is founded on an open-source software created in 2009 by a computer scientist known under the pseudonym Satoshi Nakamoto, whose identity is still unknown. Hence, Bitcoin is an independent digital currency, not subject to the control of any central authority; furthermore, transactions in the network are pseudonymousFootnote 1 and irreversible.

Bitcoin, altcoinFootnote 2 and the underlying blockchain technology have gained much attention in the last few years. Research on Bitcoin often deals with cyber-security and legitimacy issues such as the analysis of double spending possibilities and other cyber-threats; recently, high returns and volatility have attracted research toward the analysis of Bitcoin price efficiency, such as Almudhaf (2018); Urquhart (2016); Nadarajah and Chu (2017), as well as its price dynamics. Within the latter branch of research, a non-exhaustive list is Kristoufek (2013, 2015); Bukovina and Martiček (2016); Dyhrberg (2016); Ciaian et al. (2016); Katsiampa (2017); Cretarola et al. (2018); Blau (2017). Among quoted papers, many contributions claim that Bitcoin price is driven by attention or sentiment about the Bitcoin system itself. Possible driving factors are the volume of Google searches, of Wikipedia requests (Kristoufek 2013) or more traditional indicators such as the volume of transactions (Kristoufek 2015). In Bukovina and Martiček (2016), sentiment data are obtained from http://sentdex.com/, an online platform specialized on natural language processing algorithms to deliver a positive, neutral or negative feeling about a specific topic. The dependence of Bitcoin price on investors’ attention is also investigated in Ciaian et al. (2016) where the authors analyze the dependence of Bitcoin price on several market forces jointly: supply and demand for Bitcoins, some variables related to global macroeconomic and financial development such as stock market indices and oil price, and several attractiveness factors. Specifically, they measure attractiveness of Bitcoin by means of the number of Wikipedia inquiries on the topic, the number of new users and the number of posts in the online forum https://bitcointalk.org/. By estimating vector autoregressive and vector error correction models, they find that such variables are significant in explaining Bitcoin prices. As for more traditional attention measures, note that in Blau (2017) a time series model is introduced in order to identify the dynamic relation between speculation activity and price: Bitcoin returns are regressed against a demeaned measure of trading activity, following the idea in Llorente et al. (2002), and regression errors are modeled as a standard GARCH(1,1) process to account for heteroscedasticity. Models within the GARCH family have also been applied to describe the dynamics of Bitcoin returns and volatility in Dyhrberg (2016); Katsiampa (2017), but neither attention nor sentiment is taken into account in the above settings. Differently from previous contributions, we investigate whether and to which extent market attention influences the dynamics of Bitcoin, either in the mean returns or in their volatility. To this end, we measure Bitcoin attractiveness both by a classical measure of attention such as the total trading volume in the market and, as suggested in Da et al. (2011), by the Search Volume Index (SVI) provided by Google; the latter is particularly suitable in this framework since Bitcoin is an Internet-based digital currency and Internet users commonly collect information through a search engine such as Google. With a different goal, Google trends data are also used by Yelowitz and Wilson (2015) to distinguish the characteristic of Bitcoin users. It is worth noticing that Urquhart (2018) investigates the relationship between Bitcoin returns, its trading volume and the SVI Google index, with a complementary approach to ours. In Da et al. (2011), the authors also find strong evidence that SVI captures the attention of retail investors: “the search volume is likely to be representative of the Internet search behavior of the general population and more critically, search is a revealed attention measure: if you search for a stock in Google, you are undoubtedly paying attention to it. Therefore, aggregate search frequency in Google is a direct and unambiguous measure of attention.” Indeed, we believe that many of the retail investors in Bitcoin, especially after its steady increase in value, enter the market for speculation purposes and their positions in the Bitcoin and cryptocurrency market depend heavily on the news in the media and on tweets by well-known investors or experts; their information on Bitcoin characteristics may be based and fed by performing Internet searches as argued in Da et al. (2011). Such investors may be responsible for noisy behavior of Bitcoin and have strongly contributed to increase its volatility over time.

In order to test for the impact of attention on Bitcoin returns, we estimate several time series models where the trading volume and the SVI Google index (suitably transformed) are taken as explanatory factors. Overall, we find evidence that Bitcoin returns are affected by market attention and, within this framework, we are able to assess best candidate models for the analyzed datasets by means of the so-called Box–Jenkins procedure. Outcomes show that the trading volume affects both the mean of Bitcoin returns and their volatility while the SVI Google index is significant in the conditional variance of returns and, in few cases, weakly significant in the mean. An out-of sample analysis is also carried out in order to test the forecasting performance of selected models and to choose the best alternative. In this respect, we found that the inclusion of attention measures in model specification makes forecasts more accurate. The rest of the paper is structured as follows. In Sect. 2, we describe the data and we introduce the family of alternative models for the Bitcoin price dynamics; in Sect. 3, we present the model selection methodology; and in Sect. 4, we sum up all the empirical findings. Section 5 is devoted to concluding remarks and to draw some directions for future investigations.

2 Bitcoin price modeling

2.1 Data

We consider daily data for the average price of Bitcoin across main exchanges, obtained by https://blockchain.info/, from January 1, 2012, to December 31, 2017, and, in order to account for time variability, two non-overlapping subsamples. In Fig. 1, we plot the evolution of the mean daily trading volume computed on rolling windows of length \(n =\lbrace 180; 270; 360 \rbrace \) days; in order to have smoother graphs, we consider a time span of \(m=15\) days between consecutive samples. Note that, for all cases, there is a change in the mean volume trend, from a sharp decrease to a sudden increase, around the end of 2014 and the beginning of 2015. Hence, we decide to split our dataset at the beginning of this new phaseFootnote 3 of Bitcoin history. By choosing to split the data exactly on January 1, 2015, we also end up with subsamples of same three years of length, from January 1, 2012, to December 31, 2014, and from January 1, 2015, to December 31, 2017.

Fig. 1
figure 1

The evolution of the mean daily trading volume computed on rolling windows of length \(n =\lbrace 180; 270; 360 \rbrace \) days and considering a time span of \(m=15\) days between consecutive overlapping samples

In Fig. 2, we plot Bitcoin prices and returns and in Table 1 the corresponding descriptive statistics for the three periods.

Fig. 2
figure 2

Bitcoin prices and logarithmic returns data from January 1, 2012, to December 31, 2017 (top), from January 1, 2012, to December 31, 2014 (bottom left), and from January 1, 2015, to December 31, 2017 (bottom right)

Table 1 Summary statistics of daily returns

Both descriptive statistics and Jarque–Bera test p values, reported in Table 1, evidence strong non-normality of returns across the whole time series as well as the two subsamples.

In Fig. 3, we plot the autocorrelation and the partial autocorrelation functions of the Bitcoin logarithmic returns for the three time series, respectively; a significant serial dependence structure is evidenced both in the whole time series and in the first period under investigation, while it is reduced in the last time interval.

Fig. 3
figure 3

Autocorrelation function (on the left) and partial autocorrelation function (on the right) for Bitcoin logarithmic returns: 2012–2017 (top), 2012–2014 (center), 2015–2017 (bottom)

2.2 Models

The serial dependence evidenced in Fig. 3 suggests that the dynamics of Bitcoin returns may be described within the autoregressive moving average (ARMA) models. Since we are interested in the effect of market attention on Bitcoin returns, we augment the model specification with an explanatory process representing a proxy for attention. We consider all the possible constrained and unconstrained ARMA(p,q) specifications for pq\(\in \lbrace {1,2,\ldots ,7}\rbrace \) though in the numerical exercise we found no evidence to go beyond an ARMA(2,2) specification. Usual notation for such augmented model is ARMA(p,q)-X which we will use throughout the paper.

Figure 2 clearly shows heteroscedasticity of returns in all the analyzed time periods. In order to take into account this feature, the error process \(\epsilon =\lbrace {\epsilon _t, t \ge 0\rbrace }\) in (1) is modeled within the GARCH family. Again, we fit a GARCH model on the Bitcoin returns time series, by including an explanatory variable representing market attention also in the conditional variance equation, denoted by GARCH-X. Several model specifications are available within this framework; we believe that outcomes will not differ substantially from a qualitative viewpoint so we focus on two examples, the standard GARCH and the Exponential GARCH models; for the sake of parameter parsimony, we consider the simple GARCH(1,1)-X and EGARCH(1,1)-X as possible specifications for the conditional variance.

Summing up, we describe Bitcoin returns with

$$\begin{aligned} R_t=a_0+\sum _{i=1}^p a_i R_{t-i}+\sum _{j=1}^q b_j \epsilon _{t-j} + c X_{t-1} + \epsilon _{t} \end{aligned}$$
(1)

where \(\epsilon =\lbrace {\epsilon _t, t \ge 0\rbrace }\) is the error process, \(X=\lbrace {X_t, t\ge 0\rbrace }\) is the attention-related explanatory variable and \(a_0,a_i,b_j,c\), for \(i, j=1,2,\ldots ,7\) are model parameters. Nested models are also fitted, by setting some of the parameters to 0. The simple linear regression (LR) corresponds to the choice \(a_i=b_j=0\), for \(i, j=1,2,\ldots ,7\). Notably, the case \(c=0\) will be a benchmark when analyzing the relevance of the attention factor in Eq. (1).

The error term in Eq. (1) is given by \(\epsilon _t=\sqrt{h_t}\eta _t\), where \(\eta =\lbrace {\eta _t, t \ge 0\rbrace }\) is a Gaussian noise and \(h_t\) is the so-called conditional variance modeled as

$$\begin{aligned} h_t=\alpha _0+\alpha _1 \epsilon _{t-1}^2+\beta _1 h_{t-1} + \gamma X_{t-1} \end{aligned}$$
(2)

or

$$\begin{aligned} \log {h_t}=\alpha _0+\alpha _1 \eta _{t-1} +\beta _1 \log {h_{t-1}} + \lambda \left( {\left| \eta _{t-1}\right| }-{\mathbb {E}}\left[ {\left| \eta _{t-1}\right| }\right] \right) + \gamma X_{t-1} \end{aligned}$$
(3)

for the GARCH and EGARCH model, respectively. Again, nested models are also estimated by suitably binding some of parameters to zero. Specifically, benchmark models will be the ones where \(\gamma =0\), when analyzing the effect of the attention variables in Eqs. (2) and (3), respectively.

When discussing the empirical results, we will refer, respectively, to Eqs. (1) and (2) (or (3)) as mean equation and variance equations describing Bitcoin returns.

3 Methodology

3.1 Market attention variables

The explanatory variables representing market attention are based on two sources of data: the total volume of transactions in Bitcoins, provided by https://blockchain.info/, and the adjusted volumeFootnote 4 of Internet searches, the SVI Google index, delivered by https://trends.google.com/trends/. Denoting with A the available time series, we consider alternative transformed variables \(X_1:=\log (A)\), \(X_2:=\varDelta \log (A)\) and \(X_3:={\left| X_2\right| }\). Since the volume traded and volume searches have very high values with respect to returns, the logarithm transformation is appliedFootnote 5; though expressed in logarithmic scale, we will refer to variable \(X_1\) as to the level of attention throughout the paper. The difference variable is considered to understand whether the variation affects Bitcoin more significantly than the attention level; finally the third variable is accounted for in order to investigate whether either the magnitude or the sign of changes is more likely to affect Bitcoin returns.

In Fig. 4, we plot the trading volume and the SVI Google index, both in logarithmic scale, and in Table 2, we sum up the corresponding descriptive statistics for three samples. In order to check for stationarity, we perform the Augmented Dickey Fuller test for both attention measuresFootnote 6; the p values of the tests are reported in the last row of Table 2. It is worth noticing that the volume of transactions is stationary with a nonzero mean, for all the time series considered; the SVI Google index is stationary in the whole sample and in the first subsample while it is stationary around a deterministic trend in the second subsample. In order to avoid any issues in the estimation procedure, we detrend the SVI Google index when necessary.

Fig. 4
figure 4

Bitcoin trading Volume (top) and Google Searches Volume Index (bottom) observed from January 1, 2012, to December 31, 2017 (top), from January 1, 2012, to December 31, 2014 (bottom left), and from January 1, 2015, to December 31, 2017 (bottom right), in logarithmic scale

Table 2 Summary statistics of attention variables

3.2 Model selection procedure

In what follows, we apply the so-called Box–Jenkins model selection procedure (see Rachev et al. (2007)) for a model selection among the model specifications nested in Eqs. (1), (2) and (3), where the attention proxy \(X_t\) is replaced alternatively by \(X_{1t},X_{2t}\) and \(X_{3t}\). Second, models are ranked according to the Akaike and the Bayesian information criteria (AIC and BIC). Finally, the null hypothesis of uncorrelated residuals is tested by means of the Ljung–Box Q test, see Ljung and Box (1978), and the null of homoscedasticity is verified via the Engle’s ARCH Test (see Engle (1982)); we also apply the Ljung–Box Q test to the squared residuals, in order to detect possible serial nonlinear dependence in the residuals. Tests are performed for several choices of the maximum number of Lags.

As a preliminary analysis, we make sure that a simple linear model such as ARMA-X is not suitable to describe Bitcoin returns. Indeed, residuals of all ARMA specifications, considered according to the different transformed measures of attention, still exhibit heteroscedasticity. The empirical results of this preliminary exercise are not reported in this paper but are available upon request.

We estimate the ARMA(p,q)-X specification in Eq. (1) with either GARCH(1,1)-X or EGARCH(1,1)-X errors (Eqs. (2) or (3)) by including the same attention variablesFootnote 7 in both the mean and the variance equations and select the best performing models in terms of the AIC and BIC values. Then, in order to have further insights on the relevance of various transformations either on the mean or in the variance equation, we fit all models when different transformed variables are included in the mean and in the variance equations.

Note that, the estimation of the mean and variance equations is carried out jointly: Specifically, we make use of the function ugarchfit provided by rugarch package available in R software where estimates are obtained by applying the solnp solver, under the assumption of Normally distributed standardized errors (see Ghalanos 2014).

As a further selection tool among the best-ranked model according to the AIC, BIC value and Box–Jenkins procedure, we apply an out-of-sample forecasting performance analysis.

4 Empirical results

In this section, we present the empirical results for three time series of Bitcoin daily returns: the whole series spanning from January 2012 to December 2017, the first subperiod from January 2012 to December 2014, and the second sample from January 2015 to December 2017. The numerical exercise is carried out by measuring the attention variable \(X_{t-1}\) included in Eqs. (1), (2) and (3) with suitable transformed values of : (1) the trading volume only, (2) the SVI Google index only and (3) the vector of both.

The main qualitative results are the following:

  • The trading volume is significant both in the mean and in the variance of Bitcoin returns: The best choice as explanatory variable for the mean equation is the level of the trading volume in the first series and the trading volume change in the second series; the best explanatory variable for the variance equation is the trading volume change.

  • The SVI Google index mainly affects the variance of Bitcoin returns: For the mean equation, the best choice as explanatory variable is the attention level; the most significant transformation in the variance equation is the SVI Google index change, for the whole sample and the first subperiod, and the level for the second one.

  • When both attention proxies are considered, the mean return is especially affected by the level of the trading volume while the SVI has usually no-influence; conversely, the variance is weakly affected by the trading volume and strongly affected by the SVI Google index.

  • While the best ARMA specification varies with both the considered sample and the selection criterion adopted, model selection for the variance equation leads toward the EGARCH(1,1)-X dynamics across all considered cases but for the second subperiod, for which the GARCH(1,1)-X is best ranked.

A possible economic interpretation of the above outcomes is that, while the trading volume level may also affect the mean return of Bitcoin, the search for information, measured in our exercise through the SVI Google index change, mainly influences its volatility. This finding is consistent with the theory that Bitcoin market has attracted investments from so-called noisy traders, raising information on the web, especially in the more recent period where the SVI level also takes part in explaining the variance of Bitcoin returns.

An alternative specification to Eqs. (1)–(3) is to consider market attention and Bitcoin returns observed at the same t. Indeed, by repeating the same analysis within this framework, we obtain analogous qualitative results, though best selections slightly change.

However, considering the lagged attention variable in the model equations makes it possible to pursue a forecasting performance analysis of competing model as a further selection tool, which we tackle in Sect. 4.1.

In Table 3, we list the best models selected for the three considered time series, according to the Akaike information and the Bayesian information criteria; attention is measured, respectively, by the trading volume (Panel a), the SVI Google index (Panel b) and the vector of both variables (Panel c), highlighted in bold. Note that the two information criteria always agree in Panel b, when the Google SVI is considered.

Table 3 AIC and BIC model selection analysis
Table 4 Diagnostics for competing nonlinear models

In Table 4, the p values of diagnostics tests on model residuals are reported for several values of the maximum lags number: The Ljung–Box Q test (Ljung and Box 1978) applied to both residuals and square residuals, and the Engle’s heteroscedasticity Arch test (Engle 1982).

The outcomes are gathered in three separate panels, by considering the trading volume, the SVI Google index or their vector, respectively, as alternative attention measures.

By looking at Table 4, we can conclude that:

  • When attention is measured by the trading volume only, Panel a, the candidate models are suitable to explain both the serial dependence and the heteroscedasticity evidenced in the first subsample and in the whole sample (for the BIC best model). Yet, for the second subsample some residual serial dependence is displayed.

  • When attention is measured by the SVI Google index, Panel b, the candidate models are appropriate for the two subsamples but are not able to describe the serial dependence found on the whole dataset.

  • When the vector of both attention variables is considered, Panel c, candidate models are able to explain both the serial dependence and the heteroscedasticity in the three time series.

Candidate models which are not able to explain serial characteristics in the data are excluded from our further investigation on out-of-sample performance: namely the ARMA(2,2)-\(X_3^{\mathrm{vol}}\) GARCH(1,1)-\(X_3^{\mathrm{vol}}\), ARMA(0,0)-\(X_3^{\mathrm{vol}}\) GARCH(1,1)-\(X_3^{\mathrm{vol}}\), ARMA(2,2)-\(\mathbf {X_1}\) EGARCH(1,1)-\(\mathbf {X_1}\) for the second subsample and the ARMA(2,2)-\(X_1^{\mathrm{vol}}\) EGARCH(1,1)-\(X_2^{\mathrm{vol}}\) and ARMA(0,6) EGARCH(1,1)-\(X_2^{\mathrm{svi}}\) for the whole sample.

4.1 Forecasting performance

In order to evaluate the forecasting power of alternative model specifications, we compute the daily difference between point model forecast and the corresponding observed value, for one month ahead. We apply the function ugarchforecast, available in the rugarch package for the R software (Ghalanos 2014), for computing h-steps ahead forecasts.

An usual approach to compare the overall out-of-sample performance of candidate models is to rank one-day-ahead point forecasts as well as the root-mean-square error (RMSE) of forecasts computed for \(h=7,15\) and 30 days ahead, defined by:

$$\begin{aligned} \mathrm{RMSE}=\sqrt{\frac{1}{h}\sum _{i=1}^{h} \left( \widehat{R_i}-R_i\right) ^2}. \end{aligned}$$
(4)

In Table 5, the computed values are reported for competing models of previous Box–Jenkins selection.

Table 5 Forecasting RMSE
Fig. 5
figure 5

Bitcoin forecasted return and confidence intervals at 95% from January 1 to January 30, 2015, using first subsample different competitor models: realized return (asterix line); ARMA(0,1) EGARCH(1,1) without market attention variables (solid line); ARMA(0,1)-\(X_1^{\mathrm{vol}}\) EGARCH(1,1)-\(\mathbf {X_2}\) (dashed line); ARMA(0,1)-\(\mathbf {X_1}\) EGARCH(1,1)-\(\mathbf {X_2}\) (dotted line); ARMA(0,1)-\(X_1^{\mathrm{svi}}\) EGARCH(1,1)-\(X_2^{\mathrm{svi}}\) (dash-dot line); ARMA(0,1)-\(X_2^{\mathrm{vol}}\) EGARCH(1,1)-\(X_2^{\mathrm{vol}}\) (plus sign line)

Fig. 6
figure 6

Bitcoin forecasted return and confidence intervals at 95% from January 1 to January 30, 2018, using second subsample different competitor models: realized return (asterix line); ARMA(0,0) GARCH(1,1) without market attention variables (solid line); ARMA(0,0)-\(\mathbf {X_1}\) GARCH(1,1)-\(\mathbf {X_1}\) (dashed line); ARMA(0,0)-\(X_1^{\mathrm{svi}}\) GARCH(1,1)-\(X_1^{\mathrm{svi}}\) (dotted line)

Outcomes in Table 5 evidence a similar forecasting ability of competing models; hence, point forecast performance cannot further distinguish among selected models.

In Figs. 56 and 7, we plot point and interval forecasts for candidate model as well as future real observations. In order to disentangle the relevance of attention measures, we also plot the point and interval forecasts obtained for the corresponding model specification where no attention variable is considered. Interval forecasts are narrower in any of the considered time series and for all competing specification when the SVI Google index is included in the market attention measure. This is consistent with the finding of a strong significant impact of the SVI Google index in the variance equation, as evidenced in Table 3.

Fig. 7
figure 7

Bitcoin forecasted return and confidence intervals at 95% from January 1 to January 30, 2018, using whole sample different competitor models: realized return (asterix line); ARMA(0,6) EGARCH(1,1) without market attention variables (solid line); ARMA(0,6)-\(X_1^{\mathrm{vol}}\) EGARCH(1,1)-\(\mathbf {X_2}\) (dashed line); ARMA(0,6)-\(X_1^{\mathrm{vol}}\) EGARCH(1,1)-\(X_2^{\mathrm{vol}}\) (dotted line)

It is worth noticing that all interval forecasts include most of the out of sample observations from one day to one month ahead; interestingly, some of the model specifications have narrower interval forecasts, evidencing the ability to provide same forecasts with higher precision.

Finally, combining the results of the whole Box–Jenkins procedure with the above findings, we can conclude that the overall best models are:

  • ARMA(0,1)-\({X_1}^{\mathrm{vol}}\) EGARCH(1,1)-\(\mathbf {X_2}\) for the first subsample;

  • ARMA(0,0)-\({X_1}^{\mathrm{svi}}\) GARCH(1,1)-\({X_1}^{\mathrm{svi}}\) for the second subsample;

  • ARMA(0,6)-\({X_1}^{\mathrm{vol}}\) EGARCH(1,1)-\(\mathbf {X_2}\) for the whole period.

Table 6 finally exhibits the parameter estimates for the above selected models as well as their standard error and the t test statistics and p value. Notably, nearly all the coefficients of the trading volume and/or SVI Google index are strongly significant, confirming that market attention affects Bitcoin returns and volatility.

Table 6 Parameter estimates of the best models

In order to investigate the stability of parameters’ value over time, best selected models are finally estimated on two- and five-year rolling windows, respectively, for the two subsamples and the whole period; in both cases, we end up with one year of parameter estimates which are plotted in Figs. 8,  9 and  10 .

Fig. 8
figure 8

Parameter estimates of the ARMA(0,1)-\({X_1}^{\mathrm{vol}}\) EGARCH(1,1)-\(\mathbf {X_2}\) model on 2-year rolling windows: first subsample

Fig. 9
figure 9

Parameter estimates of the ARMA(0,0)-\({X_1}^{\mathrm{svi}}\) GARCH(1,1)-\({X_1}^{\mathrm{svi}}\) model on 2-year rolling windows: second subsample

Fig. 10
figure 10

Parameter estimates of the ARMA(0,6)-\({X_1}^{\mathrm{vol}}\) EGARCH(1,1)-\(\mathbf {X_2}\) model on 5-year rolling windows: whole sample

From Figs. 89 and  10, we can conclude that parameter estimates are quite stable across the considered time span.

5 Concluding remarks

The increasing trend experienced by Bitcoin prices and its bubble behavior has pushed the interest in the modeling of its returns. In Katsiampa (2017), the author compares several GARCH specifications to describe Bitcoin returns and volatility; in Dyhrberg (2016), a similar analysis is performed by adding some financial risk factors to the mean equation, such as stock market indexes, fiat currency exchange rates and gold spot and future prices. In this paper, we give further insights within the strand of the literature which relates Bitcoin price and returns to market attention (see Kristoufek (2013, 2015)) and sentiment (see Bukovina and Martiček (2016)) by investigating whether attention factors do influence Bitcoin price dynamics; more precisely, we select best models within a family of ARMA(p,q)-X (E)GARCH(1,1)-X nonlinear models where an attention-related explanatory variable is also included. Following the suggestions in Kristoufek (2013, 2015), we use either the SVI Google index or the trading volume of transactions to measure market attention and we compute several related variables by applying proper transformations, such as the logarithm and the first differences. Trading volume and SVI Google attention variables are alternatively or jointly introduced as regressors in the model specification.

Model selection is performed by applying the Box–Jenkins procedure, Rachev et al. (2007) and by performing a forecasting performance analysis of best candidates.

The analysis is conducted for the time series of Bitcoin returns observed daily from January 1, 2012, to December 31, 2017, and for two subsamples obtained by splitting the whole dataset on January 1, 2015.

The overall picture which can be drawn by our results is that attention measures do affect significantly both the conditional mean and the conditional variance of Bitcoin returns and its inclusion as an explanatory factor within the considered models improve both the AIC and BIC. In particular, when attention is measured by the trading volume, the level is significant in the mean equation, whereas the difference variable is significant in the variance term. For the SVI Google index, the attention change is strongly significant in the variance term while its contribution on the mean return is weakly significant but for few cases. This is consistent with our initial conjecture that investors whose attention is represented by SVI Google index contribute significantly to an increase in Bitcoin volatility rather than on its returns. In the case where the trading volume and the SVI Google index are considered jointly, the volatility of Bitcoin returns is strongly influenced by changes in the trading volume and in the SVI Google index while the mean return is mainly affected by the level of the trading volume. This means that these attention measures are not redundant and add explanatory power if both included in the model; this is also confirmed by the improvement of information criteria in the joint case, with respect to the single explanatory cases. The author in Urquhart (2018) applies vector autoregressive techniques and evidences, as a by product, that the log SVI does not affect significantly Bitcoin log returns and realized volatility for any of the three time series under investigation. Our results are consistent with above findings concerning the whole time series and the first period, while in the second subsampleFootnote 8 we find a positive dependence between the log SVI (\(X_1\) in our notation) and Bitcoin returns; however, the methodologies and the models fitted in this paper are much different from those in Urquhart (2018) and a clear-cut comparison is not feasible. It is our strong intention to investigate, in the next future, the potential causality or reverse causality effect of market attention on Bitcoin returns and volatility by applying a sound multivariate setting extending our approach. In order to select overall best models for the three samples under study, we also performed an out-of-sample forecasting analysis of the alternative specification provided by the Box–Jenkins procedure. Interestingly, we noted that adding attention-based explanatory variables makes model forecasts more accurate.

Further research will address possible financial implications of our findings: Among many, we would like to investigate whether the inclusion of attention measures in the above model specifications do improve the performance of capital allocation risk measures, such as value at risk.