1 Introduction

Bitcoin was first introduced in 2009 by a computer scientist, known under the pseudonym Satoshi Nakamoto, as an electronic payment system between peers and is based on an open-source software which generates a peer-to-peer network, see (Nakamoto 2008). Opposite to traditional transactions, which are based on trust in financial intermediaries, this system relies on the network, on the fixed rules and on cryptography. Bitcoins can be purchased on online trading platforms (so-called exchanges) by using fiat currencies. Further, payments can be made in Bitcoins for several online services and goods and its use is increasing. At very low expenses it is also possible to send the cryptocurrency internationally. Concerns about the use of Bitcoin include reputation issues due anonymous transactions, such as money laundering or the possible financing of criminal activities , and cyber-security issues, since it can be only deposited in digital wallets which are vulnerable to hacking attacks. In spite of the above critics, Bitcoin has experienced a rapid growth both in value and in the number of transactions. From an economic viewpoint, one of the main concerns about Bitcoin is whether it should be considered a currency, a commodity or a stock. The conclusion in Yermack (2015) is that Bitcoin behaves as a high volatility stock and that most transactions on Bitcoins are aimed to speculative investments. Recently, literature about Bitcoin has paid much attention to Bitcoin price dynamics and, in particular, to the identification of possible price drivers. In Kristoufek (2013, 2015); Kim et al. (2015), Figà-Talamanca and Patacca (2019) it is shown that Bitcoin price and volatility are affected by the volume or number of transactions, by the number of Google searches on the topic, and by Wikipedia inquires on Bitcoin. Alternatively, in Bukovina and Martiček (2016) Bitcoin price is related to a sentiment measure obtained from the Web siteFootnote 1Sentdex.com.

It is also worth noticing that a market for derivatives on Bitcoin has recently raised on appropriate Web sites such as https://coinut.com and https://deribit.com trading European Calls and Puts and others are likely to come soon according to The Wall Street Journal (2017b), Binary options are also traded in some platforms endorsing the idea in Yermack (2015) that Bitcoins are traded for speculative purposes. Besides and more importantly, the Chicago Board Options Exchange (CBOE) has launched standardized Future contracts on the cryptocurrency in December 2017; this has given rise to a new era for Bitcoin trades and probably opened the way to other standardized derivatives, see (The Wall Street Journal 2017a). Motivated by the evidences in the above quoted papers and the increasing interest in Bitcoin derivatives, we propose a bivariate model in continuous time to describe the dynamics of both price and market attention for Bitcoin. As an additional novel feature, we allow for a possible delay between the attention factor and its delivered effect on Bitcoin price. We assume that the price and the attention factor are two observable variables, each described by a continuous time diffusion process; in addition, we assume that the attention factor influences directly the drift and the diffusion functions describing the price. In the numerical exercise, we use some proxies for the attention factor, such as the trading volume and the volume of internet searches. An alternative specification which takes into account market attention is suggested in Kou (2002) where the author assumes that the dynamics of the stock price is a jump diffusion and sudden jumps in returns may be induced by outside news. However, the jump process is not directly related to some attention factor or news indicator.

In this paper, we give several contributions: first, the market model is defined and it is proven that a strong solution exists; then, mild conditions are given for the model to be arbitrage-free and its statistical properties are investigated. Further, the likelihood for a discrete sample of the model is computed and an approximated closed formula is derived, so that maximum likelihood estimates are obtained for model parameters. Precisely, we apply the profile likelihood method described in Davison (2003), Pawitan (2001) to fit the model to Bitcoin market data from January 15, 2015 to March 31, 2017; market attention is measured either through the traditional trading volume, see (Barber and Odean 2007; Gervais et al. 2001; Hou et al. 2009), or by the Google SVI search volume index, as suggested in Da et al. (2011). Finally, based on risk-neutral evaluation, a quasi-closed formula is derived for any European-style derivative on the Bitcoin, which makes it possible to price derivatives when their values are not known or to calibrate model parameters when derivative prices are given by market trades or quotations.

The paper is organized as follows. In Sect. 2 we describe the model for the Bitcoin price dynamics, discuss its statistical properties and show that the market is arbitrage-free. In Sect. 3, we compute the joint distribution of the discretely sampled model as well as a closedform approximation which is useful to introduce a parameter estimation procedure. Section 4 is devoted to test possible proxies for the attention factor and to estimate model parameters on historical market data obtained from http://blockchain.info. In Sect. 5, we prove a quasi-closed formula for European-style derivatives with detailed computations for Plain Vanilla and Binary option prices, while in Sect. 6.2 we give some further insights on numerical applications of the pricing formula. Some concluding remarks are given in Sect. 7 as well as directions for interesting future investigations. For the sake of readability, side results as well as most technical proofs are collected in Appendices.

2 The Bitcoin market model

We fix a probability space \((\Omega ,\mathcal {F},\mathbf {P})\) endowed with a filtration \(\mathbb {F}= \{\mathcal {F}_t,\ t \ge 0\}\) that satisfies the usual conditions of right-continuity and completeness. On the given probability space, we consider a main market in which heterogeneous agents buy or sell Bitcoins and denote by \(S = \{S_t,\ t \ge 0\}\) the price process of the cryptocurrency. Following an idea suggested in Cretarola et al. (2018), we assume that the Bitcoin price dynamics is described by the following equation:

$$\begin{aligned} \mathrm {d}S_t = \mu _S P_{t-\tau } S_t \mathrm {d}t+\sigma _S \sqrt{P_{t-\tau }}S_t\mathrm {d}W_t,\quad S_0=s_0 \in \mathbb {R}_+, \end{aligned}$$
(2.1)

where \(\mu _{S} \in \mathbb {R}{\setminus } \{0\}\), \(\sigma _{S} \in \mathbb {R}_+,\ \tau \in \mathbb {R}_+\) represent model parameters; \(W = \{W_t,\ t \ge 0\}\) is a standard \(\mathbb {F}\)-Brownian motion on \((\Omega ,\mathcal {F},\mathbf {P})\) and \(P = \{P_t,\ t \ge 0\}\) is a stochastic factor, representing the attention index in the Bitcoin market, and satisfying

$$\begin{aligned} \mathrm {d}P_t =\mu _P P_t\mathrm {d}t+\sigma _P P_t\mathrm {d}Z_t, \quad P_t = \phi (t),\ t \in [-L,0]. \end{aligned}$$
(2.2)

Here, \(\mu _P \in \mathbb {R}{\setminus } \{0\}\), \(\sigma _P \in \mathbb {R}_+\), \(L \in \mathbb {R}_+\)are constant parameters, \(Z = \{Z_t,\ t \ge 0\}\) is a standard \(\mathbb {F}\)-Brownian motion on \((\Omega ,\mathcal {F},\mathbf {P})\), which is \(\mathbf {P}\)-independent of W, and \(\phi :[-L,0] \rightarrow [0,+\infty )\) is a continuous (deterministic) initial function. Note that, the nonnegative property of the function \(\phi \) corresponds to requiring that the minimum level for attention is zero.

We assume that the reference filtration \(\mathbb {F}=\{\mathcal {F}_t,\ t \ge 0\}\), describing the information on the Bitcoin market, is of the form

$$\begin{aligned} \mathcal {F}_t=\mathcal {F}_t^W \vee \mathcal {F}_t^Z, \quad t \ge 0, \end{aligned}$$

where \(\mathcal {F}_t^W\) and \(\mathcal {F}_t^Z\) denote the \(\sigma \)-algebras generated by W and Z, respectively, up to time \(t \ge 0\). Note that \(\mathcal {F}_t^Z=\mathcal {F}_t^P\), for each \(t \ge 0\), with \(\mathcal {F}_t^P\) being the \(\sigma \)-algebra generated by P up to time \(t \ge 0\). Since at any time t the Bitcoin price dynamics is affected by the attention index only up to time \(t-\tau \), we consider the filtration \({\widetilde{\mathbb {F}}}=\{{\widetilde{\mathcal {F}}}_t,\ t \ge 0\}\), defined by

$$\begin{aligned} {\widetilde{\mathcal {F}}}_t = {\mathcal {F}}_{t}^W \vee \mathcal {F}_{t-\tau }^P, \quad t \ge 0. \end{aligned}$$

to describe the traders information on the digital market. This assumption plays an important role in the derivation of the pricing formula in Sect. 5. We also remark that all filtrations satisfy the usual conditions of completeness and right-continuity (see e.g., Protter 2005).

In (2.1), the attention factor P affects explicitly both the drift and the diffusion of the Bitcoin price \(S_t\), up to a certain preceding time \(t-\tau \). We further assume that \(\tau <L\) and that the process P is observed within the period \([-L,0]\), to make the bivariate model jointly feasible. It is worth noticing that the instantaneous variance of the Bitcoin price process increases with the delayed process P; this may appear counter-intuitive if P is interpreted only as a positive attention indicator. However, in our perspective, the factor P is mathematically a nonnegative process but does not necessary represent a positive attention. Possible proxies for P are the volume or the number of daily transactions as well as the number of internet searches.

It is well known that the solution of (2.2) is available in closedform and that \(P_t\) has a log-normal distribution for each \(t > 0\), see (Black and Scholes 1973). Further, we also prove that the system given by equations (2.1) and (2.2) admits a unique strong solution in \(\mathbb {R}_+\) which is given in explicit form, see Theorem B.2 in “Appendix B”.

In order to visualize the dynamics implied by the model in equations (2.1) and (2.2), we plot in Fig. 1 a possible simulated path of daily observations for the attention factor P and the corresponding Bitcoin prices S within one year horizon by letting \(\tau \) vary; as expected, market reaction to attention is delayed when \(\tau \) increases.

Fig. 1
figure 1

An example of Bitcoin price dynamics given the evolution of the attention index (red): \(\tau =1\) day (black), \(\tau =10\) days (blue). Model parameters are set to \(\mu _P=0.03,\sigma _P=0.35\), \(\mu _S=10^{-5},\sigma _S=0.04\)

Let us introduce a key definition for the rest of the paper, the integrated attention process\(X^\tau =\{X_{t}^\tau ,\ t \ge 0\}\), associated to the factor P, given by:

$$\begin{aligned} X_t^\tau := \left\{ \begin{array}{ll} \int _0^t P_{u-\tau } \mathrm {d}u =\int _{-\tau }^{0} \phi (u) \mathrm {d}u + \int _0^{t-\tau } P_u \mathrm {d}u= X_\tau ^\tau + \int _0^{t-\tau } P_u \mathrm {d}u, &{} \quad 0 \le \tau \le t,\\ \int _{-\tau }^{t-\tau } \phi (u) \mathrm {d}u, &{} \quad 0 \le t \le \tau . \end{array} \right. \end{aligned}$$
(2.3)

Note that, for \(t \in [0,\tau ]\), we have \(X_t^\tau =\int _{-\tau }^{t-\tau } \phi (u) \mathrm {d}u\) which is deterministic. In addition, for a finite time horizon \(T>0\), the corresponding change over the interval [tT], for \(t\le T\), is defined as \(X_{t,T}^\tau :=X_T^\tau -X_t^\tau \). Obviously, \(X_{T,T}^\tau =0\); moreover, for \(t<T\),

$$\begin{aligned} X_{t,T}^\tau := \left\{ \begin{array}{ll} \int _{t-\tau }^{T-\tau } P_u \mathrm {d}u &{} \quad \text{ if } \ 0 \le \tau \le t< T, \\ \int _{t-\tau }^{0} \phi (u) \mathrm {d}u + \int _0^{T-\tau } P_u \mathrm {d}u &{} \quad \text{ if } \ 0 \le t \le \tau< T, \\ \int _{t-\tau }^{T-\tau } \phi (u) \mathrm {d}u &{} \quad \text{ if } \ 0\le t < T \le \tau . \\ \end{array} \right. \end{aligned}$$

Note that for \(T\le \tau \), we get \(X_{t,T}^\tau =\int _{t-\tau }^{T-\tau } \phi (u) \mathrm {d}u\) which is deterministic. Basic statistical properties for the integrated attention process and for its changes are given in Lemma B.1 in “Appendix B”.

3 Statistical properties of discretely observed quantities and parameter estimation

In this section, we derive statistical properties for a sample of discretely observed prices and suggest a possible closedform approximation for the joint probability density of the discrete sample. Let us fix a discrete observation step \(\Delta \) and consider the discrete time process \(\{S_i,\ i \in \mathbb {N}\}\), where \(S_i:=S_{i\Delta }\). Define the corresponding logarithmic returns process \(\{R_i,\ i\in \mathbb {N}\}\) as

$$\begin{aligned} R_i=\log (S_{i})-\log (S_{i-1}), \end{aligned}$$

where

$$\begin{aligned} \log (S_i)=\log (S_0)+ \left( \mu _S-\frac{{\sigma _S^{2}}}{2}\right) \int _{0}^{i\Delta } P_{u-\tau } du + {\sigma _{S}} \int _{(0}^{i\Delta } \sqrt{P_{u-\tau }} \mathrm {d}W_u, \end{aligned}$$

see Theorem B.2 in “Appendix B”. Define

$$\begin{aligned} M_i:=\int _{(i-1)\Delta }^{i\Delta } \sqrt{P_{u-\tau }} \mathrm {d}W_u,\,\,\, \forall i \in \mathbb {N}, \end{aligned}$$

so that we can write,

$$\begin{aligned} R_i= \left( \mu _S-\frac{\sigma _S^2}{2}\right) A_i^\tau + \sigma _S M_i, \quad i \in \mathbb {N}, \end{aligned}$$

where \(A_i^\tau :=X_{(i-1)\Delta ,\; i\Delta }^\tau \), with \(X_{t,T}^\tau \) being the variation of the integrated attention process introduced in (2.3); since \(\tau \) is fixed we omit hereafter the dependence on it and, without loss of generality we assume \(\tau < \Delta \) so that \(A_1=X_\tau ^\tau + \int _0^{\Delta -\tau } P_u \mathrm {d}u\).

Note that if \(j\Delta \le \tau <(j+1)\Delta \) the quantities \(A_1,\ldots ,A_j\) are deterministic and the outcomes in what follows still hold if \(A_1\) is replaced by the first non-deterministic value \(A_{j+1}\).

Let us consider a finite time horizon \(T=n\Delta \); under model assumptions the conditional probability distribution of the vector \({\mathbf {M}}=\left( M_1,M_2,\dots ,M_n\right) \), given the vector \({\mathbf {A}}=\left( A_1,A_2,\dots ,A_n\right) \), is a multivariate normal with covariance matrix \(Diag({\mathbf {A}})\).

Hence, the vector of discretely observed logarithmic returns \({\mathbf {R}}=\left( R_1,R_2,\dots ,R_n\right) \), conditionally on \({\mathbf {A}}\), is jointly normal with mean \(\left( \mu _S-\frac{\sigma _S^2}{2}\right) {\mathbf {A}}\) and covariance matrix \(\Sigma =\sigma _S^2 Diag({\mathbf {A}})\) where we have omitted the superscript \(\tau \) for the ease of reading.

The application of Bayes’ rule allows to write the unconditional joint probability distribution of \(\left( {\mathbf {R}},{\mathbf {A}}\right) \), i.e., the density function \(f_{\left( {\mathbf {R}},{\mathbf {A}}\right) }:\mathbb {R}^n \times \mathbb {R}_+^n \times N^n\longrightarrow \mathbb {R}\) as

$$\begin{aligned} f_{\left( {\mathbf {R}},{\mathbf {A}}\right) }({\mathbf {r}},{\mathbf {a}})=f_{A_1}(a_1)\prod _{i=2}^n f_{\left( A_i|A_{i-1}\right) }(a_i) \prod _{i=1}^n \frac{1}{\sqrt{2\pi \sigma _S^2 a_i}} e^{-\frac{1}{2}\frac{\left( r_i-\left( \mu _S-\frac{\sigma _S^2}{2}\right) a_i\right) ^2}{\sigma _S^2 a_i }}. \end{aligned}$$
(3.1)

with \({\mathbf {r}}=\left( r_1,r_2,\dots ,r_n\right) \in \mathbb {R}^n\) and \({\mathbf {a}}=\left( a_1,a_2,\dots ,a_n\right) \in \mathbb {R}_+^n\).

The probability distribution functions \(f_{A_1}(.)\) and \(f_{A_i|A_{i-1}}(.) \text{ for } i=2,3,\dots ,n\) are not available in closedform; though, several approximations exist among which those introduced in Levy (1992) and Milevsky and Posner (1998). Of course, any approximation available for such densities can be applied in order to find a closed formula approximating the joint density \(f_{\left( {\mathbf {R}},{\mathbf {A}}\right) }\left( {\mathbf {r}},{\mathbf {a}}\right) \); in what follows we adopt the one suggested in Levy (1992), see “Appendix A” for further details. Note that the inverse gamma approach suggested in Milevsky and Posner (1998) holds in the limit when T tends to infinity, a condition which is not at all consistent with the applications we have in mind; further discussion on the approximating distribution is beyond the scope of our paper.

3.1 The approximated likelihood

One of the pillar in statistical inference is the maximum likelihood (in short ML) estimation approach where model parameters are estimated so as to maximize the probability of the realized sample to be extracted randomly; the likelihood function shares the same mathematical expression of the probability density function but it is computed “ex-post” when a realization of involved random variables is available and assuming the underlying model parameters to be unknown. It is well known that ML estimates are consistent and asymptotically normal and they achieve efficiency, i.e., they have the lowest variance among estimators sharing the same asymptotic properties (see Davison 2003).

By applying the approximation of Levy (1992), we prove the following Lemma.

Lemma 3.1

Let \(\phi (t) > 0\), for each \(t \in [-L,0]\), in (2.2) and \(\tau <\Delta \). Then, in the market model outlined in Sect. 2, we have

  1. (i)

    the distribution of \(A_1-X_\tau ^\tau \) is approximated by a log-normal with mean \(\alpha _1\) and variance \(\nu _1^2\) given by

    $$\begin{aligned} \alpha _1 =&\log \phi (0) + 2\log \frac{e^{\mu _P(\Delta -\tau )}-1}{\mu _P} \\&\quad -\frac{1}{2}\log \left( \frac{2}{\mu _P+\sigma _P^2} \left[ \frac{e^{(2\mu _P+\sigma _P^2)(\Delta -\tau )}-1}{2\mu _P+\sigma _P^2}-\frac{e^{\mu _P(\Delta -\tau )}-1}{\mu _P} \right] \right) \\ \nu _1^2=&\log \left( \frac{2}{\mu _P+\sigma _P^2} \left[ \frac{e^{(2\mu _P+\sigma _P^2)(\Delta -\tau )}-1}{2\mu _P+\sigma _P^2}-\frac{e^{\mu _P(\Delta -\tau )}-1}{\mu _P} \right] \right) \\&\quad -2\log \left( \frac{e^{\mu _P(\Delta -\tau )}-1}{\mu _P} \right) \end{aligned}$$
  2. (ii)

    the distribution of \(A_i\) given \(A_{i-1}\) (shortly \(A_i|A_{i-1}\)), for \(i=1,\dots ,n\), is approximated by a log-normal with means \(\alpha _i\) and variances \(\nu _i^2\) given by

    $$\begin{aligned} \alpha _i=\log \left( A_{i-1}\right) + \left( \mu _P-\frac{\sigma _P^2}{2}\right) \Delta , \quad \text{ for } i=1,\dots ,n,\\ \nu _i^2=\sigma _P^2\Delta , \quad \text{ for } i=1,\dots ,n. \end{aligned}$$

The proof is postponed to “Appendix B”.

Now, we are in the position to state the following theorem.

Theorem 3.2

Under the same assumptions of Lemma 3.1, given the realized sample \(\left( \bar{{\mathbf {r}}},\bar{{\mathbf {a}}}\right) \), the log-likelihood function \(\log {\mathcal {L}_{\mathbf {R,A}}(\mu _P,\mu _S,\sigma _P,\sigma _S)}:\mathbb {R}^2 \times \mathbb {R}_+^2 \longrightarrow \mathbb {R}\) can be approximated by

$$\begin{aligned} \begin{aligned} \log {\mathcal {L}_{\mathbf {R,A}}(\mu _P,\mu _S,\sigma _P,\sigma _S)}&=\sum _{i=1}^n\left[ \log \left( \frac{1}{\sqrt{2\pi \sigma _S^2 a_i}}\right) -\frac{1}{2}\frac{\left( r_i-\left( \mu _S-\frac{\sigma _S^2}{2}\right) a_i\right) ^2}{\sigma _S^2 a_i}\right] \\&\quad +\sum _{i=1}^n\left[ \log \left( \frac{1}{a_i\nu _i\sqrt{2\pi }}\right) -\frac{\left( \log (a_i)-\alpha _i \right) ^2}{2\nu _i^2}\right] , \end{aligned} \end{aligned}$$
(3.2)

where upper case letters are used for random variables and lowercase for the corresponding realizations.

Proof

First, recall that the likelihood function of a parameter corresponds to the probability density function where random variables are replaced by they realizations and parameters are unknown. Then, by simply applying the logarithmic function to (3.1) we get

$$\begin{aligned} \log {f_{({\mathbf {R,A}})}({\mathbf {r}},{\mathbf {a}})} =&-\frac{1}{2}\sum _{i=1}^n\left[ \log \left( 2\pi \sigma _S^2 a_i \right) +\frac{\left( r_i-\left( \mu _S-\frac{\sigma _S^2}{2}\right) a_i\right) ^2}{\sigma _S^2 a_i}\right] \nonumber \\&\quad +\log \left( f_{A_1}(a_1)\right) +\sum _{i=2}^n\log \left( f_{A_i|A_{i-1}}(a_i)\right) . \end{aligned}$$
(3.3)

Replacing the unknown densities in (3.3) according to Lemma 3.1 gives the desired result. \(\square \)

Maximum likelihood estimates for the model can be obtained by maximizing the log-likelihood approximation in (3.2), i.e.,

$$\begin{aligned} (\widehat{\mu }_P,\widehat{\mu }_S,\widehat{\sigma }_P,\widehat{\sigma }_S)=\arg \max _{\begin{array}{c} \mu _P,\mu _S \\ \sigma _P,\sigma _S \end{array}}\log {\mathcal {L}_{\mathbf {R,A}}(\mu _P,\mu _S,\sigma _P,\sigma _S)}. \end{aligned}$$

In this case, the methodology is referred to as Quasi-Maximum Likelihood since the exact expression of the likelihood is not available; under suitable conditions, quasi-maximum likelihood estimates are asymptotically equivalent to the Maximum Likelihood estimates, see e.g., (White 1982; Gourieroux et al. 1984). We also performed a simulation study to assess the finite sample behavior of the estimates which is summed up in “Appendix C”.

It is worth to stress that the above estimation method does not assume the process P to be observed, as far as \(X_\tau ^\tau \) and \(A_i\), \(i \ge 1\) are known (note that \(A_i\) is the cumulative of P along the time interval \([(i-1)\Delta -\tau ,i\Delta -\tau ]\)).

3.2 Estimation of the delay parameter

The delay parameter \(\tau \) directly affects the definition of the discrete process \(A_i\). Hence, in order to proceed with its estimation we need to observe the process P at a finer observation step \(\delta \) with respect to the log-returns. In what follows we set \(\Delta =\delta r\) and we adopt a two step estimation procedure known as Profile Likelihood in order to estimate the delay. We briefly describe the Profile Likelihood approach to estimation and its application in our specific case; interested readers are referred to Davison (2003), Pawitan (2001) for details on the profile likelihood. The basic idea of this approach is to split the parameter vector which has to be estimated, say \(\theta \), in two sub-vectors, one representing the parameter of interest and the other the so-called nuisance parameter, i.e., \(\theta =(\beta ,\lambda )\); to estimate \(\beta \) and \(\lambda \) jointly we should maximize at once the likelihood, i.e.,

$$\begin{aligned} \max _{\begin{array}{c} \beta ,\lambda \end{array}}\log {\mathcal {L}\left( \beta ,\lambda \right) }. \end{aligned}$$

When this is not feasible and provided the likelihood computed with respect to the nuisance parameter vector \(\lambda \) is available and easy to maximize, we can apply a two step procedure by maximizing, for each \(\beta \) in its parametric space,

$$\begin{aligned} {\mathcal {L}}_p(\beta )=\max _{\begin{array}{c} \lambda \end{array}}{\mathcal {L}}(\beta ,\lambda )={\mathcal {L}}(\beta ,\widehat{\lambda }_\beta ),\,\,\, \end{aligned}$$

where p is the length of parameter \(\beta \) and \(\widehat{\lambda }_\beta \) is the maximum likelihood estimate of \(\lambda \) for fixed \(\beta \). Then, the best estimate for \(\beta \) is obtained as

$$\begin{aligned} \widehat{\beta }=\arg \max _{\begin{array}{c} \beta \end{array}}\log {{\mathcal {L}}_p(\beta )}. \end{aligned}$$

Classical confidence intervals cannot be defined in this setting; indeed, it is possible to obtain a confidence region for \(\beta \) using the likelihood ratio statistics (see Davison 2003), defined as

$$\begin{aligned} W_p(\beta _0)=2\left\{ {\mathcal {L}}(\widehat{\beta },\widehat{\lambda }) - {\mathcal {L}}(\beta _0,\widehat{\lambda }_{\beta _0}) \right\} , \end{aligned}$$

where

$$\begin{aligned} W_p(\beta _0) \xrightarrow { \;D\; } \chi _p^2 \end{aligned}$$

and \(\beta _0\) is an assigned value for \(\beta \). These results imply that the confidence region for \(\beta \) is the set

$$\begin{aligned} \left\{ \beta : {\mathcal {L}}_p(\beta )\ge {\mathcal {L}}_p(\widehat{\beta })-\frac{1}{2} c_p(1-2\alpha ) \right\} , \end{aligned}$$
(3.4)

with \(c_p(\alpha )\) is the \(\alpha \)-quantile of the \(\chi _p^2\) distribution.

In our exercise we split \(\theta :=(\mu _P,\mu _S,\sigma _P,\sigma _S,\tau )\) in \(\theta =(\tau ,\lambda )\) where \(\tau \) is the parameter on which we are focusing and \(\lambda =(\mu _P,\mu _S,\sigma _P,\sigma _S)\) is the nuisance parameter vector. The Profile Likelihood approach is feasible in our case since a closed approximating expression for the likelihood with respect to the nuisance parameter is indeed available. The parametric space for \(\beta :=\tau \) is the interval [0, L] in this case but, for practical purposes, \(\tau \) is chosen on a grid, i.e., \(\tau \in \lbrace \tau _0,\tau _1,\tau _2,\ldots ,\tau _k\rbrace \); the maximization of the likelihood \(\log {\mathcal {L}_{\mathbf {R,A}}({\mathbf {r}},{\mathbf {a}})}\) is then performed with respect to \(\lambda \) for each value \(\tau _j\) in the grid, obtaining \({\mathcal {L}}_p(\tau _j)\) for \(j=0,1,\dots ,k\). An estimate for \(\tau \) is then obtained as \(\widehat{\tau }=\arg \max _j {\mathcal {L}}_p(\tau _j)\). Finally we get \(\widehat{\theta }=\left( \widehat{\tau },\widehat{\lambda }_{\widehat{\tau }} \right) \). Of course, the estimation error decreases with the mesh of the grid so that it sufficiently spans the parametric set for \(\tau \).

4 Model fitting on historical data

In order to fit the model described in (2.1)–(2.2) on real data we need samples for both the Bitcoin price and the attention indicator. Recently, in Da et al. (2011) the authors suggest a new and direct measure of investor attention using cumulative internet search frequency. This measure is particularly consistent with our approach since Bitcoin is an internet-based digital currency and internet users commonly collect information through a search engine such as GoogleFootnote 2. Besides, “the search volume is likely to be representative of the internet search behavior of the general population and more critically, search is a revealed attention measure: if you search for a stock in Google, you are undoubtedly paying attention to it. Therefore, aggregate search frequency in Google is a direct and unambiguous measure of attention”, quoting (Da et al. 2011). The authors in Da et al. (2011) also find strong evidence that search volume index (SVI), provided by Google, captures the attention of individual/retail investors, i.e., of non informed investors, that we named as followers in the Introduction.

Further, the number and volume of transactions, which are examples of traditional measures for investor attention, as well as the SVI volume of Google searches or Wikipedia requests have been proven to affect Bitcoin prices in Kristoufek (2015), Kristoufek (2013).

Hence, once consistency is investigated, for the above measures of attention, with the dynamics assumed in (2.2), we fit the model to both the attention factor and Bitcoin prices data. Daily observations for Bitcoin prices, volume and number of transactions are obtained through the Web site http://blockchain.info which provides an average price among main exchanges trading on Bitcoin and the total exchanged volume. Daily data for Wikipedia requests on the term “bitcoin” are obtained through the Web site http://tools.wmflabs.org/pageviews. Google computes the search volume index (SVI) for a search term as the number of searches for that term scaled by its time series average and make this measure available by the Web site Google-trends http://www.google.com/trends from which we obtained weekly data for the number of Google searches on the term “bitcoin” (daily data are not available at the time of writing).

4.1 Proxies for attention

The univariate process P is a geometric Brownian motion; it is well known that the corresponding discrete process of logarithmic changes, with time step \(\delta \), is a sequence of independent and identically distributed normal random variables with mean \(\left( \mu _P-\frac{\sigma _P^2}{2}\right) \delta \) and variance \(\sigma _P^2 \delta \). Simple tests, such as the augmented Dickey–Fuller test, see (Tsay 2005), and the one-sample Kolmogorov-Smirnov test, see (Massey 1951), are applied in what follows to investigate such property on candidate proxies for the process P.

We consider the number and the volume of transactions as possible traditional attention measures and the number of Google searches and Wikipedia requests as examples of web-based attention indicators. The first three series were investigated from 01/01/2015 to 31/03/2017 while Wikipedia requests are considered from 01/07/2015 to 31/03/2017Footnote 3

Non-stationarity is rejected for all proxies, whereas log-normality is not rejected only for the trading volume and the SVI index. For this reason, we used the these two proxies for our empirical estimation.

4.2 Estimation results

According to the outcomes in previous subsection, we consider the daily time series of the volume of transactions and the weekly time series of the Google searches from 01/01/2015 to 31/03/2017 as suitable proxies for process P. Hence, we fit the model described in (2.1) and (2.2) by applying the procedure described in Sect. 3, that we may briefly refer as Profile Quasi-Maximum Likelihood (PQML). Bitcoin prices are considered from 15/01/2015 in order to account for a maximum of two weeks for the time delay with respect to attention measures. Note that Google-trends provides a scaled time series for the number of searches so the maximum value is 100; in order to compare outcomes we do the same for the trading volume time series. In what follows we assume that \(\Delta \) = 1 week is the observation step for Bitcoin log-returns.

4.2.1 Attention measured by volume

Given daily observations \( \lbrace P_i \rbrace _i\) of the volume of transactions we are able to compute the cumulative weekly attention \(\lbrace A_i \rbrace _i\); for \(\tau =0\)\(A_i\) is simply the mean volume during the preceding week, i.e.,

$$\begin{aligned} A_i= & {} \int _{\left( i-1\right) \Delta }^{i\Delta }P_{u} \mathrm {d}u =\sum _{j=1}^{7}\int _{\left( i-1\right) \Delta +\left( j-1\right) \delta }^{\left( i-1\right) \Delta +j\delta } P_{u} \mathrm {d}u =\sum _{j=1}^{7} P_{\left( i-1\right) \Delta +j\delta } \delta \\= & {} \frac{\sum _{j=1}^{7} P_{\left( i-1\right) \Delta +j\delta }}{7} \Delta . \end{aligned}$$

The generalization to \(\tau >0\) is straightforward as soon as we assume \(\tau =r\delta \) for some positive integer r; in which case \(A_i\) would be the mean volume of the 7 days preceding time \(i\Delta -r\delta \).

By applying the PQML method we obtain \(\tau =5\) days; the estimated value of other parameters is summed up in Table 1.

Table 1 Parameter fit with \(\tau =5\)

In order to assess significance of parameters, the usual t-statistics is also computed and shown in the table as well as the p-value of the t test. It is clear from Table 1 that \(\mu _P\) is not statistically significant and that \(\mu _S\) is weakly significant. Finally, we evaluate the confidence region for \(\tau \) using (3.4) and \(\tau \in \lbrace 0,1,2,\ldots ,10 \rbrace \) days. We find that the confidence region is given by \({\mathcal {T}}=\lbrace {2,3,4\rbrace }\) days. Estimates of other parameters are indeed very similar for any \(\tau \in {\mathcal {T}}\) and analogous comments apply. We stress that the value 0 is not within the confidence region, meaning that the introduction of the delay parameter is relevant in the model specification.

4.2.2 Attention measured by Google searches

Assume now that Google searches are representative of attention on Bitcoin. Since we have weekly data, we should aggregate both Google searches and Bitcoin returns to a coarser observation step; however, this would reduce the time series length dramatically and corresponding estimates might be unreliable. Hence, we assume that the available observations correspond to the cumulative attention time series \(A_i\) and we assume there exists a nonnegative integer c such that \(\tau =c\Delta \); in this case, \(\tau \) is on a weekly scale and not on a daily scale as in the previous case.

By applying the PQML procedure we obtain \(\tau =1\) week; the estimated value of other parameters is displayed in Table 2.

Table 2 Parameter fit with \(\tau =1\)

The t- statistics and its p-value are also reported for all parameters, as in Sect. 4.2.1. Again, \(\mu _P\) is not statistically significant and \(\mu _S\) is weakly significant. Note that, the confidence region for \(\tau \) only includes \(\tau \) = 1 week.

5 Risk-neutral evaluation of European-type contingent claims

In this section results concerning arbitrage-free conditions and pricing for European-style derivatives are discussed. Precisely, we derive a closedform pricing formula for European-style derivatives and, once parameter are estimated via the PQML method described above, we compare model prices with real quotations reported on the trading platform www.deribit.com, where standard and binary option on Bitcoin are issued regularly. Though Bitcoin is traded on different exchanges with different prices, the derivative contracts in the platform assume that the underlying price is given by the value of the Bitcoin Index available in www.blockchain.info , computed as a weighted average of prices on major exchanges. Here, we use the same approach and consider a single price for one Bitcoin.

Let us fix a finite time horizon \(T>0\) and assume the existence of a riskless asset, say the money market account, whose value process \(B=\{B_t,\ t \in [0,T]\}\) is given by

$$\begin{aligned} B_t=e^{\int _0^t r(s)\mathrm {d}s},\quad t \in [0,T], \end{aligned}$$

where \(r:[0,T] \rightarrow \mathbb {R}\) is a bounded, deterministic function representing the instantaneous risk-free interest rate. To exclude arbitrage opportunities, we need to check that the set of all equivalent martingale measures for the Bitcoin price process S is non-empty. More precisely, it will contain more than a single element, since P does not represent the price of any tradable asset, and therefore the underlying market model is incomplete, as shown in Lemma B.3 in “Appendix B”.

Simple examples of candidate equivalent martingale measures are obtained under the assumption of a constant price \(\gamma \) of attention risk. Specifically these are defined as probability measures \(\mathbf {Q}^\gamma \) with the following density

$$\begin{aligned} \frac{\mathrm {d}\mathbf {Q}^\gamma }{\mathrm {d}\mathbf {P}}\bigg |_{\mathcal {F}_T}=:L_T^{\mathbf {Q}^\gamma }, \quad \mathbf {P}-\text{ a.s. }, \end{aligned}$$

where \(L_T^{\mathbf {Q}^\gamma }\) is the terminal value of the \((\mathbb {F},\mathbf {P})\)-martingale \(L^{\mathbf {Q}^\gamma }=\{L_t^{\mathbf {Q}^\gamma },\ t \in [0,T]\}\) given by

$$\begin{aligned} L_t^{\mathbf {Q}^\gamma } :=\mathcal {E}\left( -\int _0^\cdot \frac{\mu _S P_{s-\tau }-r(s)}{\sigma _S\sqrt{P_{s-\tau }}} \mathrm {d}W_s - \int _0^\cdot \gamma \mathrm {d}Z_s\right) , \quad t \in [0,T], \end{aligned}$$

for a constant parameter \(\gamma \).

The dynamics of the (non-discounted) Bitcoin price with respect to the minimal martingale measure \( \mathbf {Q}^\gamma \) is given by

$$\begin{aligned} \left\{ \begin{array}{ll} \mathrm {d}S_t = r(t) S_t\mathrm {d}t + \sigma _S \sqrt{P_{t-\tau }} S_t\mathrm {d}{\widehat{W}}_t, \quad S_0=s_0 \in \mathbb {R}_+, \\ \mathrm {d}P_t =(\mu _P-\gamma \,\sigma _P) P_t\mathrm {d}t+\sigma _P P_t\mathrm {d}{\widehat{Z}}_t, \quad P_t = \phi (t),\ t \in [-L,0], \\ \end{array} \right. \end{aligned}$$

where \({\widehat{W}}=\{{\widehat{W}}_t,\ t \in [0,T]\}\) and \({\widehat{Z}}=\{{\widehat{Z}}_t,\ t \in [0,T]\}\) are \((\mathbb {F},\mathbf {Q}^\gamma )\)-Brownian motions defined, respectively, by

$$\begin{aligned} {\widehat{W}}_t&:= W_t + \int _0^t\frac{\mu _S P_{s-\tau }-r(s)}{\sigma _S\sqrt{P_{s-\tau }}} \mathrm {d}s,\quad t \in [0,T],\\ {\widehat{Z}}_t&:= Z_t+\gamma \, t, \quad t \in [0,T]. \end{aligned}$$

A very special case within this family is the so-called minimal martingale measure (see e.g., Föllmer and Schweizer 1991, 2010), obtained by setting \(\gamma =0\), and denoted by \({\widehat{\mathbf {P}}}\). Note that, in this case, the Brownian motion driving the attention factor as well as its dynamics is not affected by the change of measures, i.e., \({\widehat{Z}}_t :=Z_t\). The interested reader may find further details in “Appendix B”.

Let us consider in what follows a European-type contingent claim, expiring at time T, traded on the underlying market and we assume that its final payoff is described by a \({\widetilde{\mathcal {F}}}_T\)-measurable random variable \(H=\varphi (S_T)\), with \(\varphi : \mathbb {R}\rightarrow \mathbb {R}\) being a Borel-measurable functionFootnote 4 such that H is integrable under \(\mathbf {Q}^\gamma \).

Recall that \(X_{t,T}^\tau =X_T^\tau -X_t^\tau \), for each \(t \in [0,T)\), refers to the variation of the process \(X^\tau \) defined in (2.3), over the interval [tT]. Then, denote by \(\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \cdot \Big |{\widetilde{\mathcal {F}}}_t\right] \) the conditional expectation with respect to \({\widetilde{\mathcal {F}}}_t\) under the probability measure \(\mathbf {Q}^\gamma \) and so on.

Theorem 5.1

Let \(H=\varphi (S_T)\) be the payoff a European-type contingent claim with date of maturity T. Then, the risk-neutral price \(\Phi _t(H)\) at time t of H is given by

$$\begin{aligned} \Phi _t(H) = \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \psi (t,S_t,X_{t,T}^\tau )\Bigg | S_t\right] , \quad t \in [0,T), \end{aligned}$$

where \(\psi : [0,T) \times \mathbb {R}_+ \times \mathbb {R}_+ \longrightarrow \mathbb {R}\) is a Borel-measurable function such that

$$\begin{aligned} \psi (t,S_t,X_{t,T}^\tau )=B_t\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \frac{1}{B_T}G\left( t,S_t,X_{t,T}^\tau ,Y_{t,T}\right) \Bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right] , \end{aligned}$$
(5.1)

for a suitable function G depending on the contract such that \(G\left( t,S_t,X_{t,T}^\tau ,Y_{t,T}\right) \) is \(\mathbf {Q}^\gamma \)-integrable.

Proof

For the sake of simplicity suppose that \(\tau <T\) and set \(Y_{t,T}:= \int _t^T\sqrt{P_{u-\tau }}\mathrm {d}{\widehat{W}}_u\), for each \(t \in [0,T)\). Then, the risk-neutral price \(\Phi _t(H)\) at time t of a European-type contingent claim with payoff \(H=\varphi (S_T)\) is given by

$$\begin{aligned} \Phi _t(H)&= B_t \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \frac{\varphi (S_T)}{B_T}\bigg |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&= B_t \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \frac{\varphi \left( S_te^{\int _t^T r(u)\mathrm {d}u-\frac{\sigma _S^2}{2}X_{t,T}^\tau + \sigma _S Y_{t,T}}\right) }{B_T}\left| \phantom {\frac{\varphi \left( S_te^{\int _t^T r(u)\mathrm {d}u-\frac{\sigma _S^2}{2}X_{t,T}^\tau + \sigma _S Y_{t,T}}\right) }{B_T}}\right. \mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right] \left| \phantom {\frac{\varphi \left( S_te^{\int _t^T r(u)\mathrm {d}u-\frac{\sigma _S^2}{2}X_{t,T}^\tau + \sigma _S Y_{t,T}}\right) }{B_T}}\right. {\widetilde{\mathcal {F}}}_t\right] , \end{aligned}$$
(5.2)

where \(\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \cdot \Big |{\widetilde{\mathcal {F}}}_t\right] \) denotes the conditional expectation with respect to \({\widetilde{\mathcal {F}}}_t\) under the equivalent martingale measure \( \mathbf {Q}^\gamma \). More generally, (5.2) can be written as

$$\begin{aligned} \Phi _t(H) = B_t \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \frac{G(t,S_t,X_{t,T}^\tau , Y_{t,T})}{B_T}\Bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right] \Bigg |{\widetilde{\mathcal {F}}}_t\right] , \end{aligned}$$
(5.3)

for a suitable function G depending on the contract function \(\varphi \). We can apply the same arguments used in point (ii) of the proof of Theorem B.2, to get that, for each \(t \in [0,T)\), the random variable \(Y_{t,T}\) conditioned on \(\mathcal {F}_{T-\tau }^P\) is Normally distributed with mean 0 and variance \(X_{t,T}^\tau \). Then, we can write (in law) that \(Y_{t,T}= \sqrt{X_{t,T}^\tau } \epsilon \), where \(\epsilon \) is a standard Normal random variable and this allows to find a function \(\psi \) such that (5.1) holds, which means that the conditional expectation with respect to \(\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\) in (5.3) only depends on \(S_t\) and \(X_{t,T}^\tau \), for every \(t \in [0,T)\). Consequently, the risk-neutral price \(\Phi _t(H)\) can be written as

$$\begin{aligned} \Phi _t(H)&= \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \psi (t,S_t,X_{t,T}^\tau )\bigg |{\widetilde{\mathcal {F}}}_t\right] = \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \psi (t,S_t,X_{t,T}^\tau )\Bigg | S_t\right] , \end{aligned}$$
(5.4)

where the last equality holds since S is \({\widetilde{\mathbb {F}}}\)-adapted and \(X_{t,T}^\tau \) is independent of \({\widetilde{\mathcal {F}}}_t\), for each \(t \in [0,T)\), see e.g., (Pascucci 2011, Lemma A.108). More precisely, we have

$$\begin{aligned} \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \psi (t,S_t,X_{t,T}^\tau ) \bigg |{\widetilde{\mathcal {F}}}_t\right] =\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \psi (t,S_t,X_{t,T}^\tau ) \bigg | S_t\right] =g(S_t), \end{aligned}$$

where

$$\begin{aligned} g(s)=\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \psi (t,s,X_{t,T}^\tau ) \bigg | S_t=s\right] ,\quad s \in \mathbb {R}_+. \end{aligned}$$

\(\square \)

Since the martingale measure is fixed, the risk-neutral price obtained above agrees with the arbitrage-free price for those payoffs which can be replicated by investing on the underlying market. Forward prices at settlement can also be obtained, by imposing an initial zero cash-flow between counterparts in the pricing formula.

Remark 5.2

It is worth to remark that \(\psi (t,S_t,x)\), with \(x \in \mathbb {R}_+\), represents the risk-neutral price at time \(t \in [0,T)\) of the contract \(H=\varphi (S_T)\) in a Black & Scholes framework, where the constant volatility parameter \(\sigma ^{BS}\) is defined by

$$\begin{aligned} \sigma ^{BS}:=\sigma _S\sqrt{\frac{x}{T-t}}. \end{aligned}$$

This is proved explicitly in Corollary 5.4 for the special case of a Plain Vanilla European Call option.

Remark 5.3

A pricing formula analogous to (5.4) is conjectured in Hull and White (1987) for a special example of the model suggested here (corresponding to \(\tau =0\) and \(\sigma _S=1\)). However, the authors assumed from the very beginning to be within a risk-neutral setting, without defining the dynamics under the physical measure and without the need for a proof of the existence of any equivalent martingale measure. Theorem 5.1 extends their results to the more general case and gives a rigorous proof.

5.1 A Black & Scholes-type option pricing formula

We consider here the special case of a European Call option with strike price K and maturity T . Define the function \(C^{BS}:[0,T) \times \mathbb {R}_+ \times \mathbb {R}_+ \longrightarrow \mathbb {R}\) as follows

$$\begin{aligned} C^{BS}(t,s,x):=s\mathcal N(d_1(t,s,x)) - Ke^{-\int _0^t r(u) \mathrm {d}u}\mathcal N(d_2(t,s,x)), \end{aligned}$$
(5.5)

where

$$\begin{aligned} d_1(t,s,x)=\frac{\log \left( \frac{s}{K}\right) + \int _0^t r(u)\mathrm {d}u + \frac{\sigma _S^2}{2}x}{\sigma _S \sqrt{x}} \end{aligned}$$

and \(d_2(t,s,x)=d_1(t,s,x)-\sigma _S \sqrt{x}\), or more explicitly

$$\begin{aligned} d_2(t,s,x)=\frac{\log \left( \frac{s}{K}\right) + \int _0^t r(u)\mathrm {d}u - \frac{\sigma _S^2}{2}x}{\sigma _S \sqrt{x}}. \end{aligned}$$

Here, \(\mathcal N\) stands for the standard Gaussian cumulative distribution function

$$\begin{aligned} \mathcal N(y)=\frac{1}{\sqrt{2\pi }} \int _{-\infty }^{y} e^{-\frac{z^2}{2}}\mathrm {d}z, \quad \forall \ y \in \mathbb {R}. \end{aligned}$$

Corollary 5.4

The risk-neutral price \(C_t\) at time t of a European Call option written on the Bitcoin with price S expiring in T and with strike price K is given by the formula

$$\begin{aligned} C_t=\mathbb {E}^{\mathbf {Q}^\gamma }\left[ C^{BS}(t,S_t,X_{t,T}^\tau )\bigg | S_t\right] , \quad t \in [0,T), \end{aligned}$$
(5.6)

where the function \(C^{BS}\) is defined in (5.5).

Proof

As in the proof of Theorem 5.1, let us assume that \(\tau <T\). Under the martingale measure \(\mathbf {Q}^\gamma \), the risk-neutral price \(C_t\) at time \(t \in [0,T)\) of a European Call option written on the Bitcoin with price S expiring in T and with strike price K, is given by

$$\begin{aligned} C_t&= B_t \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \frac{\max \left( S_T-K,0\right) }{B_T}\bigg |{\widetilde{\mathcal {F}}}_t\right] \\&= B_t\mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\widetilde{S}}_T{\mathbf {1}}_{\{S_T>K\}}\Big |{\widetilde{\mathcal {F}}}_t\right] -K e^{-\int _t^T r(u) \mathrm {d}u}\mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\mathbf {1}}_{\{S_T>K\}}\Big |{\widetilde{\mathcal {F}}}_t\right] \\&= B_t J_1 - Ke^{-\int _t^T r(u) \mathrm {d}u}J_2, \end{aligned}$$

where we have set \(J_1:=\mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\widetilde{S}}_T{\mathbf {1}}_{\{S_T>K\}}\Big |{\widetilde{\mathcal {F}}}_t\right] \) and \(J_2:=\mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\mathbf {1}}_{\{S_T>K\}}\Big |{\widetilde{\mathcal {F}}}_t\right] \). Setting \(Y_{t,T}:= \int _t^T\sqrt{P_{u-\tau }}\mathrm {d}{\widehat{W}}_u\), for every \(t \in [0,T)\), the term \(J_2\) can be written as

$$\begin{aligned} J_2&= \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\mathbf {1}}_{\{S_T>K\}}|\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right] \Big |{\widetilde{\mathcal {F}}}_t\right] \\&= \mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\widehat{\mathbf {P}}}\left( S_te^{\int _t^T r(u)\mathrm {d}u-\frac{\sigma _S^2}{2}X_{t,T}^\tau +\sigma _S Y_{t,T}}>K \bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P \right) \bigg |{\widetilde{\mathcal {F}}}_t\right] \\&= \mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\widehat{\mathbf {P}}}\left( -\frac{Y_{t,T}}{ \sqrt{X_{t,T}^\tau }} < \frac{\log \left( \frac{S_t}{K}\right) + \int _t^T r(u)\mathrm {d}u - \frac{\sigma _S^2}{2}X_{t,T}^\tau }{ \sigma _S \sqrt{X_{t,T}^\tau }} \bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right) \bigg |{\widetilde{\mathcal {F}}}_t\right] \\&= \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal N\left( d_2(t,S_t,X_{t,T}^\tau ) \right) \bigg |{\widetilde{\mathcal {F}}}_t\right] , \end{aligned}$$

as for each \(t \in [0,T)\), the random variable \(\displaystyle -\frac{Y_{t,T}}{\sqrt{X_{t,T}^\tau }}\) , given \(\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\) , has a standard Gaussian law \(\mathcal N(0,1)\) under \(\mathbf {Q}^\gamma \). Concerning \(J_1\), let us introduce the auxiliary probability measure \({\bar{\mathbf {Q}}}\) on \((\Omega ,\mathcal {F}_T)\) defined as follows:

$$\begin{aligned} \frac{\mathrm {d}{\bar{\mathbf {Q}}}}{\mathrm {d}\mathbf {Q}^\gamma } := e^{-\frac{\sigma _S^2}{2} \int _0^T P_{u-\tau }\mathrm {d}u + \sigma _S \int _0^T \sqrt{P_{u-\tau }}\mathrm {d}{\widehat{W}}_u}, \quad {\widehat{\mathbf {P}}}-\text{ a.s. }. \end{aligned}$$

By Girsanov’s Theorem, we get that the process \({\bar{W}}=\{{\bar{W}}_t,\ t \in [0,T]\}\), given by

$$\begin{aligned} {\bar{W}}_t := {\widehat{W}}_t - \sigma _S\int _0^t\sqrt{P_{u-\tau }}\mathrm {d}u, \quad t \in [0,T], \end{aligned}$$

follows a standard \((\mathbb {F},{\bar{\mathbf {Q}}})\)-Brownian motion. In addition, using (B.16) in “Appendix B”, we obtain

$$\begin{aligned} {\widetilde{S}}_T = {\widetilde{S}}_t e^{\sigma _S\int _t^T\sqrt{P_{u-\tau }}\mathrm {d}{\bar{W}}_u + \frac{\sigma _S^2}{2}\int _t^T P_{u-\tau } \mathrm {d}u}, \end{aligned}$$

for every \(t \in [0,T]\). Since S is \({\widetilde{\mathbb {F}}}\)-adapted, by (B.16) and the Bayes formula on the change of probability measure for conditional expectation, for every \(t \in [0,T)\) we get

$$\begin{aligned} J_1&= \mathbb {E}^{\mathbf {Q}^\gamma }\left[ {\widetilde{S}}_T {\mathbf {1}}_{\{S_T>K\}}\Big |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&= {\widetilde{S}}_t \frac{\mathbb {E}^{\mathbf {Q}^\gamma }\left[ e^{-\frac{\sigma _S^2}{2} X_{T}^\tau + \sigma _S Y_{0,T}}{\mathbf {1}}_{\{S_T>K\}}\bigg |{\widetilde{\mathcal {F}}}_t\right] }{e^{-\frac{\sigma _S^2}{2} X_{t}^\tau + \sigma _S Y_{0,t}}}\nonumber \\&= {\widetilde{S}}_t \mathbb {E}^{\bar{\mathbf {Q}}}\left[ {\mathbf {1}}_{\left\{ {\widetilde{S}}_T> K B_T^{-1}\right\} }\bigg |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&= {\widetilde{S}}_t \mathbb {E}^{\bar{\mathbf {Q}}}\left[ \mathbb {E}^{\bar{\mathbf {Q}}}\left[ {\mathbf {1}}_{\left\{ \sigma _S {\bar{Y}}_{t,T} > \log \left( \frac{K}{S_t}\right) - \int _t^T r(u)\mathrm {d}u - \frac{\sigma _S^2}{2}X_{t,T}^\tau \right\} }\bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right] \bigg |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&= {\widetilde{S}}_t \mathbb {E}^{\bar{\mathbf {Q}}}\left[ {\bar{\mathbf {Q}}} \left( -\frac{{\bar{Y}}_{t,T}}{\sqrt{X_{t,T}^\tau }} < \frac{\log \left( \frac{S_t}{K}\right) + \int _t^T r(u)\mathrm {d}u + \frac{\sigma _S^2}{2}X_{t,T}^\tau }{\sigma _S \sqrt{X_{t,T}^\tau }} \bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right) \bigg |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&= {\widetilde{S}}_t\mathbb {E}^{\bar{\mathbf {Q}}}\left[ \mathcal N\left( d_1(t,S_t,X_{t,T}^\tau ) \right) \bigg |{\widetilde{\mathcal {F}}}_t\right] , \end{aligned}$$
(5.7)

with

$$\begin{aligned} d_1(t,S_t,X_{t,T}^\tau ) = d_2(t,S_t,X_{t,T}^\tau ) + \sigma _S\sqrt{X_{t,T}^\tau }. \end{aligned}$$

In the above computations, analogously to before, we have set \({\bar{Y}}_{t,T}:= \int _t^T\sqrt{P_{u-\tau }}\mathrm {d}{\bar{W}}_u\), for each \(t \in [0,T)\). Consequently, we have that \({\bar{Y}}_{t,T}\) conditional on \(\mathcal {F}_{T-\tau }^P\), is a Normally distributed random variable with mean 0 and variance \(X_{t,T}^\tau \), for each \(t \in [0,T)\), since Z is not affected by the change of measure from \(\mathbf {Q}^\gamma \) to \({\bar{\mathbf {Q}}}\). Indeed, by the change of numéraire theorem, we have that the probability measure \({\bar{\mathbf {Q}}}\) turns out to be the minimal martingale measure corresponding to the choice of the Bitcoin price process as benchmark. Further, by applying again the Bayes formula on the change of probability measure for conditional expectation, we get

$$\begin{aligned} J_1&= {\widetilde{S}}_t\mathbb {E}^{\bar{\mathbf {Q}}}\left[ \mathcal N\left( d_1(t,S_t,X_{t,T}^\tau ) \right) \bigg |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&= {\widetilde{S}}_t \frac{\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal {N}\left( d_1(t,S_t,X_{t,T}^\tau ) \right) e^{-\frac{\sigma _S^2}{2} \int _0^T P_{u-\tau }\mathrm {d}u + \sigma _S \int _0^T \sqrt{P_{u-\tau }}\mathrm {d}{\widehat{W}}_u}\bigg |{\widetilde{\mathcal {F}}}_t\right] }{e^{-\frac{\sigma _S^2}{2} \int _0^t P_{u-\tau }\mathrm {d}u + \sigma _S \int _0^t \sqrt{P_{u-\tau }}\mathrm {d}{\widehat{W}}_u}}\nonumber \\&= {\widetilde{S}}_t \mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal {N}\left( d_1(t,S_t,X_{t,T}^\tau ) \right) e^{-\frac{\sigma _S^2}{2} X_{t,T}^\tau }\mathbb {E}^{\mathbf {Q}^\gamma }\left[ e^{\sigma _S Y_{t,T}} \bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right] \bigg |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&= {\widetilde{S}}_t\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal N\left( d_1(t,S_t,X_{t,T}^\tau ) \right) \bigg |{\widetilde{\mathcal {F}}}_t\right] , \end{aligned}$$
(5.8)

since the conditional Gaussian distribution of \(Y_{t,T}\) gives

$$\begin{aligned} \mathbb {E}^{\mathbf {Q}^\gamma }\left[ e^{\sigma _S Y_{t,T}}\bigg |\mathcal {F}_t^W \vee \mathcal {F}_{T-\tau }^P\right] = e^{\frac{\sigma _S^2}{2} X_{t,T}^\tau }. \end{aligned}$$

Finally, gathering the two terms (5.8) and (5.7), for every \(t \in [0,T)\) we obtain

$$\begin{aligned} C_t&= S_t\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal N\left( d_1(t,S_t,X_{t,T}^\tau ) \right) \bigg |{\widetilde{\mathcal {F}}}_t\right] -K e^{-\int _t^T r(u) \mathrm {d}u}\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal N\left( d_2(t,S_t,X_{t,T}^\tau ) \right) \bigg |{\widetilde{\mathcal {F}}}_t\right] \nonumber \\&=\mathbb {E}^{\mathbf {Q}^\gamma }\left[ C^{BS}(t,S_t,X_{t,T}^\tau )\bigg |{\widetilde{\mathcal {F}}}_t\right] = \mathbb {E}^{\mathbf {Q}^\gamma }\left[ C^{BS}(t,S_t,X_{t,T}^\tau )\bigg | S_t\right] , \end{aligned}$$

where the last equality follows again from (Pascucci 2011, Lemma A.108), since for each \(t \in [0,T)\), \(X_{t,T}^\tau \) is independent of \({\widetilde{\mathcal {F}}}_t\) and \(S_t\) is \({\widetilde{\mathcal {F}}}_t\)-measurable. \(\square \)

It is worth noticing that the option pricing formula (5.6) only depends on the distribution of \(X_{t,T}^\tau \) which is the same both under measure \(\mathbf {Q}^\gamma \) and \({\bar{\mathbf {Q}}}\). As observed in Remark 5.2, formula (5.6) evaluated in \(S_t\) corresponds to the Black & Scholes price at time \(t \in [0,T)\) of a European Call option written on S, with strike price K and maturity T, in a market where the volatility parameter is given by \(\sigma _S \sqrt{\frac{x}{T-t}}\). Then, for every \(t \in [0,T)\) it may be written as:

$$\begin{aligned} C_t=\int _0^{+\infty } C^{BS}(t,S_t,x) f_{X_{t,T}^\tau }(x) \mathrm {d}x, \end{aligned}$$

where \(f_{X_{t,T}^\tau }\) denotes the density function of \(X_{t,T}^\tau \), for each \(t \in [0,T)\), under measure \(\mathbf {Q}^\gamma \). The price at time t for a Plain Vanilla European option may also be written as a Black & Scholes style price:

$$\begin{aligned} C_t=S_t Q_1-K e^{-\int _t^T r(u) \mathrm {d}s} Q_2, \end{aligned}$$

where

$$\begin{aligned} Q_1:=\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal N\left( d_1(t,S_t,X_{t,T}^\tau ) \right) \bigg |S_t\right]&= \int _0^{+\infty } \mathcal N\left( d_1(t,S_t,x) \right) f_{X_{t,T}^\tau }(x) \mathrm {d}x, \end{aligned}$$

and

$$\begin{aligned} Q_2:=\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal N\left( d_2(t,S_t,X_{t,T}^\tau ) \right) \bigg |S_t\right]&= \int _0^{+\infty } \mathcal N\left( d_2(t,S_t,x) \right) f_{X_{t,T}^\tau }(x) \mathrm {d}x. \end{aligned}$$

Similar formulas can be computed for other European-style derivatives as for binary options which, indeed, are quoted in Bitcoin markets. For the case of a Cash or Nothing Call, which is essentially a bet of A on the exercise event, the risk-neutral pricing formula is given by

$$\begin{aligned} C_t^{Bin}&=Ae^{-\int _t^T r(u) \mathrm {d}s}\mathbb {E}^{\mathbf {Q}^\gamma }\left[ \mathcal N\left( d_2(t,S_t,X_{t,T}^\tau ) \right) \bigg | S_t\right] \\&= Ae^{-\int _t^T r(u) \mathrm {d}s} \int _0^{+\infty } \mathcal N\left( d_2(t,S_t,x) \right) f_{X_{t,T}^\tau }(x) \mathrm {d}x,\quad t \in [0,T). \end{aligned}$$

To compute derivative prices by applying the above formulas, we need an explicit form for the distribution function of \(X_{t,T}^\tau \). One possible approach is to approximate this distribution by applying the outcomes in Levy (1992) where the author suggests a log-normal distribution as a proper choice. Specifically, the price at time \(t=0\) of Call option pricing formula reads:

$$\begin{aligned} C_0 = \int _0^{+\infty } C^{BS}(0,S_0,x) {\mathcal {LN}} pdf_{\alpha (T-\tau ),\nu ^2(T-\tau )} \left( x\right) \mathrm {d}x, \end{aligned}$$
(5.9)

which can be computed numerically, once parameters \(\alpha (T-\tau ),\nu (T-S)\), defined in “Appendix B” are obtained. Similarly, the price at \(t=0\) of a Binary Options with terminal value A when in the money, is given by:

$$\begin{aligned} C_0^{Bin}=A \int _0^{+\infty } \mathcal N\left( d_2(T-t,S_0,x) \right) {\mathcal {LN}} pdf_{\alpha (T-\tau ),\nu ^2(T-\tau )} \left( x\right) \mathrm {d}x. \end{aligned}$$

6 Numerical applications

6.1 Sensitivity analysis of the pricing formula

In this subsection, we compute European Plain Vanilla and Binary option prices assuming that model parameters are known and considering several strike prices and expiration dates. Since the main features in the suggested model are the introduction of an attention process and a delay in the dependence of the price upon such stochastic factor, we believe it is important to further understand their contribution to the option price formation. To this end, we compute option prices by letting both the initial attention and the delay values change, while other parameters are set to \(\mu _P=0.03,\sigma _P=0.35, \sigma _S=0.04\), \(\gamma =0\) and \(r=0.01\); we assume, without loss of generality, that the Bitcoin price at time \(t=0\) is \(S_0=450\).

In Table 3, Call option prices are reported for \(T=3\) months, \(\tau \)= 5 days. Rows correspond to different values of \(P_0\) while columns to different values for the strike price. As expected, Call option prices are increasing with respect to initial attention for the Bitcoin and decreasing with respect to strike price.

Table 3 Call option prices against different strikes K and for different values of \(P_0\): \(S_0=450,r=0.01,\mu _P=0.03,\sigma _P=0.35, \sigma _S=0.04\), \(T=3\) months, \(\tau \) = 1 week (5 days)

In Table 4, Call option prices are summed up, for \(P_0=100\), by letting the expiration date T and the delay \(\tau \) vary. Again as expected, for Plain Vanilla Calls the price increases with time to maturity. Increasing the delay reduces option prices; of course, the spread is inversely related to the time to maturity of the option.

Table 4 Call option prices against different Strikes K and for different values of T and \(\tau \): \(S_0=450,r=0.01,\mu _P=0.03,\sigma _P=0.35, \sigma _S=0.04\) and \(P_0=100\)

In Tables 5 and 6, analogous results are reported for Binary Options with outcome \(A=100\); Table 5 sums up Binary Cash or Nothing prices for \(S_0=450\), \(r=0.01\), \(\mu _P=0.03\), \(\sigma _P=0.35\), \(\sigma _S=0.04\), \(T=3\) months, \(\tau = 1\) week (5 working days) against several strikes (in columns). Rows correspond to different values \(P_0\) for the initial attention on Bitcoins. As expected, prices are decreasing with respect to strike prices. Here, in the money (ITM) options values are decreasing with respect to \(P_0\) while out of the money (OTM) ones are increasing. The difference in ITM and OTM prices is large for low values of \(P_0\), while it is very small for a high level of the initial attention factor in Bitcoins. This may be justified by the fact that, when the attention factor in the Bitcoin is strong, all bets are worth, even the OTM ones, since the underlying value is expected to blow up. Binary Call prices decrease with respect to time to maturity for ITM options and increase for OTM options which become more likely to be exercised. The influence of the delay value is tiny, as for vanilla options, being larger for short time to maturities.

Table 5 Digital Cash or Nothing prices against different Strikes K and for different values of \(P_0\) on Bitcoins
Table 6 Digital Cash or Nothing prices against different Strikes K and for different values of T and \(\tau \)

6.2 Pricing performance on market option prices

In this subsection, we compute option prices in the case where model parameters are not simply assigned by the authors but are estimated on market data for Bitcoin prices. Again we assume that the attention price of risk is \(\gamma =0\) for two main reasons. The first one, as already remarked, because we do not expect the dynamics of the attention factor to change much from a risk-averse to a risk-neutral setting; the second motivation is that this parameter does not appear in the dynamics of the system under the physical measure \(\mathbf {P}\), so it can not be estimated on historical data for Bitcoin price and market attention. Of course, its value might be obtained by calibration, i.e., by minimizing a suitable measure for the difference between model and market option prices on a given date, such as the Mean Squared Error (MSE) or its square root (RMSE). However, this approach considers option prices as given data; this is opposite to our purpose of computing market option prices by using data on the underlying. Nevertheless, the closed formula in 5.6 makes it possible to calibrate of parameters within our framework if we considered options prices as given. An example is given in following subsectionFootnote 5.

Once model prices for options are computed these values are compared to corresponding market price. Data for market prices of options on Bitcoin may be retrieved from online platforms where it is possible to trade on Plain Vanilla and Binary options on the Bitcoin. A relevant platform where bid-ask quotes are publicly available is www.deribit.com, where the underlying is the Bitcoin average price available from blockchain.info; we consider the mid-value of the Bid-Ask range in this platform as a benchmark market price for assessing our pricing formula performance. We discard options for which there was no transactions. Every day two different expiration dates are available corresponding to a one month and two months maturity at issue. We are aware of possible synchronicity problems but as a first evaluation of the suggested pricing formula we intentionally neglect this friction. We compute the price of each of the traded options in our sample (from July 20 to July 31, 2017) according to formula (5.9) by plugging proper values for the strike price and the expiration date and for model parameters, estimated according to the profile likelihood method described in Sect. 3. In order to assess pricing performance, we compute the Root Mean Squared Error (RMSE) of model prices with respect to market prices across all the considered sample of options and of suitably chosen subsamplesFootnote 6. The same is done when prices are computed with the benchmark Black & Scholes model, where market attention is not accounted for. Of course, the latter is estimated on the same time series of Bitcoin prices; note that only the volatility estimation matters since we set \(r=0\) for both models. It is worth noticing that parameters are given by the estimation procedure on historical data hence the RMSE value is the output of our analysis, not the input function to be minimized, as it is done when a model is calibrated to market option prices in order to derive the underlying parameter. An example in this direction will be given in next subsection.

In Tables 7 and 8 we report market as well as model option prices for July 28, 2017. Black & Scholes prices are also reported and the RMSE is displayed in the last row. On the sample days available maturities were July 27 (options from July 20 to July 27), August 25 (options from July 28 to July 31) and September 29 (all options). Since the bid-ask range on the trading platform is quite large the choice of the best model is subjective and depends on which criteria are prioritized. Tables 7 and 8 evidence that Black and Scholes price is often out of bid-ask range while the proposed model performs better.

Table 7 Option Price with t=28 July and T=25 August
Table 8 Option price with t = 28 July and T = 29 September

In Tables 9 and 10 we report the RMSE for the complete sample of options as well as for specific subsamples obtained according to moneyness and to maturity across all trading days. RMSE values are also computed and reported in the tables for the benchmark Black & Scholes model. Overall our pricing formula does much better than the benchmark in all cases. Highlighted in the tables (in bold) are the few cases where Black & Scholes model does better than the model we suggest, according to the RMSE.

Indeed, the pricing performance of the model depends on the choice for the attention measure to be considered which introduce another source of model riskFootnote 7; however, it is worth noticing that when attention is measured by Google Searches the model tends to overprice long-term options while it is the very best for shorter term options as if this attention indicator is driven by enthusiasm giving a sudden impulse to options. This evidence can lead to the proper choice of the attention proxy depending on the expiration date of options to be priced. It would be interesting to consider a weighted mean of these two factors as a proxy of market attention rather than focusing on only one of them; this will be subject of future research.

Table 9 RMSE Option Price with Volume of Transactions
Table 10 RMSE Option Price with Google Searches

6.3 Calibration of parameters

The availability of a closed formula for option prices makes it possible to derive parameters’ value from market option prices rather than from historical prices of the underlying: this procedure is usually referred to as calibration; alternatively, different subsets of parameters may be derived by using market option data and underlying data, respectively.

Calibration is based on the minimization, with respect to model parameters, of a suitably defined distance between model prices and corresponding market prices for a family of financial derivatives for which a market price, resulting from either a trade or a market maker quotation, is observed.

There has been a long debate in the literature whether calibration methodology is a sound estimation method; it is beyond the scope of this paper to go through pros and cons of the procedure. We just mention that calibration is usually performed when the market prices for derivatives are observed in large and trusted derivative exchanges; this is not the case in our setting, where at the time of writing, financial derivatives on Bitcoin are traded on online platforms which are not always trustworthyFootnote 8. Nevertheless, if we want to consider a non-zero price \(\gamma \) for attention risk we need to resort, at least for this parameter, to calibration.

For the sake of completeness, we give a numerical example on how parameter \(\gamma \) can be calibrated on the options that we considered in the previous subsection, under the assumption that their price is given by the mid-value of the bid-ask market range. As for a suitable distance between model and market price we consider the Mean Squared Error.

More precisely, denote with \(C_t=C_t(\mu _P,\sigma _P,\gamma ,\mu _S,\sigma _S,\tau ,K,T)\) the pricing function in t of a Call Option exiting at time T with strike price K where we have stressed the dependence on model parameters. Besides, denote with \(C_t^*(T,K)\) the market price of a Call option with same characteristics.

Due to the tiny option market where Bitcoin options are traded, we believe that the PQML estimation procedure should be chosen when it is possible and we apply (6.1). Hence, we estimate \((\mu _P,\sigma _P,\mu _S,\sigma _S,\tau )\) with the PQML method introduced in Sect. 4 and derive \(\gamma \) obtained as:

$$\begin{aligned} {\bar{\gamma }}=\arg \min \sum _{j=1}^J \left( C_t(\hat{\mu }_P,\hat{\sigma }_P,\gamma ,\hat{\mu }_S,\hat{\sigma }_S,\hat{\tau },K_j,T_j)-C_t^*(K_j,T_j)\right) ^2, \end{aligned}$$
(6.1)

where J is the cardinality of the set of options for which a market value is available, and \((K_j,T_j)\) for \(j=1,2,\ldots ,J\) are the characteristics of these options. By considering options traded on July 28, 2017 and assuming that their market price is the midprice of the bid-ask displayed, respectively, in Tables 7 and 8 we obtain, according to (6.1), \({\bar{\gamma }}= -0.90\) and \({\bar{\gamma }}= 2.05\) for the trading volume and the SVI index, respectively, leading to minimum values for the RMSE of 0.0183 and 0.0121 in the two cases. The difference in sign evidences that in a risk-neutral world the increasing yield of the trading volume would be higher that in the real world why the opposite happens for the internet searches volume.

If we wanted to calibrate all of the parameters their value would be derived as:

$$\begin{aligned} ({\bar{\mu }}_P,{\bar{\sigma }}_P,{\bar{\gamma }},{\bar{\mu }}_S,{\bar{\sigma }}_S,{\bar{\tau }})= & {} \arg \min \sum _{j=1}^J \left( C_t(\mu _P,\sigma _P,\gamma ,\mu _S,\sigma _S,\tau ,K_j,T_j)\right. \\&\left. -C_t^*(K_j,T_j)\right) ^2 \end{aligned}$$

but in this case the number J of options in the sample should be the very large to make estimation consistent, and, as already noticed, the Exchange trading the options should be a reference for the whole market.

7 Concluding remarks

In this paper, we assume that Bitcoin prices are driven by investors’ attention as suggested in recent literature; main references in this area are Kristoufek (2013, 2015), Kim et al. (2015), Figà-Talamanca and Patacca (2019). In order to account for such behavior we develop a stochastic model in continuous time describing the dynamics of two factors, one representing the attention index on the Bitcoin system and the other representing the Bitcoin price itself, which is directly affected by the first factor; we also take into account a delay between the attention index and its delivered effect on the Bitcoin price. We investigate statistical properties of the proposed model and we show its arbitrage-free property. Further, under suitable model assumptions we derive a closedform approximation for the joint density of the discretely observed process and we propose a statistical estimation to fit the model to real data. By applying the classical risk-neutral evaluation we are able to derive a quasi-closed formula for European-style derivatives on the Bitcoin with special attention on Plain Vanilla and Binary options for which a market already exists (e.g., https://deribit.com, https://coinut.com). Of course, attention about Bitcoin or, more generally, on cryptocurrencies, is not directly observed but several variables may be considered as indicators. Here, we analyzed the trading volume, see (Kristoufek 2015), and more unconventional attention indicators such as the number of Google searches on the word “bitcoin”, as suggested by Da et al. (2011). Firstly, we investigated whether these proxies were consistent with the suggested model and we proved that both the volume of transactions and the number of Google searches give a good fit of the dynamics described in the model. Finally, we fit the model using real data of Bitcoin price with the volume of transactions and the Google searches index, respectively. The ability of our pricing formula on capturing market option prices is also assessed on a sample of options traded on www.deribit.com; the overall performance outperforms substantially the benchmark Black & Scholes pricing formula.

A calibration example is also provided in order to estimate the price of attention risk on available option prices. There is still space for improvement in the suggested model to take into account stylized facts of observed price evidenced in Catania and Grassi (2017), Guo and Li (2017), Chu et al. (2015). Indeed, even if empirical evidence suggests that market attention/sentiment drives Bitcoin price behavior, other sources of randomness should be added to properly model its stochastic volatility. However, the specification given in Eqs. (2.1)–(2.2) allows to get several nice results, as the quasi-closed pricing formula for European-style derivatives on Bitcoin; the effects of including jumps in the price dynamics, generalizing the approach in Kou (2002) will be investigated. In a companion paper, Cretarola and Figà-Talamanca (2019), we consider a non-zero correlation between the two factors, focusing on the case of no delay; high correlation is shown to be linked to bubbles in the price dynamics, which have been also evidenced for the Bitcoin, see e.g., (Fry and Cheah 2015; Malhotra and Maloo 2014; Donier and Bouchaud 2015; Corbet et al. 2018; Bistarelli et al. 2019b).

Finally, we believe it is important to stress that another specific feature of Bitcoin is the existence of different prices for trades in different online platforms; hence, even considering the bid-ask mid price, several prices are available for the same asset at the same time. This characteristic makes arbitrage opportunities arise across exchanges, since the law of unique price is not fulfilled. In the present paper we ignore this issue by assigning to Bitcoin the average price computed and provided by the Web site https://blockchain.info; this choice is consistent with most literature on Bitcoin price behavior but we plan to give a multivariate specification of the suggested model. Preliminary results in this direction, to account for a multi-exchange framework and possible cross-exchange arbitrage opportunities, are discussed in Bistarelli et al. (2018, 2019a).