1 Introduction

The recent observation of gravitational waves [1] provided confirmation for their hundred-year-old predicted existence. In the early years of general relativity (GR) alternative models of gravitation were considered; for a long time these alternatives to general relativity were little more than a curiosity, as the observations of that time did necessitate anything else. Many of these modified theories of gravity were ruled out for theoretical reasons but others remained viable as observational tests were not yet feasible in most cases.

Cosmic microwave background [2] and observations of supernovae [3, 4] led to the discovery of the accelerating expansion of the Universe. The accelerated expansion can be explained with the cosmological constant, but there are some fundamental problems with the cosmological constant [5] and the \(\varLambda \)CDM or concordance model [6]. Therefore, the modified gravity theories, which received little interest for decades, have become relevant once again.

The f(R) theories (see e.g. [7, 8] for reviews) or fourth order theories, which generalize the Einstein Hilbert Langrangian as a function of the curvature scalar, have received considerable attention in the 21st century. In [9] it was shown that the accelerating expansion could be explained with an f(R) modification; since then, more viable models have been proposed (e.g. [10,11,12,13,14,15]).

In standard general relativity the graviton, which mediates the gravitational force, has zero mass. General relativity is a metric theory which includes the set postulates of requiring field equations with linear second order derivatives, satisfying the Newtonian weak field limit and lacking dependence on any prior geometry. In order to have a massive graviton, some generalization is needed. The path of least resistance is fixing the background metric.

It is possible to add a term to the Einstein Hilbert action, thereby causing a massive graviton [16, 17]. There are a number of different terms that produce a massive graviton but most of these fail to reach the correct Newtonian limit [18, 19]. However, while in general relativity the graviton naturally has a zero mass, this is not the case for f(R) gravity [7].

In f(R) gravity the graviton has a priori a non-zero mass. As the f(R) theories are explicitly higher order theories, this in not in contradiction with the demands of constructing a massive graviton for general relativity. The higher order contribution in the field equations adds up to an effective graviton mass term. This link between graviton mass and model dependence can be converted into boundaries for viable f(R) models.

Solar system observations have set several bounds on the mass of the graviton. As the dynamics of the solar system are found to follow general relativity extremely closely, these bounds are rather stringent. If the Newtonian potential is modified to include a massive graviton, then the Kepler laws produce a limit for the Compton wavelength of the graviton [20, 21]. The bound on graviton mass follows from the Compton wavelength and mass relation via \(\lambda _g=h/m_g c\) [22], where \(m_g\) is the graviton mass.

Inspiraling binaries are a known source of gravitational waves and provide a possibility for measuring the graviton mass [20, 21, 23]. Before the LIGO experiments the graviton mass had been bounded by binary pulsars [17] instead of a pair of black holes. Similar studies have been done in the context of f(R) gravity [24]. Assuming a non-zero mass \(m_g\) graviton would cause the gravitational potential to be of the Yukawa form \(r^{-1}e^{-m_grc/h}\). The exponential dependence would cause a cut-off of the gravitational interaction at large distances, namely larger than the Compton wavelength. Such a cut-off has not been observed in the solar system [20] or galaxy clusters [25]. Therefore, these observations set an upper limit for the mass of the graviton \(m_g\).

The galaxy cluster limits for the graviton mass are rather stringent, with \(m_gc^2<2\times 10^{-29}\,\text {eV}\) [25], but are model dependent regarding e.g. dark matter assumptions. These are not directly applicable to f(R) theories as they modify the effects and the need for dark matter [26,27,28,29].

Black hole superradiance is another source of constraints to the graviton mass. In the context of bimetric gravity the mass is bounded by \(m_gc^2<5\times 10^{-23}\,\text {eV}\) [30]. The gravitational-wave based bounds arise from the dynamics of gravitation and as such are model-independent. Currently, the best model-independent dynamical bounds for the graviton mass are those from the recent LIGO observations \(m_gc^2<1.2\times 10^{-22}\,\text {eV}\) [22]. If a supermassive black hole binary is detected in the future, it could introduce a limit that is more stringent by several orders of magnitude [23].

In the following we will examine the naturally occurring graviton mass in f(R) gravity [31]. Several studies attempt constrain f(R) theories with both theoretical and observational means (e.g. [8, 32,33,34,35,36,37]). Previous studies also investigate f(R) gravity in the context of binaries and the related graviton mass [38, 39]. Using the recent LIGO upper limit on the graviton mass we further constrain the model parameters of some viable f(R) theories such as the Hu-Sawicki model [10].

2 Equations of motion

In the following we derive the equations of motion describing gravitational waves and graviton mass arising from the f(R) contribution. We examine an f(R)-modified gravitational actionFootnote 1

$$\begin{aligned} \mathcal {A} =\frac{1}{2\chi }\int d^4x\sqrt{-g}\Big (f(R)+2\chi \mathcal {L}_m\Big ), \end{aligned}$$
(1)

where \(\chi =\frac{8\pi G}{c^4}\) is the coupling of gravitational equations and \(\mathcal {L}_m\) is the minimally coupled matter Lagrangian. Following standard metric variational techniques, we find the field equations and the trace equation

$$\begin{aligned} f'(R)R_{\mu \nu }-\frac{1}{2}f(R)g_{\mu \nu }-\nabla _\mu \nabla _\nu f'(R)+g_{\mu \nu }\square f'(R)&=\chi T_{\mu \nu } \end{aligned}$$
(2)
$$\begin{aligned} 3\square f'(R) + f'(R)R-2f(R)&=\chi T , \end{aligned}$$
(3)

respectively, where the energy-momentum tensor \(T_{\mu \nu }=-\frac{2}{\sqrt{-g}}\frac{\delta \sqrt{-g}\mathcal {L}_m}{\delta g^{\mu \nu }}\) and \(T=T^\alpha _\alpha \). The prime is used to denote the derivatives with respect to R. We study the linear perturbations \(h_{\mu \nu }\) and write

$$\begin{aligned} g_{\mu \nu }=\tilde{g}_{\mu \nu }+h_{\mu \nu }, \end{aligned}$$
(4)

where \(\tilde{g}_{\mu \nu }\) is the background metric. In general we use tilde to denote the quantities calculated with the background metric. The Ricci tensor and scalar can be expanded respective to the background as

$$\begin{aligned} R_{\mu \nu }&\simeq \tilde{R}_{\mu \nu }+\delta R_{\mu \nu }+\mathcal {O}(h^2), \end{aligned}$$
(5)
$$\begin{aligned} R&\simeq \tilde{R}+\delta R+\mathcal {O}(h^2). \end{aligned}$$
(6)

As the first derivative of f(R) appears in the equations of motion, we need an expansion for this function as well, i.e. \(f'(R)\simeq f'(\tilde{R})+f''(\tilde{R})\delta R+\mathcal {O}(4)\). This expansion is substituted into (3) which yields

$$\begin{aligned} f''(\tilde{R})(3\square \delta R+\tilde{R}\delta R)-f'(\tilde{R})\delta R=0 . \end{aligned}$$
(7)

As we are primarily interested in the propagation of gravitational waves in empty space, we set \(T_{\mu \nu }=0\). The variations of the Ricci tensor and scalar can be written in terms of the metric perturbation \(h_{\mu \nu }\) (e.g. [40]) such that

$$\begin{aligned} \delta R_{\mu \nu }&=\frac{1}{2}\Big (\nabla _\mu \nabla _\nu h-\nabla _\mu \nabla ^\lambda h_{\lambda \nu }-\nabla _\nu \nabla ^\lambda h_{\mu \lambda }+\square h_{\mu \nu }\Big ), \end{aligned}$$
(8)
$$\begin{aligned} \delta R&=\delta (g^{\mu \nu }R_{\mu \nu })=\square h-\nabla ^\mu \nabla ^\nu h_{\mu \nu }-\tilde{R}_{\mu \nu }h^{\mu \nu }. \end{aligned}$$
(9)

As this case is gauge invariant we fix the gauge to be the harmonic gauge with

$$\begin{aligned} \nabla _\mu h^\mu _\lambda =\frac{1}{2}\nabla _\lambda h, \end{aligned}$$
(10)

which further implies \(\nabla ^\mu \nabla ^\nu h_{\mu \nu }=\frac{1}{2}\square h\).

In order to provide the correct expansion of the Universe, a viable f(R) should have a de Sitter solution. This requires the background equations, (2) and (3) for empty space, to have solutions, i.e. \(f'(\tilde{R})\tilde{R}=2f(\tilde{R})\) and \(\tilde{R}_{\mu \nu }=\tilde{g}_{\mu \nu }\frac{f( \tilde{R})}{2f'(\tilde{R})}\) must hold true. Using these equalities and the harmonic gauge we find

$$\begin{aligned} 3f''(\tilde{R})\square ^2 h-\left( \frac{f(\tilde{R}) f''(\tilde{R})}{f'(\tilde{R})}+f'(\tilde{R})\right) \square h+\left( f(\tilde{R})-\frac{2f^2(\tilde{R})f''(\tilde{R})}{f'^2(\tilde{R})}\right) h=0. \end{aligned}$$
(11)

The graviton dispersion relation \(k^2=-m_g^2\) reveals that the plane wave solution \(h\sim e^{ik \cdot x}\) fulfills \(\square h=m_g^2h\). Therefore, we can write

$$\begin{aligned} 3f''(\tilde{R})m^4_g-\left( \frac{f(\tilde{R}) f''(\tilde{R})}{f'(\tilde{R})}+f'(\tilde{R})\right) m^2_g+\left( f(\tilde{R})-\frac{2f^2(\tilde{R})f''(\tilde{R})}{f'^2(\tilde{R})}\right) =0, \end{aligned}$$
(12)

for non-zero perturbations. Thus we obtain two solutions for \(m_g^2\),

$$\begin{aligned} m_1^2&=\frac{f'^2(\tilde{R})-2f(\tilde{R})f''(\tilde{R})}{3f'(\tilde{R})f''(\tilde{R})}, \end{aligned}$$
(13)
$$\begin{aligned} m_2^2&=\frac{1}{2}\tilde{R}, \end{aligned}$$
(14)

which tell us the perturbations of the metric can be written as a linear combination

$$\begin{aligned} h_{\mu \nu }=h^{(1)}_{\mu \nu }e^{ik^{(1)}_\lambda x^\lambda }+h^{(2)}_{\mu \nu }e^{ik^{(2)}_\lambda x^\lambda }, \end{aligned}$$
(15)

where the quantities \(h^{(i)}_{\mu \nu }\) and \(k^{(i)}_\mu \) are the metric perturbation and four-momentum related to the corresponding solution \(m_i\).

We have found two physically viable solutions for a non-zero graviton mass. The first solution (13) resembles the stability criterion of [41, 42]. Basically this criterion tells us that the square of the graviton mass must be non-negative. The mass is often derived with the well-known f(R) scalar-tensor theory equivalence [43,44,45]. This solution is not available when \(f''(R)=0\), such as in the case of GR.

The second solution (14) does not depend on \(f''(R)\) and holds even for GR. This solution is related to having \(\delta R=0\) in (7). In the case of empty space GR we have \(\tilde{R}=0\) and \(m_2=0\) as expected. Clearly, a well-behaved GR limit exists for the second solution as \(f''(R)\rightarrow 0\). Since for this solution \(\delta R=0\), in the situation \(\tilde{R}=0\), the perturbation of the metric would simply be

$$\begin{aligned} \delta R\sim h^{(1)}_{\mu \nu }e^{ik^{(1)}_\lambda x^\lambda } \end{aligned}$$
(16)

and only the scalar modes would manifest. Therefore, \(m_2\) solutions do not effect scalar perturbations while the tensor perturbations are affected by both of the solutions.

The GR limit of the first solution is problematic as it diverges as \(f''(R)\rightarrow 0\). This reveals an interesting fact that even though f(R) models have to closely resemble GR, they cannot be infinitely close. This is comparable to the result of the forbidden Higuchi mass range of the graviton [46,47,48]. The emergence of these massive modes in f(R) gravity is discussed in detail in [31].

The second solution is extremely small when \(m_2\sim \sqrt{\varLambda }\), which easily passes all constraints on graviton mass. Therefore, we focus on the first solution, which can be constrained. The exact mass state of a graviton emited by two inspiraling black holes is unknown, however the combination of the two mass states is bounded by observation. Mergers in f(R) gravity need further study in order to distinguish between these two states. To our knowledge, such studies have not yet been conducted.

Another, often overlooked, fact is that for GR with \(\varLambda \) (i.e. \(f(R)=R+\varLambda \)) we would have a non-zero graviton mass, \(m_2^2=2\varLambda \). This is due to relaxing the assumptions of GR [16]. Even though this is mathematically clear, the physical consequences are debatable, see e.g. [49] and references therein for a discussion.

For the case of f(R) gravity, there is the extra scalar degree of freedom, like with the cosmological constant. A massive graviton always implies extra degrees of freedom; this leads to gravitational waves with \(\varLambda \) or f(R) different to those caused by plain GR. However, this does not affect the relation to observations.

The LIGO observations provide a lower limit for the Compton wavelength of the graviton [1]. A finite Compton wavelength in general translates to a massive theory and therefore, extra degrees of freedom. The measurements detect perturbations of the metric \(h_{\mu \nu }\), which can be written as a linear combination of the modes associated with masses (13) and (14). The ratio of these two modes caused by the black holes is unknown but the total contribution is constrained.

In the following, we shall take a closer look at specific models and use the Hu–Sawicki model as a case study to demonstrate the procedure.

3 Viable f(R) models and graviton mass constraints

Rigorous contraints on the parameters of the f(R) function are achieved with the most stringent bound on \(\tilde{R}\) which [10, 33, 50] finds is

$$\begin{aligned} |f'(\tilde{R})-1|<4\times 10^{-7}. \end{aligned}$$
(17)

Here, and for the rest of the paper, we assume natural units. With the graviton mass we can find another bound for these parameters. In what follows we will demonstrate this for a particular model.

The popular Hu–Sawicki model is constructed to pass the solar system tests and produce the observed late-time cosmology [10]. A truly viable model needs to fulfil the high curvature regime constraints as well as provide the accelerated expansion of the Universe, which appears at low curvature regimes. The Hu-Sawicki model is of the form

$$\begin{aligned} f(R)=R-\mu R_c \frac{\Big (\frac{R}{R_c}\Big )^{2n}}{b \Big (\frac{R}{R_c}\Big )^{2n}+1}, \end{aligned}$$
(18)

with \(\mu \), \(R_c\), b positive constants and \(n\in \mathbb N\). Inserting this into the de Sitter criterion, \(\tilde{R}f'(\tilde{R})-2f(\tilde{R})=0\), we can solve for b

$$\begin{aligned} b_\pm =-1+\mu \pm \sqrt{\mu (\mu -2n)}. \end{aligned}$$
(19)

As the action must be real, b must have a real value as well. This leads to the constraint \(\mu >2n\). The constant \(R_c\) is a free-scaling parameter and for simplicity we have chosen \(R_c=\tilde{R}\). The bound (17) translates to

$$\begin{aligned} |f'(\tilde{R})-1|=\frac{2n\mu }{(1+b_\pm )^2}<4\times 10^{-7}. \end{aligned}$$
(20)

For \(b_-\) we have

$$\begin{aligned} |f'(\tilde{R})-1|=\frac{2n\mu }{(\mu -\sqrt{\mu (\mu -2n)})^2}=\frac{2 n}{\mu \Big (1-\sqrt{1-\frac{2n}{\mu }}\Big )^2}<4\times 10^{-7}. \end{aligned}$$
(21)

With the condition \(\mu >2n\) the square root can be expanded as a series. This results in \(|f'(\tilde{R})-1|\sim \mu <10^{-7}\) which is in clear contradiction with \(\mu >2n\). Therefore we must choose \(b=b_+\), for which we find

$$\begin{aligned} \frac{2n\mu }{(\mu +\sqrt{\mu (\mu -2n)})^2}\sim \frac{2 n\mu }{4\mu ^2}=\frac{n}{2\mu }<4\times 10^{-7} \end{aligned}$$
(22)

when \(\mu>>1\). This further translates to a lower bound \(\mu >10^6\), where we have assumed \(n\sim 1\). As higher n have been found to mimic \(\varLambda \)CDM behaviour more closely [10], our assumption of n is reasonable for viable f(R) models [8]. Therefore \(n\sim 1\) translates to the most conservative bound. With \(\mu>>1\), we can write the square of the graviton mass as a series of \(x=1/\mu \)

$$\begin{aligned} m_g^2=\tilde{R}\left( \frac{2}{3n(1+2n)x}-\frac{2n(5+2n)}{3(1+2n)^2}x\right) +\mathcal {O}(x^2). \end{aligned}$$
(23)

Therefore, we have \(n m_g^2/\tilde{R}\sim \mu \). As the gravitational wave observations set an upper limit for the graviton mass, we find a upper bound for \(\mu \) as well. We can write the relation of the background curvature to the cosmological constant as \(\tilde{R}=4\varLambda \). Using the density parameter \(\varOmega _\varLambda \) we can also write

$$\begin{aligned} \varLambda =3 H_0^2\varOmega _\varLambda , \end{aligned}$$
(24)

where \(H_0\) is the Hubble parameter. Using the Planck collaboration results [51] and the LIGO results [1], we can now provide an upper bound for the parameter \(\mu \) in Hu–Sawicki models (again assuming \(n\sim 1\)), while (17) provides the lower bound:

$$\begin{aligned} 10^{20}>\mu >10^6. \end{aligned}$$
(25)

While we have constrained the viable parameter space to a certain range, this range is too wide to make claims on the viability of the theory. However, with more accurate measurements it will be possible to make the range narrower. As the gravitational wave constraints are independent of e.g. solar constraints, they offer valuable proofs for the limits of f(R) and scalar tensor gravity as well.

It is also interesting to note that the galaxy cluster limit for the graviton mass is 7 orders of magnitude tighter than the LIGO limit. If we could apply this limit, the upper limit would be of the same order as the lower limit, causing severe fine-tuning issues. This is due to many model parameters being arbitrary and not directly linked to physical quantities. One should consider whether it is a desirable feature in a theory to include several unphysical strictly constrained parameters. However, we stress that the model-dependent galaxy cluster result cannot be used directly with f(R) theories, as mentioned in the introduction.

Similar procedures can be subjected to other f(R) models as, such as the Starobinsky model [11], which is described by

$$\begin{aligned} f(R)=R+\lambda R_0\Big (\big (1+\frac{R^2}{R_0^2}\big )^{-n}-1\Big ) \end{aligned}$$
(26)

with \(\lambda \) and \(R_0\) positive constants and \(n\in \mathbb N\). For the Starobinsky model, we can follow similar procedures to find \(10^{-20}<\lambda <10^{-8}\) with the same assumption \(n\sim 1\). In a similar manner constraints can be made on any other viable model as well.

4 Discussion

We have studied f(R) theories and the naturally emerging massive graviton. With bounds on the graviton mass produced by the gravitational wave observations it is possible to constrain f(R) theories. As a case study, we concentrated on the Hu–Sawicki model. For this model we find an upper limit for the free parameter in addition to the lower limit previously presented in the literature. While the allowed range is still wide, further observations are likely narrow the range. As the massive graviton is characteristic of f(R) theories and massive Brans–Dicke theories, the viability of these models is under increasing scrutiny.

Other f(R) theories can be subjected to the same procedure as well. As there is a known connection between f(R) gravity and scalar-tensor gravity (e.g. [52]), these theories are also a possible target for application.Footnote 2

The LIGO measurement accuracy is expected to rise in the future with the construction of additional detectors [1, 53]. As these are likely to lower the upper limit for the graviton mass, the range found for the free parameter for the Hu–Sawicki model is bound to narrow down even further. The effect of further gravitational wave observations on the graviton mass is discussed in [54].

Space-based detection of gravitational waves in the future with eLISA or similar programs are expected to give constraints on the graviton mass [55,56,57]. Single observations with the space-based devices are expected to reach measurements two magnitudes more precise than LIGO. However, multiple events during the mission are expected to increase the total accuracy of by three orders of magnitudes. This will lead to a considerably tighter range for viable f(R) models.

Detection of a non-zero graviton mass would have far-reaching consequences for f(R) theories and naturally GR itself. As the f(R) models predict a massive graviton, the detected mass would further constrain the possible parameter space. This could not be explained by standard GR and would therefore emphasize the need for modified gravity.

Another possibility is the so-far model-dependent graviton mass constraints from galaxy clusters. In order to achieve this, the effects of modified gravity on dynamics and dark matter assumptions have to be carefully considered. As these model-dependent limits are far tighter than the LIGO limits, they could provide far more stringent constraints and even rule out theories currently considered viable.