Inference of cosmological models with principal component analysis

SHARMA, RANBIR; JASSAL, H. K.

doi:10.1007/s12036-024-10009-9

Inference of cosmological models with principal component analysis

Published: 31 August 2024

Volume 45, article number 24, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Astrophysics and Astronomy Aims and scope Submit manuscript

Inference of cosmological models with principal component analysis

Download PDF

22 Accesses
9 Altmetric
1 Mention
Explore all metrics

Abstract

Determination of cosmological parameters is a major goal in cosmology at present. The availability of improved data sets necessitates the development of novel statistical tools to interpret the inference from a cosmological model. In this paper, we combine the principal component analysis (PCA) and Markov Chain Monte Carlo (MCMC) method to infer the parameters of cosmological models. We use the No U-Turn Sampler (NUTS) to run the MCMC chains in the model parameter space. After determining the observable by PCA, we replace the observational and error parts of the likelihood analysis with the PCA reconstructed observable and find the most preferred model parameter set. To demonstrate our methodology, we assume a polynomial expansion as the parametrization of the dark energy equation of state and plug it into the reconstruction algorithm as our model. After testing our methodology with simulated data, we apply the same to the observed data sets, the Hubble parameter data, Supernova Type Ia data, and the Baryon acoustic oscillation data. This method effectively constrains cosmological parameters from data, including sparse data sets.

Reconstruction of latetime cosmology using principal component analysis

Article 10 February 2022

Model selection applied to reconstructions of the Dark Energy

Article Open access 24 March 2023

Understanding Better (Some) Astronomical Data Using Bayesian Methods

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Observational evidence of the acceleration of the Universe marked the beginning of a new era in Cosmology. It is well established that the current expansion of the Universe is accelerating, and an explanation for the current acceleration is done by introducing the dark energy (DE) term in the Einstein equation. Dark energy is described by its equation of state (EoS) parameter $w=-P^{\prime } / \rho ^{\prime }$, where $\rho ^{\prime }$ is the energy density and $P^{\prime }$ is its pressure contribution. It is still unknown whether dark energy is a cosmological constant (Carroll et al. 1992; Turner & White 1997; Carroll 2001; Padmanabhan 2003) or a time-evolving entity (Peebles & Ratra 2003; Copeland et al. 2006). The $\Lambda $CDM (cosmological constant and cold dark matter) model corresponds to the dark energy equation of state value $w=-1$, whereas in the case of time-evolving dark energy, the dark energy EoS parameter varies with time and can assume different values of w (Weinberg 1989; Carroll et al. 1992; Coble et al. 1997; Caldwell et al. 1998; Sahni & Starobinsky 2000; Ellis 2003; Padmanabhan 2003; Peebles & Ratra 2003; Albrecht et al. 2006; Frieman et al. 2008; Linder 2008a; Stern et al. 2010; Arjona & Nesseris 2020). Various models based on scalar, canonical, and non-canonical fields have been proposed to overcome different problems of $\Lambda $CDM model (Ratra & Peebles 1988; Copeland et al. 1998; Zlatev et al. 1999; Chevallier & Polarski 2001; Padmanabhan 2002; Bagla et al. 2003; Caldwell & Linder 2005; Linder 2006; Huterer & Peiris 2007; Linder 2008b; Tsujikawa 2013; Rajvanshi & Bagla 2019; Singh et al. 2019). The discrepancies in $H_0$ measurements and its implication in cosmological model selection is discussed in Banerjee et al. (2021) and Lee et al. (2022). The last two decades have also marked the era of precision Cosmology. Cosmological parameters are measured to high precision utilizing the availability of new data sets (Chevallier & Polarski 2001; Planck Collaboration et al. 2018; Sangwan et al. 2018).

Maximum likelihood estimation (MLE) analysis is the most commonly used technique in cosmological parameter estimation (Nesseris & Perivolaropoulos 2004, 2005, 2007; Jassal 2009; Sangwan et al. 2018; Singh et al. 2019). The increasing availability of the observational data set has tightened the constraints on the parameters of theoretical models (Chevallier & Polarski 2001; Linder 2003a; Jassal et al. 2005; Gong & Wang 2007; Verde et al. 2013; Mukherjee 2016; Di Valentino et al. 2017; Vagnozzi et al. 2018; Bellomo et al. 2020; Bernal et al. 2020). Though it is crucial to determine the theory parameters, we have the observational data dependencies at the core of these methods, and new data sets reject or accept a particular model with quantified precision. Methods like the principal component analysis (PCA) enable us to determine the functional form of the observable of a data set in a model-independent, non-parametric manner (Huterer & Starkman 2003; Huterer & Cooray 2005; Crittenden et al. 2009; Clarkson & Zunckel 2010; Ishida & de Souza 2011; Hojjati et al. 2012; Nesseris & García-Bellido 2013; Nair & Jhingan 2013; Zheng & Li 2017; Miranda & Dvorkin 2018; Hart & Chluba 2019; Sharma et al. 2020; Hart & Chluba 2022a). PCA is a multivariate analysis that gives the form of cosmological quantities as a function of redshift (Huterer & Starkman 2003; Huterer & Cooray 2005; Clarkson & Zunckel 2010; Zheng & Li 2017; Sharma et al. 2020). In a previous work (Sharma et al. 2020), we combined PCA and correlation coefficient calculation to give the analytical, functional form of the observable quantity when observational data sets are given as input. The method is efficient in fitting the observable; the caveat, however, is that the derived cosmological parameters, like the dark energy EoS parameter, are not determined very efficiently. The problem arises due to the non-linear dependency of the dark energy parameter on the observational quantity at hand, for instance, the Hubble parameter and the distance modulus. To circumvent this problem, we incorporate the Markov Chain Monte Carlo method with PCA reconstruction to derive the EoS parameters for dark energy and other cosmological parameters. The EoS parameter is derived by searching for the model that best describes the functional form of the observable determined by the observational data. For the Monte Carlo method, we utilize the No U-Turn Sampler, a variant of the Hamilton Monte Carlo method. In this analysis, we demonstrate that the constraints on the parameters of the dark energy EoS are in line with those obtained from other methods.

This paper is structured as follows. Section 2 briefly reviews background cosmology, describes the reconstruction algorithm, and the No U-Turn sampling. In Section 3, we describe the results of our algorithm. We describe the distinguishing features of our methodology in Section 4. In Section 5, we summarize this paper’s main results.

2 Reconstruction methodology

In this section, we first discuss the methodology of the principal component analysis reconstruction (Sharma et al. 2020) and the modification to the algorithm.

2.1 Reconstruction of the functional form of Hubble parameter, distance modulus, and angular scale in terms of redshift

For a spatially flat Universe composed of dark energy and non-relativistic matter, the Hubble parameter is given by,

$$\begin{aligned} H(z) \!=\! H_0\left[ \Omega _m (1\!+\!z)^3 \!+\! \Omega _{\textrm{DE}} e^{3 \int ^z_0 \frac{1+w(z')}{1+z'}dz'} \right] ^{1/2}\!\!. \end{aligned}$$

(1)

The dark energy EoS parameter $w(z) = P^{\prime }/\rho ^{\prime }$ can be written as:

$$\begin{aligned} w(z) = \sum _{i=1}^{m} \alpha _{(i - 1)} {\mathcal {F}}(z)^{(i - 1)}, \quad {\mathcal {F}}(z) = \frac{z}{(1 + z)}, \end{aligned}$$

(2)

where $H_0$ denotes the present-day value of the Hubble parameter and $\Omega _m$, $\Omega _{\textrm{DE}}$ are the density parameters for matter and dark energy, respectively. In Equation (2), $m=2$ corresponds to the Chevallier–Polarski–Linder (CPL) parameterization (Linder 2003a) given by, $w(z) = w_0 + w' z /(1+z)$, $w_0$ and $w'$ being the present-day values of the equation state parameter and its derivative, respectively. The equation gives the Taylor series expression of the dark energy EoS parameter in terms of $(1 - a)$, where a is the scale factor.

From the functional form of the Hubble parameter, we can reconstruct the dark energy EoS parameter w(z). Differentiating Equation (1) we get,

$$\begin{aligned} w(z) = \frac{3 h^2 - 2 (1 + z) h h^{\prime } }{3 h_0^2 (1 + z)^3 \Omega _m - 3 h^2}, \end{aligned}$$

(3)

where h is the reduced Hubble parameter given by H(z)/100 km $\hbox {s}^{-1}$ $\hbox {Mpc}^{-1}$.

The luminosity distance $d_L(z)$ is given by,

$$\begin{aligned} d_L(z) = \frac{c}{H_0}(1+z)\int _0^z d_H(z')dz', \end{aligned}$$

(4)

where $d_H$, from Equation (1) is,

$$\begin{aligned} d_H(z) = \left( \Omega _m (1+z)^3 + \Omega _x e^{{3\int _0^z \frac{(1+w(z'))dz'}{(1+z')}}}\right) ^{-1/2} \end{aligned}$$

(5)

and is related to the distance modulus as:

$$\begin{aligned} \mu (z) = 5 \log {\left( \frac{d_L}{1~\textrm{Mpc}}\right) } + 25. \end{aligned}$$

(6)

We use the same expression of Equation (2) for the EoS parameter to express $\mu (z)$.

Since, $D(z) = (H_0 / c) (1+z)^{-1} d_L(z) $, the EoS parameter in terms of distance is given by,

$$\begin{aligned} w(z) = \frac{2(1+z)D'' +3D'}{3D'^3 \Omega _m (1+z)^3 -3D'}. \end{aligned}$$

(7)

The Baryon Acoustic Oscillation (BAO) angular scale $\theta _{b}$ is defined in terms of the angular diameter $D_A$ as,

$$\begin{aligned} \theta _b = \frac{r_{\textrm{drag}}}{(1 + z) D_A}. \end{aligned}$$

(8)

Here, $r_{\textrm{drag}}$ is the sound horizon at the drag epoch.

Following the reconstruction method of Sharma et al. (2020), we start by calculating the functional form of the reduced Hubble parameter h(z) and distance modulus $\mu (z)$ directly from the data set, using principal component analysis. The observable of the given data set is expressed as a polynomial over an initial basis function, which creates a coefficient space. The dimension of the coefficient space is the same as the number of terms in the initial basis function. We select different patches in the coefficient space and do a $\chi ^2$ calculation on each patch. For each patch, we get a minimum value of $\chi ^2$. We create the PCA data matrix (${\mathcal {D}}$) from these minimum $\chi ^2$ values of each patch. We then calculate covariance matrix ${\mathcal {C}}$ of ${\mathcal {D}}$, from which the eigenvector matrix ${\mathcal {E}}$ is calculated. ${\mathcal {E}}$ is used to diagonalize ${\mathcal {C}}$ and omit the linear correlation of the data matrix. It also creates a new set of basis functions. The observables are finally expressed in terms of the final basis function. With the help of these new basis functions, we create the new data matrix ${\mathcal {D}}'$. To select the value of the final basis number M, we compare the correlation matrix of ${\mathcal {D}}$ and ${\mathcal {D}}'$. A comparison of the correlation matrix also aids in selecting the best initial basis variable.

If the initial basis function is given by,

$$\begin{aligned} G=(f_1(z), f_2(z), \ldots , f_N(z)), \end{aligned}$$

(9)

with $f_i(z) = f(z)^{(i-1)}$, the initial expression of the observable $\xi $ in terms of the independent variable z is given by,

$$\begin{aligned} \xi _{ini}(z) = \sum _{i=1}^{N} b_i f(z)^{(i -1)}. \end{aligned}$$

(10)

The value of N is the number of terms in the polynomial expression of $\xi _{ini}(z)$; it is also the dimension of coefficient space $\vec {b}$. The correlation coefficient calculation determines the value of N (Kendall 1938; Sharma et al. 2020). This value must be large enough that the function can capture most of the features from the observed data set. To select the value of N, we calculate Pearson, Spearman, and Kendall correlation coefficients for the data matrix ${\mathcal {D}}$ (Kendall 1938; Kreyszig et al. 2011). The Pearson correlation coefficient gives the linear correlation in the data set. On the other hand, Spearman and Kendall correlation coefficients give the non-linear correlations of the data set. For the Spearman correlation coefficient, we calculate the rank of the data set. We arrange the ranks according to the numerical value; we give rank 1 to the highest numerical value of the PCA data set, rank 2 to the second highest, and so on. The Spearman correlation coefficient is the Pearson correlation coefficient of the rank variable of the data set. The Spearman correlation indicates whether there is a monotonic relationship between the dependent and independent variables, showing whether they tend to increase or decrease together. For the Kendall correlation coefficient, we find the concordant and disconcordant pairs. It gives the ordinal association between the variables (Kendall 1938; Kreyszig et al. 2011).

We choose the smallest value of N from the set of which the PCA data matrix gives us a higher value of Pearson correlation coefficient compared to the Spearman and Kendall correlation coefficients. If the expression of the observable $\xi (z)$ in terms of the polynomial is exact, there would be no correlation between the coefficients of the polynomial expression. Our motive is to break the correlation of the coefficient and obtain the polynomial expression of $\xi _{ini}(z)$ as closely as possible to the actual $\xi (z)$. After the reduction of the higher order principal components (PCs), the number of the terms in the polynomial of $\xi _{ini}(z)$ is M. The final functional form of the observable is,

$$\begin{aligned} \xi _{pca}(z) = \sum _{i=1}^{M} \kappa _i u_{i}(z), \end{aligned}$$

where $(u_1(z), u_2(z),\ldots , u_M(z))$ and $U = G {\mathcal {E}}$. After applying PCA, the dimension of the coefficient space $\vec {\kappa }$ is M.

In the earlier work, we have shown that a derived approach where the PCA obtains the observable and then reconstructs the dark energy EoS parameter is an efficient method to reconstruct the dark energy model rather than directly attempting to reconstruct it. Also, while we can reconstruct the Hubble parameter h(z) very well with PCA, the presence of a differentiation term in the equation given by Equation (3), which relates the EoS with h(z) increases the errors in the reconstruction of w(z).

We address this problem by suggesting a modified approach to bypass the differentiation in calculating EoS from the PCA reconstructed Hubble parameter, distance modulus function, and angular scale of BAO. This has been done by combining PCA with the maximum likelihood estimation (MLE) technique, using Markov Chain Monte Carlo (MCMC) to search for the best-fit dark energy model to the PCA reconstructed Hubble parameter, distance modulus, and angular scale. We replace the observational part of the MLE calculation with the best-fit curve of h(z), $\mu (z)$, and $\theta _b(z)$ as a function of redshift obtained via PCA. This method omits the dependencies on the number of observational data points. This analysis gives us the machinery to produce the most probable value of the model parameters by constraining the theory with reconstructed PCA data. The errors are the functions created from the covariance matrix of PCA data matrix (Huterer & Starkman 2003; Clarkson & Zunckel 2010; Sharma et al. 2020). The error comprises the eigenvalues and eigenfunctions of the covariance matrix (Huterer & Starkman 2003; Clarkson & Zunckel 2010; Sharma et al. 2020). The eigenvalues of the covariance matrix quantify the error in the reconstruction of the observable $\xi (z)$. If $\lambda _i$ are the eigenvalues of the covariance matrix ${\mathcal {C}}$, then the error associated with each of the components is $\sigma (\alpha _i) = \lambda _i ^ {1/2}$. For M number of final terms, we have the final error as,

$$\begin{aligned} \sigma (\xi (z_a)) = \left[ \sum _{i=1}^{M} \sigma ^2(\alpha _i) e^2_i(z_a) \right] ^ {1/2}. \end{aligned}$$

(11)

Equation (11) gives the error function for a particular reconstructed curve, and we have the error as a function of redshift (Huterer & Starkman 2003; Clarkson & Zunckel 2010).

2.2 No-U-turn sampler

To implement the MCMC search, we use the No-U-turn sampler (NUTS), which effectively chooses the best parameter region. The No-U-turn sampler modifies the Hamiltonian Monte Carlo (HMC), where the algorithm intrinsically selects the Leapfrog steps (Gelman & Rubin 1992; Hoffman & Gelman 2011; Salvatier et al. 2016). The selection of leapfrog steps is crucial in solving the Hamiltonian differential equations of the HMC. At every step, NUTS proceeds by creating a binary tree. In this binary tree, two particles representing progress in the forward and backward directions are created. If these two are represented as $(\mathbf {q_n^+}$, $\mathbf {p_n^+})$ and $(\mathbf {q_n^-}$, $\mathbf {p_n^-})$ then the NUTS conditions can be given by,

$$\begin{aligned} ({\textbf{q}}_{\textbf{n}}^+ - {\textbf{q}}_{\textbf{n}}^-)\cdot {\textbf{p}}_{\textbf{n}}^-< 0 ,\\ ({\textbf{q}}_{\textbf{n}}^+ - {\textbf{q}}_{\textbf{n}}^-)\cdot {\textbf{p}}_{\textbf{n}}^+ < 0. \end{aligned}$$

In HMC, we move in the phase space of ${\textbf{q}}$ and ${\textbf{p}}$ in the elliptical path (Gelman & Rubin 1992; Hoffman & Gelman 2011). The introduction of the momentum variable ${\textbf{p}}$ aims to ensure the exploration of a wider area in the parameter space. This is done by moving in an elliptical contour, which we get after solving the dynamical Hamiltonian equation. In NUTS, when we move half of the elliptical path, the sign of the momentum and the position variables are changed, and we stop. This makes the NUTS more efficient than HMC, wherein there is no way to ascertain if we are moving in the region of parameter space that has already been explored.

We choose the value of the total sample points $M_s$ by checking the convergence limit using Gelman–Rubin statistic (Salvatier et al. 2016). Gelman–Rubin statistic for convergence is based on the notion that multiple convergence chain appears to be similar; otherwise, they will not converge. It is a standard method to run multiple MCMC chains to test for convergence. Scale reduction factor $\hat{r_o}$ is used to check the Gelman–Rubin convergence. There are two main ways the sequences of MCMC iterations fail to converge. In one case, the chains run in different parts, which have drastic differences in posterior probability densities of the target distribution. On the other, the chains fail to attain convergence. We change the value of $M_s$ until we get $\hat{r_o} = 1$, which confirms attaining the convergence.

3 Results

We do the analysis described above for the Hubble parameter data, Cepheid Calibrated SNIa data and the BAO data set. We present the results of both the simulated and real data sets for the Hubble parameter. The simulated data set is created using the same parameter values fixed by Planck Collaboration et al. (2018). For the simulated $\Lambda $CDM data set, we have fixed the values of cosmological parameters as $\Omega _m = 0.3$ and $h_0 = 0.685$. We test the validity of our method and check if the analysis picks up these values. We then apply the method to the real data set, namely the Cosmic-Chronometer data set (Simon et al. 2005; Moresco et al. 2012, 2020; Moresco 2015; Ratsimbazafy et al. 2017; Jiao et al. 2023; Jimenez et al. 2023) as well as SNIa data set (Deng & Wei 2018; Riess et al. 2021; Scolnic et al. 2022; Uddin et al. 2023), then, compare with the usual likelihood analysis results. We also use the transverse BAO data set from Carvalho et al. (2016), Alcaniz et al. (2017), de Carvalho et al. (2018), Carvalho et al. (2020), which consists of 15 transverse BAO measurements (Nunes et al. 2020), that are calculated using the public data releases of the Sloan Digital Sky Survey (SDSS) (York et al. 2000), without assuming a fiducial cosmological model (Sánchez et al. 2011; Carnero et al. 2012).

To get the reconstructed curve of reduced Hubble parameter, distance modulus, as well as angular scale, we use $f(z) = {z}/{(1 + z)}$ as the basis variable for simulated as well as observational datasets. This initial basis function gives the best reconstruction as shown in Sharma et al. (2020). Here, we can choose the value of $n_d$, which is the number of data points in the observed part of MLE. We run the Markov Chain Monte Carlo (MCMC) chain to search for minimum $\chi ^2$, which gives us the likelihood of the PCA data set. In the MCMC analysis, for the Hubble parameter data set, we use normal priors ${\mathcal {N}}(0.70, 0.2)$ and ${\mathcal {N}}(0.35, 0.1)$ for reduced Hubble constant $h_0$ and $\Omega _m$, respectively. For the DE parameters, $\vec {\alpha }$ we take ${\mathcal {N}}(0, 3)$. Here, ${\mathcal {N}}(x_{\textrm{mean}}, x_{\textrm{mode}})$ represents the normal probability density function with mean $x_{\textrm{mean}}$ and spread of $x_{\textrm{mode}}$. For Cepheid Calibrated SNIa data, we use the data archive given in Riess et al. (2021), Uddin et al. (2023) and Deng & Wei (2018). In the MLE part, we take half normal with a standard deviation of 0.4 as a prior of $\Omega _m$ and for $\vec {\alpha }$ we take ${\mathcal {N}}(-2, 1.5)$. In the case of the BAO data set, we use the same priors as for the Hubble parameter dataset.

We choose the largest possible value for $n_d$, which is limited by the computing power. We then check the results for different values of $n_d$ and $M_s$. Moreover, we find out the posterior distribution’s mean, median, and mode. For $m = 3$ in Equation (2), we analyze different values of $n_d$ and $M_s$. $m=3$ is the CPL parameterization along with the following order term (Linder 2003a). We vary $n_d$ in the range 100–800 whereas $M_s$ in the range 1000–800000 and find out mean, median, and mode as well as 1$\sigma $ and 2$\sigma $ ranges of $\omega _m$, $h_0$, $\vec {\alpha }$.

In Figures 1 and 3, we show results for $n_d = 600$, where we fix the number of sample points at $M_s = 800000$. This particular choice of $n_d$ and $M_s$ gives us the closest approximation of the model parameters for the simulated Hubble parameter data. Also, we see that about this value of $n_d$ and $M_s$, we get the smallest variation in $1 \sigma $ and $2 \sigma $ ranges of the model parameters, with the variation of these two quantities. In particular for $(n_d, M_s) = (600, 800000)$ and (1000, 50000) the difference in $1 \sigma $ and $2 \sigma $ ranges are of the order of ${\mathcal {O}}(-1)$ for $\vec {\alpha }$ and ${\mathcal {O}}(-2)$ or less for $\Omega _m$ and $h_0$. For the Hubble parameter data set, the mean of the posterior of $h_0$ and $\Omega _m$ from the algorithm are $h_0 = 0.68$ and $\Omega _m = 0.34$, which are very close to the assumed values to produce the simulated data set, $h_0 = 0.685$ and $\Omega _m = 0.3$. Table 1 shows the $1\sigma $ and $2\sigma $ ranges for the parameters, along with their best-fit values. The mean of the posterior of $h_0$ and $\Omega _m$ for the real Hubble parameter data are $h_0 = 0.71$ and $\Omega _m = 0.35$, respectively. From the mode plot of the posterior of the model likelihood of Figures 2 and 4, we can see the difference between the old cosmic chronometer data set (Simon et al. 2005; Moresco et al. 2012; Zhang et al. 2014; Moresco 2015; Ratsimbazafy et al. 2017) with the new cosmic chronometer data set (Moresco et al. 2020; Jimenez et al. 2023; Jiao et al. 2023). For the particular cosmological model of Equation (2), with the NUTS algorithm, PCA reconstruction brings w(z) closer to the $w(z) = -1$ in comparison to the old cosmic chronometer data set. Also, we present our results for SNIa data set (Deng & Wei 2018; Riess et al. 2021; Scolnic et al. 2022; Uddin et al. 2023) as well as BAO data set (Carvalho et al. 2016; de Carvalho et al. 2018; Alcaniz et al. 2017; Nunes et al. 2020). For the SNIa data set, we show our results for $w_{\textrm{CDM}}$ and $w_{\textrm{CPL}}$ in Figure 5. For the model of Equation (2) with $m=3$, we show our results in Figure 6. We present our results for $(n_d, M_s) = (100, 80000)$ for the SNIa data set. Table 1 gives a comparison; this table can be extended to different data sets and models.

In Figure 7, we present results for the BAO data set, using the DE model with $m=3$ of Equation (2). We see, both from Figure 7, and the combined joint analysis plot Figure 8 that BAO gives tighter constraints in comparison to Hubble parameter and SNIa data sets. For Figures 7 and 8, we use $(n_d, M_s)=(50, 10000)$, and this choice is made under the available computational power. From Figure 7, we can see that PCA $+$ MCMC gives very good reconstruction even with a small number of sample points $M_s$. The 1$\sigma $ range of $r_\textrm{drag}$ from our method is [146.1, 148.2].

It is also evident from the Figures 2–7, that $w(z) = -1 $ is well within the $1\sigma $ range of w(z) parameters ($\vec {\alpha }$). The plots of w(z) and $\rho (z)/\rho _0$ are similar for real and simulated data sets. The difference in w(z) and $\rho (z) / \rho _0$ curve between simulated and real Hubble parameter data are 0.445 and 0.026, respectively. Here, $\rho (z)$ and $\rho _0$ are the total energy density at redshift z and at present. The dark energy density plot, $\rho ^\prime (z) / \rho ^\prime _0$ vs. z, for simulated and real Hubble parameter data set are also similar, and the maximum difference between them for Hubble parameter data set is 0.31.

We restrict to the $m=3$ cut-off in Equation (2) for $(n_d, M_s)=(600, 800000)$, largely due to the computational power available to us at present. In the Table 2 of supplementary information, for $(n_d, M_s)=(100, 100)$ we use the algorithm up to $m=10$. We find out that for the Hubble parameter data set, the better constraint on the parameter space for $m \ge 4$ needs to be done with $(n_d, M_s) \ge (600, 800000)$. In follow-up work, we optimize the algorithm to constrain the parameter space with large enough values of m and draw the physical conclusion. The parametrization of w(z) and $({\rho ^{\prime }_{de}})/({\rho ^{\prime }_{0}})$ have the same physical implication, which is shown in Figures 2 and 4. The dark energy density can be derived analytically for these parameterizations, and fixing the dark energy EoS parameter determines the evolution of the energy density as a function of time.

In the MCMC run for both the real and simulated Hubble parameter data set, with $(n_d, M_s) = (600, 800000)$, the value of Gelman–Rubin convergence factor $\hat{r_o}$ is 1. For SNIa MCMC run $(n_d, M_s) = (100, 80000)$ gives the value of $\hat{r_o} = 1$. To check the convergence, we not only check the $\hat{r_o}$ factor and eliminate those iterations which do not satisfy the $\hat{r_o} \approx 1$ criteria, but we also check the trace plots, rank bar plots and the rank vertical line plots of the posterior sampling for visual confirmations (Gelman & Rubin 1992; Cowles & Carlin 1996; Brooks & Gelman 1998).

The error bars of the parameters in Table 1 are derived when the error function from PCA is considered. Hence, the $1\sigma $, $2\sigma $ ranges are affected by the error functions we introduce in the MLE. For $w_{cpl}$, with the Pantheon data set, when we consider a half-normal probability distribution for the error part of the MLE, the range of $h_0$ changes to [0.6005, 0.6584], which is almost three times smaller than the range when PCA error function is considered. PCA error function is created solely from the data structure we provide in the first step of PCA. With the improvement of error bars in the original data points the range of the parameters will reduce significantly.

Table 1 This table gives the 1$\sigma $ and 2$\sigma $ ranges for parameters, $h_0$, $\Omega _m$ and $\vec {\alpha }$, for all the different model and the data-type for which we run our complete analysis. For the cosmic chronometer data set, we use both the simulated and the real data sets and apply them to the $w_{\textrm{model}}$. For SNIa data, we do our analysis for $w_{\textrm{CDM}}$, $w_{\textrm{CPL}}$ as well as the dark energy model of Equation (2) with $m=3$. Here, $\vec {\alpha }$ are the parameters of the dark energy equation of state parameter. The last column of the table corresponds to the best-fit value of the parameter of the model given the data set.

Full size table

We also analyze the classical Metropolis–Hastings (MH) and the Hamiltonian Monte Carlo (HMC) sampler. For comparison with MH and HMC, we do the analysis with $(n_d, M_s) = (600, 800000)$ and $(n_d, M_s) = (100, 80000)$ for Hubble parameter and Supernovae data, respectively. Our analysis shows that the NUTS and HMC sampler perform better than the MH sampler. For more than six continuous parameters and with the same CPU power, NUTS improves speed by a factor of 2.4 to complete the analysis. Details of the time taken by the MH and NUTS are given in Table 2 of supplementary information. The plots for the real Hubble parameter data set, with the MH, are shown in Figure 15 of supplementary information. Again, NUTS is better than HMC as after picking up the leapfrog steps, NUTS stops automatically when the NUTS conditions are satisfied. It has been explicitly shown in (Hoffman & Gelman 2011) that the NUTS algorithm is more efficient. Convergence plots for the NUTS, HMC, and MH samplers, in the case of the Hubble parameter data set, are shown in Figures 9–18, respectively (see supplementary information).

4 A comparison with other methods

The reconstruction of H(z), $\mu (z)$, and $\theta _b(z)$ from the PCA algorithm, which is described in Section 2 and in Sharma et al. (2020) is qualitatively different from the other PCA techniques employed in the literature Huterer & Starkman (2003), Huterer & Cooray (2005), Clarkson & Zunckel (2010), Ishida & de Souza (2011) and Zheng & Li (2017). The starting assumption is that the function h(z), $\mu (z)$, and $\theta _b$ smoothly vary over redshift z. This is a reasonable choice as described by the current data sets (Simon et al. 2005; Moresco et al. 2012, 2020; Moresco 2015; Carvalho et al. 2016, 2020; Alcaniz et al. 2017; Ratsimbazafy et al. 2017; Deng & Wei 2018; de Carvalho et al. 2018; Riess et al. 2021; Scolnic et al. 2022; Jiao et al. 2023; Jimenez et al. 2023; Uddin et al. 2023). Different variants of PCA techniques have been adopted in Liu et al. (2019) and Liu et al. (2016). Before creating a different set of simulated Hubble data to construct the covariance matrix Liu et al. (2019) uses an error model. While Liu et al. (2016) combines the weighed least square method with PCA.

We apply MLE to the observed data sets using the reconstruction functional form of PCA. Including a Cosmological model is only in the final part of our methodology, where we use the MLE technique. MCMC chain with the NUTS, over the model-independent reconstruction of PCA, gives unbiased constraints on the model parameters. Fisher matrix computation is one of the major ways to PCA reconstruction (Huterer & Starkman 2003; Huterer & Cooray 2005; Crittenden et al. 2009; Clarkson & Zunckel 2010; Ishida & de Souza 2011; Hojjati et al. 2012; Nair & Jhingan 2013; Nesseris & García-Bellido 2013; Zheng & Li 2017; Miranda & Dvorkin 2018; Hart & Chluba 2019, 2022a). Our methodology calculates the Covariance matrix, which quantifies the correlation and uncertainties directly from the PCA data matrix described in Section 2. Reduction of dimension, a distinctive feature of the PCA reconstruction, omits the noise part from the PCA data matrix. Therefore, the parameter constraints that are done by replacing the observational part with the PCA reconstructed part will constrain the parameter more reasonably. One important feature of the (PCA $+$ MCMC) methodology is that it will also work for sparse data sets. In comparison to the classical techniques, it can be easily generalized to a higher dimension of parameter space with little expense of computational time, as described in Section 2 and showed quantitatively in Appendix (see supplementary information).

5 Conclusions

This paper combines the principal component analysis reconstruction with the Markov Chain Monte Carlo tool to determine cosmological parameters. We assume the Taylor series expansion of the EoS parameter in terms of the scale factor as the parameterization of the dark energy EoS. When the method of PCA is combined with the correlation coefficient calculation and the MCMC tool, we can determine the number of points in the observational part using the maximum likelihood method. We use the No-U-turn sampler for this analysis.

First, we test the method on simulated data and check if the values assumed for the cosmological parameters are reconstructed effectively. We see that the predictions for the model parameters are consistent with the assumed values. The parameter estimation does not depend strongly on the prior probability assumption, and the idea can be generalized to other data sets and different sampling techniques. The relation between the Hubble parameter and the EoS of dark energy also contains the first differentiation of the Hubble parameter, which introduces an unwanted error in the EoS predictions. Similarly, for SNIa and BAO data sets the relation between the distance modulus and the angular scale with the EoS of dark energy contains first and second-order differentiation.

The present method eliminates the error that arises from the first and higher-order differentiation of the observable to infer the value and ranges of the EoS of dark energy. In this work, we only use the error function that comes directly from the PCA algorithm, and one can use different error functions in the error part of the MLE as well. It is clear from the results that, for the simple model of dark energy we take, the allowed range of cosmological parameters is consistent with other analyses, and the cosmological constant model is well within the allowed range of models for both the Hubble parameter and distance modulus data set. This analysis can be extended to other dark energy models. Here the advantage is that the complete functional form of the observable of the dataset is obtained; in the present work, the Hubble parameter and distance modulus as a function of redshift are determined. The second step is deriving the dark energy EoS parameter. The method is suitable for different types of data and merits future analysis. It depends only on the model-independent reconstruction of the data set and its associated error. Improvement of even a single data point leads to an increase in the constraining power of our method. With the upcoming improved Hubble parameter, SNIa, and BAO data sets, the application of the method will lead to better constraints and a much easier distinction between different dark energy models. Also, the method discussed here can be used as a model selection tool for data sets with fewer data points.

6 Supplementary information

Appendices A and B and Figures 9–18 have been given as supplementary information. The online version contains supplementary material available at https://doi.org/10.1007/s12036-024-10009-9 and at https://www.ias.ac.in/listing/articles/joaa/.

References

Albrecht A., Bernstein G., Cahn R. et al. 2006, arXiv e-prints, astro
Alcaniz J. S., Carvalho G. C., Bernui A., Carvalho J. C., Benetti M. 2017, Fundam. Theor. Phys., 187, 11
Article Google Scholar
Arjona R., Nesseris S. 2020, arXiv e-prints, arXiv: 2012.12202
Bagla J. S., Jassal H. K., Padmanabhan T. 2003, Phys. Rev. D, 67, 063504
Article ADS Google Scholar
Banerjee A., Cai H., Heisenberg L. et al. 2021, Phys. Rev. D, 103, L081305
Article ADS Google Scholar
Bellomo N., Bernal J. L., Scelfo G., Raccanelli A., Verde L. 2020, J. Cosmology Astropart. Phys., 2020, 016
Article Google Scholar
Bernal J. L., Bellomo N., Raccanelli A., Verde L. 2020, J. Cosmology Astropart. Phys., 2020, 017
Article Google Scholar
Brooks S. P., Gelman A. 1998, Journal of Computational and Graphical Statistics, 7, 434
Article MathSciNet Google Scholar
Caldwell R. R., Dave R., Steinhardt P. J. 1998, Phys. Rev. Lett., 80, 1582
Article ADS Google Scholar
Caldwell R. R., Linder E. V. 2005, Phys. Rev. Lett., 95, 141301
Article ADS Google Scholar
Carnero A., Sánchez E., Crocce M., Cabré A., Gaztañaga E. 2012, MNRAS, 419, 1689
Article ADS Google Scholar
Carroll S. M. 2001, Living Rev. Rel., 4, 1
Article Google Scholar
Carroll S. M., Press W. H., Turner E. L. 1992, Annual Review of Astronomy and Astrophysics, 30, 499
Article ADS Google Scholar
Carvalho G. C., Bernui A., Benetti M., Carvalho J. C., Alcaniz J. S. 2016, Phys. Rev. D, 93, 023530
Article ADS Google Scholar
Carvalho G. C., Bernui A., Benetti M. et al. 2020, Astroparticle Physics, 119, 102432
Article Google Scholar
Chevallier M., Polarski D. 2001, Int. J. Mod. Phys. D, 10, 213
Article ADS Google Scholar
Clarkson C., Zunckel C. 2010, Phys. Rev. Lett., 104, 211301
Article ADS Google Scholar
Coble K., Dodelson S., Frieman J. A. 1997, Phys. Rev. D, 55, 1851
Article ADS Google Scholar
Copeland E. J., Liddle A. R., Wands D. 1998, Phys. Rev. D, 57, 4686
Article ADS Google Scholar
Copeland E. J., Sami M., Tsujikawa S. 2006, Int. J. Mod. Phys. D, 15, 1753
Article ADS Google Scholar
Cowles M. K., Carlin B. P. 1996, Journal of the American Statistical Association, 91, 883
Article MathSciNet Google Scholar
Crittenden R. G., Pogosian L., Zhao G.-B. 2009, JCAP, 0912, 025
Article ADS Google Scholar
de Carvalho E., Bernui A., Carvalho G. C., Novaes C. P., Xavier H. S. 2018, J. Cosmology Astropart. Phys., 2018, 064
Article Google Scholar
Deng H.-K., Wei H. 2018, European Physical Journal C, 78, 755
Article ADS Google Scholar
Di Valentino E., Melchiorri A., Linder E. V., Silk J. 2017, Phys. Rev. D, 96, 023523
Article ADS Google Scholar
Ellis J. 2003, Philosophical Transactions of the Royal Society of London A: Mathematical. Physical and Engineering Sciences, 361, 2607
Article Google Scholar
Frieman J. A., Turner M. S., Huterer D. 2008, Annual Review of Astronomy & Astrophysics, 46, 385
Article ADS Google Scholar
Gelman A., Rubin D. B. 1992, Statistical Science, 7, 457
ADS Google Scholar
Gong Y.-G., Wang A. 2007, Phys. Rev. D, 75, 043520
Article ADS Google Scholar
Hart L., Chluba J. 2019, arXiv e-prints, arXiv:1912.04682
Hart L., Chluba J. 2022, MNRAS, 510, 2206
Article ADS Google Scholar
Hoffman M. D., Gelman A. 2011, arXiv e-prints, arXiv: 1111.4246
Hojjati A., Zhao G.-B., Pogosian L. et al. 2012, Phys. Rev. D, 85, 043508
Article ADS Google Scholar
Huterer D., Cooray A. 2005, Phys. Rev. D, 71, 023506
Article ADS Google Scholar
Huterer D., Peiris H. V. 2007, Phys. Rev. D, 75, 083503
Article ADS Google Scholar
Huterer D., Starkman G. 2003, Phys. Rev. Lett., 90, 031301
Article ADS Google Scholar
Ishida E. E. O., de Souza R. S. 2011, A &A, 527, A49
Google Scholar
Jassal H. K. 2009, Phys. Rev. D, 79, 127301
Article ADS Google Scholar
Jassal H. K., Bagla J. S. Padmanabhan T. 2005, MNRAS, 356, L11
Jiao K., Borghi N., Moresco M., Zhang T.-J. 2023, ApJS, 265, 48
Article ADS Google Scholar
Jimenez R., Moresco M., Verde L., Wandelt B. D. 2023, arXiv e-prints, arXiv:2306.11425
Kendall 1938, Biometrika, 30, 81
Kreyszig E., Kreyszig H., Norminton E. J. 2011, Advanced engineering mathematics, 10th edn (Hoboken, N.J.: Wiley)
Lee B.-H., Lee W., Ó Colgáin E., Sheikh-Jabbari M. M., Thakur S. 2022, J. Cosmology Astropart. Phys., 2022, 004
Linder E. V. 2003, Phys. Rev. Lett., 90, 091301
Article ADS Google Scholar
Linder E. V. 2006, Phys. Rev. D, 73, 063010
Article ADS Google Scholar
Linder E. V. 2008, Reports on Progress in Physics, 71, 056901
Article ADS Google Scholar
Linder E. V. 2008, Gen. Rel. Grav., 40, 329
Article ADS Google Scholar
Liu Z.-E., Qin H.-F., Zhang J., Zhang T.-J., Yu H.-R. 2019, Physics of the Dark Universe, 26, 100379
Article Google Scholar
Liu Z.-E., Yu H.-R., Zhang T.-J., Tang Y.-K. 2016, Phys. Dark Univ., 14, 21
Article Google Scholar
Miranda V., Dvorkin C. 2018, Phys. Rev. D, 98, 043537
Article ADS Google Scholar
Moresco M. 2015, MNRAS, 450, L16
Article ADS Google Scholar
Moresco M., Jimenez R., Verde L., Cimatti A., Pozzetti L. 2020, ApJ, 898, 82
Article ADS Google Scholar
Moresco M., Verde L., Pozzetti L., Jimenez R., Cimatti A. 2012, JCAP, 1207, 053
Article ADS Google Scholar
Mukherjee A. 2016, Mon. Not. Roy. Astron. Soc., 460, 273
Article ADS Google Scholar
Nair R., Jhingan S. 2013, J. Cosmology Astropart. Phys., 2013, 049
Article Google Scholar
Nesseris S., García-Bellido J. 2013, Phys. Rev. D, 88, 063521
Article ADS Google Scholar
Nesseris S., Perivolaropoulos L. 2004, Phys. Rev. D, 70, 043531
Article ADS Google Scholar
Nesseris S., Perivolaropoulos L. 2005, Phys. Rev. D, 72, 123519
Article ADS Google Scholar
Nesseris S., Perivolaropoulos L. 2007, J. Cosmology Astropart. Phys., 2007, 018
Article Google Scholar
Nunes R. C., Yadav S. K., Jesus J. F., Bernui A. 2020, MNRAS, 497, 2133
Article ADS Google Scholar
Padmanabhan T. 2002, Phys. Rev. D, 66, 021301
Article ADS Google Scholar
Padmanabhan T. 2003, Phys. Rept., 380, 235
Article ADS Google Scholar
Peebles P. J. E., Ratra B. 2003, Rev. Mod. Phys., 75, 559, [,592(2002)]
Planck Collaboration, Aghanim N., Akrami Y. et al. 2018, arXiv e-prints, arXiv:1807.06209
R Core Team 2013, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria
Google Scholar
Rajvanshi M. P., Bagla J. S. 2019, Journal of Astrophysics and Astronomy, 40, 44
Article ADS Google Scholar
Ratra B., Peebles P. J. E. 1988, Phys. Rev. D, 37, 3406
Article ADS Google Scholar
Ratsimbazafy A. L., Loubser S. I., Crawford S. et al. 2017, Mon. Not. Roy. Astron. Soc., 467, 3239
Article ADS Google Scholar
Riess A. G., Casertano S., Yuan W. et al. 2021, ApJ, 908, L6
Article ADS Google Scholar
Sahni V., Starobinsky A. 2000, International Journal of Modern Physics D, 9, 373
Article ADS Google Scholar
Salvatier J., Wiecki T. V., Fonnesbeck C. 2016, PeerJ Computer Science, 2, e55
Article Google Scholar
Sánchez E., Carnero A., García-Bellido J. et al. 2011, MNRAS, 411, 277
Sangwan A., Tripathi A., Jassal H. K. 2018, arXiv e-prints, arXiv:1804.09350
Scolnic D. et al. 2022, Astrophys. J., 938, 113
Article ADS Google Scholar
Sharma R., Mukherjee A., Jassal H. K. 2020, arXiv e-prints, arXiv:2004.01393
Simon J., Verde L., Jimenez R. 2005, Phys. Rev. D, 71, 123001
Article ADS Google Scholar
Singh A., Sangwan A., Jassal H. K. 2019, JCAP, 1904, 047
Stern D., Jimenez R., Verde L., Kamionkowski M., Stanford S. A. 2010, J. Cosmology Astropart. Phys., 2010, 008
Article Google Scholar
Tsujikawa S. 2013, Classical and Quantum Gravity, 30, 214003
Article ADS MathSciNet Google Scholar
Turner M. S., White M. 1997, prd, 56, R4439
Uddin S. A., Burns C. R., Phillips M. M., et al. 2023, arXiv e-prints, arXiv:2308.01875
Vagnozzi S., Dhawan S., Gerbino M. et al. 2018, Phys. Rev. D, 98,
Vehtari A., Gelman A., Simpson D., Carpenter B., Bürkner P.-C. 2019, arXiv e-prints, arXiv:1903.08008
Verde L., Protopapas P., Jimenez R. 2013, Physics of the Dark Universe, 2, 166
Article ADS Google Scholar
Weinberg S. 1989, Rev. Mod. Phys., 61, 1
Article ADS Google Scholar
York D. G., Adelman J., Anderson J. E. et al. 2000, AJ, 120, 1579
Zhang C., Zhang H., Yuan S. et al. 2014, Research in Astronomy & Astrophysics, 14, 1221
Article ADS Google Scholar
Zheng W., Li H. 2017, Astropart. Phys., 86, 1
Article ADS Google Scholar
Zlatev I., Wang L.-M., Steinhardt P. J. 1999, Phys. Rev. Lett., 82, 896
Article ADS Google Scholar

Download references

Acknowledgements

We would like to thank J. S. Bagla for their valuable suggestions. We would also like to thank the PyMC3 team (R Core Team 2013; Salvatier et al. 2016) for making their software open-source and user-friendly.

Author information

Authors and Affiliations

Indian Institute of Science Education and Research, Mohali, 140306, India
RANBIR SHARMA & H. K. JASSAL
Korean Astronomy and Space Science Institute, Daejeon, 34055, Korea
RANBIR SHARMA

Authors

RANBIR SHARMA
View author publications
You can also search for this author in PubMed Google Scholar
H. K. JASSAL
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to RANBIR SHARMA.

Ethics declarations

Data availability

The observational data set used in the analysis is publicly available and duly referred to in the text (Simon et al. 2005; Moresco et al. 2012; Zhang et al. 2014; Moresco 2015; Ratsimbazafy et al. 2017; Scolnic et al. 2022). The simulated data set can be created by using the standard $\Lambda $CDM model, using Equations (1)–(6) of the text.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 8181 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

SHARMA, R., JASSAL, H.K. Inference of cosmological models with principal component analysis. J Astrophys Astron 45, 24 (2024). https://doi.org/10.1007/s12036-024-10009-9

Download citation

Received: 21 July 2023
Accepted: 29 April 2024
Published: 31 August 2024
DOI: https://doi.org/10.1007/s12036-024-10009-9

Keyword

Cosmology—dark energy equation of state reconstruction—principal component analysis—correlation coefficient

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Inference of cosmological models with principal component analysis

Abstract

Similar content being viewed by others

Reconstruction of latetime cosmology using principal component analysis

Model selection applied to reconstructions of the Dark Energy

Understanding Better (Some) Astronomical Data Using Bayesian Methods

1 Introduction

2 Reconstruction methodology

2.1 Reconstruction of the functional form of Hubble parameter, distance modulus, and angular scale in terms of redshift

2.2 No-U-turn sampler

3 Results

4 A comparison with other methods

5 Conclusions

6 Supplementary information

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Data availability

Supplementary Information

Supplementary file 1 (pdf 8181 KB)

Rights and permissions

About this article

Cite this article

Keyword

Navigation

Inference of cosmological models with principal component analysis

Abstract

Similar content being viewed by others

Reconstruction of latetime cosmology using principal component analysis

Model selection applied to reconstructions of the Dark Energy

Understanding Better (Some) Astronomical Data Using Bayesian Methods

1 Introduction

2 Reconstruction methodology

2.1 Reconstruction of the functional form of Hubble parameter, distance modulus, and angular scale in terms of redshift

2.2 No-U-turn sampler

3 Results

4 A comparison with other methods

5 Conclusions

6 Supplementary information

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Data availability

Supplementary Information

Supplementary file 1 (pdf 8181 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keyword

Search

Navigation