1 Introduction

The Ionospheric Connection Explorer (ICON) is a NASA Heliophysics mission that was launched on 10 October 2019 to explore the strong variabilities of the Earth’s thermosphere and ionosphere in response to solar forcing and forcing from the lower atmosphere (Immel et al. 2018, 2023). The Far Ultraviolet (FUV) imager, one of the four scientific instruments on the ICON mission, was designed to observe the ultraviolet airglow emission of atomic oxygen (OI) at 135.6 nm and the N2 Lyman-Birge-Hopfield emission around 157 nm (Frey et al. 2023; Mende et al. 2017; Wilkins et al. 2017). On the nightside, limb scans of the OI 135.6 nm emission can be used to determine the ionospheric parameters through radiative transfer modeling and regularized inversion (Kamalabadi et al. 2018; Qin et al. 2015, 2016). Before the unfortunate loss of communication with the spacecraft on 25 November 2022, ICON completed its prime mission and collected more than 1000 days of observations (Immel et al. 2023). The FUV observations acquired by ICON can provide long-term and global-scale information of the nighttime ionosphere for the studies of the equatorial anomalies and plasma density disturbances, plasma bubbles and blobs, and conjugate photoelectron energy spectra (e.g., Park et al. 2022; Urco et al. 2021; Wautelet et al. 2019). Those data can also be incorporated into physics-based and assimilative models to provide realistic specifications of ionospheric parameters for space weather monitoring, nowcasting, and forecasting (e.g., Bust and Immel 2020; Huba et al. 2017).

For the aforementioned studies, accurate determination of the nighttime ionospheric parameters from the ICON FUV observations is highly desirable. Currently, ICON uses the radiative transfer model developed by Qin et al. (2015) and Tikhonov regularization (Aster et al. 2013; Menke 2012) for the inversion of the nighttime OI 135.6 nm emission to retrieve the ionospheric parameters (Kamalabadi et al. 2018). Recent analyses revealed a systematic bias between the ICON retrievals and external radio measurements (Wautelet et al. 2021, 2023). Specifically, Wautelet et al. (2023) found that the ICON retrievals are on average ∼6%-10% higher in \(n_{\mathrm{m}}F_{2}\) and ∼7 km higher in \(h_{\mathrm{m}}F_{2}\) than those measured by ionosondes and satellite radio occultation. While the model of Qin et al. (2015) has been validated by comparison with the Monte Carlo radiative transfer model of Qin and Harding (2020) and by analyzing the nighttime OI 135.6 nm emission observed by the Global-Scale Observations of the Limb and Disk (GOLD) mission (Qin et al. 2023), Tikhonov regularization is known to bias the solutions to discrete inverse problems due to the use of a deterministic penalty function that tends to smooth the solutions by penalizing high-frequency components (Aster et al. 2013; Menke 2012). Despite awareness of this bias, Tikhonov regularization was used for the ICON FUV data inversion because it is a conceptually simple and easily implementable method that had been demonstrated with synthetic observations to be able to meet the mission requirements (i.e., less than 10% precision in the estimated \(n_{\mathrm{m}}F_{2}\) and a maximum 20-km error in the estimated \(h_{\mathrm{m}}F_{2}\)) prior to the launch of the mission (Qin et al. 2016; Kamalabadi et al. 2018). Nonetheless, with the completion of the prime mission, development of new methods for more accurate data inversion becomes desirable for a better understanding of the nighttime ionospheric variations.

In this study, we develop a Bayesian framework for accurate determination of the ionospheric parameters from the ICON FUV observations. The Bayesian method regularizes the solutions statistically by incorporating a prior distribution for the parameters to be determined (Aster et al. 2013; Kamalabadi et al. 1999). This method is particularly powerful for solving inverse problems in the case when prior information is abundantly available, which is true for the Earth’s ionosphere that has been measured and studied for decades with various means. This paper is structured as follows: In Sect. 2 we describe the ICON FUV observations and the external radio measurements that are analyzed in this study. In Sect. 3 we describe the forward model and the inversion methods that are used for our data analysis. In Sect. 4 we first analyze synthetic observations to show that Tikhonov regularization can introduce a systematic bias in the ionospheric retrievals. We then describe the details of the Bayesian method and demonstrate that it can retrieve the ionospheric parameters with a negligible bias. The improved ionospheric retrievals with the Bayesian method can be used to better understand the global-scale variations of the nighttime ionosphere.

2 Observations

The ICON spacecraft entered a circular orbit of ∼590 km with an inclination of ∼27 in October 2019 to observe the Earth’s thermosphere and ionosphere above mid-to-low latitudes (Immel et al. 2023). The FUV imager was directed perpendicular to the satellite motion with an \(18^{\circ} \times 24^{\circ}\) (horizontal×vertical) field-of-view (FOV) and with the center ray pointing 20 below the local horizontal. On the nightside, the FUV imager nominally was steered every orbit to look along the magnetic field line to capture the OI 135.6 nm emission with a horizontal resolution of 3 and an altitude resolution of ∼4 km, resulting in 6 × 256 pixels for each image (Mende et al. 2017; Wilkins et al. 2017). The successive images were recorded at a cadence of 12 s, each of which was processed individually to form six limb radiance profiles (referred to as limb scans), constituting the ICON FUV Level 1 (L1) data product (Frey et al. 2023). In the ICON FUV Level 2.5 (L2.5) data product, the nighttime ionospheric parameters were retrieved from the individual limb scans by using the radiative transfer model of Qin et al. (2015) and Tikhononv regularization (Kamalabadi et al. 2018). In this study, we examined all the ICON FUV L1 and L2.5 data to select the observations that have coincident radio measurements for our analysis. Details of the selection process are explained as follow.

Ionosonde is one of the most reliable instruments to measure the ionospheric parameters above the facility location (Bibl and Reinisch 1978; Huang and Reinisch 2001), which can be viewed as “ground-truth” for the validation of the ionospheric retrievals from the ICON observations. We examined the available ionosonde measurements from the National Oceanic and Atmospheric Administration (NOAA) archive to search for coincident measurements with the ICON FUV observations. Those archived measurements are processed from ionograms using the Automatic Real Time Ionogram Scaler with True Height-5 (ARTIST-5) (Galkin et al. 2008). A total number of 14 ionosonde stations, same as those used by Wautelet et al. (2021), are included for our examination. Geographical coincidence is determined if the distance between the ICON and ionosonde measurements is within 500 km. Note that the geolocation of each FUV observation is defined as the latitude and longitude of the tangent point of the line-of-sight (LOS) that is associated with the peak brightness of the radiance profile. For those geographically coincident observations, temporal coincidence is further determined if their observational times differ for less than 15 min.

The COSMIC-2 mission was launched into a low-Earth orbit of ∼550 km with an inclination of 24 on 25 June 2019 for radio occultation of the Earth’s atmosphere and ionosphere (Schreiner et al. 2020). COSMIC-2 carries a Global Navigation Satellite System (GNSS) receiver to measure the amplitude and phase of the GNSS radio signals that are occulted by Earth’s ionosphere to calculate the total electron content (TEC) along the ray path (Yue et al. 2014). By assuming local symmetry of the electron density with respect to the tangent point of the ray path between COSMIC-2 and the GNSS satellites, the ionospheric electron density profiles can be derived from the TEC measurements. Depending on the occultation geometry, each COSMIC-2 profile covers a space extent, called smear, which ranges from ∼100 km to more than 5000 km. This smearing effect is one major source of uncertainties for the COSMIC-2 measurements. In this study, we used only those measurements with smears smaller than 3200 km. The geolocation of each COSMIC-2 electron density profile is defined as the latitude and longitude of the tangent point of the ray path that is associated with the \(h_{\mathrm{m}}F_{2}\) of the profile. The coincidence of the ICON and COSMIC-2 measurements is determined by using the same criteria as those used for the ICON and ionosonde measurements, namely that the distance is within 500 km and the time difference is within 15 min between the two measurements.

During three years of operation, ICON recorded a total of ∼13.4 million individual limb scans of the nighttime OI 135.6 nm emission in its L1 data product, among which ∼4.0 million limb scans are reported to have no quality issues for the retrieval of the ionospheric parameters in its L2.5 version 05 (v05) data product. In the present analysis we examined only those ∼4.0 million limb scans. During our analysis we found that ∼0.6 million limb scans that are observed at latitudes higher than ∼20N by the FUV imager, facing north, are contaminated by aurora, manifesting as radiance profiles with tangent altitudes of the peak brightness below ∼220 km. Those ∼0.6 million limb scans are excluded from our analysis. A flag of aurora contamination will be added to the upcoming version 06 data product. Moreover, some of the remaining ∼3.4 million limb scans may be affected by conjugate photoelectrons. In the current data version, a flag has been used to indicate possible contamination by conjugate photoelectrons, which reduces the quality from 1.0 to 0.5 if more than 10% of the conjugate raypath is sunlit. In our analysis, we imposed an additional constraint to ensure that only the limb scans that are not affected by conjugate photoelectrons are included. Specifically, we calculated the conjugate location of each scan by using the International Geomagnetic Reference Field model (Alken et al. 2021). The scans that have conjugate locations with SZA less than 110 (Wautelet et al. 2023) are excluded from our analysis. By examination of the geolocations and times of the remaining ∼2.0 million limb scans, we found that ∼52,000 and ∼296,000 scans have coincident ionosonde and COSMIC-2 measurements, respectively. Note that during our search for coincident observations we included only the ionosonde and COSMIC-2 measurements with no quality issues. In Sect. 4, we will focus on the analysis of these coincident measurements.

3 Models

Accurate determination of the ionospheric parameters from the ICON FUV observations requires accurate modeling of the production and transport of the OI 135.6 nm photons in the nighttime ionosphere. For this purpose we used the first-principles model of Qin et al. (2015), which solves the radiative transfer equation by direct discretization of its integral form in a non-isothermal atmosphere with complete frequency redistribution. This model properly accounts for radiative recombination, mutual neutralization, multiple scattering, and pure absorption by using the rate coefficients and absorption cross sections documented in a series of publications (e.g., Meier 1991; Meier et al. 2015; Melendez-Alvira et al. 1999). A recent implementation of the model to analyzing the GOLD disk observations showed that the retrieved \(n_{\mathrm{m}}F_{2}\) have negligible systematic differences (less than 3%) when compared to external radio measurements (Qin et al. 2023). Moreover, the use of this model to analyzing the FUV dayglow observations by the Global Ultraviolet Imager (GUVI) on the Thermosphere Ionosphere Mesosphere Energetic and Dynamics (TIMED) mission also led to excellent agreement (better than ∼5%) between the model and the observations (Qin 2024). Thus, we consider that the model of Qin et al. (2015) can be used with confidence to analyze the ICON FUV observations.

However, accurate determination of the ionospheric parameters from the ICON FUV observations also requires accurate reconstruction of the plasma densities from integrated LOS measurements, which is an ill-conditioned inverse problem (Kamalabadi et al. 2002, 2018). Previous studies have shown that regularization is needed for minimizing the estimation errors in the ICON FUV retrievals introduced by measurement noise (Qin et al. 2016; Kamalabadi et al. 2018). ICON currently uses the second-order Tikhonov regularization (Tikhonov 1963) to enforce global smoothness of the solution, \(\mathbf{\hat {x}}_{\mathrm{Tik}}\):

$$ {\mathbf{\hat{x}}}_{\mathrm{Tik}} = {\mathbf{A^{\dagger}}}{\mathbf{y}} $$
(1)

where \({\mathbf{A^{\dagger}}} = ({\mathbf{A}}^{\mathrm{T}}{\mathbf{A}}+\alpha ^{2}{\mathbf{L}}^{\mathrm{T}}{ \mathbf{L}})^{-1}{\mathbf{A}}^{\mathrm{T}}\) and \({\mathbf{A}}\) is the forward model matrix, \({\mathbf{y}}\) is the vector of measurements, L is the second derivative operator acting as a low-pass filter to penalize the high-frequency components, and \(\alpha \) is the regularization parameter (see more details in Kamalabadi et al. 2018). The use of Tikhonov regularization, on the one hand, prevents overfitting and ensures solution stability, and on the other hand, reduces model resolution and introduces a bias. According to Aster et al. (2013), a bound on the norm of the bias can be formulated as:

$$ ||E[{\mathbf{\hat{x}}}_{\mathrm{Tik}}]-{\mathbf{x}}_{\mathrm{true}}||\le ||{\mathbf{R}}_{\mathrm{m}}-{ \mathbf{I}}||||{\mathbf{x}}_{\mathrm{true}}|| $$
(2)

where \(E[{\mathbf{\hat{x}}}_{\mathrm{Tik}}]={\mathbf{R}}_{\mathrm{m}}{\mathbf{x}}_{\mathrm{true}}\) is the expected value of \({\mathbf{\hat{x}}}_{\mathrm{Tik}}\), \({\mathbf{x}}_{\mathrm{true}}\) is the true solution, \({\mathbf{R}}_{\mathrm{m}}\) is the model resolution matrix, \({\mathbf{I}}\) is the identity matrix, and \(||{\mathbf{R}}_{\mathrm{m}}-{\mathbf{I}}||\) characterizes the bias of the solution. For the second-order Tikhonov regularization, the model resolution matrix is:

$$ {\mathbf{R}}_{\mathrm{m}} = {\mathbf{A^{\dagger}}}{\mathbf{A}} = {\mathbf{X}}^{-{\mathrm{T}}}{\mathbf{F}} { \mathbf{X}}^{\mathrm{T}} $$
(3)

where \({\mathbf{X}}\) is a nonsingular square matrix obtained from generalized singular value decomposition (GSVD) of the matrices \({\mathbf{A}}\) and \({\mathbf{L}}\), \({\mathbf{F}}\) is a diagonal matrix of the GSVD filter factors, \(f_{i} = \gamma _{i}^{2}/(\gamma _{i}^{2}+\alpha ^{2})\), and \(\gamma _{i}\) is the \(i^{\mathrm{th}}\) generalized singular values (Aster et al. 2013). In the case when the solution is not regularized (i.e., \(\alpha =0\) thus \({\mathbf{R}}_{\mathrm{m}}={\mathbf{I}}\)), the solution, \({\mathbf{\hat{x}}}_{\mathrm{Tik}}\), is unbiased. However, in such a case the solution is usually unstable in the presence of measurement noise. With the use of a non-zero regularization parameter to stabilize the solution (i.e., \(\alpha \ne 0\) thus \({\mathbf{R}}_{\mathrm{m}}\ne {\mathbf{I}}\)), the model resolution is reduced and bias is introduced. The bias increases with the use of a larger regularization parameter that leads to a larger difference between \({\mathbf{R}}_{\mathrm{m}}\) and \({\mathbf{I}}\).

Bayesian inference is an alternative approach to solving the above ill-conditioned inverse problem, which can combine a prior distribution for the ionospheric parameters with the ICON FUV observations through the Bayes’ theorem to produce a posterior distribution (i.e., a regularized solution). Assuming Gaussian prior and Gaussian noise, a desirable solution can be selected from the posterior distribution based on the Maximum A Posteriori (MAP) estimation by solving the following optimization problem (Aster et al. 2013; Menke 2012):

$$ {\mathbf{\hat{x}}}_{\mathrm{MAP}} = \min{\left \Arrowvert \begin{bmatrix} {\mathbf{R}}^{-\frac{1}{2}}_{\mathbf{w}}{\mathbf{A}} \\ {\mathbf{R}}^{-\frac{1}{2}}_{\mathbf{x}} \end{bmatrix} {\mathbf{x}}- \begin{bmatrix} {\mathbf{R}}^{-\frac{1}{2}}_{\mathbf{w}}{\mathbf{y}} \\ {\mathbf{R}}^{-\frac{1}{2}}_{\mathbf{x}}{\mathbf{x}}_{\mathrm{prior}} \end{bmatrix} \right \Arrowvert}^{2}_{2} $$
(4)

where \({\mathbf{R_{w}}}\) is the data covariance matrix, \({\mathbf{x}}_{\mathrm{prior}}\) is the mean of the prior distribution, and \({\mathbf{R_{x}}}\) is the covariance matrix for the prior distribution. The MAP solution, \({\mathbf{\hat{x}}}_{\mathrm{MAP}}\), can be viewed as a weighted average between our prior knowledge about the ionospheric parameters and the information contained in the new measurements. Note that for ICON FUV observations the measurement noise is dominated by photon counting noise with a Poisson distribution, which can be well approximated by a Gaussian distribution in the case when the photon counts are large. The photon counts during the nominal 12 s exposure of the FUV imager is 0.48 counts/Rayleigh/science pixel (Mende et al. 2017). We examined the limb scans of the nighttime 135.6-nm emission and found that 99.6% of them have peak brightness in the range of 20-1500 Rayleigh, corresponding to large photon counts of ∼10-750. For the other 0.4% observations with smaller photon counts, an uncertainty can be introduced in the MAP solution by the Gaussian approximation.

The flexibility provided by the Bayesian method to incorporate prior knowledge is particularly advantageous in the case when a large amount of prior information is available and little information is contained in the measurements (e.g., due to low signal-to-noise ratios). The prior distribution, specifically \({\mathbf{x}}_{\mathrm{prior}}\) and \({\mathbf{R_{x}}}\), can be constructed based on any reliable prior knowledge of the ionosphere, such as existing radio measurements and climatological models. In Sect. 4, we use the Bayesian method with prior distributions that are constructed based on the NRLMSIS 2.0 model (Emmert et al. 2021) and the IRI 2016 model (Bilitza et al. 2017) to resolve the bias in the ICON FUV L2.5 v05 data product.

4 Results and Discussion

4.1 Investigation of the Systematic Bias in the Ionospheric Retrievals Introduced by Tikhonov Regularization

In this section, we perform an inversion analysis of synthetic observations to demonstrate that Tikhonov regularization can indeed introduce a systematic bias in the ionospheric retrievals. To generate the synthetic observations, we used the NRLMSIS 2.0 model (Emmert et al. 2021) and the IRI 2016 model (Bilitza et al. 2017) to specify the atmospheric and ionospheric conditions at the geolocations and times of those ∼52,000 individual FUV limb scans that have coincident ionosonde measurements (see Sect. 2 for the determination of the coincidence). With these empirical model outputs, we used the radiative transfer model of Qin et al. (2015) to simulate ∼52,000 noiseless radiance profiles. We then included photon counting noise that follows a Poisson distribution to generate ∼52,000 synthetic observations by assuming a 12 s integration time and an instrument sensitivity of 87.3 counts/rescell/s/kR to mimic the ICON FUV observations (Qin et al. 2016). Finally, we retrieved the ionospheric parameters from the synthetic observations based on the second-order Tikhonov regularization. The accuracy and precision of the retrieved \(n_{\mathrm{m}}F_{2}\) and \(h_{\mathrm{m}}F_{2}\) are analyzed in Fig. 1. Figure 1a-b show the distribution of the true \(n_{\mathrm{m}}F_{2}\) and \(h_{\mathrm{m}}F_{2}\) that are specified by the IRI 2016 model and the retrievals based on Tikhonov regularization. Figure 1c-d show the distributions of the errors in the retrieved \(n_{\mathrm{m}}F_{2}\) and \(h_{\mathrm{m}}F_{2}\), with mean (accuracy) and standard deviation (precision) of −0.5 ± 5.8% and 10.3 ± 11.3 km, respectively. These results show that Tikhonov regularization introduces a negligible bias in the retrieved \(n_{\mathrm{m}}F_{2}\) and a systematic bias of ∼10.3 km in the retrieved \(h_{\mathrm{m}}F_{2}\), under the assumption of the ICON FUV instrument sensitivity.

Fig. 1
figure 1

Examination of the systematic bias introduced by Tikhonov regularized inversion of ∼52,000 synthetic observations. (a-b) Blue: Distributions of the true \(n_{\mathrm{m}}F_{2}\) and \(h_{\mathrm{m}}F_{2}\) that are specified using the IRI 2016 model with a Chapman-like ionosphere. Brown: Distributions of the retrieved \(n_{\mathrm{m}}F_{2}\) and \(h_{\mathrm{m}}F_{2}\). (c-d) Retrieval errors in the \(n_{\mathrm{m}}F_{2}\) (%) and \(h_{\mathrm{m}}F_{2}\) (km) shown in (a-b). The mean and standard deviation of the retrieval errors given in (c-d) are −0.5 ± 5.8% and 10.3 ± 11.3 km, respectively. (e) Three synthetic observations that are generated by assuming different instrument sensitivities under otherwise the same conditions. (f) The optimal regularization parameters, \(\alpha \), that are determined based on the L-curve criterion for the three synthetic observations. (e-f) Black: Noiseless. Brown: Noise included with 10 times of the ICON FUV instrument sensitivity. Blue: Noise included with the ICON FUV instrument sensitivity. (g-i) Solid black: The true electron density profile that is used in the radiative transfer model to obtain the three radiance profiles shown in (e). Dashed red: Electron density profiles retrieved from the three modeled radiance profiles based on Tikhonov regularization with the parameters given in (f). Solid blue: Electron density profiles retrieved without regularization (i.e., with \(\alpha = 0\))

As discussed in Sect. 3, the bias introduced by Tikhonov regularization increases with a larger regularization parameter. To demonstrate this effect, we repeated the inversions shown in Fig. 1a-d using the same ∼52,000 noiseless radiance profiles with different noise realizations. Figure 1e shows examples of the noise realizations for a specific noiseless profile, which include the noiseless profile (black), a profile with noise included by assuming an instrument sensitivity that is 10 times of the ICON FUV instrument sensitivity (brown), and a profile with the ICON FUV instrument sensitivity (blue). The optimal regularization parameters that are determined based on the L-curve criterion are shown in Fig. 1f, which are (black) 57.3, (brown) 357.4, and (blue) 1106.2, respectively. We see that with the increase of the noise level a larger regularization parameter is needed to stabilize the solution. Inversion of the ∼52,000 noiseless profiles with typical \(\alpha \simeq \) 50 leads to negligible bias in both \(n_{\mathrm{m}}F_{2}\) and \(h_{\mathrm{m}}F_{2}\), with errors of −0.4 ± 0.5% and 1.4 ± 1.1 km, respectively. For the cases of 10 times of the ICON FUV instrument sensitivity, the regularization parameters are several hundreds and the retrieval errors are 0.2 ± 2.2% and 4.9 ± 7.1 km, respectively. When compared with the retrieval errors discussed earlier for the case of the ICON FUV instrument sensitivity, we find that the bias in \(h_{\mathrm{m}}F_{2}\) introduced by Tikhonov regularization increases with a larger regularization parameter, which is required for stabilizing the solution in the case of a smaller signal-to-noise ratio. As a demonstration of the smoothing effect of Tikhonov regularization, the inversion results of the three radiance profiles shown in Fig. 1e with and without regularization are compared in Fig. 1g-i.

4.2 Development of a Bayesian Framework for Accurate Determination of the Ionospheric Parameters

In this section, we develop a Bayesian framework for accurate determination of the ionospheric parameters from the ICON FUV observations based on the MAP estimation. The key to an accurate MAP estimation by using Eq. (4) is to properly construct the \({\mathbf{x}}_{\mathrm{prior}}\) vector and the \({\mathbf{R_{x}}}\) matrix using prior knowledge of the ionosphere. It should be noted that various sources of prior knowledge could be used by data analysts to construct prior distributions that could differ from each other. Even the same source of prior knowledge could be used in different ways by data analysts to construct prior distributions. The use of different prior distributions for the MAP estimation could lead to different retrieval accuracies. In this section, we construct prior distributions based on the NRLMSIS 2.0 model and the IRI 2016 model and demonstrate that the MAP estimation can retrieve the ionospheric parameters with a negligible bias through analysis of the same ∼52,000 synthetic observations as Fig. 1.

In the inversion scheme of Qin et al. (2015), regularized inversion is performed first to obtain the altitude-dependent source volume emission rates of the OI 135.6 nm emission, which are then converted to ionospheric plasma densities by solving a cubic function (see their Sect. 5). Therefore, the prior distribution needs to be constructed for the source volume emission rate instead of for the plasma density. The source volume emission rates, \(4\pi \varepsilon _{135.6}\), are calculated as (Qin et al. 2023; Yin et al. 2023):

$$ 4\pi \varepsilon _{135.6} = \alpha _{135.6}(T_{\mathrm{e}})n_{\mathrm{e}}n_{ \mathrm{O^{+}}}+\beta _{135.6}k_{1}k_{2} \frac{n_{\mathrm{e}}n_{\mathrm{O}}n_{\mathrm{O^{+}}}}{k_{2}n_{\mathrm{O^{+}}}+k_{3}n_{\mathrm{O}}} $$
(5)

where \(T_{\mathrm{e}}\), \(n_{\mathrm{e}}\), \(n_{\mathrm{O}}\), and \(n_{\mathrm{O^{+}}}\) represent the electron temperature, electron density, and O and O+ densities, respectively. The rate coefficients, \(\alpha _{135.6} = 7.3\times 10^{-13}\)(1,160/\(T_{ \mathrm{e}}\))0.5 cm3 s−1, \(\beta _{135.6} = 0.54\), \(k_{1} = 1.3\times 10^{-15}\) cm3 s−1, \(k_{2} = 1.0\times 10^{-7}\) cm3 s−1, and \(k_{3} = 1.4\times 10^{-10}\) cm3 s−1, are taken from (Melendez-Alvira et al. 1999). For construction of the priors, the altitude profiles of \(T_{\mathrm{e}}\), \(n_{\mathrm{e}}\), \(n_{\mathrm{O}}\), and \(n_{\mathrm{O^{+}}}\) are specified using the two empirical models. During the inversion, \(4\pi \varepsilon _{135.6}\) is first obtained from the MAP estimation. The cubic function is then solved to obtain \(n_{\mathrm{e}}\) and \(n_{\mathrm{O^{+}}}\) by assuming that \(n_{\mathrm{e}}=n_{\mathrm{O^{+}}}\) and specifying \(T_{\mathrm{e}}\) and \(n_{\mathrm{O}}\) with the empirical models.

One of the simplest ways to construct a prior distribution is to calculate a large number (e.g., 10,000) of volume emission rate profiles that are associated with random geolocations and times, from which \({\mathbf{x}}_{\mathrm{prior}}\) and \({\mathbf{R_{x}}}\) can be readily obtained. We found that the use of such a simple prior distribution for the MAP estimation leads to a systematic bias of ∼5.7 km in the \(h_{\mathrm{m}}F_{2}\) retrieved from the same ∼52,000 synthetic observations as Fig. 1, which is reduced by ∼4.6 km when compared to the bias of ∼10.3 km introduced by Tikhonov regularization. To further improve the retrieval accuracy, we constructed a series of prior distributions that are associated with different \(h_{\mathrm{m}}F_{2}\). Specifically, we divided the \(h_{\mathrm{m}}F_{2}\) between ∼200-450 km equally into small bins (e.g., 290–300 and 300–310 km). The reason for using 200 km and 450 km as the lower and upper limits of the \(h_{\mathrm{m}}F_{2}\) bins to construct the prior distributions is that the IRI 2016 model generates few cases with \(h_{\mathrm{m}}F_{2}\) outside this range. For each of the \(h_{\mathrm{m}}F_{2}\) bins, we constructed a prior distribution using a large number (hundreds to thousands) of volume emission rate profiles at random geolocations that are associated with the \(h_{\mathrm{m}}F_{2}\) within the bin. The total number of prior distributions in the series depends on the bin size. To retrieve the ionospheric parameters from each individual observation, we used one of the prior distributions for the MAP estimation, which is selected from the series based on the peak height, \(h_{\mathrm{max}}\), of the radiance profile (see later discussion for the selection method). During our investigation, we found that the retrieval accuracy increases with a finer division of the \(h_{\mathrm{m}}F_{2}\) bins. We achieved a negligible bias in the retrievals by using a bin size of 1 km (e.g., 299.5–300.5 and 300.5–301.5 km). The results are shown in Fig. 2.

Fig. 2
figure 2

Examination of the retrieval accuracy of the MAP estimation by inverting the same ∼52,000 synthetic observations as Fig. 1. (a) The covariance matrix, \({\mathbf{R_{x}}}\), for a prior distribution that is constructed using the empirical models with no constraints on \(h_{\mathrm{m}}F_{2}\). (b-c) Same as (a), except for \(h_{\mathrm{m}}F_{2}=\) 249.5–250.5 km and 399.5–400.5 km, respectively. (d) Solid black: The \({\mathbf{x}}_{\mathrm{prior}}\) vector of the prior distribution associated with the \({\mathbf{R_{x}}}\) matrix shown in (a). Dashed red and dotted blue: \({\mathbf{x}}_{\mathrm{prior}}\) associated with \({\mathbf{R_{x}}}\) shown in (b, c), respectively. (e) Distribution of the differences between the ∼52,000 peak heights, \(h_{\mathrm{max}}\), of the modeled noiseless radiance profiles and the true \(h_{\mathrm{m}}F_{2}\), which have a mean and standard deviation of 29.8 ± 7.3 km. (f) Distributions of (blue) the true \(n_{\mathrm{m}}F_{2}\) and (brown) the \(n_{\mathrm{m}}F_{2}\) retrieved from the MAP estimation. (g) The true and the retrieved \(h_{\mathrm{m}}F_{2}\). (h-i) The retrieval errors in the \(n_{\mathrm{m}}F_{2}\) and \(h_{\mathrm{m}}F_{2}\), with mean and standard deviation of −0.4 ± 4.3% and −0.6 ± 9.1 km, respectively

Figure 2a shows the \({\mathbf{R_{x}}}\) matrix that is constructed without any constraints on the \(h_{\mathrm{m}}F_{2}\) (i.e., the volume emission rate profiles associated all possible \(h_{\mathrm{m}}F_{2}\) are used). Figures 2b-c show two examples of the \({\mathbf{R_{x}}}\) matrices that are constructed with \(h_{\mathrm{m}}F_{2}\) constrained within 249.5–250.5 and 399.5–400.5 km, respectively. Figure 2d shows the \({\mathbf{x}}_{\mathrm{prior}}\) vectors corresponding to the \({\mathbf{R_{x}}}\) matrices shown in Fig. 2a-c. As mentioned earlier, the use of the simple prior distribution (i.e., the \({\mathbf{R_{x}}}\) matrix shown in Fig. 2a and the \({\mathbf{x}}_{\mathrm{prior}}\) vector shown by the solid black line in Fig. 2d) leads to a systematic bias in the retrieved \(h_{\mathrm{m}}F_{2}\), whose accuracy and precision are ∼5.7 ± 11.1 km. For the use of a series of prior distributions for the MAP estimation, we need a selection method to determine the proper prior distribution for each individual observations. For this purpose, we used synthetic observations to find an empirical relationship between the altitude of the peak brightness, \(h_{\mathrm{max}}\), and the altitude of the peak electron density, \(h_{\mathrm{m}}F_{2}\), which are closely related to each other. Specifically, we calculated the differences between the \(h_{\mathrm{max}}\) of the ∼52,000 noiseless radiance profiles and the corresponding true \(h_{\mathrm{m}}F_{2}\), whose distribution is shown in Fig. 2e. The mean and standard deviation of the differences are 29.8 ± 7.3 km. For the inversion of each synthetic observations, we first find the \(h_{\mathrm{max}}\) of its radiance profile and then use the prior distribution corresponding to the \(h_{\mathrm{m}}F_{2} = h_{\mathrm{max}}+30\) km to perform the MAP estimation. The inversion results are shown in Fig. 2f-i. The retrieval accuracy and precision are −0.4 ± 4.3% for \(n_{\mathrm{m}}F_{2}\) and −0.6 ± 9.1 km for \(h_{\mathrm{m}}F_{2}\). The bias in the retrievals are negligible.

In the above analysis, the prior distributions are constructed using a large number of volume emission rate profiles that are associated with random geolocations (i.e., not associated with the geolocations of the ∼52,000 synthetic observations). The negligible bias in the retrievals indicates good performance of the MAP estimation using such prior distributions. To further demonstrate the performance of our method, we conducted \(k\)-fold cross validation with \(k = 10\) using the ∼52,000 synthetic observations. Specifically, we split the synthetic data into 10 groups, with each containing ∼5200 observations. For each group, we used the other 9 groups as a training data set to construct a series of prior distributions that are associated with different \(h_{\mathrm{m}}F_{2}\) bins of 1-km size. The prior distributions are then used to retrieve the plasma densities from the observations in that group. The retrieval accuracy and precision of the 10 groups are shown in Fig. 3. The fluctuations are less than ∼1% for \(n_{\mathrm{m}}F_{2}\) and ∼4 km for \(h_{\mathrm{m}}F_{2}\). The averages of the accuracy and precision are −0.5 ± 3.8% for \(n_{\mathrm{m}}F_{2}\) and −0.2 ± 8.6 km for \(h_{\mathrm{m}}F_{2}\), which are close to the ones shown in Fig. 2h-i, indicating no overfitting and good performance of our method.

Fig. 3
figure 3

Retrieval accuracy and precision for each of the 10 groups in the 10-fold cross validation. (a) \(n_{\mathrm{m}}F_{2}\). (b) \(h_{\mathrm{m}}F_{2}\). The averages of the accuracy and precision are -0.5 ± 3.8% and -0.2 ± 8.6 km, respectively

4.3 Implementation of the MAP Estimation to the ICON FUV Observations and Verification of the Ionospheric Retrievals

In this section, we implement the MAP estimation to retrieve the ionospheric parameters from the ICON FUV observations and verify the retrievals by comparison with coincident radio measurements. In Fig. 4, we analyze the retrievals from the ∼52,000 limb scans that have coincident ionosonde measurements. Figure 4a-b compare the distributions of the \(n_{\mathrm{m}}F_{2}\) measured by ionosondes and those retrieved based on Tikhonov regularized inversion and MAP estimation. Figure 4c shows the percentage differences between the retrieved \(n_{\mathrm{m}}F_{2}\) and the coincident ionosonde measurements. The mean and standard deviation of the differences are −2.3 ± 25.3% and 0.1 ± 24.6% for the Tikhonov regularized inversion and the MAP estimation, respectively. Both methods lead to a negligible or small bias in the retrieved \(n_{\mathrm{m}}F_{2}\). Figures 4d-e compare the distributions of the \(h_{\mathrm{m}}F_{2}\) measured by ionosondes and those retrieved from the ICON FUV observations. In Fig. 4f, the mean and standard deviation of the differences are 11.6 ± 22.1 km and 0.6 ± 23.7 km for the Tikhonov regularized inversion and the MAP estimation, respectively. These results verify that the MAP estimation can indeed significantly reduce the bias in the retrieved \(h_{\mathrm{m}}F_{2}\) when compared to Tikhonov regularized inversion. Note that while the mean differences shown in Fig. 4c, f may be viewed as the retrieval accuracies, the standard deviations should not be interpreted as the precisions of the retrievals because at least part of the deviations are due to imperfect data matching between the ICON and the ionosonde measurements. Also note that in the comparisons shown in Fig. 4 we excluded the ionospheric retrievals associated with \(h_{\mathrm{m}}F_{2}\) below 200 km or above 450 km, which are the lower and upper limits that we used for the construction of our prior distributions (see Sect. 4.2). These extreme cases account for only ∼0.6% of the ∼52,000 observations, which are not expected by the IRI 2016 model and in practice should be inverted by using Tikhonov regularization.

Fig. 4
figure 4

Verification of the MAP solutions to the ∼52,000 ICON FUV limb scans by comparison with coincident ionosonde measurements. (a-b) Blue: The ionosonde \(n_{\mathrm{m}}F_{2}\). Brown: The \(n_{\mathrm{m}}F_{2}\) retrieved by Tikhonov regularization, taken directly from the ICON FUV L2.5 v05 data product. Cyan: The \(n_{\mathrm{m}}F_{2}\) retrieved by the MAP estimation. (c) The percentage differences between the retrieved \(n_{\mathrm{m}}F_{2}\) and the coincident ionosonde measurements. Brown: Tikhonov. Cyan: MAP. The mean and standard deviation of the differences are −2.3 ± 25.3% and 0.1 ± 24.6% for the Tikhonov regularized inversion and the MAP estimation, respectively. (d-f) Same as (a-c), except for \(h_{\mathrm{m}}F_{2}\). (f) Tikhonov: 11.6 ± 22.1 km. MAP: 0.6 ± 23.7 km

Figure 5 compares the ionospheric retrievals from the ∼296,000 ICON observations that have coincident COSMIC-2 measurements in the same manner as Fig. 4. The retrievals associated with \(h_{\mathrm{m}}F_{2}\) below 200 km or above 450 km are also excluded, which account for ∼1.1% of all cases. Figure 5a-c show the distributions of the \(n_{\mathrm{m}}F_{2}\) retrieved from the ICON observations and their percentage differences with the coincident COSMIC-2 measurements. The mean and standard deviation are 7.6 ± 22.0% for the Tikhonov regularized inversion and are 8.3 ± 21.7% for the MAP estimation. It appears that the ICON \(n_{\mathrm{m}}F_{2}\) retrievals from both inversion methods are ∼8% systematically higher than the COSMIC-2 measurements. This bias suggests that COSMIC-2 may underestimate the \(n_{\mathrm{m}}F_{2}\) systematically by ∼8% (e.g., due to the smearing effect), since the ICON \(n_{\mathrm{m}}F_{2}\) retrievals agree well with the ionosonde measurements that can be viewed as “ground-truth” (see Fig. 4). Indeed, we compared the COSMIC-2 and the ionosonde measurements obtained from 2019–2022. By using the same data matching criteria as those described in Sect. 2, we found ∼19,000 coincident COSMIC-2 and ionosonde measurements with differences of 8.8 ± 24.6% in their \(n_{\mathrm{m}}F_{2}\), which agrees with the accuracy assessment of the COSMIC-2 measurements by Cherniak et al. (2021). We see that the comparisons among the three data sets are consistent, which indicate that the MAP estimation leads to a negligible bias in the \(n_{\mathrm{m}}F_{2}\) retrievals.

Fig. 5
figure 5

Same as Fig. 4, except for the use of the ionospheric retrievals from the ∼296,000 ICON observations that have coincident COSMIC-2 measurements. (c) Tikhonov: 7.6 ± 22.0%. MAP: 8.3 ± 21.7%. (f) Tikhonov: 7.8 ± 21.1 km. MAP: −1.4 ± 22.5 km

Figure 5d-f show the distributions of the \(h_{\mathrm{m}}F_{2}\) and the differences between the ICON retrievals and the COSMIC-2 measurements. The mean and standard deviation of the differences are 7.8 ± 21.1 km for the Tikhonov regularized inversion and are −1.4 ± 22.5 km for the MAP estimation. Similar to the results shown in Fig. 4d-f, the MAP estimation can significantly reduce the bias in the retrieved \(h_{\mathrm{m}}F_{2}\). Another indication of the results shown in Fig. 5f is that COSMIC-2 seems to overestimate the \(h_{\mathrm{m}}F_{2}\) systematically by ∼2 km, since the differences between the ICON retrievals and the ionosonde measurements are 0.6 ± 23.5 km for the MAP estimation (see Fig. 4f). Indeed, from the same comparison of the COSMIC-2 and the ionosonde measurements discussed earlier, we found differences of 2.7 ± 20.3 km in their \(h_{\mathrm{m}}F_{2}\), which also agrees with the assessment by Cherniak et al. (2021). Again, the comparisons among the three data sets are consistent, indicating that the MAP estimation leads to a negligible bias in the \(h_{\mathrm{m}}F_{2}\) retrievals.

As shown in Sect. 4.1, the bias in \(h_{\mathrm{m}}F_{2}\) introduced by Tikhonov regularization increases with a smaller signal-to-noise ratio, which requires a larger regularization parameter to stabilize the solution. As a result, it is not possible to retrieve the \(h_{\mathrm{m}}F_{2}\) accurately by using a constant offset to correct the Tikhonov regularized solutions. To further demonstrate this, Fig. 6 presents scatter plots of the \(h_{\mathrm{m}}F_{2}\) retrieved from the ICON FUV observations versus the ionosonde and the COSMIC-2 measurements. As shown in Fig. 6a, d, the systematic bias tends to be larger for a lower \(h_{\mathrm{m}}F_{2}\). Indeed, in Fig. 6a the bias introduced by Tikhonov regularization is 19.1 and 7.0 km for \(h_{\mathrm{m}}F_{2}\) that are lower and higher than 300 km, respectively. In Fig. 6d, the bias is 17.9 and 2.1 for \(h_{\mathrm{m}}F_{2}\) that are lower and higher than 300 km, respectively. In comparison, the MAP estimations shown in Fig. 6b, e exhibit no clear differences in the retrieval accuracy for the entire range of \(h_{\mathrm{m}}F_{2}\). Figure 6c, f compare the Tikhonov regularized solutions and the MAP estimations. The two linear fits (i.e., the cyan dashed lines) have slopes of 1.05 and 1.07, respectively, indicating that a simple correction by subtracting a constant offset from the Tikhonov regularized solutions cannot match the MAP estimations. We thus conclude that the MAP estimation is better suited for the inversion of the ICON FUV observations in the case when the \(h_{\mathrm{m}}F_{2}\) lies within ∼200-450 km, which is the typical ionosphere that can be well represented by the IRI 2016 model.

Fig. 6
figure 6

(a) Scatter plot of the \(h_{\mathrm{m}}F_{2}\) retrieved from the ICON FUV observations based on Tikhonov regularization versus coincident ionosonde measurements. (b) The ICON FUV retrievals based on MAP estimation versus coincident ionosonde measurements. (c) The ICON FUV retrievals based on Tikhonov regularization versus those based on MAP estimation. (d-f) Same as (a-c), except that the ICON retrievals are compared with coincident COSMIC-2 measurements. The cyan dashed lines in (c) and (f), which are linear fits to the retrievals, have slopes of 1.05 and 1.07, respectively. The slopes of the solid black lines are all 1.0

In the above analysis, we did not consider the impact of geomagnetic storms because the ICON observations were made mostly during quiet times. Indeed, we find that only ∼900 out of the ∼52,000 (1.7%) ICON limb scans that have coincident ionosonde measurements and ∼3800 out of the ∼296,000 (1.3%) ICON limb scans that have coincident COSMIC-2 measurements were acquired during moderately disturbed times with Ap index in the range of ∼40-60. By comparisons of those ∼900 and ∼3800 coincident measurements, we find differences of 6.0 ± 18.4% in \(n_{\mathrm{m}}F_{2}\) and −1.4 ± 27.0 km in \(h_{\mathrm{m}}F_{2}\) between the ICON MAP retrievals and the ionosonde measurements and differences of 9.3 ± 20.3% in \(n_{\mathrm{m}}F_{2}\) and −6.1 ± 21.1 km in \(h_{\mathrm{m}}F_{2}\) between the ICON MAP retrievals and the COSMIC-2 measurements. These differences are slightly larger than those shown in Figs. 4-5, indicating that the ICON retrievals may be less accurate for disturbed times. However, a definite conclusion cannot be made due to the small sample size of the disturbed observations.

5 Conclusions

We developed a Bayesian framework for accurate determination of the nighttime ionospheric parameters from the OI 135.6 nm emission observed by the FUV imager on the ICON mission based on the MAP estimation in order to address the systematic bias that currently exists in the ICON L2.5 v05 data product. We first demonstrated through analysis of synthetic observations that the bias is due to the use of Tikhonov regularization for the ICON FUV L1 data inversion, which penalizes high-frequency components of the solution more significantly than lower-frequency components. To address the bias, we constructed a series of prior distributions for the MAP estimation by dividing the ionospheric \(h_{\mathrm{m}}F_{2}\) equally into small bins of 1-km size. The retrieval accuracy of the MAP estimation is demonstrated through analysis of synthetic observations, which shows a negligible bias in the ionospheric retrievals (∼1% in the retrieved \(n_{\mathrm{m}}F_{2}\) and ∼1 km in the retrieved \(h_{\mathrm{m}}F_{2}\)). The MAP estimation is then implemented to analyzing the ICON FUV observations. Comparisons of the ICON retrievals with coincident ionosonde and COSMIC-2 measurements verify that the MAP estimation can indeed achieve the aforementioned high retrieval accuracies. Our results provide an improved data set for the study of the long-term and global-scale variations of the nighttime ionosphere and a novel method for the analysis of space-based FUV remote sensing observations.