1 Introduction

To analyze positively skewed data, the log-normal distribution is widely used in biomedical research. In bioequivalence trials, the relative potency of a new drug to that of a standard one is expressed in terms of the ratio of means. It is important to construct a confidence interval for this ratio or to test the null hypothesis that the ratio is one (Berger and Hsu 1996; Chow and Liu 2000). For a positively skewed data, a log transformation is often considered in order to normalize the distribution of the data. In general, confidence interval for the ratio of means is obtained by constructing confidence interval for the difference of means of the log-transformed variables and then backtransforming the resulting confidence interval. Similarly, p values for the null hypothesis based on the original outcomes is calculated by testing procedures for the difference of means of the log-transformed variables. However, Zhou et al. (1997) showed that the null hypothesis based on the log-transformed outcomes is not equivalent to the one based on the original outcomes when the variances of the log-transformed outcome variables are unequal.

Zhou et al. (1997) proposed a Z-score test and a nonparametric bootstrap approach for the ratio of the two independent log-normal means. The Z-score test does not perform well under small sample settings. Therefore, Wu et al. (2002) consider two methods based on the signed log-likelihood ratio statistic and the modified likelihood signed log-likelihood ratio statistic. The latter essentially provides almost exact coverage probabilities and test results for all designs considered. Krishnamoorthy and Mathew (2003) used the ideas of generalized p values and generalized confidence intervals and showed that the distribution of the Z-score statistic is valid only when both samples are large and \((n_1,\mu _1,\sigma _1^2)\) is approximately equal to \((n_2,\mu _2,\sigma _2^2)\). Otherwise, the distributions of the Z-score statistic appears to be highly skewed. In addition, the Z-score test is either too conservative or too liberal when the sample sizes are unequal. However, the test based on the generalized p value is applicable regardless of the sample size.

It is well known that Bayesian inference uses probabilities for both hypotheses and data while frequentists only uses conditional distributions of data given specific hypotheses. One difficulty of Bayesian analysis arises from eliciting the prior. Especially when model has multiple parameters, a popular approach under the objective Bayesian framework is to use reference priors (Bernardo 1979; Berger and Bernardo 1989, 1992). Alternatives include matching priors that match the posterior probabilities of one-sided credible intervals with their frequentist coverage probabilities, up to a certain order (Tibshirani 1989; Mukerjee and Ghosh 1997; Datta and Mukerjee 2004). Note that reference priors often meet first order probability matching criteria. However, the matching priors for the ratio of two log-normal means are not easy to obtain, and reference priors are usually not matching priors. Recently, Kim et al. (2017) proposed a matching prior for the common mean in multiple log-normal distributions based on the modified profile likelihood.

Let X be a random sample from a log-normal population with parameters \(\mu _1\) and \(\sigma _1^2\) and Y be a random sample from a log-normal population with parameters \(\mu _2\) and \(\sigma _2^2\). That is, \(\log X\) and \(\log Y\) are independently and normally distributed with means \(\mu _1\) and \(\mu _2\) and variance \(\sigma _1^2\) and \(\sigma _2^2\), respectively. Then, the ratio of two log-normal means is

$$\begin{aligned} {E(X)\over E(Y)} ={{\exp \left( \mu _1+{1\over 2}\sigma _1^2\right) } \over {\exp \left( \mu _2+{1\over 2}\sigma _2^2\right) }}=\exp (\theta ), \end{aligned}$$

where \(\theta =\mu _1-\mu _2+{1\over 2}(\sigma _1^2-\sigma _2^2)\). The parameter \(\theta \) is of interest. Note that the problem for the ratio of two means is equivalent to \(\theta \) in the development of matching priors.

Furthermore, the marginal posterior of \(\theta \) requires integration over the nuisance parameter. Numerical integrations and approximation techniques could be used but they are difficult to use in general. See, for example, Ventura and Racugno (2011) for a discussion on the Laplace approximation of the marginal posterior distribution and corresponding tail area probabilities.

For models with nuisance parameters including the ratio of two log-normal means problem, Ventura et al. (2009) proposed the use of an appropriate pseudo-likelihood such as different modifications of the profile likelihood to eliminate the nuisance parameter. Matching priors on \(\theta \) are incorporated to obtain the posterior on \(\theta \), which avoids the elicitation of priors for the entire parameter and integration on the nuisance parameter. Moreover, second order matching priors are also presented. Recently Min and Sun (2013) derived a matching prior based on the modified profile likelihood in a generalized Weibull stress-strength model by following Ventura et al.’s method (2009).

The remainder of this paper is divided into four sections. In the following section, the method proposed by Ventura et al. (2009) to handle models with nuisance parameters is briefly introduced and then the matching priors for the ratio of two log-normal means are derived. Section 3 devotes to show the propriety of the posterior distribution based on the derived matching prior. The proposed method is illustrated by simulation studies under several configurations and two real examples in Sect. 4. Simulated frequentist coverage probabilities under the proposed priors are presented. To compare the confidence intervals for the ratio of the log-normal means, we compute the confidence intervals based on the Z-score statistic, the modified signed log-likelihood ratio statistic, and Bayesian credible intervals. Finally, Sect. 5 closes with a conclusion.

2 Modified profile likelihood and matching prior for the ratio of two log-normal means

To deal with the nuisance parameter in a model, several different methods have been proposed. The simplest way is to replace the nuisance parameter by its maximum likelihood estimator (MLE) or the restricted MLE in the likelihood. The latter is usually referred to as the profile likelihood. Alternatively, different pseudo-likelihoods, which are various modifications of the profile likelihood can be used. Finally, Bayesian procedures, as mentioned in the introduction, can be applied to general models with nuisance parameters. Ventura et al. (2009) introduced the method to handle models with nuisance parameters through a combination of frequentist and Bayesian approaches (see also Min and Sun 2013). Certain pseudo-likelihoods are used instead of the full likelihood as in the frequentist approach. A prior is then applied to the pseudo-likelihood, as in the Bayesian approach, to achieve the matching property. Here we only consider two modified profile likelihoods, namely \(L_{mp}(\theta )\) of Barndorff-Nielsen (1983) and \(\bar{L}_{M}(\theta )\) by Severini (1998).

Let \(\mathbf{X}=(X_{1},\ldots ,X_{n})\) be a random sample of size n from a log-normal population with parameters \(\mu _1\) and \(\sigma _1^2\) and \(\mathbf{Y}=(Y_{1},\ldots ,Y_{m})\) be a random sample of size m from a log-normal population with parameters \(\mu _2\) and \(\sigma _2^2\). The parameter \(\theta =\mu _1-\mu _2+{1\over 2}(\sigma _1^2-\sigma _2^2)\) is of interest. In addition, the nuisance parameter is \(\lambda =(\mu _2,\sigma _1,\sigma _2)\). Then, the likelihood under the new parameterization is

$$\begin{aligned} L(\theta ,\lambda )= & {} (2\pi )^{-{n+m\over 2}} \left( \prod _{i=1}^n x_i^{-1}\prod _{i=1}^m y_i^{-1}\right) \sigma _1^{-n} \sigma _2^{-m}\nonumber \\&\times \exp \left\{ - \sum _{i=1}^{n} {\left( \log x_{i}-\theta -\mu _2+{\sigma _1^2-\sigma _2^2\over 2}\right) ^2\over 2\sigma _1^2} - {\sum _{i=1}^{m} (\log y_{i}-\mu _2)^2 \over 2\sigma _2^2}\right\} .\nonumber \\ \end{aligned}$$
(1)

The modified profile likelihood \(L_{mp}(\theta )\) is defined as

$$\begin{aligned} L_{mp}(\theta )=L_p(\theta ){|j_{\lambda \lambda } (\theta ,\hat{\lambda }_{\theta })|^{1\over 2} \over |l_{\lambda ;\hat{\lambda }}(\theta ,\hat{\lambda }_{\theta })|}, \end{aligned}$$
(2)

where \(L_p(\theta )=L(\theta ,\hat{\lambda }_{\theta })\) is the profile likelihood of \(\theta \), \(j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta }) =-\partial ^2l(\theta ,\lambda )/\partial \lambda \partial \lambda ^T\) is the observed Fisher information for \(\lambda \), and \(l_{\lambda ;\hat{\lambda }}(\theta ,\lambda )\) is the sample space derivatives,

$$\begin{aligned} l_{\lambda ;\hat{\lambda }}(\theta ,\lambda )= {\partial ^2l(\theta ,\lambda \vert \hat{\theta },\hat{\lambda },\alpha )\over \partial \lambda \partial \hat{\lambda }}. \end{aligned}$$

Finally, we derive the matching prior corresponding to the modified profile likelihood \(L_{mp}(\theta )\). From likelihood (1), we obtain the Fisher information matrix as follows.

$$\begin{aligned} I(\theta ,\lambda )= & {} \begin{pmatrix} i_{\theta \theta }(\theta ,\lambda )&{} i_{\theta \lambda }(\theta ,\lambda )\\ i_{\lambda \theta }(\theta ,\lambda )&{} i_{\lambda \lambda }(\theta ,\lambda ) \end{pmatrix}\nonumber \\= & {} \left( \begin{array}{cccc} {n\over \sigma _1^2} &{} {n\over \sigma _1^2} &{} -{n\over \sigma _1} &{} {n\sigma _2\over \sigma _1^2} \\ {n\over \sigma _1^2} &{} {n\over \sigma _1^2}+{m\over \sigma _2^2} &{} -{n\over \sigma _1} &{} {n\sigma _2\over \sigma _1^2}\\ -{n\over \sigma _1} &{} -{n\over \sigma _1} &{} n+{2n\over \sigma _1^2} &{} -{n\sigma _2\over \sigma _1}\\ {n\sigma _2\over \sigma _1^2} &{} {n\sigma _2\over \sigma _1^2} &{} -{n\sigma _2\over \sigma _1} &{} {n\sigma _2^2\over \sigma _1^2}+{2m\over \sigma _2^2} \end{array}\right) . \end{aligned}$$
(3)

Then, we obtain

$$\begin{aligned} i_{\theta \theta .\lambda }(\theta ,\lambda )= & {} i_{\theta \theta }(\theta ,\lambda ) -i_{\theta \lambda }(\theta ,\lambda ) i_{\lambda \lambda }(\theta ,\lambda )^{-1} i_{\lambda \theta }(\theta ,\lambda )\\= & {} {2nm\over m\sigma _1^2(\sigma _1^2+2)+n\sigma _2^2(\sigma _2^2+2)}. \end{aligned}$$

Next, the restricted maximum likelihood estimator \(\hat{\lambda }_{\theta }=(\hat{\mu }_{2\theta },\hat{\sigma }_{1\theta },\hat{\sigma }_{2\theta })\) is defined by the following recursive equations.

$$\begin{aligned} \hat{\mu }_{2\theta }= & {} \left[ {n\left( \bar{x} -\theta +{\hat{\sigma }_{1\theta }^2-\hat{\sigma }_{2\theta }^2\over 2}\right) \over \hat{\sigma }_{1\theta }^2} +{m\bar{y}\over \hat{\sigma }_{2\theta }^2}\right] \left( {n\over \hat{\sigma }_{1\theta }^2}+{m\over \hat{\sigma }_{2\theta }^2}\right) ^{-1},\\ \hat{\sigma }_{1\theta }^2= & {} -2+\left( 4+{4n^{-1}S_x^2} +\left[ \hat{\sigma }_{2\theta }^2+2(\theta +\hat{\mu }_{2\theta }-\bar{x})\right] ^2\right) ^{1\over 2},\\ \hat{\sigma }_{2\theta }^2= & {} \hat{\sigma }_{1\theta }^2[S_y^2+m(\bar{y}-\hat{\mu }_{2\theta })^2] \left[ m\hat{\sigma }_{1\theta }^2-n\hat{\sigma }_{2\theta }^2 \left( \bar{x}-\theta -\hat{\mu }_{2\theta }+{\hat{\sigma }_{1\theta }^2-\hat{\sigma }_{2\theta }^2\over 2}\right) \right] ^{-1}, \end{aligned}$$

where \(\bar{x}=\sum _{i=1}^n\log x_i/n\), \(\bar{y}=\sum _{i=1}^m\log y_i/m\). To find the restricted maximum likelihood estimator \(\hat{\lambda }_{\theta }\), we use the Gauss–Seidel iteration. Since the matching prior for \(\theta \) associated with \(L_{mp}(\theta )\) given in Ventura et al. (2009) is of the form

$$\begin{aligned} \pi (\theta ) \propto \left( i_{\theta \theta .\lambda }(\theta ,\hat{\lambda }_{\theta })\right) ^{1\over 2}, \end{aligned}$$
(4)

the matching prior for \(\theta \) is

$$\begin{aligned} \pi _m(\theta )\propto \left[ m\hat{\sigma }_{1\theta }^2 (\hat{\sigma }_{1\theta }^2+2)+n\hat{\sigma }_{2\theta }^2 (\hat{\sigma }_{2\theta }^2+2)\right] ^{-{1\over 2}}. \end{aligned}$$
(5)

Remark 1

Based on the determinant of Fisher information (3), Jeffreys’ prior is

$$\begin{aligned} \pi _J(\theta ,\mu _2,\sigma _1,\sigma _2) \propto \sigma _1^{-2}\sigma _2^{-2}. \end{aligned}$$
(6)

3 The posterior distribution for the ratio of two log-normal means

Based on the derived matching prior (5), we here compute the corresponding posterior distribution of \(\theta \). The modified profile likelihood \(L_{mp}(\theta )\) for the ratio of two log-normal means is derived from the profile likelihood \(L_{p}(\theta )\) obtained by simply plugging in \(\hat{\lambda }_{\theta }\) in (1). See Wu et al. (2002) for a more detailed discussion on likelihood analysis for the ratio of two log-normal means.

Since the \((\lambda ,\lambda )\)-block observed Fisher information matrix consist of the following elements

$$\begin{aligned} j_{\mu _2\mu _2}= & {} {n\over \sigma _1^2}+{m\over \sigma _2^2}, j_{\mu _2\sigma _1}=-{n\over \sigma _1}-{2n\left( \theta +\mu _2-{\sigma _1^2-\sigma _2^2\over 2}-\bar{x}\right) \over \sigma _1^3}, j_{\mu _2\sigma _2}\\= & {} {n\sigma _2\over \sigma _1^2}-{2m(\mu _2-\bar{y})\over \sigma _2^3},\\ j_{\sigma _1\sigma _1}= & {} n-{n\over \sigma _1^2} +{3n\left( \theta +\mu _2-{\sigma _1^2-\sigma _2^2\over 2}-\bar{x}\right) \over \sigma _1^2} +{3\sum _{i=1}^n \left( \theta +\mu _2-{\sigma _1^2-\sigma _2^2\over 2}-\log x_i\right) ^2\over \sigma _1^4},\\ j_{\sigma _1\sigma _2}= & {} -{n\sigma _2\over \sigma _1} -{2n\sigma _2\left( \theta +\mu _2-{\sigma _1^2-\sigma _2^2\over 2}-\bar{x} \right) \over \sigma _1^3},\\ j_{\sigma _2\sigma _2}= & {} -{m\over \sigma _2^2}+{n\sigma _2^2\over \sigma _1^2} +{n\left( \theta +\mu _2-{\sigma _1^2-\sigma _2^2\over 2}-\bar{x} \right) \over \sigma _1^2} +{3\sum _{i=1}^m(\mu _2-\log y_i)^2\over \sigma _2^4}, \end{aligned}$$

the determinant of \(j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta })\) is given by

$$\begin{aligned} |j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta })|= & {} {n\over \hat{\sigma }_{1\theta }^8\hat{\sigma }_{2\theta }^6} |nm\hat{\sigma }_{1\theta }^2\hat{\sigma }_{2\theta }^4g_1(\theta ) + \hat{\sigma }_{2\theta }^2g_2(\theta )+g_3(\theta )|, \end{aligned}$$

where

$$\begin{aligned} g_1(\theta )= & {} (2{\bar{y}}-2\hat{\mu }_{2\theta }-\hat{\sigma }_{2\theta }^2) [\hat{\sigma }_{2\theta }^2+2(\theta -\bar{x}+\hat{\mu }_{2\theta })]^2,\\ g_2(\theta )= & {} [2m\hat{\sigma }_{1\theta }^2+n\hat{\sigma }_{2\theta }^2 (\hat{\sigma }_{2\theta }^2-2\hat{\sigma }_{1\theta }^2 +2[\hat{\sigma }_{2\theta }^2+2(\theta -\bar{x}+\hat{\mu }_{2\theta })] )]\\&\times [m\hat{\sigma }_{1\theta }^4(\hat{\sigma }_{1\theta }^2+2)+n\hat{\sigma }_{2\theta }^2 (2\hat{\sigma }_{1\theta }^2+\hat{\sigma }_{1\theta }^4 -[\hat{\sigma }_{2\theta }^2+2(\theta -\bar{x}+\hat{\mu }_{2\theta })]^2)],\\ g_3(\theta )= & {} [n\hat{\sigma }_{2\theta }^4+2m \hat{\sigma }_{1\theta }^2(\bar{y}-\hat{\mu }_{2\theta })]\\&\times [2m\hat{\sigma }_{1\theta }^4(\hat{\sigma }_{1\theta }^2+2) (\hat{\mu }_{2\theta }-\bar{y}) -n\hat{\sigma }_{2\theta }^4(2\hat{\sigma }_{1\theta }^2 +\hat{\sigma }_{1\theta }^4-[\hat{\sigma }_{2\theta }^2 +2(\theta -\bar{x}+\hat{\mu }_{2\theta })]^2)]. \end{aligned}$$

The derivative of the log-likelihood (1), \(l(\theta ,\lambda \vert \hat{\theta },\hat{\lambda },a)\) is

$$\begin{aligned} l(\theta ,\lambda \vert \hat{\theta },\hat{\lambda },a)= & {} -n\sigma _1-m\sigma _2 -{n\hat{\sigma }_1^2+n\left( \hat{\theta }+\hat{\mu }_2 -{\hat{\sigma }_1^2-\hat{\sigma }_2^2\over 2}\right) ^2 \over 2\sigma _1^2} -{m\hat{\sigma }_2^2+m\hat{\mu }_2^2 \over 2\sigma _2^2}\nonumber \\&+ {n\left( \hat{\theta }+\hat{\mu }_2-{\hat{\sigma }_1^2-\hat{\sigma }_2^2\over 2}\right) \left( \theta +\mu _2-{\sigma _1^2-\sigma _2^2\over 2}\right) \over \sigma _1^2} +{m\hat{\mu }_2\mu _2 \over \sigma _2^2}\nonumber \\&-{n\left( \theta +\mu _2-{\sigma _1^2-\sigma _2^2\over 2}\right) ^2\over 2\sigma _1^2} -{m\mu _2^2 \over 2\sigma _2^2}, \end{aligned}$$
(7)

where the maximum likelihood estimator \(\hat{\eta }=(\hat{\theta },\hat{\mu }_2,\hat{\sigma }_1,\hat{\sigma }_2)\) is \(\hat{\mu }_2=\bar{y}\), \(\hat{\sigma }_1^2=\sum _{i=1}^n (\log x_i-\bar{x})^2/n\), \(\hat{\sigma }_2^2=\sum _{i=1}^m (\log y_i-\bar{y})^2/m\) and \(\hat{\theta }=\bar{x}-\hat{\mu }_2+{1\over 2}(\hat{\sigma }_1^2-\hat{\sigma }_2^2)\). Since the sample space derivatives of the log-likelihood (7) are:

$$\begin{aligned} l_{\mu _2;\hat{\mu }_2}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\mu }_i} ={n\over \sigma _1^{2}}+{m\over \sigma _2^{2}},\\ l_{\mu _2;\hat{\sigma }_1}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\mu }_j} =-{n\hat{\sigma }_1\over \sigma _1^2},\\ l_{\mu _2;\hat{\sigma }_2}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\sigma }_j} ={n\hat{\sigma }_2\over \sigma _1^2},\\ l_{\sigma _1;\hat{\mu }_2}= & {} {2nd(\eta ,\hat{\eta })\over \sigma _1^3}-{n\over \sigma _1},\\ l_{\sigma _1;\hat{\sigma }_1}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\mu }_j} ={{2n\hat{\sigma }_1-2n\hat{\sigma }_1d(\eta ,\hat{\eta })} \over \sigma _1^3}+{n\hat{\sigma }_1\over \sigma _1},\\ l_{\sigma _1;\hat{\sigma }_2}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\sigma }_j} ={{2n\hat{\sigma }_2 d(\eta ,\hat{\eta })}\over \sigma _1^3}-{n\hat{\sigma }_2\over \sigma _1},\\ l_{\sigma _2;\hat{\mu }_2}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\mu }_i} ={n\sigma _2\over \sigma _1^2}+{2m(\hat{\mu }_2-\mu _2)\over \sigma _2^3},\\ l_{\sigma _2;\hat{\sigma }_1}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\mu }_j} =-{n\hat{\sigma }_1\sigma _2\over \sigma _1^2},\\ l_{\sigma _2;\hat{\sigma }_2}= & {} {\partial l^2\over \partial \mu _i\partial \hat{\sigma }_j} ={n\hat{\sigma }_2\sigma _2\over \sigma _1^2}+{2m\hat{\sigma }_2\over \sigma _2^3}, \end{aligned}$$

where \(d({\eta ,\hat{\eta }})=\hat{\theta }+\hat{\mu }_2-{\hat{\sigma }_1^2-\hat{\sigma }_2^2\over 2} -\theta -\mu _2+{\sigma _1^2-\sigma _2^2\over 2}\), the elements of the matrix \(l_{\lambda ;\hat{\lambda }}(\theta ,\lambda )\) are given as follows.

$$\begin{aligned} l_{\lambda ;\hat{\lambda }}(\theta ,\lambda ) = \left( \begin{array}{ccc} {n\over \sigma _1^{2}}+{m\over \sigma _2^{2}} &{} -{n\hat{\sigma }_1\over \sigma _1^2} &{} {n\hat{\sigma }_2\over \sigma _1^2} \\ {2nd(\eta ,\hat{\eta })\over \sigma _1^3}-{n\over \sigma _1} &{} {{2n\hat{\sigma }_1-2n\hat{\sigma }_1d(\eta ,\hat{\eta })}\over \sigma _1^3} +{n\hat{\sigma }_1\over \sigma _1} &{} {{2n\hat{\sigma }_2 d(\eta ,\hat{\eta })}\over \sigma _1^3}-{n\hat{\sigma }_2\over \sigma _1}\\ {n\sigma _2\over \sigma _1^2}+{2m(\hat{\mu }_2-\mu _2)\over \sigma _2^3} &{} -{n\hat{\sigma }_1\sigma _2\over \sigma _1^2} &{} {n\hat{\sigma }_2\sigma _2\over \sigma _1^2}+{2m\hat{\sigma }_2\over \sigma _2^3} \end{array}\right) , \end{aligned}$$

where

$$\begin{aligned}&d({\eta ,\hat{\eta }})=\hat{\theta }+\hat{\mu }_2-{\hat{\sigma }_1^2-\hat{\sigma }_2^2\over 2} -\theta -\mu _2+{\sigma _1^2-\sigma _2^2\over 2} \end{aligned}$$

and so the determinant of \(l_{\lambda ;\hat{\lambda }}(\theta ,\hat{\lambda }_{\theta })\) is

$$\begin{aligned} |l_{\lambda ;\hat{\lambda }}(\theta ,\hat{\lambda }_{\theta })|= & {} {2nm \hat{\sigma }_{1}\hat{\sigma }_{2}\over \hat{\sigma }_{1\theta }^5\hat{\sigma }_{2\theta }^5} [(m\hat{\sigma }_{1\theta }^2+n\hat{\sigma }_{2\theta }^2) (\hat{\sigma }_{2\theta }^2+2(\hat{\mu }_{2\theta } -\hat{\mu }_{2})+2)\\&+\, m\hat{\sigma }_{1\theta }^2(2(\theta -\hat{\theta }) +\hat{\sigma }_{1}^2-\hat{\sigma }_{2}^2)], \end{aligned}$$

where the maximum likelihood estimator \(\hat{\eta }=(\hat{\theta },\hat{\mu }_2,\hat{\sigma }_1,\hat{\sigma }_2)\) is \(\hat{\mu }_2=\bar{y}\), \(\hat{\sigma }_1^2=\sum _{i=1}^n (\log x_i-\bar{x})^2/n\), \(\hat{\sigma }_2^2=\sum _{}^m (\log y_i-\bar{y})2/m\) and \(\hat{\theta }=\bar{x}-\hat{\mu }_2+{1\over 2}(\hat{\sigma }_1^2-\hat{\sigma }_2^2)\).

Therefore, the posterior distribution of \(\theta \) based on the derived matching prior (5) is

$$\begin{aligned}&\pi (\theta \vert \mathbf{X},\mathbf{Y})\propto L_p(\theta ) {|j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta })|^{1\over 2} \over |l_{\lambda ;\hat{\lambda }}(\theta ,\hat{\lambda }_{\theta })|} \left( i_{\theta \theta .\lambda }(\theta ,\hat{\lambda }_{\theta })\right) ^{1\over 2}\\&\quad \propto \hat{\sigma }_{1\theta }^{-n+1}\hat{\sigma }_{2\theta }^{-m+2} [m\hat{\sigma }_{1\theta }^2(\hat{\sigma }_{1\theta }^2+2)+n\hat{\sigma }_{2\theta }^2 (\hat{\sigma }_{2\theta }^2+2)]^{-{1\over 2}}\\&\quad \times {|nm\hat{\sigma }_{1\theta }^2\hat{\sigma }_{2\theta }^4g_1(\theta ) + \hat{\sigma }_{2\theta }^2g_2(\theta )+g_3(\theta )|^{1\over 2} \over |(m\hat{\sigma }_{1\theta }^2+n\hat{\sigma }_{2\theta }^2) (\hat{\sigma }_{2\theta }^2+2(\hat{\mu }_{2\theta } -\hat{\mu }_{2})+2)+ m\hat{\sigma }_{1\theta }^2(2(\theta -\hat{\theta })+\hat{\sigma }_{1}^2-\hat{\sigma }_{2}^2)|}\\&\quad \times \exp \left\{ - \sum _{i=1}^{n} {\left[ \log x_{i}-\theta -\hat{\mu }_{2\theta } +{\hat{\sigma }_{1\theta }^2-\hat{\sigma }_{2\theta }^2\over 2}\right] ^2 \over 2\hat{\sigma }_{2\theta }^2} - {\sum _{i=1}^{m} (\log y_{i}-\hat{\mu }_{2\theta })^2 \over 2\hat{\sigma }_{2\theta }^2}\right\} . \end{aligned}$$

The calculation of \(L_{mp}(\theta )\) might not be straightforward in many cases. For example, in the work on the Weibull stress-strength model by Min and Sun (2013), there is no analytic form for the MLE, which makes the sample space derivatives difficult to obtain. For such cases, Ventura et al. (2009) suggested that their method is still applicable for other modifications to the profile likelihood. This is because all the available adjustments are equivalent to the second order, and they all reduce the score bias to \(O(n^{-1})\). See Severini (2000, Chap. 9), Barndorff-Nielsen and Cox 1994, Chap. 8), and Pace and Salvan (2006) for a review.

The modified profile likelihood \(\bar{L}_{M}(\theta )\) is defined as

$$\begin{aligned} \bar{L}_{M}(\theta )=L_{p}(\theta ) {|j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta })|^{1\over 2} \over |I(\theta ,\hat{\lambda }_{\theta };\hat{\eta })|}, \end{aligned}$$

where \(\hat{\eta }=(\hat{\theta },\hat{\lambda })\), \(I(\theta ,\lambda ;\eta ^{0})=E_{\eta ^0}[l_{\lambda } (\theta ,\lambda )l_{\lambda }(\theta ^0,\lambda ^0)^T]\) with \(\eta ^0=(\theta ^0,\lambda ^0)\) and \(l_{\lambda }(\theta ,\lambda )=\partial l(\theta ,\lambda )/\partial \lambda \). The corresponding posterior is

$$\begin{aligned} \pi (\theta \vert \mathbf{X},\mathbf{Y})\propto L_p(\theta ) {|j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta })|^{1\over 2} \over |I(\theta ,\hat{\lambda }_{\theta });\hat{\eta }|} \left( i_{\theta \theta .\lambda }(\theta ,\hat{\lambda }_{\theta })\right) ^{1\over 2}. \end{aligned}$$
(8)

Applications of (2) and (8) have been illustrated in Ventura et al. (2009) and in Ventura and Racugno (2011). Therefore, we derive the matching prior and the posterior for ratio of log-normal means using matching prior (4), and posteriors (2) and (8), respectively.

Remark 2

From derivatives of the log-likelihood (1), we compute the elements of the matrix \(I(\theta ,\lambda ;\eta ^0)\). It is also shown that the determinants of \(l_{\lambda ;\hat{\lambda }}(\theta ,\hat{\lambda }_{\theta })\) and \(I(\theta ,\hat{\lambda }_{\theta };\hat{\eta })\) are the same (see Appendix for more details).

Remark 3

The posterior distribution of \(\theta \) based on Jeffreys’ prior (6) is given by

$$\begin{aligned}&\pi (\theta \vert \mathbf{X, Y}) \propto \int _{0}^{\infty }\int _{0}^{\infty } \sigma _1^{-n-1}\sigma _2^{-m-1}[m\sigma _1^2+n\sigma _2^2]^{-{1\over 2}}\nonumber \\&\quad \times \exp \left\{ -{S_x^2\over 2\sigma _1^2}-{S_y^2\over 2\sigma _2^2} -{nm\left[ \bar{x} -\bar{y} -\theta +{\sigma _{1}^2-\sigma _{2}^2\over 2}\right] ^2 \over 2(m\sigma _1^2+n\sigma _2^2)} \right\} d\sigma _1d\sigma _2. \end{aligned}$$
(9)

Now we check the proprieties of the posterior distribution under the derived matching prior through the following theorem.

Theorem 3.1

The posterior distribution of \(\theta \) is appropriate if \(n+m-5>0\).

Proof

Now

$$\begin{aligned} \exp \left\{ - \sum _{i=1}^{n} {\left[ \log x_{i}-\theta -\hat{\mu }_{2\theta }+{\hat{\sigma }_{1\theta }^2 -\hat{\sigma }_{2\theta }^2\over 2}\right] ^2 \over 2\hat{\sigma }_{1\theta }^2} - {\sum _{i=1}^{m} (\log y_{i}-\hat{\mu }_{2\theta })^2 \over 2\hat{\sigma }_{2\theta }^2}\right\} <\infty . \end{aligned}$$

Thus, we have

$$\begin{aligned}&L_p(\theta ) \propto \hat{\sigma }_{1\theta }^{-n}\hat{\sigma }_{1\theta }^{-n} \exp \left\{ - \sum _{i=1}^{n} {\left[ \log x_{i}-\theta -\hat{\mu }_{2\theta }+{\hat{\sigma }_{1\theta }^2 -\hat{\sigma }_{2\theta }^2\over 2}\right] ^2 \over 2\hat{\sigma }_{1\theta }^2} - {\sum _{i=1}^{m} (\log y_{i}-\hat{\mu }_{2\theta })^2 \over 2\hat{\sigma }_{2\theta }^2}\right\} \\&\quad \le c_1 \hat{\sigma }_{1\theta }^{-n}\hat{\sigma }_{1\theta }^{-n}. \end{aligned}$$

Next, since

$$\begin{aligned} { |nm\hat{\sigma }_{1\theta }^2\hat{\sigma }_{2\theta }^4g_1(\theta ) + \hat{\sigma }_{2\theta }^2g_2(\theta )+g_3(\theta )|^{1\over 2} \over \hat{\sigma }_{2\theta }^2|(m\hat{\sigma }_{1\theta }^2+n \hat{\sigma }_{2\theta }^2)(\hat{\sigma }_{2\theta }^2+2(\hat{\mu }_{2\theta } -\hat{\mu }_{2})+2)+ m\hat{\sigma }_{1\theta }^2(2(\theta -\hat{\theta }) +\hat{\sigma }_{1}^2-\hat{\sigma }_{2}^2)|} <\infty , \end{aligned}$$

thus, we have

$$\begin{aligned}&{|j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta })|^{1\over 2} \over |l_{\lambda ;\hat{\lambda }}(\theta ,\hat{\lambda }_{\theta })|}\\&\quad \propto {n^{1\over 2} \over 2nm\hat{\sigma }_1\hat{\sigma }_2} { \hat{\sigma }_{1\theta }^5\hat{\sigma }_{2\theta }^5|nm \hat{\sigma }_{1\theta }^2\hat{\sigma }_{2\theta }^4g_1(\theta ) + \hat{\sigma }_{2\theta }^2g_2(\theta )+g_3(\theta )|^{1\over 2} \over \hat{\sigma }_{1\theta }^4\hat{\sigma }_{2\theta }^3| (m\hat{\sigma }_{1\theta }^2+n\hat{\sigma }_{2\theta }^2) (\hat{\sigma }_{2\theta }^2+2(\hat{\mu }_{2\theta } -\hat{\mu }_{2})+2)+m\hat{\sigma }_{1\theta }^2(2(\theta -\hat{\theta })+\hat{\sigma }_{1}^2-\hat{\sigma }_{2}^2)|}\\&\quad \le c_2 \hat{\sigma }_{1\theta }\hat{\sigma }_{2\theta }^4. \end{aligned}$$

Lastly, since

$$\begin{aligned} {[}m\hat{\sigma }_{1\theta }^2\sigma _{2\theta }^{-4}(\hat{\sigma }_{1\theta }^2+2) +n(2\hat{\sigma }_{2\theta }^{-2}+1)]^{-{1\over 2}}<\infty , \end{aligned}$$

we have

$$\begin{aligned}&\left( i_{\theta \theta .\lambda }(\theta ,\hat{\lambda }_{\theta })\right) ^{1\over 2} \propto [m\hat{\sigma }_{1\theta }^2(\hat{\sigma }_{1\theta }^2+2) +n\hat{\sigma }_{2\theta }^2(\hat{\sigma }_{2\theta }^2+2)]^{-{1\over 2}}\\&\quad \propto \sigma _{2\theta }^{-2} [m\hat{\sigma }_{1\theta }^2\sigma _{2\theta }^{-4}(\hat{\sigma }_{1\theta }^2+2) +n(2\hat{\sigma }_{2\theta }^{-2}+1)]^{-{1\over 2}}\\&\quad \le c_3\hat{\sigma }_{2\theta }^{-2}. \end{aligned}$$

Here, \(c_1, c_2, c_3\), and \(c_4\) are a constants. Therefore,

$$\begin{aligned}&\int _{-\infty }^{\infty } \pi (\theta \vert \mathbf{X},\mathbf{Y})d\theta \propto \int _{-\infty }^{\infty } L_p(\theta ) {|j_{\lambda \lambda }(\theta ,\hat{\lambda }_{\theta })|^{1\over 2} \over |l_{\lambda ;\hat{\lambda }}(\theta ,\hat{\lambda }_{\theta })|} \left( i_{\theta \theta .\lambda } (\theta ,\hat{\lambda }_{\theta })\right) ^{1\over 2}d\theta \\&\quad \le \int _{-\infty }^{\infty } c_4 \hat{\sigma }_{1\theta }^{-n+1} \hat{\sigma }_{2\theta }^{-m+2}d\theta <\infty , \end{aligned}$$

if \(n+m-5>0\). This completes the proof. \(\square \)

Remark 4

By direct computation, it can be shown that the posterior distribution (9) of \(\theta \) based on Jeffreys’ prior is appropriate if \(n>0\) and \(m>0\).

Table 1 Frequentist coverage probabilities of 0.05 (0.95) posterior quantiles and 90% (95%) credible interval of \(\theta \)

4 Numerical study

4.1 Simulation study

For various configurations (\(\mu _1,\mu _2,\sigma _1,\sigma _2\)) and (nm), we investigate the credible interval of the marginal posterior density of \(\theta \) under the matching prior given in the previous section. The frequentist coverage probability of a \((1-\alpha )\)th posterior quantile should be close to \(1-\alpha \). Table 1 provides numerical values of the frequentist coverage probabilities of 0.05 (0.95) posterior quantiles which is based on the following algorithm for any fixed true (\(\mu _1,\mu _2,\sigma _1,\sigma _2\)) and any prespecified probability value \(\alpha \).

Since \(F(\theta ^{\pi }(\alpha \vert \mathbf{X},\mathbf{Y}) \vert \mathbf{X},\mathbf{Y})=\alpha \), where \(F(\cdot \vert \mathbf{X},\mathbf{Y})\) is the marginal posterior distribution of \(\theta \) and \(\theta ^{\pi }(\alpha \vert \mathbf{X},\mathbf{Y})\) is the posterior \(\alpha \)-quantile of \(\theta \) given \(\mathbf{X}\) and \(\mathbf{Y}\), the frequentist coverage probability of this one sided credible interval of \(\theta \) is

$$\begin{aligned} P_{( \mu _1,\mu _2,\sigma _1,\sigma _2)}(\alpha ;\theta )= P_{(\mu _1,\mu _2,\sigma _1,\sigma _2)}(0< \theta \le \theta ^{\pi } (\alpha \vert \mathbf{X},\mathbf{Y})). \end{aligned}$$

The values of \(P_{(\mu _1,\mu _2,\sigma _1,\sigma _2)}(\alpha ;\theta )\) are computed when \(\alpha =0.05\) (0.95) and are summarized in Table 1. For fixed nm and \((\mu _1,\mu _2,\sigma _1,\sigma _2)\), 10, 000 independent random samples of \(\mathbf{X}=(X_1,\ldots ,X_{n})\) and \(\mathbf{Y}=(Y_1,\ldots ,Y_{m})\) are generated from the log-normal distributions, respectively.

It turns outs that the matching prior performs better than Jeffreys’ prior in meeting the target coverage probabilities. Especially when two variances are equal or are slightly different, Jeffreys’ priors also show good coverage probabilities. In addition, the matching prior meets the target coverage probabilities well even for small sample sizes. Moreover, Table 1 shows that outcomes are not considerably sensitive to change in the values of parameters (\(\mu _1,\mu _2,\sigma _1,\sigma _2\)).

4.2 Real data

We compute the confidence interval based on the Z-score statistic (Zhou et al. 1997), the confidence interval based on the modified signed log-likelihood ratio statistic, known as the \(r^*\)-formula (Wu et al. 2002), and Bayesian credible intervals for the ratio of two log-normal means based on the matching prior and Jeffreys’ prior. Two real examples are given as follow.

Example 1

The first example is a bioavailability study where a randomized, parallel-group experiment was conducted with 20 subjects to compare a new test formulation with a reference formulation of a drug product with a long half-life (Wu et al. 2002). In determining if the two formulations have different bioavailability, it is crucial to construct a confidence interval for the ratio of means of maximum plasma concentration (\(C_{\max }\)) and test the equality of the means of \(C_{\max }\) of the two formulations. The data are given below:

A new test formulation: 732.89, 1371.97, 614.62, 557.24, 821.39, 363.94, 430.95, 401.42, 436.16, 951.46.

A reference formulation: 1053.63, 1351.54, 197.95, 1204.72, 447.20, 3357.66, 567.36, 668.48, 842.19, 284.86.

After log-transformation, the sample means are 6.417 \((n_1=10)\) and 6.601 \((n_2=10)\), and the sample standard deviations are 0.429 and 0.817, respectively. The Shapiro–Wilk tests for normality on the log-transformed data give a p value of 0.595 for the test formulation group and a p value of 0.983 for the reference formulation group. The F test for equal variances of the log-transformed data between the two groups yields a p value of 0.034, and therefore the log transformation does not stabilize the variances (see Wu et al. 2002).

Table 2 The 95% confidence interval and length for the ratio of two log-normal means

For the ratio of means of \(C_{\max }\) between two formulations, a 95% Z-interval, \(r^*\)-interval, Bayesian credible intervals and lengths are summarized in Table 2. It shows that the Z-interval differs from the other confidence intervals. When both samples are small, the Z-score method is either too liberal or too conservative (Wu et al. 2002; Krishnamoorthy and Mathew 2003). The left limits in the intervals of the \(r^*\) and \(\pi _J \) are different while Jeffreys’ prior yields the shorter interval. Further, the interval of \(\pi _m\) includes the interval of \(r^*\). Overall, the confidence intervals based on \(r^*, \pi _J \), and \(\pi _m\) yield nearly similar intervals.

Example 2

This example was studied by Krishnamoorthy and Mathew (2003). The data is found in the web address (http://lib.stat.cmu.deu/DASL/). An oil refinery located northeast of San Francisco conducted a series of 31 daily measurements of the carbon monoxide levels arising from one of their stacks between April 16 and May 16, 1993. These measurements were reported as evidence for establishing a baseline to the Bay Area Air Quality Management District (BAAQMD). The BAAQMD personnel also made 9 independent measurements of the carbon monoxide concentration from the same stack over the period from September 11, 1990–March 30, 1993. The “data & story library” at the above web address mentioned that the refinery tends to overestimate carbon monoxide emissions (to setup a baseline at a higher level), which needs to be tested in this analysis. The data are given below:

Carbon monoxide measurements by refinery (in ppm): 45, 30, 38, 42, 63, 43, 102, 86, 99, 63, 58, 34, 37, 55, 58, 153, 75, 58, 36, 59, 43, 102, 52, 30, 21, 40, 141, 85, 161, 86, 71.

Carbon monoxide measurements by the BAAQMD (in ppm): 12.5, 20, 4, 20, 25, 170, 15, 20, 15.

The sample means and the sample standard deviations after log-transformation were 4.0743, 0.5021 for measurement taken by the refinery and 2.963, 0.974 for measurement collected by BAAQM. The sample sizes were \(n_1=31\) and \(n_2=9\), respectively. Krishnamoorthy and Mathew (2003) showed that a log-normal model fits both sets of measurements well but the normal model does not at any practical level of significance.

Table 3 The 95% confidence interval and length for the ratio of two log-normal means

Table 3 provides 95% Z-interval, \(r^*\)-interval, credible intervals and lengths for the ratio of means of two measurements, respectively. As in Example 1, the Z-interval has similar performance compared to other confidence intervals. The 95% generalized lower confidence limit (Krishnamoorthy and Mathew 2003) for \(\theta \left( =\mu _1-\mu _2+{1\over 2}(\sigma _1^2-\sigma _2^2)\right) \) is \(-0.40\). The 95% lower confidence limit based on \(r^*\), \(\pi _J\), and \(\pi _m\) for \(\theta _1\) are \(-0.32, -0.17\), and \(-0.47\), respectively. Therefore, the generalized lower confidence interval, lower confidence limit based on \(r^*\) and \(\pi _m\) for \(\theta \) are virtually the same.

5 Concluding remarks

The log-normal distribution is widely used to analyze positively skewed data in biomedical research. In bioequivalence trials, the relative potency of a new drug to that of a standard one is expressed in terms of the ratio of means. However, it is not easy to derive the matching priors for the ratio of log-normal means owing to the presence of nuisance parameters. In addition, multidimensional numerical integration over nuisance parameters or Markov chain Monte Carlo numerical integration is needed to derive the marginal posterior distribution of parameter of interest. Therefore, we here consider an alternative method using the modified profile likelihood to eliminate the nuisance parameter and derive the corresponding matching prior, which leads the proper posterior distribution. As illustrated in numerical study and two data examples, the derived matching prior performs better than Jeffreys’ prior with respect to the asymptotic frequentist coverage property. In addition, the matching prior meets the target coverage probability criteria well even for small sample sizes.