1 Introduction

Zika virus (ZIKV) was first isolated in Zika Forest of Uganda in a rhesus monkey in 1947 and then in humans in 1952 (Dick et al. 1952). Following the isolation of the first human case, several incidences occurred in a number of countries in Africa and Asia (Faye and Freire 2014) in the 1970s and 1980s. The first major outbreak of Zika was recorded in the Island of Yap with 185 suspected cases in 2007 (Kindhauser et al. 2016). Since 2015 the geographical distribution of the Zika virus has continued with reported cases from Brazil, Puerto Rico and most recently in Miami, Florida. It appears that the nature of the Zika virus infections has been changing as the virus moves from Africa to the Americas (Kindhauser et al. 2016). It was initially classified as obscure mosquito-borne infection causing mild illness across equatorial Africa and Asia, but, since 2007 it has been causing large outbreaks and has become global health emergency WHO http://www.who.int/emergencies/zika-virus/articles/one-year-outbreak/en/index1.html). Zika outbreaks have been linked to neurological disorders including Guillain–Barre syndrome and microcephaly in newborns born to mothers infected with Zika across the Pacific region and the Americas (WHO http://www.who.int/emergencies/zika-virus/articles/one-year-outbreak/en/index1.html; Perkins et al. 2016).

The main route of transmission for Zika virus is through the bite of an infected mosquito mainly from the Aedes species (Ae. aegypti and Ae. albopictus). These are the same vectors that transmit dengue, chikungunya and yellow fever. CDC estimates that these mosquitoes species can be found in about half of the area of the continental US, mostly the southern states but also in the northeast (CDC 2016). Local transmission of Zika in the continental US has been detected only in Florida and Texas.Footnote 1 Though very similar to dengue and chikungunya, Zika has other routes of transmission: through sexual contact, vertical transmission and blood infusion (see footnote 1). The virus can survive in semen longer than in blood and can sporadically be found in the vaginal fluids (Osuna and Lim 2016). A pregnant woman can pass Zika virus to her fetus during pregnancy. Vertical transmission of the Zika virus causes microcephaly and other severe fetal brain defects (Mlakar et al. 2016).

As of March 2017, the USA have had symptomatic cases of Zika, 215 of them locally acquired in Florida (see footnote 1). The local transmission of Zika in Florida began in June 2016 and was declared resolved by the governor in December 2016. Zika has been endemic in Puerto Rico. These incidences show that Zika is a significant public health problem in the USA as well as the remaining Americas.

Zika has been investigated through mathematical models. Kucharski et al. (2016) uses a standard vector–host ODE model to understand Zika transmission during the 2013–2014 outbreak of Zika in French Polynesia. The authors estimate a reproduction number of 2.6–4.8 with an estimated 11.5% of the cases reported. The article further estimates that 94% of the population was infected, mostly asymptomatically. Sexual transmission alongside vector-borne transmission for Zika was first modeled in Gao et al. (2016). Gao et al use data from Brazil, Colombia and El Salvador to estimate the reproduction number at 2.055. The large confidence interval for \(\mathcal R_0\) (0.523–6.3) suggests that some of the parameters that comprise the reproduction number are not identifiable. The article further estimates that about 3% of the transmissions are sexual; however, the large CI suggests again that the parameters related to sexual transmission are not identifiable. We address this question explicitly and show that without data on sexual transmission, parameters related to sexual transmission are not identifiable. On the other hand, obtaining data for sexual transmission in the context of local transmission is very difficult since it is hard to distinguish which case is mosquito generated and which has resulted from sexual transmission (personal communication with Florida Department of Health). Sexual and vector-borne transmissions are also investigated in Baca-Carrasco and Velasco-Hernandez (2016) where the authors evaluate the impact of sexual transmission as well as importation of cases and find that sexual transmission impacts the magnitude of the outbreak, while migration generates outbreaks over time, possibly with lower magnitude. Chowell et al use the generalized Richards model to project new cases and estimate the burden of Zika, using data from Antioquia, Colombia (Chowell et al. 2016). One of the most serious impacts of Zika is on newborn babies to women infected with Zika. Perkins et al. (2016) uses mathematical models to project Zika virus infections in childbearing women in the Americas.

The goal of this article is twofold: (1) use a number of models to estimate \(\mathcal R_0\) of the local cases in Florida of Zika outbreak in 2016 and (2) develop identifiable models of Zika, including models that explicitly account for pregnant women. In the next section, we introduce six models of Zika, starting from the very generic vector–host model and incorporating one by one distinct features of Zika, such as asymptomatic infections, sexual transmission and separate class for pregnant women. In Sect. 3, we discuss the structural identifiability of the models. In Sect. 4, we fit the models to the data, estimate \(\mathcal R_0\) and discuss the practical identifiability of the models. In Sect. 5, we derive basic analytical results for the Zika models. Section 6 summarizes our conclusions. We have added all the MATLAB code used in this study to https://github.com/NecibeTuncer/ZikaODEModels.

2 Epidemiological Models of the Zika Virus Infection

Zika is an epidemiologically complex disease. Though in many respects similar to other arboviral diseases, such as dengue and Chikungunya (Lanciotti et al. 2016), it also has many distinct features. In this section, we introduce a number of epidemiological models, starting from very simple and generic vector-borne model and including gradually the more distinctive features of Zika. Since Zika rarely leads to death (Petersen et al. 2016), we neglect the disease-induced death rate in all models. Nonetheless, we use standard incidence. Prior research has used outbreak models for the human population (Gao et al. 2016). We use endemic models as the Zika epidemic has continued for nearly 2 years; estimates of reproduction numbers suggest that the virus is endemic and WHO has put the disease on the list of continued threat diseases.Footnote 2 Model 1 is a general vector-borne model, much as the ones developed by Ross and McDonald for Malaria (Smith et al. 2012). Although Zika infection with a given strain (lineage) is believed to offer life-long protection (Dudley et al. 2016), infections with other strains may be possible. To capture that possibility, we include loss of immunity in the models. The dependent variables in the models are listed in Table 1. The parameter meanings of the various models are listed in Table 2.

Table 1 Definition of the variables in the between host models

Zika model with vector transmission only:

$$\begin{aligned} \mathrm{Model\, 1 \, (M_1)}: {\left\{ \begin{array}{ll} \displaystyle \frac{\mathrm{d} S_v}{\mathrm{d} t } &{}= \displaystyle \Lambda _v - \frac{\beta I S_v}{N} -\mu _v S_v ,\\ \displaystyle \frac{\mathrm{d} I_v}{\mathrm{d} t } &{}=\displaystyle \frac{\beta I S_v}{N} - \mu _v I_v ,\\ \displaystyle \frac{\mathrm{d} S}{\mathrm{d} t } &{}= \displaystyle \Lambda - \frac{\beta _v I_v S}{N} - \mu S +\omega R ,\\ \displaystyle \frac{\mathrm{d} I}{\mathrm{d} t} &{}= \displaystyle \frac{\beta _v I_v S}{N} - (\mu +\gamma )I , \\ \displaystyle \frac{\mathrm{d} R}{\mathrm{d} t} &{}= \gamma I - (\mu +\omega ) R . \\ \end{array}\right. } \end{aligned}$$
(1)

The total human population \(N(t) = S(t) +I(t) +R(t)\) satisfies the following differential equation,

$$\begin{aligned} N' = \Lambda - \mu N, \quad N(0) = S(0) +I(0) +R(0). \end{aligned}$$
(2)

Similarly, the total mosquito population \(N_v(t) = S_v(t) +I_v(t)\) can be determined from the following differential equation,

$$\begin{aligned} N_v' = \Lambda _v - \mu _v N_v, \quad N_v(0) = S_v(0) +I_v(0). \end{aligned}$$
(3)

Zika infections are often asymptomatic with an estimated 80% of the cases being without symptoms (Petersen et al. 2016). In symptomatic individuals, clinical manifestation are mild. Symptoms in non-pregnant individuals last from several days to a week.Footnote 3 To account for the asymptomatic infections, we consider a version of the above model with asymptomatic class A:

Zika model with vector transmission and asymptomatic class:

$$\begin{aligned} \mathrm{Model\, 2 \, (M_2)}: {\left\{ \begin{array}{ll} \displaystyle \frac{\mathrm{d} S_v}{\mathrm{d} t } &{}= \displaystyle \Lambda _v - \frac{\beta (I+q A) S_v}{N} -\mu _v S_v ,\\ \displaystyle \frac{\mathrm{d} I_v}{\mathrm{d} t } &{}= \displaystyle \frac{\beta (I+qA) S_v}{N} - \mu _v I_v ,\\ \displaystyle \frac{\mathrm{d} S}{\mathrm{d} t } &{}= \displaystyle \Lambda - \frac{\beta _v I_v S}{N} - \mu S +\omega R ,\\ \displaystyle \frac{\mathrm{d} A}{\mathrm{d} t} &{}= \displaystyle \frac{(1-\phi )\beta _v I_v S}{N} - (\mu +\gamma _A)A , \\ \displaystyle \frac{\mathrm{d} I}{\mathrm{d} t} &{}= \displaystyle \frac{\phi \beta _v I_v S}{N} - (\mu +\gamma )I , \\ \displaystyle \frac{\mathrm{d} R}{\mathrm{d} t} &{}= \gamma I+\gamma _A A - (\mu +\omega ) R, \\ \end{array}\right. } \end{aligned}$$
(4)

where q is the reduction in infectivity of asymptomatic individuals and \(\phi \) is the fraction of the new infections that are symptomatic. \(N=S+A+I+R\). Models with asymptomatic class have also been investigated before (Chitnis et al. 2013).

As a vector-borne disease, Zika is transmitted predominantly by mosquitos. A distinctive feature of Zika is that it can be transmitted through sexual contact.Footnote 4 More recent data suggest that sexual transmissions may not be as rare as originally thought. To account for sexual transmission, we include a direct transmission term in Model 1:

Zika model with vector and direct transmissions:

$$\begin{aligned} \mathrm{Model\, 3 \, (M_3)}: {\left\{ \begin{array}{ll} \displaystyle \frac{\mathrm{d} S_v}{\mathrm{d} t } &{}=\displaystyle \Lambda _v - \frac{\beta I S_v}{N} -\mu _v S_v ,\\ \displaystyle \frac{\mathrm{d} I_v}{\mathrm{d} t } &{}=\displaystyle \frac{\beta I S_v}{N} - \mu _v I_v ,\\ \displaystyle \frac{\mathrm{d} S}{\mathrm{d} t } &{}= \displaystyle \Lambda - \frac{\beta _v I_vS + \beta _d IS}{N} - \mu S +\omega R ,\\ \displaystyle \frac{\mathrm{d} I}{\mathrm{d} t} &{}= \displaystyle \frac{\beta _v I_v S + \beta _d SI}{N} - (\mu +\gamma )I , \\ \displaystyle \frac{\mathrm{d} R}{\mathrm{d} t} &{}= \gamma I - (\mu +\omega ) R . \end{array}\right. } \end{aligned}$$
(5)

Here, \(N=S+I+R\). Models of vector-borne diseases with direct transmission are not new and have been considered before (Velasco-Hernandez 1994; Wei et al. 2008).

We include the asymptomatic infectious class in the model with vector and direct transmission.

Zika model with vector/direct transmission and asymptomatic class:

$$\begin{aligned} \mathrm{Model\, 4 \, (M_4)}: {\left\{ \begin{array}{ll} \displaystyle \frac{\mathrm{d} S_v}{\mathrm{d} t } &{}= \displaystyle \Lambda _v - \frac{\beta (I+qA) S_v}{N} -\mu _v S_v ,\\ \displaystyle \frac{\mathrm{d} I_v}{\mathrm{d} t } &{}=\displaystyle \frac{\beta (I+qA) S_v}{N} - \mu _v I_v ,\\ \displaystyle \frac{\mathrm{d} S}{\mathrm{d} t } &{}= \displaystyle \Lambda - \frac{\beta _v I_v S + \beta _d S (I+q_AA)}{N} - \mu S +\omega R ,\\ \displaystyle \frac{\mathrm{d} A}{\mathrm{d} t} &{}= \displaystyle \frac{(1-\phi ) \beta _v I_v S + (1-\phi ) \beta _d S(I+q_AA)}{N} - (\mu +\gamma _A)A , \\ \displaystyle \frac{\mathrm{d} I}{\mathrm{d} t} &{}= \displaystyle \frac{\phi \beta _v I_v S + \phi \beta _d S(I+q_A A)}{N} - (\mu +\gamma )I , \\ \displaystyle \frac{\mathrm{d} R}{\mathrm{d} t} &{}=\displaystyle \gamma I+\gamma _A A - (\mu +\omega ) R, \end{array}\right. } \end{aligned}$$
(6)

where \(N=S+I+A+R\).

Although Zika is a mild infection in individuals, it may be serious for pregnant women and their unborn children. Now it is well determined that Zika can cross the placenta and infect the fetus, particularly the brain, causing birth defects (Wu and Zuo 2016; Mlakar et al. 2016).Footnote 5 CDC reports the number of pregnant women infected with Zika,Footnote 6 allowing for modeling of this class separately. We include a model of Zika tracking separately pregnant women below. We do not include infected births because we assume that the contribution to new infections from the newly born infected babies is minimal.

Table 2 Definition of the parameters in models (1), (4), (5), (6), (7)

Zika model with vector/direct transmission and pregnant women:

$$\begin{aligned} \mathrm{Model\, 5 \, (M_5)}: {\left\{ \begin{array}{ll} \displaystyle \frac{\mathrm{d} S_v}{\mathrm{d} t } &{}= \displaystyle \Lambda _v - \frac{\beta (I + I_p ) S_v}{N} -\mu _v S_v ,\\ \displaystyle \frac{\mathrm{d} I_v}{\mathrm{d} t } &{}=\displaystyle \frac{\beta (I + I_p ) S_v}{N} - \mu _v I_v ,\\ \displaystyle \frac{\mathrm{d} S}{\mathrm{d} t } &{}= \displaystyle \Lambda +\xi _p S_p - \frac{\beta _v I_v S + \beta _d(I + I_p)S}{N} - (\mu +\xi ) S +\omega R ,\\ \displaystyle \frac{\mathrm{d} S_p}{\mathrm{d} t} &{}=\displaystyle \xi S - \frac{\beta _{v_p} I_v S_p + \beta _{d_p} I S_p}{N} - (\mu +\xi _p) S_p ,\\ \displaystyle \frac{\mathrm{d} I}{\mathrm{d} t} &{}= \displaystyle \frac{\beta _v I_v S + \beta _{d}(I + I_p) S}{N} - (\mu +\gamma )I , \\ \displaystyle \frac{\mathrm{d} I_p}{\mathrm{d} t} &{}= \displaystyle \frac{\beta _{v_p} I_v S_p + \beta _{d_p} I S_p}{N} - (\mu +\gamma _p)I_p , \\ \displaystyle \frac{\mathrm{d} R}{\mathrm{d} t} &{}= \gamma I +\gamma _p I_p - (\mu +\omega ) R. \end{array}\right. } \end{aligned}$$
(7)

Here, \(N=S+S_p+I+I_p+R\). Research (Dudley et al. 2016) in monkeys suggests that pregnant individuals are infected for longer but the infection persists at lower level in the serum making them on average perhaps less invective to the mosquitos. To account for that difference, we assume distinct transmission rates for pregnant women to mosquitoes. However, for direct transmission we assume the same transmission rate based on the fact that direct transmission is more rare and the difference in the transmission rates will have a small impact on the disease dynamics.

Zika model with vector transmission and pregnant women class:

$$\begin{aligned} \mathrm{Model\, 6 \, (M_6)}: {\left\{ \begin{array}{ll} \displaystyle \frac{\mathrm{d} S_v}{\mathrm{d} t } &{}= \displaystyle \Lambda _v - \frac{\beta (I + I_p ) S_v}{N} -\mu _v S_v ,\\ \displaystyle \frac{\mathrm{d} I_v}{\mathrm{d} t } &{}=\displaystyle \frac{\beta (I + I_p ) S_v}{N} - \mu _v I_v ,\\ \displaystyle \frac{\mathrm{d} S}{\mathrm{d} t } &{}= \displaystyle \Lambda +\xi _p S_p - \frac{\beta _v I_v S }{N} - (\mu +\xi ) S +\omega R ,\\ \displaystyle \frac{\mathrm{d} S_p}{\mathrm{d} t} &{}=\displaystyle \xi S - \frac{\beta _{v_p} I_v S_p }{N} - (\mu +\xi _p)S_p ,\\ \displaystyle \frac{\mathrm{d} I}{\mathrm{d} t} &{}= \displaystyle \frac{\beta _v I_v S }{N} - (\mu +\gamma )I , \\ \displaystyle \frac{\mathrm{d} I_p}{\mathrm{d} t} &{}= \displaystyle \frac{\beta _{v_p} I_v S_p }{N} - (\mu +\gamma _p)I_p , \\ \displaystyle \frac{\mathrm{d} R}{\mathrm{d} t} &{}= \gamma I +\gamma _p I_p - (\mu +\omega ) R. \end{array}\right. } \end{aligned}$$
(8)

Here, \(N=S+S_p+I+I_p+R\). All the models and their characteristic differences are summarized in Table 3.

Table 3 Summary of Zika models presented in the paper

3 Structural Identifiability Analysis of the Epidemiological Models of Zika Virus Infection

As in many applications, in this study as well the parameters of the models cannot be directly measured by clinical studies, but can only be determined by indirect approaches such as parameter estimation methods using time-varying incidence reports provided by the health organizations. However, it is necessary to answer the fundamental question of whether the mathematical model is structured to identify its parameters from the given observations. For a well-posed parameter estimation problem, we need to know whether there exists a unique set of parameters that had produced the data. We first study the well posedness of the parameter estimation problem for the given observations such as cumulative incidences and infected pregnant women. A model is said to be structurally identifiable if for given large enough data sets, free of errors, it is theoretically possible to uniquely determine the parameter values from the observations that generated these data points. If infinite number of parameter sets leads to the same observations, then the model is called non-identifiable. If two or more (finite and isolated) parameter sets lead to the same observational output, the model is called locally identifiable.

If the model is not structurally identifiable, then we cannot estimate the “true values” of the parameters. As a first step in determining the parameters, we investigate the identifiability of the candidate models \(M_1\)-\(M_6\). There are multiple ways to test for the identifiability of a model: Taylor’s or generating series approaches, identifiability tableaus, differential algebra approach, direct methods, implicit function approach, profile likelihood, output sensitivities and differential geometry approaches (Chis et al. 2011; Miao et al. 2011; Raue et al. 2009; Stigter and Molenaar 2015; Villaverde et al. 2016; Meshkat et al. 2014).Footnote 7 Among these methods, there is not a single method that is applicable to all mathematical models. For comparison of these methods in terms of applicability, computational complexity and information provided, we refer the reader to Chis et al. (2011).

To set up the problem, without loss of generality we express the models \(M_1\) through \(M_6\) in the following compact form

$$\begin{aligned} {\varvec{x}}' = f({\varvec{x}}(t), {\varvec{p}}) \quad {\varvec{x}}(0)= {\varvec{x}}_0, \end{aligned}$$
(9)

where \({\varvec{p}}\) denotes the parameters of the system, \({\varvec{x}}(t)\) denotes the state variables and \({\varvec{x}}_0\) is the initial values. The observations, cumulative number of incidences and pregnant women infected with Zika are given by the output function \(g({\varvec{x}}(t),{\varvec{p}})\). The definition of the structural identifiability in the literature is given as in Miao et al. (2011).

Definition 3.1

A parameter set \( {\varvec{p}}\) is called structurally globally (or uniquely) identifiable if for every \({\varvec{q}}\) in the parameter space, the equation

$$\begin{aligned} g({\varvec{x}} (t) ,{\varvec{p}}) = g({\varvec{x}} (t) ,{\varvec{q}}) \iff {\varvec{p}} = {\varvec{q}} \,. \end{aligned}$$

That is, if \({\varvec{p}} \ne {\varvec{q}} \), then \(g({\varvec{x}} (t) ,{\varvec{p}}) \ne g({\varvec{x}} (t) ,{\varvec{q}})\) and hence the corresponding noise-free data are as well distinct. In other words, if any observation of the mathematical model can only be determined by a unique set of parameters, then the model is said to be globally (structurally) identifiable. The definition of local identifiability is given in Miao et al. (2011) as the following.

Definition 3.2

Let \(\mathcal {N}({\varvec{p}})\) denote the neighborhood of the parameter \( {\varvec{p}}\). The parameter set \( {\varvec{p}}\) is called locally identifiable if for every \({\varvec{p}}\) there exists an open neighborhood \(\mathcal {N}({\varvec{p}})\), such that for every \({\varvec{q}} \in \mathcal {N}({\varvec{p}})\) the equation

$$\begin{aligned} g({\varvec{x}} (t) ,{\varvec{p}}) = g({\varvec{x}} (t) ,{\varvec{q}}) \iff {\varvec{p}} = {\varvec{q}} \,. \end{aligned}$$

Among all the methods to test for the structural identifiability, the differential algebra approach stands out because not only it distinguishes between local and global identifiability, but also it reveals the parameter correlations that lead to un-identifiability. So, if the model is not identifiable, using the parameter combinations obtained by the differential algebra approach, it is possible to scale the model to obtain a structurally identifiable model (Tuncer et al. 2016). Another advantage of the differential algebra approach is that there exists a software package “Differential Algebra for Identifiability of SYstems (DAISY)” implemented in REDUCE introduced by Bellu et al. (2007). However, the software does not perform well for large dynamical systems as pointed out in Chis et al. (2011). We ran DAISY for the model \(M_1\) with mass action incidence term. DAISY could not finish the computations to produce an input–output equation where the output is the cumulative number of cases. DAISY reported computational errors due to the lack of memory. Computation of input–output equation does not depend on which parameters are fixed. We believe that the model \(M_1\), with only 5 state variables, should not be considered a large dynamical system. In the era of connecting epidemiological models with time-series data, it is essential to develop computer packages (implemented in MATHEMATICA or MAPLE) to obtain structural identifiability analysis of epidemiological models using differential algebra approach.

In this study, we use the Identifiability Analysis package in MATHEMATICA to test for the local identifiability of the epidemiological models of Zika, \(M_1\) through \(M_6\). This implementation is based on a probabilistic numerical method of computing the rank of the identifiability (Jacobian) matrix (11) where the matrix parameters and initial state variables are specialized to random integers. We briefly describe the method here; for more detailed information, we refer to Karlson et al. (2012). Let \(y(t) = g({\varvec{x}}(t),{\varvec{p}})\) denote the observations which had generated the data and \({\varvec{p}}\) be set of model parameters. The power series expansion of the observations y(t) at the initial time \(t=0\) is given as

$$\begin{aligned} y(t) = y(0) + y'(0)t + y''(0) \displaystyle \frac{t^2}{2} + \ldots + y^{\nu }(0)\displaystyle \frac{t^\nu }{\nu !} + \mathcal {O}(t^{\nu +1}), \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} y (0)&= g({\varvec{x}}(0),{\varvec{p}}) \\ y'(0)&= \displaystyle \frac{\partial g({\varvec{x}}(0),{\varvec{p}})}{\partial {\varvec{x}}}\displaystyle \frac{\partial {\varvec{x}}}{\partial t} = \mathcal {L}^1_f g({\varvec{x}}(0),{\varvec{p}})\\ y''(0)&= \mathcal {L}^2_f g({\varvec{x}}(0),{\varvec{p}})\\ \vdots \\ y^{\nu }(0)&= \mathcal {L}^\nu _f g({\varvec{x}}(0),{\varvec{p}}) \end{aligned} \end{aligned}$$
(10)

with \(\mathcal {L}_f\) denoting the Lie-derivative along the vector field f. Setting \(\mathcal {Y} = (y(0), y'(0), y''(0), \cdots ,y^\nu (0))^T\), (10) can be written in compact form, \(\mathcal {Y} = \mathcal {Y}({\varvec{x}}(0), {\varvec{p}})\). The inverse function theorem states that the equation \(\mathcal {Y} = \mathcal {Y}({\varvec{x}}(0), {\varvec{p}})\) can be uniquely solved for \({\varvec{x}}(0)\) and \({\varvec{p}}\) if and only if the Jacobian matrix

$$\begin{aligned} J({\varvec{x}}(0), {\varvec{p}}) = \displaystyle \frac{\partial \mathcal {Y}({\varvec{x}}(0), {\varvec{p}})}{\partial ({\varvec{x}}(0), {\varvec{p}})} \end{aligned}$$
(11)

has full rank. The Identifiability Analysis package determines the rank of the matrix (11) by assigning random integers to model parameters and initial state variables. The method is based on two assumptions (observations). First, this method assumes that if a model is identifiable locally at time close to 0, then the identifiability carries over to all times. Second, probability of obtaining the actual rank of the matrix (11) by assigning random integers to initial state variables and model parameters is high. We are mainly interested in estimating epidemiologically important parameter values such as transmission and recovery rates in models \(M_1\) through \(M_6\). Thus, the parameter such as recruitment rate or natural death rates is fixed when estimating the other parameters. Full list of fixed parameters is given in Table 4. Identifiability Analysis states that the parameters of the models \(M_1\) through \(M_4\) are locally identifiable from the cumulative incidence observations, but parameters of the model \(M_5\) cannot be obtained from the cumulative incidence observations only. For model \(M_5\), we use two data sets, cumulative incidences and infected pregnant women, and then, the Identifiability Analysis states that the models \(M_5\) and \(M _6\) are locally identifiable. Structural identifiability analysis is necessary but not sufficient in concluding the identifiability of the parameter estimation problem. A model that is structurally identifiable may not be identifiable in practice when real data with noise are considered. On the other hand, the IdentifiabilityAnalysis tool is a numerical algorithm which relies on the determining the identifiability at the initial time and by determining the rank of the Jacobian matrix randomly. Hence, we further investigate the identifiability of the models \(M_1\) through \(M_6\) by Monte Carlo simulations.

4 Fitting the Epidemiological Models of Zika to Data

The observations (cumulative incidences and infected pregnant women) \(\{y_i\}_{i=1}^n\) are obtained at discrete time points \(t_1\,,t_2\,,\ldots t_n\,\) of the output function \(g({\varvec{x}}(t),{\varvec{p}})\). We define the statistical model by following the definition in Banks et al. (2014) as,

$$\begin{aligned} y_i = g({\varvec{x}}(t_i),\hat{{\varvec{p}}}) + E_i, \end{aligned}$$
(12)

where \(\hat{{\varvec{p}}}\) denotes the true parameters that generate the observations \(\{y_i\}_{i=1}^n\) and \(E_i\) are the random variables that represent the observation or measurement error which cause the observations not fall exactly on the points \(g({\varvec{x}}(t_i),\hat{{\varvec{p}}})\) of the smooth path \(g({\varvec{x}}(t),\hat{{\varvec{p}}}).\) In a general setting, the measurements errors are assumed to have the following form,

$$\begin{aligned} E_i = g({\varvec{x}}(t_i),\hat{{\varvec{p}}})^{\xi } \epsilon _i, \end{aligned}$$
(13)

where \(\xi \ge 0\) and \(\epsilon _i\) are independent and identically distributed with mean zero and constant variance \(\sigma _0^2\). The random variables \(y_i\) have mean \(\mathbb {E}(y_i) =g({\varvec{x}}(t_i),\hat{{\varvec{p}}})\) and variances \(Var(y_i) = g({\varvec{x}}(t_i),\hat{{\varvec{p}}})^{2 \xi } \sigma _0^2\). Varying \(\xi \) allows for varying error scales in the measurements. We use the relative error model, that is, \(\xi =1\) in (13), and use ordinary least squares in the parameter estimation problem. For the parameter estimation problem, we suppose that the Zika outbreak in Florida is exactly described by one of the deterministic models \(M_1\) through \(M_6\), that is, there is no modeling error and the expected value of the random variables \(\epsilon _i\) is zero, hence \( \mathbb {E}(\epsilon _i) =0.\)

Parameter estimation problem in the sense of least squares is to find the “true” parameter \(\hat{{\varvec{p}}}\) by solving the following optimization problem

$$\begin{aligned} \hat{{\varvec{p}}} = \displaystyle \min _{{\varvec{p}}} \sum _{i=1}^n\left( y_i - g({\varvec{x}}(t_i),{\varvec{p}})\right) ^2\,. \end{aligned}$$
(14)

4.1 Data and Parameter Values

As of November 23, 2016, there are total 4444 cases of Zika in the US, 182 of which locally acquired. All of the locally acquired cases have been acquired in Florida, where the vectors transmitting Zika, Aedes aegypti and Aedes albopictus can be found (see footnote 1). We obtained time-series data of cumulative incidences from the Florida Department of Health Daily Zika Update Reports.Footnote 8 The first locally acquired Zika case is observed on July 19, 2016 (see footnote 8). We use the time-series data of locally inquired Zika cases from July 19, 2016, to September 29, 2016. The Daily Zika Update Reports do not include locally acquired pregnant women cases; reports only consider the travel-related pregnant cases. Through email communications with Florida Department of Health, we obtained the infected Zika pregnant cases acquired in Florida. Based on the identifiability results of the previous section, we fit the models in the framework to the data with the following goals: (1) to select the best model representing the data in Florida; (2) to estimate the reproduction number of Zika in Florida; (3) to make short-term projections about the epidemic in Florida and its impact on pregnant women.

Florida’s population currently is about 20 million people with life expectancy in the USA at 79 years. We take \(\mu =1/(79*365)\) days\(^{-1}\) and \(\Lambda = 20000000*\mu \) people per day. Female mosquitoes, which bite and transmit the disease, live in captivity up to 30 days, but in the wild they often do not survive longer than 2 weeks.Footnote 9 We take \(\mu _v =1/10\) days\(^{-1}\) and we set \(\Lambda _v=\mu _v\) to work with proportions of mosquitoes, rather than mosquito numbers. There are 2% pregnant women in the entire population on average at any time in the USAFootnote 10 which gives a value for \(\xi = \displaystyle \frac{0.02}{0.98}\mu \).Footnote 11 Various sources give information about the duration of Zika symptoms which last for 2–7 days.Footnote 12 That is also the typical duration of viremia (Dudley et al. 2016) in non-pregnant individuals. Duration of viremia in pregnant individuals can last 40–50 days and up to 10 weeks (Driggers et al. 2016).

Viremia studies in monkeys suggest that viremia levels of symptomatic and asymptomatic individuals are not that different than the asymptomatic so we assume that \(q\approx 1\). We also surmise that in direct transmission asymptomatic individuals may be more infectious than symptomatic as they may not know that they are sick.

Table 4 shows a list of fixed parameters and their ranges.

Table 4 Fixed parameters in models (1), (4), (5), (6), (7), (8)

4.2 Fitting the Models to the Data

We fit models (1), (4), (5), (6), the models with no pregnant classes, to the cumulative local Zika infections in Florida, starting from July 19 to September 29, 2016. For the optimization, we use MATLABs fminsearchbnd with both lower and upper bound on the fitted parameters. We fit repeatedly until the error does not decrease and the algorithm terminates because optimization tolerances have been reached. We observe that the recovery rates typically fit at the lower bound. We fit model (7), the model with pregnant classes, to cumulative number of local cases and cumulative number of local pregnant Zika cases. We use similar fitting approach as with other models; however, we only fit with lower bound (that we assume the upper bound for all parameters is infinity). In this case, the recovery rates also fit at the lower bound. To avoid fitting \(\gamma _p\) at a very low value, we take a lower bound 1 / 50. Only epidemiological parameters such as transmission and recovery rates in all models are estimated, and the list of fitted parameters with their values is given in Table 5. The fitted parameters of the models are within the same magnitude when models are compared according to having an asymptomatic class or not.

Table 5 Fitted parameters in models (1), (4), (5), (6), (7), (8) models \(M_2\) and \(M_4\) have asymptomatic classes, whereas the models \(M_1\), \(M_3\), \(M_5\) and \(M_6\) do not have asymptomatic classes

4.3 Practical Identifiability Analysis of Zika Models

To further analyze the identifiability of the models, we perform Monte Carlo simulations which have been widely used for practical identifiability of ODE models (Miao et al. 2011). We generate 1000 synthetic data sets using the true parameter set \(\hat{{\varvec{p}}}\) and adding noise at increasing levels. The true parameter set \(\hat{{\varvec{p}}}\) for each model is obtained through fitting, and the results are given in Table 5. We outline the Monte Carlo simulations in the following steps.

  1. (1.)

    Solve the epidemiological model (\(M_1\) through \(M_6\)) numerically with the true parameters \(\hat{{\varvec{p}}}\) and obtain the output vector \({\varvec{g}} ({\varvec{x}}(t), \hat{{\varvec{p}}})\) at the discrete data time points \(\{t_i\}_{i=1}^n\,.\)

  2. (2.)

    Generate \(M = 1000\) data sets from the statistical model (12) with a given measurement error. Data sets are drawn from a normal distribution whose mean is the output vector obtained in step (1.) and standard deviation is the \(\sigma _0 \%\) of the mean. That is, we set \(\xi =1\) in the error structure given in (12)

    $$\begin{aligned} y_i = g({\varvec{x}}(t_i),\hat{{\varvec{p}}}) + g({\varvec{x}}(t_i),\hat{{\varvec{p}}}) \epsilon _i \quad i=1,2,\ldots ,n, \end{aligned}$$

    where \(\mathbb {E}(\epsilon _i) =0\) and \(\text {Var}(\epsilon _i) =\sigma _0^2\). Hence, the random variables \(y_i\) have mean \(\mathbb {E}(y_i) =g({\varvec{x}}(t_i),\hat{{\varvec{p}}})\) and variances \(\text {Var}(y_i) = g({\varvec{x}}(t_i),\hat{{\varvec{p}}})^{2} \sigma _0^2\).

  3. (3.)

    Fit the epidemiological model \( {\varvec{x}}' = f({\varvec{x}}, t, {\varvec{p}}), \quad {\varvec{x}}(0)= {\varvec{x}}_0\) to each of the M simulated data sets to estimate the parameter set \({{\varvec{p}}}_j\) for \(j=1,2,\ldots , M\). That is,

    $$\begin{aligned} {\varvec{p}}_j = \displaystyle \min _{{\varvec{p}}} \sum _{i=1}^n\left( y_i - g({\varvec{x}}(t_i),{\varvec{p}})\right) ^2\,, \quad j=1,2,\ldots ,M\,. \end{aligned}$$
  4. (4.)

    Calculate the average relative estimation error for each parameter in the set \({\varvec{p}}\) by Miao et al. (2011)

    $$\begin{aligned} ARE(p^{(k)}) = 100\% \displaystyle \frac{1}{M} \displaystyle \sum _{j=1}^M \displaystyle \frac{| \hat{p}^{(k)} - p_j^{(k)}| }{\hat{p}^{(k)}}, \end{aligned}$$

    where \(p^{(k)}\) is the \(k^{th}\) parameter in the set \({\varvec{p}}\), \(\hat{p}^{(k)}\) is the \(k^{th}\) parameter in the true parameter set \(\hat{{\varvec{p}}}\) and \(p_j^{(k)}\) is the \(k^{th}\) parameter in the set \({\varvec{p}}_j\).

  5. (5.)

    Repeat steps 1 through 5 with increasing level of noise, that is, take \(\sigma _0 = 0,1,5,10,\) \(20, 30\%\,.\)

We perform Monte Carlo simulations by generating 1000 random data sets for each measurement error level and fitting each data set to the epidemiological model. We then compute the relative estimation errors (ARE) for each parameter in the epidemiological model which gives an insight about the practical identifiability of the parameters. When \(\sigma _0=0\), that is, when there is no noise in the data, the ARE of the parameters of a structurally (globally) identifiable model should be 0 or very close to 0. As the noise level in the data increases, the ARE of the model parameters increases as well. If a parameter is not practically identifiable, then the ARE of that parameter will be significantly high even for a reasonable level of measurement error. Some of the parameters will be very sensitive to the noise in the data, and increasing the measurement errors will result in significantly high AREs, and then, we claim that the parameter is practically unidentifiable. To be specific, if the ARE of the parameter is higher than the measurement error \(\sigma _0\), then we say that the parameter is practically unidentifiable.

Table 6 Monte Carlo simulations: average relative estimation error (ARE) for parameters of the models \(M_1\) through \(M_6\)

The average relative errors computed from the first Monte Carlo simulations are presented in Table 6. As we see from Table 6, only the transmission rate from an infected mosquito to a susceptible human (\(\beta _v\)) is practically identifiable in models with vector transmission only, that is, in models \(M_1\) and \(M_2\). When direct transmission is added to models with vector transmission, we observe that the identifiability of \(\beta _v\) is lost, and none of the parameters in models \(M_3\) and \(M_4\) are identifiable. The average relative errors of the direct transmission rate (\(\beta _d\)) in models with direct transmission (\(M_3, M_4\) and \(M_5\)) are significantly high compared with all other parameters. That is, we conclude that the direct transmission is not practically identifiable from time-series data of cumulative incidences. Hence, the uncertainties in the estimates of direct transmission rate are very high in models \(M_3\), \(M_4\) and \(M_5\) (see Table 6). The uncertainties in the parameter estimation decrease if data related to other state variables are used in the fitting. Even though model \(M_5\) includes direct transmission, since we use both cumulative incidences and Zika infected pregnant cases while fitting this model, the AREs of the model \(M_5\) parameters are less compared with model \(M_3\). (Both models have direct and vector transmission; the models differ only at the pregnant classes.) The direct transmission rates \(\beta _d\) and \(\beta _{dp}\) in model \(M_5\) are not identifiable. Comparing all the models, we see that the model \(M_6\) has the least ARE parameters. For model \(M_6\), \(\beta \) has AREs up to \(50\%\) higher than \(\sigma _0\), but in comparison with an unidentifiable parameter such as \(\beta _d\) in model \(M_5\) which has ARE at \(2.9\times 10^5\), it is not that high. We conclude that ARE of \(\beta \) is in reasonable range and all other parameters of the model \(M_6\) are practically identifiable from cumulative incidence and infected pregnant cases (see Table 6).

Table 7 Monte Carlo simulations: average relative estimation error (ARE) for parameters of the models \(M_1\) through \(M_6\)

A standard approach to increasing the identifiability of parameters, when the parameters of the model are not identifiable, is to fix some parameters to previously known values, especially in the case that no other state variables are measurable. Since the recovery rate of Zika infections can be obtained from other sources, we perform second Monte Carlo simulations by fixing the recovery rates to the fitted values. The AREs of the parameters computed for the second Monte Carlo experiment are presented in Table 7. It is clear that fixing recovery rates decreased the AREs of all parameters in fitting models \(M_1, M_2, M_3\) and \(M_4\), in the fittings where we could only fit cumulative incidences. This process has also decreased the ARE of the unidentifiable parameter, i.e., the direct transmission rates in all models. But the direct transmission rates remain unidentifiable even after fixing recovery rates.

The ultimate goal in estimating the parameters of an epidemiological model is to estimate the basic reproduction number of the infection. Since, we observe that fixing recovery rates decreases the uncertainties in parameter estimation, we would like to see whether this is still true in estimating the basic reproduction number. So, next we perform the following Monte Carlo simulations.

  1. (1.)

    Generate \(M=2000\) recovery rates \((\gamma ,\gamma _A)\) from a normal distribution whose mean is the fitted value of the recovery rate, that is, \(\mu _{\gamma } = 0.1\), and standard deviation is \(\sigma _{\gamma } = 0.1\), that is,

    $$\begin{aligned} \gamma _j = \mathcal {N}(\mu _{\gamma }, \sigma _{\gamma }) = \mathcal {N}(0.1, 0.1) \quad j =1, 2, \ldots , M. \end{aligned}$$

    Since recovery rate ranges between 2 and 15 days, we move to next step only if \(\gamma _j \ge 0.07\) for each j. If randomly chosen recovery rate is less than 0.07, then another recovery rate is chosen from the normal distribution \(\mathcal {N}(0.1, 0.1)\).

    Similarly, generate \(M=2000\) recovery rates \((\gamma _p)\) from a normal distribution whose mean is the fitted value of the recovery rate, that is, \(\mu _{\gamma _p} = 0.02\), and standard deviation is \(\sigma _{\gamma _p} = 0.005\), that is,

    $$\begin{aligned} \gamma _{pj} = \mathcal {N}(\mu _{\gamma _p}, \sigma _{\gamma _p}) = \mathcal {N}(0.02, 0.005) \quad j =1, 2, \ldots , M. \end{aligned}$$

    Randomly chosen recovery rate for pregnant women from a normal distribution with \(\mathcal {N}(0.02, 0.005)\) puts the recovery rate in the range of 40–70 days.

  2. (2.)

    Fix recovery rate(s) in the epidemiological model

    $$\begin{aligned} {\varvec{x}}' = f({\varvec{x}}, t, {\varvec{p}}) \quad {\varvec{x}}(0)= {\varvec{x}}_0 \end{aligned}$$

    to the randomly chosen value \(\gamma _j\) in step (1.) \(j=1, 2, \ldots , M\)

  3. (3.)

    For models \(M_1,\, M_2,\,M_3\) and \(M_4\) fit the epidemiological model \( {\varvec{x}}' = f({\varvec{x}}, t, {\varvec{p}}), \; {\varvec{x}}(0)= {\varvec{x}}_0\) to the observed cumulative Florida cases to estimate the rest of the parameters in \({{\varvec{p}}}_j\). That is,

    $$\begin{aligned} {{\varvec{p}}}_j = \displaystyle \min _{{\varvec{p}}} \sum _{i=1}^n\left( y_i - g({\varvec{x}}(t_i),{\varvec{p}})\right) ^2\,\quad j =1, 2, \ldots , M\,. \end{aligned}$$

    Estimate the parameters of the models \(M_5\) and \(M_6\) by fitting to cumulative Florida cases and pregnant infections.

  4. (4.)

    Compute the basic reproduction number, \(\mathcal R_{0i}^j\) using \(\gamma _j\) and \({{\varvec{p}}}_j\) for each j.

  5. (5.)

    Compute the average relative error in basic reproduction number,

    $$\begin{aligned} ARE(\mathcal R_{0i}) = 100\% \displaystyle \frac{1}{M} \displaystyle \sum _{j=1}^M \displaystyle \frac{| \mathcal R_{0i}^j - \mathcal R_{0i}| }{\mathcal R_{0i}}, \end{aligned}$$

    where \(\mathcal R_{0i}\) is the fitted value obtained in Table 5.

Performing this third Monte Carlo simulations for model \(M_1\), we obtain the following average relative errors (see Table 8),

$$\begin{aligned} ARE(\mathcal R_{01}) = 6.7\% \quad ARE(\beta ) = 92.8\% \quad ARE(\beta _v) = 16.0\%\,. \end{aligned}$$

That is, even though the average relative error of the transmission rate \(\beta \) is very high (the estimates for \(\beta \) ranges from 2000 to 20000), the average relative error in the computation of the basic reproduction number \(\mathcal R_{01}\) is low. The computed basic reproduction number \(\mathcal R_{01}\) ranges between 1.25 and 1.6. Based on this Monte Carlo simulation results, we conclude that by fixing recovery rate to any value in the range 2–15 days will result in large variations in the estimates of the transmission rate \(\beta \) from an infected individual to an infected mosquito, but the computation of the reproduction rate will not have huge variations.

Table 8 Monte Carlo simulations: average relative estimation error (ARE) for basic reproduction numbers and the parameters of the models \(M_1{-}M_6\) when recovery rates are randomly chosen from the range given in the literature

4.4 Elasticity of the Reproduction Numbers

In this section, we investigate the elasticities of the reproduction numbers. The formulas for the reproduction numbers are computed in the next section. Elasticities of \(\mathcal R_{0i}\) are shown in Fig. 1.

The elasticity of quantity Q with respect to parameter p is given by

$$\begin{aligned} \mathcal E_p^{Q} = \frac{\partial Q}{\partial p} \frac{p}{Q}. \end{aligned}$$

The elasticities give the percentage change in the quantity Q in response to 1% increase in the parameter p. When \(\mathcal E_p^{Q}>0\) that means that Q increases with p; when \(\mathcal E_p^{Q}<0\) that means that Q decreases when p increases. The elasticities are relative to the size of the quantity and the parameter and allow us to compare the sensitivity of the quantity to different in size parameters.

Looking through the panels in Fig. 1, we can observe common trends. (1) All reproduction numbers are most sensitive to the parameters \(\beta \) (transmission from infected humans to susceptible mosquitoes), \(\mu _v\) (death rate of mosquitos) and m (ratio of mosquitos to human). The elasticities \(\mathcal R_{0i}\) to these parameters are approximately 1%, that is, 1% change in the parameter results in 1% change in \(\mathcal R_{0i}\). This observation suggests that control measures targeted toward decreasing the mosquito/human ratio, decreasing the mosquito life span and decreasing the biting rate are most effective in reducing the reproduction number of Zika. (2) All reproduction numbers that depend on direct transmission \(\beta _d\) (\(\beta _{d_p}\)) show very small sensitivities to direct transmission parameters. For instance, the elasticity of \(\mathcal R_{03}\) with respect to \(\beta _d\) is \(6.4*10^{-6}\)%, which is negligent. We surmise that the low sensitivity of the reproduction numbers to the direct transmission parameters is due to the very small values of these parameters. On the other hand, these low elasticities explain why even if we cannot identify the direct transmission parameters from the given data, the estimates of the reproduction are still quite reliable. The small sensitivities of the reproduction numbers with respect to the direct transmission parameters imply that control measures targeted at direct transmission have little population-level effect.

Figure 1 also suggests that the elasticities of \(\mathcal R_{01}\) and \(\mathcal R_{03}\) are quite similar; the elasticities of \(\mathcal R_{02}\) and \(\mathcal R_{04}\) are quite similar and the elasticities of \(\mathcal R_{05}\) and \(\mathcal R_{06}\) are quite similar. We continue by discussing more carefully the elasticities of \(\mathcal R_{04}\) and \(\mathcal R_{05}\).

Figure 1 panel (d) shows the elasticities of \(\mathcal R_{04}\). One immediate observation that can be made from Figure 1 is that \(\mathcal R_{04}\) is most sensitive to \(\beta \), \(\beta _v\) and \(\mu _v\), m and q. These are all parameters that govern the vector-borne transmission of Zika. Reducing \(\beta \) or \(\beta _v\) with \(1\%\) will reduce \(\mathcal R_{04}\) with approximately \(0.9975\%\). The effect of vector mortality rate and the quotient of mosquitos to humans is similar. Increasing \(\mu _v\) or reducing m with 1% will decrease \(\mathcal R_{04}\) with 0.9975%. This suggests that control measures that may reduce the vector life span or reduce the vector to human ratio are some of the most efficient. On the other hand, the reproduction number depends very little on the direct transmission coefficient \(\beta _d\). Reducing \(\beta _d\) with \(1\%\) will reduce \(\mathcal R_{04}\) with only \( 0.00245\%\). One surprising observation is that the proportion of symptomatic/asymptomatic individuals \(\phi \) has little impact on the reproduction number \(\mathcal R_{04}\). That effect may be a result of the fact that \(\gamma \approx \gamma _A\) causing the terms multiplied by \(\phi \) to cancel out. Reciprocally, since \((1-\phi )>> \phi \), the sensitivity to \(\gamma _A\) is much larger than the sensitivity to \(\gamma \). This suggests that when potential treatment is available, treating asymptomatic individuals is at least as important as treating symptomatic individuals.

Fig. 1
figure 1

(Color figure online) Elasticities of basic reproduction numbers of models \(M_1\) through \(M_6\). a Elasticities of \(\mathcal R_{01}\), b elasticities of \(\mathcal R_{02}\), c elasticities of \(\mathcal R_{03}\), d elasticities of \(\mathcal R_{04}\), e elasticities of \(\mathcal R_{05}\), f elasticities of \(\mathcal R_{06}\)

Figure 1 panel (e) shows the elasticities of \(\mathcal R_{05}\). \(\mathcal R_{05}\) is most sensitive to \(\beta \), \(\mu _v\) and m where \(1\%\) increase in the parameter will lead to \(1\%\) change in \(\mathcal R_{05}\). As before, parameters related to the vector-borne transmission are most influential, suggesting control measures targeted toward reducing mosquito life span and the ratio of mosquitos to humans. Using personal protection to prevent bites is also important control strategy. The elasticities of \(\mathcal R_{05}\) with respect to transmission parameters from vector to human are \(0.29\%\) with respect to \(\beta _{v_p}\) and \(0.706\%\) with respect \(\beta _v\). The elasticities of \(\mathcal R_{05}\) with respect to recovery rates are \(0.29\%\) with respect to \(\gamma _p\) and \(0.705\%\) with respect \(\gamma \).

Even though pregnant women are fewer than the general population, the transmission to them has a significant impact on the reproduction number. Moreover, the duration of infectiousness of pregnant women \(\gamma _p\) is a very notable factor influencing the reproduction number. This suggests that there is an urgent need of treatment strategies that will reduce the duration of infectiousness in pregnant women, and these will not only protect unborn children but also contribute to the reduction in population-level transmission of Zika. Regarding sexual transmission, the elasticity of \(\mathcal R_{05}\) with respect to \(\beta _d\) is \(1.9*10^{-10}\%\) and the elasticity of \(\mathcal R_{05}\) with respect to \(\beta _{d_p}\) is \(1.5*10^{-7}\%\). The direct transmission parameters impact on \(\mathcal R_{05}\) is minimal, perhaps because although sexual transmission can occur, it is much more rare.Footnote 13

5 Analysis of the Epidemiological Models of Zika Virus Infection

In this section, we provide basic analysis of the models in Sect. 2.

5.1 Analysis of Model (1)

Model \(M_1\) has disease-free equilibrium \({\epsilon }_{M_1}^0 = (S_v^0,0,S^0,0,0) = (\displaystyle \frac{\Lambda _v}{\mu _v},0,\displaystyle \frac{\Lambda }{\mu },0,0)\) which is locally and globally asymptotically stable when \(\mathcal {R}_{01} <1\), where \(\mathcal {R}_{01}\) is the reproduction number given by

$$\begin{aligned} \mathcal {R}_{01} = \displaystyle \frac{\beta \beta _v m}{ \mu _v(\mu +\gamma )}, \end{aligned}$$

where \(m = N_v/N\) is the ratio of mosquitos to human and \(N_v = \displaystyle \frac{\Lambda _v}{\mu _v}\), \(N =\displaystyle \frac{\Lambda }{\mu }\). This result is not hard to establish and we omit the proof. When \( \mathcal {R}_{01} > 1\), the model \(M_1\) has a unique locally stable endemic equilibrium \({\epsilon }_{M_1}^* = (S_v^*, I_v^*, S^*, I^*, R^*)\) given by,

$$\begin{aligned} \begin{array}{l} I_v^* = \displaystyle \frac{\beta N_v i^*}{(\beta i^* +\mu _v)}\\ S^* = \displaystyle \frac{\Lambda }{\mu }\left( 1 - \displaystyle \frac{(\mu +\omega +\gamma )i^*}{\mu +\omega }\right) \\ i^* = \displaystyle \frac{\mu _v(\mathcal R_{01}-1)}{\beta K} \qquad K = 1+ \frac{\beta _v m}{(\mu +\gamma )}\frac{\mu +\omega +\gamma }{\mu +\omega }, \end{array} \end{aligned}$$
(15)

where \(i^* = I^*/N\). Substituting \(I^*\) in \(S^*\) and simplifying show that \(S^*>0\). This equilibrium is also locally and globally stable (Yang et al. 2010), at least in the case \(\omega =0\).

5.2 Analysis of Model (4)

Model \(M_2\) has a disease-free equilibrium \({\epsilon }_{M_2}^0 = (S_v^0,0,S^0,0,0) = (\displaystyle \frac{\Lambda _v}{\mu _v},0,\displaystyle \frac{\Lambda }{\mu },0,0,0)\) which is locally and globally asymptotically stable when \(\mathcal {R}_{02} <1\), where \(\mathcal {R}_{02}\) is the reproduction number given by

$$\begin{aligned} \mathcal {R}_{02} = \displaystyle \frac{\phi \beta \beta _v m}{ \mu _v(\mu +\gamma )} + \displaystyle \frac{(1-\phi )q\beta \beta _v m}{ \mu _v(\mu +\gamma _A)}. \end{aligned}$$

The reproduction number consists of sum of two terms, the first one giving the secondary infections of symptomatic individuals and the second one—of asymptomatic individuals. This result is also not hard to establish using the Jacobian approach. In the case \( \mathcal {R}_{02} > 1\), the model \(M_2\) has a unique endemic equilibrium \({\epsilon }_{M_2}^* = (S_v^*, I_v^*, S^*, A^*,I^*, R^*)\). The components of this equilibrium or their fractions in the total human population s, i, a and r, respectively, are given by:

$$\begin{aligned} \begin{array}{l} A^* = \displaystyle \frac{(1-\phi )(\mu +\gamma )}{(\mu +\gamma _A)\phi } I^* = \kappa I^*,\\ I_v^* = \displaystyle \frac{\beta K_1 N_v i^*}{ (\beta K_1 i^* +\mu _v)},\\ S^* = \displaystyle \frac{\Lambda }{\mu }\left( 1 - K_2 i^*\right) ,\\ i^* = \displaystyle \frac{\mu _v(\mathcal R_{02}-1)}{\beta K_1+ \mu _v \mathcal R_{02}K_2},\\ R^* = \displaystyle \frac{\gamma I^*+\gamma _A A^*}{\mu +\omega }, \end{array} \end{aligned}$$
(16)

where the constants \(K_1\) and \(K_2\) are defined as follows:

$$\begin{aligned} \begin{array}{l} \displaystyle K_1 = 1+\kappa q,\\ \displaystyle K_2 = \frac{\mu +\omega +\gamma }{\mu +\omega }+\kappa \frac{\mu +\omega +\gamma _A}{\mu +\omega }. \end{array} \end{aligned}$$
(17)

Substituting \(I^*\) and \(A^*\) in \(S^*\) and simplifying show that \(S^*>0\).

5.3 Analysis of Model (5)

Models with vector-borne and direct transmission have been investigated before (Wei et al. 2008). Model \(M_3\) has disease-free equilibrium \({\epsilon }^0_{M_3} = (S_v^0,0,S^0,0,0) = (\displaystyle \frac{\Lambda _v}{\mu _v},0,\displaystyle \frac{\Lambda }{\mu },0,0)\) which is locally and globally asymptotically stable when \(\mathcal {R}_{03} <1\), where

$$\begin{aligned} \mathcal {R}_{03} = \displaystyle \frac{\beta _d}{\mu +\gamma } + \displaystyle \frac{\beta \beta _v m}{\mu _v(\mu +\gamma )}. \end{aligned}$$

When \( \mathcal {R}_{03} > 1\), the model \(M_3\) has a unique locally stable (Wei et al. 2008) endemic equilibrium \({\epsilon }_{M_3}^* = (S_v^*, I_v^*, S^*, I^*, R^*)\) given by,

$$\begin{aligned} I_v^*= & {} \displaystyle \frac{\beta \Lambda _v i^*}{\mu _v (\beta i^* +\mu _v)},\\ S^*= & {} \displaystyle \frac{\Lambda }{\mu } - \displaystyle \frac{(\mu +\omega +\gamma )I^*}{\mu +\omega }. \end{aligned}$$

The equilibrium \(i^*\) can be determined uniquely from

$$\begin{aligned} \left( \displaystyle 1 - \displaystyle \frac{(\mu +\omega +\gamma )i^*}{\mu +\omega }\right) \left( \displaystyle \frac{\beta _d }{\mu +\gamma } + \displaystyle \frac{\beta _v \beta \Lambda _v}{N\mu _v (\mu +\gamma )(\beta i^* + \mu _v)} \right) -1 = 0. \end{aligned}$$

Note that when \(i^* =0\), we have \( \mathcal {R}_{03} -1 >0.\) Once \(i^*\) is obtained, we have \(I^*=\frac{\Lambda }{\mu } i^*\).

5.4 Analysis of Model (6)

Model \(M_4\) has a disease-free equilibrium \({\epsilon }^0_{M_4} = (S_v^0,0,S^0,0,0,0) = (\displaystyle \frac{\Lambda _v}{\mu _v},0,\displaystyle \frac{\Lambda }{\mu },0,0,0)\). The system has three transmitting infectious classes which leads to a full next-generation matrix. Alternatively, we use the Jacobian. The Jacobian has one eigenvalue \(\lambda _1=-(\mu +\omega )\). The remaining eigenvalues are eigenvalues of the characteristic polynomial:

$$\begin{aligned} \begin{array}{l} \displaystyle 0= (\mu _v+\lambda )(\lambda +\mu +\gamma )(\lambda +\mu +\gamma _A) -\phi \beta _d (\mu _v+\lambda )(\lambda +\mu +\gamma _A)\\ \displaystyle \qquad -\,(1-\phi )q_A \beta _d (\mu _v+\lambda )(\lambda +\mu +\gamma )- \beta m \phi \beta _v (\lambda +\mu +\gamma _A)\\ \displaystyle \qquad -\,\beta m (1-\phi )q \beta _v (\lambda +\mu +\gamma ). \end{array} \end{aligned}$$
(18)

This polynomial has a positive leading term; whence, if the constant term \(c_0<0\), the equation has a positive root and the disease-free equilibrium is unstable. The condition \(c_0<0\) is satisfied if and only if the reproduction number \(\mathcal R_{04} >1\), where

$$\begin{aligned} \mathcal R_{04} = \frac{\beta m \phi \beta _v}{(\mu +\gamma )\mu _v} + \frac{\beta m (1-\phi )q \beta _v}{(\mu +\gamma _A)\mu _v} + \frac{\phi \beta _d }{\mu +\gamma }+ \frac{(1-\phi )q_A \beta _d }{\mu +\gamma _A}. \end{aligned}$$

That implies that if \(\mathcal R_{04} >1\) the disease-free equilibrium is unstable. If \(\mathcal R_{04} <1\), then we can rewrite the characteristic equation in the form \(\mathcal H(\lambda )=1\) where

$$\begin{aligned} \mathcal H(\lambda )= & {} \frac{\beta m \phi \beta _v}{(\mu +\gamma +\lambda )(\mu _v+\lambda )} + \frac{\beta m (1-\phi )q \beta _v}{(\mu +\gamma _A+\lambda )(\mu _v+\lambda )} \\&+\, \frac{\phi \beta _d }{\mu +\gamma +\lambda }+ \frac{(1-\phi )q_A \beta _d }{\mu +\gamma _A+\lambda }.\end{aligned}$$

Then, for \(\lambda \) with \(\mathfrak {R}\lambda \ge 0\) we have \(|\mathcal H(\lambda )|\le \mathcal H(0)=\mathcal R_{04}<1\). We conclude that if \(\mathcal R_{04} <1\) the disease-free equilibrium is locally asymptotically stable. In interpreting the reproduction number, we notice that the first term gives the secondary infections obtained by vector transmission generated by a single symptomatic individual; the second term gives the secondary infections obtained by vector transmission generated by a single asymptomatic individual; the third term gives the secondary infections obtained by direct transmission generated by a single symptomatic individual; and the last term gives the secondary infections obtained by direct transmission generated by a single asymptomatic individual.

When \( \mathcal {R}_{04} > 1\), the model \(M_4\) has a unique endemic equilibrium \({\epsilon }_{M_4}^* = (S_v^*, I_v^*, S^*, A^*,I^*, R^*)\). Dividing the last two equations, we obtain \(A^*\) in terms of \(I^*\). From the first equation, we obtain \(I_v\) in terms of \(i^*\) where \(i^*=I^*/N\) and we express S in terms of \(I^*\) from the equation for the total populations size:

$$\begin{aligned} \begin{array}{l} \displaystyle A^* = \kappa I^* = \frac{(1-\phi ) (\mu +\gamma )}{\phi (\mu +\gamma _A)} I^*,\\ \displaystyle I_v^* = \frac{\beta (1+q\kappa ) i^* N_v}{\beta (1+q\kappa ) i^*+\mu _v},\\ \displaystyle S^* =\frac{\Lambda }{\mu }-K_3 I^*, \end{array} \end{aligned}$$
(19)

where

$$\begin{aligned} K_3 = \frac{\omega +\mu +\gamma }{\omega +\mu } + \kappa \frac{\omega +\mu +\gamma _A}{\omega +\mu }. \end{aligned}$$

\(I^*\) is then obtained from the following equation for \(i^*\):

$$\begin{aligned} \frac{\phi }{\mu +\gamma }\left( \frac{\beta m \beta _v(1+q\kappa ) }{\beta (1+q\kappa ) i^*+\mu _v} + \beta _d(1+q_A\kappa )\right) \left( 1-K_3 i^*\right) =1. \end{aligned}$$
(20)

It is not hard to see that if \(F(i^*)\) is the left-hand side of the above equation, \(F(0)=\mathcal R_{04}>1\). That says that the equation has a positive solution. On the other hand, it is clear that \(F(i^*)\) is a decreasing function, and therefore, if a solution exists, it must be unique.

5.5 Analysis of Model (7)

We recast the model in the following form, which has the same dynamics:

$$\begin{aligned} \mathrm{Model\, 5 \, (M_5)}: {\left\{ \begin{array}{ll} \displaystyle \frac{\mathrm{d} S}{\mathrm{d} t } = \displaystyle \Lambda -\frac{ \beta _v I_v S + \beta _d(I + I_p)S}{N} - (\mu +\xi ) S +\omega R + \xi _p S_p, \\ \displaystyle \frac{\mathrm{d} S_p}{\mathrm{d} t} = \displaystyle \xi S - \frac{\beta _{v_p} I_v S_p + \beta _{d_p} I S_p}{N} -(\mu + \xi _p) S_p ,\\ \displaystyle \frac{\mathrm{d} I}{\mathrm{d} t} = \displaystyle \frac{\beta _v I_v S + \beta _{d}(I + I_p) S}{N} - (\mu +\gamma )I , \\ \displaystyle \frac{\mathrm{d} I_p}{\mathrm{d} t} = \displaystyle \frac{\beta _{v_p} I_v S_p + \beta _{d_p} I S_p}{N} - (\mu +\gamma _p)I_p , \\ \displaystyle \frac{\mathrm{d} R}{\mathrm{d} t} = \displaystyle \gamma I +\gamma _p I_p - (\mu +\omega ) R, \\ \displaystyle \frac{\mathrm{d} I_v}{\mathrm{d} t } =\displaystyle \frac{\beta (I + I_p )}{N} (\frac{\Lambda _v}{\mu _v}-I_v) - \mu _v I_v. \end{array}\right. } \end{aligned}$$
(21)

This model also has a disease-free equilibrium which always exists

$$\begin{aligned} {\epsilon }^0_{M_5} = (S_v^0,0,S^0,S^0_p,0,0,0) = \left( \displaystyle \frac{\Lambda _v}{\mu _v},0,\displaystyle \frac{\Lambda (\mu + \xi _p)}{\mu (\mu + \xi + \xi _p)},\displaystyle \frac{\xi }{\mu + \xi _p} S^0,0 ,0,0\right) . \end{aligned}$$

We will denote by \(s^0\) and \(s_p^0\) the fractions \(S^0/N\) and \(S_p^0/N\) given by

$$\begin{aligned} s^0 = \frac{\mu + \xi _p}{\mu + \xi + \xi _p} \qquad s_p^0 = \frac{\xi }{\mu + \xi + \xi _p}. \end{aligned}$$

Computing the reproduction number via the next-generation approach does not result in a compact closed-form expression because the next-generation matrix is a full three-dimensional matrix. We compute the reproduction number using the Jacobian. Arranging the variables as \((S,S_p,I,I_p,R,I_v)\), the Jacobian of the system has two eigenvalues, \(\lambda _1, \lambda _2\) with negative real parts and \(\lambda _3=-(\mu +\omega )\). The remaining three eigenvalues satisfy the following characteristic equation:

$$\begin{aligned} \displaystyle 0= & {} (\lambda +\mu _v)(\lambda +\mu +\gamma _p)(\lambda +\mu +\gamma ) - \beta _d s^0(\lambda +\mu _v)(\lambda +\mu +\gamma _p)\nonumber \\&-\,\beta _{d_p} s^0_p \beta m\beta _v s^0\displaystyle -\beta _{d_p} s_p^0\beta _p s^0(\lambda +\mu _v)\nonumber \\&-\,\beta _{v_p}s_p^0\beta m (\lambda +\mu +\gamma ) -\beta _v s^0\beta m(\lambda +\mu +\gamma _p).\nonumber \\ \end{aligned}$$
(22)

Since the leading term of this equation is positive, the equation has a positive eigenvalue if the constant term \(c_0<0\). This inequality holds if the reproduction number \(\mathcal R_{05}>1\) where

$$\begin{aligned} \mathcal R_{05}=\frac{\beta _ds^0}{\mu +\gamma } +\frac{ \beta _v s^0\beta m}{\mu _v(\mu +\gamma )}+\frac{ \beta _{v_p} s_p^0\beta m}{\mu _v(\mu +\gamma _p)}+ \frac{\beta _{d_p}s_p^0}{\mu +\gamma }\left( \frac{\beta _d^0 s^0}{\mu +\gamma _p}+\frac{\beta _vs^0\beta m}{\mu _v(\mu +\gamma _p)}\right) \end{aligned}$$

with \(m=N_v/N\). On the other hand, one can show that if \(\mathcal R_{05}<1\), then the characteristic equation (22) does not have roots with nonnegative real part (Martcheva 2015). In interpreting \(\mathcal R_{05}\), notice that the first two terms give secondary infections generated by one infected individual. The first term accounts for the secondary infections generated by direct transmission, while the second accounts for secondary infections generated through the vector transmission pathway. The third and fourth terms are the secondary infections of pregnant individuals generated by one pregnant individual. The third term accounts for secondary infections generated through vector transmission by pregnant women. The last term is most difficult to understand. We obtain this term because pregnant individuals do not infect directly other pregnant individuals. Thus, for one infected pregnant woman to generate a secondary infected pregnant woman through direct transmission one of two routes must be taken: (1) the pregnant woman infects a non-pregnant individual though direct transmission who in turn infects another pregnant individual; or (2) the pregnant woman infects a vector which in turn infects a non-pregnant individual who transmits through direct transmission to a pregnant woman. The number of secondary cases generated through the first scenario is given by the first term in the parenthesis, while the second scenario is given by the second term in the parenthesis. More specifically, one pregnant individual infects \(\beta m/(\mu +\gamma _p)\) vectors during her life span as infectious, which in turn infect \(\beta _v s^0/\mu _v\) individuals, who in turn infect \(\beta _{d_p} s_p^0/(\mu +\gamma )\) pregnant women through direct transmission.

Theorem 1

Assume \(\xi _p = 0\) and \(\mathcal R_{05}>1\). Then, model (7) has an endemic equilibrium. If \(\omega =0\), then that equilibrium is unique.

The proof of the theorem is delegated to Appendix.

Fig. 2
figure 2

(Color figure online) Backward bifurcation in model (7) in the case of no direct transmission. Parameter values are as follows: \(\mu = 1/(75*365)\), \(\Lambda = \mu /3\), \(\Lambda _v = 100000000000\), \(\mu _v = 1/10\), \(\xi = 0.5\), \(\gamma _p = 1/40\), \(\gamma = 1/5\), \(\omega \) is variable, \(\beta _v = 0.0000005\), \(\beta _{v_p} =0.000000001, \xi _p = 0\)

Theorem 2

Assume there is no direct transmission, that is, \(\beta _d=\beta _{d_p}=0\) and \(\xi _p = 0\). Then, model (7) exhibits backward bifurcation if and only if

$$\begin{aligned} \begin{array}{l} \displaystyle \left( \frac{\beta _v m}{\mu +\gamma } +\frac{\beta _{v_p} m}{\mu +\gamma _p}\frac{\xi }{\mu }\right) \frac{\omega }{\mu +\omega } \left( \frac{\gamma }{\mu +\gamma }\frac{\beta _v s^0}{\mu +\xi } +\frac{\gamma _p}{\mu +\gamma _p} \frac{\beta _{v_p} s_p^0}{\mu +\xi }\right) \\ \displaystyle \qquad -\left( \frac{\beta _v m}{\mu +\gamma } +\frac{\beta _{v_p} m}{\mu +\gamma _p}\frac{\xi }{\mu }\right) \frac{\beta _v s^0}{\mu +\xi } \displaystyle -\frac{\beta _{v_p} m}{\mu +\gamma _p}\frac{\beta _{v_p} s_p^0}{\mu } - \frac{\beta _{v_p} s_p^0}{\mu +\gamma _p} - \frac{\beta _v s^0}{\mu +\gamma }>0, \end{array} \end{aligned}$$
(23)

where \(m = N_v/N\) and \(N_v =\frac{\Lambda _v}{\mu _v}\), \(N =\frac{\Lambda }{\mu }\).

The proof of the theorem is also delegated to Appendix. Figure 2 shows the backward bifurcation. There are several conclusions that can be drawn from the figure and condition (23). As Theorem 1 suggests if \(\omega =0\) there is no backward bifurcation or multiple equilibria. More careful examination of (23) reveals that \(\gamma _p=0\) or \(\beta _{v_p}=0\) there is no backward bifurcation. Figure 2 suggests that \(\omega \) controls the depth of the backward bifurcation—the larger \(\omega \), the deeper the backward bifurcation. Furthermore, we seem to need a very large value of m to produce an example of the backward bifurcation. We must note that if \(\beta _d,\beta _{d_p}\ne 0\) backward bifurcation may still occur, just the necessary and sufficient condition for this to happen is more cumbersome to derive.

6 Discussion

Investigating the necessary public health measures needs mathematical models that are developed with available data in mind and have identifiable parameters. In this study, we develop six ODE models of Zika, encompassing various features of the disease, and we test them against two data sets: the cumulative number of local cases in Florida and cumulative number of local cases in pregnant women in Florida. Our main objectives were: (1) to compute the reproduction number of Zika in Florida, (2) to develop identifiable models of Zika and to estimate some of the critical parameters associated with Zika transmission.

We found that M\(_1\)–M\(_4\) are structurally locally identifiable from data on cumulative number of local cases only, while M\(_5\) and M\(_6\) are not. We found that M\(_5\) and M\(_6\) are structurally locally identifiable from data on cumulative number of local cases and cumulative number of local cases of pregnant women in Florida (data that we obtained courtesy of the Florida Department of Public Health). We fit all the models to the relevant data sets using MATLAB’s fminsearchbnd routine and obtained the fitted parameters. Models M\(_1\) through M\(_4\) give a value of \(\mathcal R_{0i}=1.46\). Models M\(_5\) and M\(_6\) give reproduction number \(\mathcal R_{0i} =1.24\), \(i=5,6\). We conclude that the reproduction number depends on the data used but not on the model (given the same methodology of computation).

After fitting the models and estimating the parameters, we performed practical identifiability analysis using Monte Carlo simulations. We fit the transmission rates and the recovery rates. We find that for model M\(_1\) through M\(_4\) only the transmission rate from humans to mosquitos is practically identifiable, while the direct transmission rates have average relative errors (AREs) of \(10^6\) for all noise levels except the zero noise. Practical identifiability of parameters improves dramatically in M\(_5\) and M\(_6\) where we use two data sets. In model M\(_5\) only the direct transmission rates are not practically identifiable with AREs of \(10^5\) for noise level above \(20\%\). Model M\(_6\) parameters are all practically identifiable.

In this study, we have not considered a model with all possible cases, that is, a model which includes vector and direct transmissions and asymptotic and pregnant classes. This was intentional. The goal of this paper is to study the identifiability analysis of Zika models. We have shown that adding asymptotic class definitely increases the ARE of the model parameters. Same is true for direct transmission. So, if we had a model with all possible cases, then we know for sure that the model would be unidentifiable with the current available data. If we had data available for the asymptotic and direct transmission cases, then we would have studied the model with all possible cases.

Next, we fixed the recovery rates at their fitted values and performed Monte Carlo simulation to understand the identifiability of the transmission rates. In this case, the identifiability of all transmission rates in all models has improved, but the direct transmission rates are still not identifiable in M\(_3\) and M\(_4\). In model M\(_5\), the AREs of the direct transmission coefficients are only \(2*10^2\) at noise level of \(30\%\). Since data on direct transmission are hard to obtain in a place of local transmission of Zika, such as Florida, inference about the direct transmission can best be made from model M\(_5\) with fixed recovery rates.

Finally, we address the impact of how we fix the recovery rates in their plausible intervals on the value of the reproduction number and the estimates of the transmission rates. We perform Monte Carlo simulations by randomly choosing the recovery rates from their biologically realistic intervals, fitting the transmission rates and computing the reproduction number of each model and its AREs. Notably, the AREs of \(\mathcal R_{0i}\) for \(i=1,\dots ,4\) are around 6.5%, while AREs of \(\mathcal R_{0i}\) for \(i=5,6\) are 10%. This leads to a very small range for the reproduction number. For instance, \(\mathcal R_{01}\) ranges between 1.25 and 1.6. There seems to be little doubt that the reproduction number of Zika in Florida at the start of the epidemic was above one.

Elasticities of the reproduction numbers suggest that the reproduction numbers are most sensitive to the transmission from humans to vectors, the ratio of vectors to humans and the vector mortality rate. This means that public health measures should focus in two main directions: (1) insecticide spraying to reduce mosquito life span and the mosquito population; (2) education of the public how to protect themselves against mosquito bites. Furthermore, the reproduction numbers are not sensitive at all to the direct transmission rate. That means that control measures targeting direct transmission have little population-level impact. The low sensitivity of the reproduction number to the direct transmission rates explains why its value has not been impacted by the presence of direct transmission in the model.