Keywords

1 Introduction

It is a common belief that the spread of COVID-19 across the world is through humans traveling over an open and interwoven network of population flow. Recent studies confirmed that human mobility is a major factor driving the spatial spread of COVID-19 [1,2,3]. Similarly, [4] revealed that mobility patterns strongly correlated with COVID-19 growth rates, based on data from the most affected counties in the U.S. However, it is challenging to predict the extent and speed of virus transmission with credible accuracy. Further and deeper investigation on the spatiotemporal spreading patterns of COVID-19 utilizing a population mobility network beyond the traditional Susceptible-Exposed-Infectious-Removed (SEIR) model [5] needs to be undertaken. Such an investigation is not only essential for understanding the mechanism of the spread in a population mobility network involving intercity traveling, but also beneficial for devising measures for spread prevention and control on a global basis.

The art, also the science, of mathematical modeling, is to construct the simplest model that captures the salient features of the system. Virus transmission is a dynamic process that involves many stochastic components. Consequently, it is necessary to seamlessly incorporate population mobility and stochastic processes into the classical SEIR model to predict the spatiotemporal effect of virus transmission. Additionally, COVID-19 shows some variation from the typical progression of infectious disease transmission, in the sense that it is also infectious during the incubation period [6], which needs to be considered through modification of the classical SEIR model.

To our best knowledge, there does not exist any applicable mainstream network topology structure to simulate COVID-19 spread associated with population mobility. Also, some inadequacies in the deterministic epidemic models remain unaddressed. First, previous network models with clustering are context-specific hence not scalable [7, 8]. Consequently, results generated from such models may not be generalizable [9]. Second, we propose an initial high-risk population to accurately represent the size of the initial susceptible population, which is pivotal for epidemic modeling [10]. Prior studies suggested using the city population base as the initial susceptible population [6, 11]. However, such methods overestimate the infection cases at the early stage of epidemic transmission, which varies based on the demographic background of different cities. Thus, the infection rate is at risk of being underestimated as the size of infected populations being the product of the initial susceptible population and the infection rate. Third, the epidemic models usually are not parameter-free. Besides the essential initial susceptible population, initial parameters like the infection rate and recovery rate also need to be provided [12]. However, due to the lack of a comprehensive review of historical incidence data, it is hard to derive specific parameters for different survey sites.

In this paper, we use fine-grained spatiotemporal population mobility data, in conjunction with epidemic data, to construct an urban network epidemic framework, which is closer to real-world scenarios. The use of the framework offers a scalable and credible solution compared with the traditional SEIR model and strong baseline models such as Metapopulation SIR model [13] and Susceptible-Undiagnosed-Infected-Removed (SUIR) model [14]. Additionally, some challenges of the deterministic epidemic model are better addressed. Extensive experiments were conducted to assess the performance of our approach, using a real-world COVID-19 epidemic dataset of 12 cities in Hubei Province, China, and 12 cities in Italy.

2 The Proposed Approach

We propose an urban network epidemic framework (M-Urb-SEIR), a novel approach that incorporates population mobility and stochasticity for accurate COVID-19 confirmed case forecasting.

2.1 SEIR Model (Single-Network)

We adopt the SEIR model [5] to describe the dynamic process of epidemic propagation. Criteria for dividing the subjects are as follows: (P) represents the total population, Susceptible (S) is for the susceptible individuals, Exposed (E) for the exposed individuals, previously susceptible who have been exposed to the virus but may not be infected; Infected (I) for the infective individuals capable of transmitting the disease, this includes a non-symptomatic infectious period, and Recovered (R) for recovered individuals who were previously infected but have become immune. If the immune period is limited, R can be converted into S again. The relation between all variables is shown below:

$$\begin{aligned} \frac{\textrm{d}S}{\textrm{d}t}&= -\alpha I(t)S(t) / N, \end{aligned}$$
(1)
$$\begin{aligned} \frac{\textrm{d}E}{\textrm{d}t}&= \alpha I(t)S(t) / N - \mu E(t), \end{aligned}$$
(2)
$$\begin{aligned} \frac{\textrm{d}I}{\textrm{d}t}&= \mu E(t) - \gamma I(t), \end{aligned}$$
(3)
$$\begin{aligned} \frac{\textrm{d}R}{\textrm{d}t}&= \gamma I(t), \end{aligned}$$
(4)
$$\begin{aligned} S(t) + E(t) + I(t) + R(t)&= N, \end{aligned}$$
(5)

where \(\alpha \) represents the rate of conversion for the exposed become infected; \(\mu \) represents the rate of transformation for the incubation period to a patient; and \(\gamma \) represents the probability of recovery.

2.2 M-Urb-SEIR (Urban Network Epidemic Framework)

The traditional SEIR model assumes a single infection network among individuals and only model epidemic propagation in a single dimension. We extend the traditional SEIR model to the scenario of urban networks. We assume that there are N cities in a city network of interests. Eligibility criteria required individuals to be divided into four states: S, E, I, and R. To assess the city n, the city population base was used \(P_{n}\), and the number of S, E, I, and R in the city at time t is \(S_{n}(t)\), \(E_{n}(t)\), \(I_{n}(t)\), and \(R_{n}(t)\). This study assumes a mobility intensity (\(W_{nm}\)) between city n and m, representing the average number of visitors from the city n to m. Recent evidence suggests that cases with the latent period are infectious [6]. We therefore set out to assess the effect of the infection rate of the infected individual and the effect of infection rate of the latent individual (infectious and lag onset). \(\alpha _{1}\) represents the infection rate of the infected individual; \(\alpha _{2}\) represents the infection rate of the latent individual; \(\mu \) represents the rate of transformation of the incubation period to patients; \(\gamma \) represents recovery rate of patients.

The transmission of and recovery from infection are intrinsically stochastic processes, and the deterministic epidemic model does not account for fluctuations [15]. To tackle this issue, we assume the process is Markovian on the relevant time scales, the dynamic variations between states of the four populations at t are summarized as follows:

$$\begin{aligned} \begin{array}{l} \frac{\textrm{d} S_{n} (t)}{\textrm{d} t} =-\,\alpha _{1} S_{n} (t)\sum ^{N}_{m=1} (\frac{W_{mn}}{P_{m}} +\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n}\\ \ \ \ \ \ \ \ \ \ \ \ \ \ -\,\alpha _{2} S_{n} (t)\sum ^{N}_{m=1} (\frac{W_{mn}}{P_{m}} +\frac{W_{nm}}{P_{n}} )E_{m} (t)/P_{n}\\ \ \ \ \ \ \ \ \ \ \ \ \ \ +\,\sqrt{\alpha _{1} S_{n} (t)I_{n} (t)/P_{n}} \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ \ \ \ +\,\sqrt{\alpha _{2} S_{n} (t)E_{n} (t)/P_{n}} \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ \ \ \ +\,\sqrt{\alpha _{1} S_{n} (t)\sum ^{N}_{m\ne n} (\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n}} \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ \ \ \ +\,(\sqrt{\alpha _{2} S_{n} (t)\sum ^{N}_{m\ne n} (\frac{W_{nm}}{P_{n}} )E_{m} (t)/P_{n}} \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ \ \ \ +\,\sqrt{\sum ^{N}_{m\ne n} \alpha _{1} S_{n} (t)(\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n}} \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{m} ). \\ \end{array} \end{aligned}$$
(6)
$$\begin{aligned} \begin{array}{l} \frac{\textrm{d} E_{n} (t)}{\textrm{d} t} =\alpha _{1} S_{n} (t)\sum ^{N}_{m=1} (\frac{W_{mn}}{P_{m}} +\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n}\\ \ \ \ \ \ \ \ \ \ \ +\,\alpha _{2} S_{n} (t)\sum ^{N}_{m=1} (\frac{W_{mn}}{P_{m}} +\frac{W_{nm}}{P_{n}} )E_{m} (t)/P_{n}\\ \ \ \ \ \ \ \ \ \ \ -\,\mu \cdot E_{n} (t)-(\sqrt{\alpha _{1} S_{n} (t)I_{n} (t)/P_{n}} \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ -\,(\sqrt{\alpha _{2} S_{n} (t)E_{n} (t)/P_{n}} \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ -\,(\sqrt{\alpha _{1} S_{n} (t)\sum ^{N}_{m\ne n} (\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n}} \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ -\,(\sqrt{\alpha _{2} S_{n} (t)\sum ^{N}_{m\ne n} (\frac{W_{nm}}{P_{n}} )E_{m} (t)/P_{n}} \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{n} )\\ \ \ \ \ \ \ \ \ \ \ -\,(\sqrt{\sum ^{N}_{m\ne n} \alpha _{1} S_{n} (t)(\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n}}\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{m} )\\ \ \ \ \ \ \ \ \ \ \ -\,(\sqrt{\sum ^{N}_{m\ne n} \alpha _{2} S_{n} (t)(\frac{W_{nm}}{P_{n}} )E_{m} (t)/P_{n}}\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{m} )\\ \ \ \ \ \ \ \ \ \ +\,(\sqrt{\mu E_{n} (t)} \cdot P_{t} (E_{n}{\mathop {\longrightarrow }\limits ^{\mu }} I_{n}).\\ \end{array} \end{aligned}$$
(7)
$$\begin{aligned} \begin{array}{l} \frac{\textrm{d} I_{n} (t)}{\textrm{d} t} =\mu \cdot E_{n} (t)-\gamma \cdot I_{n} (t)\\ \ \ \ \ \ \ \ \ \,(\sqrt{\mu E_{n} (t)} \cdot P_{t} (E_{n}{\mathop {\longrightarrow }\limits ^{\mu }} I_{n} )\\ \ \ \ \ \ \ \ \ +\,\sqrt{\gamma I_{n} (t)} \cdot P_{t} (I_{n}{\mathop {\longrightarrow }\limits ^{\gamma }} R_{n} ). \end{array} \end{aligned}$$
(8)
$$\begin{aligned} \frac{\textrm{d} R_{n} (t)}{\textrm{d} t} =\gamma \cdot I_{n} (t)-\sqrt{\gamma I_{n} (t)} \cdot P_{t} (I_{n}{\mathop {\longrightarrow }\limits ^{\gamma }} R_{n} ). \end{aligned}$$
(9)

\(P_{t}\) (\(S_{n}\) \({\mathop {\longrightarrow }\limits ^{\alpha _{1}}}\) \(E_{n}\)), the probability that individuals in S state (city n) will be transformed into E state (city n) at t time, which is caused by the contact between S (city n) and I (city n). \(P_{t}\) (\(S_{n}\) \({\mathop {\longrightarrow }\limits ^{\alpha _{2}}}\) \(E_{n}\)), the probability that individuals in S state (city n) will be transformed into E state (city n) at t time, which is caused by the contact between S (city n) and E (city n). \(P_{t}\) (\(S_{n}\) \({\mathop {\longrightarrow }\limits ^{\alpha _{1}}}\) \(E_{m}\)), the probability that the individuals in S state (city n) will be transformed into E state (city m) at t time due to the contact between individuals in S (city n) and I (city m). \(P_{t}\) (\(S_{n}\) \({\mathop {\longrightarrow }\limits ^{\alpha _{2}}}\) \(E_{m}\)), the probability that the individuals in S state (city n) will be transformed into E state (city m) at t time due to the contact between individuals in S (city n) and E (city m). \(P_{t}\) (\(E_{n}\) \({\mathop {\longrightarrow }\limits ^{\mu }}\) \(I_{n}\)), the probability that individuals in E state (city n) transforms into I state (city n) at t time. \(P_{t}\) (\(I_{n}\) \({\mathop {\longrightarrow }\limits ^{\gamma }}\) \(R_{n}\)), the probability of individuals in I state (city n) transforms into R state (city n) at time t.

In an urban network, there are three behaviors for susceptible populations in urban n to access the incubation period.

(1) Internal transmission of city n:

$$\begin{aligned} \begin{array}{l} \alpha _{1} \cdot S_{n} (t)\cdot I_{n} (t)/P_{n} +\alpha _{2} \cdot S_{n} (t)\cdot E_{n} (t)/P_{n}\\ -\,\sqrt{\alpha _{1} S_{n} (t)I_{n} (t)/P_{n}} \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{n} )\\ -\,\sqrt{\alpha _{2} S_{n} (t)E_{n} (t)/P_{n}} \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{n} ) \end{array} \end{aligned}$$
(10)

(2) Transmission caused by the flow of infected and exposed individuals from city m to n:

$$\begin{aligned} \begin{array}{l} \alpha _{1} \cdot S_{n} (t)\cdot \sum ^{N}_{m\ne n} (\frac{W_{mn}}{P_{m}} )\cdot I_{m} (t)/P_{n} \\ +\,\alpha _{2} \cdot S_{n} (t)\cdot \sum ^{N}_{m\ne n} (\frac{W_{mn}}{P_{m}} )\cdot E_{m} (t)/P_{n}\\ -\,\sqrt{\alpha _{1} S_{n} (t)\sum ^{N}_{m\ne n} (\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n}} \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{n} )\\ -\,\sqrt{\alpha _{2} S_{n} (t)\sum ^{N}_{m\ne n} (\frac{W_{nm}}{p_{m}} )/P_{n}} \cdot P_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{n} ) \end{array} \end{aligned}$$
(11)

(3) The susceptible population from city n flows into m and infected:

$$\begin{aligned} \begin{array}{l} \sum ^{N}_{m\ne n} \alpha _{1} \cdot S_{n} (t)(\frac{W_{nm}}{P_{n}} )\cdot I_{m} (t)/P_{n} \\ +\,\sum ^{N}_{m\ne n} \alpha _{2} \cdot S_{n} (t)(\frac{W_{nm}}{P_{n}} )\cdot E_{m} (t)/P_{n}\\ -\,\sqrt{\sum ^{N}_{m\ne n} \alpha _{1} S_{n} (t)(\frac{W_{nm}}{P_{n}} )I_{m} (t)/P_{n} \cdot P^{2}_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{1}}} E_{m} )}\\ -\,\sqrt{\sum ^{N}_{m\ne n} \alpha _{2} S_{n} (t)(\frac{W_{nm}}{P_{n}} )E_{m} (t)/P_{n} \cdot P^{2}_{t} (S_{n}{\mathop {\longrightarrow }\limits ^{\alpha _{2}}} E_{m} )} \end{array} \end{aligned}$$
(12)

The proposed framework is implemented by an overall algorithm as follows:

figure a

2.3 Addressing the Challenges of a Deterministic Epidemic Model

(1) Our proposed framework is scalable. Once the original and target domain are located and marked, the actual cross-domain propagation of COVID-19 can be simulated. (2) We propose a backward derivation algorithm to derive the initial susceptible population \(\mathcal {E}_0\). Specifically, we first used the way of [16] to obtain \(\mathcal {R}_0\) sequences from time-series data of confirmed cases. We then established a basic Susceptible-Infected-Recovered (SIR) model [11] as shown in Eq. (3.13).

$$\begin{aligned} \mathcal {R}_0 = \frac{\alpha \cdot P}{\gamma }, \end{aligned}$$
(13)

where \(\alpha \) represents the infection rate, P represents the total population in the area, and \(\gamma \) represents the recovery rate. Based on Eq. (3.13), we first initialized the infection rate (\(\alpha \)) and recovery rate (\(\gamma \)). We then adopted the Nelder-Mead simplex optimization method to optimize \(\alpha \) and \(\gamma \) to make the total number of infected individuals (I) and recovered individuals (R) as close as possible to the real number of confirmed cases. Last, the total number of infected individuals (I) and recovered individuals (R) were used as \(\mathcal {E}_0\) of the urban network epidemic framework. Moreover, the optimal \(\alpha \) and \(\gamma \) were used as the initial hyper-parameters of the urban network epidemic framework. Note that we used \(\mathcal {E}_0\) instead of the city population base in the urban network epidemic framework, and the Markov process used the difference between the city population base and \(\mathcal {E}_0\) to incorporate stochasticity. (3) We used the Nelder-Mead simplex optimization method again to optimize the \(\alpha \), \(\gamma \), and \(\mu \) (the rate of transformation of the incubation period to patients) of the urban network epidemic framework.

3 Experimental Settings

We present datasets, competitors, and evaluation metrics for our experiments.

3.1 Datasets

We adopted the epidemic data from the National Health Commission of the People’s Republic of ChinaFootnote 1 (daily confirmed new cases for each city between January 23, 2020 and February 29, 2020) and Italian epidemic dataFootnote 2 (daily confirmed new cases for each city between February 24, 2020 and April 15, 2020). The statistical data includes the cumulative number of infected, recovered, and death cases. Chinese migration data were obtained with consent from Baidu migration, and the most recent data are available on the website (https://qianxi.baidu.com). The dataset includes the immigration scale index, the emigration scale index, and intracity travel intensity. The migration scale index is converted according to the absolute value of the number of individuals who move in/out, reflecting the population scale of the cities. The intensity of intracity travel is the exponential result of the number of individuals who have traveled in the city and the city’s resident population, reflecting the intracity mobility scale. Similarly, Italian migration data were obtained from [17].

3.2 Competitors

To fairly compare different approaches, we compare our approach with the following deterministic epidemic models.

  1. 1.

    SEIR is an epidemiological model used to simulate the spread of infectious disease.

  2. 2.

    Metapopulation SIR model [13] (SIGKDD 2018) extends the SIR model to a metapopulation SIR model that allows visitors transmission between any two sub-populations.

  3. 3.

    SUIR model [14] (SIGKDD 2020) incorporates a unique ’undiagnosed’ state of the COVID-19 on the basis of the SIR model.

  4. 4.

    Urb-SEIR (without Markov process) is one variant of the proposed framework.

3.3 Evaluation Metrics

This study assumes that the prediction date between t and T, and the relative error is defined as:

$$\begin{aligned} e(t, T) = \frac{|C(t+T) - \hat{C}(t+T)|}{\hat{C}(t+T)}, \end{aligned}$$
(14)

where C(t) represents the cumulative confirmed cases obtained from the baselines and our proposed methods, and \(\hat{C}(t)\) represents the cumulative truth cases based on the database.

4 Experimental Results

Most research works use the number of city population base as the initial susceptible population, however, Table 1 shows the initial high-risk population base deduced by the backward derivation algorithm is much less than city population but more reasonable. Furthermore, the optimal hyper-parameter identified in these responses are summarized in Table 2. Table 3 shows the inference formulation by Markov stochastic process.

Table 1. Number of the initial high-risk population obtained from backward derivation algorithm
Table 2. Optimal hyper-parameters of outbreak cities in China and Italy
Table 3. Inference formulation by Markov stochastic process

Figures 1, 2, 3 and 4 illustrate the prediction error bars of models in Chinese and Italian cities. The graphs show that our M-Urb-SEIR performs well in the 28 days forecast period of 12 cities in China compared with baseline models. Moreover, we find that the M-Urb-SEIR outperforms the Urb-SEIR. This benefits from incorporating random effects into epidemic models, which is critical for improving prediction accuracy. Besides, we had three critical findings from the experimental results in Italian cities. First, the Metapopulation SIR model prediction result performs the best on most days of the first week, and the suboptimal model is the traditional SEIR model. Second, by contrast, Urb-SEIR and M-Urb-SEIR perform better with a longer prediction horizon. Usually, they will perform better than other approaches when the prediction horizon is longer than 1 or 2 weeks. Third, the performance of the Urb-SEIR is better than M-Urb-SEIR. Compared to China, where ‘Chunyun’ leads to dramatic population migration, Italy’s strength of population movement is much milder. Therefore, M-Urb-SEIR, which considers more about the stochastic effect, performs worse than Urb-SEIR. These figures suggest that the prediction of COVID-19 should be customized, and contextual information should be considered. In different application scenarios, the model should be able to be extended and modulized. Specifically, for the high randomness effect such as the ‘Chunyun’ event (the largest periodic human migration in China), models should take the Markov stochastic process into account; however, in the context of regular population movements (Italian cities), the results highlighted the need to use the Urb-SEIR as a predictive tool.

Fig. 1.
figure 1

Prediction error bars of models after T days.

With regard to the research methods, some limitations need to be acknowledged. The principal limitation of this analysis was the variance to estimate \(R_{0}\) [18]. Another major source of uncertainty is in the backward derivation algorithm used to calculate the initial susceptible population. The latent infectivity population and other external factors were not accounted for in the derivation process. Although there are limitations in the backward derivation, it contributed to the infinite approach to the real world’s transmission state. An additional uncontrolled factor is the effect of ‘Chunyun’ [19], which is hard to be measured in the prediction process. Furthermore, the summary of error results is subject to inevitable fluctuation. The fluctuation phenomenon is intrinsically one of the challenges of the deterministic epidemic model, which reflects the likelihood that the precision of a deterministic epidemic model will vary across the ‘life cycle’ of an epidemic outbreak when analyzed using a set of fixed parameters. This will clearly influence the results across a long forecast period; Therefore, we provided a 28 days prediction horizon, which is much longer than most prior studies. Future studies that adopt a ‘stage’ forecast in the life cycle of an epidemic are clearly indicated. Despite its limitations, this study indicates the effectiveness of incorporating population mobility and random effects into the epidemic simulation.

Fig. 2.
figure 2

Prediction error bars of models after T days.

Fig. 3.
figure 3

Prediction error bars of models after T days.

Fig. 4.
figure 4

Prediction error bars of models after T days.

5 Conclusion

In this paper, we propose a novel urban network epidemic framework to study the spread pattern of COVID-19 in different cities. We applied the framework to simulate and predict the COVID-19 confirmed cases in ‘epicenter’ Wuhan and other 11 cities in Hubei Province of China and 12 cities in Italy with severe epidemic situations, which outperforms other deterministic state-of-the-art models in the COVID-19 spreading prediction task. We also demonstrated that incorporating population mobility and random effect into epidemic models is necessary. Our findings provide new scientific evidence for further epidemic model design and offer a foundation for conducting other research studies, such as assessing the long-term social and economic effects of COVID-19.