1 Introduction

Control of epidemic spreading has always attracted interest in the biological mathematics field. However, many models describing the spread of an infection assume a randomly mixed population, in which contact between any two individuals is equally likely (Ball et al. 2013; Chris 2005). This assumption is strong and in many cases unrealistic. Epidemics on contact networks provide a more realistic alternative. In general, a contact network represents the possible connections through which an infection can transmit. Classical dynamic models of epidemics consider the impact of disease factors on the spread of an infectious disease. Therefore, much research on infectious disease transmission in complex populations has focused on understanding the implications of network structures on the epidemic processes. Recently, Kiss et al. (2017) have shown a large pool of epidemic models on networks, ranging from exact and stochastic to approximate differential equation models.

Recent works have shown that community structure is a universal feature of contact networks. The community structure describes the division of the network nodes into groups with dense connections among the group members, but sparser connections among the groups. The community may be classmates, friends, co-workers and club members. Research on the community structures of networks has yielded rich results. Such research has focused on the features and excavation of communities, or epidemic propagation in a network. Present results on the former are focused on community detection (Newman and Girvan 2004; Traud et al. 2009; Wayne 1977) and community growth models (Jin et al. 2001). Orman et al. (2013) considered the relationship between community structure and clustering coefficient. They discovered that even a network with near-zero transitivity can form a clearly defined community structure. Epidemic propagation through networks has focused on the effect of community structure on epidemic propagation. Various simulation studies (Huang and Li 2007; Liu and Hu 2005; Sun and Gao 2007) have revealed that networks without community structure are more robust than those with community structure. For details, see Huang and Li (2007), Wu and Liu (2008), Yan et al. (2007). Rowthorn et al. (2009) and Salathé and Jones (2010) discuss the control of diseases in metapopulations and in networks with community structure, respectively. Furthermore, the latter shows that community structure will lower the final size and peak prevalence under certain parameter conditions.

Disease spread through complex networks can be modeled by various approaches, such as mean field equations (Gross et al. 2006; Luo et al. 2014; Wang et al. 2010), the moment closure model (Eames 2008; House and Keeling 2010), a branching process approximation (Ball 1983; Ball and Neal 2008; Neal 2007), the effective degree model (Ma et al. 2013), bond percolation theory (Gleeson 2009; Gleeson et al. 2010; Miller 2009a; Newman 2003b; Wang et al. 2012) and a generating function-based model (Miller 2009b, 2011; Volz 2008). House and Keeling (2011) introduce the generating function into the approximation formula, and get a low-dimensional model with a small number of dynamical variables. Some of these methods have been extended to disease spread through complex networks with community structure.

The mean field equations and moment closure model were extended to a two-community network by Peng et al. (2013), Tunc and Shaw (2014) and Zhou and Liu (2009). Peng et al. (2013) and Zhou and Liu (2009) studied a susceptible–infected–susceptible (SIS) epidemiological model in a two-community network with a static and mobile agent, respectively. Both studies concluded that epidemic spreading in a community can be sustained through contact with another community. Tunc and Shaw (2014) studied the effects of community structure on epidemic spreading in an adaptive network. They showed that an epidemic can change the community structure. The household structure, as a special kind of community structure, has also been studied. Using mean field equations, Liu et al. (2004) concluded that some simple geometrical quantities of networks crucially affect the infection property of infectious diseases. Ball et al. (1997) reported that household structure exerts an amplification effect that permits a population-scale outbreak at very low levels of global transmission. Ball et al. (2009, 2010) derived the disease threshold parameters by branching-process approximations. However, these models do not describe the disease dynamics. In an effective degree SIR model, Ma et al. (2013) derived expressions for the disease threshold parameters. They showed that households can accelerate or decelerate the disease dynamics depending on the variance of the inter-household degree distribution.

All of the above epidemic models in a network with community structure assume homogeneous mixing. In 2013, Koch et al. (2013) extended the Miller network SIR model (Miller 2011). They modeled the spread of disease in complex two-community networks in the special case of independence between the inter- and intra-community degrees, obtaining an 8 dimensional model. They proved that random edge removal, from either group or between the groups, always decreases the basic reproduction number. In the same year, Miller and Volz (2013) extended the Miller network SIR model (Miller 2011) to model disease spread in complex networks with n communities and an arbitrary joint degree distribution (a 2-community network yielded a 4- dimensional model). They proved that for the same distribution of within and between-group partnerships in two populations, varying the correlations of the within and between-group partnerships will alter the course of an epidemic.

The present paper extends the Volz network SIR model (Volz 2008) to disease spreading in two-community complex networks with an arbitrary joint degree distribution. We obtain a 12 dimensional model. As the Volz (2008) and Miller (2011) models are consistent, our model is consistent with the model of Miller and Volz (2013) with \(n=2\). Although the model of Miller and Volz (2013) has fewer dimensions than our model, the basic reproduction number of their model requires finding the root of a quartic equation, which greatly complicates the disease threshold analysis. Instead, our model obtains the disease threshold conditions analytically, which involve the high-order moments of the degree distribution.

Furthermore, although the model of Miller and Volz (2013) assumes the same distribution of within and between-group partnerships in both populations, which implies the same community structure in the two populations, the total degree distribution in this process is also changed. Consequently, their work ignores the impact of community structure on the epidemic dynamics. In Koch et al. (2013), the random edge removal alters not only the community structure, but also the total degree distribution. Therefore, they cannot discern whether their dynamical behaviors are altered by changing the total degree distribution or by changing the community structure. In this paper, we investigate the effect of community structure on the propagation of a SIR epidemic in a network with a specific degree distribution. The total degree distribution is fixed, and the community structure is altered by adjusting the number ratio of the internal and external edges. Two main conclusions emerged from the study. First, by strengthening the community structure (i.e, fixing the total degree distribution while reducing the number ratio of the external edges), we can increase or decrease the final cumulative epidemic incidence depending on the transmissibility of the virus between humans and the community structure of the network at that point. Second, disease transmission is most obviously affected by the community structure when the human-to-human transmissibility of the virus is near the threshold.

The remainder of this paper is organized as follows. Section 2 introduces the definitions and measurements of community structures and the generation of complex networks with community structure. Section 3 develops our low-dimensional model of SIR-epidemic percolation in a two-community network, and obtains the epidemic threshold beyond which the disease outbreaks in both communities. The main conclusions of the study are demonstrated in an example, and the accuracy of the model and theorems are confirmed in a simulation study. In Sect. 4, the effects of community structure on disease spread are explored in additional simulations. Section 5 summarizes our present findings and attempts to explain why community structure can affect disease dynamics in a complicated way.

2 Complex networks with community structure

2.1 Definitions and measurement of the community structure

Consider a closed population of N individuals. The population is described as a simple network \(G = (V,E)\), where \(V=\{v_1,v_2,\cdots ,v_N\}\) is the set of nodes representing the individuals and E is the set of edges representing the connections between the individuals. The network is simple, meaning that at most one edge connects two individuals. We assume also that contacts are symmetric, that is, if an edge \((v_1, v_2)\in E\) connects \(v_1\) to \(v_2\), then an edge also connects \(v_2\) to \(v_1\). Although the network is undirected (i.e. any two neighboring vertices can infect each other), we wish to keep track of who infects who. Therefore, for each edge \((v_1, v_2)\in E\), we define two arcs as the ordered pairs \((v_1, v_2)\) and \((v_2, v_1)\). The first and second elements in the ordered pair \((v_1, v_2)\) are frequently called the ego and the alter, respectively (Volz 2008).

The N network nodes are divided into two groups, one with \(N_1\) nodes, the other with \(N_2\) nodes, where \(N_1+N_2=N\). We call the first and second groups the first and second community, respectively. Let \(P_l(k,j)\) (\(l=1,2\)) be the probability that a node in the lth community has k intra-community neighbors and j inter-community neighbors. Then the probability generating functions (PGFs) of the joint degree distribution of the first and second community are given by

$$\begin{aligned} \sum \limits _{k,j}P_1(k,j)x^k_1y^j_1 \ \ \ \ \ \text{ and } \ \ \ \ \ \sum \limits _{k,j}P_2(k,j)x^k_2y^j_2 \end{aligned}$$

respectively, where \(\sum \nolimits _{k,j}P_l(k,j)=1\), \(l=1,2\). For convenience, let

$$\begin{aligned} G(x_1,y_1,x_2,y_2)=\frac{N_1}{N} \sum \limits _{k,j}P_1(k,j)x^k_1y^j_1 +\frac{N_2}{N} \sum \limits _{k,j}P_2(k,j)x^k_2y^j_2. \end{aligned}$$
(1)

As the two communities share common external edges, the balance condition is given by

$$\begin{aligned} G^1(1,1)=G^2(1,1). \end{aligned}$$
(2)

We denote by square brackets [ \(]^{in}_l\) and [ \(]^{out}_l\) the expected numbers of any particular internal and external contacts with the lth community, respectively, and use the following notations:

$$\begin{aligned} G_l(x_l,y_l)=\frac{\partial G(x_1,y_1,x_2,y_2)}{\partial x_l},G^l(x_l,y_l)=\frac{\partial G(x_1,y_1,x_2,y_2)}{\partial y_l} \end{aligned}$$

and

$$\begin{aligned} G_l^l(x_l,y_l)= & {} \frac{\partial ^2G(x_1,y_1,x_2,y_2)}{\partial x_l\partial y_l},\\ G_{ll}(x_l,y_l)= & {} \frac{\partial ^2G(x_1,y_1,x_2,y_2)}{\partial x_l^2},\\ G^{ll}(x_l,y_l)= & {} \frac{\partial ^2 G(x_1,y_1,x_2,y_2)}{\partial y_l^2}. \end{aligned}$$

For convenience, we apply the following notations

$$\begin{aligned} G_l:= & {} G_l(1,1), G^l:=G^l(1,1);\nonumber \\ G_l^l:= & {} G_l^l(1,1), G_{ll}:=G_{ll}(1,1), G^{ll}:=G^{ll}(1,1), \end{aligned}$$
(3)

for \(l=1,2\). Thus, the expected numbers of internal and external degrees of the lth community are

$$\begin{aligned}{}[k]^{in}_l=\frac{N}{N_l}G_l \text{ and } [j]^{out}_l=\frac{N}{N_l}G^l, \end{aligned}$$
(4)

respectively. The internal, external and mixed second moments of the lth community are respectively given by

$$\begin{aligned}{}[k^2]_l^{in}:=\frac{N}{N_l}G_{ll}, [j^2]_l^{out}:=\frac{N}{N_l}G^{ll} \text{ and } [kj]_l^{mix}:=\frac{N}{N_l}G_l^l. \end{aligned}$$
(5)

Furthermore, the average internal and external redundancies along an internal edge in the lth community are

$$\begin{aligned} \frac{[k^2]_l^{in}}{[k]^{in}_l}=\frac{G_{ll}}{G_l} \ \ \ \text{ and } \ \ \ \ \frac{[kj]_l^{mix}}{[k]^{in}_l}=\frac{G_{l}^l}{G_l}\end{aligned}$$
(6)

respectively. Similarly, the average internal and external redundancies along an external edge in the lth community are

$$\begin{aligned} \frac{[kj]_l^{mix}}{[j]^{out}_l}=\frac{G_{l}^l}{G^l} \ \ \ \text{ and } \ \ \ \ \frac{[j^2]_l^{out}}{[j]^{out}_l}=\frac{G^{ll}}{G^l}, \end{aligned}$$
(7)

respectively. Furthermore, the PGF of the total degree distribution of the whole network is

$$\begin{aligned} g(x)=G(x,x,x,x). \end{aligned}$$

We now introduce the quality function Q proposed by Newman (2003a), which measures the strength of the community structure in a complex network:

$$\begin{aligned} Q=\sum \limits _{i}(e_{ii}-a^2_i), \end{aligned}$$
(8)

where \(e_{ij}\) is the fraction of links in the network that connect nodes in communities i and j, and \(a_i=\sum \nolimits _{j}e_{ij}\) represents the fraction of edges connected to vertices in community i. Q approximates zero in a random network, and one in a network with a strong community structure. In real-world networks, Q typically falls within the range 0.3–0.7 (Newman 2003a). Higher values are rare. As an example, Fig. 1 shows a two-community network with \(Q=0.314\).

From Eqs. (3), (4) and (8), we have

$$\begin{aligned} e_{ii}=\frac{G_{i}}{G_1+G^1+G_2+G^2}, a_i=\frac{G_{i}+G^i}{G_1+G^1+G_2+G^2}. \end{aligned}$$
(9)

Substituting equation (9) into (8), we obtain

$$\begin{aligned} Q=\frac{2G_1G_2-2{(G^1)}^2}{(G_1+2G^1+G_2)^2}. \end{aligned}$$
(10)

Thus the strength Q of the community structure in a complex network depends on the parameters \(G_l,G^l\), \(l=1,2\).

Fig. 1
figure 1

A random network with two communities. Community A consists of 100 nodes (black circles) with an average intra-community degree \([k]^{in}_1=4\). Community B consists of 100 nodes (red circles) with \([k]^{in}_2=5\). The average inter-community degrees of Communities A and B are \( [j]^{out}_1=1\) and \( [j]^{out}_2=1\), respectively (color figure online)

2.2 Generation of complex networks with community structure

Stochastic simulations were performed on a two-community contacting network with a specified joint degree distribution. The network was generated by the Configuration Model (Molloy and Reed 1995). This process, a variation of the CM model proposed by Volz (2008), is briefly described below:

Each node is assigned to a single community. The nodes are apportioned according to the sizes of the two communities. To each node \(v_l\) in community l (\(l=1,2\)), assign a joint degree \((\delta (v_l), \zeta (v_l))\) from the joint degree distribution \(P_l(k,j)\). For all nodes in each community \(l=1,2\), generate new sets \(X_l\) of “half-edges” with \(\delta (v_l)\) copies of node \(v_l\) and sets \(X^l\) of “half-edges” with \(\zeta (v_l)\) copies of node \(v_l\). While \(X_l\) is not empty, randomly and uniformly draw two elements \(v(l), w(l)\in X_l\) and create an edge (v(l), w(l)) (an internal connection in community l). While both \(X^1\) and \(X^2\) are non-empty, randomly and uniformly draw two elements v(1) and v(2) from \(X^1\) and \(X^2\) respectively and create an edge (v(1), v(2)) (an external connection between the two communities).

Note that this procedure does not allow multiple edges to the same nodes, and loops from a node to itself.

3 SIR models in random networks

This section introduces a low-dimensional system that models the percolation of an SIR-epidemic in a two-community network with arbitrary joint degrees. It also obtains the disease thresholds.

3.1 Our model

When a disease spreads through a network, the nodes can be in any of three exclusive states: susceptible (S), infectious (I), and recovered (R). The dynamics are as follows. An infectious node will independently infect each of its neighbors at a constant rate \(\gamma \). Infectious nodes recover at a constant rate \(\mu \), whereupon they no longer infect any neighbor. These processes will be formulated in the next section. Following the notations in the literature (Volz 2008), we let s, i and r represent the fractions of susceptible, infectious and recovered nodes, respectively.

More generally, we represent the state of a node as A or B respectively, where A and B may be S, I or R. For ease of reference, we summarize this notation in Table 1.

Table 1 Symbols

The model is formulated as follows:

$$\begin{aligned} \left\{ \begin{array}{lll} \frac{d\theta _{ll}}{dt}&{}=&{}-\gamma \theta _{ll}p^{I_l}_{S_l},\\ \frac{d\theta _{ln}}{dt}&{}=&{}-\gamma \theta _{ln}p^{I_n}_{S_l}, \\ \frac{dp^{I_l}_{S_l}}{dt}&{}=&{}\frac{\gamma \theta _{ll}p^{I_l}_{S_l}p^{S_l}_{S_l}G_{ll}{(\theta _{ll},\theta _{ln})}}{G_l{(\theta _{ll},\theta _{ln})}} +\frac{\gamma \theta _{ln}p^{I_n}_{S_l}p^{S_l}_{S_l}G_l^l{(\theta _{ll},\theta _{ln})}}{G_l{(\theta _{ll},\theta _{ln})}}-\gamma p^{I_l}_{S_l}(1-p^{I_l}_{S_l})-\mu p^{I_l}_{S_l},\\ \frac{dp^{I_n}_{S_l}}{dt}&{}=&{}\frac{\gamma (\theta _{nl})^2p^{I_l}_{S_n}p^{S_l}_{S_n}G^{nn}{(\theta _{nn},\theta _{nl})}}{\theta _{ln}G^l{(\theta _{ll},\theta _{ln})}} +\frac{\gamma \theta _{nn}\theta _{nl}p^{I_n}_{S_n}p^{S_l}_{S_n}G_n^n{(\theta _{nn},\theta _{nl})}}{\theta _{ln}G^l{(\theta _{ll},\theta _{ln})}} -\gamma p^{I_n}_{S_l}(1-p^{I_n}_{S_l})\\ &{}&{}-\mu p^{I_n}_{S_l},\\ \frac{dp^{S_l}_{S_l}}{dt}&{}=&{}\frac{-\gamma p^{I_l}_{S_l}p^{S_l}_{S_l}\theta _{ll}G_{ll}{(\theta _{ll},\theta _{ln})}}{G_l{(\theta _{ll},\theta _{ln})}}-\frac{\gamma p^{I_n}_{S_l}p^{S_l}_{S_l}\theta _{ln}G_l^l{(\theta _{ll},\theta _{ln})}}{G_l{(\theta _{ll},\theta _{ln})}}+\gamma p^{S_l}_{S_l}p^{I_l}_{S_l},\\ \frac{dp^{S_n}_{S_l}}{dt}&{}=&{}\frac{-\gamma p^{I_n}_{S_n}p^{S_l}_{S_n}\theta _{nn}\theta _{nl}G_n^n{(\theta _{nn},\theta _{nl})}}{G^l{(\theta _{ll},\theta _{ln})}\theta _{ln}} -\frac{\gamma p^{I_l}_{S_n}p^{S_l}_{S_n}(\theta _{nl})^2G^{nn}{(\theta _{nn},\theta _{nl})}}{G^l{(\theta _{ll},\theta _{ln})\theta _{ln}}}+\gamma p^{S_n}_{S_l}p^{I_n}_{S_l}, \end{array}\right. \quad \end{aligned}$$
(11)

where \(n\ne l\) and \(l,n=1,2\). This model extends the network SIR model of Volz (2008) to disease spreading in a complex two-community network with an arbitrary joint degree distribution. The model is derived in detail in “Appendix A”.

The fraction of nodes that remain susceptible at time t in community l (\(l=1, 2\)) is given by

$$\begin{aligned} \left\{ \begin{array}{ll} s_1=\sum \limits _{k,j}P_1(k,j)\theta _{11}^k\theta _{12}^j=\frac{N}{N_1}G(\theta _{11}, \theta _{12}, 0, 0), \ \ \ \ \, \ &{}\ \ \\ s_2=\sum \limits _{k,j}P_2(k,j)\theta _{22}^k\theta _{21}^j=\frac{N}{N_2}G(0, 0, \theta _{22}, \theta _{21}), \ \ \ \ \, \ &{} \ \ \end{array}\right. \end{aligned}$$
(12)

and the fraction of nodes that remain susceptible in the whole network at time t is

$$\begin{aligned} s=\sum \limits _{l,n=1}^2\sum \limits _{n\ne l}\sum \limits _{k,j}\frac{N_l}{N}P_l(k,j)\theta _{ll}^k\theta _{ln}^j=G(\theta _{11},\theta _{12},\theta _{22},\theta _{21}). \end{aligned}$$
(13)

To consider the number of infected individuals, we need the dynamics of the infectious nodes. The infectious class increases at rate \(-\dot{s}\) and decreases at rate \(\mu i\). Therefore, we have

$$\begin{aligned} \frac{di}{dt}= & {} \gamma p^{I_1}_{S_1}\theta _{11}G_1{(\theta _{11},\theta _{12})}+\gamma p^{I_2}_{S_1}\theta _{12}G^1{(\theta _{11},\theta _{12})} +\gamma p^{I_2}_{S_2}\theta _{22}G_2{(\theta _{22},\theta _{21})}\nonumber \\&+\,\gamma p^{I_1}_{S_2}\theta _{21}G^2{(\theta _{22},\theta _{21})}-\mu i. \end{aligned}$$
(14)

3.2 Initial conditions

We randomly and uniformly select small but strictly positive fractions \(\epsilon _1\) and \(\epsilon _2\) of the nodes in Community 1 and Community 2, respectively, as initially infected nodes. Then we obtain the following condition.

$$\begin{aligned} M^{in}_{I_l}= & {} M^{out}_{I_l}=\epsilon _l, M^{in}_{S_l}=M^{out}_{S_l}=1-\epsilon _l, \nonumber \\ M^{I_l}_{S_l}\approx & {} M^{in}_{I_l}=\epsilon _l,M^{S_l}_{S_l}=M^{in}_{S_l}-M^{I_l}_{S_l}=1-2\epsilon _l,\nonumber \\ M^{I_n}_{S_l}\approx & {} M^{out}_{I_n}=\epsilon _n, M^{S_n}_{S_l}=M^{out}_{S_l}-M^{I_n}_{S_l}\approx 1-\epsilon _1-\epsilon _2, l\ne n, \end{aligned}$$
(15)

where \(l, n=1,2\). To summarize,

$$\begin{aligned} \begin{array}{lll} 1.\ \ \ \ \ \ \theta _{ll}(t=0)\approx M^{in}_{S_l}=1-\epsilon _l, \theta _{ln}(t=0)\approx M^{out}_{S_l}=1-\epsilon _l,\\ 2.\ \ \ \ \ \ p^{I_l}_{S_l}(t=0)=\frac{M^{I_l}_{S_l}}{M^{in}_{S_l}}\approx \frac{\epsilon _l}{1-\epsilon _l}, p^{I_n}_{S_l}(t=0)=\frac{M^{I_n}_{S_l}}{M^{out}_{S_l}}\approx \frac{\epsilon _n}{1-\epsilon _l},\\ 3.\ \ \ \ \ \ p^{S_l}_{S_l}(t=0)=\frac{M^{S_l}_{S_l}}{{M^{in}_{S_l}}}\approx \frac{1-2\epsilon _l}{1-\epsilon _l}, p^{S_n}_{S_l}(t=0)=\frac{M^{S_n}_{S_l}}{{M^{out}_{S_l}}}\approx \frac{1-\epsilon _1-\epsilon _2}{1-\epsilon _l}, \end{array}\end{aligned}$$
(16)

where \(n\ne l\) and \(l,n=1,2\).

3.3 Epidemic thresholds

As noted in the literature (Volz 2008), the number of new infectious in a small time interval is proportional to \(p_S^I\), which defines the probability that a susceptible node is connected to an infectious node. If \(\dot{p}_S^I(t=0)<0\), an epidemic will necessarily die out without reaching a fraction of the population.

According to the total probability formula, we obtain

$$\begin{aligned} p_S^I=\frac{s_1N_1 p_{S_1}^I}{s_1N_1+s_2N_2}+\frac{s_2N_2 p_{S_2}^I}{s_2N_2+s_1N_1}. \end{aligned}$$
(17)

So

$$\begin{aligned} \dot{p}_S^I= & {} \frac{s_1N_1\dot{p}_{S_1}^I+s_2N_2\dot{p}_{S_2}^I}{s_1N_1+s_2N_2}+\frac{(\dot{s_1}N_1 p_{S_1}^I +\dot{s_2}N_2 p_{S_2}^I) }{{(s_1N_1+s_2N_2)}}\nonumber \\&-\frac{(s_1N_1 p_{S_1}^I+s_2N_2 p_{S_2}^I)(\dot{s_1}N_1+\dot{s_2}N_2)}{{(s_1N_1+s_2N_2)^2}}. \end{aligned}$$
(18)

Assuming independent transmission from infectious nodes to their common susceptible neighbors, we have

$$\begin{aligned} p_{S_l}^I=p^{I_1}_{S_l}+p^{I_2}_{S_l}-{p^{I_1}_{S_l}}{p^{I_2}_{S_l}}, \ \ \ \ l=1,2. \end{aligned}$$
(19)

According to Eq. (11) and the initial conditions (16), we obtain

$$\begin{aligned} \dot{p}^{I_1}_{S_1}(t=0)= & {} \gamma \left( \frac{1-2\epsilon _1}{1-\epsilon _1}\frac{G_{11}{(1-\epsilon _1,1-\epsilon _1)}}{G_1{(1-\epsilon _1,1-\epsilon _1)}} - \frac{1-2\epsilon _1}{(1-\epsilon _1)^2}\right) \epsilon _1\nonumber \\&+\,\gamma \frac{G_1^1{(1-\epsilon _1,1-\epsilon _1)(1-2\epsilon _1)}}{G_1{(1-\epsilon _1,1-\epsilon _1)}(1-\epsilon _1)}\epsilon _2-\mu \frac{\epsilon _1}{1-\epsilon _1}. \end{aligned}$$
(20)

Noting that \(\epsilon _1,\epsilon _2\ll 1\) we can ignore the higher-order terms of \(\epsilon _1\) and \(\epsilon _2\) to obtain

$$\begin{aligned} \dot{p}^{I_1}_{S_1}(t=0)\approx \frac{\gamma }{G_1}(G_{11}\epsilon _1+G_1^1\epsilon _2)-(\gamma +\mu ) \epsilon _1. \end{aligned}$$
(21)

Similarly, we can obtain

$$\begin{aligned} \dot{p}^{I_2}_{S_1}(t=0)\approx & {} \frac{\gamma }{G^1}(G^{22}\epsilon _1+G_2^2\epsilon _2)-(\gamma +\mu )\epsilon _2,\nonumber \\ \dot{p}^{I_1}_{S_2}(t=0)\approx & {} \frac{\gamma }{ G^2}(G^{11}\epsilon _2+G_1^1\epsilon _1)-(\gamma +\mu )\epsilon _1,\nonumber \\ \dot{p}^{I_2}_{S_2}(t=0)\approx & {} \frac{\gamma }{G_2}(G_2^2\epsilon _1+G_{22}\epsilon _2)-(\gamma +\mu ) \epsilon _2. \end{aligned}$$
(22)

Applying Eqs. (21) and (22) to (19), we have

$$\begin{aligned} \dot{p}_{S_l}^I(t=0)\approx & {} \left( \gamma \frac{G_{ll}}{G_l} +\gamma \frac{G^{nn}}{G^l}-\gamma -\mu \right) \epsilon _l +\left( \gamma \frac{G_l^l}{G_l}+\gamma \frac{G_n^n}{G^l}-\gamma -\mu \right) \epsilon _n\nonumber \\&-\left( \gamma \frac{G_{ll}}{G_l}+\gamma \frac{G_n^n}{G^l}-2\gamma -2\mu \right) \epsilon _l\epsilon _n-\gamma \frac{G^{nn}}{G^l}({\epsilon }_l)^2 -\gamma \frac{G_l^l}{G_l}({\epsilon }_n)^2,\nonumber \\ \end{aligned}$$
(23)

where \(l\ne n\) and \(l,n=1,2\). Ignoring the higher-order terms of \(\epsilon _1\) and \(\epsilon _2\), we have

$$\begin{aligned} \dot{p}_{S_l}^I(t=0)\approx & {} \left( \gamma \frac{G_{ll}}{G_l}+\gamma \frac{G^{nn}}{G^l}-\gamma -\mu \right) \epsilon _l +\left( \gamma \frac{G_l^l}{G_l}+\gamma \frac{G_n^n}{G^l}-\gamma -\mu \right) \epsilon _n,\nonumber \\ \end{aligned}$$
(24)

where \(l\ne n\) and \(l,n=1,2\).

Now, applying Eqs. (2), (11), (12) and (24)–(18) and noticing that \(\epsilon _1,\varepsilon _2\ll 1\), we have

$$\begin{aligned} \dot{p}_S^I(t=0)\approx & {} \left[ \frac{\gamma N_1}{N_1+N_2}\left( \frac{G_{11}}{G_1}+\frac{G^{22}}{G^1}-1\right) +\frac{\gamma N_2}{N_1+N_2}\left( \frac{G_2^2}{G_2}+\frac{G_1^1}{G^2}-1\right) -\mu \right] \epsilon _1\nonumber \\&+\left[ \frac{\gamma N_1}{N_1+N_2}\left( \frac{G_1^1}{G_1}+\frac{G_2^2}{G^1}-1\right) +\frac{\gamma N_2}{N_1+N_2}\left( \frac{G_{22}}{G_2}+\frac{G^{11}}{G^2}-1\right) -\mu \right] \epsilon _2.\nonumber \\ \end{aligned}$$
(25)

After rearranging this expression, we get the critical ratio \(\gamma /\mu \) in terms of the PGF. They are two positive numbers:

$$\begin{aligned} (\gamma /\mu )^*_1= & {} \frac{1}{\frac{N_1}{N_1+N_2}\left( \frac{G_{11}}{G_1}+\frac{G^{22}}{G^2}\right) +\frac{N_2}{N_1+N_2}\left( \frac{G_2^2}{G_2}+\frac{G_1^1}{G^1}\right) -1},\nonumber \\ (\gamma /\mu )^*_2= & {} \frac{1}{\frac{N_1}{N_1+N_2}\left( \frac{G_1^1}{G_1}+\frac{G_2^2}{G^2}\right) +\frac{N_2}{N_1+N_2}\left( \frac{G_{22}}{G_2}+\frac{G^{11}}{G^1}\right) -1}. \end{aligned}$$
(26)

If \((\gamma /\mu )^*_1>0\) and \((\gamma /\mu )^*_2>0\), we obtain the following theorem. This theorem is the main result of this section.

Theorem 1

For the SIR model described in (11), the following are true.

  1. (I)

    If \( \frac{\gamma }{\mu }> \max \{(\frac{\gamma }{\mu })^*_1,(\frac{\gamma }{\mu })^*_2\}\), the epidemic occurs in both communities;

  2. (II)

    If \( \frac{\gamma }{\mu }\le \min \{(\frac{\gamma }{\mu })^*_1,(\frac{\gamma }{\mu })^*_2\}\), the epidemic dies out in both communities;

  3. (III)

    If \(\min \{(\frac{\gamma }{\mu })^*_1,(\frac{\gamma }{\mu })^*_2\}<\frac{\gamma }{\mu } \le \max \{(\frac{\gamma }{\mu })^*_1,(\frac{\gamma }{\mu })^*_2\}\), the epidemic may occur in both communities, die out in both communities or occur in one community and die out in the other community.

From (6) and (7), we know that \(\frac{G_{ll}}{G_l}\) and \(\frac{G^{nn}}{G^n} (n\ne l)\) are the average numbers of transmissible neighbors in Community l of a node in communities l and n, respectively, which was newly infected by a node in Community l. Meanwhile, \(\frac{G_l^l}{G^l} \) and \(\frac{G_n^n}{G_n}\) \((n\ne l)\) are the average numbers of transmissible neighbors in Community l of a node in communities n and l, respectively, which was newly infected by a node in Community n. Therefore, the average number of transmissible neighbors in Community l is \(\frac{N_l}{N_l+N_n}(\frac{G_{ll}}{G_l}+\frac{G^{nn}}{G^n})+\frac{N_n}{N_l+N_n}(\frac{G_n^n}{G_n}+\frac{G_l^l}{G^l})\). The form of the thresholds \((\gamma /\mu )^*_1\) and \((\gamma /\mu )^*_2\) is then consistent with the critical transmissibility of the SIR epidemic model in a network with no community structure obtained by Newman (2002), i.e.

$$\begin{aligned} T_c=\frac{1}{\frac{[k^2]}{[k]}-1}. \end{aligned}$$
(27)

When \(\gamma /\mu >(\gamma /\mu )^*_l\), the disease expands in Community l, and when \(\gamma /\mu <(\gamma /\mu )^*_l\), the disease shrinks in Community l. Hence, \(\gamma /\mu >\max \{(\gamma /\mu )^*_1, \gamma /\mu )^*_2\}\) means that the disease expands in both communities, and the epidemic spreads through the whole net. Conversely, \(\gamma /\mu <\min \{(\gamma /\mu )^*_1, \gamma /\mu )^*_2\}\) means that the disease shrinks in both communities, and the epidemic dies out throughout the whole net. However, when \(\min \{(\gamma /\mu )^*_1, \gamma /\mu )^*_2\}<\gamma /\mu <\max \{(\gamma /\mu )^*_1, \gamma /\mu )^*_2\}\), the disease expands in one community and shrinks in the other. Diseases can shift between the two communities, sometimes expanding and sometimes shrinking, so the final trend of the disease is difficult to predict.

The disease thresholds involve both the first-order and second-order moments of the network. Therefore, changing the strength Q of the community structure in a complex network changes the first order moments \(G_l\) and \(G^l\) and also the second-order moments \(G_{ll}\), \(G^{ll}\) and \(G_l^l\), \((l=1,2)\). When G is nonspecific, the impact of Q on the spread of disease cannot be easily known. In the following section, we exemplify the impact of community structure on disease spread using a specific distribution function.

3.4 An example

In this subsection, we demonstrate Theorem 1 by an example. The joint degree distribution of the community l is assumed as the following Poisson distribution, i.e.

$$\begin{aligned} P_l(k,j)=\frac{\lambda _{ll}^k\lambda _{ln}^je^{-(\lambda _{ll}+\lambda _{ln})}}{k!j!}, l,n=1,2,l\ne n, \end{aligned}$$
(28)

where \(\lambda _{ln}\) (\(l,n=1,2\)) are positive numbers, and the expected number of external degrees \(\lambda _{12}\) and \(\lambda _{21}\) meet the balance condition

$$\begin{aligned} N_1\lambda _{12}=N_2\lambda _{21}. \end{aligned}$$
(29)

The corresponding PGF is then given by

$$\begin{aligned} G(x_1,y_1,x_2,y_2)= & {} \frac{N_1}{N_1+N_2}e^{\lambda _{11}(x_1-1)}e^{\lambda _{12}(y_1-1)}\nonumber \\&+\frac{N_2}{N_1+N_2}e^{\lambda _{22}(x_2-1)}e^{\lambda _{21}(y_2-1)}. \end{aligned}$$
(30)

From Eqs. (3) and (30), we have

$$\begin{aligned} G_1=\frac{N_1}{N_1+N_2}\lambda _{11}, G^1=\frac{N_1}{N_1+N_2}\lambda _{12}, G_2=\frac{N_2}{N_1+N_2}\lambda _{22}, G^2=\frac{N_2}{N_1+N_2}\lambda _{21},\nonumber \\ \end{aligned}$$
(31)

and

$$\begin{aligned} G_{11}= & {} \frac{N_1}{N_1+N_2}{\lambda _{11}}^2, G_1^1=\frac{N_1}{N_1+N_2}{\lambda _{11}}\lambda _{12}, G^{11}=\frac{N_1}{N_1+N_2}{\lambda _{12}}^2, \nonumber \\ G_{22}= & {} \frac{N_2}{N_1+N_2}{\lambda _{22}}^2, G_2^2=\frac{N_2}{N_1+N_2}{\lambda _{22}}\lambda _{21}, G^{22}=\frac{N_2}{N_1+N_2}{\lambda _{21}}^2. \end{aligned}$$
(32)

3.4.1 Measurement of the community structure

According to Eqs. (31), (32) and the definition Eq. (10), the strength of the community structure in the above complex network is

$$\begin{aligned} Q=\frac{2N_1N_2\lambda _{11}\lambda _{22}-2N_1^2\lambda _{12}^2}{(N_1\lambda _{11}+2N_1\lambda _{12}+N_2\lambda _{22})^2}. \end{aligned}$$
(33)

We next study the relationship between the external degree and community structure. To this end, we first denote the average total degrees of Community 1 and 2 as \(\lambda _1\) and \(\lambda _2\), respectively. These quantities are defined as follows

$$\begin{aligned} \lambda _1=\lambda _{11}+\lambda _{12}, \lambda _2=\lambda _{21}+\lambda _{22}. \end{aligned}$$
(34)

Given the scales (\(N_1\) and \(N_2\)) and the average total degrees (\(\lambda _1\) and \(\lambda _2\)) of the communities, Q and \(\lambda _{12}\) are related as follows:

$$\begin{aligned} Q(\lambda _{12})=\frac{2N_1[N_2\lambda _1\lambda _2-(N_1\lambda _1+N_2\lambda _2)\lambda _{12}]}{(N_1\lambda _1+N_2\lambda _2)^2}. \end{aligned}$$
(35)

According to this formula, increasing the number of edges between the communities weakens the community structure, consistent with our understanding.

3.4.2 Disease thresholds

We now derive the disease thresholds in the presented example.

From Eqs. (31) and (32), we have

$$\begin{aligned} \left( \frac{\gamma }{\mu }\right) _1^*=\frac{1}{\lambda _{11}+\lambda _{21}-1}, \left( \frac{\gamma }{\mu }\right) _2^*=\frac{1}{\lambda _{22}+\lambda _{12}-1}. \end{aligned}$$
(36)

Assuming that

$$\begin{aligned} \lambda _{11}>1, \lambda _{22}>1, \end{aligned}$$
(37)

we have \((\frac{\gamma }{\mu })_1^*>0\) and \((\frac{\gamma }{\mu })_2^*>0\). Theorem 1 is then modified as follows.

Theorem 2

For the SIR model in (11), the following are true.

  1. (I)

    If \(\frac{\gamma }{\mu }> \max \{\frac{1}{\lambda _{11}+\lambda _{21}-1},\frac{1}{\lambda _{22}+\lambda _{12}-1}\}\), the epidemic occurs in both communities;

  2. (II)

    If \( \frac{\gamma }{\mu }\le \min \{\frac{1}{\lambda _{11}+\lambda _{21}-1},\frac{1}{\lambda _{22}+\lambda _{12}-1}\}\), the epidemic dies out in both communities;

  3. (III)

    If \(\min \{\frac{1}{\lambda _{11}+\lambda _{21}-1},\frac{1}{\lambda _{22}+\lambda _{12}-1}\}<\frac{\gamma }{\mu } \le \max \{\frac{1}{\lambda _{11}+\lambda _{21}-1},\frac{1}{\lambda _{22}+\lambda _{12}-1}\}\), the epidemic may occur in both communities, die out in both communities or occur in one community and die out in the other.

To study the effect of community structure on the disease threshold, we first study how the associations between communities affect the disease threshold in the network. Because the external degrees \(\lambda _{12}\) and \(\lambda _{21}\) meet the balance condition \(N_1\lambda _{12}=N_2\lambda _{21}\), we need only study the effect of parameter \(\lambda _{12}\). Applying Eqs. (29), (34) and (36), we have

$$\begin{aligned} \left( \frac{\gamma }{\mu }\right) _1^*=\frac{1}{\lambda _1-1+(\frac{N_1}{N_2}-1)\lambda _{12}}, \left( \frac{\gamma }{\mu }\right) _2^*=\frac{1}{\lambda _2-1-(\frac{N_1}{N_2}-1)\lambda _{12}}. \end{aligned}$$
(38)

Given the scales and average total degrees of the communities, the following investigates the effect of the external degree \(\lambda _{12}\) on the disease threshold in three cases: \(N_1=N_2\), \(N_1<N_2\) and \(N_1>N_2\).

Case 1: \(N_1=N_2\). Theorem 1 is equivalent to the following theorem.

Theorem 3

In the SIR model described in (11), the following are true.

  1. (I)

    If \(\frac{\gamma }{\mu }> \max \{\frac{1}{\lambda _1-1},\frac{1}{\lambda _2-1}\}\), the epidemic occurs in both communities;

  2. (II)

    If \( \frac{\gamma }{\mu }\le \min \{\frac{1}{\lambda _1-1},\frac{1}{\lambda _2-1}\}\), the epidemic dies out in both communities;

  3. (III)

    If \(\min \{\frac{1}{\lambda _1-1},\frac{1}{\lambda _2-1}\}<\frac{\gamma }{\mu } \le \max \{\frac{1}{\lambda _1-1},\frac{1}{\lambda _2-1}\}\), the epidemic may occur in both communities, die out in both communities or occur in one community and die out in the other.

At this time, the external degree \(\lambda _{12}\) and the community strength Q do not influence the disease threshold.

Case 2: \(N_1>N_2\). Let

$$\begin{aligned} \begin{array}{ll} \underline{\lambda }_{12}^*=-\frac{\lambda _1-1}{\frac{N_1}{N_2}-1}, &{} {\lambda }_{12}^*=\frac{\lambda _2-\lambda _1}{2(\frac{N_1}{N_2}-1)}, \\ \bar{{\lambda }}_{12}^*=\min \{\lambda _1-1, \frac{N_2}{N_1}(\lambda _2-1)\}, &{} \tilde{\lambda }_{12}^*=\frac{\lambda _2-1}{\frac{N_1}{N_2}-1}. \end{array}\end{aligned}$$
(39)

Obviously, Eq. (37) means that

$$\begin{aligned} \underline{\lambda }_{12}^*<0, \tilde{\lambda }_{12}^*>\bar{{\lambda }}_{12}^*>0. \end{aligned}$$

From the practical meaning of \(\lambda _{12}\) and Eq. (37), we can deduce that \(0\le \lambda _{12}<\bar{\lambda }_{12}^*\). Otherwise, we have \(\lambda _{12}\ge \bar{\lambda }_{12}^*\), meaning that \(\lambda _{22}\le 1\) or \(\lambda _{11}\le 1\), which contradicts \(\lambda _{22}>1\), \(\lambda _{11}>1\).

Fig. 2
figure 2

Phase diagram of the disease threshold \((\frac{\gamma }{\mu })^*\) versus external degree in Community 1. The blue and purple lines represent \((\frac{\gamma }{\mu })^*_2\) and \((\frac{\gamma }{\mu })^*_1\), respectively. If the parameter values fall in the red or green areas, the epidemic spreads or dies out in both communities, respectively. If the parameter values fall in the yellow area, the trend of the disease is uncertain. The parameters are \(N_1>N_2\), \(\lambda _1>\lambda _2\) in (a) and \(N_1>N_2\), \(\lambda _1<\lambda _2\) in (b) (color figure online)

When \(\lambda _2<\lambda _1\), we have \({\lambda }_{12}^*<0\). As \(\lambda _{12}\in (0,\bar{\lambda }_{12}^*)\), we can easily obtain that \((\frac{\gamma }{\mu })_2^*>(\frac{\gamma }{\mu })_1^*\). Then, from Theorem 1 and Eq. (38), we have the following theorem.

Theorem 4

If \(N_1>N_2\) and \(\lambda _1>\lambda _2\), the following are true.

  1. (I)

    If \(\frac{\gamma }{\mu }> \frac{1}{\lambda _2-1-(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic occurs in both communities;

  2. (II)

    If \( \frac{\gamma }{\mu }\le \frac{1}{\lambda _1-1+(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic dies out in both communities;

  3. (III)

    If \( \frac{1}{\lambda _1-1+(\frac{N_1}{N_2}-1)\lambda _{12}}<\frac{\gamma }{\mu } \le \frac{1}{\lambda _2-1-(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic may occur in both communities, die out in both communities or occur in one community and die out in the other.

The results of Theorem 4 are demonstrated in Fig. 2a.

When \(\lambda _2>\lambda _1\), we have \({\lambda }_{12}^*>0\).

For \(\lambda _{12}\in [0, \bar{\lambda }_{12}^*)\) and \(\lambda _{12}^*< \bar{\lambda }_{12}^*\) we can easily obtain that \((\frac{\gamma }{\mu })_1^*>(\frac{\gamma }{\mu })_2^*\) (as \(\lambda _{12}\in [0,\lambda _{12}^*)\)), \((\frac{\gamma }{\mu })_1^*=(\frac{\gamma }{\mu })_2^*\) (as \(\lambda _{12}=\lambda _{12}^*\)), and \((\frac{\gamma }{\mu })_1^*<(\frac{\gamma }{\mu })_2^*\) (as \(\lambda _{12}\in (\lambda _{12}^*,\bar{\lambda }_{12}^*)\)).

Applying Theorem 1, Eq. (38) and the above facts, we obtain the following theorems.

Theorem 5

If \(N_1>N_2\), \(\lambda _2>\lambda _1\) and \(\lambda _{12}\in [0,\min \{\lambda _{12}^*, \bar{\lambda }_{12}^*\})\), the following are true.

  1. (I)

    If \(\frac{\gamma }{\mu }> \frac{1}{\lambda _1-1+(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic occurs in both communities;

  2. (II)

    If \( \frac{\gamma }{\mu }\le \frac{1}{\lambda _2-1-(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic dies out in both communities;

  3. (III)

    If \(\frac{1}{\lambda _2-1-(\frac{N_1}{N_2}-1)\lambda _{12}}<\frac{\gamma }{\mu } \le \frac{1}{\lambda _1-1+(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic may occur in both communities, die out in both communities or occur in one community and die out in the other.

Theorem 6

If \(N_1>N_2\), \(\lambda _2>\lambda _1\), \(\lambda _{12}^*< \bar{\lambda }_{12}^*\) and \(\lambda _{12}=\lambda _{12}^*\), the following are true.

  1. (I)

    If \(\frac{\gamma }{\mu }> \frac{2}{\lambda _1+\lambda _2-2}\), the epidemic occurs in both communities;

  2. (II)

    If \( \frac{\gamma }{\mu }\le \frac{2}{\lambda _1+\lambda _2-2}\), the epidemic dies out in both communities.

Theorem 7

If \(N_1>N_2\), \(\lambda _2>\lambda _1\), \(\lambda _{12}^*< \bar{\lambda }_{12}^*\) and \(\lambda _{12}\in (\lambda _{12}^*,\bar{\lambda }_{12}^*)\), the following are true.

  1. (I)

    If \(\frac{\gamma }{\mu }> \frac{1}{\lambda _2-1-(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic occurs in both communities;

  2. (II)

    If \( \frac{\gamma }{\mu }\le \frac{1}{\lambda _1-1+(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic dies out in both communities;

  3. (III)

    If \(\frac{1}{\lambda _1-1+(\frac{N_1}{N_2}-1)\lambda _{12}}<\frac{\gamma }{\mu } \le \frac{1}{\lambda _2-1-(\frac{N_1}{N_2}-1)\lambda _{12}}\), the epidemic may occur in both communities, die out in both communities or occur in one community and die out in the other.

The results of Theorems 56, and 7 are demonstrated in Fig. 2b.

Case 3: \(N_1<N_2\). This case is analyzed similarly to the above cases, so the analysis is omitted.

3.4.3 Simulations

This section confirms the accuracy of our model (11) and the theorems given in Sect. 3.4.2.

To describe the scale of the disease, we adopt the cumulative epidemic incidence, defined as the fraction of infectious or recovered nodes (Volz 2008). The cumulative incidences in the whole network, Community 1, and Community 2 are denoted as J, \(J_1\), and \(J_2\), respectively. From Eqs. (12) and (13) and noting that the hazard is identical for all nodes, we have

$$\begin{aligned} J=1-G(\theta _{11},\theta _{12},\theta _{22},\theta _{21}), \end{aligned}$$
(40)

and

$$\begin{aligned} J_1= & {} 1-\frac{N}{N_1}G(\theta _{11},\theta _{12},0,0),\nonumber \\ J_2= & {} 1-\frac{N}{N_2}G(0,0,\theta _{22},\theta _{21}). \end{aligned}$$
(41)

First, the accuracy of model (11) is verified in numerical simulations. Figure 3 plots the temporal evolution of the cumulative epidemic incidence under model (11). The random simulations are based on the network given in Sect. 2.2, with joint degree distributions \(P_l(k,j)=\frac{\lambda _{ll}^k\lambda _{ln}^je^{-(\lambda _{ll}+\lambda _{ln})}}{k!j!}\), \(l,n=1,2,l\ne n\). The random simulation dynamics are described on page 306 of Volz (2008). The numerical results well correlate with the random simulation results, confirming the accuracy of model (11).

Fig. 3
figure 3

Random and numerical simulations of the cumulative incidence in Community 1 (\(J_1\)) and Community 2 (\(J_2\)). The dotted lines correspond to random simulations of the SIR model based on the Monte Carlo method. The curve is the average of 100 stochastic simulation runs on the same contact network with the same disease parameters. If the final epidemic size was below 20, the trial was discarded. The solid lines correspond to numerical simulations of the cumulative incidences under system (11) with the following parameter settings: a \(N_1=1000\), \(N_2=3000\), \(\lambda _{11}=10\), \(\lambda _{12}=3\), \(\lambda _{21}=1\), \(\lambda _{22}=10\), \(\gamma =0.03\), \(\mu =0.1\); b \(N_1=1000\), \(N_2=2000\), \(\lambda _{11}=5\), \(\lambda _{12}=2\), \(\lambda _{21}=1\), \(\lambda _{22}=7\), \(\gamma =0.006\), \(\mu =0.1\); (c) \(N_1=100\), \(N_2=1000\), \(\lambda _{11}=3\), \(\lambda _{12}=0.1\), \(\lambda _{21}=0.01\), \(\lambda _{22}=10\), \(\gamma =0.03\), \(\mu =0.1\)

Second, we test the correctness of Theorem 2 in examples.

Fig. 4
figure 4

Numerical simulations of cumulative incidence in Community 1 (\(J_1\)) and Community 2 (\(J_2\)) under model (11) with the following parameter settings: a \(N_1=100\), \(N_2=1000\), \(\lambda _{11}=3\), \(\lambda _{12}=0.1\), \(\lambda _{21}=0.01\), \(\lambda _{22}=10\), \(\gamma =0.08\), \(\mu =0.1\); b \(N_1=100\), \(N_2=1000\), \(\lambda _{11}=3\), \(\lambda _{12}=0.1\), \(\lambda _{21}=0.01\), \(\lambda _{22}=10\), \(\gamma =0.008\), \(\mu =0.1\)

In panels (a) and (b) of Fig. 4, the epidemic spreaded through both communities and died out in both communities, respectively.

To show the uncertainty of the epidemic dynamics under the condition \(\frac{\gamma }{\mu } \in (\min \{(\frac{\gamma }{\mu })^*_1, (\frac{\gamma }{\mu })^*_2\}, \max \{(\frac{\gamma }{\mu })^*_1,(\frac{\gamma }{\mu })^*_2\})\), we varied the parameter settings in model (11). The simulation results are shown in Fig. 5. In this case, the epidemic may occur in community 2 and die out in community 1 (Fig. 5a), occur in both communities (Fig. 5b), or die out in both communities (Fig. 5c).

Fig. 5
figure 5

Numerical simulations of cumulative incidence in Community 1 (\(J_1\)) and Community 2 (\(J_2\)) under model (11) with the following parameter settings: a \(N_1=100\), \(N_2=1000\), \(\lambda _{11}=3\), \(\lambda _{12}=0.1\), \(\lambda _{21}=0.01\), \(\lambda _{22}=10\), \(\gamma =0.03\), \(\mu =0.1\); b \(N_1=1000\), \(N_2=2000\), \(\lambda _{11}=5\), \(\lambda _{12}=2\), \(\lambda _{21}=1\), \(\lambda _{22}=7\), \(\gamma =0.1\), \(\mu =0.6\); c \(N_1=1000\), \(N_2=2000\), \(\lambda _{11}=5\), \(\lambda _{12}=2\), \(\lambda _{21}=1\), \(\lambda _{22}=7\), \(\gamma =0.1\), \(\mu =0.7\)

4 Effect of community structure on disease spread

As mentioned above, the impact of Q on the spread of disease in a network is difficult to investigate when G is nonspecific. Using the example in Sect. 3.4, we instead show how the community structure affects the dynamics of model (11). We first study how the associations between the two communities affect the transmission of disease through the network, given \(N_1\), \(N_2\), \(\lambda _1\) and \(\lambda _2\).

Considering the symmetry of the conclusions, the influence of community structure on disease transmission is investigated in five cases, namely, (a) \(N_1=N_2\) and \(\lambda _1=\lambda _2\), (b) \(N_1>N_2\) and \(\lambda _1=\lambda _2\), (c) \(N_1=N_2\) and \(\lambda _1>\lambda _2\), (d) \(N_1>N_2\) and \(\lambda _1>\lambda _2\) and (e) \(N_1>N_2\) and \(\lambda _1<\lambda _2\).

We first study epidemic spreading under the influence of \(\lambda _{12}\), which is inversely proportional to Q. The temporal evolutions of the cumulative epidemic incidence for various values of \(\lambda _{12}\) in the five cases are shown in Figs. 6, 7, 8 and 9. From these figures, we can observe two interesting points: the first point is, when the values of the human-to-human transmissibility of the virus (i.e. \(\frac{\gamma }{\mu }\)) are larger, the cumulative incidence plots are more ‘wobbly’ than for standard epidemics; secondly, the influence of community structure on the cumulative epidemic incidence is dependent on the values of \(\frac{\gamma }{\mu }\). In order to dig deeper into the biological phenomena here, we will show both prevalence (frequency of population infected) plots and the connection between the variation in final epidemic cumulative incidence and the human-to-human transmissibility of the virus.

Fig. 6
figure 6

a Numerical simulations of cumulative incidence in the whole network (J) in Case (a) of model (11) with the following parameter settings: \(N_1=1000\), \(N_2=1000\), \(\lambda _{1}=10\), \(\lambda _{2}=10\), \(\gamma =0.02\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.1, 1, 5, 8; b numerical simulations of cumulative incidence in the whole network (J) in Case (b) of model (11) with the following parameter settings: \(N_1=3000\), \(N_2=1000\), \(\lambda _{1}=10\), \(\lambda _{2}=10\), \(\gamma =0.013\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.05, 0.1, 1, 3

Fig. 7
figure 7

Numerical simulations of cumulative incidence in the whole network (J) in Case (c) of model (11) with the following parameter settings: \(N_1=1000\), \(N_2=1000\), \(\lambda _{1}=20\), \(\lambda _{2}=10\), a \(\gamma =0.02\), \(\mu =0.1\) and \(\lambda _{12}=0.01\), 0.1, 1, 5, 8; b \(\gamma =0.0075\), \(\mu =0.1\) and \(\lambda _{12}=0.01\), 0.1, 1, 2, 3, 4, 8; c \(\gamma =0.006\), \(\mu =0.1\) and \(\lambda _{12}=0.01\), 0.1, 1, 3, 5, 8

Fig. 8
figure 8

Numerical simulations of cumulative incidence in the whole network (J) in Case (d) of model (11) with the following parameter settings: \(N_1=3000\), \(N_2=1000\), \(\lambda _{1}=20\), \(\lambda _{2}=10\), a \(\gamma =0.015\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.05, 0.1, 0.5, 1, 2, 3; b \(\gamma =0.008\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.1, 1, 2, 2.5, 3; c \(\gamma =0.006\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.05, 0.1, 0.3, 0.5, 0.7, 1, 2, 3

Fig. 9
figure 9

Numerical simulations of cumulative incidence in the whole network (J) in Case (e) of model (11) with the following parameter settings: \(N_1\)=3000, \(N_2=1000\), \(\lambda _{1}=10\), \(\lambda _{2}=20\), a \(\gamma =0.013\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.05, 0.1, 0.5, 1, 3, 5, 6, 6.3; b \(\gamma =0.009\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.1, 0.3, 0.5, 0.7, 1, 3, 5, 6, 6.3; c \(\gamma =0.0065\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.1, 0.5, 0.8, 1, 3, 5, 6.3

Fig. 10
figure 10

Numerical simulations of the prevalence in the whole network of model (11) with the following parameter settings: a \(N_1=1000\), \(N_2=1000\), \(\lambda _{1}=20\), \(\lambda _{2}=10\), \(\gamma =0.02\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.04, 0.1, 0.5, 1; b \(N_1=3000\), \(N_2=1000\), \(\lambda _{1}=20\), \(\lambda _{2}=10\), \(\gamma =0.015\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.04, 0.1, 0.5, 1; c \(N_1=3000\), \(N_2=1000\), \(\lambda _{1}=10\), \(\lambda _{2}=20\), \(\gamma =0.013\), \(\mu =0.1\), and \(\lambda _{12}=0.01\), 0.04, 0.1, 0.5, 1

For the first point, we now show the prevalence plots in Fig. 10. Since the ‘wobbly’ phenomenon is obvious in Figs. 7a, 8a and 9a, we give the prevalence plots under these three sets of parameters. Figure 10 shows that two peaks of prevalence appear when \(\lambda \) is small. With the increase in \(\lambda \), the second peak moves towards the first peak until it merges with the first one. At the same time, the first peak is raised, the arriving time of the second peak is advanced and the duration of the epidemic is shortened.

For the second point, we firstly plotted the epidemic final cumulative incidences as functions of modularity Q. The plots are presented in Figs. 11, 12, 13 and 14.

As shown in Fig. 11, the community structure does not affect the final epidemic cumulative incidence when \(\lambda _1=\lambda _2\). The increase of Q will reduce the final epidemic cumulative incidence for larger \(\frac{\gamma }{\mu }\) (Figs. 12, 13, 14). Furthermore, increasing Q accelerates the decrease in the final cumulative incidence. When \(\frac{\gamma }{\mu }\) is near the critical point, with the increase of Q, the final epidemic cumulative incidence increases first and the increment becomes less and less until it reaches zero. Then the final size decreases with increasing Q and the decrement becomes bigger and bigger. However, when \(\frac{\gamma }{\mu }\) is markedly below the critical point, the final epidemic cumulative incidence increases with increasing Q.

Fig. 11
figure 11

a Final cumulative incidences as functions of modularity Q in Cases (a) and (b) of model (11) with the following parameter settings: a \(N_1=1000\), \(N_2=1000\), \(\lambda _{1}=10\), \(\lambda _{2}=10\), \(\gamma =0.02\), \(\mu =0.1\); b \(N_1=3000\), \(N_2=1000\), \(\lambda _{1}=10\), \(\lambda _{2}=10\), \(\gamma =0.013\), \(\mu =0.1\)

Fig. 12
figure 12

Final cumulative incidences as functions of modularity Q in Case (c) of model (11) with the following parameter settings: \(N_1=1000\), \(N_2=1000\), \(\lambda _{1}=20\), \(\lambda _{2}=10\), a \(\gamma =0.02\), \(\mu =0.1\); b \(\gamma =0.0075\), \(\mu =0.1\); c \(\gamma =0.006\), \(\mu =0.1\)

Fig. 13
figure 13

Final cumulative incidences as functions of modularity Q in Case (d) of model (11) with the following parameter settings: \(N_1=3000\), \(N_2=1000\), \(\lambda _{1}=20\), \(\lambda _{2}=10\), a \(\gamma =0.015\), \(\mu =0.1\); b \(\gamma =0.008\), \(\mu =0.1\); c \(\gamma =0.006\), \(\mu =0.1\)

Fig. 14
figure 14

Final cumulative incidences as functions of modularity Q in Case (e) of model (11) with the following parameter settings: \(N_1=3000\), \(N_2=1000\), \(\lambda _{1}=10\), \(\lambda _{2}=20\), a \(\gamma =0.013\), \(\mu =0.1\); b \(\gamma =0.009\), \(\mu =0.1\); c \(\gamma =0.0065\), \(\mu =0.1\)

According to the above results, the impact of community structure on disease spread (especially on the final epidemic cumulative incidence) depends on the value of \(\frac{\gamma }{\mu }\). To better capture the information in Figs. 12, 13 and 14, we denote by \(\varDelta J^+\) and \(\varDelta J^-\) the increment and decrement of the final epidemic cumulative incidence, respectively, as Q increases from its lower limit \(Q(\bar{\lambda }_{12}^*)\) to Q(0.01). [The relationship between Q and \(\lambda _{12}\) is given by Eq. (35)].

Figure 15 shows the variation in final epidemic cumulative incidence with the human-to-human transmissibility of the virus (\(\frac{\gamma }{\mu }\)). These results collaborate Figs. 12, 13 and 14. As \(\frac{\gamma }{\mu }\) is larger, we obtain \(\varDelta J^+=0\) and \(\varDelta J^->0\), which reflects that strengthening the community structure to reduce the size of the disease is an effective method. At the point of \(\frac{\gamma }{\mu }\) where \(\varDelta J^-\) reaches its maximum value, the effectiveness of this method is the most significant. However, when \(\frac{\gamma }{\mu }\) is near the critical point, we have \(\varDelta J^+>0\) and \(\varDelta J^->0\), indicating that strengthening the community structure either enhances or reduces the size of the disease, which is closely related to the community strength at this moment. Comparing panels (b) and (c) in Fig. 15, we find that the community structure more strongly affects the final cumulative incidence when \(N_1>N_2\) and \(\lambda _1<\lambda _2\) than when \(N_1>N_2\) and \(\lambda _1>\lambda _2\).

Fig. 15
figure 15

Final cumulative incidences as functions of \(\gamma /\mu \) in model (11) with the following parameter settings: a \(\lambda _{1}=20\), \(\lambda _{2}=10\), \(N_1=1000\), \(N_2=1000\), and \(\lambda _{12}\) decreasing from 8 to 0.01; b \(\lambda _{1}=20\), \(\lambda _{2}=10\), \(N_1=3000\), \(N_2=1000\) and \(\lambda _{12}\) decreasing from 3 to 0.01; c \(\lambda _{1}=10\), \(\lambda _{2}=20\), \(N_1=3000\), \(N_2=1000\) and \(\lambda _{12}\) decreasing from 6.3 to 0.01

5 Conclusions

In this study, we modeled the spread of a SIR-epidemic in a complex network with two communities. By representing the network as a probability generating function, we could easily reduce the dimensionality of the model system. We first generated the two-community complex network and introduced the quality function Q, which measures the strength of the community structure in complex networks. We then introduced the states of the nodes into the generating function of the degree distribution of the network and obtained the generating function of the excessive degree distribution, thereby establishing the SIR epidemic model. Third, we obtained sufficient conditions for the outbreak and extinction of the disease. Finally, we exemplified the SIR epidemic model in a network with a Poisson joint degree distribution.

The accuracy of our main results was confirmed in simulations on this network. Fixing the total degrees and scales of the two communities, we studied the epidemic dynamics in networks with various community structures, and obtained the following results:

  1. (1)

    Increasing the external degree of the communities reduced the strength of the communities (expressed in terms of the modularity Q).

  2. (2)

    When \(N_1>N_2\), the community structure exerted a stronger effect on the final cumulative incidence when \(\lambda _1<\lambda _2\) than when \(\lambda _1>\lambda _2\).

  3. (3)

    In a network with strong community structure, the disease tends to be kept within the community and it may die out before spreading to the other community. However, when the human-to-human transmissibility of the virus is large enough to cause the disease to break out in all two communities, the strengthening of the community structure will lead to appearance of the second peak in prevalence. Since the stronger the community structure, the easier it is to spread the disease within the community, and the more difficult to spread the disease to the other community. So the strengthening of the community structure will bring the drop in the second peak, the delay of the arrival of the second peak and the prolongation of the epidemic duration.

  4. (4)

    In two communities of different sizes, enhancing the community structure reduced and enhanced the final cumulative incidence of the epidemic under strong and weak transmissibility of the virus between humans (i.e. \(\frac{\gamma }{\mu }\)), respectively. The reduction (increase) in cumulative incidence was attenuated (amplified) as \(\frac{\gamma }{\mu }\) increased. When the transmissibility was near the critical point of the outbreak, strengthening the community structure either increased or decreased the final cumulative epidemic incidence (the actual effect was uncertain). Furthermore, the increment (decrement) was attenuated (amplified) with increasing \(\frac{\gamma }{\mu }\).

How can we explain the above results? We suggest that when the transmissibility of the virus between humans is strong, a disease outbreak in the community is likely. However, whereas a strong community structure will probably retain the disease within the community, the dense network connections in a strongly connected community contain many redundant edges. Consequently, the stronger the community structure, the smaller the final cumulative epidemic incidence. Conversely, when the transmissibility of the virus between humans is weak, a disease outbreak is unlikely. In this case, strengthening the community structure (i.e. increasing the number of network connections in the community) increases the chance of disease breakout within that community. Therefore, the stronger the community structure, the larger the final cumulative epidemic incidence. When the transmissibility is near the critical point of the outbreak, the situation becomes more complicated. Initially, strengthening the community structure encourages the disease outbreak by adding new links within the community, thus increasing the final cumulative epidemic incidence. Once the disease outbreak has peaked, the disease-transmission efficiency of the newly added links steadily reduces. This occurs because the community structure gains redundant internal links while losing effective external links. Consequently, the final cumulative epidemic incidence decreases.

In summary, the size of a menacing disease can be effectively reduced toby strengthening the community structure. Furthermore, the effect becomes more obvious in stronger community structures.

In this paper, we demonstrated the SIR model in a network with a Poisson joint degree distribution as an example, and studied the detailed effects of community structure on disease spread. In future works, we will extend our SIR model to networks with power-law or other joint degree distributions.