Keywords

Introduction

Synthetic populations are tools widely spread in the agent-based community for representing a baseline population of interest whose dynamics and evolution will be simulated and studied using microsimulations. Using synthetic populations typically consists of two steps. The first one is the generation of the synthetic population statistically as similar as the population of interest. This problem has been extensively studied since the seminal work of [1]. As such, many different methods are available in the literature. Selecting the right one depends on the data available for the generation process [2,3,4,5]. We refer the reader to [5, 6] and [7] for a review of existing approaches.

The dynamic evolution of the synthetic population to forecast the future population is the second step. This is done by feeding the microsimulation with the baseline synthetic population generated in the previous step and apply a set of models and rules to its agents in order to simulate the dynamics of the population. Recent large microsimulation based on this approach include ILUTE [8], MOBLOC [9], VirtualBelgium [10] and its extension VirtualBelgium in Health [11] and TransMob [12].

Usually, the simulation of population’s evolution is driven by a large number of models defining the interactions of the agents between them and/or their environments. Even though each model can have its own time-scale, the conventional approach to simulate the evolution of a population is to use a global time step, e.g. one year, to evaluate all the model in a given predefined sequence. This situation is depicted in Fig. 19.1.

Fig. 19.1
figure 1

Conventional approach to evolve synthetic population in TransMob. Each simulated year \({t}_{i}\) the following sequence of models is applied to obtain the synthetic population in \({t}_{i+1}\): ageing, dying, divorcing, wedding, births

Despite having produced satisfactory results in many different applications, this approach is not ideal. Indeed, the generated population is sensitive to the ordering of models used in the evolution, i.e. different sequences of models will result in significantly different populations. To mitigate this issue, a calendar-based approach has been recently proposed [13], but still relies on a fixed time step. In addition, it is usually impossible to simulate processes evolving on short time scales due to the typically large time step used.

The goal of this research is to propose a framework to evolve a synthetic population solving both aforementioned issues, i.e. without a fixed order for the models and with a dynamic time step. The proposed evolution scheme relies on Gillespie algorithm (Gillespie 1977) originally made to stochastically simulate coupled chemical reactions and is briefly detailed hereunder.

Continuous Evolution Scheme

Let us denote by \(P=\left\{{d}_{1},\dots ,\in {d}_{K}\right\}\) the synthetic population of size \(K\), and \(M=\left\{{m}_{1},\dots ,{m}_{l}\right\}\) the set of \(l\) models used to evolve \(P\) until a given time horizon \({t}_{f}\) is reached. The main steps of the proposed algorithm are:

  1. 1.

    Initialization: initialize the baseline population \(P\) at time \(t={t}_{0}\).

  2. 2.

    Monte-Carlo step: determine the most probable \({m}_{s}\in M\) as well as \(\tau \), the most probable time step at which \({m}_{s}\) will occur.

  3. 3.

    Update: \({m}_{s}\) is applied to \(P\) and \(t\leftarrow t+\tau \). The transition probabilities of every \({m}_{i}\in M\) are also updated.

  4. 4.

    Iterate: go back to 2 while \(t<{t}_{f}\).

This evolution scheme is illustrated in Fig. 19.2.

Fig. 19.2
figure 2

Schematic representation of a continuous time evolution. At each iteration, the most probable time step and model are selected. In this example, the following sequence of models is applied: ageing and death, ageing and birth, ageing and death

The first step to assess the potential of this new methodology is to compare it against validated ones. We thus simulate the evolution of a small synthetic population of 15,000 individuals using a limited set of models (ageing, birth, death) using the recent calendar-based approach as well as a conventional one relying on a fixed (discrete) time step.

Initial results indicate that the approaches produce comparable results. For instance, Fig. 19.3 shows that the evolution of the average population size and the average age of the individuals over time are similar.

Fig. 19.3
figure 3

Evolution of the average year of the individuals per gender (left panel) and population size per gender (right panel) for different evolution algorithms. It can be seen that the algorithms produce similar evolution curves

The proposed approach also allows the use in the models of non-constant probabilities over time to take into account seasonality effects. For instance, let us assume that the natality rate can follows one of the two the probability distributions represented in Fig. 19.4, i.e., either uniform or non-constant. The outcomes of those two distributions on the number of births over time in the population are illustrated in Fig. 19.5, where the seasonality induced by the non-uniform can be clearly seen.

Fig. 19.4
figure 4

Birth probability distribution functions: uniform (blue), non-uniform (red)

Fig. 19.5
figure 5

Number of births per month assuming a constant uniform probability distribution over the year (left panel) and a non-uniform probability distribution (right panel)

From those early experiments, it can be seen that the proposed approach has potential to simulate realistic synthetic population evolution as it does not assume any a priori sequence of models to apply, nor a fixed time step.

Nonetheless, this method is computationally intensive and not well suited to large population. Indeed, as the simulated population grows, \(\tau \) decreases and can become very small, thus increasing the number of steps to reach \({t}_{f}\). Consequently, improving the scalability of this approach will be investigated.

Finally, and more importantly, future development will also focus on adapting this approach to synthetic populations made of individuals gathered in households.