Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

People and society, as well as their past, present and future, have always been of fundamental interest to both academic social scientists and to planners and policy-makers in both government and commercial organisations. New research methods enabled by the capabilities of modern computers may radically transform human ability to reason systematically about complex social systems. This has become increasingly important as our world today confronts rapid and potentially profound transitions driven by social, economic, environmental, and technological changes.

Indeed, “one of demography’s main contributions to societal planning is to provide projections of the future population” (Rees 2009). Despite the simplification and uncertainty of the modelled population, population projections still play an indispensible role in our society today, as population evolution affects all levels of planning and policy making. MoSeS (Modelling and Simulation for e-Social Science) is a research node of the UK National Centre for e-Social Science.Footnote 1 MoSeS aims to develop a national demographic model and simulation of the UK population specified at the level of individuals and households. Our interest is not only in the construction of more sophisticated and effective models, but also how simulations might be transferred into a real-world environment. The approach builds up from cities and regions, with an aim to create simulation models of interactions between individuals, groups, or neighbourhoods within large metropolitan areas. Such simulations can form the basis of a wide range of applications in both e-Research and public policy analysis.

This paper describes the hybrid approach used in the MoSeS Dynamic Model, which combines the strengths of both MSM and ABM to enable the modelling of complex social systems. The results generated by the model and our experiments with different sub-populations will also be discussed in this chapter.

2 Background

Modelling population and their societies have always been challenging, due to the complex nature of such systems. They normally consist of a large number of individual components that interact in a strongly nonlinear fashion. Population models concern collections of discrete entities with stochastic behaviours, as well as complex population dynamics caused by temporal population changes. Such changes are bounded and result from something other than a monotonic approach to a stable equilibrium. The dynamics of these systems are typically irregular and may include oscillations and chaos that are produced by some combination of exogenous and endogenous stochastic and/or deterministic factors. Such a complex system requires the capability to handle real data and phenomena that are not necessarily tractable mathematically. Therefore modelling such a system often requires heavy use of simulations and other computational tools to reproduce the picture of the dynamics and behaviours within such systems (Turchin 2003).

There are two main types of population simulation models: microsimulation models (MSMs) and macrosimulation models (see Birkin and Wu 2012, for further details). In MSMs each individual is described with its particular attributes and behaviour, while a macrosimulation model is based on stochastic differential equations, where the population is described in aggregated terms by the number of individuals in different states (Gustafsson and Sternad 2007). Traditional macroscopic population models project populations by age and sex only. Aggregated probabilities are applied to grouped populations by age and sex in the fundamental demographic transitions such as mortality, fertility and migration to project the components of changes in future. Due to their aggregate nature, it is difficult to capture the individual characteristics in such models. MSMs, on the other hand, can model the impact on individual decision units from the changes in strategic planning or government policies. With the increasing complexity of such social and economic programmes, MSMs, with a used and tested record, have become an essential part of the decision making process, allowing identification of the magnitude of gains and losses from policy changes in such areas as taxation, social security, pensions and social services (Bourguignon and Spadaro 2006; Fredriksen et al. 2007). As MSMs can model the population and their past, present and future at the individual level (Wolf 2001), its usage has quickly spanned the whole spectrum of the social sciences to model complex social systems where individual characteristics are important.

However, MSMs require realistic micro-data and can be more difficult to apply in situations where appropriate data are not available. For example, in many applications of MSMs to problems involving taxation, housing or the public finances, real individual records can be manipulated under alternative scenarios to evaluate the impact of policy alternatives. In a population MSM, the individual attributes and transition processes are typically driven by probabilities from aggregate data or samples, and this can make it less flexible in modelling interactions and behaviours of various population groups that occur in the real world. Agent-based models (ABMs) can strengthen our understanding of the interactions between people and their behaviour by modelling demographic processes through interactions with other agents and/or the environment that they live in and model actions according to their unique built-in rules of behaviour. The flexibility of being able to construct heterogeneous agents and their rules makes it particularly helpful when there is a knowledge gap or data are unavailable (Axtell 2000; Epstein 1999). Crooks and Heppenstall (2012) present a useful overview of ABMs.

The MoSeS hybrid model attempts to combine the strength of a dynamic spatial MSM and an ABM to simulate discrete demographic processes at the level of an electoral ward to project the individuals into the future from the year 2001 to 2031.

3 Methodology

3.1 Microsimulation Models (MSMs)

MSMs can model the impact on individual decision units from the changes in strategic planning or government policies. Referring to the probabilistic generation of artificial data on an individual basis of the kind that would be observed in real life, MSM techniques have several advantages over the macrosimulation approach (van Imhoff and Post 1998). It allows more factors to be taken into account in demographic processes, and it is possible to model the interactions between individuals and to incorporate continuous covariates that are not possible in the macrosimulation approach (Siegel 2002). It can also produce data with much richer outputs and yield probabilistic results allowing confidence intervals to be created for the resulting projections. MSMs are able to generate individual event histories that are fully consistent with a set of transition intensities (probabilities) and produce estimates of the full distribution of an outcome, in addition to the expected value that is produced analytically by most models. They are also particularly useful where the projections are produced to answer a “what if” question such as the effect of a policy on a demographic characteristic (Wittenberg et al. 1998).

However, social simulation models are normally targeted to analyze complex social outcomes, i.e. macro outcomes that strongly depend on systemic processes of interactions between individuals that are co-located within a given environment. Microsimulation and macrosimulation are also recognized in previous research as alternative methods for making similar statements about the future. Despite its power, the MSM approach is also complex with onerous data requirements. It relies on quality microdata to produce realistic results, especially for important demographic transitions. When suitable data are not available, it will struggle. MSMs are also more prone to error than macrosimulation due to the sources of randomness such as the extra sources of error from the rich attributes of the individuals, a degree of inherent randomness in the Monte Carlo simulation process and the randomness in the initial population samples (van Imhoff and Post 1998; Wilson and Rees 2005). Some researchers continue to pursue macro-micro linkages, either in linking general equilibrium macro-economic models to MSMs (Aaberge et al. 2007) or allowing behavioural responses in MSMs. Among them, the MicMac project aims to offer a bridge between aggregate projections of cohorts (Mac) and individual cohort members (Mic) (Willekens 2005). Mac and Mic both use the same set of aggregated transition rates, but extra attributes in addition to age and sex have been introduced into Mic to address demographic events and other life transitions at the individual level (Gampe et al. 2009). Mac focuses on transitions among functional states by age and sex to produce cohort biographies, while Mic addresses demographic events and other life transitions at the individual level using a multistate model to produce individual biographies (Willekens 2005).

However, the difficulty of incorporating behavioural responses into a pure MSM becomes clear in practice. The fundamental dilemma of the behavioural vs. probabilistic model has been experienced (O’Donoghue 2001).

3.2 Agent-Based Models (ABMs)

More recently, the ABM approach has been used in various areas of social science, as agents in an ABM seem to be able to naturally mimic human population. A definition is provided by Axtell (2000): “An ABM consists of individual agents with states and rules of behaviour. Running such a model is simply creating a population of such agents and letting agents interact, and monitoring what happens”. Typically, individual agents can move around and interact with each other and the environment that they live in according to their built-in rules. They can also store information gathered through such movements and interactions and make decisions or act upon it.

One of the most useful features of ABMs, as suggested by Epstein (1999), is that it demonstrates that a sufficient microspecification can generate a macrostructure of interest. Through the repetition of application interaction rules for individual agents at the microscopic level, a target macrostructure can be effectively attained. As ABM tries to bridge the micro and macro by identifying microspecifications that are sufficient to generate the macroscopic picture in a robust and replicable manner, it also allows us to study the micro-to-macro mapping.

Furthermore, each individual/society is multi-dimensional and such dimensions can often not be considered separately in the real world, unlike the artificial division of different disciplines of academic social science. The ABM approach reflects the complexity of social systems and provides powerful new forms of hybrid theoretical computational work, especially in studies of non-equilibrium systems. In fact, ABM “invites the interpretation of society as a distributed computational device, and in turn the interpretation of social dynamics as a type of computation. This interpretation raises important foundational issues in social science – some related to intractability, and some to undecidability proper”. For instance, “it is not obvious how we can formulate (or solve) the equations to represent large populations of discrete heterogeneous agents coevolving on a separate space, with which they interact” (Epstein 1999).

In demography, there is a poor level of precision in theoretical constructions. Quite often it not only lacks sufficient theory for applications of statistical models and data collection, there is also insufficient accounting for observability of important quantities used in the theory. Therefore ABM is very helpful for problems where “writing down equations is not a useful activity” (Billari et al. 2003). Despite such useful features that can benefit demographic studies, especially for those who are interested in understanding demographic behaviour, limited use of ABMs has been made in the area of demography. However, there is a stream of researchers who believe that the use of ABM in demography is of interest to all scientists interested in studying demographic behaviour, as well as to computer scientists and modellers who are looking for a promising field of application. Different to the approach based on statistical analysis of behavioural data that aims to understand why specific rules are applied by humans, ABMs pre-suppose (realistic) rules of behaviour and try to challenge the validity of these rules by showing whether they can or cannot explain macroscopic regularities. They argue that in order to study human populations, agent-based approaches are particularly useful from various theoretical perspectives, and as a promising stream of research, agent based approaches can improve our understanding of demographic behaviour (Billari et al. 2003).

3.3 The MoSeS Hybrid Approach

A hybrid modelling approach that combines the strength of both MSMs and ABMs has been adopted to capture the individual characteristics in the simulation model. Microsimulation in the hybrid model provides the capability to work with large scale data through the list processing power and the theoretical basis of dealing with real microdata and empirical data, the macro consequences of a certain behaviour from in depth studies, as well as the analysis of the impact of policies through a predefined group of transition rates, which drives the simulation.

However using the pure MSM, it is difficult to model the movements, interactions and behaviours of individuals or sub-populations due to its statistical nature and central controlled structure, as well as a lack of appropriate data on important transitions. Therefore a hybrid approach combining MSM and ABM techniques is tested to provide the flexibility and practical solution for modelling the subtlety of population changes and the heterogeneous behaviour of the individuals among a large population with rich details. This is less well studied and lacks an appropriate theoretical basis in previous MSMs. The use of ABMs also provides us with the ability to introduce heterogeneous agents in the model whose distinctive behaviours are not necessarily mathematically tractable (Billari et al. 2003).

ABMs mimic the human population naturally in the way that individuals can move around and interact with each other and/or the environment where they live. They can also have some intelligence according to their built-in rule bases where memories/changes can be stored for future decisions/actions. Within an ABM, individuals in the hybrid model are not socially isolated. On the contrary, they demonstrate the autonomy and interdependency at the same time. Thus this hybrid approach allows us to better understand social/demographic processes such as migration and marriage, where the movements, interactions and behaviours are playing an important role. The feature discussed above also provides a way to experimentally test hypotheses on decision processes and behaviours at an individual level.

4 Creating Agents from a Microsimulated Population

4.1 MoSeS Model Description

MoSeS is a dynamic spatial MSM that simulates discrete demographic processes at a ward level and projects the individuals into the future from the year 2001 to 2031. At the core of the MSM is the population model consisting of six modules, which model the following demographic processes: ageing, mortality, fertility, health change, marriage and migration. It models the demographic lifecycle at an annual interval. As a dynamic MSM, this model uses dynamic ageing and the simulation is driven by probabilities applied to each individual in each demographic transition at each simulation step. These transition probabilities are underpinned by a variety of data sources, most notably the ‘British Household Panel Survey’ (BHPS),Footnote 2 a longitudinal survey of individuals and households which is enormously rich in its demographic, social, economic and behavioural content. As a spatial MSM, all probabilities used in the MSM are not only age and gender specific, but also spatially disaggregated at the ward level.Footnote 3 Such probabilities have been calculated to reflect the individual and local difference, using a combination of both individual-level sample records and aggregate small area data from the UK census. The probabilities also reflect other vital factors regarding specific demographic processes. The transition of population status, movements and interactions of individuals are the focus in the model.

MoSeS models the individuals within the households, where they interact with the rest of the world through interactions with other people and the environment that they live in. Although the studied population is modelled as individuals, there is an interdependency between the household, individual and environment. The attributes of individuals, households and their environment can change due to various interactions between the individuals and: (a) other individuals; (b) households; or (c) the area that they live in. For example, during the process of marriage, the formation of a new household will result in changes in at least one individual’s location; for households, this will result in changes in both existing and new households; for the areas that they used to/are going to live in, it will see changes in both local housing and the local population. Similar changes will be experienced in migration processes, too. Due to this interdependency, the operation of these demographic processes of individuals also leads to the formation and dissolution of households during the simulation.

MoSeS uses an ABM for some demographic processes to allow more flexibility and practicality where individual movements, interactions and behaviours are playing an important role. The MoSeS ABM naturally complements its spatial dynamic MSM components, as MoSeS agents reflect four of the most important features highlighted by Epstein (1999): heterogeneity, autonomy, explicit inclusion of space and bounded rationality:

Firstly, MoSeS agents have a rich portfolio of attributes from various data sources of human population samples and such characteristics change over time. Secondly, there is no central control over behaviours of individual MoSeS agents, and each agent acts autonomously according to their own rules. Thirdly, all events happen in an explicit space in MoSeS and local interactions are modelled typically through agents’ interactions with others or their local spatial environments. Finally, MoSeS agents only make decisions/take actions according to simple rules that are based on local information. They do not have global information or infinite computational power. Such features of the MoSeS agents are “especially powerful in representing spatially distributed systems of heterogeneous autonomous actors with bounded information and computing capacity who interact locally” (Epstein 1999). Such a description of the system reflects the characteristics of a dynamic spatial population model extremely well.

Agents have been created from the microsimulated population to model the heterogeneous migration behaviours and the impact of migration histories in two experiments. Student migration and the impact of personal migration history on mortality will be described in the following sections. We use Leeds as an example (Fig. 16.1), which is a city in northern England with a population of approximately 760,000, but the MSM is generalizable between local areas across the UK.

Fig. 16.1
figure 1_16

Map of the wards of Leeds (Source: Generated by the authors using 2001 Census statistics and ward boundaries. Crown Copyright 2003)

4.2 Student Migration

Migration is a complex demographic process where individual movements, interactions and behaviours are of obvious importance (Champion et al. 2002). Using an ABM, individual activities and diversity of migration decisions leading to observed, complex migration patterns can be simulated in detail. Some attempts have produced fruitful outcomes (Espindola et al. 2006; Makowsky et al. 2006). In this chapter we investigate the usefulness of a hybrid modelling approach in a series of experiments where the MSM is combined with an ABM. We experimented on three important properties of agents: their ability to interact with their environment; to interact with others and to carry their own personal history around; and to use the history as a reference for current/future decisions/actions.

We chose student migrants as the sub-population in our first experiment due to their distinctive migration patterns, as well as the fact that in a large UK city such as Leeds, the frequent local migrations of approximately 30,000 university students make an important impact on local population structures. Previous studies have recognized the difficulty of modelling student migration in small areas (Champion et al. 2002). Students do not only move frequently and they also exhibit a distinctive pattern from the rest of the migrants, i.e. they tend to reside only in certain areas, mostly around universities during the period of their study. Most then leave while other new students move in. Due to the replenishment of the student population each year, the population in such wards stays younger than in other wards. As an ABM is very flexible in terms of constructing heterogeneous agents with different rules, we experiment with an ABM for the student migration process using hypothetical rules in a similar way to Schelling’s model on individual decisions to move and the composition of neighbourhoods (Schelling 1971).

In the MoSeS ABM, we identify four types of agents/students: first year undergraduates, year 2 and 3 undergraduates, masters students and doctoral students. Based on the above assumptions, we then apply some general rules to the students in terms of the areas they stay in and the duration of their stay. Specific rules for individual groups vary according to their specific characteristics. For example, a year one undergraduate student will choose to stay in available accommodation on campus and then move into off-campus accommodation where they can stay in the city for two more years. They may then study a masters degree for one more year or leave. A masters student can then stay in the area for one year and leave or continue with doctoral study for three more years. The typical interaction between the agents/students themselves would be to look for fellow students to live within a certain area, and the interaction with the environment would be checking if there is a vacancy in that area. An agent/student keeps searching in areas where their fellow students live until they find a vacancy. The agent without success at the end of the search stays in the current area.

The simulation results of the student population using the pure MSM and using the ABM approach have been compared to the observed distribution of the student population. Through the implementation of simple rules at the individual level, the heterogeneity of student migration is captured in the model. Instead of students almost evenly scattering around the whole city in the MSM, the hybrid model provides a much better reflection of the observed student population concentration around the city centre, close to a university. More importantly, as new students come into the area each year, most existing students leave upon completion of their study in the hybrid model. Students are no longer ageing with the rest of the population in the suburban areas as in the pure MSM (Fig. 16.2). The number of students in wards is indicated by the shade of colour, the darker colour indicates more students in areas. Further details and discussion of these model results can be found elsewhere (Wu et al. 2008).

Fig. 16.2
figure 2_16

Student migration in small areas 2001 and 2031: observed, pure MSM and hybrid results. Darker areas indicated higher migration (Source: Generated by the authors using 2001 Census statistics and ward boundaries. Crown copyright 2003)

4.3 Mortality

Simulating geographically identified populations can demonstrate the local (environmental, economic, etc.) impact on the individuals to a degree. However, sometimes not only the current locations, but where individuals came from or used to live also contributes to individual heterogeneities. For instance, if a person has worked as a miner all his life, his/her mortality/morbidity rates should not suddenly change a great deal just because he/she retired to a pleasant residential area. An ABM can complement the MSM by retrieving personal histories with great ease. In this experiment, we explored three scenarios of mortality projections based on the impact of: the current residence location, the first residence location in the system/birth places and the mortality dependent on personal migration histories. In the first scenario, all individuals are simulated in the MSM. Their survivals are determined against an age, sex and location specific mortality probability generated on the basis of local information about the current location of residence. An ABM approach is used in the second and third scenarios, where agents carry their own histories along with them and have to check on such histories to determine their mortality probabilities. In the second scenario, the survivals are determined on the basis of the mortality rates of individual first residence location/birth places. In the third scenario, we tried to model the impact of personal migration history on mortality. Mortality is projected on the basis of the mortality rates of the area where the individual stays the longest to date.

In the experiments, the whole population of Leeds has been simulated under the three assumptions for 30 years separately and the results of the year 2031 are then compared spatially to assess the difference in the mortality distribution within the city. Although the distribution pattern of mortality is similar on the whole, the experiments still reveal some interesting variations in small areas. Unsurprisingly, the mortality projection based on origins in the second scenario is more different from the projection based on current locations in the first scenario, while the projection based on individual migration history in the third scenario demonstrates more similarity with the first scenario. However, interestingly, we can see from the map that there tends to be a higher mortality in the more established suburban wards in the northern area of the city in the first residence based projection compared to the current residence. This reflects the fact that new migrants may take time to absorb the benefits of favourable demographic conditions in the robustly healthy and affluent northern suburban areas. In comparison, the majority residence based projection indicates a reduction of mortality compared with the current residence based projection in the northern suburban areas, but indicates an increase of mortality in the traditionally less affluent areas in the south and eastern areas around the city centre. Such findings demonstrate that personal history could have an important impact on mortality (see Fig. 16.3).

Fig. 16.3
figure 3_16

Mortality projections in small areas 2001 and 2031: by current, first and majority residence. Darker areas indicated higher mortality (Source: Generated by the authors using 2001 Census statistics and ward boundaries. Crown copyright 2003)

Empirical research on the relationship between limiting long-term illness and migration established that the illness status of migrants is mid-way between that of their origin and destination locations (Norman et al. 2004). If this finding also applies to mortality, then a combination of all three scenarios may be needed to represent the mortality chances of migrants properly. We will continue to improve the mortality projection in the light of such evidence. Needless to say, the migration histories of international migrants may also have a considerable impact on their health and mortality compared to the local people. Although the mortality experiments discussed here are purely based on hypothesis (as a result of lack of data), it demonstrates that there are many more aspects of the population MSM that can be strengthened through the use of personal history. Important elements of the model such as marriage behaviour, fertility patterns and change in health status might all benefit in a similar way (e.g. the recent trends of adopting the Total Fertility Rate (TFR) approach, marriage behaviour dependency on previous marital status and health on previous health history). Such explorations are not only just interesting experiments, but can potentially play a vital role in facilitating decision making where the impact of personal history should be taken into account.

5 Conclusions

In this paper, we introduced the hybrid approach used in an ambitious social model (MoSeS) which combines the strength of both a MSM and ABM to model heterogeneity in a complex social system. MSM in the hybrid model provides the capability to work with large scale data through the list processing power and the theoretical basis of dealing with real datasets, the macro consequences of certain behaviour from in-depth studies, as well as the analysis of the impact of policies through a predefined group of transition rates, which drives the simulation. With the four important features of heterogeneity, autonomy, explicit inclusion of space and bounded rationality, the MoSeS agents in the hybrid model naturally mimic human populations and complement the spatial dynamic MSM components. The hybrid model provides a powerful way to model the local interactions in spatially distributed systems of heterogeneous autonomous actors with bounded information and computing capacity.

In the two experiments we described above, we deliberately chose some straightforward processes and applied ABM with considerable simplification. However, the modelling of interaction, behaviour and personal history impact can be significantly more complex with more intelligent agents. For example, people can have much more complex interactions with others and their environment to make a decision or take an action, assisted by information from their personal histories or requirements during the household formation process. The hybrid approach can also present opportunities to recent demographic modelling trends such as using total fertility rate (TFR) instead of age-specific fertility rates (ASFRs), as “maternal” agents can easily track the history throughout their childbearing lives. The examples discussed in this chapter are simply used to demonstrate the potentials of the hybrid model.

As a demographic planning tool, MoSeS can monitor the evolution of population structures and various demographic changes on a fine geographical scale. This provides vital information for demographic planning/policy making (especially location-based policies). MoSeS can also benefit other public policy making or public service planning. For instance, the ageing trends in certain suburban areas may promote changes in the health service and public transportation provision in order to enable easy access to such services for the old population in the area. The rich attributes captured in the system are also very useful in various policy analyses or research purposes.

The MoSeS hybrid model has provided a framework to enable the effective modelling of heterogeneous decision making units on a large scale, as well as adding the flexibility to introduce different modelling techniques to strengthen various aspects of the model. The model itself provides a useful tool in assisting decision making, exploring various “what if” situations and testing different hypotheses. As we have discussed above, the hybrid modelling approach demonstrates great potential in demographic modelling and we will continue trying to improve various aspect of this model using this approach to provide a better groundwork for more wide-ranging social science studies.