Keywords

1 Introduction

Since 1982, there have been at least 81 public mass shootings across the USA, with the killings occurring in 33 states from Massachusetts to Hawaii [1]. In response to this alarming trend, emergency evacuation of buildings have been identified as an important topic of research. Optimization of pedestrian flow can possibly decrease the time spent along non-optimal paths and hence reduce damage related to panic situations.

The existing literature has a rich body of work on modeling pedestrian movement and pedestrian destination choice. The current state of the art can be broadly divided into microscopic, macroscopic and experimental models. In microscopic modeling, the collective phenomena like bottlenecking, oscillations, etc. are observed from detailed modeling of the dynamics at the microscopic or node level. The microscopic category includes social force model [2], cellular automaton models of pedestrian movement [3], lattice gas method [4], and decision tree based modeling [5]. The macroscopic modeling technique involves describing the flow of pedestrian as analogous to fluid flow and deriving the flow equations necessary to understand and control the crowd movement [6].

In this work, the scope is to study the effect of opinion/information propagation [7] in a crowd of evacuating individuals. Our work, incorporates a sophisticated movement and decision model into a spatially-bounded opinion sharing model to study the effect of knowledge level and presence of leaders in the crowd. The next section provides details about our hybrid model.

2 Simulation Model

The building setup consists of two separate rooms that open up to a common hallway that wrap around to two different exits. The rooms were populated with people from different age and gender groups and were given walking speeds accordingly. The exits were placed such that the building has one shortest path (route 1 in Fig. 1(a)), a couple of paths of equal length (route 2 and 4 in Fig. 1(a)), and a longest path (route 3 in Fig. 1(a)).

Fig. 1.
figure 1

(a) A finite state automata showing states and actions overlaid on the building layout, and (b) Illustration of interaction with spatially bounded confidence model (Color figure online)

2.1 Decision Model

The underlying decision logic for individuals is modeled as a Markov decision process. A Markov decision process is defined by \(M=\{S,A,P,\gamma ,R\}\) where:

  • S is the set of all possible decision states,

  • A is the set of all available decision/actions,

  • P is the transition probability \(P(s,a,s')\). It gives the probability an individual assigns for successful physical transition to state \(s'\) from state s after deciding to take action a,

  • R is the set of rewards or payoffs assigned to the various decisions by an individual. The individual’s overall route choice depends on the reward structure,

  • \(\gamma \) is the discount factor \(\in [0,1)\) - which make the computation of accumulated rewards mathematically tractable.

Each individual has exits \(1, \ 2, \ 3, \ 4,\) and 5 marked as \(E_1\), \(E_2\), \(E_3\), \(E_4\), \(E_5\), and the trails connecting the exits, marked as \(T_{ij}\) (see Fig. 1(a)) as available decision states. \(T_{ij}\) denotes the corridor connecting the \(i^{th}\) exit to the \(j^{th}\) exit. Every individual can decide to move towards one of the immediately available exit points and they will land in the state corresponding to their current position. The set of available actions consist of decisions to move towards exits and the action of exiting labeled as \(e_1, \ e_2, \ e_3, \ e_4, \ e_5\), and e respectively in Fig. 1(a).

Initially, the transition probability (\(P(s,a,s')\)) for all state and action pairs is set at 0.9, and \(P(s,a,s) = 1-P(s,a,s')\) to takes into account the environmental uncertainties. The transition probability for action e (\(P(s,e,s')\)) is reduced as time progresses to account for impatience as expressed by, \(P(s,e,s')=P(s,e,s')*exp(-\alpha \times t_{diff})\), where \(t_{diff}=\,\)Time spent in state \(T_{ij}\) − Estimated travel time to exit \(E_j\). We have experimented with 3 different impatience growth rate, \(\alpha \) to simulate different crowd behaviors.

The exits are given decreasing rewards from outward to inward (\(E_4,\ E_5>E_2,\ E_3> E_1\)). The trail state rewards are inversely proportional to the trail length and are upper bounded by the minimum reward for all the exits. Individuals will typically chose the shortest path. However, if the lanes are crowded, then they tend to move towards the next best available route to reach either exit 4 or 5 as quickly as possible. This reward structure enables the decision maker to seek the decision state that leads to the shortest path towards the exit, but the framework allows individuals to change their decision if the are unable to reach their desired exit within a reasonable time frame.

Individuals are assigned a decision timer (\(\tau _i\)) from a normal random distribution. Each individual performs a planning routine whenever their decision timer expires. For planning their route, individuals compute the value of available states (exits and trails), compare the values, and decide to move along the trail with the highest value. The value of a state is the expected cumulative reward that can be obtained from that state. The discount factor is used in the summation to weigh the immediate reward more than the future rewards. Formally, a value iteration algorithm is used to find the value of states.

The value of states found with value iteration algorithm satisfies the Bellman optimality condition [8]. The Bellman optimality condition states that the action taken at a state has to result in landing at the best possible next state with respect to their calculated value. Thus each individual optimizes his/her route at every decision cycle.

2.2 Opinion Sharing Framework

Humans have a tendency to herd and it is captured in this paper with a spatially bounded confidence model. The bounded confidence model [9] is modified to suit the egress dynamics by using distance between individuals as the confidence boundary metric. Each individual after completing a value iteration cycle will interact with individuals within their herding range (r) and modify their perceived value of states according to \(V_{self} = (1-\mu )\times V_{self} + \mu \times average~of~V_{others~within~r}\), where \(\mu \) is the herding level, which is how much weight individuals give to the herd’s opinion. The value function is normalized for each individual to ensure that the herding effect is uniform.

An interaction process for an individual (blue) is depicted in Fig. 1(b). The boundary for the interaction/herding zone is shown with the green circle. Agents within the zone and not separated by walls are allowed to share opinion (green).

2.3 Movement Model

The position of the individuals are updated every one second. Each individual will attempt to move towards their respective exit choice. Every individual occupies a circle of one feet radius and additionally one feet radius is designated as personal space. Every individual attempts to move while respecting others’ personal space and avoid collision with walls and people. With this hybrid model several scenarios were investigated. The results are presented and discussed in the following section.

3 Results and Discussion

3.1 Shortest Path Decision Makers

For this set of simulations, each individual’s decision model was assigned a reward/reinforcement function which preferred the shortest route. As evident from the congestion map (Fig. 2(a)), route 1 (left, then down) was the most utilized path and route 4 (right, then up) was the second most utilized path. Route 1 was the natural choice for the crowd since it is the shortest path to safety. As every individual tried to go through route 1 it became crowded, impatience grew resulting in part of the crowd starting to move along route 4. The highest congestion occurred at the room exits followed by the corridor just outside the rooms.

Fig. 2.
figure 2

(a) Heat map indicating congestion along the routes and (b) Effect of different herding level (\(\mu \)) on the average time taken by individuals to exit the building with shortest path and familiar path reward function (Common parameters: \(N = 300\), \(r = 10~ft\), \(\alpha = 0.05\), and \(\tau = 4s\))

3.2 Familiar Path Decision Makers

For this set of simulation, the crowd was initialized with a familiar path reinforcement function. The crowd was randomly and evenly divided into four groups and each group was given a reward function that made one of the four available paths as the familiar route for the individuals in the group. At all herding levels, the shortest path crowd fared better than the familiar path crowd (Fig. 2(b)). Quicker evacuation was observed when the crowd consisted of more receptive individuals. Cooperation was better when evacuees did not have complete unbiased knowledge of their environment.

3.3 Shortest Path Decision Maker with Familiar Path Leaders

The next set of simulations were conducted to study the effect of leaders with biased route choice on the crowd’s egress dynamics. The leaders are characterized by a strong bias and stuck to their opinion (i.e.,) their exit choice is affected only by the environment and not by other individuals. The crowd is composed of a few leaders and many shortest path seeking individuals.

Fig. 3.
figure 3

(a) \(\alpha = 0.05\), \(N = 120\), and leader with route choice 4 - Effect of number of strong opinion holders on the average time to evacuate, and (b) Number of strong opinion holders, \(\lambda = 10\) - Effect of different route choice of leaders on the average time taken by individuals to exit the building (Common parameters: \(\tau = 4\,\mathrm{s}\), \(r = 10\,\mathrm{ft}\), and \(\mu = 0.4\))

Effect of number of leaders (\(\lambda \)): The results with the specific simulation parameters are shown in Fig. 3(a). The average time to exit the building decreased with more leaders in the crowd. The crowd moved with the leaders and avoided congestion at route 1 and reached safety faster. Route 4 was chosen in particular because it was the second best choice among the available routes taking into account distance to travel and the potential congestion in the corridors.

Effect of route choice of leaders under different impatience levels: The final set of simulations were concerned about the route choice of the leaders. The simulations were conducted with fixed number of leaders (\(\lambda = 10\)) in a crowd of 110 people. The effect of leaders were diminished (Fig. 3(b)) when a crowd consisted of individuals with faster impatience growth (\(\alpha = 0.1\)). One logical explanation for this is the fact that in a crowd of highly impatient individuals, leaders ability to sway and hold opinion of other individuals for long time is diminished. With a lesser impatient crowd (\(\alpha = 0.05\)), except for route 2 which puts additional pressure on already crowded lane all other leaders route bias were helpful in getting the crowd to safety quicker.

4 Conclusion and Future Work

This work combines a naturalistic movement model and a decision making model with an explicit opinion sharing dynamics (the hybrid model) to study the effect of opinion sharing and several other factors on the crowd evacuation metrics for a given building structure. Factors such as how receptive the crowd is to opinion sharing, how fast the individuals tend to change their exit choice when confronted with crowded lanes/congestion, and the frequency of decision making affect the crowd’s evacuation time. Ideally, a communicative crowd seeking shortest path with well informed leaders is well-suited for a quick evacuation of the given building. Herding is not detrimental for evacuation. However, over-herding can lead to under utilization of all the available routes leading to an increase in the evacuation time. People with strong opinions can contribute to faster egress out of the building, if their strong opinion aligns with the under-utilized route(s).