Examining military medical evacuation dispatching policies utilizing a Markov decision process model of a controlled queueing system

Jenkins, Phillip R.; Robbins, Matthew J.; Lunday, Brian J.

doi:10.1007/s10479-018-2760-z

Examining military medical evacuation dispatching policies utilizing a Markov decision process model of a controlled queueing system

Original Research
Published: 01 February 2018

Volume 271, pages 641–678, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Annals of Operations Research Aims and scope Submit manuscript

Examining military medical evacuation dispatching policies utilizing a Markov decision process model of a controlled queueing system

Download PDF

Phillip R. Jenkins¹,
Matthew J. Robbins ORCID: orcid.org/0000-0002-1718-6839¹ &
Brian J. Lunday¹

761 Accesses
16 Citations
Explore all metrics

Abstract

Military medical planners must develop dispatching policies that dictate how aerial medical evacuation (MEDEVAC) units are utilized during major combat operations. The objective of this research is to determine how to optimally dispatch MEDEVAC units in response to 9-line MEDEVAC requests to maximize MEDEVAC system performance. A discounted, infinite horizon Markov decision process (MDP) model is developed to examine the MEDEVAC dispatching problem. The MDP model allows the dispatching authority to accept, reject, or queue incoming requests based on a request’s classification (i.e., zone and precedence level) and the state of the MEDEVAC system. A representative planning scenario based on contingency operations in southern Afghanistan is utilized to investigate the differences between the optimal dispatching policy and three practitioner-friendly myopic policies. Two computational experiments are conducted to examine the impact of selected MEDEVAC problem features on the optimal policy and the system performance measure. Several excursions are examined to identify how the 9-line MEDEVAC request arrival rate and the MEDEVAC flight speeds impact the optimal dispatching policy. Results indicate that dispatching MEDEVAC units considering the precedence level of requests and the locations of busy MEDEVAC units increases the performance of the MEDEVAC system. These results inform the development and implementation of MEDEVAC tactics, techniques, and procedures by military medical planners. Moreover, an analysis of solution approaches for the MEDEVAC dispatching problem reveals that the policy iteration algorithm substantially outperforms the linear programming algorithms executed by CPLEX 12.6 with regard to computational effort. This result supports the claim that policy iteration remains the superlative solution algorithm for exactly solving computationally tractable Markov decision problems.

A maximum expected covering problem for locating and dispatching two classes of military medical evacuation air assets

Article 24 October 2014

Military and Security Applications: Medical Evacuation

A proactive transfer policy for critical patient flow management

Article 17 February 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Medical Ethics

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The primary objective of a deployed military emergency medical services (EMS) system is to successfully evacuate casualties from the battlefield in a timely manner. Casualty evacuation (CASEVAC) and medical evacuation (MEDEVAC) are the two main options available for transporting combat casualties to a medical treatment facility (MTF). CASEVAC refers to the transport of casualties to an MTF via non-medical vehicles or aircraft without en route medical care by onboard medical professionals. Casualties transported via CASEVAC may not receive the necessary medical care nor be transported to an appropriate MTF. MEDEVAC refers to the transport of casualties to an appropriate MTF via standardized medical evacuation platforms with onboard medical professionals who are equipped to provide en route medical care and emergency medical intervention (Department of the Army 2016). As such, MEDEVAC is the preferred and primary method of transporting combat casualties.

Whereas MEDEVAC operations utilize several different types of evacuation platforms, this paper focuses on the aerial aspect of MEDEVAC operations (i.e., aeromedical helicopter operations). Helicopters have the capability and flexibility to fly directly to a predetermined casualty collection point (CCP), meet battlefield casualties when they are at their most vulnerable and critical stages, and either land in an area where no other platform (e.g., ground vehicle or fixed-wing aircraft) can or utilize a rescue hoist to lift casualties to the helicopter. After securing the casualties, helicopters can fly directly to dedicated trauma centers or hospitals, unencumbered by roads, at speeds often exceeding 150 miles per hour, all while providing definitive en route care via well trained and highly skilled medics (O’Shea 2011). These helicopter capabilities greatly contribute to recent increases in casualty survivability rates.

Helicopter ambulances were first introduced in the military during the Korean conflict and immediately became a high visibility asset of the MEDEVAC system. By the end of the Vietnam War, the capabilities of helicopters (i.e., speed and versatility) in austere conditions far exceeded the capabilities of ground platforms. The ability to travel across terrain in remote areas not accessible to ground vehicles makes helicopters well suited for MEDEVAC operations (De Lorenzo 2003; Clarke and Davis 2012). The United States Army operates HH-60M helicopters specifically designed for the MEDEVAC mission. HH-60M helicopters are equipped with the necessary resources (e.g., oxygen generator, integrated EKG machine, electronically controlled litters, built-in external hoist, and an infrared system that can locate patients by their body heat) to provide medical personnel the ability to simultaneously treat and transport casualties from a CCP to an appropriate MTF. The urgency of the MEDEVAC mission is critical to the survivability of battlefield casualties and the HH-60M helicopter has proved to be advantageous to the Army with its ability to lift-off on a mission within 7 min of notification (O’Shea 2011). The United States (U.S.) military recognizes the unique capabilities of MEDEVAC helicopters and utilizes them as the primary evacuation platform for battlefield casualties. For example, during the Afghanistan conflict between September 11, 2001 and March 31, 2014, the U.S. military incurred 21,089 casualties, of which 19,148 were transported via MEDEVAC helicopter (Kotwal et al. 2016). Eastridge et al. (2012) report that the survivability of combat casualties has continued to increase over time since World War II (WWII). Approximately 80% of casualties occurring on the battlefield survived in WWII, whereas 84% survived during the Vietnam War. An increase to 90% casualty survivability was observed in the continuous decade of United States’ conflicts between 2001 and 2011. The improved casualty rates are attributed to improvements in the versatility and speed of MEDEVAC helicopters and the resulting decrease in the time required for casualties to receive proper medical care (De Lorenzo 2003).

Military medical planners are responsible for designing deployed MEDEVAC systems. An effective and efficient MEDEVAC system boosts the esprit de corps of deployed military personnel, who understand that rapid and quality care will be provided if they are injured in combat (Department of the Army 2016). Important decisions include determining where to locate MEDEVAC units and MTFs, identifying a MEDEVAC dispatching policy, and recognizing when redeployment of aeromedical helicopters is necessary and possible. The location of MEDEVAC units is usually determined while considering two objectives: maximizing coverage and minimizing response time subject to logistical, resource, and force protection constraints. Deciding which MEDEVAC unit to dispatch to a given service request is a vital aspect of any EMS, including a MEDEVAC system, and is the primary focus of this paper. The military often defaults to a myopic dispatching policy wherein the closest available MEDEVAC unit is dispatched to retrieve combat casualties from a CCP regardless of the request’s evacuation precedence category (e.g., Priority I—Urgent, Priority II—Priority, and Priority III—Routine). Redeployment of MEDEVAC units prior to returning to their originating base is possible but poses challenges due to the numerous resource and availability requirements (e.g., refueling, resupply, and armed escort). These reasons also render temporary relocation of idle MEDEVAC units uncommon within a theater of operations (Rettke et al. 2016).

This paper examines the MEDEVAC dispatching problem wherein a dispatching authority must decide which MEDEVAC unit to dispatch to a particular 9-line MEDEVAC request. The location of MTFs and MEDEVAC assets are known and all MEDEVAC helicopters are assumed to have the capability to meet the mission requirements of any 9-line MEDEVAC request. Redeployment is not considered. The reported dispatch policy is based on the location and status of MEDEVAC units, the location of the casualty event, and the evacuation precedence category of the casualty event.

An infinite horizon, discounted Markov decision process (MDP) model is formulated to determine how to optimally dispatch MEDEVAC helicopters to casualty events occurring in combat to maximize the expected total discounted reward attained by the system. A computational example is applied to a MEDEVAC system in Afghanistan in support of combat operations. Comparisons are made between myopic policies that are typically utilized in practice and the optimal policy derived from the formulated MDP model. Herein, we consider three specific myopic policies that, while adopting the rule of dispatching the closest available MEDEVAC unit to service a request, if all MEDEVAC units are busy, respectively queue any MEDEVAC requests, queue only urgent MEDEVAC requests, or reject (i.e., do not queue) any MEDEVAC requests.

An important difference between this paper and other papers in this research area is the incorporation of admission control and queueing. Consideration of admission control and queueing can greatly improve the performance of existing and proposed systems in many contexts, including manufacturing, distributed computing, and communications (Stidham 1985; Shenker and Weinrib 1989; Stidham and Weber 1993; Stidham 2002) and has yet to be examined in the context of MEDEVAC dispatching. Admission control allows the dispatching authority to observe the current state of the MEDEVAC system before making the decision to accept or reject an incoming request. This provides the dispatching authority the power to reject incoming requests, thereby reserving MEDEVAC units for higher precedence requests instead of satisfying all requests for service. The rejected requests are not simply discarded; rather, they are redirected to another supporting agency to be serviced (i.e., CASEVAC). If the dispatch authority allows a request to enter the MEDEVAC system but all MEDEVAC units are currently servicing other requests, the entering request will be allocated to a queue based on its precedence level and location as categorized by geographic zone. Once a request has entered the system, it will be serviced; however, the dispatching authority dictates which available MEDEVAC unit will service each request in the system regardless of when the request entered the system. For example, an urgent request will be serviced before a routine request regardless of the order in which they entered the system. It is important to note that MEDEVAC units will not interrupt service to a request in the case of a higher precedence request arriving. Once a MEDEVAC unit is assigned a specific request, it will be considered unavailable until it completes the service of that request.

The remainder of this paper is organized as follows. Section 2 provides a review of research relating to MEDEVAC systems. Section 3 presents a description of the MEDEVAC dispatching problem. Section 4 describes the MDP formulation developed to determine an optimal MEDEVAC dispatch policy. Section 5 examines an application of the formulated MDP model based on a representative scenario in southern Afghanistan. Section 6 concludes the paper and proposes several directions for future research.

2 Literature review

For nearly half a century, research has been conducted on optimizing both civilian and military emergency medical services (EMS) response systems. The main features of this research include determining the location of servers; dictating the number of servers per location, the server dispatch policy, and the size and number of response zones (if a partitioning strategy for the service area is implemented); identifying which performance measure to focus on as the objective: response time thresholds (RTTs) or patient survivability rates; and recognizing if and when server relocation is necessary due to either a service completion or an incoming service request. Another complicating feature concerns the location of hospitals. In research examining civilian EMS systems, the locations of hospitals are usually given as fixed; however, in some military planning contexts the medical treatment facility (MTF) locations are not given. Military medical planners must decide where to best place MTF locations when designing a military medical evacuation (MEDEVAC) system (Rettke et al. 2016). Operations research (OR) methods provide rigorous, defensible, and quantitative insights to researchers examining EMS systems. Applied OR methods include stochastic modeling, queueing, discrete optimization, and simulation modeling (Green and Kolesar 2004).

The research presented in this paper examines the optimal dispatch of military EMS vehicles (i.e., HH-60M MEDEVAC helicopters) to prioritized requests for service. Consideration of the precedence category (e.g., Priority I—Urgent, Priority II—Priority, and Priority III—Routine) is important. A substantial amount of research seeks to improve the overall performance of EMS systems, but most research endeavors do not account for the precedence of the call (Bandara et al. 2014). When the precedence of the call is not considered, the default dispatching rule sends the closest available emergency response vehicle to satisfy required service requests with no regard as to how that specific vehicle’s absence impacts the overall EMS system. Sending the closest available vehicle to a service request regardless of other factors (e.g., precedence, or severity) is commonly referred to as a myopic policy. Many researchers (Carter et al. 1972; Nicholl et al. 1999; Kuisma et al. 2004) show that myopic policies tend to be suboptimal. Incorporating precedence categories into the construction of dispatching polices can ultimately lead to more lives being saved on the battlefield.

EMS research exists that focuses specifically on military MEDEVAC systems. Zeto et al. (2006) develop a goal programming model that seeks to maximize the aggregate expected demands covered and minimize the spare capacities of air ambulances. The authors leverage the work of Alsalloum and Rand (2006) to examine both the problems of resource allocation and coverage in a three-phased approach. In the first phase, they characterize the demand for MEDEVAC missions using a multivariate hierarchical cluster analysis. In the second phase, they estimate the parameters of the model via a Monte Carlo simulation. In the third phase, they utilize a bi-criteria model to emplace the minimum number of required aircraft at each location to maximize the probability of meeting the MEDEVAC demand in the Afghanistan theater. Bastian et al. (2012) investigate the capabilities required for MEDEVAC aircraft platforms to successfully perform the necessary duties and provide coverage within a brigade operating space. The authors develop a decision support tool that military medical planners can utilize to analyze the risk associated with different MEDEVAC strategies. Fulton et al. (2009) evaluate the planning factors and rules of allocation associated with Army air ambulance companies. Military medical planners typically use the rules of allocation, which are based on strategic planning documents, to estimate the number of MEDEVAC units required for tactical and operational scenarios. The authors quantitatively analyze different rules through a Monte Carlo simulation and record the impact that they respectively have on major combat operations. The results indicate that 0.4 aircraft per admission would be a reasonable planning factor. Sundstrom et al. (1996) incorporate linear programming techniques to develop a model based on the probabilistic location set-covering problem that provides the required numbers of MEDEVAC assets needed as well as the optimal positioning of those assets to ensure orderly transport of battlefield casualties to an appropriate medical facility.

The allocation of MEDEVAC units during steady-state combat operations is studied by Fulton et al. (2010) and Bastian (2010). Fulton et al. (2010) formulate a stochastic optimization model that manages the locations of deployable military hospitals, hospital beds, and both aerial and ground MEDEVAC units prior to the reception of a 9-line MEDEVAC request. Their model uses an objective of minimizing the total travel time, weighted by the urgency level of the casualty, from the POI to an appropriate MTF. The weights associated with the urgency levels of casualties are derived from historical data of patient injury severity scores collected from Operation Iraqi Freedom (OIF) combat operations. Bastian (2010) formulates a stochastic optimization goal programming model to meet three separate objectives: maximize the coverage of theater-wide casualty demand in Afghanistan, minimize the spare capacity of MEDEVAC units, and minimize the maximal MTF evacuation site vulnerability to enemy attack. The aforementioned research endeavors alternatively focus on optimizing the location, allocation, or reallocation of MEDEVAC assets. Although such problems are important to consider, this research assumes the locations of MEDEVAC staging areas and MTFs to be fixed and the allocation of MEDEVAC helicopters to be both known and fixed throughout steady-state combat operations. These assumptions are reasonable and are adopted in other works focusing solely on MEDEVAC dispatching policies (e.g., see Keneally et al. (2016) and Rettke et al. (2016)).

Keneally et al. (2016) examine MEDEVAC dispatch policies in the Afghanistan theater via a Markov decision process (MDP) model. The authors assume that each service call arrives sequentially and the locations of each service center are predetermined. Their work classifies each service call into one of three evacuation precedence categories: urgent, priority, and routine. Moreover, they consider the possibility that an armed escort may be required to accompany the MEDEVAC unit. The authors utilize a reward function based on an RTT and conduct computational experiments wherein MEDEVAC units operate in support of Operation Enduring Freedom (OEF). The results highlight that the myopic policy (i.e., the default policy in practice) is not always the optimal dispatching policy. This work herein extends research by Keneally et al. (2016) via the consideration of admission control and queueing. Moreover, this research measures performance via a survivability function rather than an RTT since survival probability more accurately represents casualty outcomes (Bandara et al. 2014). Grannan et al. (2015) develop a binary linear programming (BLP) model to determine where to locate and how to dispatch multiple types of military MEDEVAC air assets. A spatial queueing approximation model provides inputs to the BLP model. The BLP model incorporates the precedence of each service call to maintain a high likelihood of survival for the most urgent casualties. The overall objective is to maximize the proportion of high-precedence calls responded to within a pre-determined RTT.

Rettke et al. (2016) formulate an MDP model to examine the MEDEVAC dispatching problem. The problem instance size in their study is too large for an exact dynamic programming solution approach, so the authors employ approximate dynamic programming (ADP) techniques to determine a high-quality dispatch policy. The computational experiments in this study indicate that the authors’ ADP-generated policy is nearly 31% better than the myopic policy. Military medical planners can use these results to improve existing MEDEVAC tactics and techniques. The problem instances in this research can be solved via exact dynamic programming methods and, therefore, do not utilize ADP techniques to generate MEDEVAC dispatching policies. Moreover, Rettke et al. (2016) assume that all incoming requests must be serviced if there are any MEDEVAC units available and queue incoming requests if all MEDEVAC units are busy. This paper relaxes the assumption that all incoming requests must be serviced and gives the dispatching authority the option to reject incoming requests based on the MEDEVAC system state. Lejeune and Margot (2016) propose a MEDEVAC model that considers endogenous uncertainty in the delivery times of casualties. The objective of their model is to provide prompt medical treatment and evacuation to soldiers injured in combat. The model determines where to locate MEDEVAC units and MTFs. Moreover, it helps the dispatch authority to determine which helicopters to dispatch and to which MTF each call should be transported to. Results indicate a reduction in battlefield deaths due to an increase in timely treatment of combat casualties when compared to a myopic policy. The dispatching policies generated by Lejeune and Margot (2016) assign response districts to each MEDEVAC staging area regardless of the system state. In contrast, the research herein utilizes the benefits of MDP models to determine the optimal dispatching decision for every feasible state in which the MEDEVAC system can be, considering all MEDEVAC assets in the enterprise.

3 Problem description

One of the primary missions of the Army Health System is to provide medical evacuation (MEDEVAC) across a wide range of military operations. The dedicated Army helicopters (i.e., rotary-wing aircraft or air ambulances) utilized in MEDEVAC missions are under the command of the general support aviation battalion (GSAB). Any use of air ambulances must first be coordinated with the supporting GSAB to synchronize evacuation procedures. The GSAB manages all activities related to the execution of aerial operations and serves as the primary decision-making authority for the military MEDEVAC system (Department of the Army 2016). An Army aeromedical evacuation officer (AEO) that works within the GSAB acts as the MEDEVAC dispatching authority in a deployed military emergency medical service (EMS) system (Fish 2014). AEOs direct the use of medical aircraft, personnel, and equipment in support of operational and strategic medical evacuations within a theater of operations.

When a casualty event occurs and a 9-line MEDEVAC request is submitted, the AEO must decide quickly which MEDEVAC unit (if any) to dispatch. The casualty survivability rate will decrease if there are delays in decision making. To complicate matters further, there are many situations in which MEDEVAC units require a team of armed helicopters to escort them to the casualty site due to high threat-level conditions (e.g., enemy troops in the area). Armed escort requirements can potentially increase the overall response time, which ultimately decreases the chances of casualties surviving. Therefore, it is vital that the GSAB implements a dispatching policy resulting in rapid and high-quality transport of life-threatening battlefield casualties from a pre-determined casualty collection point (CCP) to the nearest, most appropriate medical treatment facility (MTF). The procedures outlined in the Army’s Medical Evacuation Field Manual (Department of the Army 2016) and the graphical representations that Keneally et al. (2016) and Rettke et al. (2016) offer in their problem descriptions are utilized as a basis for the MEDEVAC mission timeline depicted in Fig. 1.

A 9-line MEDEVAC request is transmitted in a standardized message format with a prescribed amount of information that helps expedite the process of transporting casualties. When a 9-line MEDEVAC request is determined to be necessary, it should be transmitted over a secure communication system via a dedicated frequency. However, a 9-line MEDEVAC request can still be transmitted without such precautions if no secure communication systems are available. In wartime conditions, the information required in a 9-line MEDEVAC request is reported in the following order: the location of the pickup site (i.e., CCP), radio frequency and call sign, number of casualties by precedence, special equipment required, number of casualties by type, security of pickup site, method of marking pickup site, casualty nationality and status, and any chemical, biological, radiological, and nuclear contamination. The United States Army utilizes a three-category casualty triage rubric that governs the evacuation precedence of 9-line MEDEVAC requests. Priority I (i.e., urgent) and Priority II (i.e., priority) requests are life-threating and must be serviced within 60 min and 4 h, respectively. Priority III (i.e., routine) requests are not life-threating but still must be serviced within 24 h (Department of the Army 2016). Either the senior military member or the senior medical person (if available) at the scene identifies the evacuation precedence category of each casualty and determines whether a 9-line MEDEVAC request is necessary. The tactical situation and the condition of each casualty are taken into consideration when making this decision. The overall precedence of a 9-line MEDEVAC request is based on the most time sensitive precedence among the casualties. Correct casualty event category identification is vital and cannot be overemphasized because mistakes may burden the evacuation system. Aerial ambulances are a low-asset, high-demand resource that must be managed accordingly.

In a combat situation, requests for MEDEVAC units are typically made at the point-of-injury (POI) once enemy fire has been suppressed. MEDEVAC requests are transmitted through several layers of command before reaching an AEO working within the GSAB headquarters. The specific information flow depends on the communication infrastructure within the command, the communication equipment available to the requesting unit, and the command and control organization of the MEDEVAC system (Rettke et al. 2016). Once the request has been made, casualties are transported to a CCP, which is a predesignated point along the evacuation route for collecting the wounded (Department of the Army 2000). The time at which the MEDEVAC request reaches the AEO is denoted by $T_1$.

Once the GSAB receives the 9-line MEDEVAC request, the AEO must then decide whether to immediately assign a MEDEVAC unit to the request, depending on any pre-existing requests in the MEDEVAC system, the location of the pick-up site, the number and precedence of the casualties, and the status of the MEDEVAC units. If the MEDEVAC system is burdened with a high number of requests, the AEO may reject the incoming request from entering the system and redirect the request to be handled by casualty evacuation (CASEVAC). Assuming the request enters the system, the AEO will wait for a suitable MEDEVAC unit to become available. At time $T_2$, the AEO assigns the MEDEVAC unit to service the request with an armed escort, if required.

The amount of time between an AEO receiving the 9-line MEDEVAC request, $T_1$, and the assignment of the MEDEVAC unit, $T_2$, is the total wait time for the request in the MEDEVAC system. As stated earlier, once a 9-line MEDEVAC request is received by the GSAB, the AEO must decide whether the request should enter the MEDEVAC system or the request should be serviced by another organization (i.e., CASEVAC). If the AEO allows the request to enter the MEDEVAC system and at least one suitable MEDEVAC unit is available to service the request, another decision must be made regarding whether the request should be assigned immediately or the request should be placed in a queue based on the its precedence category and location (i.e., zone). If the AEO allows a request to enter the MEDEVAC system and no suitable MEDEVAC units are available to service the request, then the request is placed in its respective zone-precedence queue. Figure 2 depicts the multiple-server, multiple-buffer queueing model employed in this paper. The MEDEVAC queueing system represented in Fig. 2 visually depicts the wait time between points $T_1$ and $T_2$ in Fig. 1.

Decision epochs occur when a 9-line MEDEVAC request is received by the GSAB or when a MEDEVAC unit completes a service request and becomes available. When a 9-line request is submitted and received by the GSAB, the AEO’s decision consists of sending the just-arrived 9-line MEDEVAC request to its respective zone-precedence queue (if the queue is not full), immediately assigning an available MEDEVAC unit to service the request, or rejecting the request from ever entering the system. Once a MEDEVAC unit reaches service completion and at least one of the zone-precedence queues is not empty, the AEO must make a decision. The AEO’s decision consists of either assigning a queued 9-line MEDEVAC request to one of the idle MEDEVAC units or waiting for either another (possibly higher precedence) request to enter the system or another MEDEVAC unit to reach service completion.

The information from the 9-line MEDEVAC request is transmitted to the assigned MEDEVAC unit through the command’s communication system. $T_3$ denotes the time at which the assigned MEDEVAC unit departs its station for the CCP. The amount of time between the MEDEVAC unit being assigned the 9-line MEDEVAC request, $T_2$, and the MEDEVAC unit departure, $T_3$, is the total mission preparation time, which includes preparing the medical equipment, medical personnel, and helicopters for the MEDEVAC mission. Typically, if an armed escort is required, it will take off with the MEDEVAC unit at the staging area, but there are situations in which the MEDEVAC unit must meet an armed escort at a predetermined rally point en route to the CCP. The MEDEVAC unit cannot land at a high-threat level CCP site without an armed escort, and the additional coordination for an armed escort can increase total response time.

$T_4$ denotes the time at which the MEDEVAC unit lands at the CCP site. Upon arrival to the CCP site, the MEDEVAC unit immediately loads casualties and begins initial medical treatment. $T_5$ denotes the time at which the MEDEVAC unit departs the CCP site and proceeds towards an MTF. The destination MTF is selected in a deterministic manner based on sufficiency of medical capability to treat casualties and proximity to the CCP site. The sufficiently capable MTF that is located closest to the CCP site is the one that the MEDEVAC unit departs to at time $T_5$.

The MEDEVAC unit arrives at the MTF site at time $T_6$. After arriving, the MEDEVAC unit immediately begins to unload casualties and transfers the responsibility of subsequent care of the casualties to the medical staff at the MTF. After all casualties have been unloaded, the MEDEVAC unit departs the MTF and travels back to its own staging area. Once a MEDEVAC unit has finished unloading and transferring the subsequent care of casualties to the MTF medical staff, it must return to its own staging area before being tasked to service another 9-line MEDEVAC request. This requirement comes from concerns about low fuel levels, crew bed down limitations, on-board equipment configurations, and other logistical issues (Rettke et al. 2016). Typically, MEDEVAC units must return to their home staging areas to refuel before being dispatched for another mission. $T_7$ denotes the time at which the MEDEVAC unit departs the MTF.

The MEDEVAC unit arrives back at its staging area at time $T_8$. Once the MEDEVAC unit arrives back at its staging area, the mission is considered complete. The MEDEVAC unit then becomes available for dispatch to another 9-line MEDEVAC request.

It is important to note that battlefield conditions (e.g., enemy disposition, required equipment being transported, weather conditions, and the air density due to flight altitude) are expected to affect the travel times from the MEDEVAC staging area to the CCP site, from the CCP site to the selected MTF location, and from the MTF location back to the MEDEVAC staging area.

Military medical planners must consider the measurement of MEDEVAC system performance when examining dispatch policies. In civilian operations, the efficacy of EMS systems has been a difficult area to evaluate due to the multitude of variables present (MacFarlane and Benn 2003). The search for a reliable measure of performance remains a topic of interest in the EMS field (e.g., see McLay and Mayorga (2010)). Practitioners and researchers employ various means of assessment. The most common method for evaluating EMS systems utilizes ambulance response times. EMS systems commonly define the response time as the time required to reach the patient after receiving the emergency call. Since EMS systems are evaluated on response time, one of their primary focuses is the rapid response to cardiac arrest situations. This emphasis exists because the ability to provide effective treatment to patients undergoing cardiac arrest is time-sensitive. Another reason behind this rationale is as follows. If the EMS system has the capability to respond quickly to cardiac arrest patients, then it is more likely to be able to service similar life-threatening medical situations. Therefore, defining the response time for a civilian EMS system to be the time between receiving the emergency call and the time the first emergency response vehicle arrives on scene is quite intuitive.

Nonetheless, MEDEVAC system performance cannot be measured using the same evaluation criteria as the civilian EMS system. Several additional factors complicate the medical evacuation of a casualty from a battlefield. The travel times, load times, and unload times can be much greater and vary more in military EMS systems when compared to a civilian EMS system. Moreover, the primary cause of death for battlefield casualties is blood loss, not cardiac arrest. Garrett (2013) indicates that blood loss is the primary cause of death for nearly 85% of soldiers killed in action. Due to this issue, some MEDEVAC units have been recently equipped with in-flight blood transfusion capabilities; however, the majority are not, and there is a lack of data to confirm whether this addition improves the ability to handle casualties with severe blood losses (Malsby III et al. 2013). Without sufficient data to determine the effectiveness of in-flight transfusion, there has not been a change in the MEDEVAC system’s evaluation measure. Therefore, unlike civilian EMS systems, it is vital to stabilize and transport battlefield casualties to an appropriate MTF (e.g., one that has the capability and resources to perform necessary care such as blood transfusions) and into surgery rather than simply providing medical aid at the CCP. So, while civilian EMS systems measure performance by response time (i.e., the time it takes to reach the patient after receiving the emergency call), military EMS systems are evaluated in terms of how long it takes to transport the casualties from the CCP to an MTF. Therefore, it is appropriate to define the response time for a MEDEVAC unit as $T_7-T_2$. Moreover, the service time for a MEDEVAC unit is defined as $T_8-T_2$, which is commonly associated as the time expended to service a request.

The primary objective of the MEDEVAC system presented in this paper is to dispatch MEDEVAC units in a manner that maximizes the expected total discounted reward attained by the system. The dispatch authority (i.e., AEO) must make sequential decisions under uncertainty regarding which available MEDEVAC unit to dispatch to service a 9-line MEDEVAC request. The system earns rewards based on the response times associated with servicing 9-line MEDEVAC requests. It is impossible to know exactly when and where casualty events will occur, which prevents the dispatch authority from having a priori information on subsequent 9-line MEDEVAC requests. The knowledge and details of any 9-line MEDEVAC request only become known to the MEDEVAC system upon receipt of the request. Once the GSAB receives the request and the AEO selects a MEDEVAC unit to dispatch, the assigned MEDEVAC unit must initiate mission protocols immediately. The mission protocols of a MEDEVAC unit include preparing medical personnel and equipment prior to departure, traveling to the CCP to pick up casualties, providing appropriate en route medical care, and transporting casualties to the nearest MTF in a rapid and efficient manner. Delaying any mission tasks negatively impacts the total response time and ultimately decreases the survivability rates of casualties awaiting service.

Both a dynamic and stochastic approach are needed when analyzing the dispatch of either civilian or military emergency response vehicles. The stochastic aspect of this problem derives from the uncertainty concerning the manifestation of casualty events. Moreover, the dispatch, travel, and service times vary for each request and cannot be predicted precisely. When examining civilian EMS systems, the data relating to dispatch, travel, and service times are easily accessible and can be leveraged to parameterize decision models. Unfortunately, as noted earlier, one of the implicit challenges for military medical planners is having to develop and identify a dispatching policy prior to commencement of combat operations. No casualty event data exists for such a situation. Therefore, this paper utilizes a rubric that emulates the judgment and expertise of military planners with regard to the future interactions of enemy and friendly forces to identify the locations and arrivals of casualty events.

4 Methodology

This section presents the Markov decision process (MDP) model of the military’s medical evacuation (MEDEVAC) dispatching problem. One of the key benefits of formulating an MDP model is that it provides a framework in which dynamic programming algorithms can be utilized to compute exact optimal policies. In most cases, MDP formulations have clear definitions for the state space, action space, rewards, transition probabilities, and optimality equations.

The objective of the MDP model formulated in this paper is to determine which available MEDEVAC unit to dispatch in response to a 9-line MEDEVAC request with the purpose of maximizing the expected total discounted reward over an infinite horizon.

The MDP model assumes that 9-line MEDEVAC requests arrive according to a Poisson process with parameter $\lambda $ that is denoted by $PP(\lambda )$. Recall that a Poisson process possesses independent and stationary increments. The assumption of independent increments is reasonable in the context of MEDEVAC request arrivals because a large number of small, widely dispersed units perform combat operations that result in localized casualty events that are unrelated to one another, and therefore the numbers of arrivals that occur in disjoint time intervals are independent. The assumption of stationary increments is reasonable due to the underlying presumption that the implicit sizes, locations, and dispositions of friendly and adversary forces remain fixed with respect to time, and therefore the number of arrivals that occur in any interval of time depends only on the length of the time interval. Military medical planners must ensure the MEDEVAC system is tailored to effectively support friendly forces within an assigned area of operations (AO) (Department of the Army 2016). In large-scale combat operations, military medical planners should examine the expected conditions of the operation and carefully select an appropriate $\lambda $-value based on these conditions to investigate system performance during the peak hours of operation. Each casualty event that leads to a 9-line MEDEVAC request submission is categorized by its precedence level, which is determined by the senior military member and/or medical personnel present at the injury site.

The Army utilizes three casualty event precedence categories (i.e., urgent, priority, and routine) when submitting a 9-line MEDEVAC request (Department of the Army 2016). A routine evacuation precedence level is assigned to casualties that are triaged as minimally injured (i.e., non-life-threatening), and typically results in standard ground or waterborne assets responding within 24 h of the initial event (De Lorenzo 2003). Since the focus of this paper is on the aerial aspect of MEDEVAC operations and routine 9-line requests typically do not utilize dedicated air evacuation assets, this paper only considers 9-line MEDEVAC requests that have a precedence level of either urgent or priority.

The arrival of urgent and priority 9-line MEDEVAC requests from different zones is modeled utilizing a splitting technique. Splitting consists of generating two or more counting processes out of a single Poisson process (Kulkarni 2009). Let the original counting process $\{N(t'): t'\ge 0\}$ denote the $PP(\lambda )$ that counts the number of 9-line MEDEVAC request arrivals to the general support aviation battalion (GSAB) that have occurred during the time interval $(0,t']$. The original counting process can be split into counting processes that are categorized by both the zone $z \in \mathcal {Z} = \{1,2,\ldots ,|\mathcal {Z}|\}$ and the precedence level $k \in \mathcal {K} = \{1,2,\ldots ,|\mathcal {K}|\}$ of the request. The sets $\mathcal {Z}$ and $\mathcal {K}$ represent the set of zones and the set of precedence levels in the system, respectively. Let $\mathcal {R} = \{(z,k): (z,k) \in \mathcal {Z}\times \mathcal {K}\}$ be the set of request categories. There is a total of $|\mathcal {R}| = |\mathcal {Z}||\mathcal {K}|$ request categories. The original process $\{N(t'): t'\ge 0\}$ is split into $|\mathcal {R}|$ independent processes $\{N_{zk}(t'): t' \ge 0\}, \forall \; (z,k) \in \mathcal {R}$. It is clear that

$$\begin{aligned} N(t') = \sum \limits _{(z,k)\in \mathcal {R}} N_{zk}(t') \end{aligned}$$

(1)

since each request belongs to one and only one category. The nature of the split processes $\{N_{zk}(t'): t' \ge 0\}, \forall \; (z,k) \in \mathcal {R}$ depends on how the requests are categorized. The process of categorizing each request is called the splitting mechanism. The Bernoulli splitting mechanism generates the split processes $\{N_{zk}(t'): t' \ge 0\}, \forall \; (z,k) \in \mathcal {R}$, given parameters $p_{zk} > 0, \; \forall \; (z,k) \in \mathcal {R}$ such that $\sum \nolimits _{(z,k)\in \mathcal {R}}p_{zk} = 1$. Each request is independently categorized by its zone z and precedence level k combination with probability $p_{zk}$ independent of any other considerations. The splitting mechanism allows the characterization of each split process $\{N_{zk}(t'): t' \ge 0\}, (z,k)\in \mathcal {R}$ as a Poisson process with parameter $\lambda p_{zk}$, which is denoted by $PP(\lambda p_{zk})$.

There may be times when a 9-line MEDEVAC request is admitted into the system, but all MEDEVAC units are currently servicing other requests. When this occurs, the submitted 9-line MEDEVAC request is placed in its respective zone-precedence queue to be serviced at a later time. Moreover, there may be system states wherein an idle MEDEVAC is available for assignment, but placing the submitted request in its respective zone-precedence queue rather than assigning the idle MEDEVAC to the request could prove more advantageous in the long run. For example, the decision not to assign an available MEDEVAC unit immediately could prove beneficial if a lower precedence request enters the system while many MEDEVAC units are busy. In such a situation, waiting for another MEDEVAC unit to become available before servicing the lower precedence request allows the idle MEDEVAC unit to remain available for a possibly higher precedence request, yet to arrive.

The service time for a MEDEVAC unit consists of the time between the initial assignment notification and the return to the staging area. This paper assumes that the service times of the MEDEVAC units are exponentially distributed. Kotwal et al. (2016) report real-world summary statistics concerning MEDEVAC service times that support this assumption. Moreover, the exponential distribution is commonly used to represent random, real-world phenomena because it provides a reasonable, simplifying approximation of the actual empirical distribution and enables the construction of a tractable mathematical model due to its ease of use and favorable properties (e.g., the memoryless property). Indeed, this simplifying assumption is often utilized (and investigated) in related literature. For example, Jarvis (1985) performs several computational experiments, and the results suggest that the shape of the service-time distribution has little impact on the overall behavior of the system. Similarly, research by Gross and Harris (1998) also indicates the insensitivity of service time distributions to system performance. McLay and Mayorga (2013) perform simulation analyses utilizing different types of service time distributions to study the impact of modeling the system with exponential service times versus more realistic service times. Results indicate that the assumption of exponential service times does not significantly impact the optimal polices. This suggests that the optimal polices determined utilizing the MDP model from this paper provide military medical planners relevant insight regarding how to dispatch MEDEVAC units despite the simplifying assumption of exponentially distributed service times.

Having introduced the characteristics of the arrival process and the nature of the service times, formulation of the MDP model can now proceed. The development of the MDP model components leverage Maxwell et al. (2010), Keneally et al. (2016), and Rettke et al. (2016). The decision epochs, state space, action space, transition probabilities, rewards, objective, and optimality equation are described in detail below.

The decision epochs of the MEDEVAC system are the points in time that require a decision. The set of decision epochs is denoted as $\mathcal {T} = \{1,2,\ldots \}$. Two event types in the MEDEVAC system constitute all decision epochs. The first event type is the submission of a 9-line MEDEVAC request. The second event type is the change in the status of a MEDEVAC unit from busy to available upon completing a mission.

The MEDEVAC system MDP model follows the properties of semi-Markov decision processes (SMDPs). SMDPs generalize MDPs by requiring the decision-maker to select a feasible action whenever the system changes, allowing the time spent in a specific state to follow an arbitrary probability distribution, and modeling the system evolution in continuous time (Puterman 1994). The MEDEVAC system MDP model is viewed as a continuous time MDP (CTMDP), which is a special case of an SMDP wherein the inter-transition times are exponentially distributed and decisions are made at every transition. There are several different ways that CTMDPs can be analyzed, but the primary method utilized in this paper is uniformization. Uniformization is applied to the CTMDP model to obtain an equivalent discrete-time discounted model with constant transition rates (Puterman 1994). The transformation allows the results and algorithms for discrete-time MDP models to be applied directly.

The state $S_t \in \mathcal {S}$ describes the status of the entire MEDEVAC system at decision epoch $t \in \mathcal {T}$. The MEDEVAC system state is represented by the tuple $S_t = \left( M_t, Q_t, \hat{R_t}\right) $ wherein $M_t$ represents the MEDEVAC status tuple at epoch t, $Q_t$ represents the queue status tuple at epoch t, and $\hat{R_t}$ represents the request arrival status tuple at epoch t.

The MEDEVAC status tuple $M_t$ describes the status of every MEDEVAC unit in the system at epoch t. The tuple $M_t$ can be written as

$$\begin{aligned} M_t =\left( M_{tm}\right) _{m \in \mathcal {M}}, \end{aligned}$$

(2)

where $\mathcal {M} = \{1,2,\ldots ,|\mathcal {M}|\}$ represents the set of MEDEVAC units in the system. The state variable $M_{tm}\in \{0\} \cup \mathcal {Z}$ contains the information pertaining to MEDEVAC unit $m \in \mathcal {M}$ at epoch t. Each MEDEVAC unit can either be idle or servicing a request in one of the zones in the system. When $M_{tm} = 0$, MEDEVAC unit m is idle. When $M_{tm} = z$, MEDEVAC unit m is servicing a request from zone $z \in \mathcal {Z}$.

The queue status tuple $Q_t$ describes the status of every zone-precedence queue in the system at epoch t. The tuple $Q_t$ can be written as

$$\begin{aligned} Q_t =\left( Q_{tzk}\right) _{z \in \mathcal {Z}, k \in \mathcal {K}}. \end{aligned}$$

(3)

The state variable $Q_{tzk} \in \{0, 1, \ldots , q^{max}\}$ contains the information pertaining to the $(z,k) \in \mathcal {R}$ zone-precedence queue at epoch t. Each zone-precedence queue can hold no more than $q^{max}$ requests at any point in time.

The request arrival status tuple $\hat{R_t}$ indicates whether there is a request arrival awaiting an admission decision at epoch t; it also provides the zone and precedence level of the request arrival, if one is present at epoch t. Let $\hat{R}_t = (0,0)$ when there is not a request arrival at the GSAB at epoch t. Otherwise, let

$$\begin{aligned} \hat{R}_t =\left( \hat{Z}_t,\hat{K}_t\right) _{\hat{Z}_t \in \mathcal {Z},\hat{K}_t \in \mathcal {K}}. \end{aligned}$$

(4)

The random variable $\hat{Z}_t$ represents the zone of the request arrival at epoch t, and the random variable $\hat{K}_t$ represents the precedence level of the request arrival at epoch t. At epoch t, the information in $\hat{Z}_t$ and $\hat{K}_t$ has just been realized and is no longer uncertain. However, $\hat{Z}_t$ and $\hat{K}_t$ are random variables at epochs $1,2,\ldots ,t-1$ because the information they contain is still uncertain at those epochs.

The size of the state space $\mathcal {S}$ depends on $|\mathcal {M}|, |\mathcal {Z}|, |\mathcal {K}|,$ and $q^{max}$. The following expression indicates the cardinality of the state space for the MEDEVAC system:

$$\begin{aligned} \left| \mathcal {S}\right| = \left( 1+|\mathcal {Z}|\right) ^{|\mathcal {M}|}\left( 1 + q^{max}\right) ^{|\mathcal {Z}| |\mathcal {K}|}\left( 1 + |\mathcal {Z}| |\mathcal {K}|\right) . \end{aligned}$$

(5)

The size of the state space grows exponentially with respect to the number of state variables. This is commonly referred to as the curse of dimensionality and renders dynamic programming intractable for analyzing practical scenarios (i.e., large-scale problem instances). The purpose of formulating and analyzing small-scale problem instances is to examine the general efficacy of currently practiced (myopic) policies, identify possible structural properties of high-quality solutions, and inform the subsequent development of approximate solution approaches for application to the analysis of large-scale problems.

Events are triggered when a 9-line MEDEVAC request is submitted to the system or if a busy MEDEVAC unit completes a service request and becomes available. An admission control decision only occurs when a 9-line MEDEVAC request is submitted to the system. A dispatching decision may be necessary when either of these two event types occur.

The MEDEVAC system employs an inter-zone policy regarding airspace access that allows any MEDEVAC unit to service any 9-line MEDEVAC request, regardless of the zone from which the request originated. Once a MEDEVAC unit is tasked, it will be considered unavailable until the task is completed and the MEDEVAC unit has returned to its own staging area. Although rerouting a MEDEVAC unit during mid-flight can be accomplished, potential delays and communication difficulties can create issues in the MEDEVAC system that may ultimately cost casualties their lives. Furthermore, most military operations do not utilize a MEDEVAC unit rerouting strategy during combat operations (Rettke et al. 2016). Due to these reasons, rerouting MEDEVAC units mid-flight is not incorporated in this MDP model.

When a 9-line MEDEVAC request is submitted, the AEO must take into account the current state of the system and make an admission control and possibly a dispatching decision. There are three possible alternatives: allowing the request to enter its respective zone-precedence queue; assigning an available MEDEVAC unit to service the request immediately; or rejecting the request from entering the system, which forces the request to be serviced by an outside agency (i.e., CASEVAC). If a request arrival is present at epoch t and its queue is not full, i.e., $\hat{R}_t = \left( \hat{Z}_t, \hat{K}_t\right) $ and $Q_{t\hat{Z}_t\hat{K}_t} < q^{max}$, $\hat{Z}_t \in \mathcal {Z}$, $\hat{K}_t \in \mathcal {K}$, then the AEO can either accept or reject the request from entering the system. If the request is accepted, it can either be placed in its respective zone-precedence queue or an available MEDEVAC unit can be tasked to service the request immediately. Moreover, if a request arrival is present at epoch t and its queue is full, i.e., $\hat{R}_t = \left( \hat{Z}_t, \hat{K}_t\right) $ and $Q_{t\hat{Z}_t\hat{K}_t} = q^{max}$, $\hat{Z}_t \in \mathcal {Z}$, $\hat{K}_t \in \mathcal {K}$, then the AEO must reject the request from entering the system. Practically speaking, $q^{max}$ should be set high enough so that requests are not routinely rejected due to a full queue.

Let the decision variable $x_t^{reject} \in \{\varDelta , 0,1\}$ denote the admission control decision at epoch t. If an arrival request is not present at epoch t, i.e., $\hat{R}_t = (0,0)$, the only available decision is $x_t^{reject} = \varDelta $, which indicates the system will continue to transition without any impact from $x_t^{reject}$. When $x_t^{reject} = 0$, the arrival request at epoch t is admitted to the MEDEVAC system, whereas when $x_t^{reject} = 1$, the arrival request at epoch t is rejected from entering the MEDEVAC system.

Dispatching decisions may be required when either a 9-line request is submitted or a busy MEDEVAC unit completes a service request and becomes available. Let $\mathcal {I}(S_t) = \{m: m\in \mathcal {M}, M_{tm} = 0\}$ denote the set of idle MEDEVAC units available for dispatching when the state of the system is $S_t$ at epoch t. Let $\mathcal {W}(S_t) = \{(z,k): (z,k) \in \mathcal {R}, Q_{tzk} > 0\}$ denote the set of zone-precedence queues that have at least one casualty event awaiting service when the state of the system is $S_t$ at epoch t. The dispatching decision is represented by the tuple $x_t^{d} = \left( x_{t}^{ar},x_{t}^{qr}\right) $ wherein $x_t^{ar}$ represents the arrival request dispatch decision tuple and $x_t^{qr}$ represents the queued requests dispatch decision tuple at epoch t.

The arrival request dispatch decision tuple $x_t^{ar}$ describes the AEO’s dispatching decision with regard to arrival requests at epoch t. The tuple $x_t^{ar}$ can be written as

$$\begin{aligned} x_t^{ar} =\left( x_{tm}^{ar}\right) _{m \in \mathcal {I}(S_t)}. \end{aligned}$$

(6)

The decision variable $x_{tm}^{ar}= 1$ if MEDEVAC unit $m \in \mathcal {I}(S_t)$ is dispatched to service the arrival request $\hat{R}_t = \left( \hat{Z}_t,\hat{K}_t\right) $, where $\hat{Z}_t \in \mathcal {Z}$ and $\hat{K}_t \in \mathcal {K}$, at epoch t, and 0 otherwise.

The queued requests dispatch decision tuple, $x_t^{qr}$, describes the AEO’s dispatching decision with regard to queued requests at epoch t. The tuple $x_t^{qr}$ can be written as

$$\begin{aligned} x_t^{qr} =\left( x_{tmzk}^{qr}\right) _{m \in \mathcal {I}(S_t), (z,k) \in \mathcal {W}(S_t)}. \end{aligned}$$

(7)

The decision variable $x_{tmzk}^{qr} = 1$ if MEDEVAC unit $m \in \mathcal {I}(S_t)$ is dispatched to service a queued request from the (z, k) zone-precedence queue, where $(z,k) \in \mathcal {W}(S_t)$, at epoch t, and 0 otherwise.

Let $x_t = \left( x_t^{reject},x_t^{d}\right) $ denote a compact representation of the decision variables at epoch t. Several constraints bound the decisions being made at epoch t. The first constraint,

$$\begin{aligned} I_{\{\hat{R}_t \ne (0,0)\}}\sum \limits _{m \in \mathcal {I}(S_t)}x_{tm}^{ar} + \sum \limits _{m \in \mathcal {I}(S_t)}\sum \limits _{(z,k) \in \mathcal {W}(S_t)} x_{tmzk}^{qr} \le 1, \end{aligned}$$

(8)

requires that there is at most one MEDEVAC unit dispatched at epoch t. The next constraint,

$$\begin{aligned} x_t^{reject} \le 1 - \sum \limits _{m\in \mathcal {I}(S_t)} x_{tm}^{ar}, \end{aligned}$$

(9)

indicates that, if an arrival request is present at epoch t and a MEDEVAC unit is tasked to service the arrival request at epoch t, as indicated by $x_{tm}^{ar}= 1$ for some $m \in \mathcal {I}(S_t)$, then the arrival request must enter the system, as indicated by $x_t^{reject} = 0$. Otherwise, $x_{tm}^{ar}= 0$ for all $m \in \mathcal {I}(S_t)$, and the arrival request is either queued (i.e., $x_t^{reject} = 0$) or rejected (i.e., $x_t^{reject} = 1$) from the system at epoch t. The set of available actions when a decision is required is denoted as follows

(10)

where Constraints (8) and (9) must be satisfied. The first two cases in Eq. (10) represent all feasible actions when the decision epoch occurs due to a MEDEVAC unit completing a service request and becoming available, whereas the last five cases represent all feasible actions when the decision epoch occurs due to a 9-line MEDEVAC request submission.

State transitions are Markovian with two possible events dictating the transition. The first event type is the submission of a 9-line MEDEVAC request. Recall that 9-line MEDEVAC requests arrive according to a $PP(\lambda )$. The second event type is the change in the status of a MEDEVAC unit from busy to available upon completing a mission. Let $\mu _{mz}$ denote the service rate of MEDEVAC unit $m \in \mathcal {M}$ when servicing a 9-line MEDEVAC request in zone $z \in \mathcal {Z}$. Let $\mathcal {B}(S_t) = \{m: m\in \mathcal {M}, M_{tm} \ne 0\}$ denote the set of busy MEDEVAC units when the state of the system is $S_t$ at epoch t. If the MEDEVAC system is in pre-decision state $S_t$ and action $x_t$ is taken, the system will immediately transition to a post-decision state $S_t^x$. The sojourn time in $S_t^x$ (i.e., the time the system remains in post decision state $S_t^x$ before transitioning to to the next pre-decision state $S_{t+1}$) follows an exponential distribution with parameter $\beta (S_t,x_t)$. Simple calculations reveal that

(11)

If $\mathcal {B}(S_t) = \varnothing $, $x_{tm}^{ar} = 0 \;\forall \; m \in \mathcal {I}(S_t)$, and $x_{tmzk}^{qr} = 0 \;\forall \; m \in \mathcal {I}(S_t), (z,k) \in \mathcal {W}(S_t)$, then $\beta (S_t,x_t)$ represents the sojourn time for the state-action pairs for which the next decision epoch occurs upon the arrival of a 9-line MEDEVAC request. Otherwise, $\beta (S_t,x_t)$ represents the sojourn time for the state-action pairs for which the next decision epoch occurs after either a 9-line MEDEVAC request arrives to the GSAB or one of the busy MEDEVAC units completes a service request and becomes available. Let $T_a$ denote the time until the next 9-line MEDEVAC request arrival. Let $T_s$ denote the time until the next service completion. The time until the next decision epoch $T_e$ satisfies $T_e = \min \{T_a,T_s\}$. Since both $T_a$ and $T_s$ follow an exponential distribution, standard calculations show that $T_e$ follows an exponential distribution with parameter $\beta (S_t,x_t)$.

The probabilistic behavior of the process is summarized in terms of its infinitesimal generator. The infinitesimal generator is an $|\mathcal {S}| \times |\mathcal {S}|$ matrix G with components:

$$\begin{aligned} G(S_{t+1}|S_t,x_t) = {\left\{ \begin{array}{ll} -[1-p(S_t|S_t,x_t)]\beta (S_t,x_t), &{} \text {if } S_{t+1} = S_t\\ p(S_{t+1}|S_t,x_t)\beta (S_t,x_t), &{} \text {if }S_{t+1} \ne S_t \end{array}\right. } \end{aligned}$$

(12)

wherein

(13)

denotes the probability that the system transitions to state $S_{t+1}$ given that it is currently in state $S_t$ and decision $x_t$ is made. The post-decision state variable $M_{tm}^x \in \{0\} \cup \mathcal {Z}$ contains the information pertaining to MEDEVAC unit $m \in \mathcal {M}$ when decision $x_t$ is made at epoch t. Note that $p(S_t|S_t,x_t) = 0$, which means that the system will transition to a different state at the end of a sojourn in state $S_t^x$.

Puterman (1994) argues that converting CTMDPs to equivalent discrete-time MDPs via the uniformization approach makes subsequent analysis easier to perform. To uniformize the system, the maximum rate of transition must be determined and is calculated by

$$\begin{aligned} \nu = \lambda + \sum \limits _{m\in \mathcal {M}} \tau _m, \end{aligned}$$

(14)

wherein

$$\begin{aligned} \tau _m = \max _{z \in \mathcal {Z}} \mu _{mz},\;\forall \; m\in \mathcal {M}. \end{aligned}$$

(15)

The restriction that there are no self-transitions from a state to itself is removed when uniformization is applied to the process. Applying uniformization yields the following transition probabilities:

$$\begin{aligned} \tilde{p}(S_{t+1}|S_t,x_t) = {\left\{ \begin{array}{ll} 1-\frac{[1-p(S_t|S_t,x_t)]\beta (S_t,x_t)}{\nu }, &{} \text {if }S_{t+1} = S_t\\ \frac{p(S_{t+1}|S_t,x_t)\beta (S_t,x_t)}{\nu }, &{} \text {if }S_{t+1} \ne S_t. \end{array}\right. } \end{aligned}$$

(16)

This transformation may be viewed as inducing extra (i.e., “notional”) transition opportunities from a state to itself. This modified process has the same probabilistic structure as the CTMDP.

The decision epochs in CTMDPs follow each state transition, and the times between decision epochs are exponentially distributed. Several factors impact the amount of reward gained from making a decision to service a 9-line MEDEVAC request. These factors include the zone and precedence level of the 9-line MEDEVAC request as well as the staging area of the servicing MEDEVAC unit. Let $c(S_t,x_t) = \psi _{mzk}$ denote the immediate expected reward (i.e., contribution) if MEDEVAC unit $m \in \mathcal {M}$ is dispatched to service a zone $z \in \mathcal {Z}$, precedence level $k \in \mathcal {K}$ 9-line MEDEVAC request (i.e., $x_{tm}^{ar} = 1$ or $x_{tmzk}^{qr} = 1$). The immediate expected reward is computed as follows:

$$\begin{aligned} \psi _{mzk} = {\left\{ \begin{array}{ll} \delta e^{\frac{-\zeta _{mz}}{60}}, &{} \text {if k=1 (i.e., urgent )}\\ e^{\frac{-\zeta _{mz}}{240}}, &{} \text {if k=2 (i.e., priority )}\\ 0, &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

(17)

wherein $\zeta _{mz}$ is the expected response time when MEDEVAC $m \in \mathcal {M}$ is dispatched to service a request in zone $z\in \mathcal {Z}$, and $\delta \ge 1$ is a tradeoff parameter utilized to vary the urgent-to-priority immediate expected reward ratio. If a MEDEVAC unit is not dispatched to service a 9-line MEDEVAC request at epoch t, then $c(S_t,x_t) = 0$.

Let $h(S_t, x_t)$ denote the holding cost accumulated when the MEDEVAC system is in state $S_t$ and decision $x_t$ is selected. The MEDEVAC system incurs a holding cost for queued 9-line MEDEVAC requests based on the time requirements outlined in the Army’s Medical Evacuation Field Manual (Department of the Army 2016). The MEDEVAC system seeks to service urgent and priority 9-line MEDEVAC requests within 60 and 240 min from notification, respectively. Let $\phi _k$ denote the holding cost rate for holding a single precedence-k request in its queue between decision epochs. The holding cost rate $\phi _k$ is defined as

$$\begin{aligned} \phi _k=\xi \frac{\sum \nolimits _{m \in \mathcal {M}}\sum \nolimits _{z \in \mathcal {Z}}\psi _{mzk}}{|\mathcal {M}||\mathcal {Z}|}, \forall k\in \mathcal {K}, \end{aligned}$$

(18)

where $\xi \in [0,1]$ is a parameter that scales the holding cost rate for a precedence-k request based on the average immediate expected reward over all possible MEDEVAC-zone combinations. Summing the holding costs over all zone-precedence queues yields the following expression

$$\begin{aligned} h(S_t,x_t) = \sum \limits _{z \in \mathcal {Z}}\sum \limits _{k \in \mathcal {K}} \phi _k Q_{tzk}. \end{aligned}$$

(19)

Simple calculations show that, if $\mathcal {W}(S_t) = \varnothing $, then $h(S_t, x_t) = 0$. That is, if no requests are queued, then no holding cost is incurred. Since the system does not change in the time between decision epochs, the expected discounted reward is

$$\begin{aligned} r(S_t,x_t) = c(S_t,x_t) - \frac{h(S_t,x_t)}{\alpha + \beta (S_t,x_t)}, \end{aligned}$$

(20)

where $\alpha >0$ denotes the continuous time discounting rate. Applying uniformization gives

$$\begin{aligned} \tilde{r}(S_t,x_t) \equiv r(S_t,x_t)\frac{\alpha + \beta (S_t,x_t)}{\alpha + \nu }. \end{aligned}$$

(21)

Note that the uniformized rewards agree with the rewards in the CTMDP.

Let $X^{\pi }(S_t)$ be a policy (i.e., decision function) that prescribes AEO dispatch decisions for each state $S_t \in \mathcal {S}$. That is, $x = X^{\pi }(S_t)$ is the dispatching decision returned when utilizing policy $\pi $. The optimal policy $\pi ^*$ is sought from the class of policies ($X^{\pi }(S_t))_{\pi \in \varPi }$ to maximize the expected total discounted reward earned by the MEDEVAC system. The objective is expressed as

$$\begin{aligned} \max \limits _{\pi \in \varPi }\mathbb {E}^{\pi }\Big \{\sum \limits _{t=1}^{\infty }\gamma ^{t-1} \tilde{r}(S_t,X^{\pi }(S_t))\Big \}, \end{aligned}$$

(22)

where $\gamma = \frac{\nu }{\nu + \alpha }$ is the uniformized discount factor. The optimal policy is found by solving the Bellman equation

$$\begin{aligned} J(S_t) = \max \limits _{x_t\in \mathcal {X}(S_t)}\Big \{\tilde{r}(S_t,x_t) + \gamma \sum \limits _{S_{t+1} \in \mathcal {S}} \tilde{p}(S_{t+1}|S_t,x_t)J(S_{t+1})\Big \}. \end{aligned}$$

(23)

The policy iteration algorithm is implemented in MATLAB to solve Eq. (23) exactly. Policy iteration starts with an initial policy and then iteratively performs two steps: policy evaluation, which computes the expected total discounted reward of each state given the current policy, and policy improvement, which updates the current policy if any improvements are available (Puterman 1994). The policy iteration algorithm terminates after the policy converges.

For comparison purposes, a linear programming (LP) model of the Markov decision problem is also constructed. Constructing an LP model of a Markov decision problem is beneficial because it eases the inclusion of constraints and provides a better mechanism with which to conduct sensitivity analyses. However, Puterman (1994) notes that LP has not proven to be an efficient method for solving large discounted Markov decision problems. Yet, recent advancements in LP algorithms have increased the computational efficiency of LP approaches (e.g., as indicated by the performance testing of CPLEX and Gurobi in Bixby (2012)) and make LP a more viable solution method for solving MDPs.

5 Testing, analysis, and results

This section presents a representative military medical evacuation (MEDEVAC) planning scenario utilized both to demonstrate the applicability of the Markov decision process (MDP) model and to examine the behavior of the optimal dispatching policy. A series of sensitivity analyses and computational excursions identify the model parameters that significantly impact the optimal dispatching policy. Military medical planners should focus on these parameters when developing MEDEVAC dispatching polices. Moreover, this section compares the computational efficiency of policy iteration via MATLAB versus linear programming via CPLEX 12.6. The paper utilizes a dual Intel Xeon E5-2650v2 workstation having 128 GB of RAM and MATLAB’s Parallel Computing Toolbox to conduct the computational experiments and analyses presented herein.

5.1 Representative scenario

As of 2017, the United States (U.S.) continues to conduct military operations in Afghanistan. The launch of U.S. military operations in Afghanistan began with the initiation of Operation Enduring Freedom (OEF) on October 7, 2001 in response to the terrorist attacks on New York’s World Trade Center and the Pentagon on September 11, 2001. OEF lasted a little over 13 years and officially ended when U.S. combat operations in Afghanistan were terminated on December 31, 2014. However, as part of Operation Freedom’s Sentinel, U.S. military forces still remain in Afghanistan to participate in a coalition mission to train and assist the Afghan military and to conduct counter-terrorism operations against Al Qaeda (Department of Defense 2016). While official U.S. combat operations are currently not being conducted in Afghanistan, military medical planners still prepare and plan for potential combat scenarios in the event that a sudden change requires U.S. combat operations.

The computational examples in Bandara et al. (2012), Keneally et al. (2016), and Rettke et al. (2016) inform the development of the representative scenario examined herein. This paper considers a notional planning scenario in which a coalition of allied countries executes combat operations in response to an increase in insurgency operations by remnants of Al-Qaeda militants in southern Afghanistan. For simplicity, this notional scenario (hereafter referred to as the $2\times 2$ case) assumes a MEDEVAC system with two demand zones (i.e., the zones at which 9-line MEDEVAC requests originate) and two MEDEVAC unit staging areas (i.e., the locations in which the MEDEVAC units are stationed) with one medical treatment facility (MTF) co-located at each staging area. Both MTFs are equally capable of treating any casualty and each MTF has an unlimited capacity to treat incoming casualties (i.e., no queueing at the MTF), so only the proximity of an MTF to a casualty collection point (CCP) is utilized to determine where a MEDEVAC will transport casualties.

The $2 \times 2$ case assumes that southern Afghanistan is the area of operations (AO) and is divided into two separate demand zones: Helmand province (Zone 1) and Kandahar province (Zone 2). Two MEDEVAC units are considered with one being staged in Zone 1 (i.e., MEDEVAC 1) and the other being staged in Zone 2 (i.e., MEDEVAC 2). The placement of the staging areas and co-located MTFs represents a general realism based on the historical trends in enemy activity in southern Afghanistan. Helmand and Kandahar are the two provinces that have produced the most war-related fatalities in Afghanistan since the start of OEF with 956 and 558 coalition service members killed in action, respectively (White 2016). While these numbers do not account for every type of casualty (e.g., military wounded in action and civilian casualties), they do provide a representative sample that is utilized as an approximation of the threat level present in each zone. Moreover, these numbers are utilized to determine the proportion of 9-line MEDEVAC requests from each zone; the proportion of requests coming from Zone 1 is $p_{z_1} = 0.6314$ and the proportion of requests coming from Zone 2 is $p_{z_2} = 1- p_{z_1} = 0.3686$.

Each 9-line MEDEVAC request is independently categorized by its zone z (e.g., Helmand and Kandahar) and precedence level k (e.g., urgent, priority, and routine) combination. Fulton et al. (2010) report that the probability of a casualty event being classified with a precedence level of urgent, priority, or routine is 11, 12, and 77%, respectively based on historical MEDEVAC data from U.S. operations in Iraq. Recall that routine requests are assumed to be serviced by non-MEDEVAC units (i.e., casualty evacuation (CASEVAC)). The $2 \times 2$ case assumes that the proportion of requests classified with an urgent precedence level is approximately $p_{k_1} =0.5$ and the proportion of requests classified with a priority precedence level is $p_{k_2} = 1 - p_{k_1} = 0.5$. The proportion of each request categorization $p_{zk}$ is found by multiplying the zone proportion with the precedence level proportion (e.g., $p_{11} = p_{z_1}p_{k_1}$).

Military medical planners estimate the arrival rate of 9-line MEDEVAC requests by estimating when and where future tactical level engagements will occur, along with the likelihood and severity of corresponding casualty events. The reward obtained for servicing a 9-line MEDEVAC request depends on the location of the request, the servicing MEDEVAC unit, and the closest MTF. The response and service times described in Sect. 3 are generated by leveraging the procedure set forth by Keneally et al. (2016).

The procedure utilized to model future 9-line MEDEVAC requests avoids using current data from southern Afghanistan to maintain operational security. Indeed, actual data for current MEDEVAC unit, casualty event, and MTF locations are restricted. Instead, the spatial distribution of future 9-line MEDEVAC requests are modeled with a Monte Carlo simulation via a Poisson cluster process. Casualty cluster centers are selected by leveraging data from the International Council on Security and Development (ICOS) (2008) pertaining to insurgent attacks in southern Afghanistan resulting in death in 2007. It is assumed that all casualty events generated from the casualty cluster centers result in 9-line MEDEVAC requests. Moreover, the distribution of 9-line MEDEVAC request locations from a given casualty cluster center is generated on a uniform distribution with respect to the distance of the request to the casualty cluster center. Military medical planners must keep in mind that data will certainly change with respect to each conflict. Furthermore, the dispatching policy generated depends on the input data and therefore must relate to the scenario being modeled to obtain meaningful results.

Figure 3 depicts the two zones (i.e., Helmand and Kandahar) in southern Afghanistan utilized to generate the data, as well as the MEDEVAC and MTF locations. Recall that the MEDEVAC and MTF locations are collocated for the $2 \times 2$ case. The collocated MEDEVAC and MTF locations in each zone are represented by blue stars. The casualty cluster centers in each zone are represented by red diamonds.

The data generated for the MEDEVAC mission task times that comprise the response time vary with each mission and, therefore, are represented as random variables. The response time variables representing mission preparation time, travel time to CCP, service time at CCP, travel time to MTF, and service time at the MTF are defined in Sect. 3 and described in detail in the following four paragraphs.

The mission preparation time is exponentially distributed with a mean of 10 min. The 2008 MEDEVAC after action report (AAR) estimates mission prep time to be 20 min (Bastian 2010). This AAR, along with personal experiences, influences Bastian (2010) to model mission preparation time with a mean of 20 min and standard deviation of 5 min. However, a more recent interview with a MEDEVAC pilot in O’Shea (2011) reports that with proper pre-planning procedures the mission preparation time is often less than 10 min.

The armed escort delay is exponentially distributed with a mean of 10 min. Garrett (2013) reports that there is a 31% chance that a MEDEVAC mission requires an armed escort. Moreover, among the missions requiring an armed escort, approximately 4% are delayed due to issues caused primarily by the escort aircraft. These percentages are included in the computation of the expected response times and the corresponding expected rewards. The delay induced by armed escorts is an important feature of the MEDEVAC problem. This paper applies the same armed escort assumptions found in Keneally et al. (2016), to which we refer a more interested reader for a more in depth description on how armed escorts impact this MDP model.

The flight speed, which accounts for the travel time to the CCP and the travel time to the MTF, is uniformly distributed between 120 and 193 knots with a mean of 156.5 knots. This flight speed is based on currently fielded MEDEVAC helicopters (i.e., HH-60Ms) and on subject matter expertise (Bastian 2010).

The service time at the CCP and the service time at the MTF are exponentially distributed with a mean of 10 and 5 min, respectively. These times are determined by leveraging the data provided by in-theater MEDEVAC pilots and other subject matter experts described in Bastian (2010) and Keneally et al. (2016).

The just-described response time random variables, casualty cluster centers, and MEDEVAC staging areas are utilized in a Monte Carlo simulation to obtain a synthetic, but realistic, spatial distribution of future 9-line MEDEVAC requests and response time data. The means of the response times are computed and presented in Table 1.

Table 1 Expected response times (min)

Examining military medical evacuation dispatching policies utilizing a Markov decision process model of a controlled queueing system

Abstract

Similar content being viewed by others

A maximum expected covering problem for locating and dispatching two classes of military medical evacuation air assets

Military and Security Applications: Medical Evacuation

A proactive transfer policy for critical patient flow management

Explore related subjects

1 Introduction

2 Literature review

3 Problem description

4 Methodology

5 Testing, analysis, and results

5.1 Representative scenario

5.2 Representative scenario results

5.3 Computational experiments

5.4 Excursion 1—request arrival rate

5.5 Excursion 2—MEDEVAC helicopter flight speed

5.6 Excursion 3—intra-zone policies

5.7 Policy iteration versus linear programming

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation