1 Introduction

From the first department store in 1852, the retail revolution has never stopped. Retail business has been rapidly evolving in the past decades with the boom of internet, mobile technologies and most importantly e-commerce. More recently, new technologies are making it possible for even more forms of retail business. Although the fundamental problems of supply chain management for retail business remains the same as what was seen decades ago as Forrester (1958) illustrated, the tools and methodologies have dramatically changed over the years. In this paper, we will highlight several key components in supply chain management that is particularly relevant to retail business, especially technology-driven retail business, and provide an overview of the evolution of supply chain management methodologies and practices.

There are several key factors that differentiate a retail supply chain from a manufacturer supply chain. Firstly, retailers generally face significantly more partners. A large-scale retailer could have tens of thousands of vendors to supply its inventory compared to a much more manageable number for the manufactures. As a result, retail supply chain has to be able to scale more efficiently and focus more on supply chain coordination. Secondly, retailers face end consumers in the number of hundreds of millions compared to a limited number of wholesalers for manufacturers. Retailers need to invest more on understanding customer demand in order to plan and adjust accordingly. Thirdly, the cost structure of a retailer is different from a manufacturer. The majority of retailers cost is inventory cost while manufacturer spends significantly more on equipment and product lines. As a result, inventory management is more important for retailers. Manufacturers tend to focus more on resource scheduling and planning. Lastly, with the ever-growing customer needs in service quality and delivery speed, retailers have to take fulfillment into consideration when building their supply chain networks. Although some methodologies can also be applied to other supply chain systems, in this review, we will primarily focus on the components that are more relevant to retail supply chain systems.

In retail, supply chain systems normally consist of many different components and have different set ups. In this paper, we only consider the related components that the retailer as a decision maker has direct impact on. Traditionally, a typical supply chain in retail consists of 4 main components: wholesalers (vendors), warehouses (distribution centers), stores and customers as shown in Fig. 1. Other parties such as manufacturers and their suppliers that are further upstream of the supply chain system are normally beyond the scope from a retailer perspective. In a more modern setup, however, the supply chain system visible to the retailer becomes much more complex (as shown in Fig. 2). In additional to the increasing size of the warehouse and store network, retailers now have more options to satisfy their customers’ demand (e.g. from stores, from warehouses, or directly from the suppliers, etc.). Further, more and more retailers are forming a deeper integration with the manufactures and even raw material suppliers through the introduction of store brands or private labels (Chen 2018a, ). Retailers today still face the same or similar supply chain management challenges such as the bullwhip effect (Lee et al. 1997). But the approaches and methodologies used by retailers are continuously evolving in order to further drive for lower cost, higher efficiency and ultimately better customer experience.

Fig. 1
figure 1

Traditional retail supply chain

Fig. 2
figure 2

Modern retail supply chain

One of the most distinguishable changes over the time in supply chain management methodologies is the shift from “experience-driven” to “data-driven” approaches. An “experience-driven” supply chain management system heavily relies on human experience to make the critical decisions such as inventory placement, fulfillment network design, etc. Although there will still be usage of data in the process, assumptions based on experience and human judgment are usually pivotal to the success of such models. For example, traditional inventory replenishment model relies on the assumption of the shape of demand distribution as well as vendor lead times in order to make an optimal decision (Covert and Philip 1973; Brahimi et al. 2006); Product assortment solutions (Kök et al. 2008) also heavily depend on product similarities which are mostly rated by experienced category managers. On the other hand, “data-driven” models rely more on the data itself rather than human experience. Due to the greatly increased data availability and the advancement in data mining and machine learning technologies, it is now possible for supply chain management systems to make decisions on available data alone. For example, learning based replenishment algorithms (Zhang et al. 2017) utilize observed demand data to help improve inventory decisions over time. Machine learning based inventory algorithms (Shi et al. 2018; O’Neil et al. 2016) can directly take in the historical sales and purchase order (PO) information to produce optimal decisions without placing assumptions on demand estimation. Embedding technique (Shi 2018; Barkan and Koenigstein 2016) can also effectively learn the product similarities through historical customer order data without consulting to domain knowledge.

The growing interests in data-driven algorithms are universal across both industry and academia. At the same time, we do observe promising outcomes from newer approaches showing their ability to solve practical problems with better empirical performance. With the ever-growing complexity of modern supply chain networks, data-driven algorithms are playing more and more important roles.

In this paper, we will primarily discuss several key components of retail supply chain management systems and present their corresponding evolution to the data era. Besides research literature, the review will also focus on industrial applications and demonstrate how theories are applied in the real-world. The structure of the paper is organized as follows: Section 2 discusses retailers’ coordination and interaction with vendors; Section 3 discusses topics related to demand forecasting, one of the most critical components in supply chain management; Section 4 reviews inventory management systems; Section 5 follows on the fulfillment perspective of the supply chain system; Section 6 provides a summary for the future research directions.

2 Vendor management

In a supply chain, vendor management is critical for a firm since it has a direct impact on product quality, service level and company profits. An effective vendor management requires information sharing between the retailer and vendors, such as sharing of demand forecast and inventory information through a vendor managed inventory (VMI) system in a marketplace with stationary and non-stationary demand. In this section, we will review the recent development of vendor management from both theoretical research and industrial practice perspectives.

2.1 Introduction

Recent advances in information technology, particularly in the e-business arena, are enabling firms to rethink their vendor management strategies and explore new avenues for the cooperation with vendors through information sharing. Sharing demand related information among supply chain members has achieved huge impact in practice. As pointed out by Stein and Sweat (1998), by exchanging information, such as Point of Sales (POS), forecasting data, inventory level and sales trends, many companies are reducing their cycle times, fulfilling orders more quickly, cutting out excess inventory, and improving customer service.

Many companies not only share information with their supply chain partners, but also jointly make decisions to improve supply chain performance by using Collaborative Planning, Forecasting and Replenishment (CPFR). According to Panahifar et al. (2015), CPFR is a technological innovation tool that was first registered as a trademark by the Voluntary Inter-industry Commerce Standards (VICS) in 1998 and is defined by VICS as a collection of new business practices that leverage the Internet and EDI (electronic data interchange) in order to achieve two goals: radically reduce inventories and expenses while improving customer service.

Verity (1996) reported in Business Week that Wal-Mart and Warner Lambert attained significant improvements of in-stock positions while reducing inventory through CPFR. CPFR is one of a series of supply chain initiatives like JIT (Just-In-Time), ECR (Efficient Customer Response) and VMI (Sheffi 2002) driven by organizations to make their supply chains more responsive and keep all the supply chain members in tune with the end customer demand, both in terms of the product and its volumes.

By ensuring end-to-end information sharing, the occurrence of the bullwhip effect is reduced thus lowering inventory levels across the chain. It also allows the partners in a supply chain to visualize a bigger picture in terms of the entire supply chain rather than their individual enterprise.

As partner collaboration is initiated from the planning to the replenishment stage, the supply chain can better respond to the exceptional circumstances so as to make it a more proactive system rather than a reactive one. On a more abstract level, CPFR aims at creating an environment of trust between trading partners where the benefits of sharing information are realized. The role of CPFR in various stages of supply chain activity is aptly represented in Fig. 3. Within an efficient and integrated CPFR environment where firms share promotion plan, sales data and retail analytics and vendors share inventory and shipment information, the supply chain is more responsive to the external business change and all the partners can benefit from the information sharing.

Fig. 3
figure 3

The CPFR in a supply chain

In the vendor management, however, an incomplete understanding of the value of information sharing and physical flow coordination may hinder the efforts that promote the efficiency and responsiveness of a supply chain. We attempts to better understand the information sharing and flow coordination by reviewing and categorizing the recent research and practice in this area.

2.2 Literature review

Relevant literature consistently recognizes that inventory reduction can be achieved by implementing initiatives such as information sharing, continuous replenishment program and VMI. In the field of these initiatives, there are several streams of the related literature: literature on (1) inter-organizational systems (IOS); (2) quantitative models in information sharing; and (3) implementation of information sharing through VMI and CPFR (Yao and Dresner 2008).

The first stream of research has revealed the business value of IOS by studying the supply chain management initiatives, such as information sharing, continuous replenishment programs and VMI. Research has shown that IOS, as a link between suppliers and buyers, can improve a firm’s performance and bring it competitive advantages (Sethi et al. 1993; Palmer and Markus 2000; Premkumar 2000; Srinivasan et al. 1994).

The second stream of research has quantitatively examined the value of information sharing in supply chains. The consequences of the bullwhip effect, for example, can be minimized through information sharing (Lee et al. 1997; Lee and Whang 1999; Lee et al. 2000; Cachon and Fisher 2000; Chen et al. 2000).

Some research has found that policies such as VMI can decrease the bullwhip effect, thereby improving supply chain efficiency, such as by lowering inventory levels and reducing cycle time (Cachon and Zipkin 1999; Kulp et al. 2004; Mishra and Raghunathan 2004). Angulo et al. (2004) use simulation to find demand information sharing is a significant part of VMI implementation and can improve the fill rate by up to 42%. Çetinkaya and Lee (2000) develop an analytical model for coordinating inventory and transportation decisions with VMI systems. Lee and Whang (1999) present empirical evidence to confirm the value of supply chain coordination. These authors show that inventory turns and stock-outs have been improved after the implementation of continuous replenishment programs, using data collected from 31 grocery retail chains.

The third stream of research has widely discussed the collaboration and integration in the context of supply chains in recent years, especially in the area of CPFR. Panahifar et al. (2015) review the scope and value of CPFR using a devised state-of-the-art taxonomy for the classification of selected references related to CPFR. In the paper, based on a total of 93 papers published from 1998 to 2013 on CPFR, the authors attempt to seek answers to the question of what are the main constructs and efficient framework for successful implementation of CPFR. The key findings of the paper is that four main constructs for successful implementation of CPFR have been identified: 1. CPFR enablers; 2. CPFR barriers; 3. trading partner selection; 4. incentive alignment. The findings indicate that there is a need for better understanding of the amount and level of information sharing as an important function of CPFR implementation.

This paper also categorizes the CPFR implementation benefits for companies, which consists of three main dimensions: Information, Service and Finance. The information dimension encompasses improvement of forecasting accuracy, reducing the amount of exchanged information and reducing the bullwhip effect. The second dimension involves more criteria including increased responsiveness, enhanced customer service quality, improved inventory management, improved product offering, operational efficiency, product availability assurance, decreased replenishment lead time, increased customization capability. The financial dimension is the most important objective for firms implementing CPFR. This covers several criteria reported in the previous studies such as increased revenues and earnings, increased margins, increasing EVA (Economic Value Added), increasing shareholder wealth, decreasing cost of production, planning and deployment, maximum efficiency of members, a reduction of inventory in the supply chain, decreasing working capital, reduction in production and inventory costs, reduced overall costs, increasing the sales of products and reduction in stock-outs.

2.3 Industry practice of information sharing

Potential economic benefits of information sharing through CPFR are well-recognized and have been publicized in practice by successful retail businesses such as Wal-Mart.

In its CPFR partnership with P&G, Wal-Mart’s marketing information is integrated with P&G’s manufacturing systems to make better consumer-based decisions across their firm-level boundaries. For example, Wal-Mart’s POS data show the transaction-level information about consumer’s choices, thus providing the actual demand information on what is selling and the selling price. P&G’s products are then developed, manufactured and delivered to meet those customer needs in a timely manner. CPFR pilot with P&G provided a structured contractual platform for joint forecasting and planning activities between Wal-Mart and its vendors that ultimately drive the replenishment process through the entire supply chain. As pointed out by Andraski and Haedicke (2003), by 2003 Wal-Mart has established over 600 trading partners through CPFR to reduce its operating expenses to the lowest in the industry. Successful collaboration with CPFR partners allowed Wal-Mart to price its products 10% below most of the competitors.

Kim and Mahoney (2010) provided a detailed case study of the CPFR arrangement between Wal-Mart and P&G. The case study reveals that the successful implementation of CPFR depends not only on extensive information sharing but also on mutual learning as well as commitments to the dedicated partners from the repeated interactions. It grows out of first gaining an awareness of its contractual partners’ business needs by asking: 1. what is competitive advantage of your partners; 2. what is the competitive advantage to you if you combine them with yours; 3. what kind of business relationship does that create. Thus, successful implementation of CPFR requires higher levels of communication including the exchange of strategies and objectives between partners at the beginning of a planning period.

As e-commerce retail industry in China has been booming recently, the information sharing through CPFR has been widely implemented by Chinese retailers. To build an efficient and responsive supply chain that provides fast delivery to customers, JD.com, China’s largest retailer, online or offline, has invested in the collaboration with its vendors through CPFR by continuously applying the advanced information technology in supply chain innovation. Below we will illustrate several successful cases from JD.com, where the value of information sharing in the vendor management is fully realized.

More efficient vendor management with Vendor A

One practical example of information sharing is the EDI system coordination between JD.com and Vendor A, who is a Chinese leading electrical appliance manufacturer. Vendor A is one of the top vendors for JD.com by providing electrical appliances, such as small kitchen appliances, laundry, large cooking appliances, and refrigeration appliances.

The original information exchange between JD.com and Vendor A such as replenishment purchase orders was simply by emails and telephone, which resulted in information delays and inaccuracies, and orders being intractable. To overcome this challenge, JD.com and the vendor cooperated in building an EDI system, allowing the information exchange in real-time, being transparent and traceable.

The information shared between JD.com and Vendor A includes the JD.com’s sales plan, monthly demand forecast by region, purchase orders, and vendor confirmation responding to the orders. The business benefits of this information sharing through EDI and the integrated demand forecasting is significant: 1. JD.com has better estimation for the Vendor A’s inventory and production capacity, increasing the prediction time window from one month to three months. 2. With JD.com’s sharing demand prediction for longer periods, Vendor A can make a better arrangement of raw material procurement, production capacity and workers schedule. The production plan of Vendor A is 95% alignment with JD.com’s purchase plan. 3. The time delay between JD.com’s purchase orders and Vendor A’s response and confirmation is significantly reduced by 70%.

Supply lead time reduction with Vendor B

Vendor B is a major beverage and food supplier for JD.com. Since 2014, JD.com set a goal to provide fresh products and high-quality customer service in the beverage market. Knowing the potential benefits of supply chain coordination, the executive managements of both JD.com and Vendor B reached a strategic agreement that would strengthen the collaboration in information sharing. Since then, three key steps have been implemented: 1. Demand forecast and promotion plan sharing; 2. Vendor B direct inventory replenishment to JD.com warehouses; 3. Using Vendor B’s warehouses to fulfill JD.com customer orders. In 2016, with the CPFR practice and information sharing efforts, the supply lead time from Vendor B has decreased from 8 days to 3 days. The in-stock rate increases from 73% to 96%, and GMV increased by 5%.

Higher in-stock rate for peak promotion with Vendor C

In the important promotional and holiday events, such as Thanksgiving Black Friday event in US, 618 and Single Day events in China, the sales could account for 30% or higher of the annual sales, and retailers usually need to prepare inventory 1-2 months in advance to meet the peak demand, especially for the top-selling products. Through the CPFR, firms can book the vendors inventory and improve the in-stock rate during the promotional events. Another success case of information sharing is the implementation of CPFR between JD.com and Vendor C. During the 618 promotional event in 2017, Vendor C fulfilled 99% of JD.com’s replenishment orders. The promotional information sharing resulted in Vendor C’s products in-stock rate improved by 3%, and lead time reduced by 20%.

3 Demand forecasting

Demand forecasting is one of the most important components in supply chain management. In fact, demand forecasting results are key inputs for many decision-making processes in retail such as inventory management, network planning, pricing and revenue management, marketing, etc. Over the recent years, many retail companies, especially e-commerce companies have significantly increased their investments in improving demand forecasting performance. Further, with the increased data availability and the development of more sophisticated machine learning algorithms, we observe many new developments in this space in recent years. We will review both theoretical research and industrial practices of demand forecasting in this section.

3.1 Literature review

Demand forecasting is critical to the success of a retail company. Particularly in supply chain management, an accurate and practical demand forecasting system can be a significant source of competitive advantage by improving customer service levels and by reducing costs related to supply-demand mismatches (Snyder and Shen 2011). As a result, the forecasting problem has a long research history in the field of statistics and recently in machine learning. Demand forecasting is a practical domain of time series modeling and forecasting (Hamilton 1995). Many important models have been proposed in the literature for improving the accuracy and efficiency of time series modeling and forecasting. In the section below, we describe three important classes of time series models that are commonly used in the practice of demand forecasting, i.e. the stochastic time series, machine learning and deep learning models, together with their inherent forecasting strengths and weaknesses. Further, we will also point out several specific forecasting scenarios that are closely associated with retail industry.

3.1.1 Stochastic time series models

Stochastic time series models are the most popular methods used in demand forecasting. There are two main classes of stochastic time series models: linear and non-linear models. Linear model includes the two most widely used stochastic time series approaches, i.e. the Holt-Winters method (Holt 2004; Winters 1960) and the Autoregressive Integrated Moving Average (ARIMA) model (Box et al. 2015). ARIMA model has subclasses of other models, such as the Autoregressive (AR), Moving Average (MA), Autoregressive Moving Average (ARMA), and Seasonal ARIMA (SARIMA) (Hamzaçebi 2008) models. Although these models can capture trend and seasonality, they are ineffective in modeling highly nonlinear time series. To overcome this drawback, various non-linear stochastic models have been proposed in the literature, such as the Autoregressive Conditional Heteroskedasticity (ARCH) (Engle 1982), and Generalized ARCH (GARCH) (Bollerslev 1986) models; however, the implementations are not as straight-forward and simple as the linear models.

The strengths of stochastic time series models are the model interpretability and computational efficiency. It is well received by practitioners due to the easiness to implement and the intuitive results. However, the models have relatively strong assumptions in terms of the underlying stochastic processes, which are normally over simplifying the practical situation. Recently, with the advancements in big data and machine learning, researchers have found Machine Learning and Deep Learning algorithms to be able to provide better forecasting performance.

3.1.2 Forecasting with machine learning techniques

In the last few decades, machine learning techniques are widely used in the field of forecasting, such as Decision Tree (Breiman 2017), K-Nearest Neighbor Regression (KNN) (Hastie and Tibshirani 1996), Support Vector Regression (SVR) (Drucker et al. 1997), and Gaussian Processes (GP) (Williams and Rasmussen 1996). Comparing to stochastic time-series models which are more model-driven, the machine learning methods are more data-driven; in general, these machine learning methods are exploited to improve time series predictions by minimizing a loss function. Ahmed et al. (2010) compares the accuracy and time consumption of these machine learning models.

Zheng and Su (2014) proposed a two-step enhanced KNN method, and the method consistently improved the forecasting accuracy in short-term forecasting. Recurrent least-square SVR (Suykens and Vandewalle 2000) and dynamic least-square SVR (Fan et al. 2006) are two popular SVR models for time series forecasting. Girard et al. (2003) proposed to use the non-parametric Gaussian process model for multi-step ahead time series prediction, so that the uncertainty about intermediate regressor values can be incorporated, thus the uncertainty on the current prediction can be updated. Decision Tree provides a foundation for various tree-based ensemble algorithms including the two most widely used machine leaning techniques, i.e., Random Forest (RF) (Kam 1995) and Gradient Boosting Decision Trees (GBDT) (Chen and Guestrin 2016; Ke et al. 2017; Prokhorenkova et al. 2017). RF has been used for electricity load forecasting (Nedellec et al. 2014; Dudek 2015), and it is shown that the RF model provides as accurate forecasts as artificial neural networks (ANN) and outperformed the ARIMA and Decision Tree models (Dudek 2015). RF has been proven to be competent in one-step-ahead time series forecasting, and it is also shown that a low number of recent lagged variables performs better, highlighting the importance of the training set’s length (Tyralis and Papacharalampous 2017). Similar to RF, GBDT has exhibited competent performance comparing to other stochastic time series and machine learning methods on time series forecasting (Kusiak et al. 2009). However, there is some limitation as tree-based models cannot extrapolate, i.e. cannot predict value bigger or smaller than the value in the training set, so they are not suitable for data with a trend in time series. The solution is feature engineering, which is to remove the trend first by constructing many time series features (e.g. day of week, holiday, season, moving average, and lag) (Kusiak et al. 2009). Another weakness of machine learning models is that they require more computation comparing to stochastic time series models, which increases the time complexity of the solutions.

3.1.3 Forecasting with deep learning techniques

Recently, artificial neural networks (ANNs) have attracted increasing attentions in the domain of time series forecasting (Zhang 2003; Kihoro et al. 2004; Kamruzzaman and Sarker 2006). The excellent feature of ANNs, when applied to time series forecasting problems is their inherent capability of non-linear modeling, without any presumption about the statistical distribution followed by the observations.

Recurrent Neural Networks (RNNs) are a special type of ANNs which contain self-connections. Unlike feedforward NNs, the hidden states of RNNs serve as memory to help map both current time inputs and previous time internal states to new desired outputs. This allows RNNs to capture temporal information in sequence data. Long Short-Term Memory (LSTM) (Hochreiter and Schmidhuber 1997) is an important class of RNN variants, which has additional memory-control gates and memory cells to selectively store historical information and keep long term information. RNNs and LSTMs have been successfully applied to time series data forecasting problems. Längkvist et al. (2014) reviewed recent research in unsupervised feature learning and sequence modeling with deep learning methods for various time series data such as video frames, speech signals, and stock market prices. Bianchi et al. (2017) compared different RNN variants and showed that LSTMs outperformed others on highly non-linear sequences with sharp spikes thanks to the quick memory cell modification mechanism. Cinar et al. (2017) proposed using an LSTM encoder-decoder with position-based attention model to capture patterns of pseudo-periods in sequence data. They applied the attention mechanism (Bahdanau et al. 2014) to explore similar local patterns in historical data for future prediction. However, it is impractical to look into the full history of time series and the selection of which part of the history to attend to relies on human knowledge. Taieb and Atiya (2016) analyzed the performance of different multi-horizon forecasting strategies on synthetic datasets with different factors, such as length of time series and number of horizons. Flunkert et al. (2017) proposed a technique called DeepAR to make probabilistic forecasts by assuming an underlying distribution for time series data. DeepAR could produce the probability density functions for target variables by estimating the distribution parameters on each time point with multi-layer perceptions (MLPs). However, the distributional assumption is often too strong to apply to real-world datasets.

Comparing to stochastic time series and machine learning models, the strengths of deep learning models are the model capability to incorporate a variety of information, no need for feature engineering, and generally higher forecast accuracy. The weakness of deep learning models is that when the training size is large, it requires an enormous amount of computation, and the model interpretability is the lowest among the three classes.

3.1.4 Retail specific forecasting scenarios

Retail demand forecast has its own unique challenges that makes the problem more complex. Firstly, retail demand forecasting normally have to deal with censored demand where observed demand data are affected by out of stock inventory (Jain et al. 2014). Empirically, out of stock has a high correlation with promotion activities in retail business. So, the forecasting algorithm needs to deal with the missing data very carefully in order to produce accurate results. The expectation-maximization (EM) algorithm (Dempster et al. 1977) is one of the approaches that widely adopted for this need (Anupindi et al. 1998; Talluri and Van Ryzin 2004; Vulcano et al. 2010; Conlon and Mortimer 2013). Secondly, with the ever-increasing product selection, new product forecast is becoming a more and more important topic. Since the publication of the Bass model in 1969 (Bass 1969), research has been made to make diffusion models theoretically sounder and practically more effective for new product forecast (Ismail and Abu 2013; Lee et al. 2014; Kahn 2014). Thirdly, product complementarity and substitutability plays an important role in the actual sales of products, and the demand for a product can depend directly and indirectly on other products in different categories (Shocker et al. 2004; Duan et al. 2015). Fourthly, a retailer, especially e-commerce retailer generally has a large product selection with highly diversified products and with different product characteristics, such as fashion, fresh groceries, etc. For fashion products, demand uncertainty, lack of historical data and seasonal trends usually coexist and make demand forecasting more challenging (Nenni et al. 2013). For fresh groceries, successful demand forecasting is very critical because of the short shelf-life of fresh products and the importance of the product quality (Doganis et al. 2006; Shukla and Jharkharia 2013). For long-tail products, the problem has proved to be more challenging due to the sparsity of the sales data that limits the degree to which traditional analytics can be deployed. Pitkin et al. (2018) developed a Bayesian hierarchical model to forecast this type of demands.

3.2 Industry practice

As mentioned in the earlier section, demand forecasting is a heavily invested domain in the industry, especially in retail. There are many existing commercial solutions on the market. At the same time, many companies have established dedicated teams on demand forecasting to keep pushing for the boundary of the forecasting accuracy.

3.2.1 Commercial forecasting tools

Demand forecasting is a key part of traditional supply chain management software. Leading companies in this space include JDA, IBM, SAS, Oracle and more. The prediction methods they use are mostly stochastic time series algorithms that emphasize the need for planning in software functions, such as assuming simulations and artificial adjustments to requirements. Although these companies lack sophisticated predictive algorithms, they have a high market share in large and medium-sized enterprises because of their long history, wide product coverage, and good integration with ERP (Enterprise Resource Planning) systems.

With the popularity of machine learning in time series prediction, in recent years, there have been many products with machine learning and prediction accuracy as the main selling points. Such products can use machine learning algorithms to make more accurate forecasting of demand than stochastic time series methods, leveraging internal and external data such as product features, store locations, weather, geography and economic indicators. Representative companies in this field include LLamasoft and Blue Yonder. It is worth noting that JDA acquired Blue Yonder in 2018, and then launched the machine-learning-based cloud prediction product Luminate Demand Edge in October of that year, marking JDA’s transition to cloud services and artificial intelligence (JDA Software 2018). Besides, Amazon recently launched its time series forecasting cloud service “Amazon Forecast” based on machine learning and deep learning techniques (e.g. DeepAR (Flunkert et al. 2017) and MQ-RNN (Wen et al. 2017)), and no machine learning experience is required to use this service.

In the last two years, a number of common machine learning platforms have emerged, such as H2O, DataRobot and Ali PAI. This type of product service is targeted at data scientists, allowing them to perform modeling through visual operations without programming. In this trend, there is a branch that is a general predictive software that does not need to be modeled by itself. The most representative one is H2O Driverless AI, which has many advanced algorithms built in, and has the ability to automatically target any set of data by the machine to find the optimal algorithm.

3.2.2 Probabilistic demand forecasting system

Many practical decision-making scenarios require richer information provided by probabilistic forecasting that returns the full conditional distribution, rather than point forecasting that predicts the conditional mean. For real-valued time series, this is traditionally achieved by assuming an error distribution on the residual series. However, an exact parametric distribution is often not directly relevant in applications as mentioned above. Instead, particular quantiles of the forecast distribution are useful in making optimal decisions, both to quantify risks and minimize losses (e.g. risk management, operation optimization). Probabilistic forecasting is a key enabler for forecasting demand and optimizing business processes in retail businesses.

Many researches focus on generating quantile estimations for target variables by formulating forecasting problems as quantile regressions (Koenker and Bassett 1978). Zheng (2010) proposed to minimize objective functions for quantile regressions with high dimensional predictors by gradient boosting (Friedman 2001). Xu et al. (2016) proposed a quantile autoregressive model which can output multi-horizon quantile predictions by sequentially feeding predictions of previous steps into the same NN for current prediction. Their NN-based autoregressive model is different from RNN as prediction values, instead of hidden states, are fed recursively. For the above methods, separate models have to be trained for different quantiles for multi-quantile forecasting tasks, which is inefficient in practice.

Recent forecasting researches from e-commerce companies show strong interests for probabilistic forecasting, such as DeepAR (Flunkert et al. 2017) and MQ-RNN (Wen et al. 2017) from Amazon and the multi-horizon forecasting algorithm (Fan et al. 2019) developed by JD.com. MQ-RNN (Wen et al. 2017) generates multiple quantile forecasts for multiple time horizons. It uses an LSTM to encode the history of time series into one hidden vector, and uses two MLPs to summarize this hidden vector, together with all future inputs, into one global context feature and horizon-specific context features for all horizons. However, this global context feature may be too general to capture short-term patterns.

Fan et al. (2019) presented an end-to-end deep learning framework for multi-horizon probabilistic demand forecasting. This LSTM based system improves the consideration for temporal relationship within the forecast result by allowing propagation of information both forward and backward in the LSTM decoder (shown in Fig. 4). This approach outperforms current state-of-the-art models on the GOC2018Footnote 1 Sales Forecasting dataset (Yuan and Jing 2018).

Fig. 4
figure 4

Structure of deep learning probabilistic forecasting framework

4 Inventory management

Inventory management is to make informed decisions about the quantity and placement of stocked goods. As a complex multi-layer system, it is required at various locations/facilities across the inventory network. It covers the entire inventory control process including the monitoring of goods moved into and out of warehouses/distribution centers and the reconciling of the inventory balances. In this section, we discuss the state-of-the-art regarding three key aspects of inventory management, namely, replenishment, transshipment, and placement.

4.1 Literature review

Inventory management is an extensively studied field in the literature. Due to space limitation, we only cover main methodologies and invite readers to refer to the cited reference for more comprehensive introductions and reviews. In the following subsections, three key aspects of inventory management are covered, i.e. replenishment, transshipment and placement. These are the foundations for many practical inventory management systems in practice and still see many new developments recently.

4.1.1 Replenishment

If the customer demand and the vendor lead-times are appropriately predicted, one would be able to vary stocks accordingly to accommodate their needs. However, both demand and lead-times are difficult to forecast to a satisfying accuracy level due to the stochastic nature of these quantities (e.g., variability in demand and lead-time). In addition, items may be required faster than the supply can provide it. Among all different types of incurred inventory costs, stock-out cost is usually the most significant part, which may happen in cases such as fluctuating customer demand, forecast inaccuracy, and variability in lead-times, etc. To compensate for these, one can otherwise hold sufficient stock to cope with unexpected or excess demand so as to prevent stock-outs. The minimum stock level maintained is referred to as safety stock and this inventory management strategy is called safety stock policy. Besides, safety stock can also compensate for the uncertainty in the vendor lead-time prediction. Arguably, safety stock is the basis for the broader problem: inventory replenishment.

The safety stock idea has been used to devise a few classic replenishment models such as the continuous review inventory policy (s,Q) and the periodic review inventory policy (R,s,S) or (R,S) (Snyder and Shen 2011). The former uses a replenishment quantity of Q which is usually determined using the idea of economic order quantity (EOQ) (Harris 1915), while the latter typically assumes a more realistic stochastic demand and takes a similar form as the safety stock derived above.

The strategic safety stocks as described above can help to manage the risk of stock-out, which usually outweighs other relevant costs. A comprehensive review of safety stock techniques is available in Guide and Srivastava (2000). Various safety stock strategies have been developed, for instance, Salameh (1997) and Brandolese and Cigolini (1999). Estes (1973) developed a re-ordering point inventory model that accounts for demand and leadtime variability by assuming that the demand during the leadtime follows a normal distribution. Later on, Ruiz-Torres and Mahmoodi (2010) determined safety stock by focusing on historical data without making any particular distributional assumptions of demand and leadtime. The optimal placement of safety stocks also has been extensively studied with seminal work including Clark and Scarf (1960) and Lagodimos and Anderson (1993). Graves and Willems (2000) formulated the safety stock placement problems as a network optimization problem and developed a dynamic programming algorithm as solution procedure. Lesnaia (2004) formulated the safety stock problem in supply chains as a general network problem, which is shown to be NP-hard.

The inventory system in practice may consist of multiple layers of warehouses such as regional distribution center (RDC) and forward distribution center (FDC). At each level of the supply chain, stock levels of various SKUs need to be determined. Multi-echelon inventory management deals with multiple layers of inventory simultaneously. Specifically, a multi-echelon system can be categorized into a few types: a serial chain, a divergent system and a convergent system. A divergent system consists of a single central stage and several successors. A convergent system has a one-end stage with several predecessors. A serial chain is a special case of the divergent systems that have one single successor.

Many real-world supply chains are multi-echelon systems consisting of several stages of procurement, manufacturing, and transportation. The theory of multi-echelon inventory management was initialized by Clark and Scarf (1960), in which a basic model for a supply chain consisting of multiple stages with a serial structure is considered. The authors proved the optimal base stock levels can be obtained by the minimization of one-dimensional convex cost functions. Another seminal work on this topic is Sherbrooke (1968). Later on, the work of Clark and Scarf (1960) was extended in terms of many aspects, such as systems with a pure assembly/convergent structure, fixed batch sizes or fixed replenishment intervals, and advanced demand information, etc. For instance, the optimality of base stock policies and the decomposition result (Clark and Scarf 1960) was based on a stochastic dynamic program in a finite-horizon setting, which was then extended to the infinite-horizon case (Federgruen and Zipkin 1984). The optimal inventory policy for a serial system assuming Markov-modulated demand was derived by Chen and Song (2001).

There are various solution procedures proposed in literature for the inventory problem of multi-echelon systems, including both heuristics, approximate methods, and exact methods. The optimal policy and optimal costs of a multi-echelon, serial system have been computed by both the approximate method based on two-moment fits and exact method in van Houtum and Zijm (1991). Shang and Song (2003) proposed a simple heuristic-based inventory policy for the multi-echelon, serial system with linear costs and stationary random demands through the lower and upper bound subsystems.

4.1.2 Transshipment

Transshipment problem deals with the problem of managing goods between distribution centers, facilitates inventory re-balancing, bulk inventory receiving, and customer order fulfillment (Herer et al. 2002). Different from replenishment problem, transshipment problem handles the necessary movements of goods between warehouses within an echelon, e.g. within the retailer’s network. Through transshipment, the overall inventory level can be reduced whilst maintaining the required service levels.

This topic has been studied extensively since the late 1950’s (Allen 1958, 1961) focusing on varying number of items, warehouse echelons, warehouses, ordering policies, and cost analysis. Two main types of transshipment exist, i.e. proactive and reactive transshipment, distinguished by the timing of the decision-making process with respect to demand realization. Proactive transshipment redistribute inventory between warehouses at predetermined moments in advance of any potential customer orders while reactive transshipment, as it is suggested by the name, reacts to real demands and act accordingly. Proactive transshipment models, e.g. Allen (1958) and Karmarkar (1981), range from a simple single-period, single-transshipment with no network inventory replenishment model to more complex model considering multi-period, multi-transshipment with network inventory replenishment. Research on reactive transshipment can be split between periodic review (Krishnan and Rao 1965; Tagaras 1989) versus continuous review (Lee 1987; Wong et al. 2006) models, which can be further detailed into single-echelon, multi-echelon and decentralized systems. A comprehensive review of lateral transshipment models can be found in Paterson et al. (2011). It is worth noting that with e-commerce retailers, there is a growing emphasize on reactive transshipment not only to improve product availability but also to reduce order splits (Zhang et al. 2018). As last mile delivery is one of the largest components of fulfilment cost (Xu et al. 2009), an efficient reactive transshipment algorithm can significantly reduce overall fulfillment cost.

Recent research further increases the intricacy of the system but primarily in a focused area. For instance, a two-item, two-warehouse periodic-review inventory model with transshipment was examined in Ramakrishna et al. (2015). Noham and Tzur (2014) implemented a fixed transshipment cost to improve on previous methods of modeling single and multi-item transshipment problems. Torabi et al. (2015) modeled the transshipment problem in e-commerce as a mixed integer linear programming (MILP) and solved optimally to minimize the cost of logistics, while not consider the effects of shipment and leadtime.

4.1.3 Inventory placement

Inventory placement is the problem of determining the choice of fulfillment centers in which to place each SKU. The inventory placement problem is specifically relevant in the era of e-commerce due to the greatly increased item selection available to the retailers. The retailer has to strategically store the items in order to better balance shipping speed and inventory cost. As an optimization problem (e.g., mixed integer linear programming), it aims to find a placement plan for a given planning horizon that minimizes the total costs (i.e., shipping, overhead) and satisfies the expected geographical demand and respects fulfillment center capacity constraints. It covers topics on facility (e.g., warehouses) location and product placement.

Inventory placement problem can be viewed as an extension of the facility location problem and a special case of the multi-commodity capacitated fixed-charge network flow problem. The problem varies by the number of commodities (i.e., single-commodity, multi-commodity), capacity (i.e., uncapacitated, node-capacitated or arc-capacitated), fixed charge (i.e., charge on nodes, edges or both), and network structure (e.g., two-echelon, multi-echelon).

There has been extensive research on topics related to inventory placement. Earlier work includes the single-commodity uncapacitated facility location problem (single plant location problem or SPLP) (Jakob and Pruzan 1983; Cornnejols et al. 1977), and the multi-commodity location problem (Warszawski and Peer 1973; Karkazis and Boffey 1981; Revelle and Laporte 1996) which was extended based on SPLP. The capacity constraints, on nodes, edges or both, impose additional challenges for the solution procedures. Various techniques have been used for solving the problem such as greedy heuristics, dual ascent method, and Benders decomposition. Sridharan (1995) provides a survey on these solution techniques. Melkote and Daskin (2001) proposed the capacitated facility location/network design problem to bridge the gap between the problems of facility location and network design. The multi-commodity capacitated fixed-charge network flow problem is a more generalized formulation compared to the aforementioned approaches. Gendron (2011), Gendron and Larose (2014), and Melo et al. (2009) provide comprehensive reviews on the exact solution approaches, including cutting plane methods, Benders decomposition, and Lagrangian relaxation, and recent advances to this problem. To capture other practical considerations, researchers have been making efforts in considering multi-period dynamics (Melo et al. 2006), stochasticity of demand or costs (Snyder 2006), and nonlinear cost structures.

Another related topic on inventory placement of broad and current interest focuses on the placement of specific types of goods named product placement. It is to determine the set of SKUs (i.e., products) to store in a specific location (i.e., distribution center) at a time point so as to better service customers. It can also be viewed as an assortment problem. The assortment planning problem has been extensively studied and is still an emerging research area. Reviews of the state-of-the-art on assortment planning are provided in Mahajan and van Ryzin (1999) and Kök et al. (2008). The latter also summarizes the current approaches used in the industry practice for inventory placement. A typical approach in literature to the assortment problem is to formulate it as an optimization problem with which to select a subset of products maximizing the revenue captured at a single store/location. To take the interaction between assortment and demand, a demand model, such as the multinomial logit model, exogenous demand, or locational choice models, is utilized to capture consumer behaviors. For uncapacitated assortment problems, algorithms ranking products by margin under the multinomial logit (MNL) discrete choice model are available in Talluri and Van Ryzin (2004), Gallego et al. (2004), and Liu and Van Ryzin (2008). Later, Schön (2010) and Wang (2012) studied the assortment problem with capacity constraint imposed. In contrast to the static view of the assortment planning problem, Rusmevichientong et al. (2010) and Caro and Gallien (2007) addressed the dynamic assortment problem that revises or changes assortment selection as time elapses.

4.2 Industry practice

Inventory management has long been the center piece for retailers. In this section, we present industry approaches to the inventory management problem regarding the following aspects. As will be seen, these companies take significantly different approaches and emphasize different aspects of the inventory management problem.

4.2.1 Replenishment

The idea of safety stock has been popular for decades in industry as inventory replenishment method. Starting in 1994, IBM started to develop the Asset Management Tool (AMT) aiming to achieve quick responsiveness to customers with minimal inventory (Lin et al. 2000). In addition, IBM has developed a software tool named the IBM Supply Chain Simulator (SCS) to help with making strategic business decisions about the design and operation of its supply chain, which include a function for determining safety stocks for the modeling of distribution (Bagchi et al. 1998).

Intel uses a periodical review model in its multi-echelon inventory optimization (MEIO) system, where inventory targets are reviewed weekly with up-to-date sales and inventory information. The inventory targets are set based on a weeks-of-inventory (WOI) policy by product family. The inventory target is then fed into an Intel-developed advanced-planning and scheduling (APS) optimizer that minimizes production costs, lost-sales costs, and costs for deviating from the inventory targets (Manary and Willems 2008).

Walmart has been collaborating with P&G to deploy a continuous review replenishment policy in which they continuously monitor the inventory levels by RFID technology and automatically replenish their inventory when levels go below the safety stock level (Kosasi et al. 2014). Similarly, Amazon.com tracks its inventory position in real-time based on warehouse receipts and shipments and places purchase orders to vendors based on the forecasted amount needed subtracting the current on-hand inventory in the warehouse (Zeppieri 2004). Chiles and Dau (2005) provide a detailed description of the replenishment process at Walmart and Amazon.

Multi-echelon inventory management is increasingly incorporated into the practice of supply chain management. For instance, ideas along this line have been successfully applied in practice such as IBM (Lin et al. 2000), Eastman Kodak (Graves and Willems 2000), and Philips Electronics (de Kok et al. 2005).

At JD.com, a sophisticated inventory control system is implemented with a system of inventory replenishment models. The system selects the appropriated models based on product characteristics as well as business requirements. For example, products with low daily sales may be treated using a dedicated model to account for the higher relative variability while high volume items are generally driven by continuous or periodical review models. In general, the replenishment models are formulated as optimization problem minimizing total cost including inventory holding cost, and stock-out cost.

JD.com utilizes a fulfillment network structure that resembles multi-echelon inventory system with a RDC-FDC design. Recently, approximation algorithms have been proposed for solving the replenishment (to RDCs) and allocation problems (from RDCs to FDCs) jointly (Qi et al. 2018). This work is extended based on the seminal work of Levi et al. (2007), which proposed the dual-balancing policies. The algorithm is validated with actual JD.com operation data and provides high computational efficiency by leveraging from two novel techniques: marginal cost accounting and cost balancing.

4.2.2 Transshipment

Transshipment is a vital and growing part in the inventory management in practice. Transshipment allows for more efficient use of network resources and reduces costs of fulfillment and replenishment throughout the entire inventory network. Amazon uses a transshipment method called lateral transshipment which deals with the movement of goods within the same echelon of the supply chain. Three transshipment types exist in Amazon, i.e. Customer Order (reactive), Network Inventory Re-balancing (proactive) and Domestic Cross-dock (proactive). Reactive transshipment is primarily used to reduce fulfillment cost while proactive transshipment is used for cost savings on replenishment to fulfillment centers. Young (2016) details the key steps of transshipment process in Amazon, which include inventory, transshipment pick, merge/palletize, outbound dock, transportation, inbound dock, receive, and stow, etc. The same process covers both reactive and transfer (proactive) transshipment.

JD.com is well known for its super-fast delivery thanks to the smart inventory allocation mechanism. It features same and next day delivery as a standard, allowing customers to receive same-day delivery for orders placed before 11am and next-day delivery by 3pm for those placed before 11pm. More than 90% of orders on JD.com are delivered same-day or next-day. Due to the fast delivery speed requirement, reactive transshipment, although still exists in JD.com network, takes a smaller portion of the overall transshipment volume. In contrast to Amazon’s lateral transshipment, JD.com owns a more complex distribution network and uses a multi-step inventory allocation algorithm. Specifically, they calculate the target inventory level for each product and then select a subset of products to transship via optimization. This innovative allocation strategy has led visible improvement regarding stock-out rate at the distribution centers and the proportion of orders meeting their same- and next-day delivery standards.

4.2.3 Inventory placement

Compared to the more operational and detail-oriented approaches in academia, industry uses a more strategic and holistic approach for assortment planning.

As reviewed by Kök et al. (2008), in retailing industry, one common approach to assortment planning is for corporate headquarters to decide on a single common assortment that is carried by all stores of the chain (e.g., Best Buy), except that in smaller stores, the breadth of the assortment may be reduced by removing some of the least important SKUs. On the contrary, a few retailers (e.g., Bed Bath & Beyond) allow their store managers to manage the assortment on their own. One example that follows a hierarchical approach to assortment planning is Albert Heijn, BV, a leading supermarket chain in the Netherlands. Specifically, they first divide SKUs into categories, and store space is allocated to each category accordingly. Subsequently, they carry out product selection and facing allocation to products as detailed in Kök et al. (2008).

In contrast to academic studies, product placement for industry practitioners is an even more intricate decision that requires many more practical issues to be considered and/or customized to their specific business needs. For instance, as a strategic necessity, Best Buy would carry a rarely demanded product just to maintain consumer perception of it as offering the latest lines of technological products. Retailers are also known to frequently carry unprofitable products in order to strategically drive for growth or overall profitability of a whole category. Besides, Best Buy also considers various other supply chain considerations, such as vendor relations, vendor performance and the scope of products offered by a vendor, to develop the assortment plans (Kök et al. 2008).

In JD.com, there are two types of warehouses or distribution centers, namely regional distribution centers (RDC) and forward distribution centers (FDC). The former have larger capacities and tend to store comprehensive SKUs, including slow moving products, and also serve as the suppliers for the latter. On the other hand, the latter have limited capacities, are geographically closer to customers, and usually store mostly high-selling SKUs in order to fulfill orders in a timely manner. As a primary feature that customers pay attention to, delivery time is an important metric for e-commerce and e-retailers are thus seeking to cut their delivery time in order to be more attractive to customers. As a result, JD.com need to maximize the total number of orders that are shipped from FDCs so as to benefit from fast shipping. In case an order cannot be satisfied by an FDC (e.g., one product is missing in the FDC), it may be routed to the closest RDC. As a consequence, the delivery is subject to delay. To make things more complicated, it is normally preferable to have an order with multiple items delivered together. As a result, simply placing the best sellers in FDC may not be able to best utilize the capacities. Since a significant portion of orders contain more than 2 products, it is necessary to group items with similar attributes that sell together in the same FDC to avoid splitting orders into multiple packages.

Shi (2018) presented one example of the solutions used for inventory replacement in JD.com. To fully leverage the fast delivery from FDCs, they aim to maximize the number of orders that can be fulfilled entirely from the FDCs and minimize the occurrences of order splits. As a large-scale optimization problem, a practical solution procedure to this is to use heuristics that rank orders by their popularity. Specifically, they use SKU2Vec algorithm, which is motivated by Google’s Word2Vec (Mikolov et al. 2013) to compute a latent vector for each SKU which is then used to model the closeness to another SKU. An end-to-end neural network framework is then used to make inventory assortment decisions by directly capturing the co-purchase relationship between products reducing order split by about 2% (i.e., 2 million less split packages per year) compared to a benchmark Greedy Ranked algorithm.

Alternatively, they also propose a data-driven graph-based algorithm at JD.com that outputs subsets of products to be placed at FDCs. This is achieved by sampling batches of orders and aggregating the solutions of a selection problem solved with parametric cut minimization. This innovative approach is demonstrated to have superior performance than the former inventory placement strategy (Jehl et al. 2018).

5 Order fulfillment

Order fulfillment is the process of accepting, processing, and delivering customer orders. Order fulfillment is a crucial component contributing to the customer experience, as well as an important aspect of cost control in supply chain. For a large retailer like JD.com, the order fulfillment decision is among the most difficult problems given that there are billions of products in its large-scale and multi-stage inventory network.

5.1 Literature review

Order fulfillment has become a popular research topic since e-commerce entered the main stage of retail business. In the era of e-commerce, the packages placed online can be delivered to the customers within a few days, sometimes even the same day. The scale and complexity of the online order fulfillment system gives rise to a broad range of research opportunities. In this section, we review three main technical aspects of the order fulfillment process, namely Fulfillment Optimization, Order Prioritization, and Fulfillment Flexibility.

Fulfillment Optimization

is the problem in the center of the order fulfillment. When a customer placed an order online, the retailer needs to decide where to ship the products to the customer destination. Different from the traditional retailing, customers do not select where the products should be shipped from as the logistic process of an order fulfillment is hidden from the customer after the order has been placed. This allows for greater flexibility in selecting the best location and transportation method to ship the products to meet the customer requirements while minimizing the fulfillment cost. Acimovic and Graves (2017) consider the fulfillment problem for a large distribution network of minimizing shipping costs for a single SKU, and propose an LP-based policy that incorporates the forecast of future orders. Xu et al. (2009) consider a batch optimization policy where each single order fulfillment decision is delayed until solving the fulfillment for a batch of orders together using an integer programming. Jasin and Sinha (2015) consider a problem similar to Xu et al. (2009) that incorporates different costs of bundled shipments.

Besides solving the fulfillment optimization problem for a given level of customer requirement, another stream of research works focuses on investigating policies that can improve fulfillment performance by differentiating customer requirements, or Order Prioritization. The idea of this can be traced back to the inventory ration policies that use limited inventory to fulfill multiple classes of customer demand discussed in Topkis (1968), Kaplan (1969), Veinott (1965), and Nahmias and Demmy (1981). In this stream of literature, customers are prioritized, and the inventory system is allowed to either back-order or lose demand for low priority customers to fulfill future demand for high priority customers. In the context of e-commerce, it is not easy to differentiate customers directly as usually all customers purchase products from a single website. However, differentiating customer orders by different levels of desired delivery-time standard is feasible. Cattani and Souza (2002) investigate rationing in a direct marketing environment (similar to online retailing), where customers may pay a higher fulfillment fee to reduce their delivery times. The integration of e-commerce and omni-channel retailing also brings new research opportunities. As offline customers usually have a priority in purchasing the products than the online customers, Karp (2017) discusses a protection level policy to ensure the right amount of inventory is reserved for the offline customers. Govindarajan et al. (2018) discusses a joint decision for the inventory and fulfillment policies that optimize the total costs. Agatz et al. (2008) provide an excellent review on the other related omni-channel researches.

Fulfillment Flexibility

strategies approach the order fulfillment from a network design point of view. Inventories of the same product are stored in multiple storage locations in the network so as to provide “flexibility” for order fulfillment. Higher degree of flexibility allows for more options in satisfying customer orders which eventually leads to lower order fulfillment costs due to less order split and inventory stock-out at nearby storage locations. Designing a flexible fulfillment network is related to the concept of process flexibility, which is developed by Jordan and Graves (1995) in 1995. DeValve et al. (2018) extend the flexibility concept to online retailing by investigating a threshold policy with the structure which is motivated by the protection levels in revenue management originated from Littlewood (1972). Acimovic and Graves (2017) show how traditional decentralized allocation policies may perform sub-optimally and induce dynamics (whiplash) that result in costly spillover. The literature of fulfillment flexibility also shares some similar features with models from inventory transshipment literature. Paterson et al. (2011) show flexibility can be created both by transshipping inventory among the facilities proactively and reactively. Axsäter (2003) develops a decision rule dictating whether to transship, or whether to incur the back-order costs. Yang and Qin (2007) discuss a model that utilizes virtual lateral transshipment between two factories. Archibald et al. (2009) develop a transshipment heuristic for a realistic multi-location inventory system.

5.2 Industry practice

With a fast growth in the past ten years, large e-commerce retailers like Amazon.com and JD.com have established highly efficient order fulfillment processes. We introduce the framework of a general order fulfillment engine and the network structure it has been built upon in this section.

5.2.1 Order fulfillment engine

The order fulfillment process usually starts with Order Approval. A fraud detection procedure is conducted to identify potential risks. This is typically done with a classification algorithm. In the case of a high-risk transaction, a human verification procedure may be involved to further analyze the risk. After the order is verified, the order will be dispatched to the warehouses for processing. This step is usually controlled by an Order Management System (OMS). The fulfillment optimization algorithms play a key role in this step. The algorithm needs to take in all the realistic constraints such as inventory availability, delivery cut-off times, warehouse processing capacity, outbound transportation capacity, and the specific features of the order and then minimizes the overall fulfillment cost. For various reasons, an order consisting with multiple products may be split into multiple warehouses for processing. As a result, the customer may receive multiple packages for an order placed. This Order Dispatching step usually takes a short time and is processed without human intervention. The next key process is Picking, Assembling, and Packing at the warehouses. This step is usually controlled by a Warehouse Management System (WMS) which ensures the orders dispatched to the warehouse can be completed before the outbound due times and at the same time maximizes the operational efficiency. The packed orders are then loaded onto trucks, waiting to be shipped. Depending on the distance from the warehouse to the customer, different transportation methods may be involved in the shipment, ranging from air to motorcycles. The order fulfillment completes with the success of the final mile delivery of the package to the customer.

Product Return

is an important aspect of customer experience. Most companies provide a convenient return policy with no or small additional return shipment fees. To provide more flexibility to the customer, many companies also apply a free Cancellation Window policy which allows customers to cancel the purchase before a cut-off time. After the cancellation window, customers who want to cancel orders can only do so by returning goods after they have arrived and by mailing the package back. To further increase customer experience, e-commerce retailers are trying to extend the cancellation windows until the delivery of a package. For example, JD.com uses a concept called Cool-Down Period (Wu 2018) to reduce the cost for order cancellation. For each order, a cancellation probability is calculated based on the nature of the order and who is buying. An order may be put on hold for a cool-down period before dispatching orders to the warehouses for picking. The cool-down period will be longer if the order is associated with a high cancellation probability.

The new trend of the integration of online and omni-channel retailing raises new opportunities to the order fulfillment. On one hand, with the omni-channel stores having the inventory of the same ideal product when the customer placed the order online, the order may be shipped to customer directly from the store which reduces the delivery time. On the other hand, with multiple omni-channel stores as potential sources for order fulfillment, it increases the number of choices for order dispatching and thus increases the complexity of the fulfillment algorithm. For example, at JD.com, with its expansion to the omni-channel retailing including JD Home for 3C and home appliance, 7Fresh for food and groceries, and the collaboration with Walmart Stores (Shan 2018), it has built a fulfillment system to select the most cost-efficient fulfillment methods to meet customer requirements by leveraging the power of online and offline.

5.2.2 Order fulfillment network

The design of the order fulfillment network can be generally categorized as Fully-Connected Distribution Network and Hierarchical Distribution Network. A fully-connected distribution network refers to a fulfillment system where each inventory storage facility (a warehouse or a depot) can be used to fulfill the order from any place in the network. This distribution network is exemplified by Amazon.com and Walmart.com, the two largest e-commerce retailers in the US. With a fully-connected distribution network, as the inventory of a product is stored in many storage facilities, the order fulfillment decision can be very flexible as there will be many ways to fulfill an order from different storage facilities. While this flexibility guarantees the required product can be fulfilled as long as there is inventory somewhere in the entire network, it increases the transportation cost as orders are more likely to be shipped in longer distance and even with air methods in order to meet the delivery promise. To avoid high transportation cost, the fulfillment decision should not only look at the direct fulfillment cost (usually composed of transportation and processing cost) but also consider the future demand from the customers in order to fulfill orders strategically. Acimovic and Graves (2017) provide a good review of the current industry practice from an academic point of view.

Different from a fully-connected distribution network, a hierarchical distribution network is composed of multiple levels of inventory storage facilities. When a product is out-of-stock at the lower level storage facility, the demand is fulfilled by a higher-level storage facility, which is usually much farther from the customer. This hierarchical distribution network is typified by JD.com. Products need to be carefully placed (Shi 2018) at each tier of the network in order to maximize the utilization of the storage capacity. The risk of inventory stock-out at the lower level storage facility needs to be carefully managed by frequent inventory transshipment (Ma et al. 2018) from the upper level storage facility. The advantage of a hierarchical distribution network is its high efficiency. As the lower-level storage facility is usually close to the customers, it enables ultra-fast delivery to the customers. A hierarchical system is also easier to manage as the inventory transshipment flow among the storage facilities is usually unidirectional.

6 Future challenges

In this section, we discuss several challenges that retailers are still facing today in the area of supply chain management. As the retail business evolves overtime, we do see new challenges as well as new methodologies emerging in practice which could provide some insights to the future supply chain management research directions.

6.1 From supply chain to supply network

Supply chain is originally named to describe the chain of activities involved in moving a product or service from supplier to customer. However, as technology and business scenario evolves, the supply “chain” is also evolving to a supply “network” where the different parties within the system forms a tighter interconnection.

As described in Section 5, the real fulfillment networks used by large scale retailers have grown to very complex structures (JD operates more than 550 warehouses nationwide in China (JD.com 2018b); Amazon is operating 75 fulfillment centers within North America (Amazon.com 2018); Walmart and Sam’s Club combined have 175 fulfillment centers in United States (MWPVL International Inc. 2018)). A modern fulfillment network may include tiered network structure similar to the classic multi-echelon design. However, the network topology could also be subject to additional constraints such as availability of transship links or compatibility of product types. For example, JD.com’s fulfillment network is a superset of 6 networks, e.g. small-to-medium sized warehousing, oversized warehousing, cross border, cold chain delivery, frozen and chilled warehousing, B2B and crowdsourcing logistics (JD.com 2018a), each has its own capability and capacity. These networks can share the same transportation or fulfillment facilities in certain sections of the supply chain process while requires specialized channel for some other sections. For example, some products in the cold chain delivery can be delivered using the regular delivery channel (shared with other networks) using a specialized packaging, but these products must be stored in refrigerated warehouses. At the same time, oversized products have their own last mile delivery mechanism but can share the same or similar storage facility as small-to-medium sized items. How to efficiently utilize the complex networks by efficiently sharing the facilities and capacities still requires extensive research.

Crowd sourcing has also grown to be a viable approach to supplement or even act as the backbone of the transportation and fulfillment system (Rougès and Montreuil 2014). Companies such as Amazon.com, Walmart has all established certain type of crowd-sourced delivery solutions within the last several years. JD.com and Walmart recently invested $500 Million into Dada-JD Daojia, which is a crowd-sourcing delivery platform in China with operations in more than 400 major cities (Choudhury 2018). Using such delivery approach provides better scalability to the supply chain system, but also brings in additional uncertainty in both reliability and cost implication. There are still a lot of questions to be answered, especially when in-house fulfillment capacity is used alongside the crowd-sourcing abilities.

Furthermore, following recent pushes of omni-channel retailing (Verhoef et al. 2015), it is more and more common for retailers to use physical store’s shelf inventory to fulfill online orders as well. Some issues and existing research are summarized by Hübner et al. (2016) and Melacini et al. (2018). With the recent collaboration with Walmart (Shan 2018), JD.com gains the ability to access Walmart’s local stores’ inventory. This provides better customer experience for both better selection and faster delivery. However, this creates complexities in inventory management and fulfillment decisions as well for both JD.com and Walmart. How to efficiently manage inventory across the fulfillment network is still a challenging problem to all practitioners.

6.2 From managing supply to managing demand

Supply Chain management is now not just about managing supply but also on managing demand. Recent advances in search and recommendation algorithm has made it possible for managing demand and supply simultaneously to maximize retailer’s revenue and/or profit (Bernstein et al. 2015; Chen et al. 2016). Inventory markdowns (Lazear 1986) has long been a common practice for managing products by retailers, especially for seasonal products. But the gap between revenue management and inventory management still exists despite efforts by researchers (Elmaghraby and Keskinocak 2003).

The rise of group purchase (Liang 2018) in China recently also creates new challenges in supply chain management where massive demand is created within a very short period, creating pressure on both inventory management and logistics. With the help of social networks, the group purchase frequently establish highly localized pattern creating geographic hot spots for the product demand. As the group purchase products are normally limited time promotions, these hot spots exist in both spatial and temporal space, creating ever increasing difficulty in managing inventory and fulfillment.

6.3 Big data, small data

With the growth of online marketplace as well as digitization in offline stores, retailers now have extensive data to work with in order to figure out the best management strategy. With the rise of data availability, there is a clear trend toward data-driven approaches in supply chain management research. However, how to effectively utilize such data in providing better supply chain management decision is still not clear to many retailers. Further, the data size available to supply chain management is still considerably smaller than other more popular AI areas such as image or natural language processing, etc. For example, current effort on reinforcement learning based replenishment algorithm (Sui et al. 2010; Chaharsooghi et al. 2008) still need a relatively large number of iterations for the algorithm to converge to optimal which may be unrealistic for practical implementation. Although there are studies (Shi et al. 2018; Zhang and Gao 2017; Oroojlooyjadid et al. 2016) on using deep learning to solve inventory problems in supply chain, few has been able to demonstrate performance on practical set ups.

Besides the “big data” issue, e-commerce marketplaces have greatly improved the number of small players in the retail industry, who only manufacturer or manage a small set of SKUs or tailored to a particular customer base. Such retailers face a huge disadvantage in the era of big data. How to effectively utilize the “small data” available to the small players to help them make effective decisions is also an exciting problem to solve.