1 Introduction

With the popularity of Internet and the rapid development of Internet of Things (IoT), global data are increasing rapidly. Meanwhile, the development of Cloud technology and virtual storage will make the storage cost of Big Data reduce (Bin and Xin 2013; Cambria et al. 2013). All of these are an opportunity for the application and development of Big Data. Therefore, the era of Big Data has arrived. Under the Big Data background, data as the essential raw materials in the information economy society (Sanders 2015) have penetrated into every industry and affect the development of an industry.

Thus, many companies begin to obtain and use Big Data. Current researches have studied the advantages of Big Data application, and findings that using Big Data can help enterprises change their benefit model and business model (Barton 2012). In addition, using Big Data can also help companies improve production capacity and efficiency (McAfee and B. E. 2012). However, for companies, not all of them have sufficient ability to deal with such complex and large data. Therefore, many companies begin to outsource their Big Data project. The common method is to buy related Big Data information (BDI) from the related Data Company. With this trend development, the status of Data Company in a supply chain will become increasingly important. This change will lead to the change of supply chain structure, namely, Data Company should contained in a supply chain as a supply chain member.

Under the above conditions, to gain more valuable BDI, Data Company will pay extra money for Big Data technology and equipment. In addition, manufacturers will also pay extra money for gaining the related BDI. In this study, BDI in a manufacturer contained the internal part and the external part (the external BDI mainly refers to consumer preference information) and is chosen as the study object. However, Data Company and manufacturer want to know how to balance the relationships between the costs spending on Big Data and the revenues getting from Big Data usage. From the whole supply chain perspective, they want to know how to make the whole supply chain benefits maximize and what contract can coordinate the supply chain.

Therefore, aims of this paper are to solve the investment decision-making and coordination issues of a three stage supply chain considering Data Company in the Big Data environment. Although, many researches have discussed the investment decision-making and coordination problem of supply chain, they focus on a two-stage supply chain with one manufacturer and one retailer. Researches taking Data Company as a supply chain member are few, thus, in this paper, we will make up this gap.

In this study, we first discussed the importance of Data Company in a supply chain, and then described the related researches on the investment decision-making and coordination problems of a supply chain in the Big Data era. Based on the utility function theory, the demand function was built. Then, four benefit models were proposed and analyzed in the decentralized model and the centralized model. Finally, the revenue sharing contract was used to coordinate the decentralized supply chain. Results offer a theoretical guidance for Data Company to invest in Big Data technology, and it also provides a guidance for manufacturer to invest in consumer preference information. Meanwhile, it is a new development of supply chain decision-making and coordination theories in the Big Data era.

2 Literature review

2.1 Significance of Data Company in supply chain

In the Big Data era, the concept of data-driven supply chain has been proposed (Sanders 2015). In the data-driven supply chain, data as driving forces and raw materials are getting more and more attentions. However, for most enterprises, analyzing so large and complex data is a huge challenge. Thus, Data Company owning Big Data analysis capacity is becoming increasingly important. The specific reasons are as follows: (1) The IT department of the traditional enterprise is helpless in using Big Data because of natures of Big Data (Gantz et al. 2011; Xuelong and Gong 2015). It is difficult for the IT department of the traditional enterprise to use Big Data well because of the “volume” nature of Big Data. According to the forecast of IDC, in 2020, the number of global digital information will be 44 times today and most of these data are unstructured (Gantz et al. 2011). In addition, the update speed of these data is very fast, which also brings great challenges to the traditional enterprises (Wen-Lian and Xia 2013). Extracting the potential values of Big Data needs professional IT companies to use professional Big Data technology (Li et al. 2014). (2) Data Company has the ability to improve the data quality and protect data security (Liu and Yi 2016). Data quality will affect the ultimate value of Big Data and influence the usage of Big Data in a supply chain (Hazen et al. 2014). The unreliability of data sources is a vital factor in affecting data quality (Bowie and Jamal 2006). This unreliability usually is caused by consumer privacy protection behavior (Wen-Lian and Xia 2013). However, a Data Company owning the ability to protect consumer privacy will get more trust of consumers and they will offer their personal information to the Data Company.

Overall, Data Company owns the unmatched advantages in the processes of Big Data collection, storage, and analysis compared with the IT department of the traditional enterprise. In the future, Data Company will play an important role in offering BDI to other companies.

Thus, studying the investment decision-making problem of a supply chain should consider the role of Data Company in supply chain. Meanwhile, this problem’s study is also vital for Data Company to invest in Big Data.

2.2 Big Data and supply chain management

Following the arrival of Big Data age, the applications of Big Data in supply chain management are also discussed. However, most of them focus on the qualitative researches and the quantitative researches are few.

In qualitative researches, many researchers have discussed the applications of Big Data in supply chain (Chae and Olson 2013; Hazen et al. 2014; Trkman et al. 2010). In 2013, Waller and Fawcett (2013) discussed the vital of Big Data in supply chain management and invoked researchers to pay more attentions on Big Data usage in supply chain. To help firms use Big Data well, a model according to Ramco Cements Limited was proposed for carrying out Big Data in a company (Dutta and Bose 2015). In addition, an analysis model based on deduction chart way was presented to combine their competitive advantages with other firms (Tan et al. 2015). In addition, effects of Big Data (Bi and Cochran 2014) and its quality (Hazen et al. 2014) on enterprises information system were discussed. Fosso Wamba et al. (2015) proposed that enterprise should pay attention on the values (market segmentation, decision-making, demand discovery, producing new products, etc.) created by Big Data. In addition, in the decision-making process of Big Data application, the decision-making quality would be influenced by the qualities of data collection, analysis, and transmission (Janssen et al. 2016). Some researchers also discussed the effects of Big Data analysis ability on supply chain performance and organization benefits (Akter et al. 2016; Gunasekaran et al. 2016; Wamba et al. 2016). Lamba and Singh (2016) addressed the complexity related with natures of Big Data. According to these, three new Big Data models were proposed to solve the problems of joint procurement, procurement, and facility layout in supply chain management process. Nguyen et al. (2017) reviewed the new researches about the applications of Big Data analytics in supply chain management and got what Big Data analytic models and techniques were used in supply chain management. Shen and Chan (2017) reviewed the researches on the forecast information sharing problems in the Big Data age, and the obstacles and advantages for sharing information were discussed. Hofmann (2017) studied which natures of Big Data were good in reducing the bullwhip effect of supply chain. Kache and Seuring (2017) proposed the challenges and opportunities brought by using Big Data in supply chain management from an empirical research angle. Similarly, Arunachalam et al. (2017) reviewed the documents on the applications of Big Data analytics in supply chain management. Through reviewing previous researches, Lamba and Singh (2017) found that Big Data analytic methods were widely used in the processes of supply chain (manufacturing, logistics, and procurement).

In quantitative researches, Olama et al. (2014) proposed an integration level model to assess the swiftness of a company data system. Tambe (2014) called managers to pay attention on balancing relationships between the costs spending on labor and the revenues getting from Big Data. Shi and Li (2015) discussed the effects of privacy information leakage on supply chain member coordination relationship in the Big Data environment. Wu et al. (2016) presented the coordination strategies of a three-stage supply chain considering Big Data service provider. Based on geospatial data, Singh et al. (2016) proposed a method to solve the location-allocation decision problem of a cold chain. Similarly, Mishra and Singh (2016) proposed a model to get waste minimization strategies based on twitter data. See-To and Ngai (2016) proposed a demand distributional model based on sale Big Data analysis. To reduce the inherent risk of hazardous materials, Zhao et al. (2017) proposed a multi-objective optimization method. In this method, Big Data analytic method was used to get the related parameters, and three scenarios were discussed to optimize green supply chain management. Kaur and Singh (2017) proposed a model about sustainability procurement and logistics, and this model contains MINLP method (Mixed Integer Non Linear Program) and MILP method (Mixed Integer Linear Program) and the input data are from the related cost Data, trading data, and emissions data. Li and Wang (2017) discussed the revenues of the chilled food supply chain by using data driven pricing method. To make Big Data transform a manageable volume scope, Wu et al. (2017) proposed a method based on fuzzy and grey Delphi methods. Li et al. (2017) studied the pricing and production rules of a two-stage supply chain with one manufacturer and two suppliers.

However, researches focused on the investment decision-making and coordination problems of supply chain were few. To make up this gap, based a two-stage supply chain with one manufacturer and one retailer, Liu and Yi (2017) discussed the investment decision-making and coordination issues of supply chain on BDI in the information symmetry circumstance and the information asymmetry circumstance, finally, a revenue sharing contract was used to coordinate the supply chain. Furthermore, Liu and Yi (2016) chose a supply chain with one manufacturer and on retailer to discuss the investment conditions of BDI and used the quantity discount contract to coordinate the supply chain. Meanwhile, the significances of Data Company in future supply chain was stressed and they thought Data Company should be included in a supply chain. Although, they emphasized the importance of Data Company in a supply chain, in the discussion process of the investment decision-making and coordination issues of supply chain, Data Company was not considered. Meanwhile, whether the revenue sharing contract can achieve a three-stage supply chain coordinate is not discussed. Therefore, to make this gap, in this paper, Data Company and the revenue sharing contract will be considered to discuss the investment decision-making and coordination issues of BDI in a three-stage supply chain.

To resolve the above problems, four investment models were presented. (1) In the decentralized supply chain and the centralized supply chain, Data Company will not invest in Big Data technology to gain BDI and the manufacturer will also not invest in BDI. And the retailer will not invest in BDI. (2) In the decentralized supply chain and the centralized supply chain, Data Company will invest in Big Data technology to gain BDI and the manufacturer will also invest in BDI. And the retailer will not invest in BDI.

3 Model establishment

In the Big Data era, with the development of consumer living standards, the heterogeneity preference of consumer is increasing. For manufacturer, gaining the accurate and timely consumer preference information becomes more and more important. And natures of Big Data can meet the demand of manufacturer on the accurate and timely consumer preference information. For manufacturer, the common way to gain consumer demand information is to buy it from Data Company. Based on these preference information, manufacturer designs and produces product to meet consumer demand. For Data Company, to gain more benefits from selling consumer BDI, it should invest in new data analysis technology to gain these BDI. Thus, in this paper, we main discuss how the Data Company and the manufacturer to balance the relationships between their related Big Data costs and benefits.

Currently, due to the immature BDI demand market, generally, the Data Company will set their BDI retail price firstly, and then the manufacturer designs and produces new products based on the BDI gained from Data Company and decides the product wholesale price considering the BDI retail price. Thus, in this paper, assume that the Data Company is the leader, and the manufacturer and the retailer are the follower.

3.1 Variables and Parameters

\(p^{k}\) is the product retail price in different models. Here, \(k=\{ND,D,NC,C\}\). ND model stands for the benefit model of the decentralized supply chain with the circumstance of no using Big Data. D model presents the benefit model of the decentralized supply chain with the circumstance of using Big Data. NC model stands for the benefit model of the centralized supply chain with the circumstance of no using Big Data. C model presents the benefit model of the centralized supply chain with the circumstance of using Big Data.

\(p_b^k \) is the BDI retail price in different models. \(\alpha ^{k}\) stands for the value discount factor and is influenced by the precision of consumer preference information.

\(c_b \) presents the information production cost of the Data Company. \(c_{bi} \) stands for the added cost of the Data Company because of using new Big Data technology to gain BDI.

c stands for the product production cost of the manufacturer. \(c_d \) presents the BDI investment cost of the manufacturer.

\(w^{k}\) is the wholesale price of the product in different models.

\(\theta \) is “the industry cost improvement coefficient. By analyzing the internal BDI and the external BDI of a company and using it in the company can help reduce the company’s cost” (Liu and Yi 2016).

Fig. 1
figure 1

Application of Big Data in the traditional manufacturers

3.2 Demand function and model assumptions

3.2.1 Demand function

In the Big Data environment, to gain more competitive advantages, the traditional manufacturer will buy consumer preference information and produce productions to meet consumer demand, and application of consumer preference information is shown in Fig. 1. Based on the utility function theory, let \(U=\alpha v-p\); here, \(v\in [0,1]\). The market demand formula is shown as follows.

$$\begin{aligned} Q^{k}=1-\frac{p^{k}}{\alpha ^{k}} \end{aligned}$$
(1)

Typically, both charging by service hours and charging by number of messages are the profit models of the Data Company. In reality, both the two profit models are used. In this paper, the profit model charging by number of messages will be chosen. When the manufacturer obtains these consumer demand information, it can design and produce products based on these information. Assume that the total number of consumer information is D. There is \(\gamma =Q/D\), here \(\gamma \in [0,1]\). To simplify the calculation, we set \(\beta =1/\gamma \), \(D=\beta Q\), \(\beta \) presents the consumer preference information conversion coefficient, in different models, it can express \(\beta ^{k}\).

3.2.2 Model assumptions

  1. (i)

    The manufacturer obtains consumers demand BDI from the Data Company. Meanwhile, the Data Company charges by messages’ number.

  2. (ii)

    The manufacturer has enough production capacity. Supply chain members are completely rational and risk neutral.

  3. (iii)

    The manufacturer designs and produces this type production based on the consumer preference information. In addition, the market is completely monopolized.

4 Investment decision-making and coordination of supply chain

In the Big Data environment, to produce products needed by consumers, manufacturer has to gain consumers’ preference information from Data Company. With this trend development, Data Company will become an important part of supply chain. In this condition, if supply chain members want to gain more benefits, how should they invest in BDI? How should the decision maker coordinate the three-stage supply chain? These problems are important for supply members to solve.

To solve the above problems and make up the theoretical research gap, a supply chain with one manufacturer, one retailer, and one Data Company are chosen. Based on the utility function theory, the demand function is built. Then, the benefit functions of the manufacturer, the retailer and the Data Company are constructed in the decentralized model and the centralized model using Stackelberg game. By comparing optimal benefits and prices of supply chain members in the decentralized model and the centralized model, we find that the players’ benefits in decentralized model have the improvement space. Thus, a revenue sharing contract is used to coordinate the supply chain.

4.1 Decentralized model

In the decentralized model, supply chain members pursue their own profit maximization. When the manufacturer sets a wholesale price, it will consider the BDI price of the Data Company. Thus, in this paper, assume that the Data Company is the game leader and first sets the BDI retail price. The manufacturer sets the wholesale price of product considering the BDI retail price, and then the retailer sets the product retail price.

4.1.1 ND model

In the ND model, the Data Company doesn’t use Big Data technology to extract information, thus, the accuracy of consumer preference information is low. If the manufacturer purchases consumer preference information from the Data Company, and then produces products according to these information. The products cannot meet the needs of consumers well. Thus, \(Q^{ND}<Q^{D},\beta ^{ND}<\beta ^{D}\). In the ND model, the benefit functions of the Data Company, the manufacturer, and the retailer are shown in functions (2), (3), and (4), respectively.

$$\begin{aligned} \pi _b^{ND} (p_b^{ND} )= & {} (p_b^{ND} -c_b )\beta ^{ND}Q^{ND} \end{aligned}$$
(2)
$$\begin{aligned} \pi _d^{ND} (w^{ND})= & {} (w^{ND}-p_b^{ND} -c)Q^{ND} \end{aligned}$$
(3)
$$\begin{aligned} \pi _r^{ND} (p^{ND})= & {} (p^{ND}-w^{ND})Q^{ND} \end{aligned}$$
(4)

Based formulas (2), (3), and (4), we get:

$$\begin{aligned} \left( p_b^{ND*} ,w^{ND*},p^{ND*},Q^{ND*}\right) =\left( \frac{\alpha ^{ND}-c+c_b}{2},\frac{3\alpha ^{ND}+c+c_b}{4}, \right. \nonumber \\ \left. \frac{7\alpha ^{ND}+c+c_b}{8},\frac{\alpha ^{ND}-c-c_b}{8\alpha }\right) \end{aligned}$$
(5)

Due to \(Q^{ND}>0\), \(\alpha ^{ND}\) should be bigger than \(c+c_b\), namely,

$$\begin{aligned} \alpha ^{ND}>c+c_b \end{aligned}$$
(6)

Based on formula (5), supply chain members’ benefits are

$$\begin{aligned} \left( \pi _\mathrm{b}^{ND*} ,\pi _d^{ND*} ,\pi _r^{ND*}\right) =\left( \frac{\beta ^{ND}(\alpha ^{ND}-c-c_{b})^{2}}{16\alpha ^{ND}},\frac{(\alpha ^{ND}-c-c_{b})^{2}}{32\alpha ^{ND}},\frac{(\alpha ^{ND}-c-c_{b})^{2}}{64\alpha ^{ND}}\right) \nonumber \\ \end{aligned}$$
(7)

Proposition 1

In the ND model, \(\left( p_b^{ND*} ,w^{ND*},p^{ND*},Q^{ND*}\right) \) is the equilibrium solution. \(\left( \pi _\mathrm{b}^{ND*} ,\pi _d^{ND*} ,\pi _r^{ND*} \right) \) is the optimal revenues of supply chain members.

4.1.2 D model

In the D model, the Data Company will use Big Data technology to extract consumer information, thus, the accuracy of consumer preference information is higher than it in the ND model. If the manufacturer purchases consumer preference information from the Data Company, and then produces product according to these information. The products can meet the demand of consumers well. Thus, the benefit functions of the Data Company, the manufacturer, and the retailer are shown in functions (8), (9), and (10), respectively.

$$\begin{aligned} \pi _b^D (p_b^D )= & {} (p_b^D -c_{bi} -\theta c_b)\beta ^{D}Q^{D} \end{aligned}$$
(8)
$$\begin{aligned} \pi _d^D (w^{D})= & {} (w^{D}-p_b^D -c_d -\theta c)Q^{D} \end{aligned}$$
(9)
$$\begin{aligned} \pi _r^D (p^{D})= & {} (p^{D}-w^{D})Q^{D} \end{aligned}$$
(10)

Based formulas (8), (9), and (10), we get:

$$\begin{aligned} \left( p_b^{D*} ,w^{D*},p^{D*},Q^{D*}\right)= & {} \left( \frac{\alpha ^{D}-\theta c-c_d +c_{bi} +\theta c_b}{2}, \frac{3\alpha ^{D}+\theta c+c_d +c_{bi} +\theta c_b}{4}, \right. \nonumber \\&\quad \left. \frac{7\alpha ^{D}+\theta c+c_d +c_{bi} +\theta c_b}{8}, \frac{\alpha ^{D}-\theta c-c_d -c_{bi} -\theta c_b}{8\alpha } \right) \nonumber \\ \end{aligned}$$
(11)

Due to \(Q^{D}>0\), \(\alpha ^{D}\) should be bigger than \(\theta c+c_d +\theta c_b^ +c_{bi} \), namely,

$$\begin{aligned} c_d +c_{bi} <\alpha ^{D}-\theta c-\theta c_b \end{aligned}$$
(12)

Based on formula (11), supply chain members’ benefits are

$$\begin{aligned} \left( \pi _b^{D*} ,\pi _d^{D*} ,\pi _r^{D*} \right)= & {} \left( \frac{\beta ^{D}(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}{16\alpha ^{D}}, \right. \nonumber \\&\left. \frac{(\alpha ^{D}-\theta c-c_d -\theta c_b -c_{bi} )^{2}}{32\alpha ^{D}},\frac{(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}{64\alpha ^{D}}\right) \nonumber \\ \end{aligned}$$
(13)

Proposition 2

In the D model, \(\left( p_b^{D*} ,w^{D*},p^{D*},Q^{D*}\right) \) is the equilibrium solution. \(\left( \pi _\mathrm{b}^{D*} ,\pi _d^{D*} ,\pi _r^{D*} \right) \) is the optimal revenues of the supply chain members.

4.1.3 Investment decision in the decentralized model

In the Big Data environment, many companies have used BDI to help them win more benefits and advantages. Thus, analyzing the investment condition of Big Data is a vital problem for organizations.

In this chapter, according to the above analyses, the optimal benefits of supply chain members in the decentralized model will be compared. Assume that \(\Delta \pi _x \) stands for the revenue differences, here, \(x=\{1,2,3\}\). \(\Delta \pi _1 =\pi _b^{D*} -\pi _b^{ND*} >0\), and according to formulas (7) and (13), we get \(c_d +c_{bi} < (1-\sqrt{{\beta ^{ND}\alpha ^{D}}/{\beta ^{D}\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\beta ^{ND}\alpha ^{D}}/{\beta ^{D}\alpha ^{ND}}}-\theta )(c+c_b )\). It expresses that when \(c_d +c_{bi} < \quad (1-\sqrt{{\beta ^{ND}\alpha ^{D}}/{\beta ^{D}\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\beta ^{ND}\alpha ^{D}}/{\beta ^{D}\alpha ^{ND}}}-\theta )(c+c_b )\) can be met, the Data Company can get more benefits through using data analysis technology. \(\Delta \pi _2 =\pi _d^{D*} -\pi _d^{ND*} >0\). According to formulas (7) and (13), we get \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\). It shows that when \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\) can be met, the manufacturer can get more benefits through using BDI. \(\Delta \pi _3 =\pi _r^{D*} -\pi _r^{ND*} >0\), and according to formulas (7) and (13), we get \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\). It shows that when \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\) can be met, the retailer can get more benefits. Thus, Conclusion 1 can be get.

Conclusion 1

When \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\), the Data Company and the manufacturer invest in Big Data can get more benefits, moreover, the retailer benefits will also be improved.

Conclusion 1 also indicates that if a member in the supply chain uses and invests in Big Data to improve their revenuers, for other members, “Spillover Effect” exists. This phenomenon is similar to the research of Liu and Yi (2017). In their study, a two-stage supply chain with one retailer and one manufacturer was considered. Through comparing this study with the research of Liu and Yi (2017), we also found that the BDI investment threshold of supply chain has a positive relationship with the improved costs (\(c+c_b )\) and a negative relationship with the industry cost improvement coefficient \(\theta \). These showed that if the users of Big Data wanted to gain more revenues, they should try their best to extract the value of Big Data and using it to reduce production costs Liu and Yi (2016, 2017).

4.2 Centralized model

In the centralized model, supply chain members pursue the maximization of the overall profits. Thus, in this section, the benefit functions in both NC and C models are built. Finally, the supply chain benefits in the two models are compared.

4.2.1 NC model

In the NC model, the benefit function of the centralized supply chain is shown in function (14).

$$\begin{aligned} \pi _j^{NC} =(p^{NC}-c)Q^{NC}-\beta ^{NC}Q^{NC}c_b \end{aligned}$$
(14)

Based on formula (14), we get

$$\begin{aligned} (p^{NC*},Q^{NC*})= & {} \left( \frac{\alpha ^{NC}+c+\beta ^{NC}c_b}{2}, \frac{\alpha ^{NC}-c-\beta ^{NC}c_b}{2\alpha ^{NC}} \right) \end{aligned}$$
(15)

Due to \(Q^{NC}>0\), \(\alpha ^{NC}\) should bigger than \(c+\beta ^{NC}c_b\), namely,

$$\begin{aligned} \alpha ^{NC}>c+\beta ^{NC}c_b \end{aligned}$$
(16)

Based on formula (15), supply chain benefit is

$$\begin{aligned} \pi _j^{NC{*}} =\frac{(\alpha ^{NC}-c-\beta ^{NC}c_b )^{2}}{4\alpha ^{NC}} \end{aligned}$$
(17)

Proposition 3

In the NC model, \(\left( p^{NC*},Q^{NC*}\right) \) is the equilibrium solution. \(\pi _j^{NC*} \) is the optimal revenue of the supply chain.

4.2.2 C model

In the C model, the benefit function of the centralized supply chain is shown in function (18).

$$\begin{aligned} \pi _j^C =(p^{C}-\theta c-c_d )Q^{C}-(\theta c_b +c_{bi} )\beta ^{C}Q^{C} \end{aligned}$$
(18)

Based on formula (18), we get:

$$\begin{aligned} \left( p^{C*},Q^{C*}\right) =\left( {\frac{\alpha ^{C}+\theta c+\theta \beta ^{C}c_b+c_d +\beta ^{C}c_{bi}}{2},\frac{\alpha ^{C}-\theta c-\theta \beta ^{NC}c_b^ -c_d -\beta ^{C}c_{bi}}{2\alpha ^{C}}} \right) \end{aligned}$$
(19)

Due to \(Q^{C}>0\), \(\alpha ^{C}\) should be bigger than \(\theta c+\theta \beta ^{C}c_b^ +c_d +\beta ^{C}c_{bi}^ c+\beta ^{NC}c_b\), namely,

$$\begin{aligned} \alpha ^{C}-\theta c-\theta \beta ^{C}c_b^ >c_d +\beta ^{C}c_{bi} \end{aligned}$$
(20)

Based on formula (19), supply chain benefit is

$$\begin{aligned} \pi _j^{C{*}} =\frac{\left( \alpha ^{C}-\theta c-\theta \beta ^{C}c_b^ -c_d -\beta ^{C}c_{bi}\right) ^{2}}{4\alpha ^{C}} \end{aligned}$$
(21)

Proposition 4

In the C model, \((p^{C*},Q^{C*})\) is the equilibrium solution. \(\pi _j^{C*} \) is the optimal revenue of the supply chain.

4.2.3 Investment decision in the centralized model.

In this chapter, according to the above analyses, the optimal benefits of the centralized supply chain in the two models will be compared. Assume that \(\Delta \pi _j \) stands for the revenue differences between \(\pi _j^{C{*}} \) and \(\pi _j^{NC{*}} \). When, \(\Delta \pi _j =\pi _j^{C{*}} -\pi _j^{NC{*}} >0\), we get \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\). It shows that when \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+ (\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\) can be met, the supply chain members can get more benefits through using BDI. Thus, Conclusion 2 can be get combining Conclusion 1.

Conclusion 2

When \(c_d +c_{bi} <(1-\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}})(\alpha ^{D}-\alpha ^{ND})+(\sqrt{{\alpha ^{D}}/{\alpha ^{ND}}}-\theta )(c+c_b )\), the decision makers investing in Big Data can get more benefits, moreover, the thresholds of Big Data investment in the decentralized supply chain are same as the centralized supply chain.

Based on Conclusions 1 and 2, Conclusion 3 can be get.

Conclusion 3

The threshold investing in Big Data (\(c_d +c_{bi} )\) has a negative relationship with the cost improvement coefficient (\(\theta )\) and has a positive relationship with the improved costs (\(c+c_b )\).

The relationship between the threshold investing in Big Data and the cost improvement coefficient is similar with the research of Liu and Yi (2016, 2017). Meanwhile, the relationship between the threshold investing in BDI and the improved costs is also similar to the research of Liu and Yi (2016, 2017). These indicate that whether it is a third-stage supply chain or a two-stage supply chain, the threshold investing in Big Data will has a negative relationship with the cost improvement coefficient and has a positive relationship with the improved costs. In addition, in the decentralized model and the centralized model, the threshold investing in Big Data is equal.

4.3 Supply chain coordination

Based on the above analyses, we can get that the product retail price in the D model is higher than it in the C model. Meanwhile, benefits of supply chain in the D model is lower than it in the C model. Thus, the decentralized supply chain has a “double marginalized effect”. The benefit of the decentralized supply chain is not optimal and has space for increase. We can use contract to make the supply chain coordinate. Thus, in this study, the revenue sharing contract will be used.

To sell more products, the manufacturer will offer a low wholesale price to the retailer. Meanwhile, to make up the benefit loss of the manufacturer because of reducing the product wholesale price, the retailer will share its benefits with a certain rate \(\rho \). Thus, the benefits of the Data Company, the manufacturer, and the retailer are shown as follows.

$$\begin{aligned} \pi _b^M= & {} (p_b^M -\theta c-c_b)Q^{M} \end{aligned}$$
(22)
$$\begin{aligned} \pi _d^M= & {} (w^{M}-\theta c-c_d )Q^{M}+(1-\rho )(p^{M}-w^{M})Q^{M} \end{aligned}$$
(23)
$$\begin{aligned} \pi _r^M= & {} \rho (p^{M}-w^{M})Q^{M} \end{aligned}$$
(24)

Based on formulas (22), (23), and (24), let \(p_{M*} =p_{C*}\), we get formula (25).

$$\begin{aligned} \left( p_b^{M*} ,w^{M*},p^{M*}\right)= & {} \left( \alpha ^{M}\rho -\beta {c}_{bi} -c_d \rho -\beta \theta c_b -\beta \rho \hbox {c}_{bi} -\theta \rho c-\rho \beta \theta c_b , \right. \nonumber \\&\left. c_d +\beta c_{bi} +\theta c+\beta \theta c_b, \frac{\alpha ^{M}+\theta c+c_d +\beta c_{bi} +\theta \beta c_b}{2} \right) \qquad \end{aligned}$$
(25)

Based on formula (25), profits of the supply chain members are shown in formula (26).

$$\begin{aligned} \left( \pi _b^{M*} ,\pi _d^{M*} ,\pi _r^{M*} \right)= & {} \left( \frac{\rho (\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}{2\alpha ^{M}}, \right. \nonumber \\&\left. \frac{\rho (\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}{2\alpha ^{M}},\frac{\rho (\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}{2\alpha ^{M}}\right) \nonumber \\ \end{aligned}$$
(26)

Conclusion 4

When \(1>\rho > \max \{\xi _1,\zeta _2,\xi _3 \}\) can be met, the revenue sharing contract can make the decentralized supply chain coordinate.

Proof

When \(\pi _b^{M*} -\pi _b^{D*} =\frac{\rho (\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}{2\alpha ^{M}}-\frac{\beta ^{D}(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}{16\alpha ^{D}}>0\), because \(\alpha ^{M}=\alpha ^{D}\), we get \(\rho >{\beta ^{D}(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}/{8\rho (\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}\), this condition is called \(\xi _1 \). When \(\pi _\mathrm{d}^{M*} -\pi _d^{D*} =\frac{\rho (\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}{2\alpha ^{M}}-\frac{(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}{32\alpha ^{D}}>0\), we get \(\rho >{(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}/{16(\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}\), this condition is called \(\xi _2 \). When \(\pi _r^{M*} -\pi _r^{D*} =\frac{\rho (\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}{2\alpha ^{M}}-\frac{(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}{64\alpha ^{D}}>0\) can be met, we get \(\rho >{(\alpha ^{D}-\theta c-c_d -\theta c_b^ -c_{bi} )^{2}}/{32(\alpha ^{M}-\theta c-c_d -\theta \beta c_b^ -\beta c_{bi} )^{2}}\), this condition is called \(\xi _3 \). Thus, Conclusion 4 is proved. \(\square \)

Conclusion 4 indicates that when the revenue sharing factor \(\rho \) in a certain range, the revenue sharing contract can make the three-stage supply chain coordinate. Previous research also showed that the revenue sharing contract could make a two-stage supply chain coordinate in the Big Data era (Liu and Yi 2017). This study expands the application of the revenue sharing contract in a three-stage supply chain.

5 Numerical simulation

In this chapter, a numerical example will be presented to show the results’ effectiveness. Let \(\theta =0.65, \alpha ^{ND}=\alpha ^{NC}=40<\alpha ^{D}=\alpha ^{C}=\alpha ^{M}=50,c=5,\beta ^{ND}=\beta ^{NC}=0.7<\beta ^{D}=\beta ^{C}=\beta ^{M}=0.9\),

$$\begin{aligned} \hbox {c}_b =1, \hbox {c}_{bi} =0.1,c_d =0.5. \end{aligned}$$

Based on Propositions 1 and 3, we get \((\pi _\mathrm{r}^{ND*} ,\pi _d^{ND*} ,\pi _b^{ND*} ,\pi _j^{NC*} )=(0.45,0.90,1.26,7.35)\). According to Conclusions 1 and 2, we get \(c_d +c_{bi} <1.62\). Without loss of generality, we set \(c_d =0.5, \quad c_{bi}=0.1\). Then, we get \((\pi _\mathrm{r}^{D*} ,\pi _d^{D*} ,\pi _b^{D*} ,\pi _j^{C*} )=(0.65,1.29,2.33,10.39)\). Based on the above analyses, we get \(\Delta \pi _\mathrm{r}^=\pi _\mathrm{r}^{D*} -\pi _\mathrm{r}^{ND*} =0.2>0\), \(\Delta \pi _d^ =\pi _d^{D*} -\pi _d^{ND*} =0.39>0\), \(\Delta \pi _b^ =\pi _b^{D*} -\pi _b^{ND*} =1.07>0\), and \(\Delta \pi _\mathrm{j}^ =\pi _j^{C*} -\pi _j^{NC*} =3.04>0\). Thus, Conclusions 1 and 2 are verified.

Effects of the cost improvement coefficient (\(\theta )\) on the benefits of supply chain members in the D model are shown in Fig. 2. From Fig. 2, we can get that with the increase of the cost improvement coefficient (\(\theta )\), the optimal benefits of supply chain members in the D model will decrease. It indicates that if the decision makers want to gain more benefits, they need to extract the values of Big Data fully. In addition, after using Big Data, the increased values of the benefits of the manufacturer and the Data Company are high and the increased values of the benefits of the retailer is low. It shows the rule “who will gain benefit from his/her investment”. Meanwhile, it also indicates the law “positive externalities” for other members. These trends are same with the research of Liu and Yi (2017).

Fig. 2
figure 2

Effects of \(\theta \) on the benefits of supply chain members in the D model

From Fig. 3, we can get that the investment costs (\(c_d +c_{bi} )\) have a negative relationship with the cost improvement coefficient (\(\theta )\). It indicates that if the decision makers want to add the range of the investment costs (\(c_d +c_{bi} )\), they should try their best to extract the value of Big Data and using it to reduce production costs. Moreover, when the cost improvement coefficient (\(\theta )\) is in a certain range, the supply chain members invest in and use Big Data are feasible and profitable. Thus, Conclusion 3 is confirmed.

Fig. 3
figure 3

Relationships between \(\theta \) and the investment costs (\(c_d +c_{bi} )\) in the D model

Figure 4 presents the effects of the consumer preference information conversion coefficient (\(\beta ^{D})\) on benefits of supply chain members. From Fig. 4, we can get that benefits of the retailer and the manufacturer have no relationship with the consumer preference information conversion coefficient (\(\beta ^{D})\). With the growth of \(\beta ^{D}\), the benefit of the Data Company will increase. It indicates that if the Data Company wants to gain more benefits, it should improve the accuracy of consumer preference information.

Fig. 4
figure 4

Effects of \(\beta ^{D}\) on benefits of supply chain members

Due to \(\pi _\mathrm{r}^{D*} +\pi _d^{D*} +\pi _b^{D*} =4.27\) and \(\pi _j^{C*} =10.39\), we get \(\pi _\mathrm{r}^{D*} +\pi _d^{D*} +\pi _b^{D*} <\pi _j^{C*} \). Moreover, \(p^{D*}=44.31\) and \(p^{C*}=27.21\), namely, \(p^{D*}>p^{C*}\). These show that after using Big Data, the decentralized supply chain has a lower product retail price and benefit compared with the centralized supply chain. Thus, the decentralized supply chain after using Big Data has “double marginalized effect.” To reduce this effect, in this study, the revenue sharing contract is used to coordinate the decentralized supply chain. Effects of the revenue sharing coefficient on benefits of supply chain members are shown in Fig. 5. From Fig. 5, we can get that when the revenue sharing coefficient (\(\rho )\) is in a certain range, the revenue sharing contract can achieve supply chain coordinate. Conclusion 4 is confirmed.

Fig. 5
figure 5

Effects of the revenue sharing coefficient on benefits of supply chain members

6 Discussion

In the Big Data environment, Data Company as one provider of BDI is playing an increasingly important role in a supply chain system (Liu and Yi 2016). Many enterprises begin to obtain related BDI from it. However, obtaining BDI from Data Company will add enterprises’ extra money, meanwhile, to gain the accurate and timely BDI, the Data Company will also add extra money in improving new data analysis technologies and equipment. In this process, it is important for Data Company and enterprise to know how to invest in and use Big Data and how to gain more benefits. Thus, in this paper, a supply chain with one manufacturer, one retailer, and one Data Company is chosen. Then, we propose and construct four benefit models based on a three-stage supply chain. Finally, the revenue sharing contract is used to coordinate the supply chain. Through analyzing the above models, some results are got.

When the investment Sum of the Data Company in improving data analysis technology and equipment and the manufacturer invested in obtaining consumer preference information is in a certain range, investing and using Big Data will help investors gain more benefits. This result is similar to the research of Liu and Yi (2016) and Liu and Yi (2017). However, there are some differences in the influencing factors of the range. In addition, the investment thresholds in the decentralized model and the centralized model are same. This result are the same with the research of Liu and Yi (2016). Meanwhile, the investors will get more benefits from using Big Data, for the retailer, its benefits will also be increase, namely, there is “Spillover Effect”. However, the added benefit value of the investors is higher than the retailer’s, it shows the law “who will gain benefit from his/her investment”.

The threshold investing in Big Data (\(c_d +c_{bi} )\) has a negative relationship with cost improvement coefficient (\(\theta )\). These trends are similar to the research of Liu and Yi (2016) and Liu and Yi (2017). It indicates that, for the investors, trying their best to extract the value of Big Data and using it to reduce production costs will help them add the range of the investment costs. The benefits of the retailer and the manufacturer have no relationship with the consumer preference information conversion coefficient. With the increase of \(\beta ^{D}\), the benefit of the Data Company will increase. Namely, if the Data Company wants to gain more benefits, it should improve the accuracy of consumer preference information.

When the revenue sharing coefficient (\(\rho )\) is in a certain range, the revenue sharing contract can achieve supply chain coordinate. It demonstrates that the revenue sharing contract can be used to solve the coordination problem of supply chain in the Big Data environment.

The above results demonstrate that, for the manufacturer, extracting the value of Big Data and using it to reduce production costs will help it obtain more benefits. For the Data Company, improving the accuracy of consumer preference information will help it gain more revenues, in addition, that the manufacturer increase the application value of BDI will also bring benefits to the Data Company. For the retailer, if other members in the supply chain use Big Data, which will make its benefits increase. However, for the whole supply chain, the benefits have an improvement space after using Big Data, the revenue sharing contract can achieve the three-stage supply chain coordination.

7 Conclusions

The main contribution of this study is that we expand the research about the BDI investment and coordination of supply chain to a three-stage supply chain in the Big Data environment, where Data Company is included in the supply chain. Meanwhile, we expand the application of the revenue sharing contact to a three-stage supply chain in the Big Data environment. In addition, we proposed the concept of consumer preference information conversion coefficient \(\beta \).

This study has some scientific values. Theoretically, the important position of a Data Company in a supply chain was analyzed under the Big Data environment. Based on these, four investment models were proposed considering the Data Company, and then the investment decision-making conditions were analyzed. Finally, we also proved that the revenue sharing contract could make the three-stage supply chain coordination. This study is a new development of the investment decision-making theory in a three-stage supply chain in the Big Data environment. It offers a theoretical guidance for the Data Company to invest in Big Data and it also provides a guidance for the manufacturer to use consumer preference information. In practice, if the Data Company wants to get more benefits, it should extract the values of data fully to improve the conversion value of data from BDI to products. For the manufacturer, it should also use the consumer preference information fully to produce the products needed by consumers. In addition, the decision makers can use the revenue sharing contract to coordinate the supply chain, and then gain more benefits.

This paper only studies a three-stage supply chain system with on manufacturer, one Data Company, and one retailer. In fact, a supply chain system is very complex, there are double channel and multi-channel supply chain. Moreover, in a supply chain system, it contains supplier, manufacturer, retailer, and the third party logistics, however, in this paper, only the relationships among manufacturer, Data Company, and retailer are discussed. In addition, supply chain members are assumed to be risk-neutral.

In the future, the investment decision-making rules and coordination policies of a supply chain should be studied with a multi-channel supply chain or a multi-stage supply chain in the Big Data environment. In addition, next, a three-stage supply chain with members having risk appetite should be discussed. We can also explore the related questions in different competitive environments.