1 Introduction

In today’s competitive economy, developing an optimal strategy for supply chain members is challenging under various circumstances. In real business, although strategic cooperation is more lucrative for all supply chain participants simultaneously, the channel members typically adopt a non-cooperative strategy for maximizing their self-profitability. Almost all prior studies have focused on the static decentralized behavior of supply chain members, while in real cases, the behavior of participants is dynamic, i.e., evolves over time (Chai and Xiao 2018). To overcome the research gap, the primary goal of our study is to analyze the dynamic decentralized behavior of retailers and the supply chain under different situations and then evaluate the evolutionary stable behavior (i.e., long-term oriented strategy). Following the evolutionary game theory, one retailer is selected to represent the entire population of retailers, and the population’s dynamic behavior regarding the adoption of strategies is evaluated by modeling the retailer’s activities in a two-player symmetric game with its symmetrical competitor from the population (Hosseini-Motlagh et al. 2022). Here, our study is based on the fact that the representative retailers make decisions on pricing and marketing efforts, while the common manufacturer focuses on green innovation.

In this era of a low-carbon economy, owing to several environmental concerns, consumers are becoming increasingly conscious of buying green commodities, and governments are encouraging manufacturers to produce more green products. As part of an environmental subsidy program, the government frequently uses policies to support and promote green products. Efficient government regulations may control and manage the production and operation plans of business firms. In order to promote the adoption of green automobiles, a 100,000 Yen subsidy with tax deductions and exclusions was offered in Japan in 2009 (Hao et al. 2014). The German government offers consumers who trade in their old cars for newer, more fuel-efficient ones a e2500 rebate (Huang et al. 2014). The Indian government has committed to abolishing single-use plastic completely by the year 2022. Despite the fact that India has a significantly lower per capita use of plastic than many Western countries, 1.3 billion people in this country and the fast-growing global economy will make keeping the pledge a little bit difficult. Moreover, every government around the world is now concerned about ecological hazards and, therefore, intervenes in manufacturing industries and the entire supply chain to produce more green products.

Nowadays, in order to promote low-carbon economic growth, it is critical to investigate and implement completely novel modes for reducing carbon emissions and pollution (Long et al. 2021). To reduce the carbon emissions generated during manufacturing, the government often implements a cap and trade strategy. A CTP can function in a variety of ways, but generally, the government sets a “cap" (i.e., a limit) on the amount of carbon emissions permitted in a certain industry. Industries that reduce their carbon emissions can also trade their allowances for those companies that emit more. In 2015, the European Union’s Emission Trading System’s cap remained 15% lower compared to when the program first started in 2005 (Mondal and Giri 2020).

Due to the rising demand for natural resources in today’s congested environment, sustainability has become a serious issue. The continuous expansion of industrialization and rapid product upgradation have led to the depletion of enormous resources and energies. To tackle this problem, companies are adopting a reverse channel to collect used products, which can be recycled to create new products, remanufactured, or sold in a secondary market at a reduced price. A supply chain which supplies new products as well as collects used products through the reverse channel is known as closed-loop supply chain (CLSC). According to Savaskan and Van Wassenhove (2006), implementing CLSC management can provide a competitive edge and promote sustainable development. Many enterprises have realized the efficient trade-off between economic advantages and associated environmental concerns through remanufacturing (Zhang et al. 2020). In practice, some industries have adopted a remanufacturing approach to obtain more profit and establish a sustainable supply chain. For instance, Hewlett Packard (HP) announced Planet Partners, a closed-loop cartridge recycling initiative that has gathered and recycled 566 million ink and toner cartridges across more than 50 countries, accounting for 90% of all cartridges sold since its inception in 1991 (Zhou et al. 2017). Notably, leading companies like Caterpillar, Patagonia, Kodak, Canon, Boeing, and Levis have embraced sustainability, and even with the growing demand for consumer electronics, EPSON, HP, Huawei, Xiaomi, Dell, and Apple have established sustainable supply chains to minimize environmental degradation (Mondal and Giri 2020). The ecological remanufacturing initiative saves Xerox 40–65% on manufacturing expenses by recycling materials and components (Savaskan et al. 2004). The CLSC plays a crucial role in achieving economic and environmental sustainability in modern-day business, and it has become a widely discussed topic for decision-makers in various scenarios.

Based on the issues mentioned above, our analysis aims to integrate three important research streams, including green innovations, government intervention, and decentralized behavior of supply chain entities, with the evolutionary game approach in CLSC. Furthermore, as a common technique for dealing with decision makers’ restricted rationality, evolutionary game theory has been frequently applied in supply chain management studies. It integrates game theory and the dynamic process of Darwinian evolution, emphasizing that the equilibrium choice in a game (evolutionary stable strategy) should be the consequence of ‘dynamic learning’ (replicator dynamics), adjustment, and ‘trials and tribulations.’ As a consequence, the purpose of this article is to address the following queries:

  1. 1.

    How do the population of retailers’ pricing and sales efforts and the common manufacturer’s green innovation influence the evolutionary game process?

  2. 2.

    How do government intervention and the CTP affect the dynamic and irrational behavior of the population of retailers and the supply chain?

  3. 3.

    Should the government intervene in the supply chain to make a long-term decision? Does it benefit the government in any way?

While answering the above questions, this research develops an evolutionary game theoretic model considering two representative symmetrical retailers from the population of retailers. The game theoretic framework helps in formulating the model and analyzing pricing policies, marketing effort levels of retailers, and the manufacturer’s green innovation effort. Further, we introduce a dynamical change of the population of retailers’ decentralized behavior with the manufacturer by the replicator dynamic equation to find the equilibrium points and then analyze the evolutionary stable strategy (i.e., long-term based behavior). Finally, we formulate and analyze an extended model using a two-party evolutionary game to investigate the government’s long-term intervention decision and the decentralized behavior of the supply chain members.

The rest of the paper is organized as follows: We discuss the prior existing literature as well as the motivation of this study in Sect. 2. Section 3 introduces the development and interpretation of the non-evolutionary model to provide the strategy combinations for evolutionary game theory (EGT) and the evolutionary game model for the symmetrical retailers. Section 4 provides the numerical simulations and sensitivity analysis for the proposed model. An extended model for government intervention and the supply chain’s adoptive non-cooperative behavior is investigated in Sect. 5, using two-party EGT. Finally, Sect. 6 involves conclusions of the study and future research ideas.

2 Literature review

In this section, the literature review components are structurally organized as green innovation, power structure, government intervention, and evolutionary game theoretic approach in supply chain management.

2.1 Green innovation in supply chain management

The idea of green supply chain management (GSCM) is being put out by researchers and industry professionals as a viable remedy for raising environmental sustainability. Even though the idea of GSCM was there in the early 1990s, the rise in scholarly articles indicates that it genuinely took off around 2000 (Tseng et al. 2019). Some researchers have demonstrated the significance of green innovations and legislative carbon emission reduction policies for ecological sustainability. Prior studies are summarized chronologically in this subsection. Srivastava (2007) provided a comprehensive and innovative look at the GSCM area and widely reviewed the relevant literature since its creation, particularly from a reverse logistics perspective. Ghosh and Saha (2012) investigated the impact of various decision-making structures and power structures in GSCM, comparing and differentiating the impact of each on channel stakeholders’ pricing and green innovation strategies. Observations of Benjaafar et al. (2013) emphasized the influence of strategic activities to reduce carbon emissions, the application of operational models in evaluating the impacts of different regulatory frameworks, and the advantages of making investments in more environment-friendly technology. Chen et al. (2016) explored how investing in environmentally sustainable projects can help logistics operations balance their environmental and financial performance. Linking the objective of environmental sustainability for a dual-channel supply chain including both physical and digital channels, Jamali and Rasti-Barzoki (2018) studied the prices of two alternative products—a green item made by one producer and a non-green item by another producer. Wei et al. (2018) examined how product-reducing emissions through environmentally friendly production and remanufacturing methods interact in a two-period CLSC made up of a producer and a retailer where they compete to collect deteriorated items. Gharaei et al. (2019) developed a coordinated model of supply chain subject to penalties with environmentally friendly production, and quality control regulations after accounting for the tax expenses of carbon emissions. Heydari et al. (2021) analyzed the problem of channel coordination using a hybrid contract of revenue sharing and cost-sharing in a green supply chain with the demand depending on the retail price and the green innovation. By altering the basic market demand and pricing factors, Liu et al. (2022) utilized an evolutionary game model to analyze the behavior of green suppliers and manufacturers in a two-level green supply chain. Their study explored various internal and external factors that impact the decision-making of both parties in the game. The findings of Roh et al. (2022) confirmed the close connection between green administrative innovation, GSCM, and the right to intellectual property as well as the link between greening and environmental efficiency. Pal et al. (2023) examined the production of eco-friendly green innovative products in an uncertain environment to minimize the negative impact on the environment under green supply chain management and found that the manufacturer’s decision on the level of green innovation has the greatest impact on the optimal profit margin.

In summary, the research on decision-making in green supply chain management has received extensive attention from scholars, providing a rich theoretical basis for this study. However, many existing studies have focused on optimal decisions using non-EGT, which assumes that optimal decisions are static and rational. In practice, firms may alter their decisions over time. Therefore, this study comprehensively examines the long-term impact of a product’s greening level on the decisions of other enterprises and the government, utilizing it as a decision variable for the manufacturer in a duopolistic retailer scenario.

2.2 Power structures in supply chain management

The power structure in supply chain management is typically represented through the sequence of actions taken by its participants, with the first mover being considered to hold more power than the second. The effects of power structure on various aspects of supply chain operations have been extensively researched. Tang and Yang (2020) developed a low-carbon supply chain with a capital-constrained manufacturer and a capital-abundant retailer, analyzed the impacts of power structures on financing mechanisms, carbon emissions, and performance, and found an equilibrium in early payment under each power structure. Li et al. (2021) examined how government subsidies and channel power structure affect innovation levels in a two-tier supply chain and concluded that consumer subsidies are more effective than producer subsidies in promoting innovation investment, regardless of the channel power structure. Jena and Ghadge (2022) examined three power-balance scenarios in a supply chain network with two manufacturers and a retailer, including manufacturer Stackelberg, retailer Stackelberg, and vertical Nash, and discovered that the total profit of the supply chain is equal under Manufacturer Stackelberg and Vertical Nash scenarios for both manufacturer and retailer bundling strategies. Li and Mizuno (2022) analyzed the impact of power structures (Manufacturer Stackelberg, Retailer Stackelberg, and Vertical Nash) on pricing and inventory decisions in a dual-channel supply chain with stochastic and price-sensitive demand, concluding that the optimal decisions are influenced by the power structure, but have the same structural properties under each structure. Zha et al. (2023) studied how power structures affect price competition and entry strategy in the platform-hardware business industry, considering both vertical channel power imbalance with a wholesale model and horizontal power imbalance with a demand-sharing model. Zhang et al. (2023) explored the impact of three different channel power structures (manufacturer-led, vertical Nash, and retailer-led) on supply chain management. They considered a scenario where a manufacturer decides whether to introduce a direct sales channel, and the retailer has either capital-constrained or capital-sufficient status.

Numerous researchers have examined power structures across various scenarios and derived significant implications. This study specifically aims to investigate the power structure in a green CLSC by analyzing the long-term decentralized behavior of the retailer. We consider two scenarios, namely, vertical Nash and retailer Stackelberg, to comprehend their implications on the power structure within the context of government intervention.

2.3 Government intervention in GSCM

By providing subsidies for green production and enacting regulations to reduce carbon emissions, governments significantly influence the development of green practices in SCM. Many researchers have studied in depth the effectiveness of government interventions in GSCM. Mitra and Webster (2008) explored the impact of government incentives to encourage remanufacturing operations using a model with a manufacturer that produces new products for selling and a remanufacturer that rivals the manufacturer. They considered an incentive related to the amount of remanufacturing proportionally and given to the producer, the remanufacturer, or a combination of both. In a duopoly setup, Huang et al. (2013) compared and contrasted the supply chains for fuel-only and fuel-and-electric vehicles and then discussed about the government subsidy program to encourage the adoption of EAs to reduce air pollution. Guo et al. (2016) investigated how the effectiveness of government incentives on economic and social well-being as well as how the profitability of supply chain players and consumer price sensitivity affect the government’s choice of subsidy approach. In order to investigate the coordination of environmentally sustainable supply chains, Zhang and Yousaf (2020) suggested a two-part tariff contract that would involve government action in the form of subsidy. Ma et al. (2021) discussed the effects of government intervention on manufacturer and retailer cooperation strategies, technological investment strategies, and decisions of supply chain participants. Srivastava et al. (2022) explored how to market adoption of EVs may be enhanced by incentives provided in the form of subsidies and alternative tax structures. They also investigated several models under homogeneous and heterogeneous taxation strategies that include and exclude subsidies. Mondal and Giri (2022) examined the impact of government intervention on enhancing supply chain efficiency under various game models. Their analysis revealed that government subsidies can increase sales volume and improve the product’s greening level, leading to optimal results. Yang and Yao (2023) examined the decision-making problem in a two-level fresh agricultural products supply chain, considering consumers’ preference for low-carbon options and the government sets the carbon cap and trade policy as part of efforts to achieve “carbon neutral" development.

Based on the above references, it can be said that government intervention plays an important role in reducing carbon emissions and promoting environmental sustainability in GSCM, although participants may sometimes experience fewer economic benefits for environmentally sustainable development. To address this, our study examines the long-term decision-making of both governments and participants in a green supply chain by considering subsidies for green production and a cap and trade policy to balance the economic and environmental benefits determined by the government.

2.4 Evolutionary game theoretic approach in supply chain

In supply chains, the behavior of participants may evolve in real-world business, i.e., the decisions of supply chain members are inherently dynamic. Nowadays this becomes an important perspective to observe the tactics of the decision-makers because of the intense rivalry in the market. As a result, several scholars have used evolutionary game theory to analyze long-term decisions in the supply chain. Even though the evolutionary game theory has been extensively used since 1971 in a diverse range of fields, only a relatively small percentage of supply chain literature makes use of this technique. In an effort to combine the advantages of the environment and the economy, Barari et al. (2012) developed cooperation between the manufacturer and the retailer to decide on their strategies for adopting green practices to maximize profitability by leveraging green production. Fan et al. (2017) used an evolutionary game methodology to investigate the monitoring effectiveness and stability issues along with the optimal method for governmental interference of low-carbon incentives. Chen and Hu (2018) used the EGT to analyze how manufacturers reacted in response to different combinations of taxes and subsidies for carbon emissions, taking into account the fact of produced items with the same carbon qualities. They also concluded that government actions have a significant impact on producers’ behaviors, and governments must adapt their strategies dynamically in response. Sun et al. (2019) investigated the green innovation strategy for producers and suppliers in a supply chain to determine the optimal strategy under the government subsidy scheme applying the dynamic game approach. Zhang et al. (2020) concentrated on a competitive CLSC with two original equipment manufacturers (OEMs) and two third-party remanufacturers (TPRs) to select the best choice of these used product recycling modes by an evolutionary process. Considering policy aspects, Long et al. (2021) represented a simple evolutionary game theoretic setup to show how the model’s evolved dynamic and steady strategies are significantly affected by green sensitivities. Incorporating the dynamic process EGT, Hosseini-Motlagh et al. (2022) proposed the impact of supply chain members’ long-term-based stable actions on their decisions to coordinate or not, as well as their participation in the collaboration profit surplus. Luo et al. (2023) developed a tripartite evolutionary game model involving agricultural enterprises, universities, and the government to examine low-carbon technology innovation in agriculture and aimed to resolve the challenge of low-carbon agriculture development in China. Liao and Tan (2023) utilized evolutionary game theory to develop a carbon taxation mechanism for the post-subsidy era in the hoteling demand sector and investigated the interactions between local governments and auto manufacturers using an evolutionary game model.

In summary, evolutionary game theory has been effectively utilized in supply chain management. In this study, we utilize EGT to examine the long-term decentralized behavior of supply chain entities and to investigate the government’s long-term intervention strategy to promote supply chain sustainability.

2.5 Research gap and our contribution

From the above literature review, it is apparent that a few studies have focused on analyzing the dynamic behavior of supply chain participants across various power structures. Existing literature on the analysis of non-cooperative behavior among entities has primarily focused on static behavior, with only a few utilizing EGT to analyze dynamic behavior in the presence of different policies (Shu et al. 2019; Hosseini-Motlagh et al. 2022; Liu and Wei 2022). However, there is a scarcity of research in this area, especially in the context of green CLSC scenarios. Government intervention in supply chain has been the subject of almost all studies as short-term, static initiatives (Mitra and Webster 2008; Huang et al. 2013; Guo et al. 2016; Zhang and Yousaf 2020; Srivastava et al. 2022). In practice, due to the dynamic nature of government behavior, it is important to investigate the impact of supply chain interventions and analyze the resulting stable strategies. Recently, Hosseini-Motlagh et al. (2022) studied the dynamic behavior of manufacturers when dealing with retailers, but their research did not specifically focus on decentralized behavior within a green CLSC under government intervention.

In this paper, we attempt to fulfill the gaps of the prior studies and study the government’s and supply chain members’ dynamic behavior in a green CLSC under government’s green subsidy scheme and CTP. A basic model is developed first using one population evolutionary game theory to investigate the non-cooperative behavior of retailers investing in sales efforts toward a common manufacturer who invests in the greening level of the product. Next, we extend the model to a two-party EGT by considering the supply chain (SC) and government as separate populations. This study puts forward suggestions for the retailer to implement a leading power structure within the supply chain, while the government should intervene across the entire supply chain in this scenario. The major distinction between the proposed study and the prior literature is illustrated in Table 1.

Table 1 The distinction between our proposed study and the prior literature

3 Model formulation

In this section, first the non-evolutionary game models with varied strategy profiles will be discussed and the equilibrium solutions will be obtained by solving the game models for each pair of strategies. Then the evolutionary game model will be developed based on the assumed strategy combinations, and the possible evolutionary stable strategy will be obtained.

3.1 Problem description

We develop mathematical models for a closed-loop green supply chain consisting of a single manufacturer and multiple retailers who invest in sales efforts. The government’s intervention across the entire supply chain and a cap-and-trade policy are also considered in the model. The manufacturer is assumed to be in charge of implementing green innovation in both the manufacturing of new items and the remanufacturing of the collected used products. Here, we consider that the manufacturer collects used products from the market, recycles them, and sells the remanufactured products alongside new items in the same market through the retail channel. To promote sustainability, the government intervenes in the supply chain by providing a green subsidy to the manufacturer for producing environmentally friendly products. Additionally, the government implements a cap-and-trade policy on the manufacturer to reduce carbon emissions resulting from the production and recycling of products. Further, this study focuses on the manufacturer’s green innovation level as the only decision variable and the retailers’ decision variables, namely retail price and sales efforts, which are extensively examined in many research papers (see Hosseini-Motlagh et al. (2022); Huang and Hsieh (2012)). In this scenario, to construct the EGT model, the different strategy combinations are formed based on the strategies adopted by two representative symmetrical retailers within the population. Specifically, each of them can choose to play with the manufacturer in two separate decentralized games: (i) Vertical Nash (VN) game, where both players can make decisions concurrently, which means the retailer’s position is not to dominate the market; (ii) Retailer Stackelberg (RS) game, where the retailer dominates the market. Three possible strategy combinations can be assessed according to the non-evolutionary games that were chosen by the retailers. First, (NN), i.e., both symmetrical retailers opt to play VN with the manufacturer. Second, (NS), i.e., while it’s opponent (i.e., the second representative symmetrical retailer) selects the RS strategy, Retailer 1 opts to play VN with the manufacturer. It is noted that due to symmetry of retailers, the replacement of the retailers’ chosen strategies makes no difference in this strategy combination (SN). These two scenarios are thus placed in the same strategy profile. The third one is (SS), i.e., RS is the game which both retailers select to play. All the above strategy combinations and the corresponding decision sequences are shown in Fig. 1.

Fig. 1
figure 1

Diagrams for different strategy combinations of two representative retailers and the corresponding decision-making sequence for each strategy combination

Table 2 Notations used to formulate the models

The representative retailers for each strategy combination engage in a nonzero-sum game that produces identical outcomes when compared to different couples of retailers. Analytical exploration is then done on the relevant outcomes of all these strategy combinations as indicated by the non-evolutionary game model. The framework is then designed using the evolutionary game to investigate the long-term-based strategy used by most retailers. In accordance with Barron (2013), the evolutionary game model first derives a matrix of payoffs for the players based on the various strategies they employ in each strategy combination. Subsequently, the evolutionary behavior of players with respect to each strategy is determined by applying the replicator dynamics equation, which takes into account the condition that each player’s payoff must be greater than the system-wide payoff when implementing each strategy.

3.2 Notations and assumptions

The notations employed throughout this paper are detailed in Table 2. The following assumptions are made for building up the proposed models:

Assumption 1

The market’s consumption is deterministic and linear in terms of the greening level of the product, the retail prices of the product, and the retailers’ efforts in marketing. The market demand rises in response to retailers’ sales efforts and the manufacturer’s commitment to environmentally friendly manufacturing, and declines in response to rising retail prices. The demand function for retailer 1 is formulated as \(D_{ij}(P_{ij}, P_{ji}, A_{ij}, A_{ji}, \theta _{ij}) = a-\alpha P_{ij}+\beta P_{ji}+\gamma A_{ij}-\delta A_{ji}+\lambda \theta _{ij}\), where \(i,j \in {\{N\}} \) (for strategy combination 1) and \(i\in {\{N\}}\), \(j \in {\{S\}} \) or \(i\in {\{S\}}\), \(j \in {\{N\}} \) (for strategy combination 2) and \(i,j \in {\{S\}}\) (for strategy combination 3). The total demand function is given by \(D(P_{ij}, P_{ji}, A_{ij}, A_{ji}, \theta _{ij}) = D_{ij}+D_{ji} = 2a-\alpha (P_{ij}+P_{ji})+\beta (P_{ij}+P_{ji})+\gamma (A_{ij}+A_{ji})-\delta (A_{ij}+A_{ji})+2\lambda \theta _{ij}\). Similar to Kurata et al. (2007), we assume \(\alpha > \beta \) which implies that the influence of the self-price is higher than the effect of the cross-price and also \(\gamma > \delta \) , i.e., the effect of self-marketing effort is greater than the cross-marketing effort.

Assumption 2

The customers return the used products or deteriorated products to the manufacturer at a price \(p_0\). The total returned quantity \(D_R\) collected by the manufacturer is a fraction \(\tau \) of the total demand D, i.e., \(D_R=\tau D\). The manufacturer produces both the new products and the remanufactured products maintaining the same quality, and remanufacturing the used items is more profitable than producing new items, i.e., \(c_r<c_m\). (Savaskan et al. 2004; Taleizadeh et al. 2020).

Assumption 3

In order to maintain the profitability of all supply chain participants, we assume that \(P_{ij}\), \(P_{ji}>w>0\), \(A_{ij}\), \(A_{ji} > 0\) and \(c_m-c_r>p_0>0\). Further, we assume \(C_0=c_m-c_r-p_0>0\) to reduce the complications of calculations.

Assumption 4

Due to the government intervention in two-echelon CLSC, the manufacturer receives subsidies for green production. Our consideration here is that the Government makes an incentive \(s = k\theta _0 (\theta -\theta _0)\) per unit of green production to the manufacturer, where \(\theta _0\) and k are the floor of green level and the adjustment factor of the Government’s subsidy, respectively. When \(\theta \ge \theta _0\), the subsidy received by the manufacturer is \(k\theta _0 (\theta - \theta _0)\); otherwise, the punishment is \(-k\theta _0 (\theta - \theta _0)\) (Zhu and Dou 2011; Yang and Xiao 2017).

Assumption 5

Similar to Mondal and Giri (2020), we assume that per unit production of carbon emission is dependent on the greening level and is formulated as \(e = e_0 -\psi \theta \), where \(e_0\) is the basic emission. Here, for modeling simplicity, we assume that the carbon emissions due to the production of new items and recycle of collected used items are the same. Therefore, the total emission is \(E_m = (D-D_R)e + D_\textrm{R} e = De =D(e_0 - \psi \theta )\). If \(E_m \ge E\), i.e., when the carbon emission due to the green production at the manufacturer is greater than the carbon cap by the government, then the shortage of emission permit has to be bought by the manufacturer at a unit production cost of \(c_e\). Otherwise, the manufacturer can sell the extra emission permit at the same trading price in the same emission trading market to get some additional profit.

Assumption 6

The manufacturer has to bear some additional cost to produce green products, which is given by the quadratic function \(\frac{1}{2} \eta \theta _{ij}^2\) in \(\theta _{ij}\). For marketing purposes, the retailer’s additional cost is assumed to be an increasing convex cost function \(\frac{1}{2} \xi A_{ij}^2\) in \(A_{ij}\) (Ghosh and Saha 2012; Mondal and Giri 2020; Hosseini-Motlagh et al. 2022).

3.3 Non-evolutionary game models under different strategy combinations

Here we discuss three different strategy combinations (N, N), (N, S), or (S, N) and (S, S) and the decision-making sequences for each strategy combination are depicted in Fig. 1d. To obtain the optimal decisions of the supply chain members, we find the fitness (payoff) functions of retailers for all these non-evolutionary games. Next, the equilibrium solutions of the different strategy combinations are arranged in Table 3.

3.3.1 Strategy combination 1 (NN)

For this first combination, we assume that both retailer 1 and its opponent opt to play the vertical Nash game with the manufacturer, and the corresponding two-echelon CLSC structure is shown in Fig. 1a. The profit functions of the supply chain members (i.e., the retailers and the manufacturer) can be derived as follows:

$$\begin{aligned}{} & {} \Pi _m^{ij} (\theta _{ij}) = (w + \psi _1 \theta _{ij} - \psi _2) \nonumber \\{} & {} \quad \{2a - (\alpha - \beta )(P_{ij} + P_{ji}) + (\gamma - \delta )(A_{ij}+A_{ji}) + 2\lambda \theta _{ij}\}\nonumber \\{} & {} \quad + Ec_e - \frac{\eta }{2} \theta _{ij}^2 \end{aligned}$$
(1)
$$\begin{aligned}{} & {} \Pi _r^{ij} (P_{ij}, P_{ji}, A_{ij}, A_{ji}) \nonumber \\{} & {} \quad = (P_{ij} - w)(a - \alpha P_{ij} + \beta P_{ji} + \gamma A_{ij} -\delta A_{ji} + \lambda \theta _{ij}) - \frac{\xi }{2} A_{ij}^{2} \nonumber \\{} & {} \quad where \ \psi _1 = k\theta _0 + c_e\psi , \psi _2\nonumber \\{} & {} \quad = c_m + k\theta _0^2 - c_0\tau + c_e e_0,\ i,j \in \{N\}. \end{aligned}$$
(2)

Thus the model is formulated as

$$\begin{aligned} \left\{ \begin{array}{l} \text{ To } \text{ obtain } ~P_{NN},A_{NN},\theta _{NN}, ~\text{ solve } \text{ simultaneously }\\ \underset{(\theta _{ij})}{\max }~\Pi _m^{ij} (P_{ij}, A_{ij}, \theta _{ij}) ~,\\ \underset{(P_{ij},A_{ji})}{\max }~\Pi _r^{ij} (P_{ij}, A_{ij}, \theta _{ij},)\\ \text{ and }\\ \underset{(P_{ji},A_{ji})}{\max }~\Pi _r^{ji} (P_{ji}, A_{ji}, \theta _{ji})\\ \end{array} \right. \end{aligned}$$

In this case, since both the retailers choose the Nash game to deal with the manufacturer, we simultaneously solve for the retailers and the manufacturer’s profit functions. Precisely, we first find \(P_{NN}(\theta _{ij})\) and \(A_{NN}(\theta _{ij})\) for both the symmetrical retailers by finding the optimal decisions of the retailers, taking  \(\theta _{ij}\)  as the parameter and \(\theta _{NN}(P_{ij}, P_{ji}, A_{ij}, A_{ji})\) for the common manufacturer in a similar manner by obtaining the optimal decisions, taking \(P_{ij}, P_{ji}, A_{ij}\) and \(A_{ji}\) as the parameters. As the market demand fulfilled by the manufacturer is obtained by the total demand for two symmetrical retailers, the manufacturer has to produce D = (\(D_{ij} + D_{ji}\)) of the product, and thus the manufacturer’s profit function is dependent on all the decision variables \(P_{ij}, P_{ji}, A_{ij}, A_{ji}\) and \(\theta _{ij}\). Therefore, using \(P_{NN}(\theta _{ij})\) and \(A_{NN}(\theta _{ij})\) in the manufacturer’s decision variable \(\theta _{NN}(P_{NN}, A_{NN})\), we get \(\theta _{NN}\) and then \(P_{NN}\) and \(A_{NN}\) simultaneously, which is described in Proposition 1.

Proposition 1

For the strategy combination 1 (NN), the profit functions of the manufacturer and the retailers are concave in nature with respect to the decision variables \(\theta _{NN}\), \(P_{NN}\) and \(A_{NN}\) with the conditions \(2\alpha \xi - \gamma ^2 > 0\) and \(4\lambda \psi _1 - \eta <0\). Consequently, the optimal retail price and the sales effort of the retailer and the manufacturer’s best response on green innovation are given, respectively, by

$$\begin{aligned} \theta _{NN}= & {} \frac{X_1 - wX_2}{\Xi _1},\\ P_{NN}= & {} \frac{a\xi \Xi _1 + \lambda \xi X_1 + w\big [\Xi _1\{\alpha \xi - \gamma (\gamma - \delta )\} - \xi \lambda X_2\big ]}{\psi _3 \Xi _1},\\ A_{NN}= & {} \frac{\gamma \big [a\Xi _1 + \lambda X_1 -w\{\lambda X_2 + \Xi _1(\alpha - \beta )\}\big ]}{\psi _3 \Xi _1}, \end{aligned}$$

where \(\psi _3 = \xi (2\alpha -\beta ) - \gamma (\gamma - \delta )\), \(X_1 = 2(a\alpha \xi \psi _1 - \lambda \psi _2\psi _3)\), \(X_2 = 2\{\alpha \xi \psi _1(\alpha - \beta ) - \delta \psi _3\}\) and \(\Xi _1 = \eta \psi _3 - 2\lambda \psi _1(\psi _3 + \xi \alpha )\).

Proof

See Appendix A. \(\square \)

Now, using these decision variables \(P_{NN}, A_{NN}\) and \(\theta _{NN}\), we can calculate the profit functions

$$\begin{aligned} \Pi _m^{NN}= & {} 2(w + \psi _1 \theta _{NN} - \psi _2) \{a - (\alpha - \beta )P_{NN} \nonumber \\{} & {} \quad + (\gamma - \delta )A_{NN} + \lambda \theta _{NN}\} + Ec_e - \frac{\eta }{2} \theta _{NN}^2 \nonumber \\ \Pi _r^{NN}= & {} (P_{NN} - w) \{a - (\alpha - \beta ) P_{NN} \nonumber \\{} & {} \quad + (\gamma -\delta ) A_{NN} + \lambda \theta _{NN} \} - \frac{\xi }{2} A_{NN}^{2} \nonumber \\ \end{aligned}$$
(3)

3.3.2 Strategy combination 2 (NS)

In this strategy combination 2, we assume that retailer 1 decides to participate in the vertical Nash game, while its competitor opts to play the retailer Stackelberg game with the common manufacturer, as shown in Fig. 1b. The profit functions of the manufacturer and retailers are given by

$$\begin{aligned}{} & {} \Pi _m^{ij} (\theta _{ij}) = (w + \psi _1 \theta _{ij} - \psi _2) \nonumber \\{} & {} \quad \{2a - (\alpha - \beta )(P_{ij} + P_{ji}) + (\gamma - \delta )(A_{ij}+A_{ji}) + 2\lambda \theta _{ij}\} \nonumber \\{} & {} + Ec_e - \frac{\eta }{2} \theta _{ij}^2 \end{aligned}$$
(4)
$$\begin{aligned}{} & {} \Pi _r^{ij} (P_{ij}, P_{ji}, A_{ij}, A_{ji}) \nonumber \\{} & {} \quad = (P_{ij} - w) (a - \alpha P_{ij} + \beta P_{ji} + \gamma A_{ij} -\delta A_{ji} + \lambda \theta _{ij}) \nonumber \\{} & {} \qquad - \frac{\xi }{2} A_{ij}^{2} where \ i \in \{N\} ~and~ \ j \in \{S\}. \end{aligned}$$
(5)

This model can be represented as

$$\begin{aligned} \left\{ \begin{array}{l} ~~~~~~~~~~~~~~~~\text{ To } \text{ obtain } ~P_{NS}, A_{NS}, \theta _{NS} ~\\ \text{ Retailer } \text{1(Nash } \text{ strategy) } ~~~~~~~\left\{ \begin{array}{l} \text{ solve } \text{ simultaneously }\\ \underset{(\theta _{ij})}{\max }~\Pi _m^{ij} (P_{ij}, A_{ij}, \theta _{ij}) ~,\\ \underset{(P_{ij},A_{ij})}{\max }~\Pi _r^{ij} (P_{ij}, A_{ij}, \theta _{ij},)\\ \end{array} \right. \\ ~~~~~~~~~~~~~~~~\text{ and } P_{SN}, A_{SN} \hbox {are obtained from}\\ \text{ Retailer } \text{2(Stackelberg } \text{ retailer) } \left\{ \begin{array}{l} \underset{(P_{ji},A_{ji})}{\max }~\Pi _r^{ji} (P_{ji}, A_{ji}, \theta _{ji},)\\ \text{ subject } \text{ to }\\ \underset{(\theta _{ji})}{\max }~\Pi _m^{ji} (P_{ji}, A_{ji}, \theta _{ji})~\\ \end{array} \right. \end{array} \right. \end{aligned}$$

When one of the population’s representative retailers opts to make decisions simultaneously with the manufacturer and its rival’s strategy to dominate the market, this situation arises. The decision variables of the retailers will thus differ from one another, but the manufacturer’s decision variable will remain the same as the (N, N) scenario. This is because we focus on retailer 1 (i.e., the manufacturer’s green innovation (\(\theta _{ij}\)) will be determined in accordance with the vertical Nash game to acquire the profit function of retailer 1). The rival’s Stackelberg game selection scenario must be taken into account at the time of the retailer’s Nash game selection scenario since the retailer’s profit depends not only on its own pricing and sales effort but also on its rival. Consequently, we first estimate \(P_{NS}\), \(A_{NS}\), and \(\theta _{NS}\) for retailer 1 by solving simultaneously, and then \(P_{SN}\) and \(A_{SN}\) for retailer 2 by utilizing the backward induction technique (i.e., we first calculate the best reaction of the manufacturer taking \(P_{ji}\) and \(A_{ji}\) as the parameters, and then w.r.t. this optimal green innovation we obtain the best reaction of retailer 2), that are derived in Proposition 2.

Proposition 2

For the strategy combination 2 (NS), if \(\max \big \{ \frac{\gamma ^2}{2\alpha },\,\frac{(3\gamma + \delta )^2}{8(3\alpha + \beta )}\big \}< \xi < \frac{\gamma (3\gamma + \delta )}{2(7\alpha - \beta )}\) and \(4\lambda \psi _1 - \eta <0\), the profit functions of the manufacturer, retailer 1 and retailer 2 are concave with respect to the decision variables \(\theta _{NS}\), \(P_{NS}\), \(A_{NS}\), \(P_{SN}\) and \(A_{SN}\), respectively. Subsequently, the optimal retail price and the sales effort of retailer 1 and its rival and the manufacturer’s best response on green innovation are given, respectively, by

$$\begin{aligned} \theta _{NS}= & {} \frac{X_1 - wX_2}{\Xi _1},\\ P_{NS}= & {} \frac{a\xi \Xi _1 + \lambda \xi X_1 + w\big [\Xi _1\{\alpha \xi - \gamma (\gamma - \delta )\} - \xi \lambda X_2\big ]}{\psi _3 \Xi _1},\\ A_{NS}= & {} \frac{\gamma \big [a\Xi _1 + \lambda X_1 -w\{\lambda X_2 + \Xi _1(\alpha - \beta )\}\big ]}{\psi _3 \Xi _1},\\ P_{SN}= & {} \frac{Y_1 + wY_2}{\Xi _2},\\ A_{SN}= & {} \frac{\psi _4 \big [ Y_1 + w(Y_2 - \Xi _2)\big ]}{\xi \Xi _2 (\eta - 4\lambda \psi _1)},\\ \end{aligned}$$

where \(\psi _4 = \gamma (\eta - 3\lambda \psi _1) - \delta \lambda \psi _4\), \(\psi _5 = \psi _4 (\gamma - \delta )(\eta - 2\lambda \psi _1)\), \(Y_1 = \xi (\eta - 4\lambda \psi _1)\{a(\eta - 2\lambda \psi _1) - 2\lambda ^2\psi _2\}\), \(Y_2 = \xi (\eta - 4\lambda \psi _1)\{\alpha (\eta - 3\lambda \psi _1) + \lambda (2\lambda - \beta \psi _1)\} - \psi _5\) and \(\Xi _2 = \xi (\eta - 4\lambda \psi _1)\{(2\alpha - \beta ) - \lambda \psi _1(5\alpha - \beta )\} - \psi _5\).

Proof

See Appendix A. \(\square \)

Now, we can derive the profit functions of the supply chain members using the decision variables \(P_{NS}\), \(P_{SN}\), \(A_{NS}\), \(A_{SN}\) and \(\theta _{NS}\) as

$$\begin{aligned} \Pi _m^{NS}{} & {} = (w + \psi _1 \theta _{NS} - \psi _2) \{2a - (\alpha - \beta )( P_{NS} + P_{SN}) \nonumber \\{} & {} \quad + (\gamma - \delta )(A_{NS} + A_{SN}) + 2\lambda \theta _{NS}\} + Ec_e - \frac{\eta }{2} \theta _{NS}^2 \nonumber \\ \Pi _r^{NS}{} & {} = (P_{NS} - w) \{a - \alpha P_{NS} + \beta P_{SN} + \gamma A_{NS} -\delta A_{SN} \nonumber \\{} & {} \quad + \lambda \theta _{NS} \} - \frac{\xi }{2} A_{NS}^{2} \nonumber \\ \end{aligned}$$
(6)

3.3.3 Strategy combination 2 (SN)

In a manner similar to the above strategy combination, here in this case also the representative duopolistic retailer chooses to play the Stackelberg game when its rival opts to play the vertical Nash game with the common manufacturer. The profit functions of the retailers and the manufacturer are given by

$$\begin{aligned}{} & {} \Pi _m^{ij} (\theta _{ij})= (w + \psi _1 \theta _{ij} - \psi _2) \nonumber \\{} & {} \quad \{2a - (\alpha - \beta )(P_{ij} + P_{ji}) + (\gamma - \delta )(A_{ij}+A_{ji}) + 2\lambda \theta _{ij}\}\nonumber \\{} & {} \quad + Ec_e - \frac{\eta }{2} \theta _{ij}^2 \end{aligned}$$
(7)
$$\begin{aligned}{} & {} \Pi _r^{ij} (P_{ij}, P_{ji}, A_{ij}, A_{ji}) \nonumber \\{} & {} \quad = (P_{ij} - w) (a - \alpha P_{ij} + \beta P_{ji} + \gamma A_{ij} -\delta A_{ji} + \lambda \theta _{ij}) \nonumber \\{} & {} \quad \quad - \frac{\xi }{2} A_{ij}^{2} \nonumber \\{} & {} \quad where \ i \in \{S\} ~and~ \ j \in \{N\}. \end{aligned}$$
(8)

This model can be represented as follows:

$$\begin{aligned} \left\{ \begin{array}{l} ~~~~~~~~~~~~~~~~\text{ To } \text{ obtain } ~P_{SN}, A_{SN}, \theta _{SN} ~\\ \text{ Retailer } \text{1(Stackelberg } \text{ retailer) } \left\{ \begin{array}{l} \underset{(P_{ij},A_{ij})}{\max }~\Pi _r^{ij} (P_{ij}, A_{ij}, \theta _{ij},)\\ \text{ subject } \text{ to }\\ \underset{(\theta _{ij})}{\max }~\Pi _m^{ij} (P_{ij}, A_{ij}, \theta _{ij}) ~,\\ \end{array} \right. \\ ~~~~~~~~~~~~~~~~\text{ and } \text{ to } \text{ obtain } P_{NS}\hbox { and }A_{NS},\\ \text{ Retailer } \text{2(Nash } \text{ Strategy) } ~~~~~~\left\{ \begin{array}{l} \text{ solve } \text{ simultaneously }\\ \underset{(\theta _{ji})}{\max }~\Pi _m^{ji} (P_{ji}, A_{ji}, \theta _{ji}) ~,\\ \underset{(P_{ji},A_{ji})}{\max }~\Pi _r^{ji} (P_{ji}, A_{ji}, \theta _{ji},)\\ \end{array} \right. \end{array} \right. \end{aligned}$$

Except for the manufacturer’s decision variable, there is no effect of altering the strategies selected by each retailer in the previous strategy combination because the entire discussion focuses on symmetrical retailers. More specifically, in this case, the manufacturer’s decision variable \(\theta _{SN}\) will be the same as strategy combination 3 since it is determined by focusing on the Stackelberg strategy of retailer 1. The decision variables of the supply chain members are derived in Proposition 3.

Proposition 3

For the strategy combination 2 (SN), if \(\max \{ \frac{\gamma ^2}{2\alpha },~ \frac{(3\gamma + \delta )^2}{8(3\alpha + \beta )}\}< \xi < \frac{\gamma (3\gamma + \delta )}{2(7\alpha - \beta )}\) and \(4\lambda \psi _1 - \eta <0\), the profit functions of the manufacturer, retailer 1 and retailer 2 are concave in nature with respect to the decision variables \(\theta _{SN}\), \(P_{SN}\), \(A_{SN}\), \(P_{NS}\) and \(A_{NS}\), respectively. Subsequently, the optimal retail price and the sales effort of retailer 1 and its rival and the manufacturer’s best response on green innovation are given, respectively, as follows:

$$\begin{aligned} \theta _{SN}= & {} \frac{\left( \begin{array}{c} \psi _1\big [ 2\alpha \xi \Xi _2(\eta - 4\lambda \psi _1) - 2\xi (\alpha - \beta )(\eta - 4\lambda \psi _1)(Y_1 + wY_2)\\ +2(\gamma - \delta )\{Y_1 + w(Y_2 - \Xi _2)\} \big ] + 2\lambda (w - \psi _2)\xi \Xi _2 (\eta - 4\lambda \psi _1) \end{array} \right) }{\xi \Xi _2 (\eta - 4\lambda \psi _1)^2},\\ P_{SN}= & {} \frac{Y_1 + wY_2}{\Xi _2},\\ A_{SN}= & {} \frac{\psi _4 \big [ Y_1 + w(Y_2 - \Xi _2)\big ]}{\xi \Xi _2 (\eta - 4\lambda \psi _1)},\\ P_{NS}= & {} \frac{a\xi \Xi _1 + \lambda \xi X_1 + w\big [\Xi _1\{\alpha \xi - \gamma (\gamma - \delta )\} - \xi \lambda X_2\big ]}{\psi _3 \Xi _1},\\ A_{NS}= & {} \frac{\gamma \big [a\Xi _1 + \lambda X_1 -w\{\lambda X_2 + \Xi _1(\alpha - \beta )\}\big ]}{\psi _3 \Xi _1}. \end{aligned}$$

Proof

See Appendix A. \(\square \)

Now, using the decision variables \(P_{NS}\), \(P_{SN}\), \(A_{NS}\), \(A_{SN}\) and \(\theta _{SN}\), we can derive the profit functions of the supply chain members as

$$\begin{aligned} \Pi _m^{SN}{} & {} = (w + \psi _1 \theta _{SN} - \psi _2) \{2a - (\alpha - \beta )( P_{SN} + P_{NS}) \nonumber \\{} & {} \quad + (\gamma - \delta )(A_{SN} + A_{NS}) + 2\lambda \theta _{SN}\} + Ec_e - \frac{\eta }{2} \theta _{SN}^2 \nonumber \\ \Pi _r^{SN}{} & {} = (P_{SN} - w) \nonumber \\{} & {} \quad \{a - \alpha P_{SN} + \beta P_{NS} + \gamma A_{SN} -\delta A_{NS} + \lambda \theta _{SN} \} \nonumber \\{} & {} \quad - \frac{\xi }{2} A_{SN}^{2} \nonumber \\ \end{aligned}$$
(9)

3.3.4 Strategy combination 3 (SS)

For this strategy combination, we assume that both retailer 1 and its opponent opt to play the retailer Stackelberg game with the manufacturer, and the corresponding two-echelon CLSC structure is shown in Fig. 1c. The profit functions of the supply chain members (i.e., the retailers and the manufacturer) can be derived as follows:

$$\begin{aligned}{} & {} \Pi _m^{ij} (\theta _{ij})=(w + \psi _1 \theta _{ij} - \psi _2) \{2a - (\alpha - \beta )(P_{ij} + P_{ji}) \nonumber \\{} & {} \qquad \qquad \qquad \qquad + (\gamma - \delta )(A_{ij}+A_{ji}) + 2\lambda \theta _{ij}\}\nonumber \\{} & {} \quad \qquad \qquad \qquad + Ec_e - \frac{\eta }{2} \theta _{ij}^2 \end{aligned}$$
(10)
$$\begin{aligned}{} & {} \Pi _r^{ij} (P_{ij}, P_{ji}, A_{ij}, A_{ji})= (P_{ij} - w) (a - \alpha P_{ij} + \beta P_{ji} \nonumber \\{} & {} \qquad \qquad \qquad \qquad \qquad \qquad \qquad + \gamma A_{ij} -\delta A_{ji} + \lambda \theta _{ij}) - \frac{\xi }{2} A_{ij}^{2} \\{} & {} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad where \ i~and~ \ j \in \{S\} \nonumber . \end{aligned}$$
(11)

This model can be represented as:

$$\begin{aligned} \left\{ \begin{array}{l} ~~~~~~~~~~~~~~~~\text{ To } \text{ obtain } ~P_{SS}, A_{SS}, \theta _{SS} ~\\ \text{ Retailer } \text{1(Retailer } \text{ Stackelberg) } \left\{ \begin{array}{l} \underset{(P_{ij},A_{ij})}{\max }~\Pi _r^{ij} (P_{ij}, A_{ij}, \theta _{ij},)\\ \text{ subject } \text{ to }\\ \underset{(\theta _{ij})}{\max }~\Pi _m^{ij} (P_{ij}, A_{ij}, \theta _{ij}) ~,\\ \end{array} \right. \\ \text{ Retailer } \text{2(Stackelberg } \text{ retailer) } \left\{ \begin{array}{l} \underset{(P_{ji},A_{ji})}{\max }~\Pi _r^{ji} (P_{ji}, A_{ji}, \theta _{ji},)\\ \text{ subject } \text{ to }\\ \underset{(\theta _{ji})}{\max }~\Pi _m^{ji} (P_{ji}, A_{ji}, \theta _{ji}) ~,\\ \end{array} \right. \end{array} \right. \end{aligned}$$

In this non-evolutionary game strategy configuration, both competing retailers play the retailer Stackelberg game with the same manufacturer. Here, the optimal decision variables \(P_{SS}\), \(A_{SS}\) and \(\theta _{SS}\) can be evaluated for both the symmetrical retailers by using the backward induction method. More precisely, we first obtain the best response of the manufacturer by taking \(P_{ij}\) and \(A_{ij}\) as the parameters with respect to the decision variable \(\theta _{ij}\) and then w.r.t. this optimal green innovation, we obtain the best response of retailers for price setting and marketing effort. In a similar manner, we can evaluate the optimal decisions for its rival as they both play the retailer Stackelberg game in this situation. The optimal decision variables are obtained as given in Proposition 4.

Proposition 4

For the strategy combination 3 (SS), the profit functions of the supply chain members are concave in nature with respect to the decision variables \(\theta _{SS}\), \(P_{SS}\) and \(A_{SS}\), if \(\max \{ \frac{\gamma ^2}{2\alpha },~ \frac{(3\gamma + \delta )^2}{8(3\alpha + \beta )}\}< \xi < \frac{\gamma (3\gamma + \delta )}{2(7\alpha - \beta )}\) and \(4\lambda \psi _1 - \eta <0\). Subsequently, the optimal decision variables \(\theta _{SS}\), \(P_{SS}\) and \(A_{SS}\) of the supply chain members are obtained, respectively, as follows:

$$\begin{aligned} \theta _{SS}= & {} \frac{\left( \begin{array}{c} \psi _1\big [ 2\alpha \xi \Xi _2(\eta - 4\lambda \psi _1) - 2\xi (\alpha - \beta )(\eta - 4\lambda \psi _1)(Y_1 + wY_2)\\ +2(\gamma - \delta )\{Y_1 + w(Y_2 - \Xi _2)\} \big ] + 2\lambda (w - \psi _2)\xi \Xi _2 (\eta - 4\lambda \psi _1) \end{array} \right) }{\xi \Xi _2 (\eta - 4\lambda \psi _1)^2},\\ P_{SS}= & {} \frac{Y_1 + wY_2}{\Xi _2},\\ A_{SS}= & {} \frac{\psi _4 [ Y_1 + w(Y_2 - \Xi _2) ] }{\xi \Xi _2 (\eta - 4\lambda \psi _1)}.\\ \end{aligned}$$

Proof

See Appendix A. \(\square \)

Now, using these decision variables \(P_{SS}, A_{SS}\) and \(\theta _{SS}\), we obtain the profit functions of the manufacturer and the retailer as

$$\begin{aligned} \Pi _m^{SS}= & {} 2(w + \psi _1 \theta _{SS} - \psi _2) \{a - (\alpha - \beta )P_{SS} \nonumber \\{} & {} \quad + (\gamma - \delta )A_{SS} + \lambda \theta _{SS}\}+ Ec_e - \frac{\eta }{2} \theta _{SS}^2 \nonumber \\ \Pi _r^{SS}= & {} (P_{SS} - w) \{a - (\alpha - \beta ) P_{SS} + (\gamma -\delta ) A_{SS} \nonumber \\{} & {} + \lambda \theta _{SS} \} - \frac{\xi }{2} A_{SS}^{2} \end{aligned}$$
(12)

where \(A = a\Xi _1 + \lambda X_1\), \(B = \Xi _1 \{ \alpha \xi - \gamma (\gamma - \delta )\} - \xi \lambda X_2\), \(C = \lambda X_2 + \Xi _1(\alpha - \beta )\), \(D = \psi _1 \big [ 2\alpha \xi \Xi _2(\eta - 4\lambda \psi _1) - 2\xi (\alpha - \beta )(\eta - 4\lambda \psi _1)(Y_1 + wY_2) + 2(\gamma - \delta )\{Y_1 + w(Y_2 - \Xi _2)\} \big ]\), \(G = 2\lambda \Xi _2\xi (\eta - 4\lambda \psi _1)\).

3.4 Development and analysis of evolutionary game model

3.4.1 Evolutionary game theory

The foundation of EGT was developed by incorporating game theory (competition) into the Darwinian evolution process. The illustrative representation of the game system model can be observed in Fig. 2. Similar to the natural selection process in the Darwinian evolution, here applying the classical game theoretic concept for competition to select a strategy from the strategy set which is adopted by the population, is known as the evolutionary stable strategy (ESS). This central idea of EGT was coined by J.M. Smith in 1973, described in his book ‘Evolution and the Theory of Games’ (John Maynard Smith 1982).

Fig. 2
figure 2

Evolutionary game process

Table 3 Summary of equilibrium solutions of the non-evolutionary game for different strategy combinations

In one phrase, the main difference between the EGT and the classical game theory is the dynamic nature of strategy selection. The important attributes of the classical game theory are (a) static (i.e., the players cannot change their selected strategy), (b) rationality of players and (c) complete sharing of information, whereas the EGT is developed based on the dynamic nature, irrationality of players and their information sharing both. Mathematically, an evolutionary game can be expressed as G = \(\{ I,S,\Pi \}\), where ‘I’, ‘S’ and ‘\(\Pi \)’ symbolize the set of the players in the population, the set of the strategies to be chosen by the players and the set of the pay-off functions of different players for different strategy combinations. Two fundamental concepts for completely understanding the evolutionary game theory are stated as follows:

(1) Evolutionary Stable Strategy (ESS) An evolutionarily stable strategy represents a strategy that is invulnerable when adopted by a population in a particular situation, i.e., it cannot be displaced by an alternate strategy that is initially rare. As a definition we can say, S* is an ESS iff any of the following two holds:

u(S*, S*) > u(S, S*)         \(\forall S \ne S^*\)

u(S*, S*) = u(S, S*)  implies that u(S*, S) > u(S, S)      \(\forall S \ne S^*\)

(2) Replicator Dynamics Replicator dynamics is a term used to describe an organism with the ability to create more or less accurate replicas of itself with respect to time, known as a replicator. It is formulated as

$$\begin{aligned} \frac{\textrm{d}x_i}{\textrm{d}t}= & {} x_i \big [f_i(x) - {\bar{f}}(x)\big ], ~~~~~~~ {\bar{f}}(x) = \sum _{j=1}^{n} x_j f_j(x) \nonumber . \end{aligned}$$

where \(x_{i}\) is the fraction of the population to select the strategy i, the vector indicating the prevalence of different types of strategy selection in the population is x=(\(x_{1},\ldots ,x_{n}\)). \(f_{i}(x)\) is the fitness (payoff) of type i (which is dependent on the population), and \({\bar{f}} (x)\) is the average fitness (payoff) of the population.

3.4.2 EGT model for symmetrical retailers

We use EGT model to analyze the long-term behavior of the retailers and develop the ESS through a replicator dynamic equation. To obtain a steady solution in EGT, we study the decentralized behavior (either to dominate the market or not) of the retailer population with the manufacturer.

Here we assume that the retailer has two pure strategies.

\(S_{N}\): the strategy representing the retailer’s choice to play the vertical Nash game with the manufacturer.

\(S_{S}\): the strategy representing the retailer’s choice to play the retailer-led Stackelberg game with the manufacturer.

The fraction of the retailer population who select each strategy given by

x: the proportion of retailers who adopt the strategy \(S_N\), \(0<x<1\).

1-x: the proportion of retailers who adopt the strategy \(S_S\).

Table 4 Payoff matrix of two symmetrical retailers

Now, according to the quantitatively symmetric game (i.e., the game is symmetric with respect to the exact payoffs) for two retailers with two different strategies (VN and RS), the payoff matrix of one is the transpose of its rival, i.e., \( Y = X^t\), where X and Y are the payoff matrices of the retailer and it’s rival, respectively. For our EGT model, G = \(\{I, S, \Pi \}\), I indicates the representative retailers in the population, S = \(\{(x, 1-x): 0\le x\le 1\}\) is the set of the strategies, and \(\Pi \) represents the set of payoff functions.

From Table 4, the payoff matrix for the retailer 1 is given by \(X = \begin{bmatrix} \Pi _{NN} &{} \Pi _{NS} \\ \Pi _{SN} &{} \Pi _{SS} \end{bmatrix}\) and for retailer 2, Y=\(X^t\) clearly. Since the fitness (payoff) function is evaluated through the profits for different strategy combinations, here the fitness of retailer 1 when it selects vertical Nash strategy is given by

$$\begin{aligned} \textrm{ES}_N= & {} x \Pi _{NN} + (1-x) \Pi _{NS}. \end{aligned}$$
(13)

Similarly, the fitness when it chooses retailer-led Stackelberg game is

$$\begin{aligned} \textrm{ES}_S= & {} x \Pi _{SN} + (1-x) \Pi _{SS}. \end{aligned}$$
(14)

Therefore, the average fitness of retailer 1 against all possible strategies of its rival is given by

$$\begin{aligned} {\overline{\textrm{ES}}}= & {} x\textrm{ES}_N + (1-x)\textrm{ES}_S = x \big [x\Pi _{NN} + (1-x) \Pi _{NS}\big ]\nonumber \\{} & {} +(1-x)\big [x\Pi _{SN} + (1-x)\Pi _{SS}\big ].\nonumber \\ \end{aligned}$$
(15)

Now, in this evolutionary game theory, we study the optimal strategy chosen by most of the retailers in the population by using the replicator dynamic equation. The concept behind this method is that the majority of individuals employ the strategy if it generates expected fitnesses that are greater than the average fitness.

By definition, the replicator dynamic equation is given by

$$\begin{aligned} \frac{\textrm{d}x(t)}{\textrm{d}t}= & {} x(t)[\textrm{ES}_{N}(t) - {\overline{\textrm{ES}}}(t)] \nonumber \\= & {} x(t)(1-x(t))(\textrm{ES}_{N}(t) - \textrm{ES}_{S}(t)) \nonumber \\= & {} x(t)(1-x(t))[(\Pi _{NN}-\Pi _{NS}-\Pi _{SN}\nonumber \\&\quad +&\Pi _{SS})x(t)+(\Pi _{NS}-\Pi _{SS})]. \end{aligned}$$
(16)

The critical points of this dynamical system are obtained as:

$$\begin{aligned} \left\{ \begin{array}{l} x=0;\\ x=1;\\ \text{ and }\\ x = \frac{\Pi _{SS} - \Pi _{NS}}{(\Pi _{NN} - \Pi _{SN})+(\Pi _{SS} - \Pi _{NS})} \end{array} \right. \end{aligned}$$

Proposition 5

For the two persons symmetric game \(X = \begin{bmatrix} \Pi _{NN} &{} \Pi _{NS}\\ \Pi _{SN} &{} \Pi _{SS} \end{bmatrix}\), satisfying the condition \((\Pi _{NN} - \Pi _{SN})(\Pi _{SS} - \Pi _{NS}) \ne 0 \), the evolutionary stable strategies (x) evaluated for three cases are given below:

Case I, when (\(\Pi _{NN} - \Pi _{SN}\))(\(\Pi _{SS} - \Pi _{NS}\)) < 0, the game has one ESS. That ESS is x = 1, if \(\Pi _{NN} - \Pi _{SN} > 0\). In contrast, if \(\Pi _{NN} - \Pi _{SN} < 0\), then x = 0 is the ESS.

Case II, when (\(\Pi _{NN} - \Pi _{SN}\)) and (\(\Pi _{SS} - \Pi _{NS}\)) both are positive, then among three Nash equilibrium, two of them x = 1 and x = 0 are ESS, but the mixed Nash strategy (\( \frac{\Pi _{SS} - \Pi _{NS}}{(\Pi _{NN} - \Pi _{SN})+(\Pi _{SS} - \Pi _{NS})}\), \(\frac{\Pi _{NN} - \Pi _{SN}}{(\Pi _{NN} - \Pi _{SN})+(\Pi _{SS} - \Pi _{NS})}\)) is not the evolutionary stable strategy.

Case III, when (\(\Pi _{NN} - \Pi _{SN}\)) and (\(\Pi _{SS} - \Pi _{NS}\)) both are negative, then the mixed Nash strategy (\(\frac{\Pi _{SS} - \Pi _{NS}}{(\Pi _{NN} - \Pi _{SN})+(\Pi _{SS} - \Pi _{NS})}\), \(\frac{\Pi _{NN} - \Pi _{SN}}{(\Pi _{NN} - \Pi _{SN})+(\Pi _{SS} - \Pi _{NS})}\)) is the evolutionary stable strategy.

Proof

Here the above differential equation

$$\begin{aligned} \frac{\textrm{d}x(t)}{\textrm{d}t}{} & {} = x(t)(1-x(t))[(\Pi _{NN}-\Pi _{NS}-\Pi _{SN}\\{} & {} \quad +\Pi _{SS})x(t)+(\Pi _{NS}-\Pi _{SS})]. \end{aligned}$$

can be solved in a manner similar to Johari et al. (2019), as obtained by Barron (2013). Then we get

$$\begin{aligned} \frac{Q^S}{x(t)^\frac{1}{\Pi _{SS}-\Pi _{NS}} \times (1-x(t))^\frac{1}{\Pi _{NN}-\Pi _{SN}}} = A e^t \end{aligned}$$

where \(Q = x(\Pi _{NN} - \Pi _{SN}) - (1-x)(\Pi _{SS} - \Pi _{NS})\); \(S = \frac{1}{\Pi _{NN}-\Pi _{SN}} + \frac{1}{\Pi _{SS}-\Pi _{NS}}\) and \(A>0\).

From this implicit solution, the proof of three cases of the proposition follows. \(\square \)

Corollary 1

In this model, (\(\Pi _{NN} - \Pi _{SN}\)) and (\(\Pi _{SS} - \Pi _{NS}\)) are of different signs. From Proposition 5, it is concluded that the possibilities of ESS are (I) x = 1 if the condition \((\Pi _{NN} - \Pi _{SN}) > 0\) holds, i.e., vertical Nash game is chosen by all the retailers in the population, and (II) x = 0 if the condition \((\Pi _{NN} - \Pi _{SN}) < 0\) holds, i.e., all the retailers in the population choose the retailer-led Stackelberg game to deal with the manufacturer.

4 Numerical simulation and sensitivity analysis

In this section, we conduct a numerical simulation for our proposed models to examine the effects of important parameters (such as green sensitivity, and different cost-sensitive parameters) on the retailer population behavior. Similar to Mondal and Giri (2020); Xu et al. (2016) and Yang and Xiao (2017), we consider the data with some reasonable changes in parameter values as given in Table 5:

Table 5 Parameter values

According to the parameter-settings, self-pricing sensitivity(\(\alpha \)) and self-marketing effort sensitivity(\(\gamma \)) are higher than cross-price sensitivity(\(\beta \)) and marketing effort sensitivity(\(\delta \)), respectively, which are representative of actual practice. For the chosen set of data given in Table 5, our presumptions and the necessary conditions for the negative definiteness of Hessian matrices are satisfied (\(c_0 = (c_m - c_r - p_0) = 30 >0\)). The best possible outcomes and the decentralized behavior of the retailer population with the manufacturer are shown in Table 6.

Table 6 Comparison between different strategy combinations and the ESS

From Table 6, comparing different strategy combinations, we can infer that when the representative duopolistic retailers in the population choose the third strategy combination (S, S), all supply chain participants can maximize their payoffs. Additionally, we determine that the evolutionary stable strategy is x = 0. In other words, the retailer-led Stackelberg strategy is adopted by \(100\%\) retailer population over the long term (i.e., the market is dominated by retailer population’s stable strategy).

In comparison with the decentralized strategies, when a retailer in the population uses the vertical Nash strategy, the retail price is higher (777.42 > 776.47), and the sales effort is lesser (0.15445 < 0.15478), regardless of the choice made by its rival. Also, from Table 6, we conclude that, under the retailer-led Stackelberg strategy, the green innovation effort is higher than that under the vertical Nash strategy (5.30105 > 5.29178) i.e., when the retailer dominates the market, the manufacturer makes more of an effort to develop an ecologically friendly product.

Our results also reveal that, regardless of how competitors interact with the manufacturer, one of the retailers generates more revenue when he chooses the RS strategy over the VN strategy. This suggests that the strategy used by the population of retailers to select RS games is validated.

Fig. 3
figure 3

The replicator dynamic diagram for the decentralized behavior of the retailers’ population

Fig. 4
figure 4

Trajectories of retailers’ population’s behavior under different initial conditions

As seen in Fig. 3, the retailer population opts for the retailer-led Stackelberg strategy throughout the long term. Using the replicator dynamic equation, the arrows in the figure are drawn with a propensity to point (0,1), indicating that \(x(t) = 0\) and \(1-x(t) = 1\) represent the frequency of the population’s behavior with regard to the RS strategy over time. That’s why, the ESS is retailer-led Stackelberg in our proposed EGT model.

Figure 4 shows that, for five initial conditions (i.e., for \(t = 0\)), the frequencies of retailers adopt strategies between VN and RS in the population. Without any loss of generality, two of the initial conditions are taken at boundaries, i.e., \(x(0) = 0\) and \(x(0) = 1\), and the other three conditions are in \(x(t) \in (0,1)\). From the graph, it can be deduced that the first two initial conditions’ trajectories remain at the borders, while the trajectories of the other three initial conditions progress in the direction of the ESS \(x = 0\). More precisely, if 100% of retailer population initially decides on either VN or RS strategy to deal with the manufacturer, then the population’s decisions do not change over time. However, if initially only a fraction of the population decides on the VN strategy, then dynamically the entire population moves toward the RS strategy.

4.1 Effect of price sensitivities

According to Fig. 5a–b, the retailers always incline to zero or move toward retailer-led Stackelberg strategy as ESS, to optimize its fitness, regardless of changes in the retailer’s self-price-sensitive demand (\(\alpha \)) and cross-price-sensitive demand (\(\beta \)). It is important to remark that the population’s frequency of adopting the RS strategy first slows down as self-price sensitivity increases, but then gradually speeds up as \(\alpha \) increases. On contrary, the population’s frequency of adopting the RS strategy first increases with the rise in cross-price-sensitive demand, but as that demand continues to rise, it begins to slow down.

Despite the fact that the majority of retailers favor the RS strategy, Fig. 6a shows that when the wholesale price (w) of the manufacturer increases, the speed of adoption of the strategy decreases.

4.2 Effect of sales effort sensitivities

From Fig. 5c–d, it is observed that despite changes in the retailer’s self-sales effort level (\(\gamma \)) and cross-sales effort level (\(\delta \)), the population of retailers tends to zero, i.e., chooses the retailer-led Stackelberg strategy as ESS for maximizing its fitness. More precisely, the rate at which the population adopts the RS strategy (to dominate the market) declines when the self-sales effort level rises, and this rate increases, while the cross-sales effort level rises.

Fig. 5
figure 5

Effect of price and sales effort demand sensitivity on evolutionary game equilibrium

4.3 Effect of green sensitivities

4.3.1 Impact of green investment cost (\(\eta \)) on the game equilibrium

Pursuant to Fig. 6b, the population is moving toward the direction of the evolutionary stable solution \(x = 0\) (i.e., the retailer Stackelberg strategy). However, when the manufacturer improves its green investment cost, the population’s adoption of the RS strategy declines which implies that the retailers adopt the strategy to dominate the market slowly.

4.3.2 Impact of greening-level-sensitive demand (\(\lambda \)) on the game equilibrium

Figure 6c demonstrates the retailer population’s decentralized behavior depending on the green-level-sensitive demand (\(\lambda \)). It is clear that the whole population’s long-term-based tendency is to dominate the market, which is adopted dynamically, but the rate of the tendency toward that strategy depends on \(\lambda \). The adoption of the RS strategy escalates as demand for environmentally friendly merchandise increases, i.e., as consumer awareness of environmentally friendly products begins to rise, so does the retailer’s propensity to dominate the market.

4.4 Effect of government intervention

Figure 6d indicates that the retailer population consistently approaches to zero, i.e., selects the retailer-led Stackelberg strategy as the ESS for optimizing the profit, notwithstanding changes in the government’s subsidy for manufacturing a green product, but the rate of tendency changes. The manufacturer enhances its green investment as a consequence of increased government subsidies, and based on our previous observations, retailers become quicker to adopt the dominant strategy (RS) dynamically.

4.5 Effect of cap and trade policy

Figure 6e depicts that the retailer population is influenced by the carbon emission adjustment factor (\(\psi \)). The overall amount of carbon emissions depends on \(\psi \) and the basic emission rate (\(e_0\)). When the basic emission is fixed, the total carbon emissions decrease with the increase of the adjustment factor. As a consequence, with \(E_m \ge E\), the cost related to purchasing the shortage of carbon emission permit reduces with an increase in \(\psi \). If \(E_m < E\), an increase in \(\psi \) improves the profit from selling the extra emission permit on the emission trading market. The effect of cap and trade policy on the evolutionary game equilibrium is reflected in Fig. 6e. The retailer population’s adoption of the RS strategy becomes faster with the increase in \(\psi \).

Furthermore, the population’s propensity to dominate the market does not change even though the game equilibrium or evolutionary stable strategy is only marginally dependent on the fraction (\(\tau \)) of the items that are remanufactured by the manufacturer.

Fig. 6
figure 6

Effect of wholesale price Green Sensitivity government intervention and CTP on evolutionary game equilibrium

5 Model extension

This section extends the model developed in Sect. 4 by taking into account various circumstances such as (a) out of two parties, one is the government (G) and the other one is the supply chain (SC), (b) both the symmetrical representative price and sales effort competitive retailers choose either the vertical Nash strategy or the retailer-led Stackelberg strategy, and (c) due to intervention, the government pays a subsidy to the manufacturer and applies CTP; otherwise, no subsidy and no CTP are applied for green production. Similar to Long et al. (2021), we make the following assumptions: (i) A specific supervision cost, \(C_\textrm{g}\), is paid by the government when it intervenes in the supply chain of environmentally friendly products and employs a cap and trade policy. (ii) The significant governance savings of \(U_1\) comes from the manufacturer’s green manufacturing practices as well as the reduction in carbon emissions. (iii) The image advantage of the government from customers’ purchasing of environmentally friendly products under government involvement is represented by \(U_2\). (iv) \(U_3\) is denoted by the image loss brought on by environmental degradation as a result of traditional manufacturing methods.

To analyze the evolutionary approach for this extended model, we first need to obtain the payoff functions of the supply chain for both the decisions of the government. Under government intervention, we have already evaluated the supply chain members’ decision variables in Propositions 1 and 4 when both retailers choose the VN strategy or the RS strategy. Therefore, using the decision variables for these two scenarios, it is simple to determine the total profit of the whole supply chain. We now assess the supply chain participants’ decision-making factors in the absence of government intervention (i.e., the situation where there is no subsidy and no CTP) by the following non-evolutionary game models. Considering the diverse strategies adopted by the entities, the payoff matrix for the two-party EGT is presented in Table 7.

5.1 Non-evolutionary game model without government intervention

We examine the payoff functions of the whole supply chain in two scenarios: first, when both representative retailers adopt the VN approach to interact with the manufacturer, and second, when both retailers choose the RS strategy.

First case (NN) Here, both the symmetrical retailers decide to take decisions simultaneously with the common manufacturer under no cap and no subsidy situation, i.e., no intervention (NI). In this case, the payoff functions of the supply chain entities are given by

$$\begin{aligned} \Pi _m^{N,NI} (\theta _{ii}){} & {} = 2(w - c_m + \tau C_0) \{a - (\alpha - \beta )P_{ii}\nonumber \\{} & {} \quad + (\gamma - \delta )A_{ii} + \lambda \theta _{ii}\} - \frac{\eta }{2} \theta _{ii}^2 \end{aligned}$$
(17)
$$\begin{aligned} \Pi _r^{N,NI} (P_{ii}, A_{ii}){} & {} = (P_{ii} - w) (a - (\alpha - \beta ) P_{ii} \nonumber \\{} & {} \quad + (\gamma -\delta ) A_{ii} + \lambda \theta _{ii}) - \frac{\xi }{2} A_{ii}^{2} \nonumber \\ where \ i \in \{N\}. \end{aligned}$$
(18)

Second case (SS) Here, both the symmetrical retailers decide to dominate the market, dealing with the common manufacturer under no cap and no subsidy situation, i.e., no intervention (NI). In this case, the payoff functions of the supply chain participants are obtained as follows:

$$\begin{aligned} \Pi _m^{S,NI} (\theta _{ii}){} & {} = 2(w - c_m + \tau C_0) \{a - (\alpha - \beta )P_{ii}\nonumber \\{} & {} \quad + (\gamma - \delta )A_{ii} + \lambda \theta _{ii}\} - \frac{\eta }{2} \theta _{ii}^2 \end{aligned}$$
(19)
$$\begin{aligned} \Pi _r^{S,NI} (P_{ii}, A_{ii}){} & {} = (P_{ii} - w) (a - (\alpha - \beta ) P_{ii} \nonumber \\{} & {} \quad + (\gamma -\delta ) A_{ii} + \lambda \theta _{ii}) - \frac{\xi }{2} A_{ii}^{2}\nonumber \\ where \ i \in \{S\}. \end{aligned}$$
(20)

For the preceding two instances, the solutions are described in Proposition 6 as follows.

Proposition 6

Under no government intervention, in both the above cases, the payoff functions of the manufacturer and the retailers are concave in relation to the decision variables \(\theta _{NN}\) (or \(\theta _{SS}\)), \(P_{NN}\) (or \(P_{SS}\)) and \(A_{NN}\) (or \(A_{SS}\)) provided that the condition \(2\alpha \xi - \gamma ^2 > 0\) holds. Consequently, the optimal price and the sales effort of the retailer, and the manufacturer’s best response on green innovation are the same for both cases and are given, respectively, by

$$\begin{aligned} \theta _{NN}= & {} \theta _{SS} = \frac{2\lambda }{\eta }(w - c_m + \tau C_0) \nonumber \\ P_{NN}= & {} P_{SS} = \frac{\xi Z_1 + w Z_2}{\eta \psi _3} \nonumber \\ A_{NN}= & {} A_{SS} = \frac{\gamma Z_1 + w Z_3}{\eta \psi _3} \nonumber . \end{aligned}$$

where \(\psi _3 = \xi (2\alpha -\beta ) - \gamma (\gamma - \delta )\), \(Z_1 = a\eta + 2 \lambda ^2(\tau C_0 - c_m)\), \(Z_2 = \xi (\eta \alpha + 2\lambda ^2) - \eta \gamma (\gamma - \delta )\) and \(Z_3=2\lambda ^2 - \eta (\alpha - \beta )\)

Proof

Proof: See Appendix B. \(\square \)

When government intervenes the supply chain, its payoff is \(\Pi _G^{i,I}=-C_g + U_1 + U_2 -Ec_e - 2(ec_e - s)\big [a-(\alpha -\beta )P_{ii}+(\gamma -\delta )A_{ii}+\lambda \theta _{ii} \big ]\), where \(i \in \{N\} or \{S\}\). When the government does not intervene in the supply chain, its payoff is \(\Pi _G^{i,NI}=U_1 - U_3\), where \(i \in \{N\} or \{S\}\).

5.2 Evolutionary game model analysis

As mentioned before, there are two pure strategies for each player in the game model:

For the government,

I: the strategy where the government pays a subsidy for green production and applies CTP.

NI: the strategy where the government neither subsidizes green production nor applies CTP.

For the supply chain,

N: the strategy where both the retailers’ choice for dealing with the manufacturer is VN.

S: the strategy where both the retailers’ choice for dealing with the manufacturer is RS.

Let x (\(0\le x \le 1\)) denote the proportion of governments who wish to play strategy I and the rest proportion \((1-x)\) prefer to implement strategy NI. Similarly, let y (\(0\le y \le 1\)) denote the fraction of the supply chains’ population who want to play N and the rest proportion \((1-y)\) prefer to play S.

Table 7 Pay off matrix for the proposed two-party game

From Table 7, The expected payoffs of the supply chains who select N, S and the average payoffs against all possible strategies of the governments are as follows:

$$\begin{aligned} f_1^N= & {} y \Pi _{SC}^{N,I} +(1-y) \Pi _{SC}^{N,NI} \end{aligned}$$
(21)
$$\begin{aligned} f_1^S= & {} y \Pi _{SC}^{S,I} +(1-y) \Pi _{SC}^{S,NI} \end{aligned}$$
(22)
$$\begin{aligned} f_1= & {} x f_1^N +(1-x) f_1^S \end{aligned}$$
(23)

The expected payoffs of governments who choose INI and the average payoffs against all possible strategies of the supply chains are formulated as follows:

$$\begin{aligned} f_2^I= & {} x \Pi _G^{N,I} + (1-x) \Pi _G^{S,I} \end{aligned}$$
(24)
$$\begin{aligned} f_2^{NI}= & {} x \Pi _G^{N,NI} + (1-x) \Pi _G^{S,NI} \end{aligned}$$
(25)
$$\begin{aligned} f_2= & {} y f_2^I + (1-y) f_2^{NI} \end{aligned}$$
(26)

According to the Malthusian model, the replicator dynamic equations of governments that choose I and supply chains that choose N are as follows:

$$\begin{aligned} \frac{\textrm{d}x}{\textrm{d}t}= & {} x(1-x) \big [ y(\Pi _{SC}^{N,I} - \Pi _{SC}^{S,I}) + (1-y)(\Pi _{SC}^{N,NI} - \Pi _{SC}^{S,NI})\big ]\nonumber \\ \end{aligned}$$
(27)
$$\begin{aligned} \frac{\textrm{d}y}{\textrm{d}t}= & {} y(1-y) \big [ x(\Pi _{G}^{N,I} - \Pi _{G}^{N,NI}) + (1-x)(\Pi _{G}^{S,I} - \Pi _{G}^{S,NI})\big ]\nonumber \\ \end{aligned}$$
(28)

Proposition 7

The equilibrium points of the system of Eqs. (27)–(28) are (0,0), (0,1), (1,0) and (1,1). When the conditions \((\Pi _{SC}^{N,I} - \Pi _{SC}^{S,I})(\Pi _{SC}^{N,NI} - \Pi _{SC}^{S,NI})<0\) and \((\Pi _{G}^{N,I} - \Pi _{G}^{N,NI})(\Pi _{G}^{S,I} - \Pi _{G}^{S,NI})<0\) are satisfied, the point (\(x^*\),\(y^*\)) will be an equilibrium point, where \(x^* = \frac{\Pi _{SC}^{N,NI} - \Pi _{SC}^{S,NI}}{(\Pi _{SC}^{N,I} - \Pi _{SC}^{N,NI}) + (\Pi _{SC}^{S,NI}-\Pi _{SC}^{S,I})}\) and \(y^* = \frac{\Pi _G^{S,I} - \Pi _G^{S,NI}}{(\Pi _G^{N,I} - \Pi _G^{N,NI}) + (\Pi _G^{S,NI}-\Pi _G^{S,I})}\).

Proof

See Appendix B. \(\square \)

Further, to analyze the stability of the critical points, the trace and the determinant of the Jacobi matrix of the formulated dynamical system, are to be determined. The Jacobi matrix (J) for the system can be evaluated as

$$\begin{aligned} J = \begin{bmatrix} (1-2x) \big [ y(\Pi _{SC}^{N,I} - \Pi _{SC}^{S,I}) + &{} x(1-x) \big [ (\Pi _{SC}^{N,I} - \Pi _{SC}^{S,I}) -\\ (1-y)(\Pi _{SC}^{N,NI} - \Pi _{SC}^{S,NI})\big ] &{} (\Pi _{SC}^{N,NI} - \Pi _{SC}^{S,NI})\big ]\\ y(1-y) \big [ (\Pi _{G}^{N,I} - \Pi _{G}^{N,NI}) - &{} (1-2y) \big [ x(\Pi _{G}^{N,I} - \Pi _{G}^{N,NI}) +\\ (\Pi _{G}^{S,I} - \Pi _{G}^{S,NI})\big ] &{} (1-x)(\Pi _{G}^{S,I} - \Pi _{G}^{S,NI})\big ] \end{bmatrix} \end{aligned}$$

Here, det (J) and tr (J) are the product and the sum of the eigenvalues of the matrix J. The condition for the ESS of the equilibrium points is det (J) \(> 0\) and tr (J)\(<0\). From the pay off functions of the government and Proposition 6, we get \(\Pi _{SC}^{N,NI} = \Pi _{SC}^{S,NI}\) and \(\Pi _G^{N,NI} = \Pi _G^{S,NI}\). Hence from Proposition 7, (\(x^*,y^*\)) cannot be the equilibrium point for this model. Therefore, there is no mixed Nash equilibrium for the evolutionary game.

Corollary 2

From Proposition 7, for the stable points (0,0) and (1,0), the determinant of the Jacobi matrix (J) is zero. Hence, these two points cannot be the ESS of the EGT model. To explain it more precisely, the government always makes a profit when it offers subsidies and controls carbon emissions, regardless of how supply chain participants behave.

5.3 Evolution toward ESS

Comparing with Long et al. (2021), we consider the parameters with some adjustments according to our basic model as follows: \(C_\textrm{g}\)=1000, \(U_1\) = 2000, \(U_2\) = 1100 and \(U_3\) = 1050. In this extended model, we observe that when there is pricing and sales effort competition among retailers, the long-term action of the government and the decentralized behavior of CLSC tend dynamically toward the Government intervention mode and the retailer-led Stackelberg strategy, respectively. The stability of the critical points is examined in the following Table 8 and Figs. 7, 8 display that the arrows are heading in the direction of the ESS (\(x = 0\) and \(y = 1\)) and trajectories of the entities’ behavior under different initial conditions.

Table 8 Stability analysis of different strategy combinations

More specifically, the extended model makes it evident that, in any circumstance, government intervention is more lucrative for it, and, analogous to the basic model, the retailer Stackelberg strategy is the best response for the whole population of retailers as well as the whole population of supply chains.

Fig. 7
figure 7

Diagram for dynamical change of government and supply chain’s behavior

Fig. 8
figure 8

Trajectories of government and supply chain’s behavior under different initial conditions

6 Conclusions

This study investigated a closed-loop green supply chain with retailers competing on pricing and sales effort, and a common manufacturer participating in a decentralized game with government intervention (government pays a subsidy to the manufacturer for green production and applies CTP for lower carbon emission). The problem of decentralized behavior of retailers in green CLSCs in the long run is addressed using evolutionary game theory. Taking the government and the supply chain as two parties in a game, the basic model is extended while keeping the other settings unchanged. Finally, the evolutionary stable behaviors of both the government and the supply chain are examined through numerical simulations.

The major contributions of our study are as follows: (1) The optimal decisions and the profit functions of the supply chain participants are first obtained analytically for two representative retailers’ various strategy combinations. The study results that a retailer will make more profit and exert more marketing effort if he chooses the retailer Stackelberg strategy, while his competitor chooses the vertical Nash strategy, but for the opposite situation, the retail price will be higher. (2) When the government intervenes in the supply chain by promoting green production with low carbon emissions, the population of retailers tends to favor the retailer Stackelberg strategy as their long-term and evolutionary stable decentralized strategy when dealing with the manufacturer. (3) In terms of a green CLSC, the supply chain as a whole can benefit more if the population of retailers, who invest in sales efforts, adopt the decentralized strategy of the retailer-led Stackelberg game when dealing with the manufacturer. However, in this scenario, the government needs to step in to support green production by subsidizing the supply chain and setting a cap on low carbon emissions in the long run. (4) The numerical simulations show that changes in key parameters do not result in a shift of the long-term strategies of the government or retailers, but do affect the adoption rate of specific strategies. For instance, as the cost of green investment increases, the rate of strategy adoption by retailers decreases; however, as the sensitivity to the greening level increases, the adoption rate of the retailer-led Stackelberg strategy by retailers increases.

The results of this study provide valuable guidance and managerial insights for both governments and supply chain participants who aim to establish and manage sustainable closed-loop green supply chains and make decisions on appropriate decentralized power structures. First, government policies and subsidies can significantly influence the behavior of supply chain participants. As a result, it is crucial for supply chain managers to collaborate with the government to ensure that their operations align with the policies and regulations enforced by the government. Second, the adoption of specific strategies by retailers is influenced by various factors such as green investment costs, green-level sensitivity, and dominant strategies in the market. Supply chain managers should consider these factors carefully when developing their supply chain strategies. Third, manufacturer’s investment in green initiatives can impact retailers’ behavior and the adoption of dominant strategies in the market. The dominant strategy in the market remains important in retailers’ decision-making. Supply chain managers should consider the impact of manufacturers’ green initiatives on the market, focus on identifying dominant strategies and consider their adoption when making decisions.

Similar to any other model, this study also has some limitations. So there are ample scopes for future extensions of our proposed model. In our model, the demand is assumed to be deterministic and linear. A stochastic or nonlinear demand may be taken into consideration in future. Secondly, one can incorporate the decentralized and the centralized games between the representative duopolistic retailers in the population in this scenario to analyze their dynamic nature. In this study, one-party and two-party EGT models have been taken into consideration to evaluate the dynamic behavior of the supply chain entities’ population and the government population. A tri-party EGT model can be easily developed by employing a third-party collector to accumulate used products from consumers. Using the EGT, one may also look into the dynamic behavior of the participants of the supply chain and decide on the best course of action. Last but not least, our model can be further extended to incorporate a manufacturer-led Stackelberg scenario into the decentralized behaviors.