1 Introduction

A constant increase in the number of academic publications has been reported in the domain of supply chain management (SCM) in recent years (Charvet et al. 2008; Rahman et al. 2011; Hochrein 2014). This trend might be due to the increasing diversification and specialization in the field of SCM observed over the last decades (Hochrein and Glock 2012). The fact that SCM practices may have a positive impact on the performance of the focal firm and its entire supply chain could have advanced this trend, and it has stimulated the interest of both scientists and practitioners in SCM-related topics (Kim 2006; Li et al. 2006; Tan et al. 1999; Hochrein et al. 2014).

To reduce the problems that result from the growing number of scientific publications and to help researchers to keep track of scientific advances in the field, it is increasingly important to regularly analyze, evaluate, and synthesize existing works (Glock and Hochrein 2011; Tranfield et al. 2003). Also practitioners can benefit from surveys of the literature, as these studies help them to get an overview of the scientific state-of-the-art.

In the area of SCM, a vast number of literature reviews (LRs) has been published in recent years. As will be shown below, LRs that were published in SCM vary in terms of quality and scope. Quite often, several LRs have been published on a specific topic. To assist researchers and practitioners in maintaining an overview of major topics that have been discussed in SCM research, and to compare and evaluate the methodologies of existing LRs in this area, a review of SCM LRs is essential. It is important to note that such LRs of literature LRs have recently been published in closely-related fields of SCM research (Brandenburg et al. 2014; Glock et al. 2014; Hochrein and Glock 2012; Spina et al. 2013).

The aim of this paper is to identify, analyze, evaluate, synthesize, and compare LRs in the SCM field. The following research questions (RQs) will be addressed in this article:

  • RQ1: Which methodologies have been applied in the SCM LRs?

  • RQ2: How can existing LRs in SCM be classified and systematized?

  • RQ3: Which SCM topics have been addressed and which research gaps can be identified?

The remainder of the paper is structured as follows: Sect. 2 briefly defines alternative types of LRs and classifies them with the help of a review taxonomy. Subsequently, several LRs of LRs are analyzed as reference studies to develop a classification framework for the work at hand. Section 3 defines the review process of this study and descriptively evaluates the sample of SCM LRs. Section 4 analyzes the topics covered in the identified LRs and evaluates them with respect to taxonomic and methodological criteria. Section 5 outlines strategies to identify research gaps at the review level and presents some suggestions for future research. Methodological recommendations for writing high-quality and methodologically sound systematic LRs in SCM research are proposed. Section 6 discusses the limitations of this review of LRs, and Sect. 7 summarizes the findings of this paper.

2 Characteristics of literature reviews

2.1 Secondary literature reviews and review taxonomy

The aim of stand-alone LRs is to identify and summarize primary studies on a specific topic. These LRs evaluate the strengths and weaknesses of the existing literature, try to resolve conflicting results, identify promising areas, and give recommendations for future research (Mentzer and Kahn 1995; Harland et al. 2006; Seuring and Müller 2007, 2008; Hochrein et al. 2014). Stand-alone LRs are important for the scientific community, as they are typically based on a precisely defined research question and serve no other purpose than analyzing and synthesizing a research field. Stand-alone LRs can be differentiated from second-type reviews, which are often used to introduce the reader into the topic of a primary study. Second-type LRs will be excluded from this study in the following.

LRs can further be differentiated into narrative literature reviews (NLRs), systematic literature reviews (SLRs), and meta analyses (MAs). Although NLRs typically do not reveal detailed information on the review process, they have to exhibit at least a minimum level of adequacy and are therefore included in this survey. Results provided by SLRs are usually more reliable than those of NLRs, as SLRs avoid methodical errors by applying a well-planned review process and by analyzing all primary studies in a transparent, objective, and thus reproducible and unbiased way (Hochrein and Glock 2012; Hochrein 2014). Due to their representative and rigorous syntheses, SLRs lead to more reliable and more comprehensive results than NLRs (Tranfield et al. 2003; Cooper 2010). MAs are a special case of SLRs and use quantitative methods for analyzing the extracted data and statistical techniques for synthesizing research findings. MAs are usually based on SLRs conducted beforehand, and therefore we assign SLRs and MAs to the same category in this paper.

Apart from the terminology introduced above, the literature also discusses different classification schemes for LRs. A popular one is Cooper’s (1988) taxonomy of LRs, which differentiates LRs according to the criteria focus, goal, perspective, coverage, organization, and audience. These criteria are defined in Table 1 (see also Hochrein and Glock 2012).

Table 1 Modified taxonomy for characterizing LRs, adapted from Cooper (2010)

2.2 Tertiary literature reviews and evaluation criteria

Research, in general, can be differentiated into primary works (e.g., theoretical or conceptual articles or empirical surveys), secondary works (NLRs, SLRs, or MAs) and tertiary works (reviews of NLRs, SLRs, or MAs). Primary works are defined as independent research that can be based on newly collected data (Cooper 2010), for example. Secondary works aggregate findings of a certain research field and often have a broader range and coverage than primary works (Brereton et al. 2007; Biolchini et al. 2007; Kitchenham et al. 2009, 2010). To evaluate whether research in a certain area has been analyzed, summarized and synthesized in a methodologically correct way, tertiary studies are necessary (Becker and Oxman 2008). Tertiary studies evaluate secondary research with a strong emphasis on the methodology of the secondary studies and (if possible) on consolidating research findings of primary studies as reported in the secondary works.

LRs of LRs prevail in different fields of science and are an accepted approach to synthesize secondary studies. As synonyms for ‘tertiary studies’, the terms ‘review-review’, ‘meta-analytic synthesis’, ‘meta-review’, ‘umbrella review’, ‘overview of reviews’, ‘MA of MA’, or ‘(content) analysis of content analyses’ have been used (Becker and Oxman 2008; Cooper and Koenka 2012). A tertiary study systematically maps and/or synthesizes LRs, with the unit of analysis often being the data contained in the secondary studies, instead of data on the primary study level (Hochrein 2014). Tertiary studies are, in general, closely related to secondary research, as they use a similar methodology to synthesize existing research. It is important to note that a literature survey that evaluates primary studies based on an analysis of secondary studies does not qualify as a tertiary study, as the LRs are not the objects of such a survey (Cooper and Koenka 2012).

In the following, we present some selected tertiary studies that fall into the focus of Management Review Quarterly. Fettke (2006), for example, studied the popularity and the quality of LRs in the area of business informatics. vom Brocke et al. (2009) analyzed the methodological quality of LRs in the area of information systems. Kitchenham et al. (2009, 2010) evaluated the state-of-the-art of LRs in the area of software engineering and analyzed how the (methodological) quality of LRs developed over the years. An extension of this work is the one of da Silva et al. (2011). Cruzes and Dybå (2011), Hanssen et al. (2011), Marques et al. (2012), and Verner et al. (2014) also studied the methodology of LRs in the area of software engineering and development. Duriau et al. (2007) examined LRs in the field of organizational science and management research with respect to their focused research topics, data sources, and methodological refinements. Nelson and Kennedy (2009) evaluated the present state of MAs in environmental economics and proposed recommendations for future secondary research. Geyskens et al. (2009) reviewed 69 MAs in management research and showed that the decisions made in applying the MA methodology have a considerable impact on the conclusions derived from the analysis. Therefore, they developed a brief checklist of critical decisions that have to be made in conducting MAs, with a special focus on the statistical analysis of data contained in the sampled articles. Kirca and Yaprak (2010) analyzed how frequently MAs had been conducted in the international business literature. Besides presenting an overview of the research process of MAs, they evaluated the role MAs have played in the synthesis of research in international business, and developed guidelines for future MA applications. Similarly, Hochrein and Glock (2012) developed guidelines for conducting SLRs in purchasing research and explored the methodological rigor of works in this area. Glock et al. (2014) surveyed LRs in the field of lot sizing and gave an overview of the different research streams in this area. Spina et al. (2013) and Brandenburg et al. (2014) summarized previously published LRs in the areas of purchasing and SCM to substantiate their research questions and to justify the methodological background of their SLRs.

Seuring and Gold (2012) investigated the importance of LRs in SCM, critically analyzed SLRs in seven sub-fields of SCM, and identified strengths and shortcomings of the sampled articles. They considered works that appeared in the years 2000–2009 and evaluated a sample consisting of 22 articles. Kache and Seuring (2014) analyzed the constructs ‘SC collaboration/integration’ and ‘SC risk/performance’ in a tertiary study and based their research on a content analysis and a contingency analysis. The main difference between this tertiary study and the works of Seuring and Gold (2012) and Kache and Seuring (2014), apart from a different research focus, is that our work is more comprehensive than the other two studies in terms of sample size and steps applied during the literature search. In addition, we provide a more detailed analysis of the papers included in the sample. Online Resource 1 contains a comprehensive comparison of the paper at hand and the two related works of Seuring and Gold (2012) and Kache and Seuring (2014).

It is clear that in order to analyze and synthesize LRs in the domain of SCM correctly and reliably, tertiary research needs to be based on an established methodology that enables the reader to reproduce sample generation and evaluation. For this reason, this survey builds on the works presented above and adopts a methodology that consolidates different approaches used in these papers. Table 2 presents an overview of the methodologies used in the works presented above, which were analyzed along the two dimensions literature search and selection and dimensions of analysis.

Table 2 Comparison of selected tertiary studies

As Table 2 shows, the selected tertiary studies used different dimensions of analysis as well as diverse literature search and selection strategies. Given that a more sophisticated literature search and selection strategy is more likely to identify all relevant publications than a less sophisticated one, we combined all search and selection strategies recommended by the tertiary studies cited above. In addition, we selected all five dimensions of analysis presented in Table 2 for this study to identify interesting patterns in the publication of LRs in SCM research. The methodological criteria for SLRs are generalized and modified in Table 3 to facilitate assessing the quality of sample selection and its description as well as the quality of the search strategy.

Table 3 Evaluation criteria for the literature search, adapted from Hochrein and Glock (2012)

2.3 Dimensions of analysis and content categories of this tertiary study

This study uses the following dimensions for evaluating SLRs in SCM, which are based on the tertiary studies listed in Table 2:

  • A taxonomic classification of SLRs is presented, and the modified classification scheme of Cooper (2010) is applied as introduced in Table 1.

  • A critical analysis of the literature search and selection process is performed based on the criteria of Hochrein and Glock (2012) as introduced in Table 3.

  • The topics of the SLRs are analyzed based on the modified content categories of SCM research of Wolf (2008) as introduced in Online Resource 2.

To identify LRs in SCM, the term SCM needs to be defined precisely. If the definition is too narrow, potentially relevant LRs might be excluded from the analysis, while in the case of a too broad definition, irrelevant works could distort the results of this study. Therefore, we first draw on the thoroughly elaborated SCM definitions of Mentzer et al. (2001) and Stock and Boyer (2009) and then refer to the well-known SCOR model promoted by the Supply Chain Council. In a second step, we use content categories to evaluate in detail whether the identified LRs cover a SCM-related topic or not. For a classification of SCM-related literature along thematic categories, the schemata of Houlihan (1985), Cooper and Ellram (1993), Cooper et al. (1997a, b), Ganeshan et al. (1999), Tan et al. (1999), Croom et al. (2000), Tracey et al. (2004), Min and Mentzer (2004), Burgess et al. (2006), Cheng and Grimm (2006), Kouvelis et al. (2006), Schoenherr (2009), Melnyk et al. (2009), Talib et al. (2011), and Rahman et al. (2011) were checked in addition to the content categories of the leading SCM journals (cf. Online Resource 4 and for a similar approach Chicksand et al. 2012). This study uses the comprehensive list of content categories defined by Wolf (2008), as they (1) were developed based on a thorough analysis of existing content classification schemes, (2) contain a high number of categories (22 subjects), (3) are clearly defined and characterized by keywords that help in assigning literature to the respective categories, and (4) led to consistent results in the categorization of LRs in a pilot study that preceded this investigation as well as to a high degree of inter-coder reliability. Accordingly, the classification scheme of Wolf (2008) had to be refined and extended to optimize the content framework (see also Seuring and Gold 2012; Kache and Seuring 2014), as the author considered only the results of primary studies and therefore ignored the two categories Mapping Studies (Map) and Industry Studies (Ind). The category Map proved valuable in related tertiary studies before, see Cruzes and Dybå (2011), da Silva et al. (2011), and Kitchenham et al. (2009). Secondary reviews were assigned to the Map category if their focus was not on synthesizing evidence from a specific SCM-related (sub-)field, but if they rather provided a general overview of the subject area per se, often with the intention to systemize research streams, to identify research trends, to compare research methods and techniques used, or to investigate a certain SCM-journal or a theoretical stream in SCM research. The Map category is thus similar to the categories general SCM reviews and empirical SCM reviews proposed in Seuring and Gold (2012). In the following, we outline some examples of mapping studies to further substantiate this category (cf. Online Resource 3 and Case 3 in Sect. 5.2 for a complete overview): The article of Croom et al. (2000) analyzed the SC literature and presented and applied a SCM literature classification framework. Sachan and Datta (2005) examined the current state of SCM research from a methodological point of view. Ho et al. (2002) analyzed the conceptualization, operationalization, and modelling of SCM with a special focus on empirical research. Spens and Kovács (2006) evaluated the application of different research approaches in SCM research. The general SLR of Burgess et al. (2006) also classifies as a mapping study, as it reports on SCM research per se. The main reason for defining the Ind category finally was an increasing trend to conduct industry-specific (primary) research in SCM. For example, the management of automotive, bio-energy/biomass, healthcare, or food SCs has received increased attention in recent years (e.g., De Meyer et al. 2014; Dobrzykowski et al. 2014; Narayana et al. 2014; Ringsberg 2014), such that SCM concepts are used to solve industry-specific problems in these areas. LRs that review industry-specific primary studies were thus assigned to the Ind category. The category SM/P was excluded from the analysis, as it includes LRs that did not fit to our cross-functional and process-oriented SCM perspective and had recently been reviewed (Hochrein and Glock 2012). All in all, our category building and coding rules are in line with high-quality SLRs published in top SCM-journals, as the iterative coding cycles (1) were performed independently by at least two researchers, and (2) relied on rules that were defined a priori (deductive category building) and that were adjusted during the coding process (inductive category refinement).

3 Literature review process of this tertiary study

The following section identifies and evaluates LRs in the domain of SCM. The search and analysis were guided by the tertiary studies discussed in Sect. 2.2 and by the process-oriented framework for secondary studies of Hochrein and Glock (2012), which adapted literature review methodologies to the specific needs of the SCM domain.

1. Problem formulation: According to the process-oriented framework, we defined research questions in the first step (cf. Sect. 1, RQ1 to RQ3) and then developed a review protocol. The review protocol (Step 1 of the search strategy) was designed along the following categories: (1) bibliographical data (author names and titles, year of publication, journal name including volume, issue, page numbers and (if any) references for search strategy), (2) type of review, (3) taxonomic categories and review characteristics (cf. Table 1), (4) attributes of the literature search (cf. Table 3), and (5) content categories including the general research topic (cf. Online Resource 2) as well as contribution and major findings. Furthermore, objective and consistent selection criteria as well as minimum requirements were defined to ensure both transparency and reproducibility of results (cf. Appendix). We then defined two groups of keywords, where group A contained SCM-related keywords and group B keywords related to review techniques. Each keyword from group A was then combined with each keyword from group B to generate the final keyword list.

2. Literature search and selection: Subsequently, the literature was searched for SCM-related LRs and relevant works were selected. The search strategy combined (1) a manual review of 35 selected journals, (2) a database search (ABI Inform and Business Source Premier), (3) a forward and backward snowball search, and (4) an expert consultation (see Appendix). The journal selection was based on SCM journal rankings (Zsidisin et al. 2007; Menachof et al. 2009), which were synthesized and classified into the groups purchasing and supply chain management (11 PSCM journals), operations research, management science, production and operations management (14 OR/MS/POM journals), international marketing management (5 IMM journals), and general management and strategy (5 GMS journals) with the help of the Harzing (2012) journal quality list. The manual review of the pre-selected journals resulted in 71 initial hits (25 relevant LRs), and the database search produced 102 hits in BSP (10 new and relevant LRs) and 101 hits in ABI (4 new and relevant LRs). The database search resulted only in a few additional hits due to the comprehensive journal search that had been conducted beforehand. The expert consultation and a snowball approach identified 82 additional relevant LRs. In total, 121 relevant LRs were found, where 66 articles could be classified as SLRs and 55 studies as NLRs. Our sample size outnumbers the scope of the tertiary studies discussed in Sect. 2.2. The search phase and the selection criteria used are documented in the Appendix. The inclusion/excluion of a paper was based on an independent evaluation by two reviewers, who, in the case of doubt, discussed the relevance of an article to reduce subjectivity and to assure consistency.

3. Data evaluation and analysis: In the following, the identified LRs were descriptively analyzed with respect to publication outlet and year of publication. A detailed evaluation is presented in Online Resource 4. Figure 1 illustrates the journal groups that published LRs in SCM (see also Online Resource 5). As can be seen, LRs appeared primarily in PSCM journals (approx. 45 %) and in OR/MS/POM journals (approx. 26 %).

Fig. 1
figure 1

Distribution of literature reviews over the journal categories and types of reviews

Figure 2 gives an overview of the ten journals that published the highest number of LRs in SCM. It is interesting to note that most of these journals provide a special section for LRs, which is a possible reason for their popularity in publishing LRs in SCM research. Interestingly, Soni and Kodali (2011) and Wolf (2008) obtained complementary results in their analyses of publication outlets of primary SCM studies. A comparison of our results with those of Soni and Kodali (2011) shows that the journals European Journal of Operational Research (EJOR), International Journal of Operations and Production Management (IJOPM), International Journal of Logistics Management (IJLM), International Journal of Production Economics (IJPE), International Journal of Production Research (IJPR), and International Journal of Physical Distribution and Logistics Management (IJPDLM) are also popular outlets for (primary) empirical studies in SCM. Further, it is worth noting that Wolf (2008) provided a similar ranking of the journals IJLM, IJPDLM, IJPE, Journal of Business Logistics (JBL), and IJPR with respect to the number of SCM-related papers published in these journals.

Fig. 2
figure 2

Analysis of journals with respect to the number of supply chain management LRs published

Figure 3 contains a chronological analysis of the LRs contained in our sample. It can be seen that the first NLR was published in 1994 and the first SLR in 1993 (considered time span: 1980–2011). This shows that the first SCM LRs were published approx. ten years after the SCM concept began to emerge (Cooper et al. 1997a). Figure 3 further illustrates that the review-publications per year followed an increasing trend in recent years (cf. also Online Resource 4).

Fig. 3
figure 3

Number of reviews and review types per year

Despite the growing importance of SCM, one reason for this trend could be an increasing number of papers published in SCM and several new SCM journals that started publishing in recent years. Figure 3 also conveys the impression that there is a somewhat stronger trend to publish SLRs instead of NLRs. One reason for this could be the methodological requirements of journals with respect to LRs. Even though SCM is still a rather young discipline (Gibson et al. 2005; Storey et al. 2006), the trend to publish an increasing number of SLRs in this field can be seen as a maturing process (cf. Melnyk et al. 2009). We can also see a strong increase in the publication of LRs after the year 2004, giving evidence to grown interest in the domain of SCM. While one to five LRs were published in the time period 1993–2004, we could identify more than ten LRs for each year between 2005 and 2011. Taking into account the time lag between publication dates of primary works and secondary studies, these findings are in line with the results of Soni and Kodali (2011), who observed an increase of published (primary) empirical studies between 2002 and 2004.

4. Critical analysis, synthesis and interpretation: A statistical analysis in terms of a MA is not possible for our sample, as the heterogeneity of the secondary articles analyzed in this paper, both with respect to the RQs and the methodology used, prohibits the application of statistical methods to our sample (Tranfield et al. 2003). For this reason, we systematically examined the selected SLRs after having transferred the corresponding data into a standardized tableau. Subsequently, we synthesized results following the dimensions of conceptual evaluation as developed in Sect. 2.3: taxonomic classification (Cooper 2010), content classification (Wolf 2008), and critical comparison of research strategy (Hochrein and Glock 2012). The analysis and synthesis are combined with a quantitative interpretation of results in Sects. 4 and 5.

5. Presentation of results: This last step refers to writing the SLR itself, which includes presenting major results and potential topics for future research.

4 Current state of literature reviews in supply chain management research

4.1 Taxonomic classification and assessment

The taxonomic classification of SLRs is based on Sect. 2.1 (see also Online Resource 6 for an individual classification of each SLR). Due to the fact that classical content analyses are difficult to classify in light of Cooper’s taxonomy, 15 papers were excluded from our taxonomic classification. The results from classifying the remaining 51 SLRs are presented in Table 4.

Table 4 Taxonomic distribution and classification of the identified SLRs

The focus of the SLRs is nearly equally distributed, which shows that the domain has been analyzed with different intentions. Integration as a means of generalization as well as identification of central issues were the primary goals. Since only 15 articles discussed the limitations of their works, the perspective of the SLRs was most often an espousal of positions. For 40 SLRs, coverage was representative, and five SLRs could not be evaluated due to missing information on the selection of literature. The organization of the SLRs was mostly conceptual and/or methodological. 12 SLRs preferred an author-centric approach, and only two were structured. Most SLRs addressed specialized researchers (audience), whereas 30 articles also targeted practitioners (cf. Fettke 2006 for a similar form of illustration).

4.2 Comparison and critical discussion of the search strategies

This section evaluates and compares the search strategies used in the 66 SLRs according to Sect. 2.2 (cf. Online Resources 7 and 8 for a detailed report). 30 SLRs scanned pre-selected journals in a manual process (approx. 45 % of all SLRs), out of which 27 articles explicitly mentioned the journals that were selected. On average, 12 journals were checked manually. If journals were checked manually, the selection of journals was in most cases justified with the help of journal rankings or reference texts. 39 SLRs conducted a database search (approx. 59 % of all SLRs), out of which only 31 articles explicitly named the databases used. In many cases, more than two databases were used. ABI and EBSCO databases were preferred, whereas Scopus, Google Scholar, or Web of Science were less often employed. 45 SLRs used keywords (approx. 68 % of all SLRs), out of which 39 articles named the keywords explicitly. The snowball approach was applied by only 8 SLRs (approx. 12 % of all SLRs), and only as a backward search. While only one SLR applied all three search strategies explicitly, 6 SLRs combined manual and database searches. As a snowball search has to be combined with one of these two strategies, 59 SLRs did not use any further complimentary search strategy.

Figure 4 summarizes our findings regarding the search strategy, covering the criteria application (has the respective strategy been applied?), description (is the strategy appropriately described?), and justification (is the use of the strategy justified?). Looking at the length of the articles, we obtain an average of 26 pages, with the number of pages varying from 11 to 199. Only 20 SLRs substantiated their methodology by referring to the relevant literature, and only 31 SLRs analyzed the primary studies they identified chronologically. 55 SLRs explicitly stated the final sample size, and only 54 SLRs provided the years of publication they covered in their review.

Fig. 4
figure 4

Comparison of search strategies applied in SLRs

4.3 Comparison and critical reflection of the content categories

The content classification of the identified LRs is based on Section 2.3. The findings are illustrated in Fig. 5. For 21 LRs, the assignment was somewhat ambiguous, i.e. they could have been assigned to a different category as well (see also Online Resource 9).

Fig. 5
figure 5

Content classification of the identified LRs, NLRs, and SLRs

The LRs are distributed over 17 out of 22 possible categories. The top five categories cover 64.5 % of all LRs we identified. Map represents the largest content group, covering 29 LRs. In these papers, SCM is discussed as a general research topic and discipline per se, which is why 7 content analyses were assigned to the Mapping Studies category (see Online Resource 2). Since a thematic analysis requires a systematic approach, SLRs are dominant. Another large group is IT/E-B, covering 16 LRs. Wolf (2008) and Soni and Kodali (2011) described this category as especially important at the primary level. The CLSC/EnvP group embraces 12 LRs and seems to be gaining in importance. The category PM/RS contains 11 LRs and is ranked fourth, whereas in the work of Soni and Kodali (2011), it was ranked first with respect to the number of published primary studies. LeanSCM covers 10 LRs, a category to which Wolf (2008) also assigned a high level of relevance. Similarly, Soni and Kodali (2011) emphasized the importance of SC integration. We could not assign any LRs to the five categories M/Sal, PowRI, LA, OrgS/P, and DCM.

5 Research agenda for future supply chain management literature reviews

5.1 Taxonomic and methodological lessons learned

To cope with the increasing prominence of secondary studies in the domain of SCM, we systematically assessed methodological issues and taxonomic concerns in Sects. 4.1 and 4.2. Based on the descriptive analysis, we present some lessons learned for future SCM SLRs in the following (cf. Seuring and Gold 2012; Hochrein and Glock 2012 for similar approaches).

Lesson learned 1 (link to previous secondary studies): Thoroughly analyzing the scientific background of closely-related publications helps in positioning the own work in the existing literature and in explaining why the developed RQs are important. Although this holds true for all types of research, the analysis of our sample showed that LRs did not always consider previously published secondary studies in detail, even though they were often of high relevance to the respective works. Only a few secondary studies accurately cited related LRs (e.g., Giunipero et al. 2008; Soni and Kodali 2011) to avoid a complete or partial overlap in the overviews. In particular, LRs that use identical papers or very similar samples than earlier LRs should indicate that certain studies may be overrepresented (cf. Fabbe-Costes and Jahre 2007, 2008). Interestingly, several recently published LRs demonstrated that an accurate link to previous LRs helps to substantiate the relevant research gaps (cf. Kunz and Reiner 2012; Spina et al. 2013; Brandenburg et al. 2014; Kache and Seuring 2014). We conclude that it is important to provide information on closely-related LRs, especially when data from primary studies has been used more than once.

Lesson learned 2 (link to methodological references): Online Resource 7 shows that 20 SLRs explicitly referred to works their methodology was adapted from. A vast number of LRs, in contrast, did not report relevant methodological details at all (i.e., the NLRs). As LRs often raise the claim to guide future research, a transparent description of its methodology is a minimum requirement to claim validity of the results. We therefore recommend to include references to a generic process model (e.g., Tranfield et al. 2003; Mayring 2008; Cooper 2010), to a SCM-specific secondary SLR (cf. Online Resources 7 and 8), or to a (SCM-specific) tertiary study (cf. Table 2 and the paper at hand). The methodological limitations of some earlier works also provide opportunities for future research, as SCM researchers could draw on the NLRs discussed in this tertiary study and improve their methodological foundation. As the identified 55 NLRs provide only a partial overview of the field, their systematic improvement could be a valuable starting point for researchers interested in conducting new, methodologically sound SLRs.

Lesson learned 3 (literature search process): Section 4.2 revealed significant differences in the applied search strategies (cf. Online Resources 7 and 8 for details), and strongly called for a better inter-subjective verifiability as postulated by Duriau et al. (2007), Hochrein and Glock (2012), and Seuring and Gold (2012). To improve the quality and rigor of the data collection process, the following guidelines that were derived from our analyses may be useful: First, the search strategy should be more thoroughly documented and accurately described, given that the search techniques used may restrict the validity of results. A protocol of the search strategy should report inclusion/exclusion criteria (e.g., time span, language), keywords, scholarly databases used, and search techniques (e.g., manual search of key journals, snowball search, or electronic citation tracking), intermediate search results, and final sample size. The Appendix is suggested as an idealized documentation form. Secondly, the literature search should minimize the risk of biases and maximize the chance that all relevant primary studies are identified. Using meta-search engines (e.g., MetaLib) that access and compare the highly relevant scholarly databases and aggregate the results into a single list may help to save time (cf. Giménez and Tachizawa 2012). Thirdly, Sect. 4.2 showed that only a few papers have combined different (complimentary) search strategies in the past. SCM scholars are thus strongly recommended to combine (1) a manual focused search of major journals (providing justification of journal selection, e.g. by using journals rankings) with (2) an open search of (at best more than one) renowned scholarly database by applying well-defined search strings. In addition, we recommend to (3) work through the reference lists of all previously selected articles (this was only done in eight of the articles in our sample, cf. Online Resource 8), and (if relevant) search through previously published LRs on the focused topic (backward snowball search); to (4) conduct a forward snowball search by checking citation indices (to identify recently published studies); and to (5) contact and consult experts in the field (cf. Sect. 3).

Lesson learned 4 (article selection): An objective, valid, and reliable article selection process is imperative for secondary studies to delimit potential biases within the included and across the excluded studies. Therefore, clear coding rules should be defined to permit transparent decisions from the outset, and relevance tests of articles should be done by at least two coders, who should assess the papers independently from each other (cf. Duriau et al. 2007; Carter and Ellram 2003; Frankel et al. 2005). As some of the review papers included in our sample were written by only a single author, and as these authors in some cases did not mention that other reviewers were involved in discussions on the inclusion/exclusion of articles, we conclude that for these papers, inter-rater reliability could not be achieved, which might be a weakness of these works. In line with that, Schoenherr (2009) stated that a limitation of his work might be that his thematic classification was only conducted by a single author, such that inter-rater reliability could not be achieved. Tangpong (2011), in contrast, overcame potential single author biases by involving additional coders.

Lesson learned 5 (data extraction via content categories): Closely related to the quality of article selection, the data of the identified primary articles should also be extracted and analyzed independently from each other to reduce biases introduced by subjectivism and randomization in the case of only a single evaluator. To minimize subjectivity in the coding process and to avoid inconsistencies in reporting research results, the objectivity, validity, and reliability of results should be broadly discussed, for example as in Spens and Kovács (2006) or in Tangpong (2011). Consistent content categories should be developed deductively, reflected inductively, and finalized during a recursive integration process and iterative coding cycles (cf. Seuring and Müller 2007). The development of a conceptual framework may support the alignment of the content categories, the synthesis of key dimensions, and the discussion of conflicting findings, and may thus lead to new insights in the field per se (cf. Blankley 2008; Esper et al. 2010; Mentzer and Kahn 1995; Tan 2001; Hochrein et al. 2014).

Lesson learned 6 (meta-analytical studies): A closer look at the analysis of Online Resource 9 shows that there is an important research gap concerning the application of MAs in the domain of SCM. We therefore strongly call for an increased use of this quantitative statistical technique for analyzing extracted data and synthesizing findings from individual articles. Since MAs are usually based on SLRs conducted beforehand, our sample of LRs can be seen as an excellent starting point for future meta-analytical inquiries. The year 2014 has witnessed an increase in the publication of MAs in the area of SCM (Leuschner et al. 2014; Mackelprang et al. 2014; Zimmermann and Foerstl 2014), which could indicate that this approach is becoming more and more important.

Lesson learned 7 (reporting of review findings): Having worked through the identified SCM LRs using Cooper’s taxonomy in Sect. 4.1, we became aware of a potential risk caused by the reporting of only selected research findings (cf. Online Resource 6 for an individual classification of each SLR). Although it seems to be more reader-friendly to conceptually organize the LRs along primary research results (cf. Table 4), an exclusive presentation of selected research outcomes may magnify those biases commonly caused by a dominant ‘espousal of position’ and a ‘non-exhaustive’ coverage. Therefore, we strongly recommend scientists combine the conceptual organization of their LRs with an (additional) author-centric presentation of article content and reporting of research findings (cf. Online Resources 6 to 9 as examples). Our taxonomic analysis revealed a further potential risk: Specialized scholars and practitioners are the dominant audience often addressed by SCM LRs at the same time. While specialized scholars have little difficulty in understanding complex research results, this is not necessarily true for practitioners. However, we encourage researchers to discuss ambiguous review findings and to particularly avoid oversimplified recommendations in reporting the results at the secondary level (cf. Hochrein et al. 2014 for a critical reflection of contradictory research results at the secondary level).

In addition to the “lessons learned 1 to 7”, future SCM LRs should more precisely follow the methodological research principles of quality, rigor, and accountability that are generally expected in primary surveys. At the secondary level, important quality indicators of a literature review could be (1) the use of iterative coding cycles, (2) transparent and clear coding rules, including an explicit statement of inclusion and exclusion criteria, (3) the use of multiple coders and the use of cross-coding to achieve inter-rater reliability, (4) the testing of the coding rules, (5) theoretical foundation with specific inductive refinements to achieve validity, and (6) an adequate description of the basic data/studies (see Seuring and Gold 2012; Tangpong 2011; Kitchenham et al. 2010). As best-practice guidelines for LRs in the field of SCM are still in their infancy (cf. Sect. 2.2), we finally make a plea for the development of further recommendations, for example, via tertiary studies.

5.2 Propositions for research on the supply chain management content categories

In the following, we propose some promising areas where more secondary research is needed based on our classification system of SCM research presented in Sect. 4.3 and in Online Resource 9. This framework is ideal to evaluate whether SCM LRs exist along the defined content categories, and it can be used to highlight strategies for the identification of research gaps. In doing so, we substantiate fruitful areas for future SCM LRs, being fully aware that the identification of relevant secondary RQs is definitely one of the most challenging steps of research (cf. Kirca and Yaprak 2010). As we cannot provide a complete overview of all SCM secondary research gaps, we instead discuss some obviously under-researched categories in Case 1, the increasingly researched CLSC/EnvP category in Case 2, and the largest category of mapping studies in Case 3. Figure 6 shows that publication numbers of Map and CLSC/EnvP SCM studies have increased in recent years, which highlights their increasing importance and justifies our selection.

Fig. 6
figure 6

Development of publication numbers for Map and CLSC/EnvP studies

Case 1

Short summary: In Case 1, we highlight some promising areas of secondary SCM research for selected content categories. As we could not assign any LRs to the categories M/Sal, PowRI, LA, OrgS/P, and DCM, we hypothesize that these SCM domains may be of high interest for future research at the secondary level.

Lesson learned 8: The OrgS/P category focusing on activities and procedures related to the organization of the supply chain and process design is, in our opinion, a particularly fruitful area for future research. Similarly, the (organizational) implementation of SCM also needs further secondary studies (cf. the LRs of Power 2005; Varma et al. 2006 as possible starting points). Quite surprising is the fact that we only found a single article addressing HRM in SCM at the secondary level (Cantor 2008). Consequently, secondary research is needed to systematically analyze and synthesize the specific requirements of skills, competencies, and capabilities at the SC level. Closely related to HRM is the KM category that targets the generation of knowledge and (inter-)organizational learning. As only Chow et al. (2007) focused on this important issue at the supply chain level, we call for further research on this central topic. The QM category includes quality-related techniques to assure and improve the quality performance of the supply chain. Considering the relative importance of quality management in a supply chain, this group is probably underrepresented with only two review articles (Talib et al. 2011; Vanichchinchai and Igel 2009). As the requirements to be met in SCM significantly vary between different industries, a vast number of primary studies concentrated on the characteristics of industry-specific SCs. However, up to now, only four LRs have been published with a special focus on particular industries, namely two on (agri-)food SCs (Ahumada and Villalobos 2009; Rajurkar and Jain 2011), one on bio-energy/biomass SCs (Gold and Seuring 2011), and one on the construction industry (London and Kenley 2001). Another topic that falls into this area is humanitarian logistics in SCM, which is gaining more and more in importance. One paper contained in our sample falls into this category (Pettit and Beresford 2009); three recently published papers (Kunz and Reiner 2012; Abidi et al. 2014; Leiras et al. 2014) show that there is on-going research in this area. Given the high importance of strategy alignment, the achievement of strategic fit, and the competitive advantage of SCs, the category Strat/Lead is, in our opinion, quite under-researched at the secondary level. Even though the recently published SLR of Gonzalez-Loureiro et al. (2015) provides a first secondary study on the link between SCM and strategic management based on selected theoretical streams, further research on the alignment and development of SC strategies and the identification of critical success factors of SCM would be highly beneficial. In particular, research on the intra-firm integration of SC strategies with the firm’s overall strategy and inter-firm integration is still developing, and it is an important area for future review research (cf. Hochrein et al. 2014). Another example is the CLSC/EnvP category discussed in more detail below. From the analysis of our sample, we conclude that future research should put a stronger focus on the alignment of sustainable (supplier-related) practices with the overall goals of the firm and the SCM strategies. Therefore, the link between the Strat/Lead ‘alignment category’ and CLSC/EnvP should be strengthened.

Case 2 (Closed-Loop supply chain and environmental protection)

Short summary: The CLSC/EnvP category includes 3 NLRs and 9 SLRs. 2 further LRs are aligned with CLSC/EnvP, but were assigned to alternative first categories (cf. Online Resource 9). The first CLSC/EnvP LR was published by Abukhader and Jönson (2004), who explored the ties between logistics/SCM and the environment. In this early phase of CLSC/EnvP research, their content analysis found that little attention had been paid to this area (cf. also Linton et al. 2007). Srivastava (2007) reviewed 227 books, journal articles and edited volumes starting in 1990 and thus provided a broad study classifying the green SCM literature along the problem context in SC design and with respect to the methodology and approaches used. The SLR was again limited to green SCM (and did not cover the broader TBL area), primarily adopting a ‘reverse logistics angle’. A NLR on organizational theories with a focus on environmental issues is the one of Sarkis et al. (2011), who critically evaluated research on green SCM using selected theories to categorize the literature and to gain new insights. The focused SLR of Shaw et al. (2010) proposed research directions to examine whether green performance measures can be integrated in an existing SC performance framework. The paper provides recommendations for practitioners on how to measure the environmental impact of their SCs. Seuring and Müller (2007) studied the concept of integrated SCM in the German management literature that takes environmental and social issues into account.

In contrast to the aforementioned focused LRs, three more general CLSC/EnvP LRs were published by Carter and Rogers (2008), Seuring and Müller (2008), and Kudla and Stölzle (2011). The content analysis of Seuring and Müller (2008) adopted a broader SC perspective and reported on sustainable SCM (191 papers published between 1994 and 2007). The authors developed a conceptual framework and provided an overview of sustainable SCM research, although papers focusing on reverse logistics and remanufacturing were excluded. Carter and Rogers (2008) published a large-scale SLR and introduced a holistic concept of sustainability into the field of SCM. Accordingly, they highlighted the relationships among environmental, social and economic SC performance within a SCM context and developed propositions for sustainable SCM. Carter and Easton (2011) extended the secondary study of Carter and Rogers (2008) and proposed a SLR of the empirical sustainable SCM literature that appeared in top SCM journals between 1991 and 2010 (80 articles). The authors addressed primarily methodological and analytical aspects and did not focus on managerial implications. In addition, they excluded non-environmental aspects of reverse logistics and waste disposal from their SLR. Kudla and Stölzle (2011) also presented a general SLR on sustainable SCM (223 papers from 60 journals published between 1987 and 2010) and developed a conceptual framework that summarized research based on their broad content analysis. Mollenkopf et al. (2010) published an integrated SLR that examined the relationship among green, lean, and global SC strategies within separate literature streams. While Gold and Seuring (2011) reviewed the interface of bio-energy production and SCM, Gold et al. (2010) explored the role of sustainable SCM as a catalyst of generating valuable inter-organizational resources that may lead to sustained inter-firm competitive advantage.

Lesson learned 9.1 (dynamics and level of maturity): Our analysis of secondary CLSC/EnvP studies verified that this sub-field of SCM is very dynamic, rather new and rapidly evolving, both on the primary and secondary level, and both in research and practice (see Fig. 6). The secondary LRs showed, for example, an increasing number of CLSC/EnvP publications over the years and reported an evolution from stand-alone research in social and environmental fields to a more convergent view of sustainability (Carter and Easton 2011). Interestingly, CLSC/EnvP was described before as a particularly important SCM area for the twenty-first-century (Seuring and Müller 2008).

Lesson learned 9.2 (representativeness of sub-topics): Green SCM has gained much more attention in primary studies as compared to the social topics in SCM research (Seuring and Müller 2008; Carter and Easton 2011). While the social dimension of sustainability has received less attention than expected given the focus of SCM on partnerships and bidirectional communication, the integration of the three dimensions of sustainability has also not attracted much attention so far (Seuring and Müller 2008). It also appears that research with a focus on environmental issues has also been more popular on the secondary level.

Lesson learned 9.3 (terminology and level of analysis): One key issue highlighted in the secondary LRs is the profusion of definitions in the primary literature with respect to the firm, dyad, SC, and network level and within the different streams of research. Although often claimed to adopt a SC level, the primary studies take at least a dyadic view, as was consistently agreed upon at the secondary level. The partly confusing picture of the foci and definitions of the primary studies lead, as a further consequence, to a wide range of very different operationalizations of sustainable SC practices and constructs as reported on the secondary level.

Lesson learned 9.4 (integrated performance measurement): Although sustainable development at the SC level needs to include economic, environmental and social performance measures, a vast number of primary works has exclusively dealt with one or two performance dimensions in isolation (Seuring and Müller 2008). Little research focused on measuring the social performance of SCs, as was reported on the secondary level. The LRs in our sample emphasized that more integrated research on environmental and social sustainability performance measures is needed. To obtain a clearer picture of what has been operationalized in the primary studies, the performance constructs and the concepts of measurement should be more thoroughly substantiated via secondary LRs. Mollenkopf et al. (2010) identified a lack of integrated metrics and measurement methods across green/lean SC strategies and called for a simultaneous implementation of these strategies to provide a more holistic view that allows managers to understand the synergies or conflicts across green/lean strategies in their global SCs.

Lesson learned 9.5 (sustainable SCM practices-performance-link): As reported at the secondary level, the link between green and sustainable SCM practices and the different dimensions of performance is also still somewhat unclear (primary studies have reported both positive and negative relationships, or found that no relationship existed at all) and more research on the primary and secondary level would provide valuable insights into whether it is beneficial to be green (sustainable) or not. In our opinion, the conflicting results obtained at the primary level create a need for more secondary research that could help to gain insights into the reasons for these conflicts, and that could help to improve the comparability of future primary studies.

Lesson learned 9.6: (contextual factors): Based on our evaluation of secondary studies, we call for more context-sensitive sustainable SCM studies and argue that CLSC/EnvP research should include an evaluation of a wider organizational and inter-organizational context (i.e., industry, country, companies’ size, technology, position in the SC). For example, the predominant use of multi-industry samples as reported by Carter and Easton (2011) requires that individual industries should be taken as sampling frames in future research. Similarly, our knowledge of the impact of firm size on sustainable SCM is also limited as reported on the secondary level.

Lesson learned 9.7 (category links): Linking the InvM category more closely to the CLSC/EnvP category should provide valuable insights, as classical inventory models often lack green metrics. An even stronger link between the RM category and its risk frameworks and CLSC/EnvP could benefit the field as well. Linking the SC Design category more closely with CLSC/EnvP is needed to study the sustainability of different SC network designs on the primary and secondary level. The secondary study of Shaw et al. (2010) is a good example of how to integrate CLSC/EnvP and PM/RS. Additionally, aligning CLSC/EnvP with ProductMan may be interesting to better understand the new product development processes and innovations with a sustainability focus. Finally, it would also be interesting to systematically analyze how the different content categories relate to each other using statistical methods.

Lesson learned 9.8 (theories used): As reported in the secondary studies of Sarkis et al. (2011), Carter and Easton (2011) and Seuring and Müller (2008), primary CLSC/EnvP studies should be stronger based on established theories. The insights on sustainable SCM obtained by the secondary LR of Sarkis et al. (2011) are based on different theoretical streams and should be used as a starting point for more theoretical soundness. Additionally, Mollenkopf et al. (2010) recommended employing theoretical approaches that take a more holistic and strategic perspective to better explain the phenomenon under study.

Lesson learned 9.9 (inter-disciplinary link and implementation): The sub-field of sustainable SCM may strongly benefit form inter-disciplinary review projects, as this area is not limited to business management (cf. Gold and Seuring 2011). We note that several journals introduced special categories for interdisciplinary primary research just recently, which could help to promote research in this area. Finally, also secondary research on implementing CLSC/EnvP practices is still rare

Case 3 (mapping studies)

Short summary: In the third case, we first grouped the 29 mapping studies into five main (sub-)streams for an in-depth analysis, and then derived some lessons learned from these LRs, which includes 7 NLRs and 22 SLRs. 1 further LR is related to the Map category, but was assigned to CLSC/EnvP (cf. Online Resource 9). The first stream includes integrative LRs and analyses primary studies according to different general dimensions based on analytical frameworks. Croom et al. (2000), for example, critically analyzed 84 randomly selected SCM articles (journal papers, books, and conference proceedings) based on a framework that used content criteria and methodological characteristics. Burgess et al. (2006) reviewed 100 randomly selected articles on SCM (from 614 usable articles across a 19 year period from 1985 to 2003) and discussed ‘descriptive features of SCM’, ‘definitional issues’, ‘theoretical concerns’, and ‘research methodological issues’. In a further study, Schoenherr (2009) explored SCM articles according to their ‘publication year and outlet’, ‘common themes, settings and viewpoints’, and ‘countries or regions investigated’. A second stream embraces methodological LRs and investigates the research methodologies applied in the field of SCM. Based on three selected journals, Sachan and Datta (2005) examined the state-of-the-art of 442 SCM articles from a broad methodological point of view (‘research design’, ‘number of hypothesis testing’, ‘research methods’, ‘data analysis techniques’, ‘data sources’, ‘level of analysis’, and ‘country of authors’). However, the short time span of five years (from 1999 to 2003) does not permit the identification of long-term trends in SCM research. Giunipero et al. (2008) reviewed 405 articles published in 9 academic journals (from 1997 to 2006) and explored a wide range of (primarily methodological) trends and gaps in the SCM literature (e.g., ‘SCM definitions’, ‘SCM content categories’, and ‘empirical vs. non-empirical literature’, ‘level of analysis’, ‘sample populations’, ‘industry and primary research methods’, and ‘data analysis techniques’). Based on a highly selective review approach, Ho et al. (2002) discussed major weaknesses of empirical SCM articles with respect to the conceptualization, operationalization, and modeling of SCM. Kovács and Spens (2005) and Spens and Kovács (2006) reviewed three different types of research approaches to logistics research (inductive, deductive, and abductive reasoning) based on three selected journals and a five-year time span starting in 1998. However, their SLRs did not systematically analyze the application of different methodologies in detail. Soni and Kodali (2011) critically reviewed 569 empirical SCM papers published in 21 selected journals between 1994 and 2008 by analyzing a comprehensive set of evaluation criteria (i.e., ‘empirical research growth in SCM’, ‘principal component bodies and related issues in SCM’, ‘level of analysis’, ‘country of sample industry’, ‘performance measurement’, ‘purpose of empirical research’, ‘entity of analysis’, ‘element of exchange’, and ‘sample industry’). Hilmola et al. (2005) exclusively focused on case study research based on a sample of 55 SCM articles published in refereed journals. Tangpong (2011) analyzed the measurement of constructs in empirical SCM research from 2002 to 2007, while Keller et al. (2002) investigated multi-item scales used in logistics research. The third stream contains dissertation-specific LRs published by Stock (2001), Stock and Luhrsen (1993), Gubi et al. (2003), Stock and Broadus (2006), and Zachariassen and Arlbjørn (2010). These mapping studies with an exclusive focus on PhD dissertations provide an overview of how PhD students performed their research in SCM, it compares the topics under study, contributes to the state-of-the-art and identifies research gaps (Gubi et al. 2003; Zachariassen and Arlbjørn 2010). Stock and Luhrsen (1993) provided a compendium of 422 logistics-related dissertations published between 1987 and 1991 and written at universities in the US and Canada. In two follow-up studies, Stock (2001) covered the period from 1992 to 1998 with a total of 317 dissertations, while Stock and Broadus (2006) reviewed 410 SCM and logistics-related doctoral dissertations published between 1999 and 2004. Gubi et al. (2003) reviewed 71 Scandinavian doctoral dissertations in the field of SCM published between 1990 and 2001 and examined a broad set of methodological variables and topics, while Zachariassen and Arlbjørn (2010) identified 70 Nordic (Finland, Norway, Denmark, and Sweden) doctoral dissertations in SCM published between 2002 and 2008 and analyzed them with reference to an analytical framework with nine criteria (‘year of publication’, ‘dissertation type’, ‘primary entity of analysis’, ‘level of analysis’, ‘main purposes’, ‘research design applied’, ‘time frame for the empirical works’, ‘type of theory generated’, and ‘elements of philosophy of science’), which had earlier been developed by Gubi et al. (2003). The fourth stream encompasses four journal-specific LRs with an in-depth analysis of a single outlet, although concentrating on a single journal does not permit to draw conclusions about SCM research in general. Mentzer and Kahn (1995) analyzed different methodological research types used in the Journal of Business Logistics (JBL) and covering the period from 1978 to 1993. Also with an exclusive focus on the JBL, Frankel et al. (2005) examined the SCM research approaches and strategies (from 1999 to 2004) and classified the 108 articles along the methods of data collection. Carter and Ellram (2003) reviewed 774 articles published in the Journal of Supply Chain Management over a 35-year period (from 1965 to 1999) via different analytical categories (e.g., ‘analysis of research methods’, ‘subject categories’, ‘research designs’, ‘theoretical approaches’, and ‘individual and institutional contributions’). Kouvelis et al. (2006) explored SCM articles that had been published in Production and Operations Management (from 1992 to 2006) and highlighted important topical issues addressed in recent research, and further provided some opportunities for future SCM research. The fifth stream includes theoretical LRs. Nagarajan and Soši (2008) investigated some applications of cooperative game theory in SCM (with a special focus on profit allocation and stability), while Leng and Parlar (2005) surveyed more than 130 papers using game theory in SCM for analyzing competitive and cooperative interactions in SCs. Defee et al. (2010) identified 181 applied theories in a broad sample of SCM articles during the period of 2004–2009. Sarkis et al. (2011) critically evaluated research on green SCM using selected organizational theories. Finally, there are some further stand-alone mapping studies. The inter-disciplinary LR of Cheng and Grimm (2006) argued that SCM researchers often use theories and methodologies from marketing and operations. For this reason, the authors studied the recent empirical strategic management literature to integrate the theoretical and conceptual contributions into SCM research. The country-specific LR of Zhao et al. (2007) evaluated the existing China-based literature on SCM decision science. Tan (2001) provided a historical LR on the development of SCM from two separate paths, namely materials and logistics management, and discussed various SCM strategies and conditions, while Shukla et al. (2011) conducted an unsystematic LR of SCM content along some selected main SC activities.

Lesson learned 10.1 (dynamics and number of publications): The area of SCM is a relatively ‘young’ research field influenced by different backgrounds, and it is fragmented along several narrow disciplines (e.g., Burgess et al. 2006; Giunipero et al. 2008). The secondary mapping LRs further showed that SCM research is still very dynamic both on the primary and secondary level. For this reason, an exponential increase in the number of SCM publications prevailed over the years (e.g., Burgess et al. 2006). Schoenherr (2009) confirmed that the number of articles published has continuously increased from 2000 to 2008 (except for 2004, which experienced a slight downturn). This confirms the findings of Stock (2001) and Stock and Broadus (2006) on the number of published dissertations in this field.

Lesson learned 10.2 (importance and representativeness of topical (sub-)fields): To learn more about dominant research topics, it is highly relevant to analyze which SCM themes were investigated at the primary level, and to reflect the journal editors’ choices and preferences. However, to evaluate whether specific SCM topics dominate the primary studies in certain years as reported in the secondary LRs is quite problematic, as extremely different content classification schemes were used in previous works (cf. Croom et al. 2000; Burgess et al. 2006; Cheng and Grimm 2006; Kouvelis et al. 2006; Schoenherr 2009; Stock and Luhrsen 1993; Stock and Broadus 2006; Carter and Ellram 2003). Therefore, we included an overview of the applied content categories in Online Resource 3 and present some main findings in the following. The LRs of PhD dissertations published in the field of SCM verified that specific topics were more thoroughly investigated in certain years (Stock and Luhrsen 1993; Stock and Broadus 2006). In line with that, Schoenherr (2009) found that some topics were relevant over the years (e.g., ‘use of information technology’ or ‘issues associated with third-party logistics’). Giunipero et al. (2008) observed that ‘SCM strategy’, ‘SCM frameworks, trends, and challenges’ and ‘alliances and relationships’ have received increased attention in the literature. Soni and Kodali (2011) showed that ‘performance measurement’ is the most frequently researched topic in empirical works, followed by ‘supply chain integration’ and ‘assessment of status of SCM in a field or industry or nation’. Overall, our tertiary study provided evidence that the evolution of ‘IT systems’ and ‘e-commerce’ within the SC context are increasingly popular topics in SCM research.

Lesson learned 10.3 (terminology and SCM definitions): Secondary studies pointed out that SCM has been defined from different perspectives and SCM philosophies. Mentzer et al. (2001), for example, identified more than 100 definitions of SCM and argued that no uniform agreed-upon definition for SCM exists. In line with that, Burgess et al. (2006), Esper et al. (2010), and Shukla et al. (2011) also stated that consensus is lacking on a precise definition of SCM. Moreover, some of the mapping studies thoroughly examined the conceptualization and evolution of SCM constructs and found that the lack of commonly-accepted definitions of SCM and associated SC problems stems from the diverging streams of research and schools of thoughts (e.g., logistics, transportation, operations, or information). Although many definitions of SCM were intensively discussed and significant inconsistencies in SCM conceptualizations prevail, we also became aware of an increasing level of agreement with the basic definitions of Mentzer et al. (2001) and the Council of Supply Chain Management Professionals (cf. Hilmola et al. 2005; Cheng and Grimm 2006; Zachariassen and Arlbjørn 2010; Giunipero et al. 2008).

Lesson learned 10.4 (level of analysis): The level of analysis refers to the perspective from which the primary studies investigate SCs. While Croom et al. (2000) suggested three levels of analysis (‘dyadic’, ‘chain’, and ‘network’), the ‘firm/function’ is used by Gubi et al. (2003), Sachan and Datta (2005), Giunipero et al. (2008), and Soni and Kodali (2011) as a fourth level. Croom et al. (2000) highlighted that there are fewer publications on the network level, while the majority of works focus on the dyadic and SC level. In line with that, Giunipero et al. (2008) reported that network analyses accounted for merely 5 % of publications, while 38 % of publications focused on the firm level. Sachan and Datta (2005) argued that 56 % of 422 articles focused on the firm/function level. The findings of Gubi et al. (2003) are also in line with the aforementioned works, as they stated that only 29.6 % of the papers in their sample focused on inter-organizational issues.Soni and Kodali (2011) verified that empirical SCM research is still very much based on the analysis of focal firms. They revealed that 65 % (370 articles) of the papers in their sample are based on the firm-level, and that only 24.8 % are truly inter-organizational (combining 177 articles on levels of dyad, network, and chain). Zachariassen and Arlbjørn (2010) stated that the main level of analysis for the reviewed dissertations is still on firm departments and the firm itself (approx. 27 %), but interestingly that a shift occurred from the focal company perspective to inter-organizational aspects in SCs (dyads from 11.4 to 22.8 % and chains from 11.4 to 21.4  %). Overall, the secondary studies agreed that (ideally) the rate of primary studies should be more at the network and chain level.

Lesson learned 10.5 (methodology): The secondary studies confirmed the use of a wide range of research methodologies and classified the research designs of the primary studies very heterogeneously. As the methodological secondary LRs are therefore also restricted from different viewpoints (e.g., in terms of ‘number and size of articles considered’, ‘period covered’, or ‘number of reviewed journals’), methodological research findings at the secondary level could not be compared easily. Giunipero et al. (2008), for example, limited their scope to an analysis of research methods used and data analysis techniques and did not investigate issues related to empirical research methodologies, such as ‘research design’, ‘data collection approach’, ‘sample size’, ‘respondents’ profile’, or ‘country coverage’. However, secondary LRs agreed that a large number of articles published in SCM were empirical (e.g., Mentzer and Kahn 1995; Giunipero et al. 2008; Croom et al. 2000; Carter and Ellram 2003; Sachan and Datta 2005; Spens and Kovács 2006; Frankel et al. 2005). Although the reported percentages of methodologies used significantly vary, quantitative empirical SCM surveys are more popular than case study-based research designs. In line with that, Mentzer and Kahn (1995) found that 54.3 % of the articles published in JBL (1978–1993) were based on surveys (only 3.2 % reported case studies), and Carter and Ellram (2003) verified that 75 % of research in the Journal of Supply Chain Management included surveys and case studies. While the analysis of Croom et al. (2000) showed that 56 % of SCM literature is primarily empirically-descriptive, Burgess et al. (2006) reported that the number of empirical research articles was nearly 54 % of the total sample (32 % case-based studies and 22 % surveys). Burgess et al. (2006) further classified 39 % of the articles as conceptual, while only 7 % used analytical/mathematical methods (none of the primary articles used mixed methods). The analysis of Giunipero et al. (2008) revealed that 70 % of the total articles published in SCM were empirical, and 30 % of the papers were theoretical. Soni and Kodali (2011) stated that the total number of empirical research articles published since 1982 was 30.1 % (569 out of the total 1,807 articles) in the selected journals. Sachan and Datta (2005) confirmed that 57 % of the reviewed papers were empirical, with survey methods based on quantitative data dominating. The growing number of empirical studies also led to an increased use of not directly observable latent SCM concepts measured via multi-item scales. As a consequence, such multi-item measures require a rigorous development and tests for validation to ensure that they exactly capture the meaning of the respective constructs (Keller et al. 2002). Interestingly, Sachan and Datta (2005) observed an increase in the application of direct observation methods, such as case studies. Exclusively focusing on case study research, Hilmola et al. (2005) showed that most case research lacked a rigorous methodological discussion, as only 12 out of 55 articles referred to case methodology literature. Equally important is the fact that a single case had often been investigated. Therefore, our tertiary study strongly calls for more (ethnographic) case studies to enhance our understanding of SCM, and it recommends a combined use of multiple research methods to achieve greater triangulation (e.g., Carter and Ellram 2003; Tangpong 2011). Secondary LRs further suggested that SCM academics may expand the sample sizes and the response rates of primary studies, and that they should conduct SCM research in developing countries, with a focus on specific industrial sectors, based on longitudinal data collection or multiple informants (Gubi et al. 2003; Giunipero et al. 2008).

With a focus on data analysis techniques, Mentzer and Kahn (1995) and Sachan and Datta (2005) provided evidence that ‘descriptive analyses’ covered 66.7 % and 39.9 % of the works analyzed, respectively. This is in line with the general believe that there has been an increase in ‘hypothesis testing’ and in the ‘application of more advanced analysis techniques’: While Mentzer and Kahn (1995) found that only 4.3 % of the papers involved hypothesis testing, Sachan and Datta (2005) showed in a more recent study that already 15 % of the analyzed works involved hypothesis testing, and that more advanced techniques had been used for data analysis. Carter and Ellram (2003) noted that the use of hypothesis testing had significantly increased from 1989 to 1999. Giunipero et al. (2008) found that 42 % of the empirical studies analyzed used basic data analysis techniques and 49.5 % advanced data analysis techniques (other analyses 8.5 %). Overall, our tertiary study strongly supports the call for the increased use of more sophisticated statistical modeling techniques in SCM (Carter and Ellram 2003; Mentzer and Kahn 1995; Schoenherr 2009). In line with the reviewed secondary studies, we also strongly call for a clearer description of the research approaches used (Spens and Kovács 2006) and for a more exact report of relevant descriptive (methodological) information (Giunipero et al. 2008).

Lesson learned 10.6 (theories used): The secondary studies reported that SCM research has been grounded in different theoretical streams, but that there is still an absence of theory in some of the primary works. Burgess et al. (2006) found that 20 % of the reviewed articles had no discernable theoretical fundament and, in particular, multi-theory grounding was quite underrepresented. Defee et al. (2010) found that 181 unique theories were used, and that 53.3 % of the articles applied at least one theory. Burgess et al. (2006) further showed that transaction cost economics and the strategic management theory related to competitive advantage dominate the field of SCM, while Defee et al. (2010) provided evidence that transaction cost economics and the resource based view account for 19 % of all theories used in SCM research. Overall, our review synthesis shows that relatively few theories account for a majority of articles in the field and, in addition, a vast number of theories used in SCM research originated in other disciplines (Cheng and Grimm 2006; Defee et al. 2010). Although we found no evidence that a specific theory has been overused, the discipline of SCM may strongly benefit from greater internal theory development (Defee et al. 2010) and from a more discipline-specific SCM theory of how to manage complex SCs. Thus, Burgess et al. (2006), for example, suggested the use of meta-theories, as a high level of diversity in ontological and epistemological bases are prevalent in the field of SCM. SCM scholars should also take the opportunity to apply rarely-used or new theories of related areas (cf. Sarkis et al. 2011). In particular, the use of relational and social theories (e.g., the relational view, social capital/network theory, or social/relational exchange theory) will become increasingly important.

Lesson learned 10.7 (journal titles): Burgess et al. (2006) showed that a total of 31 journals published SCM-related works, with the Journal of Supply Chain Management (21) and Supply Chain Management (27) accounting for 48 % of the publications. In contrast, Schoenherr (2009) collected a total of 222 journal titles in his dataset. The International Journal of Physical Distribution and Logistics Management (IJPDLM) received the highest count of articles, followed by Supply Chain Management. Based on a pre-selection of nine academic journals, Giunipero et al. (2008) found that 55 % of the 405 articles reviewed were published in the Journal of Supply Chain Management, IJPDLM, and the Journal of Operations Management. As a particular important outlet for SCM case study research, Hilmola et al. (2005) identified IJPDLM, Supply Chain Management, and the European Journal of Purchasing and Supply Management. As noted above, SCM is cross-disciplinary in nature, wherefore we recommend drawing more attention to non-disciplinary outlets, which may help to distribute research findings as well (Schoenherr 2009).

Lesson learned 10.8 (country-focused research): In reviewing the ‘countries of interest’, Sachan and Datta (2005) found that 6 % of primary studies were conducted in Asia, while, in contrast, 50 % of the studies focused on North America and 37.5 % on Europe. Schoenherr (2009) showed that a wide variety of countries and regions were investigated; China was found to be most popular, followed by the U.K., the U.S. and Europe. He further argued that primary SCM research in the U.S. had been proliferating, while primary ‘overseas’ studies had also gained attention (however, many of these primary overseas articles appeared in non-disciplinary or non-mainstream outlets). The popularity of China was not surprising in his eyes due to the rapid development of its economy. In line with that, China was also the exclusive subject of the secondary LR published by Zhao et al. (2007). Their findings showed that the majority of SCM articles are descriptive and focus on status updates. Soni and Kodali (2011) provided evidence that 16.5 % of empirical data are sampled in the U.S., while 24.3 % of the articles did not mention the country or region where the data had been collected. Asian countries contributed nearly 10 % of the studies, which is little higher than the aforementioned findings of Sachan and Datta (2005). From a secondary point of view, the analysis of our sample showed that there are not many LRs that attempted to understand the role of SCM in a country-specific business context. Considering the significantly increasing number of primary SCM articles with a regional focus, for example on the emerging markets India and China, further country-specific SCM LRs may provide new valuable insights.

Lesson learned 10.9 (industry-specific research): Our tertiary review revealed that SCM research was not restricted to particular industries, but that rather many different industries had been studied. The classification of Burgess et al. (2006) identified possible sectors for SCM research and showed that 35 % of the sampled articles focused on the manufacturing sector. Soni and Kodali (2011) stated that 15 % of the surveys were performed in the manufacturing industry, while 7.9 % of the empirical research collected data in the food and agriculture industries; 19 % took data from multiple industries. Zachariassen and Arlbjørn (2010) also confirmed that more dissertations focused on manufacturing companies and only a few on carriers. Importantly, Burgess et al. (2006) highlighted sectors that received inadequate attention among scientists and practitioners, while Soni and Kodali (2011) stated that retailers or distributors were highly neglected in comparison to manufacturers.

Lesson learned 10.10 (journal/dissertation-specific research): Some LRs focused on publication trends of a single journal or on a limited pool of doctoral dissertations. In particular, the outlet-specific LRs provided valuable insights into the evolution of certain journals and the types of research that are likely to be accepted for publication. Researchers may thus gain a better understanding of a journal’s influence on the discipline as well as the effect of editors, authors, and authors’ affiliations on the outlet. As those findings also represent a call for more forward-looking research in the respective journal, it is clear that additional journal-specific LRs may be valuable also for other outlets such as Management Review Quarterly. In addition to the journal-specific LRs, we call for further secondary studies of doctoral dissertations (e.g., with a focus on a different set of countries) to provide a comparison with publication trends in Nordic and Scandinavian dissertations or dissertations published in North America (cf. Zachariassen and Arlbjørn 2010).

General recommendations: Although the debate on SCM as a discipline has been initiated long ago, the discussion on the primary level is far from being concluded. The same is the case for the secondary level, as was shown in the examples of Cases 1–3 in Sect. 5.2. In addition to the lessons learned 8 to 10 drawn from in-depth case-analyses, there are four systematic strategic options to advance our secondary knowledge in the field. First, an easy-to-implement and rather efficient method of extending previously published secondary studies is the updating of both primary research and synthesis data. In light of significant changes SCM experienced in recent years, and given that many LRs are therefore in part outdated, up-to-date analyses and syntheses via follow-up LRs would be very beneficial for some of the defined SCM content categories. Secondly, it may also be possible to address unanswered RQs by refining and recoding the characteristics of primary studies contained in some of the existing SCM LRs (modified LRs). In case of a refinement of existing secondary data, scholars should clearly state what their modifications are, and the reader should be referred to the risk of an overrepresentation of certain primary studies and an overestimation of their findings. Thirdly, some of the mapping studies were only based on a limited set of publications or journals and thus provide only a partial overview of the field. Other mapping studies are partly outdated, which is why a regular update of mapping LRs may provide a valuable contribution to the maturing field of SCM. Fourthly, we strongly call for more interdisciplinary LRs, as only a few interdisciplinary secondary studies were published so far (cf. Cheng and Grimm 2006). In line with Sachan and Datta (2005), who found that there are also very few inter-disciplinary primary studies, we recommend expanding this stream of research both on the primary and secondary level.

Finally, we note that some secondary LRs stated that future primary studies should direct more attention to the ‘translation’ of insights derived from SCM research for SCM professionals (cf. Kouvelis et al. 2006). For example, research on the SCM practices-performance-link should be translated and generalized to support SCM mangers (cf. lesson learned 9.5). As reported at the secondary level, the dominant conceptual SCM models focus mainly on the practices-performance relationship and often overlook the impact of contextual factors (Ho et al. 2002). In general, practitioners could use secondary LRs as a starting point for the identification of references for specific questions. This, in turn, makes it necessary to highlight managerial insights that can be gained from the sampled works to make it easier for practitioners to get access to the topic under study.

6 Limitations of the literature review

This section reflects the limitations of this tertiary study and discusses the boundaries of its analytical dimensions and methodological decisions.

Firstly, this SLR used the taxonomy of Cooper (1988) and employed evaluation criteria for the search strategies based on Hochrein and Glock (2012). The classification of SCM topics, in turn, was based on the content categories of Wolf (2008). It is clear that this review of LRs could have been structured differently by referring to alternative frameworks, and as a consequence the implementation of the literature analysis is subjective to a certain degree. Even though the sophisticated coding rules enhance the validity and reliability of our results, future research could develop and apply different categories and taxonomies and could compare them to the ones used in our SLR.

Secondly, this tertiary study did not provide in-depth analyses of all defined SCM categories and is therefore somewhat restricted. However, we conducted three selected case studies (see Sect. 5) and provided an overview of purpose, content, and main findings for all reviewed articles in Online Resource 9.

Thirdly, the implemented search strategy limits our results. The keywords and search strings we defined, the criteria for eliminating papers from the sample as well as the journals we focused on and the databases we used could all be limiting factors (cf. Appendix), and they could be set differently in future research. To extend the scope of this SLR, scholars may consider employing alternative selection methods or using alternative search engines such as Google Scholar or Science Direct. The results of this study are therefore not universally valid and have to be interpreted within the context of the developed process-oriented framework. However, for the 121 LRs contained in our sample, we received sound results from our analysis.

7 Conclusions

The literature review technique has become one of the most important research methods in maturing research areas. Due to a growing number of primary studies on SCM topics, the systematic analysis, elaborate integration, and critical discussion in secondary studies help to avoid distortions in the domain of SCM. In line with that, this tertiary study analyzed 121 LRs in SCM and evaluated both the methodologies used in the sampled reviews and provided a taxonomic and thematic discussion of secondary SCM research. For selected content categories, future research opportunities were also derived for primary SCM research. This review-review intended to build momentum by stimulating new secondary research in SCM, by deriving insights for scientists and by providing an accumulated knowledge base for SCM executives. The findings of this study can be summarized as follows:

RQ1: Which methodological techniques have been applied in the SCM LRs?

Geyskens et al. (2009) provided empirical evidence in their ‘MAs of MAs’ that meta-analytical decisions influence research findings. In line with this study, our ‘SLR of LRs’ analyzed the methodological soundness of secondary SCM studies. The results obtained are rather mixed: We found that LRs in SCM significantly differ with respect to the application, description, and justification of the search strategy. As methodological decisions can have a strong impact on research findings, it is important that scientists are extremely sensitive to the consequences of their methodological setup. The methodological guideline proposed in this paper may assist scholars in conducting methodologically rigorous SCM LRs in the future.

RQ2: How can existing LRs in SCM be classified and systematized?

This paper differentiated LRs into NLRs and SLRs and applied an established taxonomy to analyze the characteristics of the SLRs according to the criteria Focus, Goal, Perspective, Coverage, Organization, and Audience. In addition, the NLRs and SLRs of this survey were systemized with regard to their topic along 22 content categories.

RQ3: Which SCM topics have been addressed and which research gaps can be identified?

This tertiary study also illustrated which major SCM topics were addressed in secondary studies in the past. Given the broad scope of our data set and evaluation categories, important research topics could be identified in which only a few LRs have been published so far, and secondary research gaps were presented for these areas. To benefit the scientific community, we did not only identify research gaps at the secondary level, but also provided a comprehensive SCM research agenda for primary studies based on the detailed analysis of selected content categories. In doing so, we identified research potential at the primary and secondary level and further illustrated how the data presented in this tertiary study could be used to identify promising topics for future research.

This study is helpful for academics, as our comprehensive compendium represents an ideal starting point for future research on SCM topics. We hope that our tertiary review will stimulate further discussions on the primary and secondary level, and that the guidelines for conducting SLRs will encourage SCM researchers to clarify methodological issues. Beyond the scientific contribution of this paper, our SLR provides a managerial panacea and beneficial knowledge base, and enables SCM practitioners to better inform their decision making.