1 Introduction

The Social Life Cycle Assessment (SLCA) guidelines (UNEP-SETAC 2009) define SLCA as a method aiming “to assess the social and socio-economic aspects of products and their potential positive and negative impacts along their life cycle” (p. 37). The guidelines distinguish two different SLCA approaches, type I and type II. The distinction between these approaches has been discussed by a number of authors (Jørgensen et al. 2008; Parent et al. 2010; Macombe et al. 2013; Wu et al. 2014; Chhipi-Shrestha et al. 2014). While the type I approach represents the lion’s share of SLCA studies at present time (Wu et al. 2014), few comprehensive and analytical reviews have been undertaken to examine the multiplicity of approaches that have been developed within this type of SLCA.

Chhipi-Shrestha et al. (2014) propose a classification for type I SLCA, which focuses on the techniques that are used by researchers to compile their results. This logic leads them to identify the following three types of type I SLCA: checklist (identifying the presence or absence of impact), score (adopting a scale-based approach), and the Social Hotpots Database (SHDB). Wu et al. (2014) also provide a general, richly descriptive, review of type I evaluation approaches, also with a strong emphasis on the scoring systems used.Footnote 1 While these approaches are useful in gaining greater clarity on the different scoring techniques in use, the present article takes on a different approach. It aims to provide a classification of impact assessment methods, according to the following two central elements that set methods apart: the characterization step and the weighting step. By doing so, it aims to highlight that the key elements that set methods apart are not so much their scoring systems or their approach to aggregation but rather the reference points that are used in evaluation methods and the type of data they assess.

This paper therefore takes on the task of exploring “what data is assessed and against what it is assessed.” At the heart of this inquiry is the simple question: when researchers carry out their study, what do they look at, and what do they base their assessments on? Ultimately, this inquiry has led us to reflect upon the existing classifications and definitions of type I SLCA, while taking into consideration how SLCA practice has evolved in recent years.

In order to tackle this work, an extensive critical literature review was undertaken, covering a total of 32 reviewed articles, ranging between 2006 and 2015. The selected articles were analyzed with a focus on the inventory data used and the type of characterization and weighting methods used. This analysis allowed to make explicit the often implicit logic underlying the evaluation methodsFootnote 2 and to identify what sets apart as well as what the common denominators of type I SLCA are.

Before presenting the methodology guiding the literature review and the analysis (Sect. 3), an overview of the current existing definitions of type I SLCA with an emphasis on what distinguishes type I from type II is presented in Sect. 2.

2 Type I and II SLCA

The first mention of type I and type II approaches is encountered in the SLCA guidelines (UNEP-SETAC 2009). It states that “Two types of social and socio-economic impact categories can be identified. Type I impact categories aggregate the results for the subcategories within a theme of interest to a stakeholder, e.g., human rights. Type II impact categories model the results for the subcategories that have a causal relationship defined on the criteria, e.g., autonomy” (UNEP-SETAC 2009, p. 70).

Building upon these definitions, Parent et al. (2010) suggested that the difference between type I and type II lies in the characterization model (the evaluation method); type I methods use performance reference points (PRPs) to assess inventory data, while type II methods use characterization factors based on measurements of causal links in impact pathways. PRPs are typically defined as “internationally set thresholds or goals or objectives according to conventions and best practices” (UNEP-SETAC 2009, p. 72), which allow to assess either a social performance or a risk of encountering a specific social performance. Characterization factors are multiplicative factors which represent the causality between two phenomena. Macombe et al. (2013) also highlight the difference between these two types of SLCA and call type I SLCA “life cycle corporate social responsibility (CSR),” given that it evaluates the social performance of organizations along a value chain—not social impacts, which could only be understood through a thorough analysis of causality chains.

Thus, at the heart of current definitions of type I SLCA is the fact that it aims to qualify whether the data collected about an observed situation corresponds to a negative or positive performance—or a high or low risk of encountering negative performance—or to varying degrees in between those two poles. Implicit in this assessment—and untested—is the fact that a good performance suggests the potential presence of positive impacts down the causality chainFootnote 3 (dashed line in Fig. 1) and vice versa. However, as depicted in Fig. 1, in type I SLCA, the inventory data and the evaluation results both focus on the same specific point along the impact pathway. On the other hand, in type II SLCA, the inventory data is positioned further upstream on an impact pathway than the evaluation result; the characterization factor is an actual representation—often quantified if not always—of the causal link (continuous line in Fig. 1 Footnote 4). Hence, a simple way of differentiating type I from type II SLCA is to verify if the inventory data and the characterized result are at the same point along an impact pathway (type I) or if they are at different points along the impact pathway (type II).

Fig. 1
figure 1

Positioning of type I and II inventory data and characterization results on the impact pathway

3 Methodology: literature review and comparative analysis of methods

This literature review is based on articles gathered through a keyword-Web-based search. The search focused on recent (between 2006 and 2015) scientific literature but also on SLCA reports produced by consulting firms. Articles selected were either presenting the elaboration of a SLCA method or using an existing one. The main keywords used (including appropriate variations and permutations) were the terms SLCA, social LCA, social performance, evaluation, methodology, mid-point, and impact assessment. Articles were searched using Web of Science, Google Scholar, and Google.

The resulting bibliography of 102 references was then screened to only keep those that corresponded to studies, articles, or manuals related to type I SLCA. Articles making explicit reference to type I, PRPs, corporate behavior assessment, and social performance assessment were kept, as well as articles where the assessment methods generated a result located at the same point as the inventory data, with regards to the impact pathway. Ultimately, 32 articles and reports were selected.

Those articles were then analyzed based on an analytical grid categorizing different steps or features of a SLCA type I method. For example, a column was dedicated to the use of a functional unit (FU), another for inventory data, and another one to the use of PRPs. As we filled out these columns for each of the selected articles, some categories of steps or features of type I SLCA methods were added and others were divided into more precise ones (e.g., the column for PRPs was subdivided in accordance to the various reference points used; the use of a FU was further broken down into the use of a FU to scale inventory data vs. to weigh results obtained across the life cycle). Therefore, in an iterative manner, the selected articles were divided into methodological steps or features that were further analyzed for all 32 articles, transversally. This method allowed identifying what sets apart as well as what the common denominators of type I SLCA are.

4 Results

Figure 2 shows the different methodological steps encountered in the type I SLCA studies and articles reviewed, in their most usual order. It shall be noted that in some studies, some steps are inverted or else carried out simultaneously. Furthermore, not all steps are always carried out, except for the data collection, the weighting (which is often done implicitly, whereby all subcategories or life cycle steps are weighted at 1), and the final interpretation. Some studies comprise two weighting steps, and others do not comprise a characterization step as we define it below.

Fig. 2
figure 2

Methodological steps encountered in type I SLCA

In the literature, authors’ implicit or explicit definitions of evaluation methods vary; some include all the data manipulation steps shown in Fig. 2 (from data collection to final aggregation), while others just a few. For the sake of making explicit the common denominators used in type I studies and to identify what sets apart the different type I SLCA methods, we propose here to define steps narrowly and describe them individually.

Data collection refers to the collection of inventory data. Data collection can be done through different techniques, and the data can represent different phenomena. Section 4.1 discusses the variations in terms of inventory dataFootnote 5 used, and Sect. 4.2 briefly describes the inventory data scaling on a FU.

Characterization is the step during which the data is analyzed in order to assess its meaning in terms of its potential to cause potential social impacts. At this step, an implicit or explicit value judgment is made on the collected data. The characterization step is discussed in Sect. 4.3.

At the weighting step, the characterization results are given a different relative weight. The weighting can be based on the relative importance attributed to different subcategories (e.g., salary and working hours) inside an impact category (e.g., working conditions) and a stakeholder category (e.g., worker), or an equal weighting can be attributed to all results. It can also be based on the relative importance of the unit processes or the life cycle steps in the studied product system. These steps are further detailed in Sect. 4.4.

Finally, the weighted results can be aggregated. If weighted on the basis of equal weighting or the relative importance of social issues, weighted characterization results can be aggregated into impact categories, stakeholder categories, or even in a single score. If weighted on the basis of the relative importance of unit processes or life cycle steps, aggregation can be done within life cycle steps of even on the entire life cycle. These approaches often co-exist in studies and as such, are often applied onto the same characterization results, in a subsequent manner. While this article does not discuss aggregation approaches in depth—greater focus is oriented toward inventory data, characterization, and weighing steps—they are nonetheless discussed in Sects. 4.2 and 4.4.

4.1 Data collection: inventory data variations

Different types of data are collected in type I SLCA studies. Wu et al. (2014) classified them into organizations/companies, sector/industry, and country-level data and discussed the collection methods mobilized for each type. Inventory data also vary as to whether they represent activities, immediate effects, or further effects. Indeed, “even though type I SLCA always assesses a performance, a feature of a situation, and not an effect, a change” (Macombe et al. 2013), what we find in the literature is that this “performance” can be observed at different points along an impact pathway linking a product system to people’s well-being—not only at the beginning of the pathway.

In numerous studies, the collected data aims at representing company activities,Footnote 6 , Footnote 7 for example, corporate practices (e.g., offering health and safety training to employees), taxes paid, and jobs created. As expected, some inventory indicators are also related to immediate effects (as depicted in Fig. 1), for example, the number of workers using protective equipment. However, some indicators are positioned further along the impact pathway, such as the turnover rate (Fontes 2014), or even further along the line, such as the composite measure of experienced well-being (Fontes 2014).Footnote 8 Therefore, not all type I SLCA approaches focus solely on company activities and their immediate effects.

While the link between the activities of companies in the product system and inventory indicators capturing these activities or their immediate effects is easy to recognize, this link is harder to acknowledge as indicators represent phenomena positioned further down the impact pathway. Indeed, chances that the “number of workers using protective equipment” can be attributed to corporate practices are fairly high. A bit further down the impact pathway, the “turnover rate” might be influenced by factors that are not generated or influenced by product system’s companies, but it can still be significantly caused by practices from the product system. The link can be easily assumed, although it remains untested. This assumption is however harder to make for an indicator such as the “composite measure of experienced well-being.” In this case, the assumed causal effect chain between the activity in the product system and the data collected is quite tenuous and in most foreseeable cases, undocumented. Indeed, the furthest an indicator is positioned down the line, the more extraneous factors to the product system come into play. In these cases, data are collected to represent a social impact, but there is no link between those data and an activity in the product system that could have caused it. Therefore, there seems to be a trade-off between trying to capture data representing stakeholders’ well-being (or phenomena located closer to the end point than the source of impact) and ensuring that these phenomena are indeed due to the product system.

Other data used in type I SLCA are simply not found on an impact pathway that could connect the product system to potential social impacts. The SHDB indicators “risk of not having access to a hospital bed” and “risk that country has not ratified International Labor Organization (ILO) conventions” (Benoît Norris et al. 2013) are good examples of this. Indeed, companies’ possible impact on these issues is quite difficult to determine—aside from those of companies in certain specific sectors such as extractive resources, where education and health can be provided by the company to the local community. Yet, it seems that these types of data can be used in a variety of ways, as contextual information providing insight on potential effects of corporate social performance, as a proxy for a phenomenon occurring within the product system and/or its resulting causality chains, or as an element that may affect/condition activities at the company level. While this type of data might be useful in a number of ways, its use and its presence alongside social performance data, without an explicit acknowledgment of the way in which it is taken into account, can be misleading. In a worst-case scenario, a potential impact of a product system could be wrongfully attributed. Using this type of data thus ought to be done with transparency with regards to the role it plays in the study and its difference with social performance data ought to be acknowledged.

Finally, it shall be noted that inventory data in the literature can be at different levels of resolution with regards to the product system considered. Some data are at the same level as the product system’ unit processes/companies, while some are at a higher aggregation level than the object of study. For example, sector and country data are often used as proxies of phenomenon happening at the company level. Studies opting to use data that corresponds to a different level than the object of study often express their results in terms of social risks, rather than performance (Benoît-Norris 2014).

4.2 Scaling and aggregating the inventory data based on a FU

Many authors have raised the question about the relevance of scaling the inventory data over a FU (e.g., Wu et al. 2014; Parent et al. 2010, 2012; Dreyer et al. 2006). Moreover, inventory data scaled on the FU was not pervasively found in our review—only in Fontes (2014), Weldegiogis and Franks (2014), and Kruse et al. (2008).

To be scalable, inventory data ought to be quantitative. Examples are person × hours (for employment), hours of training, injuries/deaths, and loss time injuries. Scaling consists in reporting the inventory indicator, in a proportional way to the FU, under the assumption of a linear relationship between the FU and the indicator.Footnote 9 If 10 h of training are offered for every 100$ of product X produced, the inventory indicator would become 5 h for a FU of 50$ of product X produced.

Once the scaling is done, Fontes (2014) suggests aggregating the inventory indicators over the life cycle, whereas Weldegiogis and Franks (2014) aggregate them within life cycle steps, and Kruse et al. (2008) simply do not raise the subject of aggregation. Subsequently, characterization can be applied on FU-scaled data presented in a disaggregated form, aggregated within life cycle steps, or aggregated over the entire life cycle.

4.3 Characterization—six main approaches in type I SLCA studies

An important distinguishing factor to tell apart type I SLCA approaches lie in the characterization method. How and with which reference points the inventory data is assessed constitutes the basis upon which we build the typology we present below. Type I assessment methods are often summed up as PRP scale analysis (Chhipi-Shrestha et al. 2014; Parent et al. 2010). However, while it is true that most of the methods identified base their assessment on a scale, the various scales utilized rest on vastly different reference points and are thus bound to lead to different results.

As depicted in Fig. 3, six key characterization approaches are identified in type I SLCA literature: (1) assessment based on norms and best practice; (2) assessment based on norms and best practice and the socio-economic context of unit processes; (3) assessment based on expert’s judgment of companies’ compliance with norms; (4) assessment based on researchers’ expert’s judgment on companies’ activities; (5) assessment based on how a company or a sector or a country is positioned with regards to average sector, country, or worldwide performance; and (6) assessment based on how the data associated with the social performance of a company or a sector compares to other alternative companies/sectors. Each will be described more in detail below. It shall be noted that in some studies, there is not characterization step at all.

Fig. 3
figure 3

Six main approaches in type I characterization

4.3.1 Assessments based on norms and best practices

This type of assessment is the most widely used and corresponds most closely to the original definition of a PRP assessment in the guidelines. It is typically performed with a binary or a four- to five-level scale, which correspond to a certain level of compliance with international, national, or sectoral norms or best practices. Typically, the scales will have these generic levels: non-compliance to reference point, compliance, and above compliance (Quantis et al. 2012; Ciroth and Franze 2011; Fontes 2014; Blok et al. 2013; CIRAIG-AGÉCO 2011). The first and last levels are divided in two ascending levels, in some studies. In some studies, multiple connected issues can be addressed in one scale (e.g., a scale addressing multiple issues related to health and safety) (Quantis et al. 2012; Fontes 2014; AGÉCO 2013).

In the literature, some PRP assessment scales are well defined—all the scale levels are transparently documented (Quantis et al. 2012; Fonges 2014)—whereas other studies only define the level of compliance within their multi-level scales (Ciroth and Franze 2011), leaving the other scale levels undefined. In general, the levels of non-compliance and compliance are defined in accordance to national law or international UN conventions. In some articles, it is also the possession of a certification that plays the pivotal role of reference point (Norris 2006; Andrews et al. 2009).

What determines the content of the levels above conformity varies a lot from one study to the other, with most scales emphasizing certain aspects of best practice related to the investigated social issue over others. Sanchez Ramirez et al. (2014) propose a novel way of establishing the level above compliance, by focusing on whether a company promotes good practices in the value chain. “This is based on the principles of Rio +20 Corporate Sustainability Forum, whereby correct behavior is to be supplemented by the continued improvement and sharing of best practice” (Sanchez Ramirez et al. 2014, p. 1518).

A variation on the use of PRPs is proposed by Ciroth and Franze (2011), in their study of an Ecolabeled Notebook. The authors first carry out a norms-based PRP assessment. As a second step, they determine through expert judgment this performance’s likely effect on impact categories, as defined by the guidelines. For example, a good performance on freedom of association “has positive impacts on working conditions, health and safety, human rights, and positive socio-economic repercussions” (Ciroth and Franze 2011, p. 43), and for this reason, the company is deemed to have in general a positive social impact. However, despite their name, the impact categories used do not represent points further along the impact pathway, such as “immediate and further effects” or “human or stakeholders’ well-being” as defined in Fig. 1. They represent groupings of the guidelines’ subcategories, only at a higher level of abstraction. Thus, the result of the assessment remains at the same point within the impact pathway as the inventory data and the PRP-based performance assessment. This “second” characterization step is thus an extension of the norms-based PRP assessment, upon which some expert judgment has been added. This method is also used by Foolmaun and Ramjeeawon (2012) in the second part of their study on polyethylene terephthalate (PET) bottle disposal in Mauritius.

Another variation in the literature with regards to PRPs is the concept of using an FU-scaled PRP, to be applied upon FU-scaled data, as discussed in Sect. 4.2. Fonges (2014) suggests to scale the “reference values” (a binary PRP—compliant or not compliant)—on the same FU as the inventory data. For example, the reference value—necessarily quantitative—2 h of training must be provide for every 10 h worked, would become 20 h of training in a system counting 100 h of work.

4.3.2 Assessments based on norms and socio-economic geographical context

Dreyer et al. (2010) and Sanchez Ramirez et al. (2014) propose to not only assess data on the basis of whether or not it meets norms and best practice standards but to also consider the social, economic, and political contexts of the companies investigated in the life cycle. This logic takes into account the fact that the social performance of organizations is affected by their operating environment and that certain geographical locations and sectors are more challenging in terms of attaining compliance and acceptable levels of social performance.

However, both authors propose to consider the geographical context quite differently. Sanchez Ramirez et al. (2014) propose the subcategory assessment method (SAM), which is based on a four-level scale (A, B, C, or D), pegged on a compliance level, which they name “basic requirement” (BR). The compliance and above compliance levels (A and B) are similar to that of other norms-based PRPs as discussed in the previous section. However, level C corresponds to organizations not fulfilling the BR and operating in a “negative context” (challenging socio-economic context), and level D corresponds to organizations not fulfilling the BR and operating in a “positive context” (Sanchez Ramirez et al. 2014). In other words, the worst performance level on the scale (level D) will be granted to organizations which do not comply with the BR but yet are located in a positive context, which should in fact promote the attainment of this BR.

Dreyer et al. (2010) propose a very different approach. Taking as an example the issue of labor rights, the authors propose a multi-criteria indicator that assesses the effort (will and ability) of a company to manage a given social issue, in line with existing norms, in this case ILO conventions. Through a number of steps, the multi-criteria indicator score,Footnote 10 which reflects corporate social performance, is then translated into a level of risk of social impacts actually occurring, mostly by considering the context of the company in terms of geographical locationFootnote 11 and industry. A company demonstrating a weak management performance with regards to a given social issue and operating in an environment typically riddled with severe social risk will obtain a result pointing to a high risk of social impacts occurring (Dreyer et al. 2010).

Dreyer et al. (2010) and Sanchez Ramirez et al. (2014) thus propose two contrasting ways of assessing potential social impacts. From the perspective of the former, companies with poor corporate performance and operating in an adverse context are the worst rated when it comes to their risk of generating social impacts. From the perspective of the latter, companies with poor corporate performance and operating in the most adverse contexts should be granted a better score than if they were operating in a more positive environment. One important difference between the authors is that Dreyer et al. (2010) base their corporate performance assessment of will and ability to manage social issues (which they call indirect indicators), whereas Sanchez Ramirez include in their corporate performance assessment indicators related to both management (indirect indicators) and immediate effects of management practices (direct indicators).

4.3.3 Assessments based on stakeholders’ or experts’ judgment of companies’/sectors’ compliance to societal expectations or norms

Some authors propose to assess corporate performance by relying on stakeholders’ perceptions or experts’ judgments on the compliance to societal expectations of the companies investigated. In this type of approach, the data collected is stakeholders’ or experts’ input. Here, the characterization is performed upon collection, when stakeholders or experts are asked—on a multi-level scale—how they rate the social performance that may be associated to a corporate activity, based on their knowledge and experience.

In their study of recycling systems in low-income countries, Aparcana and Salhofer (2013) ask stakeholders to rate according to their perceptions the level of fulfillment of social compliance criteria by companies/organizations within the recycling system. In their study of PET bottles, disposal in Foolmaun and Ramjeeawon (2012) follows a similar method with regards to meaning assessment; administering a “yes/no” survey to key stakeholders, scavengers, landfill workers, incinerator workers; and flake producing industrial workers.

Manik et al.’s (2013) case study of palm oil biodiesel in the Jambi Province of Indonesia also presents an interesting example in this regard. Four types of stakeholdersFootnote 12 were asked to gauge their social expectation of the performance of this production sector and to disclose their actual perception of this performance.Footnote 13 Final stakeholders’ perspectives were assessed by identifying the gaps between expected and perceived importance of each social issue addressed. Thus, in this study, characterization and weighting of societal issues are both performed right at the data collection stage. The authors combine this result with an expert-led weighting step, which will be discussed in Sect. 4.4.3.

A similar approach is also adopted in Hosseinijou et al. (2014) hot spot analysis, albeit mostly with experts from the industry and academia. After using material flow analysis (MFA) in order to identify which steps of the life cycle they will focus on (cf., Sect. 4.4.5), the authors ask experts to identify the most pressing socio-economic issues associated with different life cycle steps associated to two materials, cement and steel in Iran.Footnote 14 They subsequently undertake an expert-led weighting step, which will be discussed in Sect. 4.4.3.

Hosseinijou’s hot spot assessment is an example where the weighting and the meaning assessments are inverted from what is presented in Fig. 2. The MFA first weighs the importance of life cycle steps and then proceeds to an expert-based meaning assessment of the most important steps subsequently.

4.3.4 Identification of most salient issues based on researchers’ expert judgment on company’s activities

This approach corresponds to what the guidelines call “simple aggregation,” whereby the researcher brings together qualitative or quantitative information into a single summary, within a subcategory, based on what are the most salient issues in his/her inquiry. This is for instance the approach chosen by Morberg et al. (2009) in their study on ICT services. Bouzid and Padilla (2014)’s also proceed in this manner, while taking ILO conventions as a general reference point to identify salient issues. In their study of artisanal cobalt mining in the Democratic Republic of Congo, Tsurukawa et al. (2011) proceed similarly, taking into consideration both international and Congolese laws and norms. Their analysis is based upon the Oko-Institute Sustainability Toolkit PROSA, a product sustainability assessment tool which encompasses a comprehensive list of socio-economic indicators based on international standards and international CSR instruments. Prakash et al. (2010) also adopt a similar approach in their study of e-waste management in Ghana, based on field interviews and bibliographic sources.

4.3.5 Assessments based on how a performance is positioned with regards to a distribution of performances

Some characterization approaches focus on assessing a company, a sector, or a country, based on how it performs on social issues in comparison to other companies, sectors, or countries. The SHDB adopts this approach for a number of its social indicators. It compares the data obtained for a sector-country (e.g., data for “sector average wage” in “wheat sector in China”) with the worldwide distribution of data on this issue. The distribution is divided in quartiles—if the sector-country’s data in question falls within the first quartile, it is considered at “low risk”; if it falls within the second quartile, it is considered at “medium risk”; and the last two quartiles correspond to “high risk” and “very high risk” (Benoît Norris et al. 2013). The SHDB characterization scheme intervenes in numerous studies which integrate SHDB results (Martínez-Blanco et al. 2014; Ekener-Petersen and Finnveden 2013).

In their comparative study of three types of fertilizers, Martínez-Blanco et al. (2014) also discuss (but do not put into practice) a similar idea. However, they propose to compare sector data to country data averages; “if sector data (for compost) are better/equal/worse than country data, then compost has a better/neutral/worse effect regarding that social indicator” (Martínez-Blanco et al. 2014, p. 45). They also propose that the same logic could be used to assess company data against country or sector data.

4.3.6 Assessments based on comparison between life cycle steps or alternative product systems

In this approach, a comparison is made between the inventory data associated to life cycle steps or alternative product systems. Here, the assessment simply relies on a simple comparison in which the data that reflects the best company performance is identified.

In their comparative study of three types of fertilizers, Martínez-Blanco et al. (2014) carry out a data collection at different scales of resolution. For the foreground sector scale and the company scale, the characterization is done as a simple comparison of the inventory data (quantitative and qualitative) between the three fertilizer options, at each of the life cycle steps considered. A score is thus granted to each option, based on a comparative scale “best–intermediate–worse–similar” (Martínez-Blanco et al. 2014).

Scaled inventory data can also be assessed in a comparative manner, as done by Weldegiogis and Franks (2014). Since the data in question is quantitative with a—usually—clear direction as to what is best (e.g., more jobs created is better and less accidents is better), their comparison is fairly straightforward.

4.4 Weighting: four main approaches in type I SLCA studies

Characterization is always matched with an explicit or implicit weighting step in order to obtain final results—this step usually occurs after characterization, but it can also be carried out before or simultaneously. As identified in Fig. 4, there are two broad categories of weighting, those that are based on relative importance of different social issues and those that are based on the relative importance of unit processes or life cycle steps. Within these two categories, five main approaches have been identified: (1) implicit equal weighting of all subcategories, categories, unit processes, or life cycle steps; (2) prioritization of worse performance within a subcategory; (3) weighting according to stakeholders’, experts’, or users’ perception of importance of issues; (4) weighting according to importance of issues based on UN conventions; and (5) according to a variable representing the relative importance of different life cycle steps/unit processes. Different weighting approaches can be combined in one single study.

Fig. 4
figure 4

Five main approaches in type I weighting step

4.4.1 Equal weighting of subcategories and categories or of unit processes and life cycle steps

Implicit or explicit equal weighting of subcategories, categories, unit processes, or life cycle steps takes place in many SLCA studies (Foolmaun and Ramjeeawon 2012; Ciroth and Franze 2011; Revéret et al. 2015; AGÉCO 2013; Revéret and Parent 2012; Manhart and Griesshammer 2006; Tsurukawa et al. 2011). From this perspective, all items considered are granted an equal weight, regardless of their relative topical importance or extent of their involvement in the life cycle.

4.4.2 Worse performance is prioritized within subcategory

When more than one characterization indicator belongs to a subcategory, Ekener-Petersen and Finnveden (2013) suggested selecting the worst scored indicator as the result for the subcategory. Implicitly, a weighting factor of 1 is given to the worst scored indicator and 0 is granted to the rest of the indicators in the subcategory.

4.4.3 Weighting according to stakeholders’/experts’/users’ judgment of importance of issues

A number of approaches weigh the importance of social issues according to stakeholders’, users’, or experts’ judgment. The techniques for doing so vary, from simple consultations, to ranking surveys on issues (Manik et al. 2013), to multi-criteria decision-making tools such as the analytical hierarchy process (AHP) (Hosseinijou et al. 2014).

One key-differentiating factor among these approaches is the type of people involved in the consultation. While Manik et al. (2013) focus on a range of third-party stakeholders (from academia, activist organizations, NGOs, and governmental agencies), Hosseinijou et al. (2014) focus on 20 experts, mostly from the technical and business fields.Footnote 15

These different actors must be taken into account in order to understand the results of a study, as different stakeholders are bound to attribute different values to the issues considered. An example of this is Lehmann et al.’s (2011) study, relying on the input of decision makers/users—in this case, regional and municipal governmentsFootnote 16—at the weighting phase. The study considers integrated waste resource management technologies in Indonesia and packaging waste management technologies in Spain and Portugal. Among the issues that are given most weight are “noise generated” and “odor generated”—issues that are certainly relevant to the technologies considered but that are also intimately tied to issues within the purview of municipal and regional governments. Issues such as labor rights and working conditions, which one could argue are more central to human dignity and well-being, are relegated to a lower level of importance.

4.4.4 Weighting according to importance of issues based on norms and general literature

Another less explored approach is to weigh the importance of social issues according to internationally negotiated norms and literature about precursor rights, as proposed by Beaulieu et al. (2014). The authors base themselves on the ILO Declaration on Fundamental Principles and Rights at Work, the five pillars of the Decent Work agenda (rights at work, employment creation, social protection, and social dialogue), as well as the concept of precursor rights to propose a hierarchy among SLCA subcategories. Precursor rights are also called “enabling rights”—their fulfillment usually has a significant knock-over effect on the fulfillment of other rights (e.g., freedom of association).

4.4.5 Weighting according to an activity variable

Characterization results for one unit process can be weighted according to the importance of the unit process relatively to the other ones of the system, i.e., the relative importance within the life cycle considered.

The most common way to determine the relative importance of unit processes is through the use of an activity variable. An activity variable is a variable representing a quantifiable activity that can be measured at the different unit process and scaled to a FU (Norris 2006; Andrews et al. 2009). Technically, any quantitative inventory indicator scaled on a FU could serve as activity variable.Footnote 17 However, working hours are the most used activity variables as it is coherent to weight all the issues related to the working conditions according to the number of working hours “generated” through the provision of a FU. The Social Hotspot Database (Benoît et al. 2012) uses this activity variable and so do other studies (Bouzid and Padilla 2014).

Hosseinijou et al. (2014) use a material flow analysis to determine material intensity for the different unit processes. The higher the material intensity is, the higher the weight attributed to the characterization results for a unit process is, on the assumption that “important flows are those involving more stakeholders” (Hosseinijou et al. 2014, p. 627).

Finally, Ekener-Petersen and Finnveden (2013) use a weighting variable that gives a relative importance, not exactly between different unit processes but between the different countries involved in the product system.Footnote 18 The logic is however very similar as the idea is to provide a weighting according to “the countries with most stakeholders potentially affected” (Ekener-Petersen and Finnveden 2013). In other words, countries are weighted according to their significance in the product system, based on their share of the worldwide production of a given raw material or component (e.g., Russia is responsible for 12 % of the worldwide oil extraction) and the share of that raw material or component in the material balance of the product (e.g., oil accounts for 33 % of the weight of a laptop).

This weighting step is sometime further followed by an aggregation of life cycle steps or the whole life cycle, but this is not further covered in this paper.

5 Conclusions

The preceding sections have highlighted the multiplicity of approaches that are comprised within type I SLCA today, in terms of the data collected (in particular, its positioning along the impact pathway), the presence of some optional steps, and the different characterization and weighting steps that are present in the literature today. They have also put forward the idea that it is helpful to conceptually distinguish the characterization and the weighting steps from one another—these are often subsumed and intertwined under the term “characterization model” or “impact assessment method” in the literature. Separating characterization from weighting helps better understand the differences and similarities between the evaluation methods proposed by practitioners. It is also useful to gain greater clarity on what importance will be granted to different characterized results—and in this process, what “bad performances” or “good performances” might be rendered more or less visible within the end results of the SLCA. In this regard, approaches used in multi-criteria assessment methods could be explored in order to strengthen the transparency of weighting methods in SLCA.

5.1 Reflecting upon the methods’ adequacy to assess the subject at hand

It is interesting to take stock of the diversity of approaches and identify which ones are most appropriate to respond to the various questions that SLCA aims to respond to. Yet, in this polymorphous context, a question that arises is what approaches are most likely to tell us anything about the impact of a product system on social well-being? This question is particularly relevant in the absence of well-documented impact pathways between activities within product systems and impacts on social well-being in type I SLCA. While the article raises more questions than answers in this regard, a few elements can still be suggested.

5.1.1 Indicator choice

First, with regards to data collection, this review has highlighted that the furthest indicators are positioned along the impact pathway, the hardest it is to understand the link between them and the activities of companies in the product system. While it is an interesting proposition to use indicators that are not only located at the beginning of the impact pathway, in the absence of characterization factors reflecting documented causality chains, caution and full transparency with regards to the limits of analysis are necessary when using those types of indicators. Indeed, there is a trade-off between aiming to assess impacts located closer to the notion of human well-being (further positioned along the impact pathway) and ensuring that our assessment be closely related to the object of study—the product system.

So far, type II SLCA has mostly turned toward econometric models in order to document causality chains. Yet, these models tend to be applicable at the macro-level (regions and countries), rather than at the company level, which tends to be the purview of most efforts in type I SLCA. It seems that some innovative approaches could be applied in type I SLCA in order to develop a better understanding of causality chains at the company level—the work developed through the concept of theory of change, as put forward in social return on investment approaches, could be an avenue to explore in this regard.

The review has also shed light on the presence of indicators that cannot be reasonably considered as a proven “connector” of social phenomena on the pathway between a product system and potential social impacts (e.g., “risk of not having access to a hospital bed”). While some indicators can be useful to describe the context of operation of a given company, their relationship to the product system is very different from social performance data and should be acknowledged. Ultimately, research and further thinking are necessary in order to understand the relevance of and the best way to include these contextual indicators in SLCA practice.

5.1.2 Characterization and weighting approaches

Second, as presented in Sect. 4, the variation in terms of the characterization and weighting approaches used in type I SLCA today is wide-ranging. To illustrate this mosaic, a typology of six types of characterization methods and five types of weighting methods was presented. While the aim of this paper is not to promote one method over the other, it is still useful to reflect upon which approaches are most likely to shed light upon the potential impact a product system may have on human well-being. It is also useful to reflect on which aspects of different approaches are bound to bring about greater analytical clarity.

With this in mind, approaches that include a concern for norms, for geographical location, and for stakeholders’ input seem to be best placed to provide richer results. Norms have been a long-standing focus of SLCA and provide a common and well-understood frame of reference. The inclusion of stakeholders’ experience is an effective way to give voice to affected stakeholders and of all approaches, is probably best positioned to shed light upon the potential impact a product system may have on human well-being. Finally, methods that include information both on performance assessment and on geographical contextualization may be better positioned to provide an assessment of the potential impact of corporate performance, as operating environment clearly can affect the evolving chain of impacts originating from corporate performance.

This being said, concerns for norms and for stakeholders’ opinion do not always easily co-exist in studies. Key to this uneasy co-existence is the debate of what space should be given to affected stakeholders’ voice in the characterization and weighting phase vs. international norms, best practice, and expert knowledge. The question of contextual specificity vs. universality of assessment models is at the heart of this question. Yet, as argued above, it seems that methods that do give voice to affected stakeholders are probably best positioned to shed light upon the subject at hand. We recognize however the logistical challenges associated with this, as well as the tension with this approach with the aim to develop more universal and “replicable” methods.

With regards to methods including notions of geographical contextualization, the approach proposed by Dreyer et al. (2010) is particularly compelling, where the assumption that challenging operating environments are likely to fuel greater potential social impacts—and hence where outstanding corporate performance is necessary—is of particular interest. Sanchez Ramirez et al. (2014) also propose to consider geographical contextualization, but companies displaying substandard performance and operating in a negative context are granted a better PRP level than those operating in a positive context. It seems that Dreyer et al.’s (2010) approach is more oriented toward the stakeholders’ well-being, where the latter is more oriented toward the expected behavior adopted by companies.

However, approaches that rely solely on expert judgment can run the risk of lacking transparency. Those focusing on distribution of results can bring about worthy information—and often the most able to provide bulk analysis for large portions of tentacular product systems—but their results tend to be very general and do not give any indication as to how the observed situations may stack against expected norms and stakeholders’ experience. Lastly, approaches which provide a simple comparison of two alternatives are useful for their expediency, but the information obtained is solely geared toward providing a comparison and not a thorough analysis of a product system.

Finally, weighting of different social issues can be done according to their relative importance inside an impact category, a stakeholder category, or with regards to the overall social issues considered. We suggest that weighting inside a stakeholder category (which we did not encounter in our literature review) might be an approach that allows reflecting upon the potential impact of company’s performance on stakeholders. Indeed, the process of assessing which subcategory among others might be most important to stakeholders might shed light upon the potential impact a product may have on human well-being.

5.2 Reflecting upon what is type I SLCA

Taking stock of the diversity within type I SLCA also forces us to reflect upon the definition of what is type I SLCA. In the literature, important notions that are associated with type I SLCA are the use of norms-based PRPs, a focus on assessing companies’ activities, and the absence of causality-based characterization factors. Yet, the present critical review has highlighted the variety of reference points mobilized in scales used for characterization. This forces us to reflect upon the current definition for PRPs in order to reconcile the fact that, in light of current practice, this assessment approach is not only based on norms and best practice but also on other points of reference, such as stakeholders’ judgment or average performances within sector (or country or worldwide)—to name a few.

The review has also underlined how inventory data used focus on company activity but can also focus on immediate effects, further effects, and sometimes even effects further along the impact pathway. Hence, the basic element that differentiates type I SLCA from type II SLCA is the absence of causality-based characterization factors—not that it solely focuses on company activity data.