Keywords

1 Principle of the Test Method and Scientific Basis

Acute irritation is characterised by the non-immunological inflammatory response of living skin following injury caused by a single contact with an irritant substance. This response is local and reversible (unlike that produced by corrosion, which is irreversible). The in vivo evaluation of skin irritation is mainly based on semi-quantitative visual scoring (erythema and oedema). Besides morphological changes, irritation also involves more-complex, subjective and subtle phenomena, such as itching and burning sensations, which are not easily measurable [1]. Since cytotoxicity is also known (among other factors) to trigger irritation, it can be viewed as a first event likely to be shared by the effects of many irritants. Following mechanical or chemical assault, homeostatic mechanisms may be deregulated, leading to non-specific inflammation processes triggered by inflammatory mediators originating mainly from keratinocytes [2]. Cell and tissue damage lead to the release of inflammatory mediators, nerve stimulation, axonal reflexes, pain and itching [3,4,5]. The inflammatory response ultimately leads to observable phenomena such as localised skin swelling (oedema) and redness (erythema). Overall, clinical signs of irritation include the development of a rash, inflammation, swelling, scaling, and abnormal tissue growth in the affected area (Fig. 4.1).

Fig. 4.1
figure 1

Schematic of skin irritation effects

Initially, to conduct the skin irritation assessment, most regulatory authorities required a standardized in vivo test in which—having first excluded skin corrosion potential—the chemical was applied to the skin of a maximum of three rabbits [6]. The ability of the chemical to induce erythema and/or oedema was scored per animal. A score of between 0 and 4 on the Draize scale, increasing with severity, was subjectively assigned on the basis of erythemal and oedemal effects, usually at 24, 48 and 72 h after application of the substance [7]. However, scientific concerns about the variability [8, 9] and predictive capacities of this animal test in terms of human health effects [10,11,12] were raised.

Animal welfare and, more recently, political pressure in Europe in areas such as legislation relating to chemicals and cosmetics have required the development of appropriate and validated alternative, in vitro test methods [13]. In the last 20 years, considerable scientific effort has gone into developing valid in vitro skin models to replace animal testing. Initial progress was made through the availability of bioengineered non-invasive methods applicable to the skin in vivo, such as trans-epithelial water loss and electrical resistance. These methods permitted the quantification of physiological changes and opened up new possibilities for in vitro/in vivo comparison [14, 15]. Based on these observations, various in vitro models such as primary human keratinocytes [16] and human skin equivalent models [17,18,19] were evaluated for their ability to assess cutaneous toxicity or irritation. Due to the increasing need for non-animal tests to predict human skin irritation, the European and Japanese Centers for the Validation of Alternative Methods (EURL-ECVAM/JACVAM) have focused their evaluation on four suitable in vitro reconstructed human epidermis test methods: these now-validated methods have similarly defined characteristics (Fig. 4.2) and include the SkinEthic™ RHE test method [20,21,22].

Fig. 4.2
figure 2

Specific key points: defined and flexible points

The three-dimensional SkinEthic™ RHE tissue, based on a pioneering concept by Dr. Prunerias, was first released by Martin Rosdy in 1989 [23, 24]. The SkinEthic™ RHE model consists of normal human keratinocytes cultured using a chemically defined growth medium at the air-liquid interface. It produces a highly differentiated and stratified epidermis model comprising main basal, supra basal, spinous and granular layers and a functional stratum corneum with a histological morphology comparable to in vivo human tissue [25, 26]. The validated SkinEthic™ RHE skin irritation test method involves a topical application of chemicals for 42 min followed by rinsing and post-incubation for 42 h. Irritant chemicals are identified by their ability to decrease tissue viability (MTT reduction) below the defined threshold of 50% viability.

2 Current Validation Status

The reliability and relevance of the SkinEthic™ RHE skin irritation test method has been established through a rigorous, inter-laboratory validation study. Based on its scientific validity, this test method has been recommended for the testing of all classes of chemicals and for inclusion in tiered testing strategies [27]. The SkinEthic™ RHE test method was originally validated on the basis of the Performance Standards using the 20 defined reference chemicals (ESAC statement from November 2008; [28]). The SkinEthic™ RHE test method has been found scientifically valid in reliably predicting no-label and R38 (irritant) chemicals with respect to the previous EU classification scheme [29]. Re-evaluation based on recalculating the predictive values of the test method under the United Nations (UN) Globally Harmonized System of Classification and Labelling of Chemicals (GHS) was performed in 2008 and confirmed in April 2009 by ESAC for use under the UN GHS system as “applicable to all authorities” [29,30,31]. As a result, since 2010, the SkinEthic™ RHE test method has been accepted in the official Organisation for Economic Co-operation and Development (OECD) Test Guideline 439 (OECD TG439), allowing the identification of non-irritant and irritant substances and mixtures in accordance with UN GHS and the EU test method B.46 [32,33,34]. The SkinEthic™ RHE test method was also recently included as part of the Integrated Approach to Testing and Assessment (IATA) for Skin Irritation/Corrosion in OECD Guidance Document 203 [27, 35].

3 Performance and Applicability of the Test Method

3.1 Reproducibility

Two types of reproducibility were evaluated for the SkinEthic™ RHE test method: one by testing the same chemicals over time in a single laboratory (within-laboratory reproducibility, WLR) and the other by testing the same chemicals in different laboratories (between-laboratory reproducibility, BLR). WLR was calculated as the percentage of chemicals for which identical classifications were obtained in the three valid runs performed. BLR was calculated as the percentage of chemicals for which identical classifications were obtained between laboratories.

The reproducibility study involved evaluating the ten non-irritant and ten irritant reference test chemicals selected in accordance with the Performance Standard document [36]. The 20 chemicals were coded by Vitroscreen and subjected to blind tests in three laboratories: L’Oréal, Coty and Oroxcell. The same concordant classification was observed for 59 out of 60 items (98.3%) for the three laboratories when considering irritants versus non-irritants [20]. Therefore, none of those test substances showed a standard deviation (SD) > 18% in two laboratories. Only the allyl phenoxy-acetate gave a SD > 18% as unacceptable in the third laboratory, demonstrating the reproducibility of the test method. The proportion of identically classified test substances derived from the prediction model was 100% for two laboratories and 95% for the third laboratory, when considering all experiments [20]. In conclusion, regardless the analyses, low intra-and inter-run variability for all laboratories was observed with the negative and positive controls, and the 20 reference test substances indicated high intra- and inter-laboratory reproducibility.

3.2 Predictive Capacity

The study conducted by industry was submitted to EURL-ECVAM for evaluation and peer review. The SkinEthic™ test method was regarded by EURL-ECVAM as sufficiently similar to the validated EpiSkin™ method according to the European Classification System based on the Dangerous Substance Directive (DSD) [28]. Sensitivity and specificity for the 20 reference chemicals were 90% and 80%, respectively [20]. The results obtained in the three laboratories with an overall accuracy of 85% met EURL-ECVAM specificity (>80%) and sensitivity (>70%) requirements [36]. EURL-ECVAM also evaluated the test method in its in-house laboratory (called ‘Correlate’) with regard to transferability. Based on 19 of 20 test chemicals, a sensitivity of 90% and a specificity of 77.8% were reached (data available in Annexe 5 of the OECD Explanatory Background Document; [33]). The same three test substances (1-bromo-4-chlorobutane, 4-methyl-thio-benzaldehyde and hexyl salicylate) were misclassified, as in other epidermis test methods [20, 22, 33]. No clear difference in the physicochemical properties between the correctly and incorrectly classified test substances was identified to explain this outcome [37]. Increasing the number of tests to 39 chemicals lead to similar predictive capacity with a sensitivity of 90%, a specificity of 80% and an overall accuracy of 85%, with 33 out of 39 test substances correctly classified [38].

In December 2008, the EU adopted and implemented the UN GHS [29] through the Classification, Labelling and Packaging (CLP) Regulation [39]. This regulation replaced the previous EU DSD legislation [40] on the classification of substances and mixtures. The CLP system continues to use two categories to distinguish non-classified (No Category) from irritant (Category 2) substances. However, according to the new rules for skin irritation classification and labelling (C&L) [29, 39], the cut-off score to distinguish between No Category and Category 2 substances was raised to 2.3 (UN GHS or CLP) from 2.0 (EU DSD). Consequently, substances with an in vivo score of between 2.0 and 2.3 that were considered irritant under EU DSD are now non-classified under UN GHS. This naturally led to a change in the specificity and sensitivity values. Since UN GHS defines irritants as substances with a score of 2.3 or more, the sensitivity of the SkinEthic™ test system was increased to 100% and the specificity decreased to 69.2% using the 20 reference chemicals. Overall accuracy was 80%, resulting in the test method being endorsed by the EURL-ECVAM and OECD Committees as a stand-alone replacement test method for the in vivo Draize rabbit test [41].

3.3 Applications and Limitations

This test is designed for mono- and multi-component test chemicals and mixtures. The protocol was established for liquid, viscous, semi-solid and solid chemicals. Topical application to the epidermis makes the method suitable for evaluating chemicals that are soluble or insoluble in water, volatile, creamy, sticky, fatty, powdered, etc.” The inclusion of HPLC/UPLC-spectrophotometry to measure formazan in the procedures for the in vitro SkinEthic™ RHE test method also extends its applicability to strongly coloured chemicals [42]. The test method is not appropriate for testing gases and aerosols.

3.4 Comparison to Human Data

The in vivo Draize rabbit skin irritation test is an accepted regulatory method of classifying and labelling chemicals. As such, the classification and labelling results of this test were taken as the “gold standard” in the context of the validation study for the reconstructed human epidemis models. Several large-scale studies on human volunteers conducted in the 1990s concluded that the in vivo rabbit test often over-predicts the severity of skin reactions and damage produced by chemicals, although there was also occasionally under-prediction [43,44,45,46]. Therefore, as defined by Jirova et al. [47], while concordance between the rabbit test and the results of the 4-h. HPT was rather poor (56%), the reconstructed human epithelium methods provided more convincing results. The results presented in Table 4.1 confirm observations that rabbit tests over-predict skin effects in humans. Given that the SkinEthic™ RHE test method was validated against the over-predicted rabbit test, prediction errs on the side of caution for the safety of consumers, which is essential in the context of risk assessment (Table 4.1).

Table 4.1 Summary table of in vivo and in vitro results

4 Brief Description of the Protocol

Each test chemical (test material, negative and positive controls) is topically applied to three tissue replicates concurrently for 42 min at room temperature (RT), between 18 °C and 24 °C. Exposure to the test chemical is followed by rinsing with phosphate buffer saline (PBS) and mechanically dried. The epidermis is then transferred to a fresh medium and incubated at 37 °C for another 42 h. Cell viability is measured by enzymatic conversion of the vital dye MTT [3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide, Thiazolyl blue; CAS number 298–93-1] into a blue formazan salt that is quantitatively measured after extraction from tissues [48]. Cell viability is assessed by incubating the tissues for 3 h with 0.3 mL MTT solution (1 mg/mL). The formazan crystals are extracted using 1.5 mL isopropanol for 2 h at RT and quantified by spectrophotometry at 570 nm wavelength. Sodium Dodecyl Sulphate (SDS 5%) and PBS treated epidermis are used as positive and negative controls, respectively. For each treated tissue, the cell viability is expressed as a percentage of the mean negative control tissues. The mean relative tissue cell viability above 50% predicts its non-irritancy potential. Irritant chemicals are identified by their ability to decrease cell viability below the defined threshold level (i.e. ≤50%, for UN GHS Category 2). The prediction model is defined as described below in Table 4.2. Details are provided in the SOP [49] and described in [20]. Key components of the protocol are also available at http://www.episkin.com.

Table 4.2 Prediction model of the SkinEthic RHE skin irritation test method

5 Role in a Testing Strategy

The evaluation of the skin irritancy and corrosivity potential of a test chemical is a vital part of safety assessment. Alternatives to the rabbit Draize test for skin corrosivity have already received official approval, including human skin model tests using reconstructed human epidermal equivalents such as the SkinEthic™ RHE skin corrosion test method (see Chap. 10). For skin irritation, the SkinEthic™ RHE skin irritation test method was validated as a stand-alone test replacement for the rabbit Draize test (see above). In light of the full evaluation of local skin effects after a single dermal exposure using in vitro test methods, the OECD Guidance Document No. 203 on an Integrated Approach to Testing and Assessment (IATA) was established [27]. This IATA approach includes in vitro tests for skin corrosion (as described in OECD TG 431) and skin irritation (OECD TG 439) before considering testing on living animals [50].

The top-down approach (an in vitro skin corrosion test followed by an in vitro skin irritation test if the chemical is identified as non-corrosive in the first test) should be used when all available collected information and the weight-of-evidence (WoE) assessment result in a high a priori probability of the chemical being an irritant or a corrosive. The bottom-up approach (an in vitro skin irritation test followed by an in vitro skin corrosion test if the chemical is identified as an irritant in the first test) should be used only when all available collected information and the WoE assessment result in a high a priori probability of the chemical not being a skin irritant.

To demonstrate the application and relevance of both approaches using the SkinEthic™ RHE test methods, SkinEthic™ RHE irritation and corrosion data on 86 substances were integrated in a bottom-up and top-down testing strategy to assess their capacity for hazard and safety assessment under UN GHS classifications ([35, 42, 51]). The results showed that the SkinEthic™ RHE model was applicable to a wide range of chemical classes and physical states. The bottom-up and top-down testing strategies showed an identical number of correct and incorrect classifications for the different (sub)-categories (Fig. 4.3). Overall strategies showed an accuracy of 89.5% in distinguishing between non-classified and classified substances, and 93.4% in distinguishing between corrosive and non-corrosive substances (Fig. 4.3). Furthermore, excellent sensitivities were obtained in predicting UN GHS category 1 chemicals (100%), followed by the category 2 irritant substances (70%), irrespective of the strategy and classification system used. Interestingly, none of the skin corrosive category 1B–and-1C and 1A chemicals were under-predicted as a skin irritant (Category 2) or non-classified, irrespective of the strategy and classification system used, suggesting that the SkinEthic™ RHE model ensures consumer safety when used in the context of the OECD recommended IATA. Only a single non-classified substance (2,4-Xylidine) was over-predicted as category 1B–and-1C and none as category 1A, suggesting that the SkinEthic™ RHE model also helps to avoid unnecessary over-labelling.

Fig. 4.3
figure 3

In vitro classifications for the 86 test chemicals in the bottom-up testing and the top-down strategies based on the UN GHS classification system. correct (sub)-category classifications, U-P under-predicted, O-P over-predicted

6 Perspectives from the Test Developer

6.1 Critical Steps in the Protocol

The critical steps of the standardized operating procedure could be listed as follows:

  • Verify the absence of air bubble under the epidermis at each step;

  • Test all test chemicals alone in separate plate;

  • For liquids (16 μL ± 2 μL), dispense the substance onto the epidermis with a positive displacement pipette and apply a nylon mesh to gently spread the substance, taking care to cover the entire surface;

  • For solids (10 ± 2 μL H2O and 16 ± 2 mg test item), the substance should be crushed to a fine powder, ensuring good contact with the epidermis;

  • For viscous and sticky chemicals, use a curved flat spatula or weigh directly on the nylon mesh;

  • Apply the chemical-coated side of the nylon mesh to the epidermal surface;

  • Carefully remove the nylon mesh before rinsing;

  • Rinse the tissue thoroughly;

  • Thoroughly protect the plate by stretching three parafilm layers over the plate to prevent the evaporation of the formazan during the extraction step.

6.2 Possible Protocol Adaptations

In all reconstructed epidermis test methods, the skin irritation potential of a chemical is determined by measuring tissue viability in treated tissues after topical application to the tissue surface. Tissue viability is determined by enzymatic reduction of MTT tetrazolium salt to purple reduced MTT (formazan) [48]. A known limitation of the photometric MTT-reduction assay is the possible interference of coloured test chemicals with the absorbance measurement of formazan. Analytical methods such as High/Ultra High Performance Liquid Chromatography (HPLC/UPLC) might be more appropriate to detect formazan in the in vitro assay. Cosmetics Europe undertook a study to establish and evaluate the use of this analytical method [42]. Based on the outcome of this project, it was concluded that this analytical endpoint detection system is relevant to all test methods, irrespective of the test system and test method used (e.g. SkinEthic™ RHE skin irritation assay). It was therefore recommended that the OECD Test Guideline 439 be revised to incorporate HPLC/UPLC-spectrophotometry as an additional endpoint detection system in the technical procedures for the in vitro SkinEthic™ RHE skin irritation test method [32].

6.3 Challenges and Opportunities

Challenges and opportunities might be seen in the context of the assessment of specific categories of ingredients (e.g. mixtures) as well as for UN GHS categorization using the SkinEthic™ RHE test method.

The SkinEthic™ RHE test method distinguishes between skin irritants (Cat. 2) and chemicals not classified for skin irritation (No Cat.). However, the test method is not designed to classify chemicals in the optional GHS Cat. 3 (mild irritants). Development of a test method exploiting quantitative analysis of expression profiles of relevant genes might be considered as such an approach was established and defined using the EpiSkin™ RhE-based test system [52].

Mixtures are defined as “a mixture or a solution composed of two or more substances in which they do not react” [34]. Since mixtures cover a wide spectrum of categories and compositions, the type of regulatory testing required may depend on the type of mixture. For example, cosmetic formulations can no longer be tested using animal studies in some parts of the world [53]. In contrast, biocides including mixtures may be subject to specific testing requirements [54]. As such, depending on the field and/or sector, the use of validated in vitro assays to assess mixtures is of relevance. Cases in which in vitro testing of preparations and mixtures could be useful and relevant include cosmetics, cleaning products, biocides and plant protection products might be very useful [55]. Although these mixtures had high-quality in vivo data, not all of them are publically available, allowing only limited comparisons between the in vivo and in vitro observed effects. Access to in vivo data will permit a better definition of the applicability domain of the test method for mixtures with complex physical properties such as hydrophobicity, sticky/buttery-like texture and waxy/creamy foam characteristics. Further investigation would also be beneficial for agrochemicals due to the limited-and-contradictory nature of information available and the difficulty in interpreting the data when the composition of the mixtures has not been identified—as reported for another RhE-based test method [56, 57].

7 Conclusions

The SkinEthic™ RHE skin irritation test method has gained international regulatory acceptance and has been adopted for the regulatory assessment of skin irritation to distinguish between EU CLP-UN GHS category 2 (irritant) and non-classified (No Category, non-irritant) chemicals (OECD TG 439). Intra- and inter-reproducibility findings indicate that the SkinEthic™ RHE model has high robustness in terms of its performance with an enlarged dataset of diverse chemicals and mixtures. Furthermore, the relevance of the integration of SkinEthic™ RHE skin irritation data in a bottom-up or top-down strategy has been demonstrated with a similar high accuracy for the determination of the potential hazard of chemicals.