Introduction

Global cervical cancer incidence increased from 378.000 (256.000–489.000) cases per year in 1980 to 454.000 (318.000–620.000) cases per year in 2010, which is a 0.6 % annual rate of increase. Cervical cancer death rates have been decreasing but the disease still killed 200.000 (139.000–276.000) women in 2010, of whom 46.000 (33.000–64.000) were aged between 15 and 49 years in developing countries [1]. Cervical cancer is usually preceded by a long phase of pre-invasive disease called cervical intraepithelial neoplasia [CIN]. This precursor phase is generally asymptomatic and can occur over a period of 10–20 years [2]. Cytology based primary cervical cancer screening of these precancerous lesions has reduced the incidence and mortality of cervical cancer after its introduction in the 1940s [3, 4].

High-risk human papillomaviruses (hr-HPV, in this paper referred to as HPV) are definite aetiological agents of almost all cervical carcinomas [5]. Based on the findings of different clinical trials, HPV testing further improves the efficacy of primary screening as compared with cytological screening [68]. The HPV test is less specific than the cytology test by reason that the vast majority of infections are transient and cleared, particularly in young women [9, 10]. HPV testing is not recommended under 30 years of age, in this age group the cytology test is more specific [11] due to the high prevalence of HPV infections.

It is important to increase specificity in order to reduce the cost of HPV based cervical cancer screening and although being costly, maintaining high sensitivity also has public health significance. The primary HPV DNA-based screening with cytology triage and repeated HPV DNA testing of cytology-negative women has been suggested as an effective triage strategy amongst the available techniques, however cytology testing lacks sensitivity whereas repeated HPV testing still lacks specificity [12, 13]. An alternative strategy would be HPV genotyping, which is based on the increased cancer risk of the HPV-16 genotype, however, genotyping tests are not widely available [14].

Great efforts have been made to identify novel biomarkers aiming at improving the specificity of screening, which could distinguish between productive and transforming HPV infections and/or could predict disease severity. Different technologies have been proposed, such as HPV triage based technologies like p16INK4A and ProexC (combined MCM2 and TOP2A detection), or either cellular gene or HPV gene promoter methylation based biomarkers or microRNAs [1521].

p16INK4A has been proposed as a biomarker for transforming HPV infection, originally introduced for improvement of the histological and cytological evaluation of cervical precancerous lesions. Usually, p16INK4A is expressed at a very low level in healthy cells, whereas being strongly over-expressed in almost all CIN2+ cases in which HPV is present [2124].

Claudins are functional and structural components of tight junctions (TJ) belonging to a large family of transmembrane proteins with a function of regulating paracellular permeability, maintaining cellular polarity and playing a role in signal transduction [25]. Alterations of claudin expression patterns have been described in many types of gynaecological cancers such as cervical and endometrial, ovarian cancers and in premalignant lesions [2630]. Significant increase in CLDN1 and 7 was detected in premalignant cervical lesions and invasive cancers as compared with the normal cervical epithelium [2931]. These findings are consistent with the facts that TJs are disassembled during tumorigenesis and that overexpressed claudins may have roles in motility, invasion and survival.

In the current study we analysed the value of CLDN1 and p16INK4A immunocytochemistry and immunohistochemistry in cervical cancer diagnostics, in screening and triage settings and we compared the results with those of cytology and HPV testings.

Material and Methods

Patient Population

In total, 502 patient samples were enrolled including 352 cytology [liquid based cytology (LBC)] controlled conisation (both loop and knife) samples (histology samples) and 150 consecutive screening population based LBC samples enrolled in the HPV_SCREEN multi-centre clinical study and the KTI121128 KMR_BIOMARKER study in Hungary (see details in Table 1.). Cases with valid histology or cytology diagnosis were considered eligible. All clinical samples were obtained with the permission of the National Ethical Committee and all patients gave informed consent.

Table 1 Samples collected for the multicentre clinical study. A. Stratification of samples according to the diagnosis made. In total, 502 samples were enrolled comprising 352 cytology controlled conisation samples (both LBC and histology samples) and 150 consecutive screening population based LBC samples. In case of cytology and HPV testing, test sensitivities and specificities were calculated according to the gold standard histological diagnosis of CIN2+ as the clinical cut-off. B. Breakups according to the tests performed (cytology and histology, HPV, immunocytochemistry [IC-CLDN1, IC-p16INK4A] and immunohistochemistry [IH-CLDN1, IH-p16INK4A] diagnoses). Samples were considered statistically eligible if all tests were valid

Cytological Diagnosis

Cytology was evaluated by the Bethesda system for cervical smears and concurrently the PreservCyt (Hologic, Bedford, USA) cervical specimens were used for subsequent testing including HPV and immunocytochemical reactions (see later).

HPV Testing

DNA was extracted from cervical samples, collected in PreservCyt media using AmpliLute Liquid Media Extraction Kit (Hoffmann-La Roche Ltd., Basel, Switzerland) from 4 ml of PreserveCyt sample. HPV testing was carried out using Full spectrum HPV System HPV Amplification and Detection System (Genoid, Budapest, Hungary) according to manufacturer’s instructions.

Triage Testing

The diagnostic performances of cytology and HPV triage were calculated for the different tests. Sensitivity and specificity of cytology (for ASCUS+ samples) and HPV (for hr-HPV+ samples) triage populations were assessed using gold standard CIN2+ histology as positivity cut-off. For the sake of comparison, results of the pooled population (PP), where the figures represented the whole study population, as well as results of only the triage test positive population (TP) were calculated.

Immunohistochemistry and Immunocytochemistry

Immunohistochemistry (IH)

Four μm thick formalin-fixed paraffin-embedded (FFPE) sections were used for the IH reactions. CINtec p16INK4a Kit (Hoffmann-La Roche, Basel, Switzerland) was used on the slides according to manufacturer instructions. Parallel slides were prepared from each case and immunostained for CLDN1 using an antibody (Zymed, San Francisco, CA, USA) in 1:100 dilution for one hour, at room temperature. The reaction was carried out in Ventana ES automatic immunostainer (Ventana Medical System. Inc., Tucson, AZ, USA) using HRP multimer-based, biotin-free detection technique. Reagents and secondary antibody were obtained from Ventana (iView DAB Detection Kit, Ventana).

Immunocytochemistry (IC)

Cytology slides were prepared using cytospin centrifugation applying 2 ml of LBC PreserveCyt sample. The CINtec p16INK4A Cytology Kit (Hoffmann-La Roche) was used according to the manufacturer instructions with slight modification. Briefly, a protein blocking reagent (Protein Block Serum Free, Dako Cytomation, Glostrup, Denmark) was used after the peroxydase blocking step, incubating the slides with the reagent for 30 min, followed by two washing steps for 5 min. Parallel cytospin slides were prepared from each case and immunostained for CLDN1 as well as by CINtec p16INK4A Cytology Kit.

CLDN1 immunocytochemistry was performed by replacing the p16INK4A antibody in the kit with CLDN1 antibody (Zymed) in 1:100 dilution incubated for one hour at room temperature according to the protocol described above. The slides were evaluated by two experienced cytopathologists both blinded to all results.

Evaluation of Immunocytochemistry

For semiquantitative evaluation of CLDN1 and p16INK4A IH, 10 areas were selected and 100 cells per field were analysed using high power field objective (×40). Different protocols were used for evaluation of immunocytochemistry, which assessed staining characteristics with or without the cytomorphological readings of dysplastic cells.

The simple scoring method (SM) was a semiquantitive evaluation of the staining intensity without cytomorphological reading. The following grades were used: 0: no staining, 1: weak staining, 2: medium staining, 3: strong staining. Score 1 was the positivity cut-off value for this method.

The morphological reading adjusted scoring method (MASM) evaluated and calculated staining intensity as described above. In addition, the cytomorphologically positive (ASCUS+) dysplastic cells were accessed, followed by calculation of the percentage of positively stained ASCUS+ cells.

CLDN1 and p16INK4A were combined in certain evaluations providing double triage settings (DTS) as well. In triage settings, the triage test was evaluated only in case of the base test positives, however, in order to provide better comparability of the methods, sensitivity and specificity values were also calculated for both the whole population including base test negatives (pooled population) as well as for only the base test positives (triage of positives).

Statistical Analysis

Two-way Contingency Table Analysis was carried out using JavaStatistics (http://statpages.org/ctab2x2.html). Yates-corrected chi-square, Mantel-Haenszel chi-square, Fisher Exact Test were calculated, and only the significant measures were used in the study. Statistical measurements were calculated, including sensitivity, specificity, positive and negative predictive values, positive and negative likelihood ratios, diagnostic and error odds ratios. Confidence intervals for the estimated parameters were computed by a general method (based on “constant chi-square boundaries”). The gold standard clinical cut-off was CIN2 or greater histological findings (CIN2+) used in contingency tables. To analyse two treatments given to matched subjects, McNemar’s test was used (see supplement material).

Results

Study population characteristics are given in Table 1. In the evaluation of test performances CIN2 or greater histological findings (CIN2+) were considered as being positive for cervical disease. In the eligible study population 45.7 % of cases were CIN2+ (133/291) and ASCUS or greater cytology (ASCUS+) diagnosis was detected in 41.7 % (165/395) of eligible cases. Similarly, the prevalence of HPV was high, 45.5 % (179/395) including all cases (Table 1.). The performance of immunochemistry was evaluated in the triage settings for both cytology and HPV triage.

Comparing test performance for CLDN1 and p16INK4A immunochemistry, scoring based evaluation (SM)

Sensitivities and specificities for immunochemistry were calculated for both CLDN1, see representative images in figures. (Figure 1b, d, Fig. 2b, c, d) and p16INK4A (Fig. 1a, c, Fig. 2a) against CIN2+ as positives using the simple scoring method (SM), with mean values given. For details see Table 2. The immunocytochemistry (IC) results indicated that IC-CLDN1 sensitivity was slightly higher than IC-p16INK4A sensitivity [77.3 % (68.7–84.6) vs. 69.3 % (60.9–76.3)], whereas specificity of IC-CLDN1 [60.9 % (53.5–67.2) vs. 80.5 % (73.2–86.5)] was found to be lower. The same pattern was evident for immunohistochemistry (IH) [IH-CLDN1 sensitivity: 88.2 % (81.7–93.2) vs. IH-p16INK4A 75.5 % (67–81.1) and specificity: 33.3 % (28.5–37) vs. 68.1 % (62.6–73)], although the differences between the two biomarkers were more prominent, especially in regard to specificity.

Fig. 1
figure 1

Immunocytochemical reaction for p16INK4a and CLDN1 in LBC samples. Strong dark brown positive reaction can be seen for p16INK4a in the nuclei and cytoplasm of dysplastic cells (a, c). Thin linear membranous reaction b and dot-like membranous and cytoplasmic reaction d are observable for CLDN1. Several normal cells in the samples do not express the antigens, scalebars represent 35 μm

Fig. 2
figure 2

Immunohistochemical reaction for p16INK4a and CLDN1 in cervical samples. Strong nuclear and cytoplasmic reaction can be seen for p16INK4a a in a CIN3 lesion. The reaction for CLDN1 is membranous as it can be demonstrated in CIN3 b, d and CIN1 c lesions. CLDN = claudin, scalebars represent 50 μm

Table 2 Sensitivity and specificity data of the immunochemistry tests for samples with CIN2+ histology diagnosis as positivity gold standard clinical cutoff

Comparing CLDN1 immunocytochemistry to cytology (Table 2.) insignificant differences were demonstrable [sensitivity 77.3 % (68.7–84.6) vs. 75.2 % (68.9–80.8) and specificity (60.9 % (53.5–67.2) vs. 66.5 % (61.1–71.2)], however p16INK4A immunocytochemistry results showed lower sensitivity [69.3 % (60.9–76.3) vs. 75.2 % (68.9–80.8)] but higher specificity [80.5 % (73.2–86.5) vs. 66.5 % (61.1–71.2)]. Comparing IC-CLDN1 to HPV performance, lower sensitivity was observable [77.3 % (68.7–84.6) vs. 95 % (89.9–97.7)] with the same specificity [60.9 % (53.5–67.2) vs. 61.4 % (57.4–63.6)] and the results of p16INK4A immunocytochemistry revealed even lower sensitivity [69.3 % (60.9–76.3) vs. 95 % (89.9–97.7)], but higher specificity [80.5 % (73.2–86.5) vs. 61.4 % (57.4–63.6)]. The immunohistochemistry of CLDN1 showed unacceptably low specificity [33.3 % (28.5–37)] paired with very high sensitivity [88.2 % (81.7–93.2)]. On the contrary, p16INK4A immunohistochemistry performed better, showing the same range, but slightly higher specificity than HPV testing [68.1 % (62.6–73) vs. 61.4 % (57.4–63.6)] and a much lower sensitivity [75.5 % (67–81.1) vs. 95 % (89.9–97.7)].

Comparing test performance for CLDN1 and p16INK4A immunochemistry, morphological reading adjusted scoring based evaluation (MASM)

These figures changed significantly when the full staining of morphologically evident lesions was taken into account as positives only (morphological reading adjusted scoring method - MASM). In general, the sensitivities were found to be lower, however, better correlations were noticeable between the test performances of CIN2+ negative cases [e.g. concordance of CIN2+ negatives in comparison with SM IC-CLDN1 and IC-p16INK4A: 69.0 % (59.6–75.8) vs. MASM for the same tests: 85.1 % (76.8–90.0)] (See Supplement Table 1.). Noteworthy were the findings that marginal homogeneity was also improved according to the McNemar’s test values and that the MASM evaluation improved the association between the test results. Mirroring the decrease in sensitivity observable during MASM evaluation, the concordance between tests of CIN2+ positives cases was also lower [e.g. concordance of CIN2+ positives in comparison with SM IC-CLDN1 and IC-p16INK4A: 84.0 % (73.8–89.7) vs. MASM of the same: 69.3 % (56.8–79.7), see Supplement Table 1.].

More importantly, the specificities over the markers and test methods were also improved, especially for p16INK4A (see details in Table 2.). IC-CLDN1 sensitivity was equal to IC-p16INK4A sensitivity [53.3 % (95 % (89.9–97.7) vs. 52.0 % (43.8–58.6)], however the specificity of IC-CLDN1 was improved as compared with the SM evaluation [77.0 % (69.6–83.6) vs. 60.9 % (53.5–67.2)] but was lower than the IC-p16INK4A value [77.0 % (69.6–83.6) vs. 85.1 % (78–90.8)]. The same pattern was evident for immunohistochemistry regarding IH-CLDN1 [sensitivity: 88.2 % (81.7–93.2) vs. 52.0 % (44.3–59) and specificity: 33.3 % (28.5–37) vs. 69.3 % (63.6–74.8)], in which case specificity was again found to be highly improved as compared with sensitivity.

Comparison Between Cytology and HPV Test Performance and Immunochemistry

Regarding comparison of cytology and HPV test performances with immunochemistry (both IC and IH), higher specificity was manifest in case of biomarkers, especially concerning MASM evaluation, which outperformed the specificity of either cytology or HPV testing [77.0 % (69.6–83.6) for IC-CLDN1, 85.1 % (78–90.8) for IC-p16INK4A, 69.3 % (63.6–74.8) for IH-CLDN1, 91.2 % (86.4–94.4) for IH-p16INK4A vs. 66.5 % (61.1–71.2) for cytology and 61.4 % (57.4–63.6) for HPV]. Another important feature of this evaluation method was the remarkably similar test performances of p16INK4A and CLDN1 tests with the noticeable exception of IH-CLDN1 (Fig. 2b, d), which showed much higher specificity as compared with the SM evaluation, but was still significantly lower than the specificity value of IH-p16INK4A (Fig. 2a, c). Nevertheless the agreement between IC and IH tests was also highly improved compared with the SM evaluation in test negative cases (see Supplement Table 1.) along with the marginal homogeneity of the test comparisons.

During analysis, however, it was evident that the concordance between tests for a given sample was moderate only (for CIN2+ positives in the range of 39 %-84 %, where the MASM IH-p16INK4A vs. IC-p16INK4A was at the lower, and the MASM IC-CLDN1 vs. IC- p16INK4A was at the higher end of the range). This finding warrants the combination of these markers (Supplement Table 1.). Theoretically, in case of any marker combinations, either test positivity requirement would improve the sensitivity and lower the specificity as a tendency. Accordingly, combinations for MASM evaluation were calculated (Table 2.). The expected tendency of the changes in test performance was found to be generally true; both the combination of IC and IH tests showed higher sensitivity together with a moderate decrease in specificity [sensitivity: 69.3 % (60.7–76.8) for combined IC test, 85.3 % (78.4–90.7) for combined IH test; specificity: 73.6 % (66.1–80) for combined IC test, 67.2 % (62–71.2) for combined IH test]. In case of combined IH evaluation test, high sensitivity and acceptable specificity were found, however in case of combined IC test the non-combined SM evaluation of IC-p16INK4A performed better than the combined IC-CLDN1-p16INK4A MASM evaluation. Overall, the improved sensitivities of the combinations of biomarkers imply their staining or biological variations for CIN2+ cases. This is underlined by the significant McNemar’s test for immunohistochemistry of p16INK4a vs. immunohistochemistry of CLDN1 regardless of the evaluation method, also indicating differences in staining behaviour (see Supplement Table 1.).

Comparing Test Performance for CLDN1 and p16INK4A Immunochemistry in Cytology and HPV Triage

As the major area of application of biomarkers is in different triage settings, HPV and cytology triages with different biomarker tests were calculated. Without requirement of full staining of the morphologically evident lesions (SM evaluation), the tests generally have higher sensitivities and lower specificities. In our study, the MASM evaluation of the tests gave higher specificities and lower sensitivities, a behaviour which was in agreement with that observed in non-triage settings. As a baseline, triaging cytology with HPV resulted in a sensitivity of 97.6 % (92.3–99.6) and a specificity of 44.9 % (36.0–48.3) (Table 3.). All biomarker based triage strategies are likely to perform better, since they have higher specificities. In this regard, IC-p16INK4A (in the range of 81–86 % over evaluations and triage settings) demonstrated higher specificity but lower sensitivity than IC-CLDN1 (58–76 %), which finding validates the established application of IC-p16INK4A in triage. Generally, MASM evaluation further improved the specificity to the detriment of sensitivity (see Supplement Table 2.).

Table 3 Sensitivity and specificity data of cytology (ASCUS+) and HPV (hr-HPV+) triage with combined biomarker tests

As correlation between biomarkers is moderate at sample level, the combination performance of these markers in triage was also determined. Only combinations for MASM evaluation were calculated (Table 3.). The specificities were found to be highly improved, however only small drops in sensitivity were evident. Especially cytology IC-CLDN1-p16INK4A triage showed good test performance: 70.5 % (62.4–76.9) regarding sensitivity and 72.7 % (57.7–84.7) regarding specificity, which latter value was much better than the 44.9 % (36–48.3) observed for cytology HPV triage, while the otherwise significant drop in sensitivity from 97.6 % (92.3–99.6) (cytology HPV triage) to 70.5 % (62.4–76.9) could be regarded as an acceptable trade-off in cervical cancer screening. However, the SM evaluation but not the MASM evaluation of the standalone IC-P16INK4A test is still a competitive alternative.

Discussion

There are many biomarkers proposed for cervical cancer screening, which have been discussed in detail [32, 33]. Well established commercial diagnostic platforms exist for immunochemical methods, however, more clinical data are needed to support their use, particularly regarding well controlled cross-sectional and longitudinal studies where the candidates are assessed alongside concurrent pathology. In general, there is a lack of literature reviewing side-by-side comparisons between competing tests.

A meta-analysis has established the value of p16INK4A use in cytological or histological specimens of uterine cervix. It has been shown that the proportion of cervical smears over-expressing p16INK4A increases with the severity of cytological abnormality and histological grade, however the consistency of positive staining is varied depending on the severity of the lesion [20]. The immunohistochemical (IH) p16INK4A staining of cervical biopsies with a rigorous evaluation was found to be a moderate diagnostic adjunct for distinguishing biopsies with or without CIN2+ (sensitivity 86.7 %, specificity 82.8 %) [21]. A multicentric study compared the sensitivity and specificity of p16INK4A immunocytochemistry with HPV testing in histologically detected CIN2+ cases, in the triage of atypical cells of undetermined significance (ASCUS) and low grade squamous intraepithelial lesions (LSIL). The IC-p16INK4A sensitivity was found to be similar to HPV testing in both triage settings (ASCUS: 92.6 % vs. 90.1 %; LSIL: 92.0 % vs. 95.7 %). P16INK4A however provided significantly better specificity than HPV alone for the triage of ASCUS Pap cytology cases (63.2 % vs. 37.8 % and LSIL (37.1 % vs. 18.5 %) [22]. In their report, these authors discussed the need for more comprehensive and powerful studies to demonstrate the performance of p16INK4A testing in cervical cancer screening. In addition, there is a high variability in the literature regarding the evaluation protocols and cut-offs for p16INK4A immunocytochemistry and immunohistochemistry positivity [20, 23].

Any new cervical cancer biomarker should be compared with existing tests, especially with p16INK4A. To complicate the picture, the value of p16INK4A has been disputed and analysed in a number of studies [20, 23] with variable conclusions; furthermore, there is no clear consensus on what would be the most appropriate method for evaluation of immunochemical results. Another aspect is the highly anticipated and reportedly [20] unmatched performance of p16INK4A immunochemistry in cytology and histology specimens, which has been confirmed by our data on the basis of the concordance of CIN2+ positive cases comparing SM IH-p16INK4A and IC-p16INK4A. In our study the gold standard histology and immunohistochemistry/immunocytochemistry tests for p16INK4A and the performance of a new biomarker claudin 1 (CLDN1) were evaluated in case-control manner, with morphology control at the sample level in certain cases.

The proposal that the morphology of lesions is still a significant aid in the evaluation of both immunocytochemistry and immunohistochemistry has been underlined in our study by the large differences between cytology and histology concerning the performance of p16INK4A, especially in triage (Supplement Table 1.) when applying the traditional immunocytochemistry reading (SM evaluation). Strikingly, the performance of p16INK4A was highly reduced if the evaluation was restricted to only morphologically evident lesions (MASM evaluation) in negative cases, whereas the positive cases exhibited lost concordance, indicating other factors in the background. It is, however, noteworthy that the marginal homogeneity of concordances was found to be improved. This was also true for CLDN1. Apparently, a large number of CIN2+ negative lesions showed false-positive IH-p16INK4A (and IH-CLDN1) staining (see Supplement Table 2.). Moreover, the unspecific staining was most probably a significant factor of false positivity in case of both immunocytochemistry and immunohistochemistry, since all specificities were found to be improved using MASM evaluations. In this regard, the problem of CLDN1 false positivity in case of both IC and IH was more evident (see Supplement Table 2.).

Recent analyses of claudins have suggested that the TJ-based perm-selective barrier system is involved in the regulation of cell proliferation [25, 26]. CLDN1 overexpression was demonstrated in cervical cancer biopsies by cDNA array technology [34]. In another study the expressions of CLDN1 and claudin-7 were gradually increased in accordance with the progression from LSIL to in situ CC, however expression of these proteins was very low in normal cervical epithelium by immunohistochemistry, thus these proteins may serve as diagnostic markers for CINs [28, 29, 31].

Regarding CLDN1 immunocytochemical staining, a remarkable finding in our study was that CLDN1 – p16INK4a concordance (SM evaluation) was very high [e.g. concordance of CIN2+ positives 84.0 % (73.8–89.3); concordance of CIN2+ negatives 69.0 % (59.6–75.8), see Supplement Table 2.]. As a consequence, CLDN1 is a promising, new immunochemistry cervical biomarker with a very similar performance to, but being generally less specific than p16INK4A (see Table 2, Supplement Table 1.). In our study, CLDN1 showed advantages especially in IC (SM evaluation) and stained p16INK4A CIN2+ negative lesions more intensely than p16INK4A did CLDN1 negative lesions (both IC and IH, Supplement Table 1.). This finding was partly true for MASM as well. As a consequence, the application of CLDN1 as a combinational marker with the p16INK4A triage test seems a straightforward strategy for obtaining a balanced sensitivity and specificity.

An effective cytology triage would be a mandatory technological advance in cervical cancer screening, since in the absence of such technology cytology should be replaced by other screening technologies such as HPV testing, resulting in loss of grading information and producing uncertainty in patient management. Our study population was well suited to assess the value of HPV triage strategies compared to cytology triage strategies, since the number of ASCUS cases was high (62/389). In our study, the cytology HPV triage was only slightly different from the HPV cytology triage [97.6 % (92.3–99.6 %) vs. 81.0 % (75.4–86.3 %) for sensitivity and 44.9 % (36.0–48.3) vs. 43.8 % (32.1–54.7 %) for specificity]. More importantly, the cytology biomarker triage and the HPV biomarker triage were not found to significantly differ in case of either evaluation method (see Supplement Table 1.), enabling usage of biomarkers after both tests.

HPV tests are also used in the triage of equivocal cytological abnormalities and post-treatment surveillance [5]. HPV is a sensitive marker for identifying patients at risk for cervical neoplasia and has greater sensitivity than conventional cytology for identifying CIN2-3 cases [7, 8]. In a published pooled analysis of studies with HPV testing pooled sensitivity of 96 % (94 %-97 % CI95%) versus 53 % (49 %-57 % CI95%) was shown for CIN2+ compared to cytology, but the pooled specificity was found to be 91 % (90 %-91 CI95%) versus 96 % (96 %-97 % CI95%) for cytology in women between the age of 18–96 years [6]. The HPV high sensitivity encourages policies to widen the screening interval with reduced overall costs, however, several drawbacks are involved as compared with the high specificity triage options, including lower chances of incidental diagnosis of advanced lesions and possible lower screening compliance. Our study reflects that HPV and cytology based HPV or cytology triage strategies show low specificities. Regarding immunochemical triage strategies, traditional (SM) immunochemistry evaluation generally showed inferior case-control correlation for CIN2+ negatives between cytology and histology in our study, which was improved by MASM evaluation. In this scenario, cytology IC-CLDN1-p16INK4A triage using MASM evaluation showed a performance comparable to HPV IC-p16INK4A triage with the advantage of better CIN2+ negative correlation between IC and IH than the SM based p16INK4A (82.1 % vs 68 %) (Supplement Table 2.). This underlines the importance of morphological readings of cervical smear immunochemistry and shows that cytology can be improved in order to achieve the state-of-the-art of HPV based screening technologies, which might be of interest in the future of screening protocols. In those countries where cytology screening is in place, its replacement with the less informative HPV screening test would result in reduced quality of patient management.

The current study focused on the clinical behaviour of cervical pre-cancer and cancer specific biomarkers, general proliferative markers were therefore not considered. The published and newly established immunocytochemical dual staining protocol - which is a combination of p16INK4A and proliferation marker Ki-67 immunochemistries (CINTtec® Plus) - is based on a novel definition of positivity [35]. Introduction of a proliferative marker is a sound concept regarding the nature of the carcinogenic process, however taking together our morphological findings concerning the unspecific and variable staining of diseased cells, further evaluation is necessary. Especially the existence of p16INK4A and CLDN1 single marker positive diseased cells warrants more fundamental studies on the gene expression variations in cervical pre-cancer and cancer lesions. In our study, by restricting the immunocytochemical and immunohistochemical readings to morphology positive cells, the number of positive test outcomes was significantly reduced for both CLDN1 and p16INK4A covering all methods, and reduced CIN2+ positive concordance was also shown. Even though we found certain advantages regarding the combination of morphology and immunochemistries, neither morphology nor biomarkers alone or in combinations were able to deliver an ultimate test performance. In conclusion, the combination of different markers is a logical next step for the future. These studies can lead to the better understanding of the cervical carcinogenesis process and will ultimately result in cervical diagnostic tests which will have better diagnostic performance.