Introduction

Colorectal cancer (CRC) mortality can be reduced by timely detecting adenomas and cancer through colonoscopy [1, 2]. Since colonoscopy is a burdensome and costly procedure, repeated faecal immunochemical testing (FIT) to select individuals at risk for advanced neoplasia is often used in population-wide screening programmes [3]. Screening participants testing positive are referred for colonoscopy, while those with a negative result are re-invited for the next round of FIT screening two years later. FIT is not optimal in detecting advanced neoplasia (AN), the umbrella term for CRC and advanced adenomas [4]. Furthermore, more than half of those who test positive at a threshold of 15 µg Hb/g faeces do not have AN at colonoscopy [5].

Compared to invitations for colonoscopy based on FIT only, combining known CRC risk factors with FIT to select individuals for colonoscopy may be a better alternative. Our group previously developed a cross-sectional risk model [6]. In that study, screening-naïve individuals undergoing primary colonoscopy screening both performed a FIT and completed a questionnaire on several CRC risk factors. By combining the quantitative FIT result with these risk factors (smoking, family history for CRC, sex, and age) in a logistic regression model, we could calculate the risk of detecting AN at colonoscopy. The performance of this model in detecting individuals with AN was significantly better than that of FIT only, without increasing the number of colonoscopies performed. Others have reported similar results by combining FIT with CRC risk factors, sometimes also considering other variables as well, such as BMI, diabetes mellitus, or participation status in previous screening rounds [7,8,9].

One of the disadvantages of risk-based screening is the need to collect additional information from participants. Currently, participation rates with FIT screening in the Netherlands have hovered around 72% [10]. Requesting more information on risk factors from participants could be a hurdle towards participation, jeopardising the benefits of CRC screening: a reduction in CRC-related morbidity and mortality.

So far, no risk models have been evaluated in a comparative study with conventional FIT-only screening, using a single positivity threshold in all participants. Also, the effect of adding a questionnaire on screening participation is unknown. We designed a randomised controlled trial within the Dutch organised FIT-based screening programme to evaluate the yield of AN using a risk model compared to the yield using FIT only, without increasing the number of colonoscopies.

Methods

Study design

This was an invitation-based, parallel-group, unblinded, randomised controlled trial embedded in the Dutch FIT-based organised screening programme. The study was reported according to the CONSORT guidelines for parallel-group randomised trials [11]. By law, the Dutch National Health Council assesses protocols of studies to be performed within the national screening programme. Ethical approval was granted on 20 July 2018. The study protocol was registered at ClinicalTrials.gov under file number NCT04490551.

Study group

We selected 23,000 individuals and randomised them to one of two arms: a risk-model group and a FIT-only group. Eligible for participation were invitees for the second round of the Dutch national screening programme, between 56 and 75 years of age. We invited those living within a 25-km radius of the study centre (Amsterdam University Medical Center, location University of Amsterdam) from neighbourhoods that had a screening uptake similar to the national average. The inclusion criteria and the procedures around FIT analyses and follow-up colonoscopy were similar to those of the national screening programme. A detailed description can be found in the supplementary information.

Randomisation and masking

Pre-randomisation of the selected study group was carried out by the regional screening organisation, the Foundation for Population Screening Mid-West, using the randomisation function in SQL Server 2016 (Microsoft, Redmond, Washington, USA) with a 1:1 allocation ratio to the risk-model group or the FIT-only group. Invitees were not blinded.

Procedures

Invitation and informed consent

Between 6 December 2019 and 9 March 2020, we mailed study invitations to all selected individuals, including a study information leaflet, an informed consent form, and a return envelope. Study invitations were sent 4–6 weeks before invitees were scheduled to receive the FIT from the national screening programme. All study invitees were urged to return their informed consent form and questionnaire (if applicable) before they returned the FIT. Participants could return their informed consent form to the study centre until 31 December 2020. This allowed enough time for all invitees who were willing to participate to return the informed consent form, questionnaire (if applicable), and FIT.

Questionnaire

We designed a one-page questionnaire, which was printed on the back of the informed consent form and sent to all study invitees in the risk-model group (supplementary information). The questionnaire contained five questions, of which two were used in this study: one was aimed at identifying individuals with family members with CRC; the other was aimed at current smoking status. If a question was left unanswered, we assumed that the concerning risk factor was not present.

Risk model

We used an updated version of the risk model developed by Stegeman et al. [6]. This model is based on a logistic regression analysis and calculates the risk of detecting AN at colonoscopy. We dropped the variable calcium intake from the original model because leaving it in would substantially increase the questionnaire length and because its contribution to the risk of AN was anticipated to be very limited according to data from the development study. We also added sex (at birth) to the model, albeit this factor was not significant in the development study. Sex has consistently been shown to be a risk factor for CRC in other studies and is known for all invitees [12]. We dichotomised the variable family history for CRC as yes/no.

The risk model in this study has the following variables: FIT result (in µg Hb/g faeces), square root of the FIT result, age (in years), sex, current smoking behaviour (smoker or non-smoker), and family history for CRC (present or absent). We fitted this reduced model to the risk model development set with recalibration, adjusting for the anticipated differences in age and sex distribution between the risk model development set and this study group [13].

Test results

As soon as the FIT result of a participant in the risk-model group was available, we calculated the risk using the model describerd earlier. Participants in the risk-model group were considered to have a positive result if their risk was 0.10 or higher (on a 0 to 1 probability scale). Risk-negative participants in the risk-model group were also invited for colonoscopy if their FIT result was 15 µg Hb/g faeces or higher. Participants in the FIT-only group were considered to have a positive result if their FIT result was 15 µg Hb/g faeces or higher. All participants received their test results within 1 week of FIT return as per the protocol of the national screening programme.

The 0.10 risk threshold was selected to match the anticipated proportion of test positives in the FIT-only group. The FIT threshold in this study (15 µg Hb/g faeces) was lower than the threshold used in the national screening programme (47 µg Hb/g faeces) because we anticipated a larger effect at this threshold and because this would allow for further analyses at thresholds relevant to screening programmes outside the Netherlands.

Invitees who declined study participation or did not respond could still participate in the national screening programme. Their screening outcomes were not included in this study. If an invitee returned the informed consent form and questionnaire after they had received a FIT result through the national screening programme, their data were included in the study, but their screening result was not re-evaluated at the study thresholds.

Colonoscopy

Participants with a positive FIT and/or a positive risk result were invited for colonoscopy. Colonoscopies were conducted by endoscopists accredited by the national screening programme [14]. Histology of resected lesions was assessed by experienced pathologists according to the Vienna criteria and the World Health Organisation classification [15, 16]. Advanced adenomas were defined as adenomas with a diameter ≥10 mm (as assessed by the endoscopist), and/or with ≥25% villous component, and/or high-grade dysplasia.

All colonoscopy results up to 6 July 2021 were included. Participants with a positive FIT result underwent colonoscopy at local endoscopy centres, where the staff was blinded to the study participation status. Individuals who were risk-positive but had a FIT result below 15 µg Hb/g faeces were referred to a single endoscopy centre (Bergman Clinics Amsterdam), where study participation status was known to endoscopy staff. Data of participants with a FIT ≥ 15 µg Hb/g faeces were collected by the national screening programme, while data of participants who were only risk-positive with a FIT below 15 µg Hb/g were collected by the authors, according to the format and definitions used by the national screening programme. Separate follow-up for individuals who were only risk-positive was necessary because their negative FIT result meant the algorithm within the database of the national screening programme classified their screening result as negative.

Outcomes

The primary outcome measure was the yield of advanced neoplasia, defined as the number of participants in whom AN was detected at colonoscopy per 1000 invitees (this includes invitees who declined participation or did not respond). Secondary outcomes were the yield of proximal AN and participation rate. We hypothesised that our risk model might detect more proximal lesions than FIT because FIT may have a higher sensitivity for left-sided lesions [17, 18]. Proximal AN was defined as cancer or advanced adenoma that was found in the caecum, ascending colon, transverse colon, or splenic flexure. Participation was defined as the percentage of invitees who returned an informed consent form, questionnaire (if pre-randomised into the risk-model group) and FIT.

Since participants in the risk-model group were invited if they were either risk-positive or FIT-positive, the number of colonoscopies in the risk-model group and the corresponding yield could be artificially higher. We therefore performed a sensitivity analysis, in which we calculated the yield in the risk-model group based on the risk-positives only.

Statistical analysis

We tested the null hypothesis of no difference in yield between the risk-model group and the FIT-only group. Differences in proportions were tested for statistical significance using the Chi-square test statistic. P-values below 0.05 were considered to indicate statistically significant differences. All analyses were performed using R (version 4.0.3) [19].

Based on data from the national screening programme, we anticipated 65% study participation, 6.5% FIT positives, 80% colonoscopy adherence, and a positive predictive value of FIT of 35% in detecting AN at a positivity threshold of 15 µg Hb/g in second-round invitees. These assumptions would result in a yield of 11.8 participants with AN detected at colonoscopy per 1000 invitees in the FIT-only group.

In the risk-model group, we anticipated study participation (65%) and test positivity (6.5%) similar to the FIT-only group. Using data from our previous study, we estimated that with our risk model, the yield might increase up to 16 per 1000 invitees with AN detected at colonoscopy. Sending out 16,000 invitations would then give us 73% power in rejecting the null hypothesis of no gain in yield from screening using the risk model compared to FIT-only, using a 5% significance level.

Because participation in the first 6000 invitees was lower than anticipated, we asked permission to send out an additional 7000 invitations, increasing the total number of selected study invitees to 23,000.

Results

Study group

Of the 22,978 selected eligible individuals, 230 had to be excluded, either because they had moved out of the study region or because they were deceased. Of the remaining second-round invitees, 11,364 were allocated to the risk-model group and 11,384 to the FIT-only group (Fig. 1). Invitations were sent between December 2019 and March 2020.

Fig. 1: Flowchart of the study population.
figure 1

*Relative to all with informed consent. †Relative to all who returned FIT and questionnaire. ‡Relative to all who received a positive test result.

Screening yield

Of the 164 participants who underwent colonoscopy after a positive test result in the risk-model group, AN was detected in 42 participants (26%; Table 1 and Fig. 2). In the FIT-only group, AN was detected in 39 of the 146 participants undergoing colonoscopy (27%). The yield of AN was 3.70 per 1000 invitees in the risk-model group compared to 3.43 per 1000 invitees in the FIT-only group (absolute difference: 0.27 per 1000, 95%CI: −1.30 to 1.82, p = 0.82). The yield of proximal AN was 0.97 per 1000 invitees in the risk-model group versus 1.14 per 1000 invitees in the FIT-only group (−0.17 per 1000, 95%CI: −1.02 to 0.67, p = 0.84).

Table 1 Quality of colonoscopy and outcomes of participants undergoing colonoscopy.
Fig. 2
figure 2

Colonoscopy outcomes of participants who received a positive test result.

Study and screening participation

Study participation was similar in both groups: 3397 invitees (30%) in the risk-model group and 3342 invitees (29%) in the FIT-only group returned the informed consent form (Fig. 1). Relatively fewer men provided informed consent in the FIT-only group: 1571 (47%) versus 1665 (49%) in the risk-model group (p = 0.10). The median age of those who consented was 59 years (IQR: 57–61) in both groups. Of those with consent in the risk-model group, 60% were between 55 and 60 years of age, and 36% were between 60 and 65 years of age. In the FIT-only group, this applied to 61% and 35%, respectively (Supplementary Fig. 1). After providing informed consent, 3113 participants in the risk-model group and 3061 participants in the FIT-only group also returned the FIT (27.4% vs 26.9%, p = 0.40). In the risk-model group, 306 individuals (9.0%) returned an incomplete questionnaire.

Test results

In the risk-model group, 186 participants (6.0%) had a risk exceeding the 0.10 threshold or a FIT result exceeding the 15 µg Hb/g threshold. In the FIT-only group, 161 participants (5.3%) had a FIT result exceeding the 15 µg Hb/g threshold. Five invitees (four in the risk-model group and one in the FIT-only group) tested positive according to these positivity criteria but were not invited for colonoscopy because they had returned their informed consent form after receiving a negative FIT result in the national screening programme. Risk factor distribution across the two groups is shown in Supplementary Table 1. Of the 186 participants who tested positive in the risk-model group, 9 participants were only FIT-positive, 15 participants were only risk-positive, and 162 participants tested both FIT- and risk-positive.

Colonoscopy

Of the 186 participants who received a positive test result, 164 (88%) underwent colonoscopy in the risk-model group versus 146 of the 161 (91%) in the FIT-only group. CRC was detected in three participants; all three participants with CRC were in the risk-model group and had tested both risk-positive and FIT-positive (Table 1). Advanced adenomas were detected in 39 participants in the risk-model group and in 39 in the FIT-only group.

Sensitivity analysis

Nine of the 186 participants in the risk-model group were FIT-positive but did not have an elevated risk. In two of them, advanced neoplasia was detected. Excluding these from the calculation of the yield in the risk-model group resulted in the detection of AN in 40 participants. Consequently, the yield of AN when screening with the risk model was 3.52 per 1000 invitees versus 3.43 per 1000 invitees in the FIT-only group (absolute difference: 0.09, 95%CI: −1.44 to 1.63, p = 1.0)

Discussion

In this randomised controlled trial within the Dutch national CRC screening programme, we observed no significant difference in the yield of AN between screening using a FIT-based risk model and screening with FIT only. There was also no significant difference in the yield of proximal AN between the two strategies. Our data also show that in participants of this study, adding a questionnaire resulted in similar participation in the risk-model group and the FIT-only group.

In recent years, FIT-based risk models have been proposed as a potential improvement to FIT-only screening [20]. Such models may allow for personalised screening. In this randomised trial, we evaluated the yield of a screening model consisting of the quantitative FIT result, age, sex, smoking status, and family history for CRC. Information on these risk factors is already collected in many FIT-based screening programmes (age, sex) or could be easily obtained by using a simple questionnaire (smoking status, family history for CRC).

In this study, we could not demonstrate a higher yield of AN with FIT-based risk model screening compared to FIT-only screening. The point estimate of the gain shows only a minimal improvement of 0.27 per 1000 invitees in screening yield. The 95% confidence interval ranges from −1.30 to 1.82 per 1000 and excludes a substantial gain, such as the 5 per 1000 we had anticipated, based on the development study. We therefore believe that our conclusion that using a FIT-based risk model in this setting does not yield sizeable benefits, compared to screening with FIT only, is justified. We propose a number of potential explanations for this negative finding.

In the risk model development set, all participants were screening-naïve individuals, whereas participants in the current study were second-round invitees. We initially aimed to evaluate our risk model in a group of screening-naïve individuals with a wide age distribution. Since the initiation of our study was delayed due to the formal approval process, most individuals in the national programme had already been invited at least once for screening with FIT by the time the study started. As a consequence, the average risk of AN in participants was lower: the proportion with FIT-positive results was slightly lower than anticipated (5.3% in the FIT group versus 6.5% in our power calculation) as was the positive predictive value of FIT (27% versus 35%). In contrast, adherence to colonoscopy was better than expected (91% versus 80%). The most substantial difference, however, was the lower-than-anticipated participation, substantially reducing the power of our study to detect significant differences.

Adding a questionnaire to FIT may affect screening participation [21]. In our study, participation was comparable between groups. However, as participation in screening in this randomised trial through an informed consent form (30%) was much lower than participation in the national screening programme (65% in the Amsterdam area), this result must be interpreted with caution. The lower-than-anticipated participation might be explained by the study logistics: invitees first received an informed consent form (and a questionnaire, if applicable) which they needed to return to the study site via mail, while the FIT itself was sent several weeks later with its own return envelope, to be sent to a central laboratory. This process, imposed by the Dutch Ministry of Health, was tailored to fit into the national screening programme: The questionnaires had to be sent separately because no modifications in IT systems, FIT analyses or other elements of the programme were allowed, as these could affect participation in the ongoing national screening programme. The informed consent form had to be accommodated because the FIT cut-off in our trial (15 µg Hb/g faeces) was lower than the FIT threshold in the national screening programme (47 µg Hb/g faeces). Consequently, participation in our study required extra steps that could have discouraged invitees.

In this trial, the positive predictive value of FIT-positivity versus risk-positivity was highly similar (27% versus 26%). This invites an exploration of potential limitations in the model and in the study group. The risk model development set consisted of evenly distributed age groups between the ages of 50 and 75. In the current study, the large majority of participants were between 56 and 60 years old, a much more restricted age distribution. Because age is an important risk factor in the model, the skewed distribution of age may have weakened the performance of our model.

Another potential explanation might be the number of active smokers: fewer individuals identified themselves as active smoker in this study (9.6%) compared to the risk model development set (13%) [6]. The proportion of smokers in this randomised trial was also lower than that in the general population, in which 21% of individuals aged 55–65 and 14% of individuals aged 65–75 are active smokers [22]. Smoking, like age, is a key variable in our model. Consequently, the participation of fewer smokers led to fewer participants being classified as risk-positive and fewer referrals for colonoscopy. The lower percentage of smokers might also be a sign of volunteer bias in our trial.

It should also be noted that our risk model was modified after the development study: we converted the variable family history for CRC from a categorical variable to dichotomous, and we dropped the variable calcium intake and added the variable sex. These changes likely do not explain the discrepancy in performance between the development study and our current results: the incremental risks of having more than one family member with CRC and calcium intake levels were low, and these risk factors were only present in the minority in the development set.

We were the first to conduct a large randomised controlled trial comparing a risk model to FIT-only as a test method in an established CRC screening programme. While the idea of CRC screening with FIT-based risk models has been considered for several years, the current literature primarily includes reports on risk models that were developed and evaluated in existing data. Such performance evaluations may suffer from overfitting and other forms of bias in the reported performance [23]. Prospective evaluations in separate datasets with pre-selected thresholds provide more valid and applicable results. The findings in this study underline this point, as the yield with the risk-based model markedly differed from the promising results in the risk model development study [6].

Several studies have prospectively investigated screening using a combination of FIT and other risk factors, albeit not by combining the two in a risk model, instead using them in a parallel approach (e.g., screenee invited for colonoscopy if either FIT or questionnaire is positive) or step-wise fashion (e.g., screenees first complete a questionnaire; FIT is offered only in those who test negative) [21, 24, 25]. One disadvantage of these approaches is that they often lead to an increase in referrals for follow-up colonoscopies. Performing more colonoscopies will never lower the absolute number of AN detected and will typically increase it. To remove this artefact in the evaluation of the screening yield, we selected a risk positivity threshold in our trial to match the anticipated proportion of FIT positives.

The findings in this trial do not warrant the implementation of our risk model in CRC screening programmes. Other future studies should be aimed at expanding the evidence on the incremental value of other risk models compared to current screening tests in impact studies. Based on the latest evidence, researchers may also consider including previous screening history in their models. Cooper et al. recently described the development of a risk model with data from the English Bowel Cancer Screening Programme [7]. Their findings suggested that a risk model with FIT, age, sex, and participation status of a previous round (not invited, non-responder, or responder) may have significantly better discrimination than FIT only. Several other studies have also reported an association between screen-detected AN and the quantitative FIT result of a previous round [26,27,28]. Another advantage of this risk factor is that it can be collected without using a questionnaire. Preparations are underway for a trial in the Netherlands investigating whether such data can be used to personalise intervals between two consecutive FIT invitations [29].

This large randomised controlled trial within a national CRC screening programme showed that screening with a FIT-based risk model was not better at detecting AN compared to relying on FIT-only. Participation between groups was similar, albeit lower than in the national screening programme. There is an apparent discrepancy between the promising results in the initial model development study and the findings in this randomised trial. This gap highlights the importance and practical difficulties of the prospective evaluation of risk models in a screening setting. Our findings and experiences may inform future evaluations and applications of risk models as potential improvements of FIT-based CRC screening.