Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The German National Cohort (GNC) is an interdisciplinary, multicenter, population-based cohort study currently undertaken by a network of over 25 institutions in Germany. Its main goal is to investigate the development of common chronic diseases including cancer, diabetes, cardiovascular, neurodegenerative/psychiatric, respiratory, and infectious diseases (German National Cohort (GNC) Consortium 2014; Wichmann et al. 2012). The GNC spans 18 study centers across Germany and will examine and follow about 200,000 subjects of the general population between the ages of 20 and 69 years with various examinations for a period of at least 25 years. Exams include interviews, questionnaires, a variety of physical exams, and the collection of biologic samples such as blood, urine, saliva, nasal swabs, and stool. While all 200,000 examinees undergo an initial exam which takes about 4 h, a subgroup of 40,000 examinees participate in an intensified 6-h exam (German National Cohort (GNC) Consortium 2014; Wichmann et al. 2012). A subgroup thereof—about 30,000 examinees—are being imaged by a 3 Tesla whole-body MR scanner conducted at five dedicated imaging centers across Germany (Bamberg et al. 2015). Imaging is comprised of scientific sequences which significantly differ from regularly deployed sequences in clinical settings, and no contrast agent is administered. Scan time is 60 min and the deployed sequences are listed in Table 1. Besides the five imaging centers, four imaging cores have been established to carry out central functions adjunct to large-scale, multicentric imaging. In detail, an imaging core for coordination and training has been established in Munich, an imaging core for data management in Bremen, an imaging core for quality assurance in Greifswald, and an imaging core for incidental findings (IFs) in Heidelberg (Fig. 1). The imaging core has prospectively developed the concept of reporting IFs as derived from the MRI exams within the GNC and has implemented the technical requisites. During the ongoing study, it provides quality assurance for IF-reporting, serves with advice in unclear cases and updates the standard operating procedures (SOPs) based on the latest clinical and scientific knowledge.

Table 1 MR sequences within the GNC (Modified based on Bamberg et al. (Bamberg et al. 2015)), which will be viewed by radiologists for IFs
Fig. 1
figure 1

Design of MRI study within the German National Cohort (GNC). While 200,000 subjects will be enrolled across 18 sites in Germany (green areas), about 30,000 subjects will undergo whole-body MR imaging. Thus, five dedicated MR scanners were installed (blue squares). In addition, four imaging cores have been established for central functions, in Munich for coordination and training, in Bremen for data management, in Greifswald for quality assurance, and in Heidelberg for incidental findings (gray squares). The imaging core for incidental findings has developed the basic concept for the management of MR-based incidental findings within the GNC. It provides daily support and advice to the five imaging sites and performs quality control regarding the reporting of incidental findings (Source: The German National Cohort Study)

2 Ethical Framework for IF-Reporting

While most people of the general population could be considered fairly healthy, it is expected that imaging would occasionally lead to the discovery of illnesses of varying degree of medical importance (Lumbreras et al. 2010a, b). Based on the results of similar previous cohort studies, we estimated prospectively that “clinically relevant” IFs can be found in 10 % of the population undergoing MR imaging, considering the targeted age range and morbidity in Germany (Bamberg et al. 2015; Hegenscheid et al. 2013). Therefore, guidance was sought from the ethical commissions of the involved organizations to establish an ethical framework that would help in the management of any finding out of the ordinary, generally designated as IF.

General principles to be considered in the management of IFs were (Radiologists, T.R.C.o and Management of Incidental Findings detected during Research Imaging 2011; Weiner 2014) as follows:

  • Responsibility for the well-being of the participant: A participant should be informed about health concerning IFs. This is in accordance with European and international ethical guidelines, for example, the Article 26 of the Additional Protocol to the Convention on Human Rights and Biomedicine, concerning Biomedical Research of the Council of Europe (Additional Protocol to the Convention on Human Rights and Biomedicine and concerning Biomedical Research 2007; Convention for the protection of Human Rights and Dignity of the Human Being with regard to the Application of Biology and Medicine: Convention on Human Rights and Biomedicine 1999).

  • Responsibility for the well-being of the society: The general population might be affected from undisclosed illnesses a participant might suffer from. This includes, for example, illnesses that might carry an increased risk for the participant to cause a traffic accident. This is in accordance with the Article 26 of the Convention for the Protection of Human Rights and Dignity of the Human Being with regard to the Application of Biology and Medicine: Convention on Human Rights and Biomedicine of the Council of Europe (Convention for the protection of Human Rights and Dignity of the Human Being with regard to the Application of Biology and Medicine: Convention on Human Rights and Biomedicine 1999).

While these general ethical principles seem to be simple and straight forward, implementation presents certain challenges which will likely never be solved satisfactorily. The simple idea of classifying findings into reportable and non-reportable gets confounded by the definition of “IF” itself. While IF might ideally relate to a diagnosis, imaging by itself, even in clinical settings, rarely allows an abnormality to be specified down to a final diagnosis. IFs in MRI exams can present any form of untypical imaging characteristics, for example, a hyperintensity where it is not expected; a broad clinical description such as a cystic lesion; or a likely but not certain diagnosis such as an adrenal gland adenoma. Generally, only an accurate and established diagnosis allows for a reliable estimation of the impact for a participant’s future health.

With ethical principles referring rather to diagnoses but imaging generally providing much less defined information, it becomes apparent that it is often unclear how to classify an IF into report-worthy or not. In clinical as well as in research settings, an innocuous finding wrongly reported as a false-positive illness may cause severe psychologic and bodily harm conflicting with the general bioethical principle of primum non nocere (do not harm). It may also unnecessarily increase health care spending, costs for society, and lead to occupational and insurance-related consequences.

3 Defining Problems in IF-Reporting

Following the ethical considerations set forth by international guidelines (Additional Protocol to the Convention on Human Rights and Biomedicine and concerning Biomedical Research 2007; Convention for the protection of Human Rights and Dignity of the Human Being with regard to the Application of Biology and Medicine: Convention on Human Rights and Biomedicine 1999; International Ethical Guidelines for Biomedical Research Involving Human Subjects 2002), a process dubbed “IF-reading” was established. IF-reading is a procedure described by SOPs developed by the researches of the GNC and approved by the ethical commissions of the involved organizations. Those SOPs are to ensure that every participant’s imaging data is assessed by a board-certified radiologist within a certain time frame to detect IFs that might warrant a notification of the participant. Participants are only notified in case of “clinically relevant” IFs. This process poses some intricate difficulties different from IFs encountered during clinical exams. Considerations that need to be accounted for in the particular research setting of the GNC will be presented in the following paragraphs.

3.1 Scientific Imaging Sequences

Imaging sequences in the GNC, as in many other research projects, differ from clinically used sequences. They often do not have the particular purpose to obtain a certain clinical diagnosis. In the GNC, MR sequences were chosen with an emphasis on maximizing morphologic data acquisition in a restricted time frame, sacrificing some of the MRI-inherent benefits of analyzing a lesion based on a multitude of MR-characteristic tissue features. Therefore, IF-reading has to be based mostly on T1- and T2-weighted images without common, clinically applied sequences such as diffusion-weighted imaging (DWI), susceptibility-weighted imaging (SWI), etc. Due to their invasive nature, contrast agents, gut motility suppressing medications, bowel distending procedures, and endorectal/endovaginal coils have also been forgone. With the limited imaging set to characterize a finding, great uncertainty in specifying a finding and a large list of differential diagnoses including artifacts has to be expected.

While reacquisition of only one sequence is allowed, more reacquisitions, for example, because of motion or breathing artifacts, cannot be afforded due to time restrictions. Similarly, sequence protocols are fixed for comparability. No sequences can be swapped for, for example, less motion susceptible ones or more lesion appropriate ones, as it happens in clinical settings. This substantially reduces the sensitivity to pick up a lesion and hugely widens the gap between the ability to detect and characterize a finding.

3.2 Limited Clinical Context

To afford unbiased reporting, and because of strict German privacy and data protection laws, radiologists are blinded to personal and clinical information of the participant, except for gender and the year of birth. Moreover, no data from exams conducted in other areas of the GNC (e.g., blood tests) are shared with the radiologist in charge of the IF-reporting. This severely hampers guidance toward a probable diagnosis of an observed lesion.

3.3 Disproportionate Increase of False Positives

The probability of a lesion being a certain diagnosis is possibly distorted by the fact that examinees randomly selected from the general population are more likely to be healthy individuals, in contrast to patients with clinical indications for imaging. It is a mathematical phenomenon that the positive and negative predictive value of a particular imaging finding depends on the prevalence of the appendant disease in the examined population (Bender et al. 1998). Compared to a clinical setting where MRI is often applied to further characterize an already known or suspected lesion, a mostly healthy general population leads to a lower positive predictive value and accordingly to an increase in false-positive reports. Taking the generally poorer specificity of the applied scientific imaging sequences compared to clinical sequences into account, the report of a potentially harmful finding would come at the expense of an even larger number of false positives. Along with health care costs, physical and psychologic side effects of follow-up procedures would increase disproportionately, compared to true positive disease detection. The effect of false-positive reports is aggravated by the fact that possibly healthy individuals, that otherwise would not have been subjected to medical exams, might undergo harmful or side effect-stricken follow-up investigations.

3.4 Uncertainty Causing Out-of-Proportion Work-Up

The notification of a participant would likely trigger a clinical work-up outside the GNC. Participants would seek advice from their primary care physicians who would be forced, out of lack of more complete information, to follow up on IFs, likely starting with proper clinical imaging. While this is the intended purpose of the IF-reading to prevent harm from serious illnesses like cancer, for example, this would, under certain circumstances, lead to unnecessary and unnecessarily exaggerated work-up. While this is obvious for false-positive reports, this would also be the case for certain true positive reports, namely, when there is uncertainty about the clinical significance of findings. This includes minor ailments, anatomic variations within the normal range, illnesses that would usually be diagnosed and followed up on a less extensive and costly way, or illnesses that would not receive work-up at all at this point in time. Examples might be an Arteria lusoria occasionally leading to swallowing problems, a hiatal hernia that might or might not be clinically manifest, or the ubiquitous age-related degenerative joint or spine disease that might occasionally explain a participant’s pain but otherwise would not need extensive or no work-up at all.

3.5 Reliability, Reproducibility, and Consistency

For the assessment of 30,000 MR scans acquired at five different sites, a relatively large number of radiologists are involved in reading the acquired data. Furthermore, with this imaging round expected to last at least 4 years, a significant fluctuation of involved radiologists is anticipated. Therefore, high-quality standards must be met to ensure consistency and reliability. It is well known from reproducibility studies that variability is induced by radiologists in image interpretation and diagnoses making (Robinson 1997). Considering the fundamental obligation to provide the same service and the same quality of service to each of the participants, variability should be limited as much as possible. This can be theoretically achieved by either reducing the number of involved radiologists, and/or by involving only highly and specifically trained radiologists, and/or by standardized reporting and/or by conducting quality assurance in IF-reporting.

4 Translating the Ethical Framework into a Reporting Algorithm

Considering the abovementioned restrictions, it became clear that reporting every possible disease would necessitate extensive clinical follow-up with significant over-reporting and disproportionate work-up. Individuals might thus come to harm from non-disclosure of disease states as well as from reporting every possible disease state. Therefore, the ethical framework was defined more precisely in an effort to find the possibly best balance between informing participants about relevant illnesses and avoiding reporting of irrelevant illnesses. To that end a robust system guiding radiologists in IF-reporting was established curtailing especially possibly minor illnesses with questionable relevance, normal variants, and highly uncertain diagnoses.

4.1 The List: An Approach to Define Clinically Relevant IFs

It was decided to develop a specified, categorized, and concise list of reportable findings, limiting uncertainty and false positives as well as establishing consistency. The ground work for this system was laid by expert radiologists familiar with the applied sequences and the ethical considerations.

The ratios of false-positive and false-negative findings are significantly determined by the applied MR sequences. As these ratios are specific to the set of sequences used, comparability with previous cohort studies using different imaging protocols might be severely hampered. Based on the extrapolation of extensive literature research data (excerpt (Abeloos and Lefranc 2011; Al-Shahi Salman 2007; Atalay et al. 2011; Ballantyne 2008; Beigelman-Aubry et al. 2007; Berland et al. 2010; Berlin 2011; Boland et al. 2008; Booth et al. 2012; Borra and Sorensen 2011; Bradley et al. 2011; Childs and Leyendecker 2008; Chow and Drummond 2010; Cordell 2011; Cramer et al. 2011; de Rave and Hussain 2002; Erdogan et al. 2007; Esmaili et al. 2011; Gore et al. 2011; Gross et al. 2010; Hartwigsen et al. 2010; Hoggard et al. 2009; Illes 2008; Irwin et al. 2013; Johnson et al. 2011; Kamath et al. 2009; Khosa et al. 2011; Ladd 2009; Lee et al. 2011; Legmann 2009; Lund-Johansen 2013; MacMahon et al. 2005; McKenna et al. 2008; Megibow et al. 2011; Milstein 2008; Morin et al. 2009; Morris et al. 2009; Nelson 2008; Orme et al. 2010; Pierce et al. 2009; Puls et al. 2010; Richardson 2008; Royal and Peterson 2008; Shoemaker et al. 2011; Subhas et al. 2009; van der Lugt 2009; Vanel et al. 2009; Vernooij et al. 2007; Zarzeczny and Caulfield 2012)), radiologic clinical experience and the knowledge and limitations of the applied sequences, a list of reportable IFs has been specifically tailored to the imaging data available (Table 2). Similarly, a list was created exemplarily specifying IFs that should not be reported.

Table 2 Excerpt of the IF-list for the MRI study of the GNC (by February 2015)

The seemingly random combination of definitions based on clinical entities, morphologic and size criteria to assign findings to a specific IF-category was mainly determined by aforementioned limitations of imaging and the specific study settings. For example, using the available sequences (and likely a certain degree of motion artifacts), a lung nodule smaller than 1 cm could only be evaluated with great uncertainty—vessel flow artifacts or small dystelectases being so common. Therefore, a size cut off of 1 cm was chosen. Similarly, cervical lymphadenopathy was defined as at least three lymph nodes with a short-axis diameter of at least 1.5 cm, accounting for the fact that non-contrast-enhanced imaging of the neck would likely lead to an over-reporting of possibly enlarged lymph nodes. As reasoned above, some disease states have been excluded from reporting due to limited general significance, like diverticulosis. Others have been banned due to limited clinical significance specific to a non-targeted imaging setting, such as arthrosis or disk bulging, for which pre-symptomatic imaging is not an established proven method. Equally banned from reporting is, for example, cardiomyopathy, while being generally a significant disease, every participant undergoing MR imaging, has been subjected to echocardiography in another area of the GNC. Therefore, a possible cardiomyopathy would have been communicated already.

4.1.1 Separation into Acutely and Non-acutely Relevant IFs

Within this list, IFs were classified into acutely relevant and non-acutely relevant findings. Acutely relevant findings were defined as suspected disease for which the participant should receive immediate clinical care. Examples include possible stroke, pneumothorax, and aortic dissection. These findings not only have to be reported in a timely manner for the benefit of the participant but also to avoid danger to the public, for example, from causing a traffic accident. As of February 2016, the list contains 14 acutely relevant IFs.

4.1.2 Unlisted IFs

It is obvious that a list of a few dozen findings cannot encompass all report-worthy diseases. Therefore, a possibility to report unlisted findings was created. To limit over-reporting by radiologists, who out of professional habit are prone to rather over-report than under-report, a dedicated system has been established. It interposes an approval step before a report is sent out to a participant. Unlisted findings deemed report-worthy by the radiologist in charge are submitted for assessment to the imaging core facility in Heidelberg. All requests go through a standardized process (Fig. 2). Minor requests or technical errors, like an erroneously unlisted IF submission that already exists in the list, will be answered directly by the team of radiologists at the imaging core Heidelberg. More complex-to-judge submissions are referred to an external committee composed of two radiologists, a general practitioner, an epidemiologist, and an ethicist. Here, a final decision will be made, especially balancing the risk of over-reporting on a big scale for similar cases to come.

Fig. 2
figure 2

Process of unlisted incidental findings (IFs). Findings deemed report-worthy by the radiologist and not been listed so far can be submitted to the imaging core facility in Heidelberg. Minor requests or technical errors will be answered directly by the imaging core. For all unlisted findings, which are more complex, the imaging core will sample the current scientific basic and clinical guidelines, estimate the diagnostic accuracy of the applied imaging technology for such a finding and develop based on this information a recommendation, which is discussed by the external committee. The external committee is composed of two radiologists, a general practitioner, an epidemiologist, and an ethicist; the committee makes the decision whether this finding is report-worthy on not report-worthy. Accordingly, the IF list will be updated

4.2 Technical Translation

4.2.1 Mode and Time Frame of IF-Reporting

Given the abovementioned restrictions for the interaction between the radiologists and the participants due to German privacy and data protection laws and study design of the GNC, communication is managed by a trust office, part of the study recruitment center. No identifying information (e.g., name, post address, etc.) is linked with any MR findings. Non-acute IFs will be reported via regular mail. Time frame for this scenario stipulates the completion of image reading within 5 working days and completion of mailing a letter to the participant within 10 working days after image acquisition.

Acutely relevant IFs require a more direct communication with the participant as soon as the radiologist in charge becomes aware of the situation. This situation may overrule some of the study design concerns. Thus, a detailed algorithm for getting hold of a participant has been developed, which includes immediate telephone contact. In that instance, personal data of the participant (name and phone number) will be provided to the radiologist by the study recruitment center. In case phone contact cannot be established within 24 hours, an expedited letter will be sent, informing the participant of the potentially dangerous condition with the advice to seek immediate medical attention. Participants without reportable IFs will not receive any letter.

In the event of unlisted IFs, the abovementioned time frame may be exceeded for non-acutely relevant unlisted IFs. Unlisted IFs, however, judged by the reading radiologist to be acutely relevant, will be reported in the aforementioned way, before consulting the imaging core Heidelberg. Thereafter, the imaging core Heidelberg will be informed about the unlisted IF and the communication with the participant. The imaging core will then decide if the unlisted IF will be added to the IF-list for similar cases to come.

The purpose of reporting IFs is not the assistance in ascertaining a diagnosis. How far GNC imaging could assist the primary care physicians in defining a diagnosis has been discussed during the initial stages of the GNC. It became obvious that time and manpower limitations would not allow for that. Key points were that primary care physicians would be hard to reach because of busy office hours, or that supplying primary care physicians with image data would require them to be technically and disease-specifically able to evaluate scientific image protocols, which is generally beyond the expertise and the time resources of primary care physicians. Furthermore, for practical and legal reasons, communication should be directed to the participant, especially since not all participants would have a regular primary care physician. Most importantly, the limited imaging information collected with scientific protocols would rarely, if ever negate the need for proper further imaging. Therefore, the purpose of IF-reporting is to call the participant’s attention to a possibly concerning finding and provide anatomic location data to guide further work-up. Participants will be provided with a CD containing the imaging data when an IF is reported. While this may potentially facilitate further work-up, this is not meant to play a substancial role in establishing a diagnosis.

4.2.2 Data Processing

As imaging is taking place at five imaging centers across Germany and at MR scanners outside of a common hospital infrastructure, dedicated data management and image viewing tools were developed for the IF-reading. As already mentioned, the GNC requires a strict separation of identifying information and the exam results of the participant. Therefore, a dedicated software and hardware system was created that allows for blinded reading but automatically facilitates contacting the participant in the case of report-worthy IFs.

For image assessment, a web-based electronic case reporting form (eCRF) and an image viewer was developed. De-identified imaging data can be accessed on regular computers through a password-protected, encrypted gate, allowing the selection of listed IFs and submitting unlisted IFs. A standardized reporting tool as part of the image viewer has been developed by the imaging core in Bremen together with the management unit of the GNC’s centralized database located in Greifswald. Within the MR images, IFs can be labeled by an arrow or a size indicator. As soon as the IF is marked, a pop-up window opens where the radiologists can select the corresponding finding from the list. The IF-report can be supplemented by anatomic location data where necessary, selected from the drop-down menus. The same pop-up window allows for submission of unlisted IFs which automatically triggers a notification to the imaging core in Heidelberg, including information on the unlisted IF that the reading radiologist has to fill in into preset fields.

Once the IF-reading has been finalized by the radiologist, all information regarding the reported IFs is being transferred via eCRF to the central database of the GNC. The eCRF allows for automatic generation of appropriate reports containing all selected IFs in a standardized, structured form without free text (Fig. 3). The report in PDF format, containing only the participant’s study ID and no person-identifying information, will be automatically accompanied by a suitable cover letter. Specific cover letters exist to match different settings: (1) for notification of acutely relevant IFs, (2) for notification of non-acutely relevant IFs, and (3) for reporting non-acutely relevant IFs after the participant had already been contacted about acutely relevant IFs. The IF-report and the cover letter will be printed at the imaging site and sealed in an envelope labeled with the participant’s study ID. Also enclosed will be a CD with the imaging data. This sealed envelope will be placed into another envelope containing the participant’s address matched by the paticipant’s study ID. This step is carried out at the recruitmet centers, as only those have access to person identifying data. From here letters will be sent out by regular or expedited mail according to the situation.

Fig. 3
figure 3

Process of incidental findings (IFs) reporting in the German National Cohort (GNC). All images are reviewed by a board-certified radiologist using a web-based viewer. With a standardized reporting tool, the radiologist can highlight the findings (in this case an abdominal mass greater 3 cm). This information is saved in the central database, which automatically generates a standardized letter informing the participant regarding the detected IFs (here simplified on one sheet of paper). The letter contains the list of observed IFs, general information about IFs and the MR imaging as well contact details of the study site in case further consultation is necessary

4.3 Training and Certification of Radiologists

Training and certification of radiologists for the purpose of IF-reading is coordinated and implemented by the imaging core in Heidelberg. All IF-readings are performed only by board-certified radiologists. Initial reading can be done by radiologists in training with experience in MR imaging. However, their results have to be verified by board-certified radiologists similar to clinical settings in teaching hospitals in Germany. Only board-certified radiologists are able to finalize and sign-off on readings and trigger report letters. On top of that, all radiologists have to be trained and certified with respect to IF-reporting in the GNC. A multistep training system has been implemented requiring participation in a personal or videoconference-based teaching session and completion of a test. Instructions regarding image viewer operations and access to the database are given to the radiologist. All necessary SOPs are introduced as well. A dedicated training mode of the image viewer containing example cases with and without IFs is used for training purposes. Finally, a test including simulated cases must be passed in order to be certified as IF-reader for the GNC. Working as an IF-reader requires awareness of changing protocols and changes in the IF-list. Participation in yearly re-training and re-certification is mandatory. Training of the technicians operating the MR scanners is managed at the imaging core Munich.

4.4 Quality Assurance

To ensure consistency and inter-reader reliability, a protocol has been developed to monitor the performance of IF-reading across different imaging sites and different readers. A random subset of 10 % of all cases will be read again by radiologists of the imaging core in Heidelberg in a supervision reader mode. On top of that, the first 20 cases of each site and the first 5 cases of each reader will be subject to supervision reading. Discrepancies between primary reader and supervision reader will be recognized and recorded automatically. Analysis of those discrepancies may reveal problems in choosing the correct IF, in differently interpreting IFs, in following protocols or in correctly using the image viewer. Equally important, however, it may uncover poor phrasing of an IF, overlapping of IFs, or inadequacy of an IF. Depending on the type of discrepancy, several instruments can be used to solve problems. This includes personalized feedback to readers, discussion of cases at telephone conferences and meetings, especially to resolve structural or site-specific issues, and re-defining IFs or location options. As a last measure, readers can be subjected to re-training and re-certification, or be banned from participating in the GNC as IF-reader.

5 Summary

The concept of the German National Cohort for reporting IFs has been implemented since the start of the recruitment of MRI participants in spring 2014. At the current state, the recruitment is ongoing and will last for the next few years to achieve the targeted sample size. Based on the applied IF reporting concepts, several participants have been identified with IFs (Fig. 4) and informed accordingly. However, the clinical significance of our reported IFs as well the performance of our implemented reporting system remains unknown at the current date and is subject to ongoing research. Our findings as well as results from other large-scale cohorts utilizing imaging continuously influence how we will report IFs in the future in research as well as in clinical settings.

Fig. 4
figure 4

Examples of common incidental findings (IFs) as observed in the German National Cohort (GNC). The two images on the top show a mediastinal lymphadenopathy (white arrows), the left one represents an image from the T1-3D-VIBE-DIXON sequence, and the right one an image from the T2-HASTE sequence. The two images on the bottom show a renal mass >2 cm without any fatty content (yellow arrows), on the left an opp-phase image and on the right a fat-image from the T1-3D-VIBE-DIXON sequence