Introduction

Since the seventeenth century, the conventional light microscope (CLM) has been used as the primary device to examine human tissues at microscopic level for histological/pathological analysis, diagnosis, research, and educational purposes [1, 2]. However, CLM has a number of limitations, including the need for production and storage of large numbers of glass slides, care in preservation, and periodical slide replacement [3, 4]. More recently, users’ needs for quick case discussion, remote and online access, integration of data (e.g. slides and annotations), and a demand for more attractive and engaging learning platforms, have led to advances in technology, as well as development of electronic tools for incorporation into education, as an attempt to improve the students’ learning and commitment to modules [5,6,7].

Digital microscopy (DM), which represents whole slide imaging systems (WSI), is designed to accurately digitize images from glass slides, converting them into numerous high-quality images using specific hardware, combined with software which assembles these multiple images into a single digital image resembling the original glass slide [4, 8, 9]. Given the many possibilities for DM application, this technology has also become a useful alternative to CLM for human pathology teaching, with a significant acceptance described by both students and teachers [10,11,12,13], although some authors reported disagreements regarding total replacement of CLM by DM [14,15,16].

Following implementation of DM, different strategies have been used to assess students’ performance and perception, in order to determine the effectiveness of this technology over CLM [16,17,18,19]. However, the absence of educational guidelines remains a limitation for these assessments’ utility, which may impair the reliability and interpretation of results.

Considering this scenario, this systematic review compiled the published data regarding the use of DM for teaching of human pathology to medical and dental students, in order to investigate whether this technology is sufficient for teaching and learning as a stand-alone tool, and to determine the proper method to evaluate students’ performance using DM.

Materials and methods

This study was conducted according to the PRISMA (Preferred Report Items for Systematic reviews and Meta-Analyses) [20] and was registered in the PROSPERO database (protocol CRD42019132602). The review questions were as follows: “Is whole slide imaging reliable to be used as a single technique instead of the association with conventional light microscopy for human pathology teaching?” and “What is the appropriate method to evaluate the students' performance using digital microscope?”

Literature review

One author carried out a literature review in order to identify whether there were any existing systematic reviews that were already registered (in process or published), similar to the scope of our study. Two similar reviews were identified [21, 22]; however, some limitations were highlighted. The study published in 2016 [21] reported a timeline searching until 2014, which may have impaired the assessment of other more recent and relevant articles published in this 2-year interval, and between 2016 and 2019. Seven databases were reported in the search; however, Scopus (Elsevier, Amsterdam, the Netherlands) and MEDLINE (Medline Industries, Mundelein, Illinois) by PubMed platform (National Center for Biotechnology Information, US National Library of Medicine, Bethesda, Maryland) were not used. Although meta-analysis was performed, the authors did not report which scale was used by each study to allow quantitative synthesis [21]. Limitations found in both reviews were the lack of clarity to describe the search strategy, the screening of articles only published in English language, and the inclusion of studies that may had increased the heterogeneity of the results [21, 22]. Moreover, both studies included articles involving both cytology and histology samples.

Based on these observations, we decided to proceed with this systematic review to assess more homogeneous and well-designed studies, reducing the risk of bias as much as possible, providing more consistent evidence regarding the use of digital microscopy as a proper teaching method for human pathology only, and the best methods that can be applied to evaluate the learner’s performance using this technology.

Eligibility criteria

Inclusion criteria comprised studies that aimed to assess the performance and/or the perception of students using DM to analyse its importance for educational purposes. For comparative studies, if the participants were distributed into two groups, these must be crossed-over. The participants should also have analysed the sample by both methods (DM and conventional light microscopy) at two separate times, not blending the technologies. The performance results should be indicated with a score, as well as the method that was used for measures of performance. In case of studies that assessed the student’s perception, it should be reported how it was obtained. Studies published in English, Portuguese, Spanish, or French languages were screened. Exclusion criteria considered literature reviews, letters to the editors, book chapters, and abstracts published in annals. Studies which included cytopathology/hematopathology, as well as those that examined animal histology/pathology, were also excluded. Moreover, retrieved publications that could not be found or accessed were excluded. Validation studies and studies which evaluated efficacy and accuracy were not included. The publications in which the modality of WSI was not clear or specified were excluded.

Search strategy

An electronic search was conducted on May 15, 2019, without timeline restriction, in the following databases: Scopus (Elsevier, Amsterdam, The Netherlands), MEDLINE (Medline Industries, Mundelein, Illinois) by PubMed platform (National Center for Biotechnology Information, US National Library of Medicine, Bethesda, Maryland) and Embase (Elsevier, Amsterdam, The Netherlands). In order to expand the numbers, we used an association of two different searches, which retrieved several original articles in each one. This strategy was reproduced similarly in all three databases. Firstly, we included the terms (ALL ( "digital microscopy" ) AND ALL (student*) ). The second search strategy considered the following terms: (ALL ( "virtual microscopy" ) AND ALL ( student* ) ). A manual search was also carried out to identify possible additional studies.

Article screening and eligibility evaluation

Two authors independently screened the titles and abstracts of all articles, and then excluded the ones that were not in accordance with the eligibility criteria. In sequence, these authors read the full texts to identify eligible articles. The reasons for exclusion were listed and specified in the flow chart. Divergences were solved initially by discussion and then by consulting a third author to assure that the appropriate publications were selected according to the eligibility criteria. Rayyan QCRI [23] was used as the reference manager to conduct the screening of the articles, as well as the exclusion of duplicates and registrations of a primary reason for exclusion.

Quality assessment

The risk of bias was carried out by using the Joanna Briggs Institute Critical Appraisal Checklist for Analytical Cross Sectional Studies (University of Adelaide, Australia) [24]. This tool designed different checklists of items for each category of study and is recommended by the Cochrane Methods [25]. For cross-sectional studies, the questions comprised, in general, study sample and participants, methodology’s design and execution, and tools for analysis of the results. In the “confounding factors” sections, we considered the previous contact with DM prior to the study, as well as students’ module retention as potential biases. The available answers for each item were “yes”, “no”, and “not applicable”, and after finishing the questionnaire, an overall score for every article was achieved. We considered a cut-off of 50% of checklist answers to rate the publications as having as “high”, “moderate”, or “low” risk of bias. Two authors independently performed the quality assessment. Disagreements were initially solved by discussion, and later by conferring with a third author for settlement.

Data extraction

Information available in the publications was independently extracted by one author and further reviewed by a second author. A specific extraction form was designed using Microsoft Excel® software, which was also used to organize and process the qualitative and quantitative data. For each elected study, the following information was extracted (when available): year and country of publication, which variable was analysed (performance, perception or both), number of participants, students’ educational level, type of equipment and software used to assess WSI, types of workstation, digital slides accessibility, equipment training, CLM availability and its specification, number and scope of used samples, and how the students’ performance and/or perception were assessed and their results.

Analysis

The qualitative and quantitative data were presented descriptively. Given the high heterogenicity of the available information of the studies, especially regarding the methods used for perception and/or performance measures, we were not able to perform a meta-analysis in this systematic review. A narrative synthesis about the findings of the included publications was performed.

Results

PRISMA flowchart

The search performed through all databases initially identified 873 publications, with a timeline from 1998 to 2019. After exclusion of duplicates, 563 were screened by reading their titles and abstracts, resulting in 60 publications for eligibility assessment. Following full text reading, 52 articles were excluded according to the eligibility criteria, and 8 were included in the qualitative synthesis. The article selection process is summarized in Fig. 1.

Fig. 1
figure 1

Flowchart of study screening process adapted from PRISMA [20]

Methodological features of the studies

The included articles were published from 2008 to 2019, and originated from six countries: Australia (1); Brazil (1); Germany (1); Grenada (1); Saudi Arabia (1), and USA (3). Six publications (75%) assessed students’ perception and performance [5, 14, 16, 26,27,28], whereas two studies (25%) only evaluated students’ perception of DM in comparison with CLM [3, 12]. The number of participants ranged from 35 to 192 students; three articles included medical students (37.5%), three included dental students (37.5%), from the second to fifth year, and two publications comprised medical residents (25%).

The main methodological features of the included articles are summarized in Tables 1 and 2. The scope of the samples used in the studies encompassed general and systemic pathology, dermatopathology, histopathology (general and advanced), oral histology, and oral pathology. Most commonly used WSI workstations included computers with specific software and/or a web-based interface for digital slides visualization (5 studies; 62.5%). Three studies (37.5%) provided additional data, such as clinical history, radiographs, laboratory exams, and specimen annotation [3, 14, 28], and remote access to slides was described in 5 studies (62.5%). Six publications reported the availability of CLM concomitantly with DM (75%), although 3 studies did not describe whether CLM was completely abolished after DM assessment and establishment (37.5%). Detailed information of the included studies is available in Supplementary Table 1.

Table 1 Methodological features of the included studies
Table 2 Equipment specifications and practice routine according to the included studies

Performance analysis

Two studies required establishment of diagnosis in their assessments, either through multiple-choice or open-ended questions [5, 16], although one study also considered a differential diagnosis as a possible correct answer [16]. Two publications provided an admixture of requests, comprising identification of diagnostic features, microscopic description of the specimen, and clinical features besides the differential and final diagnoses [14, 28]. One study did not specify the questions’ content of the multiple-choice exam [27], and another asked the participants to assess a tissue specimen and demonstrate their interpretation through paper illustrations [26]. Time for performance’s test accomplishment was described by two studies, with a mean of 66 min. Five of six studies (83.3%) did not provide any type of equipment training prior to the exam (Table 1).

One study did not describe the numeric results of performance assessment, stating that there were no statistical differences in academic achievement between students who had used a specific technology [14]. In contrast with three studies [16, 26, 28], two others reported a similar or a slightly higher performance using CLM [5, 27] (Table 3).

Table 3 Main results of students’ perception and performance

Perception analysis

All publications utilized question surveys to assess the students’ perceptions regarding DM and/or CLM, ranging from 5 to 30 questions. Five studies (62.5%) used a 5-point Likert scale for answers (e.g. 1—strongly agree; 2—agree; 3—undecided; 4—disagree; 5—strongly disagree). Three studies also used open-ended questions, and two provided an additional section for students’ personal considerations (Table 1). As presented in Table 4, most of the studies asked the students about the easiness of equipment utilization (6 studies; 75%), quality and magnification of images (6 studies; 75%), and preference for either DM or CLM (5 studies; 62.5%).Concerning the role of DM, seven studies reported that most of the students believed in this technology to be an appropriate method for improved and efficient learning (87.5%). When the students were asked to choose one method, two studies described the students’ predilection for both methods (25%), whereas the others did not provide this option for answer. Overall, the students preferred DM over CLM (Table 3).

Table 4 Essential contents of the studies perceptions’ questionnaires regarding the use of digital microscope and conventional light microscope

Quality assessment (risk of bias)

Six publications achieved an overall low risk of bias (75%). One study was categorized as a moderate risk overall, since the criteria for sample selection was not clearly defined, there was no identification of confounding factors and evidence of strategies to deal with them, and no description of statistical analysis used in the study that could be reviewed. The study that was classified as having a high risk of bias failed in the following items: clear sample inclusion criteria, description of criteria used for measurement, identification of confounding factors and their management, and detail of statistical analysis. Full quality assessment of the included articles is available in Supplementary Table 2 and summarized in Fig. 2.

Fig. 2
figure 2

Quality assessment results, adapted from the Joanna Briggs Institute Critical Appraisal tool

Discussion

The learning curve of pathology encompasses the association between morphological changes and clinical aspects. Thus, an unimpaired visual representation supports solidifying concepts and principles and inserts a real-life component that cannot be recognized through theory alone [10, 14]. Classically, the use of glass slides and CLM has been the core of practical knowledge, which has gradually improved by the introduction of digital cameras connected to microscopes, generating static images and live videos. Although these devices enabled live examination and exhibition of slides to several participants at the same time, their control was limited to one person, simply serving as a teaching supplement [2, 8].

The Accreditation Council for Graduate Medical Education (ACGME) is an American organization that has highlighted six areas of competency, including patient care, medical knowledge, professionalism, communication skills, practice-based learning and improvement, and systems-based practice [29, 30]. Moreover, pathology courses have undergone modifications in their curricula, which have changed the dynamics of microscope laboratory sessions, such as time, physical space, and equipment availability and, consequently, facilitated the application of different teaching and learning methods including cooperative and distance learning as well as the association with new technologies [10, 31].

In this context, DM represents an important tool as it allows any computer to work as a CLM; instead of providing sets of glass slides, especially the ones with limited availability and variability, all users can access the same material collection, either in or outside the educational facilities at any time, minimizing the use of tissue and number of glass slides required ensuring standardization [9, 30, 31]. Several slides can also be simultaneously exhibited on the same screen, enabling interpretation and comparison between histochemical and immunohistochemical stains, different sections, and specimens [2, 8, 32]. Annotations, measurements, macroscopic pictures, imaging studies, and labels can be added within the images, facilitating interactive study and distant communication [6, 30, 33]. Moreover, WSI systems are more ergonomic for users in comparison with a CLM station, provide larger field of vision, and permit a broader range of magnifications. The presence of a thumbnail indicating the area on the screen also promotes better orientation [9, 32].

Challenges of DM include its cost, with a large initial investment for WSI system implementation, including hardware, computers, and software, which need regular maintenance to guarantee proper functioning [2]. The high-resolution of WSI images result in files with large size, demanding the use of large capacity to store, back-up, and allocate data whenever required, as well as a high-speed network [4, 34]. Currently, these systems still have individual designs, resulting in the absence of a universal virtual slide format, which limits the access and distribution of DM image files without relying on web-based collections, although this is negated to some extent by availability of free WSI viewing software from most vendors [2, 4, 32], as demonstrated in Table 5. Other common complaints are related to image limitations, such as the contrast and resolution [4, 5, 8].

Table 5 Examples of WSI hardware and software currently available and their main features

Conversely, as observed in our results, there has been a great acceptance of DM by students from both medical and dental backgrounds, for technical and/or educational reasons [13, 15, 17, 18]. Since students are more familiar with computers, their preference for DM is not surprising as CLM use is gradually reducing or being removed from courses [2, 6]. Still, earlier studies have reported uncertainties about choosing one single method, especially in pathology residency programs, probably due to the practical routine requirements, such as image zooming and speed of glass slide assessment, as well as the importance of learning how to operate a CLM, which are being overcome with more sophisticated technologies, such as multiplane focusing high-resolution, faster, and ubiquitous hardware and software [1, 14, 27, 48].

The performance outcomes of the included articles indicate that WSI can be considered as an effective learning tool, being equivalent to CLM, as previously reported [2, 17, 18, 48]. However, the remarkable heterogeneity of the methodologies used in the studies, i.e. different assessment methods, learners with different levels of education and experience, different samples for each technology testing, long time intervals between the use of DM and CLM for evaluation, and lack of prior instructions to manipulate these devices may have compromised the reliability of the results [5, 16, 27, 28, 49].

For diagnostic purposes, guidelines have been recommended for validation of WSI systems, such as the simulation of a real clinical environment for technology use, the participation of a WSI-trained pathologist, and the use of at least 60 cases, which might be presented during routine practice. Moreover, other guidelines include the need for both DM and CLM evaluation, a washout period of at least 2 weeks between viewing digital and glass slides, and assessment of the same material presented in both glass and digital slides [50].

Based on these parameters [50] and the lack of testing standardization to assess learners’ performance and perceptions regarding the use of WSI systems in comparison with CLM, we attempted to provide guidelines for further validation studies, as demonstrated in Table 6.

Table 6 Guidelines for validating the use of WSI for educational purposes

In conclusion, DM and WSI can be considered reliable technologies for use in human pathology education, showing acceptance by users. Although we could not determine the most appropriate approach for students’ performance assessment, the assembled data in this study highlights the demand for education validation guidelines. Therefore, we expect that our recommendations might provide the platform for more homogeneous data and higher-level evidence for other systematic reviews and meta-analyses in future.