Introduction

With growing interest in small group learning, web-based instruction and computer-based modules in medical education, there appears to be declining interest in improving the traditional medical school lecture. Within medical education literature, the lecture format is often used as a baseline to which new teaching modalities are compared, for example to small-group learning [1], computer-based learning [25], and alternative methods like game-based learning [6, 7].

However, it is unclear whether we are truly shifting from a classroom-based model of medical education to a hybrid model of small groups and computer-based instruction. A survey-based study in 2007 showed that electronic course materials such as recorded lectures and uploaded lecture slides did not deter students from lecture attendance, and that individual decisions about attending lectures were based on previous experiences with the lecturer, predictions of the lecture effectiveness, and introspective factors such as learning preferences and learning needs [8]. The study suggested that lectures have not become obsolete with the advent of electronic material, and that it is the quality of lectures themselves that dictate attendance rates.

As well, the decline in lecture attendance rates may simply reflect the increasing availability and robustness of video-recorded lectures. A study conducted at Harvard Medical School in 2008 found that 100 % of students have watched recorded lectures in some capacity, citing the reasons of increased speed of knowledge acquisition (79.3 % of students), ability to look up additional information (67.7 %), ability to stay focused (64.8 %), and overall ability to learn more (63.7 %) [7]. Therefore, while lecturers may be discouraged by dwindling numbers in the lecture hall, their influence may be more substantial than they realize as students view and review these lectures at home.

More recently, a study found that students generally prefer live lectures to recorded lectures because they are more engaging, despite the convenience of video podcasts and the ability to stop, review, and repeat segments [9]. The same study found that learning outcomes between watching lectures live or recorded were equal. While students continue to find the traditional lecture an effective and useful part of their medical education, they seem to be largely deterred from attendance only by poor quality, not by the availability of other teaching modalities. Hence, we argue that alongside important advances in technology and the promotion of problem-based learning, the traditional lecture remains a cornerstone of medical education and warrants continued research and improvement.

Two major foci of research in lecture delivery are in-lecture active learning and multimedia use, due in part to interest and support by the Liaison Committee on Medical Education (LCME) [10]. Current research in incorporating active learning into the lecture environment focus on new techniques such as in-lecture electronic clickers [11], reserving lecture hour blocks within a lecture series to assess short-term retention using electronic clickers [12], and setting up online spreadsheets in the lecture hall to which students can pose questions via anonymous contributions from their computers during the lecture hour [13]. Meanwhile, research in multimedia use in teaching [14] focuses on concepts such as the multimedia principle—presenting words with pictorial support instead of words alone—and the contiguity principle—placing printed words near corresponding graphics. Many of these multimedia guidelines apply aptly to medical school lectures.

While continued research in these domains is important, the introduction of many new tools makes it difficult to incorporate the lessons learned into best-practice guidelines that are sufficiently succinct for the typical lecturer to adopt. Our study aimed to identify a small set of recommended practices that could serve as a current simple model for improving the modern medical school lecture. We created a template to evaluate our faculty’s use of the new techniques in the domains of active learning and multimedia use, alongside traditional practices in lecture delivery style.

A similar study was performed in 2000 by Copeland et al. [15]. Based on three symposia that served as an intensive review of internal medicine topics for physicians, the group identified features of lectures that predict overall audience ratings using Pearson correlation coefficients and cluster analysis. The authors found the core features to be the lecturer’s ability to identify key points, the level of engagement, lecture clarity, slide comprehensibility and formatting. Our study differs in that we evaluated additional characteristics stemming from current research in active learning and multimedia, as well as other best-practice guidelines that have emerged since this prior study. Also, we targeted faculty who lecture to undergraduate medical students instead of physicians in practice.

In this study, we aimed to create a “short list” of most important practices that a typical lecturer may adopt into his/her lecture style. Secondarily, we hoped to illuminate areas for future research in the traditional medical school lecture, and call attention to its continued importance and relevance to medical education.

Methods

Design

This study quantified lecturer style and technical presentation characteristics, to determine whether specific characteristics correlated with students’ ratings of the lecturers. We selected a subset of all lecturers to exclude infrequent lecturers who may be unfamiliar with the course objectives, students expectations, and the specifics of topics that were taught prior to or subsequent to their lecture, to avoid allowing these factors to contribute to bias in student ratings. As such, we included only lecturers who were course directors or prior course directors in second-year pathophysiology courses at a large mid-Atlantic medical school in the 2010–2011 academic year, all of whom had given at least three lectures in the course for the year. We identified 13 such eligible lecturers and reviewed a total of 39 lectures. All students completed online evaluations for each lecturer at the end of their series of lectures within a given course, and we used the overall “clarity of presentation” rating of the lecturer as a marker for lecture style effectiveness.

Data Collection

One author reviewed and evaluated each podcast lecture. Each lecture was scored on a checklist of forty-seven objectively quantifiable characteristics based on previous works in the literature focusing primarily on overall lecture style [1618], specific lecturing strategies [19], use of PowerPoint presentations [20, 21], and newer techniques that are currently undergoing validation [12, 22]. We also included lecturer demographic characteristics: professorship rank and gender. Lecturers were not made aware of the 47 characteristics prior to giving their lectures.

We categorized the characteristics into three domains: lecture style and strategies, use of multimedia, and lecturer demographics. Each characteristic was scored with either a numerical value or yes/no; behavior-based characteristics were scored as “yes” if it was observed at least once at any point during the lecture.

At our institution, lecturers were rated by students through online evaluations at the end of each course on multiple domains using a 5-point Likert Scale (5, outstanding; 4, excellent; 3, very good; 2, good; and 1, poor). Students were also asked to rate each lecturer for “clarity of presentation”. We selected this as the primary marker for lecturer effectiveness as our study focused on identifying desired lecture qualities.

The average “clarity of presentation” rating of all 296 lecturers in the preclinical curriculum who taught in the 2010–2011 academic year was 3.9 ± 0.4 (median 3.86). As expected, our sample with only current or prior course directors of pathophysiology courses (n = 13) had a significantly higher average rating of 4.1 ± 0.5 (p = 0.015) (median 4.24). Since lecture attendance was not mandatory, student evaluations of lecturers were optional; each lecturer received an average of 120 ± 20 evaluations upon which their average score was based.

To identify the common characteristics of top lecturers, we divided our sample into lectures given by above-average (7) and below-average (6) lecturers using the average rating of 4.1 as a cutoff, resulting in 21 lectures considered above average and 18 considered below. For each characteristic, we then compared the values between the two groups using the student t test. Each characteristic associated with a statistically significant difference between the groups was considered a quality that distinguishes an above-average from a below-average lecture.

Results

Table 1 shows the percentage of lectures in which lecture style characteristics were demonstrated for “above-average” and “below-average” lecturers. Six characteristics—use of questions targeting individual students, use of audience brainstorming, use of small-group activities, use of role-playing, reading directly from slides, mumbling—were not demonstrated in any lecture. All other characteristics were demonstrated in at least one lecture. All lecturers used PowerPoint as the primary modality for lecture delivery.

Table 1 Percentage of lectures in which each lecture style characteristic was observed

Among characteristics related to active learning and audience engagement, only two characteristics were significantly different (p < 0.05) between “above average” and “below average” lecturers—in 19 % of “above-average” lectures, the speaker asked questions requiring a show of hands from the audience at least one time, while this was observed in 0 % of “below-average” lecturers. Similarly, in 67 % of “above-average” lectures, the speaker summarized key points, while this was done in only 22 % of “below-average” lectures (p = 0.005).

Table 2 shows PowerPoint characteristics comparing presentations given by “above-average” and “below-average” lecturers. Only one characteristic differed significantly between the two groups: 62 % of “above-average” lectures featured the use of a summary slide, while only 28 % of “below-average” lectures used one (p = 0.03). All other characteristics related to the PowerPoint presentation did not differ between the two groups

Table 2 PowerPoint characteristics averaged for all, “above-average”, and “below-average” lectures

For lecturer demographics, 57 % of “above-average” lecturers were full professor rank (as opposed to associate or assistant) while 0 % of “below-average” lecturers were full professor rank (p = 0.03). There was no distinction between above- and below-average lectures in regards to gender. Application of the Bonferroni correction would reduce the acceptable p value for significance to p = 0.002. However, our hypothesis was not that all characteristics must occur and be equivalent, so analysis on an item-by-item level remains valid although does not prove correlation.

Our study demonstrates that top lecturers more often summarize key points orally, ask questions that require a show of hands from the class, and provide a summary slide in their presentation. It also suggests that full professors are more likely identified to be top lecturers as opposed to associate or assistant.

Discussion

The identified characteristics share a theme of summarizing information and engaging the audience through questioning involving the entire class, the latter of which may be related to assessment of audience understanding.

It is notable that most of the 47 characteristics, all of are considered best-practice, were not found to be differently utilized by above-average compared with below-average lecturers. In fact, the values for use of these characteristics varied widely among lecturers in both groups. Assuming adequate power to our study, there are two possible explanations: (1) As a whole, subjective characteristics of lecturers such as charisma and humor have a greater influence on lecturer ratings than objectively quantifiable characteristics; in other words, the art of lecture delivery is more important than the science. (2) It is not whether a best-practice technique is used but rather how it is used that determines its effectiveness. For example, the common recommendation of using abundant multimedia in lectures must be accompanied by conscientiously selecting only appropriate multimedia and avoiding overuse. Similarly, lecturers may not yet have optimized the use of new techniques such as audience response systems due to a lack of experience.

We approached the problem of identifying best-practices characteristics mindful of the vast array of intersecting and non-intersecting characteristics, leading to many potential confounders. While it was not possible to prove that adopting the identified characteristics would improve student ratings of a lecturer, there appears to be significant associations between certain identified characteristics and student ratings.

Study Limitations

The most significant limitation of the study is that we divided lectures into above-average and below-average based solely on the “clarity of Presentation” student rating given to the lecturers at the end of each course. Student ratings are inevitably biased; recent research has also shown that student ratings are influenced by the following: physical attendance compared to viewing lectures at home and student performance based on grade, level of training, and lecturer degree (MD or Ph.D) [23]. In addition, ratings are likely influenced by the likability of the lecturer independent of the quality of his/her teaching skills, as well as even poor student effort or memory when completing evaluations. It is hoped that the large number of student evaluators (average 120 ± 20) reduces bias. Another limitation stems from the heterogeneity of lecture content. Different topics warrant different lecturing techniques. We attempted to minimize this problem by selecting only second-year pathophysiology course lectures that are information-based (for example the topic of aortic dissection) and not skill-based (for example the topic of reading EKGs), but content inevitably dictates a lecturer’s decisions in style.

As well, by selecting only course directors of second-year pathophysiology courses to minimize heterogeneity in content and lecturer demographics, we sacrificed sample size. It is possible that by expanding our inclusion criteria and evaluating more lecturers, we may have identified more characteristics as statistically significant. Also, choosing only current or prior course directors may have limited our selection to our “top” educators, reducing the range of lecture effectiveness and making differences between “above-average” and “below-average” lecturers more difficult to discern.

One further issue is generalizability. Icahn School of Medicine at Mount Sinai has a relatively good standing amongst U.S. medical schools and boasts a larger proportion of nontraditional students than the average school, as represented by a higher average age of matriculation and greater number of students with a college background in the humanities [24]. Our unique student population may lead to poorer translatability of results to other institutions; for example, adult learners practice more self-motivated and practicality-based learning [25], and student academic backgrounds may affect preferred cognitive learning styles. However, given the continued trend of medical education toward active and multi-modality learning [26], we believe the results of our study are appropriate and applicable to the modern medical student, and may be even more applicable over time with continued promotion of active learning. As well, since the lecture format unilaterally best matches the abstract/conceptualization learning styles (as opposed to, for example, concrete/experience) [27], the evaluation of lecture quality is likely only weakly affected by differential learning styles.

Finally, it is notable that we evaluated lecture effectiveness based on student rating as opposed to knowledge gained. Due to the heterogeneous nature of content, it would be challenging to measure and compare knowledge gained across multiple lecturers. Nonetheless, student rating is only one marker for lecturer effectiveness.

Future Directions

Given the vast number of characteristics that were measured, we did not attempt to evaluate subjective characteristics, for example those related to lecturer charisma, humor, or reputation. As discussed above, these subjective factors are likely as influential, if not more so, to student rating than the objective characteristics we evaluated. Possible future work would be to develop and validate quantitative scales to evaluate these characteristics, to determine their relative importance compared to other characteristics.

Our study identified a “short list” of best practices for the medical school lecture, but guidelines are limited in their generalizability and validity. Perhaps a combination of practice, experience, and developing one’s own style over time is the true “best practice”, as supported by our study results that lectures at our institution given by seasoned full professor educators who are course directors are most well-received.

Our study suggests that the most important best-practices lecture guidelines for the medical school lecture include oral summarization of key points, availability of a summary slide in the presentation, and asking questions that require a show of hands from the class. Our study did not find improved student satisfaction with the use of newer techniques such as using electronic clickers and asking students to discuss questions among themselves, although this may reflect a lack of faculty experience in optimizing their use. Future work includes assessing how subjective qualities affect lecturer ratings and evaluating a greater number of lecturers including those who are not senior course directors.