Recent reports indicate an increased need for a college-educated science, technology, engineering, and mathematics (STEM) workforce in order for the United States to stay competitive (Langdon et al. 2011) and meet growing employment demands (Lacey and Wright 2009). A substantial hurdle to meeting this need is the large number of undergraduates, nearly 50 %, who intend to earn STEM degrees but fail to do so (Chen and Soldner 2014). Many of those who abandon STEM fields are actually quite strong academically, but they may find non-STEM pursuits more attractive for a variety of reasons (Bettinger 2010; Chen and Soldner 2014; Seymour and Hewitt 1997).

Several strategies have been used to increase undergraduate retention in STEM courses and majors, including preparatory programs to assist those who are less academically prepared at the high school level (e.g., the Advancement via Individual Determination program [Oswald and Austin Independent School District 2002]) or university level (Gilmer 2007; Strayhorn 2011; Walpole et al. 2008); peer mentoring programs to facilitate collaborative learning and peer support (Light and Micari 2013; Quitadamo et al. 2009); and laboratory and lecture course redesigns to heighten problem solving skills, critical thinking, and formal operational reasoning (Kapp et al. 2011; Watkins and Mazur 2013). In particular, the Boyer Commission urged research universities to “make research-based learning the standard” for all students (Boyer Commission on Educating Undergraduates in the Research University 1998); and many laboratory course redesign efforts have been attempted with varying success to create “authentic research experiences” that replicate those experienced by practicing scientists.

Undergraduate research experiences (UREs) provide a chance for students to work with scientists, graduate students, and post-doctoral fellows on real research projects. Such experiences may promote retention, interest, and long-term persistence in STEM fields (Heidel et al. 2011; Jones et al. 2010; Lopatto 2007). In their review Hunter et al. (2007) concluded that undergraduate research promotes “thinking like a scientist,” which is characterized by improved critical thinking and problem solving as well as affective gains including increased confidence in the ability to perform research and clarity in future career goals and intentions to pursue graduate school (Hunter et al. 2007). UREs also develop scientific skills and generate interest in the scientific process, which may encourage a long-term career in STEM fields (Eagan et al. 2013; Seymour et al. 2004).

While these reports demonstrate the value of UREs, it is unclear if the timing of the experience is important. In particular, there are few reports of such experiences that serve first and second year students; and we found only one report of a research prepatory program that enrolls exclusively first-year students. The Biology Undergraduate Scholars Program at the University of California Davis, a program that includes optional laboratory research for first-year students, has demonstrated promising results. Their participants were more likely to persist in STEM disciplines and pursue graduate study compared to other graduates (Barlow and Villarejo 2004). Heidel et al. (2011) described an undergraduate research program targeting pre-freshmen to sophomores that resulted in 87 % attaining STEM degrees or continuing to pursue them. The Undergraduate Research Opportunity Program at the University of Michigan engaged first and second year students in research experiences during the academic year (Hathaway et al. 2002), and participants demonstrated higher retention and grades than non-participants.

To investigate whether early engagement in research could improve STEM retention, we designed a program specifically (a) to engage first-year students in research and (b) to encourage long-term participation in research. To maximize research participation and retention in STEM fields, we took the novel approach of selecting participants based on work by McGee and Keller (2007), who identified characteristics that distinguished between students who chose biomedical research (Ph.D.) instead of clinical training (M.D.) opportunities. We do not use academic preparedness or prior research experience to select participants.

Our study addressed the following questions. Does this program encourage first-year students who are already interested in research to (a) persist in research and (b) pursue STEM majors? We also explored students’ science interest before and after completing the program.

The Study

Institutional Context

Our private institution is predominately white and highly selective, and it has very high research activity. The Office of Undergraduate Research funds students interested in summer research; however, priority is given to juniors, so few first-year students were funded prior to this program.

NU Bioscientist Description

The program began in 2011 as a joint initiative between the Department of Molecular Biosciences (faculty), the Searle Center for Advancing Learning and Teaching (staff), and the Office of Research (administration). It was awarded a grant from the Howard Hughes Medical Institute. The goal of the program is to promote STEM retention by engaging first-year students in authentic research experiences. The objectives are (a) to establish a community of practice (Wenger 1998) consisting of undergraduates, graduate students, post-doctoral fellows, and faculty members and (b) to utilize the community of practice to guide students in authentic research experiences. This involves securing a place in a research laboratory, crafting a research question, designing appropriate methods, writing a research proposal, conducting a summer research project, and sharing findings at a symposium—thinking like a scientist by doing the work of a scientist (Lave and Wenger 1991). In the program design we incorporate current best practices from evidenced-based based research including group work, discussion-based learning, peer mentoring, and interdisciplinary approaches. Although the program is housed in our Program in Biological Sciences, it is not limited to biology students as students at our institution typically do not declare majors until their junior year. Moreover, research selections are not limited to traditional biology research as we assist participants in finding laboratories in a variety of STEM and social science fields including engineering, chemistry, materials science, psychology, and economics.

The most critical program component is a community of practice (Wenger 1998) to mentor and support students. The community consists of the first-year cohort, the undergraduate peer mentors, laboratory mentors who work with the students in labs and prepare them for their summer research, and the program faculty members. Students take two seminars (detailed below), work in small groups, and review each other’s research proposals. Trained undergraduate peer mentors work with students in the fall to help them identify laboratories, offer critical feedback on research proposal drafts in the winter, and help students integrate themselves into their laboratories. Each student is also paired with a laboratory mentor (graduate student or post-doctoral fellow), who has participated in a series of six one-hour workshops on mentoring skills based on the work of Handelsman et al. (2005). Laboratory principal investigators (PIs) match mentors with mentees. Mentors also work with program faculty members and their students to develop their research questions and methodology and to help them create a poster or talk describing the results of their research project.

The academic component of the program consists of two courses: Biological Thought and Action (fall) and Science Research Preparation (winter). Biological Thought and Action is co-taught by a faculty member in the Department of Molecular Biosciences and a faculty member in the program of Science in Human Culture, who also serve as first-year advisers for the students. Course objectives are to help students gain a deeper understanding of the mechanics of modern science; develop strong argument-based writing using supporting literature; evaluate and communicate core findings in science from diverse perspectives; and address issues of inclusion, accessibility, and ethics in science. Science Research Preparation is co-taught by the program director and a laboratory instructor in the Department of Molecular Biosciences. The objectives of this course are to develop research skills, which include crafting research questions, performing literature reviews, designing studies, and analyzing data. Each week, students participate in a 20-minute conversation with program faculty members on a particular aspect of research (e.g., academic publishing) followed by break-out sessions in small groups (3–5 students) led by a peer mentor. In the small group sessions, peer mentors work with students on a group exercise related to the topic (e.g., use PubMed or Google Scholar to find three recent peer-reviewed articles on a particular topic and compare results) and discuss how to incorporate those skills into their individual projects.

Upon successful completion of the coursework, submission of the research proposal, and approval of the project by the faculty sponsor, students are awarded funding through the program to pursue their project the following summer. During the summer students work in their laboratories for 10–12 weeks for approximately 40 hours a week under the guidance of their mentors (see below). They are not allowed to take coursework or hold jobs during this time so that they can immerse themselves in research.

Participants

The program has a website and online application that is accessible to all students (not only those intending to major in certain disciplines), and it is promoted at student recruiting events. Each year, between 100 and 140 students apply to the program, which is roughly one-quarter of the number of students interested in STEM fields at our institution. Based on the work of McGee and Keller (2007) and with advice from that study’s first author, program managers designed a rubric to score applicant responses to four questions. (a) Of all your experiences related to science in the past, which one(s) did you find the most satisfying? Why? (b) Think about the science research you have done before. What is it that makes you want to do more research? If you have NOT done any science research before, what makes you want to do it now? (c) Think about how you generally work toward a goal. Do you plan out every step and follow it closely, or do you figure it out as you go without exactly knowing where you are headed? Why? (d) When you get stuck on a problem, do you prefer working on it by yourself until you figure it out, or do you quickly try to find someone who can help you solve it?

Using responses from McGee and Keller (2007) as a guide, two program staff members, blind to the demographics or academic preparation of applicants, score the applications on the following: Curiosity to discover the unknown, enjoyment of problem solving, independence, minimally structured views of the future, and a desire to help people indirectly through research. Each domain is scored on a three-point system: 0=no evidence, 1=some evidence, 2=strong evidence. Scores for each domain are based on the intensity of a respondent’s statement and how many times a certain theme is identified. Table 1 depicts these five domains and sample statements from our applicants with assigned ratings. Students with the top 30 scores are invited to participate in the program.

Table 1 Domains used for student selection. Example responses from our applicants are depicted

There were 375 applicants over the 3-year time period. Ninety (24.0 %) were chosen to participate in the program, and 84 (93.3) completed the program. The six students who did not complete the program are included in all analyses. Forty-nine females and 35 males earned an undergraduate research grant after their first year.

Evaluation Design

The program evaluation described here pertains to three cohorts—2011, 2012, and 2013, and it was designed to address three questions. (a) Do participants have higher participation in undergraduate research than non-participants, (b) are participants more likely to major in STEM disciplines than non-participants, and (c) do participants’ science self-efficacy and interest increase as a result of the program? To answer these questions, we compared participant data with data from a control group of non-participants who applied but were not accepted into the program. We believe this to be the most appropriate comparison group to demonstrate the impact of the program because it controls for motivation to engage in research. The literature suggests that women, underrepresented minorities, and the less academically prepared are disproportionally lost from STEM majors (Griffith 2010; Bae and Smith 1997). Given this evidence, we matched each participant with a non-participant on the basis of sex, race/ethnicity, and SAT score. When matching, the priority was to match on sex and race/ethnicity first and then to find the closest SAT score. Students vary in their selection of entrance examination (SAT vs. ACT), so we converted ACT to SAT scores based on data from the College Board (2009) and used cumulative SAT scores (critical reading and mathematics) for all analyses. Table 2 shows the demographics (sex and race/ethnicity) and SAT scores for all applicants (N = 375), selected students (N = 90), control students (N = 90), and the university as a whole from 2013 to 2014 (N = 8353). There were no significant differences in sex (χ2(1) = 0.000, p = 1.0) or race/ethnicity (χ2(4) = 0.905; p = 0.924) between participants and non-participants. Participants also did not differ significantly from non-participants in terms of SAT score (Wilcoxon Signed Rank Test, Z = −0.856, p = 0.392). Thus, the experimental and control groups were statistically indistinguishable by typical academic, race/ethnicity, and sex criteria. There were also no significant differences between participants and all applicants on any of these variables (sex, χ2(1) = 0.090, p = 0.764; race/ethnicity, χ2(4) = 2.428, p = 0.657; SAT, Z = −1.741, p = 0.082).

Table 2 Demographic characteristics (N (%)) of all applicants (N = 375), selected students and their matches (N = 90 each), and full-time enrollees at the institution from 2013 to 2014 (N = 8353; 8451 for race data)

Research Participation

We considered three measures for continuation of research: (a) number of quarters that a student participated in paid research through an undergraduate research grant (b) number of quarters that the student was enrolled in an independent study for research credit, and (c) the combination thereof. The time period for all outcomes was between a student’s matriculation at the institution and when we conducted this study (late 2015) and varied by cohort. We obtained the data from the Registrar and Office of Undergraduate Research, College of Arts and Sciences, Department of Chemistry, School of Engineering, and the Program in Biological Sciences. We also included the grants received by participants as part of the program’s research experience in the undergraduate research group count as non-participants also had the ability to earn a grant in that first summer through the Office of Undergraduate Research. We compared persistence in undergraduate research between participants and non-participants cumulatively and by cohort using Wilcoxon Signed Rank Tests. We used a Mann Whitney U Test to compare these outcomes by sex across cohorts.

We also explored laboratory choices amongst participants (data were unavailable for non-participants). We classified laboratory selections into three groups: STEM, social sciences, and none and then further classified them into sub-disciplines: Biomedicine, biology, engineering, psychology, communication sciences and disorders, psychology, chemistry, and economics.

Persistence in STEM Majors

We obtained data on declared majors (cohorts 2 and 3) and degrees earned (cohort 1) from the registrar. Majors and degrees were grouped into the following categories: STEM, social sciences (e.g., anthropology, psychology, economics), non-STEM (English, history, political science), and undeclared or dismissed from the university. At our institution, students do not have to declare a major until their junior year, so declared major is a suitable proxy for degree earned. We used chi-square goodness-of-fit tests to analyze data on major/degree selection. We investigated participation of mentors in the training workshops and, mentee outcomes.

Science Interest, Self-Efficacy, Research Skills, and Career Plans

We administered two interest surveys to program participants: One prior to enrollment (pre-survey) and another after the winter quarter (post-survey; cohorts 2 and 3 [N = 27 and 29, respectively]) or after they conducted their summer research projects (cohort 1, [N = 9]). The interest survey included items from (or adapted from) published reports across eight domains: Science interest, biology interest, science self-efficacy and biology self-efficacy, critical thinking, research skills and research knowledge, and science career interest (Eccles n.d.; Handelsman et al. 2005; Linennbrink-Garcia et al. 2010; Lopatto 2004; Marsh 1990; Midgley et al. 2000; Pintrich et al. 1991). (Complete survey available from the last author upon request). Respondents indicated their extent of agreement with a series of statements on a six-point Likert scale from strongly disagree (1) to strongly agree (6). We averaged responses to items on the same subscale to create a domain score. We assessed differences between pre- and post-domain scores with Wilcoxon Signed Rank Tests and used Mann Whitney U Tests to assess differences by sex. We administered a follow-up survey in the fall of participants’ junior year. The survey contained items on a five-point Likert scale (1=strongly disagree to 5=strongly agree) related to continuation of research, science interest, community, and the program in general (N = 58).

Statistical Analyses

We used SPSS Statistics (version 23) for all analyses, and we used non-parametric statistics to analyze Likert scale data. We also used non-parametric tests when there were significant outliers and/or the assumption of normality was violated. A significance level of 0.05 was established.

Results

Increased Research Engagement

The research involvement of participants was significantly higher than non-participants (Table 3). Aggregated across cohorts, this difference was significant for undergraduate research grants, independent study enrollment, and the combination thereof. This is in part driven by a higher percentage of participants enrolled in independent studies compared to non-participants (41 % vs. 32 %; χ2(1) = 1.5; p = 0.216).

Table 3 Mean (SD) number of quarters of research by participant status

Women in the program had similar independent study and undergraduate research grant enrollment as men (mean [standard deviation]: 2.4 [2.2] vs. 2.3 [2.4] quarters). However, male non-participants had higher research participation (1.2 [1.8] vs 0.8 [1.2] quarters, ns). Figure 1 shows differences between participants and non-participants in research participation for men and women by cohort. Across all cohorts, the difference in research participation was 1.6 quarters [2.5] for women and 1.1 [2.8] for men. This difference was not significant for any individual cohort or cumulatively across cohorts.

Fig. 1
figure 1

Difference in research persistence (computed as participant-non-participant; number of quarters of URG and independent study) between participants and non-participants by sex and cohort

Laboratory PIs assigned a mentor to students after they joined a laboratory. Concordance between participant and laboratory mentor sex was higher for women (71 %) but lower for men (66 %). Female participants with male laboratory mentors had 3.9 [3.0] quarters of research, whereas female participants with female mentors had only 1.8 [1.3] quarters. For men, having a male mentor seemed to make little difference in research involvement: Mean participation was 2.8 [2.5] for male-male and 2.2 [2.4] quarters for male-female laboratory mentor dyads. Men who did not select a laboratory had 0.5 [0.84] quarters of research involvement on average.

Persistence in STEM Majors

Persistence in STEM did not differ significantly between participants and non-participants for any cohort or for all three cohorts in aggregate (Table 4). Most students (~69 % overall), declared STEM majors. We also did not observe a significant difference in STEM persistence when comparing participants (N = 90) and all non-participant applicants across all years (N = 285; χ2(1) = 0.383; p = 0.536; data not shown). Significantly more men (80 %) majored in STEM compared to women (χ2(1) = 5.570; p = 0.018) across all three cohorts.

Table 4 Number (and percent) of degrees or declared majors by participant status. For cohort 1 the percentage represents degree awarded. For cohorts 2 and 3 the percentage represents declared major

Laboratory Selection

The majority of participants (82 %) chose to work in STEM laboratories, 11 % were concentrated in the social sciences, and the remaining 7 % did not select a laboratory. Table 5 shows degree/majors of the participants and their laboratory disciplines. Of the individuals who worked in engineering laboratories, 83 % later declared STEM majors; and 80 % of those in chemistry laboratories and 74 % of those in biology laboratories later declared STEM majors. Fewer students who selected biomedical laboratories, communication sciences and disorders, or psychology eventually declared STEM majors (64, 57, and 43 % respectively).

Table 5 Number (and percent) of laboratory selections by degrees or declared majors. Non-STEM major/degrees includes social sciences. Research participation is as a combination of URGs and independent study (mean [SD]). SAT is on a 1600-point scale and reflects SAT or ACT-converted values

Mentor Participation in Workshops

Of the 84 mentors, 66 (79 %) were trained in the program workshops (“trained mentors”). Overall research involvement was slightly higher for students who had trained mentors than untrained ones (2.5 [2.3] vs. 2.3 [2.0] quarters). Table 6 shows research participation by mentor training status and sex concordance of mentor-mentee dyads. The highest research involvement for both male and female students was seen for those with trained male mentors.

Table 6 Mean (SD) research participation (URGs + independent study; number of quarters) for men and women by laboratory mentor training status and sex concordance

Changes in Science Interest, Self-Efficacy, Research Skills, and Career Plans

Table 7 summarizes pre-and post-survey interest and skills in science across all three cohorts. There was only one significant change: Mean research knowledge increased from 4.1 [1.2] on the pre-survey to 4.4 [0.6] on the post (p = 0.002). We examined trends by sex. Although we did not find any significant changes (post-pre), we observed increases for women in all domains except science self-efficacy and critical thinking, whereas for men increases were observed only for science interest, research skills, and research knowledge. Women were also more likely to credit the program for engaging them in research than women (Table 8; p = 0.021).

Table 7 Participants’ scores on pre-and post-surveys (mean [SD]) for each domain (aggregated across cohorts)
Table 8 Participant follow-up survey data for items pertaining to science interest and programmatic feedback

Discussion

Our purpose in undertaking this project was to develop a first-year research preparation program to increase research engagement and STEM retention at our institution. We successfully engaged first-year students in the research process; however, the program did not increase STEM retention. After conclusion of the grant, the program is and will be continued by the Weinberg College of Arts and Sciences (administration) and Department of Molecular Biosciences (faculty).

Our first goal was to increase participation in undergraduate research experiences, and this goal was achieved. Program participants evidenced significantly higher research involvement than matched non-participants. The conclusion that research participation was increased by the NU Bioscientist Program is bolstered by the close matching of the experimental and control groups. Participants in both the control and experimental group chose to apply to the program and were indistinguishable in terms of academic, race/ethnicity, and sex characteristics.

Interestingly, although program participants had higher research participation than non-participants, some subgroups of participants benefited more than others. Female participants conducted only slightly more quarters of research than did their male counterparts, but this finding is noteworthy because matched non-participant females actually performed less research than non-participant males. Figure 1 shows the gains in research participation compared to the matched non-participants. Consistent with the program promoting increased research, women were more likely to credit the program for engaging them in research than men (Table 8).

We also considered what role mentor sex played in encouraging research persistence. Same-sex mentoring relationships have been shown to be advantageous in terms of psychosocial support by some studies (e.g., Koberg et al. 1998) but not others (Ensher et al. 2002). For our first two cohorts, laboratory faculty were asked to make same sex pairings when possible. However, in the third year, we eliminated that request so that students could work on a project that best suited their interests. It is possible that attitudinal characteristics or other attributes of mentors are more important to mentoring relationships than sex matching of dyads. Female students with male mentors had higher research involvement than did those with female mentors, suggesting that matching female mentees with female mentors might not encourage research participation.

We also evaluated the program in terms of its impact on retention in STEM fields. Ultimately, our program did not have a significant impact on persistence in STEM majors, as they were equivalent for program participants, matched controls, and the applicant pool. The likely explanation for this finding is that most applicants, regardless of whether or not they were accepted into the program, were highly motivated individuals with strong academic preparation and interest in STEM fields.

Initial laboratory choice might be important in terms of later STEM major declaration. Students who selected laboratories in our biology, engineering, and chemistry departments were more likely to declare a STEM major as opposed to students who selected biomedicine, psychology, or communication sciences & disorders. The lower percentage of future STEM majors who selected biomedical laboratories, compared to biological, chemical, or engineering laboratories, is particularly interesting. There are several potential explanations for this finding. First, our medical school is not located on our undergraduate campus. This means that in addition to commuting between campuses, the students working in biomedical laboratories were advised by faculty members who did not typically teach and mentor undergraduates. Second, students who selected biomedical laboratories might be more likely to see research as a means to pursuing medical school or getting clinical experience; and such individuals may also be more likely to make the strategic choice of declaring a non-STEM major that would typically result in a higher GPA and an increased chance of admission to medical school.

Interestingly, female non-participants actually had more STEM majors than female participants (67 % vs. 57 %) despite the fact that female participants had higher involvement in research than female non-participants (Fig. 1). In addition to the possibility of declaring a non-STEM major to increase chances of medical school admission, an alternative possibility is that students pursued more research to supplement their choice of a non-STEM major in preparation for medical or graduate school. An important follow-up study will be to determine the career paths of participants and non-participants to determine which students ultimately pursued a STEM career.

The study has some limitations. We report continuation in research in terms of grants received and number of quarters of research participation, but we acknowledge that some undergraduate research experiences (i.e., voluntary work in a laboratory unassociated with an independent study or grant) may be missed by this approach. We explored majors/degrees earned as one of the primary outcomes; however, these outcomes were removed in time from exposure to the program. Finally, we recognize that this research was conducted at a highly selective institution, which limits the generalizability of the findings. Whether this program would increase STEM retention in the context of a different student population is an interesting question. For example, would a similar program increase STEM retention at a less selective college? Or, at our own institution, could this program increase STEM retention in sub-populations, such as low-income students, who have a lower retention in STEM majors than the general population? Future studies at other institutions may address the former question, but to address the latter question we have now changed the selection criteria of our program to investigate whether or not engagement in research can increase STEM retention of low-income and first-generation students at a selective enrollment college.

Conclusion

We designed NU Bioscientist to engage undergraduates in research early in their college careers. Our data demonstrate that the program can increase the persistence of first-year students in research. The program did not result in a higher number of STEM degrees/majors; however, there was some evidence that initial laboratory choice might impact later STEM major declaration. An unexpected finding from this study was that female students with female mentors showed reduced persistence in research compared to female students paired with male mentors. Our findings add to the growing body of research about issues relating to the STEM educational process.