Introduction

Disparities in access to quality health care based on race/ethnicity, language, and socioeconomic (SES) status are well documented in the USA (U.S. Department of Health and Human Services 2013a). In particular, underrepresented minorities (URMs), which includes African-Americans/Blacks, Hispanics/Latinos, Native Americans, and Pacific Islanders, have less access to health care than Whites; moreover, those who are poor or low income have less access than those who are middle-to-high income (U.S. Department of Health and Human Services 2013b). According to a recent report published by the Coalition of Urban Serving Universities, an estimated 66 million Americans live in areas where access to health care is limited (Danek and Borrayo 2012). Multiple factors contribute to this inequity, namely health care workforce shortages, lack of health insurance, and persistent health disparities.

In addition to expanding insurance coverage, the capacity to resolve the massive gaps in the US health care workforce is a necessary solution to meet the health needs of its rapidly growing population, especially as it pertains to the shortage of primary care in cities and urban areas (U.S. Department of Health and Human Services 2006). Health care providers from URM groups are more likely to practice in underserved areas than White practitioners, even among Whites from a lower SES (HRSA 2006; Marrast et al. 2014). Yet, URMs continue to be underrepresented in health care. Despite comprising one-third of the total US population, African-Americans, Hispanics, and Native Americans make up only 9 % of physicians, 7 % of dentists, and 6 % of registered nurses (Danek and Borrayo 2012). In response to this national shortage, urban universities are naturally positioned to transform and diversify the future health care workforce by increasing educational access and opportunity. They stand as a key vehicle for health care reform in addressing what Danek and Borrayo (2012) call “the failing K-12 education pipeline, particularly, for urban and minority students” (p. 6). This strategy is often referred to as “pipeline” programs. They further observed that interventions vary, ranging from outreach and information to mentoring, test preparation, enrichment, and scholarships. Other activities include research labs, classes, field trips, camps, internships, and social events.

The current study aimed to pilot test the impact of an informal summer learning program designed to expose high school students to a wide range of careers in health and medicine, while improving their knowledge and mastery in science and mathematics. Since 2009, the Careers in Health and Medical Professions Program (CHAMPS) has been delivered to rising sophomores, juniors, and seniors in a Summer Institute at Cleveland State University. The majority of youth are low-income URMs recruited from urban schools. Like many other pipeline programs, CHAMPS addresses the academic and non-academic factors impeding URM students’ likelihood of entering a post-secondary program in health science or medicine. These factors include a lack of knowledge about health care professions, academic underpreparation in science, lack of mentors and role models, financial barriers, and lack of family support (e.g., Agrawal et al. 2005; Rao and Flores 2007; Rashied-Henry et al. 2012). Before describing CHAMPS, a brief review of similar types of programs is discussed to provide readers with a comparative overview of the objectives, activities, impacts, costs, and limitations of STEM+H interventions across the USA.

University-Based STEM+H Outreach Programs

University outreach programs in STEM+H that address the career interests and academic readiness of URM students take on a wide variety of formats. While the curricular content and types of activities varies, there is general consensus that inquiry-based approaches to teach STEM+H subjects are the most effective way to engage students in the learning process and motivate them to pursue a career in STEM+H occupations (Fields 2009; Gibson and Chase 2002; Markowitz 2004). A meta-analysis of 37 experimental and quasi-experimental studies testing the impact of inquiry-based science teaching bolsters this perspective, with an overall mean effect size of .50 (Furtak et al. 2012). In educational policy and research, “inquiry-based” is often equated with “project-based” learning; both terms view the student as an active learner solving ill-structured, open-ended problems rather than a passive learner trying to find the “correct” answer. As such, outreach programs often emphasize hands-on activities that engage youth in real-world problems through “authentic” learning experiences.

Informal STEM+H programs that occur during the summer at universities vary in terms of dosage and format. They typically last in duration for about 1–3 weeks, combining off-campus activities with on-campus activities; students may or may not continue across each summer. Despite their wide appeal, we concur with Miranda and Hermann’s (2010) criticism that little information about these programs is published, with fewer programs that have become sustainable and supported by research efficacy trials. In this review, we summarize a number of exemplary programs focusing on STEM+H content, health sciences, and health careers.

Stanford Medical Youth Science Program

This summer residential program offered at Stanford University has been running for over 25 years as a biomedical pipeline program (Winkelby 2007). Cohorts of about 24 low-income high school students are selected to participate each summer; in short, they live on campus with ten undergraduate students for 5 weeks. During their residence, students actively participate in an anatomy lab, an apprenticeship/hospital internship, research experiences, and mentoring/guidance. According to Winkleby et al. (2009), 100 % of the 476 participants from 1988 to 2008 graduated from high school while 84 % graduated from a 4-year college. A much smaller proportion of the total sample, however, was either attending or had completed medical school (7.6 %) or a health profession graduate school (8.1 %). Winkelby et al. noted that their case report design could not rule out selection bias (e.g., high academic achievement and high motivation) and lacked a control group; indeed, the mean GPA of all students was 3.6 and even higher at 3.9 for math and/or science classes.

While the efforts undergirding the sustainability of this program are laudable and rare, it remains to be seen to what extent this type of approach can be replicated at other institutions that are dissimilar to the context and resources provided at Stanford University. Running a 5-week residential program every year may not be cost-effective at other institutions with less resources. Not every university would be capable of providing the program free of charge, including tuition, room, and board, and all other direct costs associated with the program. (The total costs of the program were not provided). Partitioning out the effects of living on a campus like Stanford from those effects caused by participating in the activities themselves remains a key empirical question.

Launch into Education About Pharmacology

As another inquiry-based science enrichment program, Launch into Education About Pharmacology (LEAP) has been delivered at Duke University since 2006 (Sikes and Schwartz-Bloom 2009). In brief, LEAP seeks to enhance content knowledge in biology and chemistry while fostering interest in science careers among URM high school students. Youth participate in a 3-week course in pharmacology. During the academic year (September to February), youth meet with mentors one Saturday per month and present their research projects at regional and state competitions in the spring. Similar to the program at Stanford, LEAP selects a total of 24 students each year. The cost of the program is approximately $30,000.

According to Sikes and Schwartz-Bloom (2009), youth improved knowledge of basic biology and chemistry principles, with an average gain of 25 % points on a 10-item test. In a retrospective survey, no significant gains in students’ interest in science or intentions to pursue science was found, which was attributed to selection bias. What sets LEAP apart from other pipeline programs is a test that assessed knowledge students were assumed to acquire. Oftentimes, programs aim to improve academic knowledge/skills but do not use assessments that measure those outcomes (e.g., Bhattacharyya and Mead 2010; Gibson and Chase 2002; Markowitz 2004). Instead, surveys are designed asking youth to rate how much they learned about or grew confident in academic concepts, including other measures assessing interest in or motivation to pursue STEM+H careers. These measures are usually a single-item question with no reported validity (e.g., Bischoff et al. 2008; Michalek and Johnson 2004; Padula et al. 2002; Phillips et al. 2012).

The Teen Medical Academy

At the University of Texas at San Antonio, the Teen Medical Academy (TMA) was developed in 2003 by the family medicine residency program (Oscos-Sanchez et al. 2008). In short, the TMA was created to increase the number of URM medical school applicants through a 9-month program promoting medical careers among high school students. Operating on one Saturday morning per month from September to May, the program consists of six medical workshops that focus on surgery, orthopedics, gastroenterology, cardiology, pulmonology, and obstetrics, as well as three teen health camps in which TMA students teach middle school students their newly learned skills. Youth interact with family medicine faculty, residents, and students. Based on an evaluation of the first 3 years of the program (2003–2006), Oscόs-Sánchez et al. (2008) mailed a follow-up survey in 2006 to the 361 students who applied to the program during that time frame, resulting in a 71 % response rate. While the authors found that greater participation in TMA significantly predicted greater interest in medical and allied health careers—among a host of attitudinal outcomes—their study was not designed to test program impact, nor did it examine short- or long-term academic outcomes.

Center for Community Outreach Development Summer Science Institute

At the University of Alabama at Birmingham, this particular program (UAB CORD) was designed as a progressive 3-year laboratory-based summer science program to improve the academic performance of inner-city high school students, while modeling and stimulating their interest in what “real” science is like (Niemann et al. 2004). As such, students are exposed to progressively more complex concepts and laboratory skills as a rising sophomore (BioTeach course), junior (ChemTeach course), and senior (Research Internship). For the first two summers, youth participate in a 6-week, 3-day-per-week laboratory and lecture course; furthermore, they also participate in weekly mathematics and English workshops and are paid a $1,000 stipend upon satisfactory completion of the course requirements. In the third summer, youth participate in a 9 week, 5-day-per-week advanced seminar and laboratory course; they continue weekly mathematics workshops and are paid a $1,800 stipend upon successful completion of course requirements, culminating in a poster presentation assessed by a jury. An average of 60 students participate each year. The total cost for the program was not reported.

Unlike most pipeline university programs, the UAB CORD program was not explicitly designed to increase interests in STEM+H occupations or the number of URMs in a particular career. Thus, Niemann et al. (2004) did not evaluate its impact on career variables per se, albeit, they did assess youths’ subjective perceptions of how much they believed and they learned about life skills (e.g., critical thinking) and job skills (e.g., calling in sick) after being in the program. A clear limitation with this method is that it was not based on observation or a test. Unfortunately, no assessment was employed at pretest and posttest for each course during each summer.

Summary and Synthesis

To our knowledge, the reviewed programs have not been replicated at other universities. In terms of similarities, CHAMPS uses project-based activities like the anatomy laboratory and research experience employed at Stanford and infuses content in biology and chemistry into the academic activities like the LEAP program at Duke. On the other hand, students do not live on campus during the summer, nor is it as long in duration compared with Stanford; the program content is also more diverse than LEAP’s focus on pharmacology, while deliberately infusing career exploration and work readiness skills into its activities. With respect to the TMA at the University of Texas, CHAMPS contains a Saturday morning component during the academic year as well, but not as frequently or delivered in the same format. CHAMPS also includes other health care occupations, not just those in medicine. With regard to the UAB CORD, CHAMPS is not as intensive or long in the summer, but it does rely on a similar approach to organize its summer activities by grade level. Like UAB CORD, it provides students with a stipend. CHAMPS differs, however, with respect to infusing career activities that target health care fields, in addition to its psychosocial component of mentoring.

CHAMPS can be further distinguished by its relative number of students it serves each summer and its financial viability. Compared with the program at Stanford, the TMA, or even the UAB CORD program, CHAMPS is most likely less expensive, perhaps at a very substantial level. It also aims to provide programming for 90 students per summer; whereas, Stanford’s program and LEAP is designed to serve 24 students per summer. Additionally, CHAMPS more directly and comprehensively evaluates its targeted career outcomes using quantitative and qualitative methods. The evaluation of the TMA program by Oscos-Sanchez et al. (2008) was the only one reviewed above that used career measures with reported psychometric properties. Similar to Sikes and Schwartz-Bloom (2009), academic outcomes in CHAMPS are evaluated using teacher-designed tests of academic knowledge or skills. In terms of content, activities, and goals, the only other published program that is most similar to CHAMPS is the Junior Fellows Program at the New York Academy of Medicine (Marcelin et al. 2004). Its evaluation study, however, was limited to a retrospective youth survey of measures assessing attitudes and opinions with no reported properties of reliability or validity.

The Careers in Health and Medical Professions (CHAMPS) Program

The CHAMPS program is an informal academic and career readiness program offered for two intensive weeks during the summer for 5 days each week and 6 h per day (the CHAMPS website is located at http://www.csuohio.edu/cehs/te/champs-careers-in-health-and-medical-professions). Youth who participate in the Summer Institute have the option of continuing to participate in the Saturday Academy during the academic year. As previously mentioned, CHAMPS introduces high school students to a wide range of careers in health and medicine and seeks to improve their knowledge in science and mathematics. In doing so, it is designed to transition a greater number of URM, low-income youth into preprofessional health career programs in college. The curriculum is anchored in multidisciplinary topics that aim to promote problem solving through hands-on research activities and group projects. These experiences are meant to transform students into “medical detectives,” challenging them to explore health and disease using laboratories, equipment, and campus facilities. Youth also participate in field trips to medical schools and hospitals, are exposed to professionals in health care, and receive mentoring from college students who are majoring in the health sciences.

In the sections that follow, the program’s components/activities, recruitment procedures, selection criteria, and personnel will be explicated. Since the summer of 2009, CHAMPS has delivered five consecutive Summer Institutes. So far, it has been offered at no cost to students through the support of private foundations in the region. The maximum cost for running the Summer Institute is $60,000, depending on the number of students selected into the program, ranging from a total of 60–90, and the stipend students receive for participating. In previous years, student stipends have ranged from $100 to $300; there are no stipulations for receiving payment except for successful completion of the program. Program activities are based on grade-level cohorts, with anywhere from 20 to 30 students selected in each cohort of sophomores, juniors, and seniors.

Program Components

In the 2-week Summer Institute, youth participate in both academic and non-academic activities from 9:00 a.m. to 3:30 p.m., Monday through Friday. All students participate in peer mentoring. In the sophomore cohort, students select a disease and provide a research presentation at the end of the 2 weeks. In the junior cohort, the research focuses on the interaction between a health-related case and all of the health professions that treat that case. As displayed in Table 1, there is some overlap in the titles of the sessions between the sophomore and junior cohorts (i.e., forensic science, library research, nursing and anatomy lab), although the content of the curriculum is unique. The titles and the content underlying the sessions for the senior cohort are clearly distinct. Each year, undergraduate students are recruited as CHAMPS peer mentors. A total of 6–9 peer mentors are recruited so they can be divided into groups of three; whereby, each group provides mentoring to the CHAMPS students during the Summer Institute.

Table 1 Sample schedule of session and activities for CHAMPS Summer Institute (2012 and 2013)

Meals for lunch and snacks are provided, as well as transportation for field trips. At the end of the Summer Institute, students present their culminating research project/presentation. Since the inception of CHAMPS, students have the option of continuing to participate across their sophomore, junior, and senior years. Participation during a previous grade, however, is not a requirement to participate during the summer of a later grade. In terms of the Saturday Academy, four sessions are provided for any CHAMPS student who wants to continue during the academic year, offered in September, November, February, and April. These sessions consist of guest speakers, laboratory demonstrations in CPR, research skills, ongoing interaction with peer mentors, and academic skills. Students can follow CHAMPS on Facebook and Twitter.

Recruitment and Application Procedures

CHAMPS participants are recruited from multiple school districts in Northeast Ohio. Because of its location in Cleveland and the URM student population it serves, the Cleveland Metropolitan School District (CMSD) provides the largest pool of students. In general, youth from over 15 local high school buildings participate each year in the Summer Institute. CMSD has an enrollment of about 41,000 students, consisting of 68 % Black/African-American and 14 % Hispanic/Latino, with 100 % designated as economically disadvantaged. Methods of recruitment consist of announcements made via presentations on school grounds, flyers, and e-mail blasts to principals and teachers. Interested students complete an application with their parent(s)/legal guardian, consisting of their contact information, academic history, transcripts, essays, teacher references, guidance counselor references, and a list of extracurricular activities. To be eligible, students must complete the ninth grade and have a cumulative GPA of 3.0 or higher, evidence aptitude for science and mathematics proficiency, and demonstrate an interest in and commitment toward a career in health care. On the day before the Summer Institute, CHAMPS students participate in a half-day orientation, accompanied by their parents and family members.

Program Staff

Over the past 5 years, CHAMPS has been directed by the same faculty member (third author) who teaches in science education programs at Cleveland State University. He has a teaching credential in Life Science and Chemistry and taught high school science for 7 years. This professional background is different from most faculty who develop and/or implement university pipeline programs; indeed, they tend to have very little, if any, experience teaching in K-12 education (Miranda and Hermann 2010). The academic sessions are taught by master teachers from high schools in CMSD and faculty at the university. To date, all master teachers are from biology departments. University faculty teach in the departments of biology, microbiology, forensics, occupational therapy, and mathematics. Graduate and undergraduate students are recruited each year to provide administrative support as well as peer group mentoring. For the past 3 years, CHAMPS has been independently evaluated by another unit on campus.

Main Hypotheses

Based on our review of the pertinent literature and the goals of CHAMPS, the following hypotheses were tested in order to examine its impact and initial promise for an efficacy trial:

H 1

Participants will evidence a significant gain in knowledge of the STEM+H academic content and skills taught in the program.

H 2

Participants will evidence a significant gain in knowledge of health care occupations taught in the program.

In addition, we explored how students experienced or perceived each individual session, the program as a whole, and the peer mentoring activities using a mixed methods approach to evaluation. Moreover, a select number of sessions were observed using an instructional rating instrument to investigate the role of such methods in future evaluations. This study evaluates the impact of CHAMPS over a period of 2 years.

Method

Participants

During the summers of 2012 and 2013, CHAMPS served a total of 155 students (87 in 2012 and 68 in 2013). Twenty-four (35.3 %) of the 2013 cohort were returning students. This return rate is an increase from 2012, when 20 (23 %) of the students were returning. In 2012, five sophomores, six seniors, and seven juniors returned; the other two students were missing data. In 2013, there were 12 returning juniors and 12 returning seniors. Aggregated across 2012 and 2013, the majority was African-American (48.4 %); the remaining racial/ethnic makeup of the sample was Caucasian/White (8.4 %), Asian (7.7 %), Hispanic/Latino (7.1 %), and other/not reported (28.4 %). The gender makeup of the aggregate sample was 71.0 % female and 23.2 % male (5.8 % of data on gender were missing).

In 2012, rising juniors (43.7 %) made up the majority; rising seniors (24 %) and rising sophomores (25.3 %) represented the rest of the sample, with 3.4 % missing. In 2013, the grade-level distribution was more balanced, with juniors (35.3 %), seniors (33.8 %) and sophomores (30.9 %) being distributed fairly equal. Seven students in the aggregate sample had a GPA of less than a 3.0, but were selected to participate. The average cumulative weighted GPA of the total sample was 3.87 (SD = .62), ranging from a minimum of 1.82 to a maximum of 5.0. Data were not available on GPA for 20.7 %. Across 2012 and 2013, participants were recruited from 18 different high schools; four of the high schools represented 63.8 % of the aggregate sample.

Procedure

The research proposal was approved by the university human subjects review board. Before orientation day, all parents/legal guardians and students were sent an informed consent form via electronic mail. At the orientation, youth who returned a signed informed consent form by their parent(s)/legal guardian were able to participate in the study. For those who did not receive an informed consent form via e-mail, they were provided one on orientation day with their parent(s) or legal guardian. Youth complete their grade-level test before the program and at the end (i.e., the day before they present their research/presentations). They also complete a questionnaire at the end of each session. At the end of the program in 2012, a subgroup of randomly selected students was invited to participate in a focus group or interview. Before completing measures, youth read and signed an assent form. Six gift cards were used as incentives to return consent forms. Hence, six students were randomly selected to receive the gift cards at the end.

Measures

Assessments

All participants completed an assessment (test) of academic and/or career knowledge designed by the main instructor (master teacher) for each cohort. These assessments were designed to be consistent with the facts, concepts, and principles intended to be taught.

Sophomores

For the 10th grade students, the test consisted of a total of 15 items (1 point each). Each item was based on a multiple-choice format. Items 1 through 7, as well as 11 through 15, assessed science or medical concepts, whereas items 8 through 10 assessed career knowledge. Hence, academic knowledge had a maximum total of 12 points and career knowledge had a maximum total of 3 points. A sample item reads, “Any change, other than injury, that disrupts the normal functions of the body is a _____ (a) disease, (b) vector, (c) bacteria, or (d) infection.” In this test, a high percentage of students missed one item, which was removed in calculating the total score in 2012. In 2013, the test was altered to make the wording of certain items easier to understand. In addition, one academic knowledge question was added, resulting in 13 items; career knowledge items were increased to 7 items in order to improve the sensitivity of the instrument to change. The new sophomore test thus consisted of 20 items.

Juniors

For the 11th grade students (rising juniors), the test consisted of a total of 20 items (1 point each). Each item either had a multiple-choice or true/false answer format. Items 1 through 10 assessed science and medical concepts, whereas items 11 through 20 assessed career knowledge. Hence, academic knowledge contained a maximum total of 10 points while career knowledge had a maximum of 10 points. A sample item reads, “A stroke is the third leading cause of death in the United States—true or false?” The junior test was also altered in 2013. The career items were increased to 14 to make the assessment more sensitive to measuring that domain. Two of them were removed due to concerns about their face validity, leaving the final number of items at 12. Academic items were reduced to six in 2013. Like the sophomore test, the same rationale and procedures were followed for modifications, resulting in a total of 18 items.

Seniors

For the 12th grade students (rising seniors), the test consisted of open-ended questions that required short-answer responses. Each item had a grading rubric with a point system based on predetermined criteria. A maximum of 35 points was available based on the following scoring categories: (a) basic/limited (<25 points), (b) proficient (25–29 points), and (c) advanced (30–35 points). Sample items include “What are microbes?” or “Define and discuss 3 techniques scientists use in the laboratory.” In contrast to sophomores and juniors, this test did not measure career knowledge. The senior test in 2013 was the same test used in 2012. The intra-class correlation coefficient, using a mixed effects model of consistency, was strong at pretest (df = 21, ICC = .75, p < 01) and posttest (df = 21, ICC = .96, p < .01) in 2012 for the total scores. Thus, the scoring rubric was established with high levels of inter-rater reliability between two scorers.

Session Questionnaires

At the conclusion of each session, students completed a brief 3-item questionnaire based on a Likert scale format designed to assess their perceptions of: (a) the presenter’s/instructor’s level of engagement, (b) interest in the session, and (c) usefulness of the information. Students rated each session on a scale of 1 (not at all) to 3 (a great extent or very). These items were modified in 2013. Specifically, the format ranged from 4 (highly) to 1(not at all). In each summer, the mean of the items was computed, with higher levels indicating higher degrees of satisfaction/favorability. Student ratings on these measures were anonymous.

Focus Groups and Interviews

In 2012 only, eight students were randomly selected to participate in a 10–15 min interview designed to assess their experience in the program, what they would want to improve about it, what sessions they liked and disliked, and how it influenced their attitudes about future careers in health care. A semi-structured protocol was used for each interview, with leeway for in-depth probing by the interviewer. A different group of eight randomly selected youth was selected to participate in the focus group designed to assess the same experiences and perceptions, using a similar semi-structured protocol. Qualitative data were not collected in 2013 due to shortage of staff.

Classroom Observations

In 2012, a sample of sessions was observed by two raters using the Reformed Teacher Observation Protocol (RTOP; Piburn et al. 2000). The RTOP is designed to assess the extent to which mathematics or science instruction meets “reformed” teaching standards. Observers’ judgments are rated on 25 items, ranging from 0 (not observed) to 4 (very descriptive). The RTOP contains five sections: (a) lesson design and implementation, (b) content of teaching, (c) procedural knowledge, (d) classroom culture, and (e) student–teacher relationships. Scores from each section are summed to obtain a total score, with scores ≥65 indicating reformed levels of teaching. A sample item reads, “Connections with other content disciplines and/or real-world phenomena were explored and valued.” Both observers on the evaluation team were trained by a faculty member in science education at the university with expertise in scoring the RTOP and inquiry-based teaching. The RTOP was not used, nor were observations conducted, in 2013 due to shortage of available staff.

Results

Main Analyses of Assessments and Session Ratings

Table 2 summarizes a series of paired samples t tests on assessments, as measured by the average number of points scored. Results indicate significant gains for each grade cohort across both years on academic knowledge; in terms of career knowledge, the 10th grade and 11th grade cohorts showed significant gains in 2013 only. The 2012 mean scores in academic knowledge for sophomores (n = 24) increased by 1.71 points; in 2013, the mean scores increased by 2.07 points, but among a much smaller cohort (n = 13). In 2012, the mean test scores in academic knowledge for juniors increased by 1.04 points (n = 26) and by 1.90 points in 2013 (n = 20). In both years, mean test scores among the seniors increased more substantially with regard to effect size than sophomores and juniors in academic knowledge, with an average increase of 15.72 points in 2012 (n = 22) and 16.22 points in 2013 (n = 19).

Table 2 Paired samples t tests of academic and career knowledge assessments (2012 and 2013)

Clearly, the seniors made the most dramatic gains in their academic knowledge, scoring far below the basic/limited level at pretest and at the proficient level at posttest; from a grade-equivalent perspective, this would translate from an average pretest grade of F to a posttest grade of C. The 2013 junior cohort, on average, made a substantial gain from a grade of F to a grade of B; the 2012 junior cohort made smaller gains, moving from an F to a posttest grade of D, although the test administered had four more items than 2013. As for sophomores, the gains moved from a pretest grade of F to a posttest grade of a D+ in 2012, and a D to a B− in 2013. In terms of career knowledge, the sophomore assessment in 2012 was limited to 3 items; it is not surprising that findings were nonsignificant. By contrast, the 2013 career knowledge test had over twice as many items; still, the mean was still equivalent of an F, even at posttest. In 2012, the posttest mean of career knowledge for juniors was an F, slightly increasing at posttest. In 2013, the new cohort of juniors had a relatively high level of baseline career knowledge before participating (B−), which significantly increased to a B+. For both sophomores and juniors, because the tests were not the same, comparisons between cohorts cannot be made.

With respect to questionnaire ratings of CHAMPS sessions, a total of 25 sessions were rated by students in 2012 (N = 828 responses; 58.9 %), and a total of 31 sessions were rated in 2013 (N = 577 responses; 41.1 %). After aggregating responses to all sessions, an overall mean of 2.54 (SD = .54) in 2012 and 3.24 (SD = .76) in 2013 was found; thus, youth tended to rate the 2012 and 2013 sessions as favorable to highly favorable. A series of one-way ANOVAs were performed to examine grade-level differences; after being aggregated across both years, sophomores made up 47.5 % of the total session responses; whereas, juniors and seniors represented 34 and 17.8 %, respectively. ANOVA tests revealed no significant differences as a function of grade in 2012, F(2, 808) = 2.75, ns, or in 2013, F(2,571) = .42, ns. In 2012, the three items showed significantly moderate to high levels of association, from r = .52 to .72, p < . 01. In 2013, a similar range was evidenced, ranging from r = .61 to .73, p < .01.

When examining the aggregate mean rating of each session, the vast majority of sessions in 2012 were rated in a favorable manner. The five highest ratings belonged to Gram staining (M = 2.86; SD = .38), aseptic techniques (M = 2.83; SD = .30), physician assistant (M = 2.83; SD = .26), nursing skills (M = 2.81; SD = .35), and occupational therapy (M = 2.76; SD = .41). Conversely, the five lowest session ratings belonged to library research (M = 1.83; SD = .56), epidemiology (M = 1.93; SD = .56), autoclave (M = 2.26; SD = .33), pharmacy (M = 2.34, SD = .49), and team building (M = 2.35; SD = .41). In 2013, the vast majority of sessions were also rated in a positive manner. The five highest ratings belonged to physician assistant (M = 3.79; SD = .33), sexually transmitted disease laboratory (M = 3.79, SD = .33), professionalism (M = 3.71; SD = .44), nursing practices for juniors (M = 3.67; SD = .46), and orthopedic surgeon (M = 3.67; SD = .47). The five lowest session ratings were identified as Health Professions Affinity Community or HPAC (M = 1.84, SD = .89), nursing and mathematics (M = 2.12, SD = 1.01), career planning (M = 2.59; SD = .81), Intro to HPAC (M = 2.65; SD = .85), and heart rate, blood pressure, and exercise (M = 2.69; SD = .58).

Secondary Analyses of Qualitative Data

In order to complement, contradict, and/or expand upon the quantitative results, data gathered from interviews, focus groups, and sample of classroom observations were analyzed using qualitative methods. In 2012, four sessions were observed using the RTOP. Content analyses were performed to code the data from the interviews and focus groups for dominant or recurring themes. Three members of the research evaluation team independently coded the data. These codes were then audited by an external member to ensure their trustworthiness and resolve discrepancies in the coding and the interpretation of their meanings. A total of eight CHAMPS students were interviewed in 2012. Due to logistical issues concerning availability of staff, only four out of the 25 sessions were observed: (a) epidemiology, (b) anatomy laboratory, (c) health and safety, and (d) biology of disease.

As summarized in Table 3, a predominant theme that emerged was career exploration and occupational knowledge. As evidenced in the illustrative quotes, youth tended to report discovering occupations they never knew existed, thus expanding their range of career choices within health careers. They also reported reconsidering specific occupations they had originally chosen due to their participation in the program. Similarly, another theme was enjoyment of hands-on experiential activities that offered opportunities for youth to interact with health care professionals, engage in authentic research labs, and master concrete research skills. In terms of improvements to the program, students focused on logistical issues of scheduling, in which the organization of the days and structure of the program’s agenda were a source of dissatisfaction.

Table 3 Themes and illustrative quotes from CHAMPS participants in 2012 (N = 8)

The focus group results were largely consistent with the results gleaned from interviews. Specifically, youth experienced the hands-on activities to be the best methods of learning in the program. For example, the focus group liked the experience of seeing cadavers, listening to a doctor talk about surgery, witnessing steps that go into a scientific experiment, and going on a field trip. Students further reported learning new and interesting ideas such as seeing how spices have medicinal purposes or learning tips on writing. Like the interview findings, the focus group experienced problems with the organization and scheduling of the program, such as not having a place to sit until teachers arrived, lack of communication, or not knowing the dress code in certain activities or field trips.

With respect to the classroom observations, results indicated that only one instructor was “reformed.” More specifically, the instructor in epidemiology had RTOP scores of 70 and 79. On the other hand, the instructor in anatomy had scores of 45 and 49; the instructor in health and safety had scores of 40 and 45; and, finally, the instructor for biology of disease received scores of 36 and 41. As we can see, the discrepancies were not substantial between raters, or were so great that they would result in different conclusions. Interestingly, while epidemiology had one of the lowest mean aggregate session ratings in 2012, the instructor was rated as highly reformed. By contrast, while the RTOP ratings for biology of disease were low, session ratings indicated that students were satisfied (M = 2.69; SD = .34).

Summary of Results

The two main hypotheses of the study were generally confirmed by the results. A few mixed findings, however, pertained to a lack of significant change in career knowledge among sophomores and juniors, but in 2012 only. The most compelling improvements in academic knowledge occurred among the seniors in both years, although substantial effect sizes were also found for improvement in academic knowledge among juniors in 2013 and sophomores in both years. Collectively, students experienced sessions as favorable, regardless of their grade level or year of the study.

Further analysis of individual session ratings suggest that activities focused on exposure to real people in health occupations and which used hands-on approaches to learning academic concepts were experienced as the most engaging, useful, and interesting. The amount of hands-on activities per session, however, compared with other types of activities youth participated in was not specifically measured. We can only infer this from the topical nature of the session itself. On the other hand, sessions that were geared toward a specific program (HPAC) and some of the career readiness skills (team building, library research, and career planning) were the least helpful. Several of the least preferred sessions, however, included topics on a health care occupation and academic content, suggesting that not all career-specific sessions were experienced with equal levels of satisfaction, and may have depended on other factors such as the quality of instruction. At the same time, the results based on classroom observations of four sessions in 2012 suggest that the quality of instruction alone may not be sufficient to determine how useful students may experience each session, as the findings seemed to contradict the corresponding session ratings.

The qualitative results are inconsistent with the quantitative results as they relate to improvements in career knowledge. The triangulation of focus group and interview data suggest that some students may have learned more about their future career paths and choices than what the assessments measured. These findings also underscore the importance of hands-on activities in inquiry-based instruction. Like other university pipeline programs, the organization around scheduling was a consistent concern voiced by the students. This is not surprising given that a significant amount of coordination is required each year in terms of planning, communicating with master teachers and university faculty, recruiting students, and managing program staff.

Discussion

CHAMPS has been in operation since 2009. This study is the first empirical attempt to bring its potential impact on targeted outcomes to light over the course of two summers, serving as a basis for future modifications to its design, delivery, and evaluation. Unlike many pipeline programs, a unique strength in our study was the use of objective assessments to measure change in academic and career outcomes, rather than relying on subjective survey items. Overall, the results offer promising evidence for the program’s impact. The dramatic gains made by the senior cohort, in particular, lends credence to the value of inquiry-based approaches to informal STEM+H learning, as it was based on an inquiry-driven curriculum focusing on the skills and processes of the scientific method that relied on a host hands-on activities. In contrast to their peers, seniors were assessed by a rubric measuring critical thinking, writing skills, and scientific rationale. This is not to say, though, that the program did not impact sophomores and juniors.

An important issue for the evaluation of CHAMPS has been assessing the relationship between quality programming and assessment of program impact. So far, the program director and evaluation staff have not planned far enough in advance of the summer institutes so that fidelity measures can be designed; this shortcoming can in part be attributed to the lack of a manual or curriculum guide. In the absence of measuring fidelity, it is difficult to conjecture to what extent the quantity and quality of implementation is associated with variation in program outcomes; this is a key question made more complicated when considering the unexpected discrepancies between the RTOP scores and their corresponding session ratings. Although four observations cannot be used as a basis for drawing inferences that represent the program, the results point toward the notion that these session ratings may have limited value; alternatively, the RTOP may not measure at a finer level of analyses all relevant features of session activities. Designing a multi-method approach to fidelity is essential for teasing out these sorts of issues, one that is not typically undertaken in evaluations of university pipeline programs.

Another critical issue that warrants attention is the alignment between what is taught in CHAMPS and the design of assessments that comprehensively measure that content in a reliable and valid manner. We believe this may be especially relevant to explaining the lack of change in the career outcomes for two of the cohorts and making future modifications to the career knowledge measures. While increasing the career items in 2013 may have led to a better assessment because it adequately covered all of the information that was imparted, the results still suggest that the students did not learn this material well, especially among the sophomores. There are several explanations for this pattern. First, the career items may not have been well aligned with the actual content taught. Second, the items may have been properly aligned, but the content may not have been taught with fidelity. Third, and perhaps most interesting, is the notion that this type of approach to measuring change in psychological outcomes like career decision making, career choice expansion/reconsideration, goal setting, and career planning are more amenable with other methods sensitive to capturing such changes, as suggested by the qualitative data. Though the interviews and focus groups are not representative of the sample, they do suggest that measuring change in these outcomes through qualitative methods may be a viable complement to assessments, as with other types of career-based programs for youth (Perry et al. 2007). Furthermore, using instruments designed by researchers in vocational psychology that assess vocational identities and different modes of career exploration through self-report measures may also provide more accurate information sensitive to change (Porfeli et al. 2011).

The limitations of the design also warrant attention. Given that the CHAMPS summer institute is 2 weeks, developmental maturation and history do not pose major threats to internal validity, but other threats in a treatment-only group design were not controlled, namely selection bias. Indeed, selection bias is arguably the most difficult threat to draw causal inferences in university pipeline programs like CHAMPS because randomization is often not feasible and find a comparison group (either based on applicants who were not selected or students who were not recruited) poses its own inherent problems that still cannot control for unobservable characteristics (e.g., academic motivation and aspirations) that likely influence long-term outcomes, such as choice of major and enrollment in college. The next phase of program evaluation for CHAMPS intends to track previous cohorts of students who have graduated in comparison with another group of similar students matched on various characteristics. Despite the limitations of design and method, the potential efficacy of CHAMPS has a foundation of evidence to build upon. Among similar interventions we reviewed that remain equally limited (if not more limited) in terms of establishing long-term causal impact, CHAMPS stands as a viable program that, over time, can be adapted by other universities at a relatively low cost.

As we move forward, plans are being made to examine the assessment scores in relationship to high school science grades and the ACT Science Test, while designing a comprehensive approach to assessing fidelity of implementation across all sessions. Although the Saturday Academy was not the focus of our study, this will be another piece of the program to evaluate in terms of understanding its added value to the summer component, which is the mainstay of the program. With each new year, we seek to continue to revise CHAMPS and create a manual that can be disseminated to the general public. Once CHAMPS has undergone these revisions, produced a replicable manual, and addressed its current limitations with assessments, this program will be capable of being extended to other university-based STEM+H outreach programs that also seek to design interventions which help prepare and excite the minds of young people for entry into health care occupations.