Introduction

Time to completion of a baccalaureate degree has increased markedly over the past two decades (Bound et al. 2010a), and has caught the attention of policy makers, educators and researchers in recent years (Garibaldi et al. 2012). Access to a college education has been steadily expanded to all population groups in past decades (Snyder et al. 2014), but apparently the ultimate goal of a college education is not access but an earned degree. Delay in achieving that goal has made educators and the public start to regard timely degree completion as a critical measure of accountability for colleges and universities (Lee 2010).

In response to these public requests, states are developing policies to boost their colleges and universities’ 6-year degree completion rates. Some states, such as Florida, Louisiana, Ohio, and South Carolina, have linked 6-year graduation rates to performance funding initiatives (Knight and Arnold 2000; National Conference of State legislatures 2014) and California is also considering similar measures in state funding of its public universities (Megerian and Gordon 2013). At the institution level, many universities and colleges have started to implement intervention programs that encourage and help students to graduate in time.

However, our understanding is still limited regarding the factors that affect time to degree (Adelman 2006; DesJardins et al. 2002c). Most reports in the literature are descriptive and cross-sectional with a focus on student demographics, academic preparation, and first-term performance as predictors of how long a student stays in school before graduation. Findings also tend to be inconclusive. For example, Knight and Arnold (2000) pinned lengthened time to degree to the fault of the institution, whereas others attributed it to students’ individual decisions (Adelman 2006).

The lack of understanding is primarily due to two barriers: data availability and methodology (Adelman 2006; Singer and Willett 1991). It is difficult to track students’ progress over time and there is no agreed-upon methodology to model the effects of students’ academic progress on time to degree.

In this study, we examine time to degree and attempt to overcome both barriers. Particularly, this study employs longitudinal data and applies event history analysis (EHA) to answer this question: beyond students’ demographic and college-preparation characteristics, what aspects of their decisions and academic performance affect time to graduation with a baccalaureate degree?

Academic Progress and Time to Degree

Time to degree can be understood in at least two different ways, as currently reported in the literature. One is elapsed time (calendar years) from the initial enrollment in a college to the time a degree is awarded. This definition does not distinguish enrolled time from stop-out time. Nationally, of all first-time recipients of bachelor’s degrees in 2007–08, only 44.2 % graduated within 4 years. Twenty-three percent of the degree recipients took longer than 6 years, and 11.5 % took over 10 years (Woo et al. 2012). Apparently, stopout accounted for a large proportion of the elapsed time from entry to completion for students who took more than 6 years to earn their degree.

When elapsed time is defined as time to degree, progress to graduation is viewed in an open system, in which a university functions as one of the interacting components that shape a student’s life (Berger and Milem 2000). To fully understand time to degree in this context, influences external to higher education institutions must be considered, in addition to intra-institutional factors and student characteristics (Titus 2006).

Research on student retention has suggested many factors originating from the external environment affect student persistence, such as availability of financial aid, health issues, family needs and jobs (Nora et al. 2005; Berger 2000; Bean 1980). Because universities are open to their larger environment, when alternatives to staying in school become more attractive, students may choose to stop out or leave permanently. Anticipated incentives to graduation are important variables in understanding student attrition and stop out. With this assumption, institutions that are perceived as more selective will be better able to retain students because lost opportunities to earn a degree are more costly (Rusbult et al. 1998). In general, the more satisfied a student is, the more investment he or she has committed to his or her enrolled institution, and the fewer the worthy alternatives, the more likely the student will persist to graduation (Chen 2012; Friedman and Mandel 2011; Harvey and Wenzel 2001).

Elapsed time to degree is therefore affected by two groups of factors, one external to the university and the other pertaining to student college experiences and the interaction between the institution and the student. These two groups of factors are related but are distinctively different constructs. Both need to be fully understood and included in modeling time to degree, and both are conditioned by characteristics students bring to college, such as pre-college academic preparation and family socio-economic background.

The second definition of time to degree only counts enrolled terms in school (semester or quarter), with a focus on student use of enrolled time to determine what contributed to the term-by-term progress to degree completion (Knight 2004; Lam 1999; Hoffer and Welch 2006). Variations in time to degree by this definition include types and timing of remediation, choices of majors and change of majors, course withdrawals and repeats, units enrolled and earned, to name a few. This definition excludes external variables and assumes students have made a decision of persistence in school, and therefore is a subset of the comprehensive study of elapsed time to degree.

Several theoretical frameworks have guided research on time to degree, including Interactionalist Theory (Braxton and Hirschy 2005; Tinto 1975), Geometric Model (Swail 2003), Student Attrition Model (DesJardins et al. 2002–2003), and Investment Model (Rusbult et al. 1998).

When students are enrolled, the interactionalist perspective and the Geometric Model are most relevant as a theoretical guide (Braxton and Hirschy 2005; Tinto 1975). The former posits that two dimensions of commitment and their interaction determine time to degree: student commitment and institutional commitment, conditioned by student background characteristics in a dynamic process of interactions between students and the academic and social systems of the institution. Student determination to obtain a degree and their commitment to the institution are negatively associated with time to degree. The Geometric Model, on the other hand, examines the relationship between institutional practices and the academic and social needs of students (Swail 2003). Institutions that work proactively to help students succeed and meet their needs will be able to shorten time to degree. The reciprocity or interaction between an institution and its students is vital in affecting graduations.

These two theories may be limited because they exclude external factors that influence retention and persistence, but once we confine analyses enrolled terms, the theories highlight an institution’s awareness and determination in meeting student needs, and help us understand how interventions may address deficiencies in promoting graduation. For example, total units required for graduation vary across disciplines in the same institution but are similar in the same disciplines across institutions. In a national survey, Pitter et al. (1996) found program requirements ranging from 122 to 124 credits in social sciences, foreign languages, psychology, mathematics and protective services, to 130–142 credits in engineering, architecture, and health professions. A careful examination of the total required units and an improvement to curriculum that stresses efficiency can be a powerful policy to speed up graduation, everything else being equal. In addition, services provided by an institution that improve student course success, such as redesigned gate-keeping courses and supplemental instruction, can also improve student success and lead to greater commitment to the institution (Roksa et al. 2009; International Center for Supplemental Instruction 2014).

A Conceptual Model

In this study, we incorporate various theoretical perspectives on time to degree, and propose a conceptual model to explain the process of degree completion. Our model integrates important aspects of students’ major choice, enrollment, and academic performance into a comprehensive conceptual framework, after controlling for pre-college characteristics. It is hypothesized that all these aspects affect students’ graduation and time to degree. In particular, we attempt to understand what aspects of student decisions (major choice and enrollment intensity) and academic performance affect time to graduation beyond students’ demographic background and academic preparation at entry (See Fig. 1).

Fig. 1
figure 1

A conceptual framework

In the model, pre-college characteristics are time-independent variables while major choices, enrollment intensity and academic performance represent time-dependent variables that change from term to term. The time-independent variables include student demographic characteristics and pre-college academic preparation. They are hypothesized to influence student decisions and performance in college, and could also directly affect graduation and time to degree.

Student decisions have two aspects: choosing a major to pursue a career and deciding on enrollment intensity. These decisions and academic performance are time-dependent and may change from term to term. Students make a decision each term to continue or discontinue enrollment, to determine enrollment intensity (a variable heavily conditioned by external factors), and to keep or change his or her major. Finally, students must perform adequately to earn the enrolled units.

Student Demographic Background

Research has shown that student demographic background is associated with time to degree. First-generation college students, male students, ethnic minority students and low-income students tend to take longer to graduate than their counterparts (Hall 1999; DesJardins et al. 2002a; Ishitani 2003; Chen 2005; Tinto and Pusser 2006; Bound et al. 2010b; Umbricht 2012). These characteristics often overlap, magnifying their negative impact on graduation.

Low-income, minority, and first-generation students likely lack specific types of “college knowledge,” and often do not understand the steps necessary to prepare for higher education which include knowing how to finance a college education, to complete basic admissions procedures, and to make connections between career goals and educational requirements (Vargas 2004). They are also less likely to receive support from their families (Thayer and Paul 2000). Take family income for example, students whose family annual income is below $6000 are about half as likely to graduate in 4 years as those whose family income is $60,000 or above (DesJardins et al. 2002–2003). Ethnic minorities, especially underrepresented groups, have lower graduation rates but take longer time to earn a degree if they do graduate (Gross et al. 2012; Pike 2013).

Academic Preparation

Students with adequate preparation for college are more likely to graduate in time than those who are not as prepared (Pike 2013). Preparations include maintaining a good GPA, taking advance placement courses prior to attending college, and staying highly engaged in high school (DesJardins et al. 2002a; DeAngelo et al. 2011). A student’s preparation for the post-secondary education is highly associated with his or her academic performance in college, and high school GPA is shown to have a strong positive association with college GPA (Zhu 2004).

On the other hand, students needing remediation lack the basic academic skills required for success in college (Scott-Clayton et al. 2014). Research has shown that as students take an average of 2.6 courses to complete remediation, their time to degree completion is significantly extended (National Center for Education Statistics 2012). According to Complete College America (Jones et al. 2012), there are “too many entering freshmen need remediation,” and “51.7 % of those entering a 2-year college enrolled in remediation and 19.9 % of those entering a 4-year college enrolled in remediation.” In addition, under-represented ethic minority students and low-income students have higher proportions needing remediation, compounding the effect of demographic background with academic preparation.

Major Choice

Students who declare a major at entry and keep it to graduation take less time to earn a degree than those who hesitate in choosing a major and those who change majors, other things being equal. The latter groups lack a clear roadmap to degree completion, caused by many factors including career uncertainty and under-par academic performance (Leppel 2001; Arcidiacono et al. 2012; John et al. 2004; Allen and Robbins 2010; Sklar 2014). Based on Holland (1997)’s theory of interest-environment congruence, students are more satisfied and have higher GPAs when their major links their personal interest to career interest, creating a positive academic/school environment (Trapmann et al. 2007). On the contrary, students who haven’t found a fit between personal and career interests and hesitate with choices of major often have low performance, take fewer units, and consequently prolong time to degree (Foraker 2012; Feldman et al. 1999).

Besides choosing a major and changing a major, students can also decide to take a second major, a minor, or both. These choices independently affect time to degree, regardless of academic performance and enrollment intensity because of increased course units requirement for graduation (Russell et al. 2008).

Enrollment and Enrollment Intensity

Enrollment can be understood in two aspects. First, students must enroll in school and pass all required courses to earn a degree. Stopout not only delays time to degree but also threatens course success due to absences from the academic setting and discontinued learning (Chen 2007). The process of earning required units towards a degree is not continuous but partitioned in academic terms (Bahr 2009; Johnson 2006; Kalamatianou and McClean 2003). Persistent success from term to term is the key to a timely graduation. Within this context, successfully completed units by term are the most important indicator whether a student is on track to graduation in a timely fashion (Chen 2007; Knight 2002; Beekhoven et al. 2003; Van Der Hulst and Jansen 2002). The number of enrolled units per term and the ratio of enrolled units to earned units are factors that best explain the progress to graduation (Hagedorn et al. 2007; Cabrera et al. 2003).

Second, enrollment intensity affects time to degree. Number of enrolled units per term reflects student decisions (part-time vs. full-time). When a student chooses to enroll in a part-time status, it is obvious he or she must take more terms to graduate than those who are full-time, everything else being equal. The same applies to full-time enrollment with varying units since the federal financial aid standard of full-time enrollment is only 12 units. In most majors, a student must enroll and earn a minimum of 15 units/semester to graduate in 4 years.

In our model, enrollment included “extra enrollment,” typically summer and/or winter sessions students may choose to enroll. Effect of extra enrollment on reducing time to degree is mixed. While some students take summer/winter courses to graduate sooner, others do so to make up deficiencies during regular terms. In this latter situation, extra enrollment was found to extend time to degree (Knight 2002).

Academic Performance

Enrollment alone does not lead to a degree. Students must perform adequately and earn the units they enroll that apply to their degree, hence academic performance is one of the most important factors whether and how soon a student can complete a degree. In our model, academic performance is defined in three aspects: term GPA, cumulative GPA and earned cumulative units. The three build on each other from term to term to reach degree completion. Research suggests that the strongest predictors of degree attainment are average student credit hour load per term, followed by total credit hours earned (Knight 2002; Zhu 2004). Low term GPA indicates a higher probability of course repeats and fewer earned credit units per team, while low cumulative GPA signals a slow accumulation of total earned credit units, leading to longer time in school before degree completion.

Student decision on how many units to enroll and whether they can successfully earn the enrolled units are dynamic and changing from term to term (DesJardins et al. 2002a). Volkwein and Lorang (1996) reported that students may choose to carry lighter unit loads to enhance GPA or to leave more time for other responsibilities while enrolled in school. These decisions are also affected by student financial background, pre-college preparedness (high school GPA, etc.) and remediation status (Boughan 2000). Academic performance is especially important in the first semester in college, which powerfully predicts chances of graduation and enrollment intensity in later semesters (Attewell et al. 2011). Relative to other factors, academic performance has a large direct effect on graduation and time to degree (Allen et al. 2008).

In summary, student decisions to enroll (whether they are retained term to term), enrollment intensity (number of units enrolled) and term success (earned units out of enrolled units) interact to act as time-dependent factors of time to degree.

Methods

We use enrolled terms to describe time to degree in this study and assume that students continuously enrolled in school by excluding stopout terms. This is by no means to ignore the external factors in the open system that influence student persistence. Instead, we focus on student use of enrolled time to determine what aspects of student decisions (major choice and enrollment intensity) and academic performance contributed to the term-by-term progress to degree completion beyond students’ demographic background and academic preparation.

Data and Variables

Data in the study include five cohorts of first-time freshmen from fall 2002 to fall 2006, drawn from a large state university in California. The observation period is from fall 2002 to spring 2014 with a maximum of 24 terms. All students are tracked by enrolled terms, including both fall and spring semesters for a minimum of 16 terms or eight calendar years (the 2006 cohort from 2006 to 2014). The final person-period dataset includes 12,069 first-time freshmen with 95,220 records (per student per enrolled term).

The dependent variable in the study is whether or not a student graduated in a given term during the observation period, coded dichotomously with 1 = graduated and 0 = did not graduate.

The independent variables consist of both time-independent and time-dependent factors. Time-independent variables have values that are constant over time, including demographics (gender, under-represented minority status or URM, and first-generation college-going status) and pre-college academic indicators such as high school GPA, remediation status and Pre-college experience (college credit through AP courses or co-enrollment in college when students were attending high school, coded as 1 = having college credits and 0 = not having it). Time-dependent variables have values that may change over time and are conceptualized in two categories: student decisions and academic performance. The former includes enrollment intensity (number of term units enrolled and extra enrollment), academic major status (major declaration, major change, and double major or minor). The latter consists of term GPA, cumulative GPA and cumulative units earned at the beginning of each term.

Analytical Models

To estimate a dichotomous dependent variable (graduated or not graduated in a given term), we employed event history analysis (EHA) to capture the temporal nature of students’ decisions, academic performance, and the effects of time-independent and time-dependent variables. Basing their roots on biostatistics and epidemiology, EHA models adopted different names across scientific disciplines, such as “duration models”, “hazard models”, or “failure-time models”. Event history modeling is an empirical technique that allows the researcher to study the occurrence and timing of events in a longitudinal process (DesJardins 2003), and has been recently used in higher education research to investigate the temporal aspects of student dropout and degree completion (Chen 2008, 2012; DesJardins et al. 2002b). EHA focuses on events that are important to the dependent variable and analyzes data obtained by observing individuals over time. It can examine the underlying causal mechanisms behind event occurrence, to control for censored data and to explore the impact of time-dependent variables on outcome (Allison 1982, 1984; Singer and Willett 2003).

“Right-censored” Data

In our study, the event of interest is graduation with a bachelor’s degree. If a student graduated, we know exactly how many enrolled terms he or she took to complete a degree. If a student hasn’t graduated, either he or she dropped out or is continuing at the end of the observation period, thus the time of graduation is unknown. All cases for which graduation is unknown are right-censored, but they must be considered in estimation with longitudinal events, to avoid biases or loss of information (Allison 1984).

Right-censored data provide incomplete but useful information: we do not know whether or when the event occurs for the censored individuals but we do know about event nonoccurrence by the censoring time. Event history analysis can handle this uncertainty by incorporating information about right censored cases. In our study, we track students term by term. In each term (referred to as “risk period” in EHA terminology), we only count students who enrolled in that term. The pool of these students, the “risk set”, includes all individuals who are eligible to experience the event in that term. Students who graduated or were censored in one term would be dropped out of the risk set in future time periods so that everyone remains in the risk set only up to the last term of enrollment.

Within the class of EHA methods, we chose the discrete-time logit hazard model, which can be estimated by a standard maximum likelihood method using a logistic regression procedure.

The discrete-time logit hazard function, denoted by h it , is the conditional probability that an individual will experience the target event (e.g. graduation) in the tth time period t (T = t) given that s/he didn’t experience it in any earlier time period (T > = t).

$$h_{it} = {\text{prob}}\left\{ {T = t|T \ge t \;and \;X_{1it} = x_{1it} , X_{2it} = x_{2it, \ldots , } X_{kit} = x_{kit} } \right\}$$
(1)

Equivalently, the equation can be transformed to the logit hazard model as below:

$${\text{Log}}_{it} \left( {h_{it} } \right) \, = {\text{Log }}\left( {\frac{{h_{it} }}{{1 - h_{it} }}} \right)$$
(2)

In Eq. (1) the subscript i indicates individual subjects, the subscript t the enrolled term, and the subscript k the independent variable.

The equation posits that the value of the hazard for individual i in time period t depends on his or her values of the K independent variable at the time period t. The independent variables in the model can be either a time-independent or a time-dependent variable. The possibility that values of independent variables will change over time is reflected in the time subscript to the X’s in the equation.

In order to test assumptions of temporal sensitivity, we first specify and estimate a time-fixing effect model, assuming the effects of independent variables do not change over time. We then relax this restriction by estimating a time-varying effect model that allows the effects of independent variables to change over time.

Modeling Overall Effects

In order to investigate the overall effects of independent variables over time, we first specify a time-fixing effect model.

Let t = 1, 2, 3… represents the successive enrolled terms, the dependence of h it on independent variables is assumed to follow a logit model:

$${\text{Log}}\left[ {\frac{{h_{it} }}{{1 - h_{it} }}} \right] = \alpha + \beta_{1} X_{it1} + \cdots + \beta_{k} X_{itk}$$
(3)

β k in Eq. 3 holds constant across all enrolled terms, assuming the effect of the independent variable, X k , is constant over time. This model provides the estimated overall effect of these independent variables across all enrolled terms.

Modeling Time-Varying Effects

An independent variable’s effect may increase or decrease over time, in alignment with certain events occurring at specific time periods. In fact, both time-dependent and time-independent independent variables can have time-varying effects.

To further explore the possibility of the time-varying effects of independent variables, we employed another discrete-time hazard model that allows the effects of X’s to differ from period to period, examining the unique effects of the independent variables in each period.

Let D1, D2, …, DT be a series of time dummy variables (for example, Dt = 1 for the tth term, otherwise = 0),

$${\text{Log}}\left[ {\frac{{h_{it} }}{{1 - h_{it} }}} \right] = \left[ {\alpha_{1} D_{1} + \cdots + \alpha_{T} D_{T} } \right] + [\beta_{11} X_{1i1} D_{1} + \cdots + \beta_{KT} X_{KiT} D_{T} ]$$
(4)

β kt in Eq. 4 represents the unique effect of X k in the tth term. This model is equivalent to the series of logistic regression models run at each time period. In time period t:

$${\text{Log}}\left[ {\frac{{h_{it} }}{{1 - h_{it} }}} \right] = \alpha_{t} + [\beta_{1t} X_{1it} + \cdots + \beta_{Kt} X_{Kit} ]$$
(5)

where t = 1, 2, …, T.

Interpretation of Model Parameters of β’s

In both models, the slope parameters (β’s) indicate the effects of the independent variables and can be interpreted in two ways. First, it is the computed odds ratio, Exp(β), that assesses the relative probability the event of the dependent variable will occur. An Exp(β) larger than 1 indicates a positive effect of the independent variable while an Exp(β) smaller than one indicates a negative effect. The farther the Exp(β) is from one, the greater the effect is.

For example, Exp(β) = 0.400 for Major change (1 = Changed majors and 0 = Didn’t change majors) indicates the odds of graduating for students who changed majors is 0.40 of those who didn’t change majors. Alternatively, a simple transformation of the β offers a direct interpretation: 100(eβ−1) is the percentage change in the odds of graduation for a one unit increase in X, holding other variables constant. Therefore, Exp(β) = 1.497 for cumulative GPA suggests that the odds of graduating would increase 49.7 % if cumulative GPA increases by one point.

Controlling for Unobserved Heterogeneity (UH)

Unobserved heterogeneity refers to the variance not explained in a statistical model that originates from correlations between observed independent variables and unobserved patterns in the data. These unobserved patterns are typically represented by differences among individual cases that are not captured in observable variables. Our statistical models have incorporated the most important variables reported in the literature, including students’ demographics, academic preparation, enrollment intensity, major choice, and academic performance. It is unlikely, however, that all factors affecting time to degree can be measured or can be made available to our analysis. Variability between individuals, for example, is such a factor. As in any regression analysis, omitted variables are a source of model misspecification. EHA models are particularly sensitive to unobserved heterogeneity (DesJardins 2003). If unmeasured variables are correlated with the covariates in the model, failure to account for unobserved heterogeneity will likely lead to biased parameter estimates. Furthermore, allowing for unobserved patterns often leads to a change in the shape of the baseline hazard rate (Steele 2005). Allison (2010) also notes that ‘‘unobserved heterogeneity tends to produce estimated hazard functions that decline with time’’ (p. 258).

The common approach to account for unobserved heterogeneity is to introduce in the model a random effect, the so-called frailty, which represents unobserved risk factors that are specific to an individual and fixed over time, usually assumed to follow a certain distributional form. We therefore extend our discrete-time logit hazard models to the discrete-time logit hazard models with normal frailty as below.

For time-fixing effect model:

$${\text{Log}}\left[ {\frac{{h_{it} }}{{1 - h_{it} }}} \right] = \alpha + \beta_{1} X_{it1} + \cdots + \beta_{k} X_{itk} + u_{i}$$
(6)

And for time-varying effect model:

$${\text{Log}}\left[ {\frac{{h_{it} }}{{1 - h_{it} }}} \right] = \left[ {\alpha_{1} D_{1} + \cdots + \alpha_{T} D_{T} } \right] + \left[ {\beta_{11} X_{1i1} D_{1} + \cdots + \beta_{KT} X_{KiT} D_{T} } \right] + u_{i}$$
(7)

In the equation, u i is a random effect for individual i, representing unobserved heterogeneity or frailty. We assume the random effects follow a normal distribution with a mean of zero and variance (\(\sigma_{u}^{2}\)). The random effect variance is interpreted as the variance between individuals that is due to unobserved time-invariant characteristics, i.e. residual variance.

Random effects logit models can be fitted in Stata (using xtlogit) or SAS (using proc nlmixed) as well as specialist software for multilevel modelling (e.g. MLwiN). In our study, we used SAS proc nlmixed.

One important consideration when fitting random effects frailty models is that the regression coefficients have a different interpretation from those in the models without random effects (Steele 2005). Suppose we have a continuous covariate x, with coefficient β. In models without frailty, Exp(β) is an odds ratio that compares the odds of an event for two randomly selected individuals with x values 1 unit apart (and the same values for other covariates in the model). Exp(β) is the population averaged effect of x. In models with frailty, Exp(β) is an odds ratio only when the random effect is held constant, i.e. if we are comparing two hypothetical individuals with the same random effect value. Exp(β) is the individual-specific effect of x.

Findings

Describing Students, Graduation and Time to Degree

There are a total of 12,069 students in our dataset as shown in Table 1. The Enrollment column in the table lists all students by time-independent and dependent variables, and the percentages are calculated on the column. For example, the sample is composed of 40.7 % males and 59.3 % females. The Graduation column shows number and percentage of students who have graduated from the student group on the same row, hence 6999 or 58 % of all students in the sample have graduated. Similarly, 53.9 % of all male students and 60.8 % of all female students have graduated.

Table 1 Student characteristics, graduation and time to degree: 2002–2006 cohorts of first-time freshmen

Among the students in our data, 55 % are underrepresented minorities (URM) including African American (7 %), American Indian (1 %), Pacific Islander (1 %), Hispanic (30 %), and Asian (16 %). Asian students are part of URM in our data because half of the Asian students are Hmong, an educationally under-represented minority. Most of the other half Asians are non-Hmong Southeast Asians, bearing similar characteristics with Hmong students. The institution our data come from is a federally designated Hispanic Serving Institution (HSI) and Asian American Native American Pacific Islander-Serving Institution (AANAPISI).

About 60 % of the students are first-generation college attendees, 46 % are Pell grant eligible, 70 % need remediation in Math and/or English, and only 22 % had college experience prior to entering the university. The average high school GPA is 3.29 and the average of SAT score is 944 (not shown in Table 1), over 100 points below the national average among first-year freshmen (NCES 2013).

Fifty-eight percent (58.0 %) of the students (6999 out of 12,069) graduated during the observation period, and their average time to graduation is ten enrolled terms.

Both graduation and time to degree varied among student subgroups, especially percentage of graduation. The top half of the table presents time-independent variables. Male students, URM students, first-generation students, students who qualified for Pell grant, those who required Math or English remediation, those who did not have Pre-college experience and those who have lower high school GPA have notably lower probability of graduation compared to their counterparts. The differences range from 6.9 to 14.2 percentage points except by high school GPA that has three groups spanning a difference of 30 percentage points.

The characteristics students bring into college not only affect chances of graduation but also time to degree: among the graduates, student groups with lower probabilities of graduation took longer than their counterparts to complete degrees. As we may expect, student characteristics overlap and have a compounding effect on graduation and time to degree.

Although the pre-college factors may seem to have a powerful impact, they are time-independent and do not reflect student decisions and academic performance in college. The lower half of Table 1 illustrates student decisions in major choices, enrollment intensity, and academic performance represented by term units enrolled and term GPA.

Change of major impacts probabilities of graduation and time to degree with opposite patterns. Students who didn’t change majors are far less likely to graduate (46 vs. 79.6 %) because, like those who never had a major, students who dropped out of school early didn’t stay long enough to change majors. When we select those who have graduated, students who did not change majors took slightly less time to complete their degrees than those who did (9.7 vs. 10.1 enrolled terms). The same pattern describes students who have a double major or minor in comparison to those who don’t. Students with only one major are more likely to drop out early than those with double majors or minors, and therefore have a lower chance of graduation. But among the graduated students, having two majors or a minor takes longer to graduate than having only one major (9.9 vs. 10.2 enrolled terms).

One of the dynamic indicators of progress in college is whether a student has a major in a given term. Students can have a major in one term, drop it in another, and pick up a new major in the next term. Keeping a declared major is a sign of confidence with a chosen pathway to graduation. Those who spent over half of their time in school with a major are three times (61.4 vs. 20.3 %) as likely to have graduated as those who don’t have a major in over half of their enrolled terms.

Students who enrolled summer and/or winter sessions are more likely to graduate than those who didn’t do so (84.3 vs. 52.4 %) and also graduated sooner (9.8 vs. 10.0 enrolled terms). Enrolled term units and GPA show expected associations with graduation and time to degree. Taking more units and having a higher GPA both promote graduation and faster degree completion.

Describing Dynamics of Time-Dependent Variables

Figure 2 shows distribution of total enrolled terms by status of graduation. Only 64 out of the total 12,069 students enrolled for 17 or more terms, and about a quarter (2926 out of 12,069 or 24.2 %) enrolled 4 or fewer terms. Among the latter group only seven have graduated, suggesting that overall graduation rates are most significantly impacted by early dropouts. The large majority of graduates (5909 out of 6999, or 84.4 %) enrolled between 8 and 12 terms.

Fig. 2
figure 2

Distribution of total enrolled terms by graduation status

Because the two extreme groups either contain very few cases (17 or more terms) or very few graduates (1–4 terms), we will focus our analysis on 5th–16th terms.

Figure 3a–h illustrate the dynamic change of time-dependent variables across enrolled terms by graduation status, including major change, double major or minor, major un-declaration, extra enrollment (summer or winter), enrolled term units, term GPA, cumulative units earned and cumulative GPA. In each figure, the X-axis represents 5th to 16th enrolled terms and the Y-axis represents the value of time-dependent variable at the given term (% for categorical variables and means for numerical variables). The difference between students who graduated and who didn’t graduate indicates the relationship between graduation and time-dependent variables.

Fig. 3
figure 3

Dynamics of time-dependent variables by graduation status (a % of major change. b % of double majors/minors. c % of major un-declaration. d % of extra enrollment. e Average term units enrolled. f Average term GPA. g Cumulative units earned at beginning of term. h Cumulative GPA at beginning of term)

In the first 11 terms, students who have graduated are less likely to change their majors than who didn’t graduate (0.0 vs. 11.3 % in the 5th term and 1.8 vs. 4.0 % in the 11th term). The difference becomes smaller in the 12th and the 13th term, and is reversed after the 14th term (Fig. 3a). This may indicate that changing majors in the early terms negatively impacts chances of graduation. Students who graduated between the 7th and the 10th terms are more likely to have double majors or minors than who didn’t graduate (16.2 vs. 10.2 % in the 8th term and 17.8 vs. 12.2 % in the 10th term). The difference becomes more varied without a consistent pattern beyond the 11th term (Fig. 3b). On the other hand, almost all students who graduated had a declared major across all terms (Fig. 3c). In all enrolled terms particularly from the 5th to the 7th, students who didn’t graduate are more likely to be un-declared than those who graduated (7.0 vs. 0.0 % in the 5th term to 1.1 vs. 0.0 % in the 16th term).

In both 5th and 6th terms, students who graduated are more likely to enroll in summer or winter courses than who didn’t graduate (8.3 vs. 0.5 % in the 5th term, 15.7 vs. 7.6 % in the 6th term), but beyond that period the difference becomes smaller (Fig. 3d). So are average term units. Students who graduated in the first eight terms enrolled more units per term than those who didn’t graduate (16.9 vs. 14.7 in the 5th term and 15.4 vs. 14.2 in the 8th term), but starting in the 9th term the difference reverses: the former enrolled fewer units than the latter (13.2 vs. 13.9 in the 9th term and 9.1 vs. 9.8 in the 16th term) (Fig. 3e). This likely indicates an interaction between double major/minor requirements and enrolled units: Those who didn’t graduate have more requirements to complete and enroll more units.

Graduated students have higher term GPA than students who didn’t graduate (3.32 vs. 2.71 in the 5th term and 2.80 vs. 2.26 in the 16th term), and the difference remains stable at about 0.6 point across all enrolled terms from 5th to 16th (Fig. 3f), indicating that term GPA is strongly related to chances of graduation. Those with good GPAs are less likely to repeat courses and have a higher earned-to-enrolled units ratio so that they are more likely to graduate at any given term. In addition, term GPA is much higher in the 5th to 8th terms than in the later terms, and the large decrease occurring after the 8th term is for both graduated and not graduated students. Finally, students who have graduated have higher earned cumulative units and cumulative GPA than those who didn’t graduate (Fig. 3g, h). The differences are larger in the early terms (up to the 8th term) and become smaller and smaller in later terms. Clearly, both factors are positively associated with probabilities of graduation, but the relationship is much stronger in earlier terms than in later terms.

Modeling Probabilities of Graduation

Table 2 shows results of three discrete-time logit hazard models on odds of graduation (dependent variable is coded one for having graduated and zero for not having graduated). The models estimate the overall odds of graduation. Term specific results will be presented in Table 3.

Table 2 Odds of graduation: time-fixing effect hazard model
Table 3 Term-specific odds of graduation: time-varying effect hazard model

It should be noted that the sample size in this study is very large when all students are included (12,059 in Tables 2, 4), but it varies drastically in the term specific analysis (Tables 3, 5) from 9143 in Term 5 to 158 in Term 16. It is not feasible to rely on statistical significance when describing predictor effects across terms. In our discussions below, we will focus on odds ratios as an indication of predictor effect, and use statistical significance as a reference where applicable.

Table 4 Odds of graduation: time-fixing effect hazard model with normal frailty
Table 5 Term-specific odds of graduation: time-varying effect hazard model with normal frailty

Model 1 of Table 2 only included time-independent variables. Except Pre-college experience that does not have a statistically significant effect, all variables show anticipated associations with graduation. Male, URM, first-generation and Pell Grant eligible students have lower probabilities of graduation than their counterparts, with odds ratios ranging from 0.894 for URM to 0.940 for FGS. On the other hand, high school GPA and Pre-college experience are positively associated with graduation, with estimated odds ratios of 1.387 and 1.054, respectively. Academic remediation, however, reduces this probability by about 11 %.

This model has a very small Nagelkerke R Square of 0.008 % and a χ2 of 314.387, which only reduced the initial −2 Log likelihood by 0.633 % (314.387 from 49692.714 of the initial model of independence), a clear indication that pre-college characteristics did not effectively explain odds of graduation.

Model 2 added five time-dependent student decision variables: Major change, Double major or minor, Major un-declaration, Extra enrollment and Term units enrolled. These variables are time-sensitive because their status changes by term, as shown in Fig. 3a–e. The logit model analyzes these statues each term and returns an average effect on graduation. In other words, the status of the dependent variable indicates odds of graduation at any given term when effects are averaged.

The seven pre-college characteristics analyzed in Model 1 largely remain unchanged in Model 2. After they are taken into consideration, the odds ratios of the added time-dependent variables suggest that those who changed majors and those who didn’t have a major are respectively 68.3 and 99.4 % less likely to graduate than their counterparts. In contrast, students with double majors or minors are over 1.65 times more likely to graduate than those who only have one major. Extra enrollment (summer or winter courses) improves chances of graduation by 35.9 % but enrolling more units in regular semesters reduces the chances of graduation by 12.5 %.

Model 2 reduced the initial −2 Log likelihood by 7.46 %, an improvement of 6.83 % from Model 1. The Nagelkerke R Square increased to 9.4 from 0.008 %. With both pre-college factors and student decisions in the analysis, Model 2 left over 90 % of the variance in probabilities of graduation unexplained.

Model 3 is our final model, which adds college academic performance indicators to Model 2: term GPA, cumulative units earned at beginning of term, and cumulative GPA at beginning of term. All three variables are positively associated with chances of graduation, with cumulative units earned at beginning of term being the strongest predictor in the model (by Wald values). With the addition of these three variables, extra enrollment is no longer statistically significant, and the effect of having double majors or minors turned negative, suggesting that once academic performance (GPA and earned units) are controlled, students with double majors or minors are less likely to graduate (at any given term on average) because of requirements on additional majors or minors. Major un-declaration has the same negative effect as in Model 2.

In Model 3, the effect of the three pre-college academic indicators (HS GPA, remediation and Pre-college experience) is reversed in comparison to Model 1 and 2: those with higher HS GPA and Pre-college experience are now less likely to graduate while those who need remediation are more likely to graduate. These seemingly puzzling effects are in fact logical, because of the compounding correlations among pre-college academic performance, college academic performance and student decision in majors. The better prepared high school graduates tend to have better college performance (correlation between HS GPA and college GPA is around 0.5) and they are more likely to take double majors and or minors than those who didn’t do as well in high school.Footnote 1 It is no surprise that these pre-college indicators and double major/minor appear to “slow” graduation once earned units and GPA are held constant.

Stated differently, the best high school graduates tend to be the best college students, who are more likely to take double majors or minors. Once total earned units and GPA are controlled, those with double majors or minors are less likely to graduate in a given term because they have additional requirements to complete. Thus both pre-college performance indicators and double major/minors changed directions in Model 3, and appear to be associated with lower chance of graduation. Similar findings were reported in the literature (Lockeman and Pelco 2013; Ishitani 2003).

Model 3 reduced the initial −2 log likelihood by 48.4 % with a Nagelkerke R square of 55.0 %, a large improvement from 9.4 % in Model 2. Academic performance in college, including earned units, is the most important predictor of graduation.

To examine time-varying effects of the independent variables, we fitted a time-varying effect model by term as shown in Table 3. Odds of graduation are presented from Term 5 to 16 (1–4 terms and 17 or later terms have very few graduates, and they are excluded. See Fig. 2).

In most terms the effects of gender and URM are negative, especially in earlier terms, indicating that men and ethnic minorities are less likely to graduate in these terms, everything else being equal.

High school GPA, Pre-college experience and remediation have similar effects as found in Table 2 (Model 3), and these time-independent variables are more important before the 12th term as judged by their effect size. Disadvantages one brings to college most significantly affect chances of graduation before the sixth year mark.

Major change negatively affects graduation in most terms, while taking double majors or minors gives students a lower probability to graduate across the entire observation period. Major un-declaration has a consistent and extremely large negative effect on graduation (odds ratio close to zero), especially in early and late terms. Extra enrollment has a rather inconsistent pattern and its effect on graduation varies term by term.

Enrolled term units helps one to graduate in earlier terms (up to 8th), but after that its effects turn negative. This is mainly because students enrolling after the 8th term tend to take double majors and minors. With comparable enrolled units, they are less likely to graduate in any given term (after the 8th) than those with a single major. Another reason is that enrolled units are not equal to earned units. We noticed that term GPA is much higher in the 5th to 8th term than in the later terms, and the decrease occurred for both graduated students and students who didn’t graduate. This means in the later terms students earned fewer units out of their enrolled units.

Term GPA and cumulative units earned are the two most consistent predictors of graduation. On the other hand, cumulative GPA has a varying impact across terms and it is largely insignificant after the 10th term. Because a grade of C can satisfy graduation requirements, cumulative GPA does not as acutely affect graduation as earned units: the difference between 2.00 (C grade) and 4.00 (A grade) does not categorically alter the outcome.

Considering Unobserved Heterogeneity (UH)

To control for unobserved heterogeneity, we fitted two discrete-time logit hazard models with normal frailty for time-fixing effects in Table 4 and for time-varying effects in Table 5.

Table 4 re-presents Model 3 in Table 2 with the addition of a random effect parameter. The random variance (\(\sigma_{u}^{2}\)) is large and highly significant, clearly indicating unobserved heterogeneity across individual students. Compared to Model 3 in Table 2 where unobserved heterogeneity is not included, Gender, URM, Pell, HS GPA, REM (remediation), Double major or minor, Term GPA, Cum GPA and Earned cum units all show a larger effect size, while FGS, Major change, Major un-declared and Enrolled term units have similar or very slightly smaller effect size. All these variables have the same effect directions with or without unobserved heterogeneity in the model.

However, Pre-college experience and Extra enrollment changed effect directions after unobserved heterogeneity is accounted for. The negative effect of Pre-college experience in Table 2 becomes positive and stronger in Table 4. On the other hand, Extra enrollment has a positive and significant effect in Table 4 while it is non-significant and negative in Table 2. Thus after controlling for individual differences, students with better pre-college preparation have higher probabilities of graduation. So are those who took extra enrollments (summer or winter intersessions).

In Table 5 we included unobserved heterogeneity in time-varying models, and it is apparent the effect of individual differences (\(\sigma_{u}^{2}\)) varies over time. Large and significant individual variances are observed in the first ten terms but after that the variances become smaller and insignificant, implying that graduation in early terms is significantly affected by un-specified individual differences in addition to the predictors included in the model. Students who continued enrollment in later terms are more homogenous.

All of the time-varying independent variables in Table 5 except Pre-college experience have the same effect direction but larger effect sizes when unobserved heterogeneity is included in the model (Tables 5 vs. 3). These factors include Gender, URM, Pell, HS GPA, REM, Double majors or minors, Term GPA, Cumulative units earned-BOT, and Cumulative GPA-BOT.

For example, gender shows similar impact patterns in all terms in both models but its effect is slightly larger in the random effect model, particularly in the early terms (prior to 11th term). Similar patterns are found for URM, Pell, HS GPA and Remediation between the two models with significant negative effects for the middle terms (the 8th, 9th, 10th and 12th terms), and the random effect model has large effect sizes.

FGS, Major change, Major un-declared, Extra enrollment and Term units enrolled have similar effect sizes either with or without random effects in the model.

Pre-college experience is the most sensitive predictor to unobserved heterogeneity. Before random effects are controlled, Pre-college experience had a significant and negative effect in the 7th, 9th, 10th and 11th terms. After controlling for random effects, Pre-college experience became positive and significant in 7th, 9th and 10th terms, and positive but insignificant in the 11th term.

Conclusion and Discussion

In this study we analyzed probabilities of graduation and time to degree, based on 12,096 first-time freshmen in a large public university in California. We asked the question what aspects of students’ decisions and their performance affect graduation and time to degree beyond pre-college characteristics.

Probabilities of graduation are affected by two broad sets of factors. The first set describes student pre-college characteristics that do not vary in school, including race, ethnicity (represented by URM status), gender, family financial and education statuses (Pell grant eligibility and first-generation college enrollees), high school academic performance and college-prep experience. Factors in the second set summarize student decisions and performance in college that may vary each enrolled term, including major choices, major change, taking double majors or minors, term units enrolled, term GPA, cumulative units earned, and cumulative GPA at beginning of term. Besides these two sets of factors, our analytical model also included unobserved heterogeneity to allow for individual differences among students that are not captured in the predictors. We fitted the data with discrete-time logit hazard models that analyze the effect of the independent variables across enrolled terms, with odds of graduation as the dependent variables.

Major Findings

Based on the results from stepwise regression for the time-fixing effect model (Table 2), the full model explained about 55 % of total variation in graduation (Nagelkerke R square of 55.0 %). Specifically, 45.6 of 55 % is due to academic performance, the most important variable affecting graduation and time to degree. Students’ decisions on major choices and enrollment intensity are the second important aspect, which accounts for additional 8.6 % of the total variation in the dependent variable. Pre-college characteristics that students bring in only account for less than 1 % of the total variation.

Academic Performance

Academic performance is defined as Term GPA, Cumulative GPA and Cumulative units earned. While the three are related, the ultimate indicator of graduation is cumulative units one earned that apply to one’s degree completion requirement. Earned cumulative units in our analysis are the strongest predictor of graduation and time to degree. Cumulative GPA is important in earlier terms but after the 10th term it no longer has an effect on graduation.

Major Choice

When we examine the time-varying predictors of graduation, student decisions in academic majors heavily influence time to degree. All three variables in this group have a negative association with graduation. Those who are late to declare a major, who change majors or take up double majors/minors are less likely to graduate in a given term in the first few years in school.

A very important distinction between graduation and time to degree is emphatically illustrated by the number of majors or minors students choose. In general, better prepared and better performing students are more likely to choose additional majors and/or minors. Once we control other factors such as GPA and earned units, these students are less likely to graduate in a given term in the first 6 years. Double major/minor is the second strongest predictor, second only to earned cumulative units, in affecting probabilities of graduation in time.

It is important to understand that these students in fact have a higher graduation rates but they take longer to graduate compared to similar students without a second major or minor. This raises a paradoxical question: while we encourage students to enrich their college experience by taking more majors and minors, we prolong time to degree on the very best students we have. In our data, 14.3 % of the students have double majors and/or minors.

Enrollment and Enrollment Intensity

Continued enrollment and enrollment intensity, including inter-session enrollment, are important to timely graduation, but just enrolling in school with a lot of units may not by itself lead to degree completion. The overall negative effect of enrolled term units is illustrative, relative to double majors/minors. Enrolling more units promotes graduation before the 9th term, but after that the effect turned negative. This is likely due to two reasons: students with double majors need more units to graduate, and enrolled units are not equal to earned units. The latter point is indicated by the effect of term GPA: higher GPA per term means more earned units, or a higher earned to enrolled units ratio, which leads to graduation and shortens time to degree.

Student Demographic Background

Male students, minority students, Pell grant eligible students, are less likely to graduate in any given term in our analysis even after controlling for student decision, enrollment intensity and academic performance. Their effect, however, gradually faded away after the 12th term, suggesting that although disadvantaged students are less likely to graduate within the first 6 years in school, their disadvantage largely disappears after that 6-year mark. The gap in graduation between URM students and non-URM students in today’s universities to a large extent reflects this time varying effect because we typically use 4 or 6 years to measure graduation but do not actively track graduation beyond 6 years.

First-generation college students (FGS) have a lower probability of graduation but this effect becomes insignificant after controlling academic performance, suggesting that FGS’s effect is mainly through FGS students’ academic performance to impact their graduation.

Academic Preparation

Students with low high school GPA and those who are in academic remediation upon entry to college are less likely to graduate. However, we also noticed a compounding effect when cumulative earned college units are controlled. High school GPA and remediation became positively associated with probabilities of graduation in any term in the first 6 years. Our subsequent analysis indicated that this was largely due to decisions made by better prepared students to take double majors and minors that add required units to graduation requirements. Because remediation course units are not counted toward college credit, students with more required units appear to have a lower chance of finishing their degree in any given term in the first 6 years. Similar to the effect of demographic background, after 6 years high school GPA and remediation no longer impact chances of graduation.

The effect of Pre-college experience is sensitive to model specifications (e.g. other variables that are included in the model) and unobserved heterogeneity. Pre-college experience is positively associated with graduation, but after accounting for academic performance its effect becomes negative. It turns positive again after UH is controlled. Across most terms Pre-college experience shows a positive effect.

Unobserved Heterogeneity (UH)

Our analysis controlled for unobserved heterogeneity, or unmeasured individual differences, among students. Our findings have three major points: (1) UH exists; (2) UH has time-varying effects (early terms and later terms), and (3) UH interacts differently with different variables. The effect of UH is large and significant, but only in the first ten terms.

UH’s effect varies by independent variables. When UH is added to the model, nine (9) of the 15 independent variables kept the same direction but showed larger effects; four (4) remained unchanged, and two (2) reversed their effect directions or changed significance status.

In summary, we found that beyond student background, major contributors to graduation and time to degree include student decision on academic major/minor and academic performance in form of cumulative earned units. Student background such as race, ethnicity, gender and pre-college academic preparation plays a role, but at a much smaller scale. In addition, student individual differences also affect chances of graduation. However, after the 6-year yardstick, most of these indicators are statistically insignificant.

Implications for Educators and Policy Makers

Implications Related to Pre-college Characteristics

Higher education institutions cannot effectively influence pre-college characteristics of their students, but they may be able to help them improve their decision-making and academic performance after they enter college. Our study shows academic performance is the most important factor affecting graduation and time to degree, and pre-college characteristics only explain a small portion of the total variation in graduation and time to degree. In addition, their effect gradually faded away after the 12th term. Disadvantaged students are less likely to graduate within the first 6 years in school, but their disadvantage largely disappears after that. The gap in graduation between URM students and non-URM students to a large extent reflects this time varying effect because we typically use 4 or 6 years to measure graduation, but stop tracking graduation beyond 6 years.

Implications Related to Major Choice

All three variables related to major choice (Major undeclared, change major and double major or minor) are negatively associated with graduation in any given term, and the effect is larger in the early terms. This signals the importance of early major advising in helping students to graduate in time. There is a close relationship between academic performance and having double majors/minors. Typically the high performers take double major or minors. These students take longer to graduate, compared to their counterparts who only have one major. Similarly, changing major, on the other hand, inevitably prolongs time needed to graduate. Major advising in early terms is an important step to help students choose their career wisely, and universities can also use total units caps to regulate how often or how late a student can change major.

Implications Related to Enrollment Intensity

Students who enrolled in summer/winter classes or enrolled more units in regular terms are more likely to graduate. This effect is more pronounced in early terms (prior to 7th or 8th term), indicating the importance of intensive enrollment in the first 3–4 years in school. Institutions should encourage students to enroll in summer/winter classes or enroll more units in the early main terms to accumulate needed units towards to graduation.

Implications for Researchers

We contributed to the literature by studying both time-dependent and time-independent variables, and our findings provide educators, advisors and administrators an important understanding that early terms in school are extremely important.

However, we need more in-depth research on the time-varying effects of time dependent variables. EHA provides the methodology to identify “the right things at the right time” for timely interventions. We need to implement interventions on university campuses in a timely manner and help students “do the right thing at the right time.” However, what is the right thing to do? When is the right time? To answer these questions, we need to track students across terms and integrate time-dependent factors to assess their time-varying decisions and performance over time. Future research also needs to take a deeper look at unobserved heterogeneity (UH), and its potential impact on estimation of the effects of other variables.

Limitations of the Study

This study is limited by data available for analysis. Students make many more decisions than choosing majors and determining enrollment intensity, but our data can only include these two. Neither do institutional data contain variables of student diligence or engagement. The variables we were unable to include in the analysis are largely external factors, originating from the environment outside a university campus. They include availability of financial aid, health issues, family needs, jobs, and other life events and options that may take students away from school or influence their decisions on enrollment.

We attempted to capture these unmeasured variables by controlling for unobserved heterogeneity using two discrete-time logit hazard models with normal frailty. We also looked at how UH altered the effects of independent variables by comparing the results from the models with and without the random effects. However, the unobserved heterogeneity we included in our analysis do not tell us exactly what these individual differences are and we are unable to offer recommendations to offset their influence on graduation and time to degree.

Our study did not include stopout semesters in school. Since the conventional method of counting time to degree is based on elapsed time or calendar year, future research may benefit by including both enrolled terms and stopout terms in understanding time to degree.