Introduction

Recently, we have experienced an increase in attention to the ways in which families impact children. However, rigorous empirical evidence has been somewhat slow to follow qualitative studies in the genre of parental involvement and its effect on childhood behavioral outcomes (Avvisati et al. 2010). This has been true particularly in regard to the specific effect of involvement during the later years of school, such as high school, although studies at the earlier grades have indeed been more prevalent (Elder and Lubotsky 2006; Neidell 2003). However, adding empirical aspects to qualitative studies has been quite important for extending results on parental involvement to the high-school level. This may have been partly due to the fact that these behavioral outcomes may also have led to changes in long term outcomes such as wages, employment, or extended incarceration. The present analysis, therefore, has been constructed to augment the empirical and statistical testing-based literature on the specific relationship between parental involvement and childhood outcomes at the high school level.

One possible reason for the paucity of work at the high-school level may have been the difficulty in precisely defining parental involvement at this level. Additionally, inherent attitudes in the education and economics community may be to blame for this shortcoming in the literature. Specifically, many have acted as if, by the time children reach the age of high-school, parents no longer have a substantive impact on their lives. While evidence has been found for relatively small effects of parental involvement during high-school on cognitive or test score outcomes, there remains reason to believe that parents significantly impact the behavioral outcomes of their children at this same level through their involvement. While the previous literature has been more ambivalent regarding the role of parental involvement in the lives of high school students (Cunha et al. 2010; Segal 2008), the present analysis found a clear, rigorously statistically documented role for parental involvement that helped to improve high school students’ behavioral outcomes.

In this analysis, data from the first follow-up survey to the 1988 National Educational and Longitudinal Survey were employed when students were in tenth grade to examine the effect of parental involvement on student outcomes, including getting in trouble at school, in-school suspension, and arrest. Parental involvement was initially measured in three different ways: (a) if parents checked homework, (b) volunteered for the child’s school, and (c) if the teacher reported the parents as “educationally involved.” This three-part approach, as opposed to examining just one measure of parental educational involvement, enabled a more precise determination of the effects of parental involvement, and somewhat mitigated measurement or over-reporting issues. Baseline estimates, followed by instrumented regressions, have been included.

Using the instrumentation strategy, there was some initial evidence for the compensatory, rather than the “enhancement” framework for parental involvement. Specifically, when parents selected into involvement conditional on the underlying quality of their child, those who decided to be involved with their child’s education may have chosen to do so to compensate for their child performing poorly in school or for exhibiting poor behavior. Alternatively, the parents may have exhibited a desire to provide an extra “boost” (enhancement) or perhaps receive moral rewards when their child was already doing well (Loughran et al. 2008).

In addition to being one of the very few papers to empirically examine the effect of parental involvement at the high-school level on behavioral outcomes in the United States, the present analysis is also unique in its use of a sound instrumentation strategy and associated statistical tests. Within this genre, papers that included tests of validity, relevance and power to supplement the Instrumental Variables results have remained relatively scarce. In effect, the ability to examine the issues of weak instruments, as well as the exclusion restriction in a rigorous statistical fashion makes this paper a clear value addition to the education literature on the topic.

The next section provides the background and motivation for this analysis. This is followed by a discussion of the data and relevant variables. Next, the theoretical and empirical strategy is presented, followed by an examination of the descriptive statistics and regression results. Finally, conclusions and recommendations for future research are presented.

Background

The question of heritability of skills has not been resolved. Plug and Vijverberg (2003) have asserted that up to 55–60 % of what has been termed “parental background” is inherited in the form of ability (“nature” or genetics), while the rest is transmitted through “nurture” factors that parents can supply, for example, via explicit provision of resources or parental involvement. In terms of income, Zimmerman (1992) has found 40 % heritability. On the other hand, parental involvement in a child’s education, in fact, seemed to matter more than liquidity constraints (Cameron and Heckman 2001). Parental involvement also appeared to have an effect in cases in which institutional, community and teacher characteristics do not tell the full story of why children succeed or fail in school (Ehrenberg et al. 1995; Goldhaber and Brewer 1997; Leibowitz 1977; Lemke and Rischall 2003). Parental involvement may work in combination with institutional factors, with, as an example, parents being more likely to volunteer conditional on the size of schools that their children attend (Gee 2011).

The outcomes of interest in most studies of parental involvement have been cognitive in nature (for example, test scores), with little attention paid to behavioral outcomes. It is also true that in both cognitive and behavioral-focused studies, the research has not been particularly rigorous in its structure or approach. The current analysis serves to amend this lack, with an instrumental variables approach using statistical tests for the exclusion restriction, as well as weak instruments.

Parental involvement relates to parental background characteristics, particularly parental income and family structure. In terms of income, lower-earning parents were found to be less likely to be involved in their children’s education (Chevalier and Lanot 2001; Cooper 2010; Jenkins and Schulter 2002). This may be due to a lack of knowledge, motivation, longer working hours, transportation challenges, varying discount rates of the children or some other parental inhibitions (Card 1999; West 2007), as well as some combination thereof. It is also possible that parents with lower abilities on the labor market may also possess lower parenting abilities (Mayer 1997). This might explain why, when low-income and low-education individuals make the choice to be involved, their involvement yields fewer returns to their child than the corresponding contributions of the high-income/education parents’ involvement (Canova and Vaglio 2010; Murnane et al. 1981). In terms of family structure, evidence has been found to suggest that single parent homes serve as an indicator for potential parental involvement (Cooper 2010; Jeynes 2005; Kalenkoski et al. 2007), and that single-parent homes often display lower levels of involvement, such as volunteering (Carlin 2001).

In terms of the numerical results in previous studies, the bulk of research in this area has found a much larger effect of parental involvement (perhaps thought of simply as “supervision”) on behavioral rather than cognitive outcomes of children at upper grade-levels (Avvisati et al. 2013; Glick and Sahn 2009; Segal 2008). These results have been particularly telling when trying to understand the effect of parental involvement in mandatory programs currently under consideration in schools (Hallam et al. 2004), since it did not appear that parent involvement at the higher levels (for example, high school, as in this study) actually improved cognitive outcomes (Balli et al. 1998), but they may have been useful in helping students in terms of behavioral outcomes.Footnote 1

As already noted, high school studies in the parental involvement literature showed different results than those conducted in primary schools. As the child enters high school, parental involvement tends to move away from purely school-based involvement which, incidentally, is more often measured. Involvement instead tends to take a form in which parents help their children in a less direct (and less school-based) fashion. Some examples include parents who subtly influence their children’s peer groups and help their children make decisions regarding which colleges to attend (Catsambis and Garland 1997; Patall et al. 2008; Perna and Titus 2005). Since the form of parental involvement evolves over the child’s life, the literature’s focus solely on primary school parental involvement does not reflect the situation in high school (see for example Aizer’s 2004 study of 10–14 years old). Additionally, not all forms of parental involvement even at a particular point in a child’s life will operate in the same fashion (Hill and Stafford 1980; McNeal 2001; Sui-Chu and Wilms 1996). This multi-dimensional and evolving nature of parental involvement provided an important impetus for this study’s three-pronged approach to parental involvement at the high school level.

Finally, from a community perspective, parental choices were couched within a framework of the community within which the child and parent reside, and higher quality neighborhoods have the potential to create complementarities which encourage and magnify the effects of basic parental involvement inputs (Patacchini and Zenou 2007; Perna and Titus 2005). This issue has been considered here both by controlling for community factors, as well as in structuring the “Instrumental Variables” section of the analysis.

The current study endeavors to address the paucity in the literature of both rigorous statistical examinations of high-school-focused parental involvement on child behavioral outcomes (with Houtenville and Conway’s 2008 analysis having contained one of the very few instrumental variables structures in this literature, although, since only one instrument was used, no overidentification tests were possible), as well as a careful consideration of the effects of family structure, income, neighborhood and types of parental involvement. The current analysis remedies this gap in the literature and demonstrates that there is a role for parental involvement at the high-school level in helping to affect student behavioral outcomes, after controlling for the various aforementioned elements. It appears that parental involvement, once measured using a more instrument-focused approach, is actually even more important in affecting student behavioral outcomes than baseline estimates would have us believe.

Materials and Methods

Model and Empirics

Theoretical Framework for the Involvement Decision

For this section, background material has been drawn from Becker and Tomes’s (1976) work on the tradeoff between child quantity and child quality. Among the questions addressed by Becker and Tomes is whether parents chose to augment a child’s innate quality in a compensating or enhancing fashion (as defined earlier). For other related work, see also Becker and Tomes (1979) and Becker (1974).

Consider a general class of models where a parent’s utility (U P ) is a function of their leisure time (L), their income (y), and the amount they are involved with their children (V)—assuming other types of household production can be outsourced:

$$ U_{P} = U_{P} (L,y,V) $$
(1)

Define income as: y = wH + y 0 where w is the wage, H is hours worked, and y 0 is non-labor income. Time (T) is split between leisure (L), parental involvement hours (V) and hours worked (H): T = L + V + H. One particular formulation models the parent giving weight (α) to their own utility and weight (1 − α) to their child’s utility:

$$ U = \alpha U_{P} (L,y,V) + (1\text{ - }\alpha )U_{K} (V) $$
(2)

In (2), the parent gains utility from leisure, income and parental involvement directly. Essentially, parents enjoy goods, leisure and spending time being involved with—in order to help—their children. This may affect whether the child gets into trouble inside or outside of school (for example, arrests or in-school suspensions) as well as student achievement via, for example, cognitive outcomes.

If a parent is involved in schooling, this could help reduce the behavioral problems a child faces and allow him or her to focus more attention on school work. In fact, this is exactly the premise tested in the empirical analysis. Specifically, the empirical analysis focuses on the effect of parental involvement in helping their children’s behavior. Presumably, the effect on behavior during high school will, as mentioned earlier, be more noticeable than it might be on test scores.

Parental involvement may increase the child’s potential income (y k )—possibly through better behavior or test scores. This is expressed linearly as: y k  = θ + γV. An even more general model would not require the linearity restriction and would allow potential income of the child to depend on her endowments, that is, innate abilities or skills/qualities: y k  = f(e,V) where e is the child’s endowment. The decision to be involved with the child can thus be expressed as:

$$ V = V(w,y_{0} ,e,\gamma ) $$
(3)

Presumably parental involvement is negatively related to the wage and positively to non-labor income \( \frac{\partial V}{\partial w} < 0,\quad \frac{\partial V}{{\partial y_{0} }} > 0. \)The first inequality should hold because of the opportunity cost of one’s time increasing with a higher wage, negatively affecting the chosen levels of non-work activities. The second inequality should hold because of the wealth increase resulting from an increase in non-labor income. In contrast, the relationship between V and child endowments (e) is indeterminate. The sign of the relationship with endowments depends on whether help is compensatory or enhancing, and the answer to this question depends on the marginal utility for the parent of compensating for the quality of a struggling child or enhancing the quality of a succeeding child. The relationship with γ can also be determined empirically.

The relationship between child outcomes and a parent’s decision to be involved with his or her education is thus complicated by the relationship between involvement and child endowments/abilities developed above. In the empirical section, a child’s skills and abilities are affected by how the child is performing in school and the child’s behavioral outcomes. It is presumed that when a parent is involved with a child’s education, the child may develop better characteristics for the future (less suspension in school, lower probability of getting in trouble) and thus ultimately possess a higher likelihood of succeeding.

The empirical section helped address the previous question by slightly altering the focus to students and their behavior (rather than parents and their choice of whether or not to be involved) and determining whether parental involvement helped improve student behavioral outcomes. The sign of the relationship in the empirical section clarified whether parental help was indeed positively related to enhancing student opportunities and skills. The change in the relationship after introducing an instrumental variables strategy was also a helpful technique by providing additional evidence as to whether parental help was enhancing or compensatory by comparing the magnitude of the IV and the OLS coefficients, with the additional caveat that such changes may instead, at least partially, reflect the nature and construction of the instruments at hand.

It is worth reiterating here that the instrumental variables chosen did relate to the parent’s own child in particular, and thus did not reflect the parent’s opinions about, or goals for, their own child’s behavior. The instrument of the parent self-reports of either “being involved in the neighborhood generally,” or the parent “being involved in a non-school club related to children” do not relate directly to the parent–child relationship. This was obviously true for the first instrument (involvement in the neighborhood) and may have been true for the second instrument as well (involvement in a non-school club related to children).

In particular, there may have been reason for concern regarding the endogenous nature of parental involvement in school, since parents who were involved with their children’s education may have done so specifically because their children were doing poorly (or well) in terms of their behavior or test scores in school. However, that is not to say that this is the only reason for parental involvement. Parents were also likely to be involved with their children’s education and at their children’s school out of a desire to contribute their time and efforts out of a “broader” sense of moral obligation. This may have come from various types of personal preferences, and it was not the goal of the present analysis to disentangle these non-child-focused motivations, since they will largely be passed through to the child through the parent’s involvement with the child, or through other included demographic characteristics of the family.

It was, however, the goal of the instrumental variables strategy to tease out the parent’s desire for general involvement from the endogenously determined desire to help one’s child in particular due to the child’s poor (or superior) levels of achievement and behavior. By choosing instruments which relate to an individual’s community and outside groups, rather than solely relating to the child’s schooling and individual outcomes, it is possible to more clearly isolate the non-endogenously determined effect of parental involvement on the behavioral outcomes of the children.

If at least one of the instruments was exogenous and the joint instruments passed the exclusion restriction via the overidentification strategy, then the strategy was judged statistically sound, and there was no reason to be concerned about validity in instrument choice.

Main Data

The dataset used was the National Educational and Longitudinal Study (NELS) conducted by the National Center for Education Statistics (NCES). Information for the number of students in a school district and the total revenues per student in a school district used data extracted from both the 1991–1992 Census Survey of Finances (F33 survey) and the 1990 Common Core of Data (CCD). One of the major strengths of this individual-level dataset was the careful procedures employed in data collection and nationwide random sampling techniques which were rarely present in other large datasets of this type.

The NELS dataset was, in fact, chosen for this analysis due to its large size and particular richness in documenting the various types of parental involvement and school volunteering. The study began in 1988 with a random sample of students in a random sample of schools in the United States. It surveyed the same students (starting in eighth-grade) every 2 years up through college and into their working years. Throughout the survey process, compositional changes from the initial random sample occurred due to attrition and non-randomized freshening of the sample. For this reason, probability weighting for inclusion in the first wave of the survey was employed in each part of the current analysis.

The 1990 NELS data were termed the first wave (F1) when the individuals of the sample were in 10th grade. The number of students in the NELS during the F1 wave was 27,508. This number was reduced due to sample non-response, as well as the focus in the present analysis on public school students. During the first wave, the largest number of school-related variables, parental involvement variables, and student-level data were collected. Because of the greater data availability in F1, this wave was chosen as the focus of the current analysis. Data availability, in addition to differences in question wording between waves making the questions non-comparable, also provided the impetus for a cross-sectional rather than a longitudinal analysis.

As a final note, the NELS, by its construction, is retrospective, in which individuals answered questions regarding the previous year each time they were surveyed. While this allowed for some inaccuracies in terms of memory, the alternative time-diary data posed more problems in structuring a general sense of involvement of parents with children, and was one of the reasons to instead use this type of dataset.

Variables

Outcome Characteristics

The behavioral variables of interest were coded in terms of category of response in the NELS. For example, the question on arrest was: “were you arrested 0 times? 1–2 times? 2 or more times?” Variables were re-coded to binary status so that an individual was considered to have either been arrested at least once, or never. This simplified the intuition and reduced problems of misreporting and incorrect recall. The NELS students answered questions regarding delinquent behavior in each wave. Several representative questions on delinquency were used as the behavioral outcomes of interest, that is, arrest, in-school suspension, and whether the child got in trouble at school.Footnote 2

Parental Involvement

In a general sense, parental involvement here was similar in concept to the Harvard Family Research Project definition of “parents are engaged in the education of the child while they are of school age” (see for example Murnane et al. 1981). This was distinct from purely school-based involvement. It is also true that, as Jeynes (2005) noted, there were many different possible ways to measure parental involvement. These different forms may have reflected different aspects of the parent–child relationship as well as parental income, education, and background (Harvey 1996; Monna and Gauthier 2008). There may also have been different types of problems in terms of over-reporting, misremembering, or misrepresentation which varied based on the reporting source. For this reason, results were reported after employing information on different types of involvement as well as from different sources—in this case, information from parents, teachers and children (Monna and Gauthier 2008). The use of a three-part structure of different types of parental involvement helped determine how different forms of parental involvement mattered and insured that the reporting source did not completely determine the results.

The specific variables used for parental involvement included (a) teacher reports of parental involvement in the child’s education, (b) child reports of how involved a parent was in checking his or her homework, and (c) parent reports for whether (either parent) chose to volunteer time in the classroom.Footnote 3 As stated previously, the variables for parental involvement were recoded as binary. This strategy has positive as well as negative implications. Specifically, interpretability of the results was limited to the extensive rather than the intensive margin (see Reich 2013 for an examination of these differences for involvement), and tests of the elasticity of substitution of time and involvement were not possible (Fenstermaker 1996; Cunha et al. 2010). However, it was also true that issues regarding the simultaneous use of time in involvement and other categories as well as misreporting—particularly by parents (Monna and Gauthier 2008), was alleviated. As a final note, because of the binary construction of parental involvement measures, non-classical measurement error and its alleviation were addressed in the Appendix.

Control Variables

The family factor most commonly used to explain an individual’s decision to volunteer—and to be involved with the child’s education—was socioeconomic status (SES), and to a similar extent, education (Gibson 1999; Janoski and Wilson 1995). In the current analysis, SES and education were used as independent (right-hand side) variables in both the OLS and IV regressions.Footnote 4 Family structure (single-parent, dual-parent home, presence of grandparents, and so forth) and number of siblings, as indicators of upbringing and the general family climate may have had important effects on parental involvement and volunteering (Angrist and Evans 1998; Baydar et al. 1999; Douthitt et al. 1990; Maume 2011; Painter and Levine 2004). Although not all authors agreed regarding the importance of the number of siblings in particular as a control characteristic (Black et al. 2005), the analysis here followed the majority and included number of siblings as a control, along with other family structure factors.

All of the aforementioned characteristics merited particular attention, as outlined in the previous section, and were used as right-hand-side variables in the regression analysis.Footnote 5 The employment statuses of the mothers and fathers were considered as additional control factors, however, these were ultimately not employed due to a lack of significant effect or evidence of an omitted variables bias problem.Footnote 6 Means and standard deviations of this variable were, however, displayed in the summary statistics table.

Additional family and individual characteristics used as right-hand-side variables in the regression analysis were binary indicators for race, which might have had a relationship with both student outcomes and parental involvement (Delgado and Canabal 2006; Desimone 1999), as well as child gender, since outcomes (and possibly parental involvement) may vary by gender, with males more likely to have experienced negative behavioral outcomes (Monna and Gauthier 2008).

School and area characteristics included the average income in the school zip code, the percentage of 12–17 years-old living above the poverty line in the school zip code, revenues per student in the district, district-level enrollment, and the number of families and students in the school’s zip code. All of these control variables were used throughout the analysis as family and school controls even if they were not explicitly displayed in the regression results.

Instrumental Variables

Although parents may have elected to be involved with a child’s education directly via school-based activities (as measured in this analysis through teacher reports), volunteering (parental measures used), or checking homework (child reports employed), these measures of involvement may have underestimated the true impact of parental involvement for several reasons. The first reason to suspect that these school-based measures underestimated true parental involvement with the child and the parent’s effect on children’s behavior is that these measures were taken when the child was already in high-school. During high-school, rather than earlier grades when parents were more involved through the schooling route, parents may have chosen to help children succeed in ways not as obviously observable as is volunteering (for example). Parents may instead have chosen to exercise their supervision in a less direct, school-focused way. A parent’s tendency to be involved more generally and not just in a school-based fashion is thus proxied in the data through measures of parents (1) being involved in a non-school club related to children and (2) being more generally involved in the neighborhood. These instrumental variables from the NELS were re-coded into binary variables from categorical responses.

It was, therefore, posited that parents displaying higher neighborhood involvement generally, as well as higher involvement in non-school children’s clubs (not necessarily their own children’s clubs), were more likely to be involved with their children. Using these two instruments, therefore, helped provide a better measure of how much parents were involved with the success of their children.

The second reason initial measures of parental involvement were improved by employing instruments was that parents were more likely to become involved in a school-based fashion in a compensatory or enhancing way. Specifically, when parents became involved due to their children doing poorly (or well), this was termed a compensatory (enhancing) form of involvement. As a result of this directed type of parental school-based involvement, baseline Ordinary Least Squares estimates employing school-based measured of involvement were biased up (down), since the involvement was enhancing (compensatory) in nature. One way to address this concern was to employ a measure which reflects a parent’s tendency to be involved in a non-school based fashion and not directly related to their own child, as explicated above. Also, it was possible to determine, by comparing the Ordinary Least Squares with the Instrumental Variables estimates, whether there was evidence for the compensatory or enhancing models of involvement.

Employing measures of parental involvement out of context from the community did not fully capture the relationship between parents and the communities in which they resided in affecting involvement. Employing the instruments listed above helped to alleviate this concern by couching the discussion of parental involvement within the framework of the community in which parents reside.

As a final note, it was always true, particularly in education-based studies, that instruments can reflect other unmeasured characteristics of parents and children which may have independently affected the outcome, that is, that they were non-excludable. Significant efforts were made to preclude this possibility. From a theoretical standpoint, it was difficult to imagine that any of these unmeasured factors would have influenced child behavior without the parent passing them on through some type of involvement with their child. Thus, while instruments for parental involvement may have indeed reflected many other factors which benefited their children’s behavior, these will generally have been passed on through involvement with the child. Barring that, these factors may have been passed on through changes in socioeconomic status, family structure, or generally through the other factors which were included in the first stage of the analysis. As a result, although non-excludability of instruments was possible, it was more likely that they affected child outcomes through their relationship with parental involvement or the other demographic factors previously introduced as controls.

Empirical Modeling and Background

The behavioral outcomes of a child were modeled as a function of school and area factors (SchoolChar), family structure (FamilyStruct), child and parent background factors (ParentBack), and parental involvement (ParentInvolv). Specifically, the linear regression model for student i in school j was:

$$ Outcome_{i,j} = \alpha_{0} + \beta_{1} SchoolChar_{j} + \beta_{2} FamilyStruct_{i} + \beta_{3} ParentBack_{i} + \beta_{4} ParentInvolv_{i} + u_{i,j} $$

Here Outcome referred alternately to whether the child got in trouble, was arrested, or had an in-school suspension, and ParentInvolv referred to the three measures of involvement separately employed (checking homework, teacher reports of parent involvement, or parental school volunteering).

There were several issues with this baseline model, including:

  • Measures of parental involvement employed were typically at the school-based level and parental involvement in high school tends to have already proceeded to a more general non-purely school-based form of involvement.

  • Parent involvement occurred within the framework of the community in which it proceeded, with more tightly knit communities being more likely to have exhibited higher levels of volunteering and involvement, as described previously.

  • Parental involvement may have occurred in an endogenous fashion, with parents being educationally involved more for students who were either doing well already (enhancing) or those doing poorly (compensatory), thereby biasing the coefficients of the regression.

Because of these problems with the initial baseline regression, an instrumental variables strategy was employed, whereby measures of parental involvement with the community (that is, whether parents are involved with their neighbors) and with other child-groups (that is, whether parents are involved in another non-school group focused on children and not necessarily their own children) were employed as instruments for the baseline measures of involvement and volunteering. The outlined strategy of instrumentation helped to solve the problem of endogeneity since

  • This involvement may have been a truer reflection of a parent’s propensity to be involved—although, it was not used alone, since it did not entirely guarantee that parents who were involved with their communities and clubs were actually involved with their children, while the previous (although endogenous) measures did guarantee “school-based” involvement.

  • Employing a measure of how involved parents were in the community helped to contextualize the relationship and test to see the true effect of the involvement after taking into account the relationship between parental decisions and the community in which they occur.

  • If parent involvement was more likely when children were doing poorly (well), then using a measure of whether parents were just “generally” involved helped to fix this endogenous selection and, in fact, a comparison of the effect of parental involvement in the baseline versus the instrumented case assisted in determining whether help is compensatory or enhancing.

If help was enhancing, then the effect of parental involvement should have decreased in the instrumented regressions, with the opposite occurring if help was compensatory. Now, it was also the case that, because of the issue of community context, and of choosing a more correct measure of the type of involvement occurring, it was difficult to determine whether the endogeneity correction effect is the only one at play. For instance, if there was an increase from the OLS to IV regressions, it could have been because help was compensatory, but it was also the case that a more correct measure of parental impulses and carry-through of involvement was being captured.

In order to determine whether the instruments (Z) are valid, that is, Cov(Z i ,u i ) = 0, relevant, that is, Cov(Z i ,ParentInvolv i ) ≠ 0, and have sufficient power, several statistical measures were employed. Specifically, Cragg Donald Wald’s F-statistic was used to check power/quality, Hansen’s J-statistic was used to check validity, and the Kleibergen–Paap rk LM statistic was used to check relevance. This was especially important due to, for example, the concern that community context might have invalidated the instruments with an independent effect of parental involvement in non-school groups or parental involvement in the neighborhood on the success of the child. Therefore, the validity check that Cov(Z i ,u i ) was critical. Results showed that this instrumentation strategy alleviated concerns regarding endogeneity, context, and type of involvement at play for this age group without the instruments becoming irrelevant, invalid or overly weak. It was also true that results were similar using a Limited Information Maximum Likelihood strategy, showing that they were indeed robust (see Angrist and Pischke 2009 for more detail on this fact).

Results and Discussion

Trends in the Data

Summary statistics for all the variables used in the analysis can be found in Table 1. Variables were listed by category (control, involvement, instrumental and outcome) and were further stratified on the sample of interest (“balanced” or “full”). The balanced columns contain statistics for 10th grade students from public schools with non-missing values for all relevant control, instrumental, outcome and involvement variable questions. The full sample was used to construct the minimum and maximum values and these values were generally similar for the balanced sample. The full sample did not perfectly correspond to the analysis in Table 2, since the number of observations necessarily varied between the different regressions. It was also true that t tests of differences of means for each variable revealed that means were significantly different at p values of 0.01 or better other than for (a) gender, (b) whether parents checked homework, (c) arrest, and (d) getting in trouble at school. Specifically, arrest was different at the 5 % level but not at the 1 % level (p = 0.028), gender was different at the 5 % but not the 10 % level (p = 0.071), while the means for parents checking homework and children getting in trouble at school were not different even at the 10 % level (p = 0.6246 and 0.2281).

Table 1 Summary statistics
Table 2 Effect of parental involvement measures on student behavior

An examination of Table 1 reveals several points of important difference between the balanced and full groups. These differences help explain the need for a separate robustness regression in Table 3 (balanced group) that examined the comparability of results to Table 2. Note that Table 2 did not condition on individuals answering any particular questions (although, as a practical matter, it required that certain questions were answered in order to run the regression of interest and use the same number of observations for each OLS regression and the corresponding IV regressions).

Table 3 Effect of parental involvement measures on student behavior

In Table 1, the full group was comprised of just slightly fewer females than males, with 49 % of the full sample versus 51 % of the balanced sample being female, although this difference, as mentioned earlier, was not statistically significant. It is also clear that the balanced sample had fewer extreme values on the education front, with a lower proportion than the full sample either having less than high school (6 vs 11 %) or receiving a Ph.D. or M.D. (4 vs 6 %). As a result, the balanced group had a higher fraction of parents either partaking in some college, finishing college, or receiving an M.A. The fraction that finished high school was nearly identical for both groups. It was also true that the balanced group possessed a slightly higher Socioeconomic Status (0.045 vs −0.02). The racial composition appeared to be slightly less diverse generally in the balanced sample with 4 % Asian (vs 7 % full), 7 % Hispanic (vs 14 % full), and 8 % Black (vs 11 % full). The families in the balanced sample had somewhat fewer children (1.534 siblings vs 1.596 in the full sample) and tended to reside more in the typical nuclear family (75 % have a mother and father at home vs 65 % in the full sample). The balanced sample was slightly more likely to have both parents employed (Mother: 93 vs 90 %; Father: 97 vs 96 %). Overall, parents in the “balanced” sample (that is, those who answered all questions used at any point in the regressions) were slightly wealthier and less diverse than the overall sample. Interestingly, they did not tend to have as many higher degrees.

The balanced sample parents were also more involved in their children’s education and in their communities (77 % involved vs 74 %; 83 % check homework vs 82 %—although this difference was not statistically significant; 29 % volunteer vs 25 %; 32 % involved in a non-school organization to help children vs 25 %; 83 % involved in their neighborhood vs 77 %). Their children, unsurprisingly, had somewhat better behavioral outcomes as well, with only 42 % (vs 44 %) getting in trouble, 10 % having an in-school suspension (vs 12 %), and just slightly fewer arrests (2.6 vs 3.3 %), although, as noted earlier, the differences for arrest and getting into trouble were not statistically significant.

The communities where individuals in the balanced sample resided were somewhat lower income ($29,000 vs $32,000) areas with a slightly higher fraction in poverty (7.3 vs 6.9 %). Their schools had slightly fewer revenues per student (5.5 vs 5.7) but their school districts were significantly smaller (19,000 vs 37,000).

Overall, we may conclude that parents who answered all relevant questions represented a somewhat selected fraction of the population, that is, a group that possessed higher income, was less diverse, displayed more involvement, and achieved better student outcomes individually, but resided in communities that were slightly less wealthy. This points to the need to check comparability conditional on whether observations included all or only some answers to the questions of interest for the current analysis and explains the later use of regressions in Table 2 versus 3.

Regression Analysis

Turning next to Table 2, the effect of parental involvement on three separate measures of student behavior was computed. Each column area represents a different outcome measure (arrest, in-school suspension, getting in trouble) while the three horizontal areas show different measures of parental involvement (checking homework, teacher reported involvement, and volunteering). Each measure of parental involvement was used in a separate regression due to the possibility of high colinearity between the various measures and the difficulty in teasing out the true effect of each of the measures. It is also interesting to determine the effect of each of these variables and to compare their coefficients. In total, this table contains eighteen different regressions (nine Ordinary Least Squares, and nine Instrumental Variables).

Each regression yielded the coefficient and t-statistic on the parental involvement measure, and was displayed along with the number of observations in the regression. The instrumental variables regressions in the “IV” columns used two instruments jointly (parents were involved in the neighborhood and parents were involved in an outside group helping children), although the regressions employing only one of the instruments were included for comparison in Table 4 in the Appendix and provided a similar pattern of results. All control variables were used in these regressions (coefficients suppressed for brevity) and all regressions were run using probability weights for sample inclusion, and standard errors were clustered at the school level to retain the most conservative results. The IV columns additionally contained tests for the relevance, validity, and power of the instrumentation strategy through the use of the Kleibergen–Paap rk LM statistic, Hansen J statistic and the Cragg Donald F statistic, respectively.

Here, and with the similar test conducted in Table 3, it is important to recall that the Hansen J statistic is an overidentification test—similar to the Sargan test. Essentially, as long as the argument has been put forth that at least one of the instruments was valid—and in this case, it was more straightforward to make the case for the involvement “generally” of a parent in their neighborhood as the clearer exogenous variable—then the overidentification test functioned in the stated fashion, as a test of the exclusion restriction. The relevance of the chosen instruments was also empirically documented in Table 5 in the Appendix.

Examining Table 2, it is clear that parental involvement had a positive impact on children’s behavioral outcomes. More parental involvement led to a lower likelihood of arrest, suspension or getting in trouble. This was true for all eighteen regressions at the 10 % level with fourteen of the eighteen regressions additionally significant at the 5 % or greater level. The effect varied with parental involvement (checking homework, being involved at school, and volunteering) having a clear impact but a relatively small coefficient in some cases—with the OLS coefficient showing a 1–3 % impact on arrest, a 2–7 % impact on likelihood of suspension and a 2–13 % impact on getting in trouble.

It is also clear that, when comparing the types of parental involvement to determine their effects, parental volunteering seems to have provided a relatively smaller magnitude of impact, with either parental involvement as measured by teachers, or perhaps parents checking homework, having had a larger impact—and teacher measured parental involvement generally displayed larger effects. This is consistent with less school-focused involvement having mattered more at the high school level, as evidenced by the larger impact of teacher-measured parental involvement, which may have picked up some non-school based involvement seen by the teachers as well. It is also true that all three of the measures show an effect of parent involvement on student outcomes, implying that while type of involvement (as well as reporting) mattered, it does not appear that varying either of these categories made the effect of parental involvement disappear.

When moving to the instrumental variables regressions, the impact of parental involvement was still negative and significant, and the magnitude was greatly increased in many cases. It is notable that all of the instrumental variable regressions achieved significance at least at the 10 % level, and eight of the nine regressions additionally reached significance at the 5 % or higher level. The size of the coefficients, however, appeared somewhat larger than expected, with the effect of teacher reported parental involvement on arrest at an 18–50 % decrease in arrest likelihood, suspension a 41–122 % effect and trouble a 55–150 % effect. Clearly, there was some issue in terms of the range of effects in the IV regressions due to the linearized nature of the regressions over-predicting beyond the range of what is feasible (that is, above one in a binary outcome).Footnote 7 However, it is also notable that in none of the cases have the instrumental variables regressions moved the analysis downwards towards a lack of, or a negative impact of parental involvement on student behavioral outcomes.

It is also true that the instruments performed generally well on tests of power, relevance and validity. The Cragg Donald F-statistic was ten or higher (ten being the rule of thumb for power with one endogenous regressor) for all three of the parental involvement measures. The Hansen J-statistic for instrument validity uses as the null the hypothesis of instruments being uncorrelated with the error term of the regression (that is, “exogeneity”). The null hypothesis failed to be rejected at any conventional level of significance as evidenced by the high p values in all eighteen regressions. The regressions also all performed well on relevance, that is, whether there was a relationship between the endogenous variable and the instrument, as seen by p values uniformly rejecting the null of no relationship at the 1 % level of significance. The instruments were thus well-chosen generally for these regressions with some question remaining regarding the instrument power in the case of the teacher measure of parental involvement.Footnote 8

Turning next to Table 3, it is possible to determine whether there was, in fact, an effect of conditioning on being in the much-smaller balanced sample. The layout of this table was the same as Table 2, with the additional constraint that all regressions employed the balanced sample (as explained in the summary statistics section) and evidenced in the consistent number of observations at 2,781. It is apparent from this table that the results are quite similar in nature to those in the previous table. This observation is important, since it points to a lack of effect of selection on whether individuals were included in the sample based on nonresponse on biasing the results.Footnote 9

Regarding the tests for relevance, validity, and power, the most notable difference in this table versus Table 2 is that, in the case of teacher measures of parental involvement (row area two), there is now a reason to be concerned with the problem of weak instruments. It is also true that the Hansen J-Statistic rejected the null at the 10 % level, but not at the 5 % level in the case of parental volunteering or involvement and the outcome of children getting into trouble. The general pattern of results is, however, similar to those in Table 2 with, perhaps unsurprisingly given the smaller number of observations in nearly all of the regressions, larger standard errors and lower test statistics in some instances. It is also true that, while some concerns may have been raised by lower than expected test-statistics here, the first stage regression results and single-instrument results provide additional evidence regarding the strength and relevance of the chosen instruments.

Turning next to the OLS coefficients in Table 3, it is clear that while the effect of parental volunteering changed from the Table 2 sample (arrest: 1–2 vs 1–3 % in Table 2; suspension: 4–8 vs 2–7 % in Table 2; trouble: 2–14 vs 2–13 % in Table 2), these changes are within reason, and not entirely improbable given the much smaller nature of the sample in Table 3. The IV results also displayed a similar pattern to Table 2 with generally significant values and a similar range. It is also notable that arrest seems to have the lowest levels of significance in the case of IV, as opposed to the outcomes of in-school suspension and getting in trouble.

In summary, the results show a clear positive impact of parental involvement as measured by teachers, parents and children both based on OLS and on IV results using several measures of parental involvement in the neighborhood and in the student’s life, with a stronger impact from the indicators not based in school. It is also true that while there were some differences between the observations of individuals who answered all the relevant questions and those who did not, the aforementioned results hold true for both classes of individuals. From this evidence, there is reason to believe that parental involvement in the education and lives of children in high school made a clear difference in the behavioral outcomes of children over and above the characteristics of the parents and the schools which the children attended. Thus, there is some room for parental involvement even apart from examining the other factors that parents contribute to the education of children. It is also true that the effect of parental contributions on the behavioral outcomes of students is underestimated relative to the broader measures of parental involvement and interest in helping children. This lends some tentative support to the concept that parents become involved only when children are doing poorly and, therefore, the effects of involvement on children’s lives are underestimated (that is, the compensatory model of involvement), although other interpretations of this result, as mentioned earlier and in the Appendix, are still possible.

Conclusions

Certainly, parents generally choose to be involved in some way in their child’s education. This can range from helping the child with her homework to volunteering at the child’s school. The types of involvement change as the child gets older, and so a unique strategy is necessary to weigh the effect of parental involvement by the time children reach high school. It is also true that parents may select into involvement conditional on the underlying quality of their child, and may also be influenced by community factors.

Due to the aforementioned structure, an instrumental variables strategy was employed to compare the baseline regression effects of a tripartite measure of parental involvement to the instrumented version. In both versions, it is clear that there is a place for parental involvement. While the effect may not be extremely large compared to other related measures in the literature, it does seem more consistently present than previously imagined (Cunha et al. 2010; Segal 2008). It is also true that the effect of parental involvement increases in the instrumented version relative to the baseline Ordinary Least Squares regressions. These changes may be driven predominantly by one of several factors including

  • Community factors and non-school-based involvement increase the effects of parental involvement, so taking these into account yields a more correct view of the contribution of parents in context (Stacer and Perrucci 2013).

  • Parents contribute in a compensating fashion, with more help being accorded when children do poorly rather than when they do well.Footnote 10

It is a non-trivial result to find that parental involvement yields a positive impact on student outcomes at the high school level for several types and forms of measurement of parental involvement. It is also true that there is clear variation both in the effect of school-level involvement between different measures (for example, checking homework, volunteering, or being generally “involved” based on teacher reports) as well as measures of involvement elicited through the instrumentation strategy.

In structuring mandatory and optional parental involvement programs at the high school level, it is important to account for the above-mentioned differences and to understand both the context in which involvement occurs, as well as the many different ways that parents can have an effect on children. While it is possible to mandate parental involvement, it remains unclear whether such involvement is beneficial. Also, although the balanced sample in Table 3 had a significantly different set of background characteristics of parents than the initial larger samples in Table 2, there is still an effect to be seen from parental involvement. Thus, while parent involvement may, in fact, depend on factors such as education and SES, as well as local characteristics, this analysis does not find strikingly different effects of involvement.

It is also important to note that while income and family structure were used as control characteristics in the current analysis, they may also have contained some “potential” or unmeasured parental involvement. Therefore, while the current study provides estimates of parental involvement which can be influenced through policy (that is, get parents to volunteer or check homework rather than changing levels of divorce), it is perhaps not the full picture of what is going on at home. As an example, while higher parental education and socioeconomic status generally do significantly relate to a decrease in child behavioral issues in the current analysis, stratifications based on these variables was not employed due to the relatively small resulting number of observations in most of these regressions. Extensions to the present analysis employing other types of data could employ this stratification based on family structure and income to help inform the policy discussion regarding ways to improve student outcomes and the differential returns to parental involvement conditional on family background.

On a related note, while the gender of the involved parent is unknown in the present data, extensions with other datasets employing this information could help determine the extent to which male and female parents are efficaciously involved with their children. This would shed light on the debate regarding the benefits of requiring firms to provide paternal leave or excuse family related absences for fathers in the workplace. It is also true that future work accounting for teacher involvement would help determine both the extent to which parents crowd out the involvement of teachers in helping to improve student outcomes, as well as the mitigating effect of teacher versus parental involvement in helping lower socioeconomic status children succeed. Taken together, while the present study makes significant headway in determining the size and magnitude of the effect of parental involvement on child behavioral outcomes, possible extensions remain for future research.