Introduction

In a twin study, a trait or phenotype is conceptually determined by four latent effects: additive genetic (A), dominant genetic (D), common environmental (C), and unique environmental (E) effects. These latent effects can be statistically estimated from phenotype data collected from monozygotic and dizygotic twins using structural equation modeling. Traditional structural equation modeling (SEM) cannot estimate A, D, C, and E effects simultaneously because C and D are confounded in the classical twin study (Neale and Maes 1992; Rijsdijk and Sham 2002). As a result, in twin studies, C and D together cannot be included in the SEM. Instead, ACE and ADE, but not ADCE or CDE, are used for the SEM (Neale and Maes 1992; Rijsdijk and Sham 2002). However, it is biologically possible that D and C affect one trait simultaneously. Ignoring D and C co-existence would lead to biased estimates (Neale and Maes 1992; Rijsdijk and Sham 2002). To address this analytic limitation, Ozaki et al. recently published non-normal structural equation modeling (nnSEM) that could estimate additive and dominant genetic influences on and environmental contribution to univariate continuous traits simultaneously using ADCE model (Ozaki et al. 2011). In the ADCE model, each of A, D, C or E designates a single latent factor. A four-factor model is defined as all of the four factors (ADCE) affecting the trait. A “three-factor” model is a model in which only three of these four factors are presumed to contribute to the variance of a trait. The sum of additive (A) and dominant (D) genetic influences is a global genetic influence (G) and can be estimated if either three-factor (i.e., ADE) or four-factor (i.e., ADCE) nnSEM (Ozaki et al. 2011) is the best fitting model. Appendix A shows the brief statistical introduction to the ADCE model using the nnSEM.

Although genetic and environmental contributions to lipids have been assessed using the traditional SEM (Goode et al. 2007; Heller et al. 1993; Jermendy et al. 2011; O’Connell et al. 1988; Snieder et al. 1999), to date, no studies have applied the novel nnSEM to elucidate additive and dominant genetic influences on lipids along with environmental impact. Circulating levels of high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), total cholesterol (TC) and triglycerides (TG), reflecting the lipid profile, are well known factors for cardiovascular diseases (Brownson et al. 1998). Using the SEM, prior twin studies have found that additive genetic factors contribute to plasma levels of HDL-C (Goode et al. 2007; Heller et al. 1993; Jermendy et al. 2011), LDL-C (Goode et al. 2007; Heller et al. 1993), TC (Goode et al. 2007; Heller et al. 1993) and TG (Goode et al. 2007; Jermendy et al. 2011). For example, estimated from the ACE SEM model, the longitudinal patterns of additive genetic effect on a lipid ranged from 0.46 to 0.57 for TC, 0.49 to 0.64 for LDL-C, 0.50 to 0.62 for HDL-C and 0.28 to 0.61 for TC among 456 twin pairs in the National Heart, Lung and Blood Institute (NHLBI) Twin Study (Goode et al. 2007; Jermendy et al. 2011). Since simultaneous inclusion of both C and D in the SEM is not possible, estimation of the genetic and environmental effect in those prior studies might be biased. It is worthy to note that the SEM and the nnSEM are not comparable as they analyze different sample statistics and are based on different statistical methods. However, it is intriguing to understand how different ACE and ADCE models would be. Therefore, the method to estimate three-factor models using the nnSEM was developed for our study (Appendix B), which allowed the comparison between nnSEM ADCE and nnSEM ACE models using model fit indices. In this reported study, using the nnSEM as an alternative approach, we attempted to shed new light on the genetic and environmental influence on circulating fasting lipid concentrations.

Materials and methods

Study population

The National Heart, Lung, and Blood Institute (NHLBI) Twin Study has been widely described (Dai et al. 2013; Reed et al. 1993; The U.S.National Heart Lung and Blood Institute (NHLBI) 2005). The NHLBI Twin Study was designed to prospectively investigate the genetic and environmental role in cardiovascular disease risk through inclusion of 514 white male middle-aged veteran twin pairs born in 1917–1927 at baseline (1969–1973) (Dai et al. 2013; Reed et al. 1993). Based on zygosity ascertained by eight red blood cell antigen groups (serotyping 22 erythrocyte antigens) in the 1960s and variable number of tandem repeat DNA markers in the 1980s (The U.S.National Heart Lung and Blood Institute (NHLBI) 2005), the baseline sample included 253 MZ and 261 DZ pairs (The U.S.National Heart Lung and Blood Institute (NHLBI) 2005). Informed consent was obtained from all individual participants included in the study.

In this study, we included a total of 1028 twins who had baseline data on fasting plasma lipid profile. The reported study was approved by the Institutional Review Boards of Vanderbilt University.

Measures of the lipid profile

Blood was drawn from the forearm vein after an overnight fast into EDTA tubes and immediately placed on ice. After centrifugation the plasma was aliquoted and frozen at −70 °C (Feinleib et al. 1977; Reed et al. 1994). Plasma lipid fractions were measured in one of three laboratories following standards of the Centers for Disease Control, located in San Francisco, Indianapolis, and Framingham (Selby et al. 1991). LDL-C was estimated (Friedewald et al. 1972) for individuals with measured TG concentration less than 400 mg/dL (Sampson et al. 1975). Plasma samples from a twin pair were assayed in the same analytical run without knowing zygosity.

Statistical analysis

Plasma levels of HDL-C, LDL-C, TC and TG were the continuous variables used for the analyses. Univariate analysis was performed using the nnSEM. This model assumes that variations in phenotypic traits can be decomposed into latent factors: additive genetic (A), common environmental (C), dominant genetic (D) and unique environmental (E) factors. In nnSEM, the “non-normal” means that some independent variables (like D, C, and/or E in our study) and dependent variables (like lipid phenotypes in our study) are non-normally distributed. We first analyzed three different nnSEM ADCE full models, in which A was normally distributed, but C, D and E could not be normally distributed simultaneously in one model. The distribution of C, D and E in the three ADCE models was non-normal for C and E in model 1; non-normal for D and E in model 2; and non-normal for C, D, and E in model 3 (Ozaki et al. 2011). The best model was selected based on the goodness of fit to data. A smaller Bayesian Information Criterion (BIC) or Root Mean Square Error of Approximation (RMSEA) indicates a better fit to the data (Ozaki et al. 2011). Unlike the SEM in which (first- and) second-order moments, namely, (means and) covariances are used as information, the nnSEM uses higher order moments as well as (first- and) second-order moments (Ozaki et al. 2011) (Appendix A). The genetic effect (heritability) was represented with the “A” and “D” components using the nnSEM. The sum (G) of additive (A) and dominant (D) genetic effect (i.e. G = A + D) is a good estimator of the global genetic effect for both three and four factor cases (Ozaki et al. 2011). In order to compare the nnSEM ADCE full model with its nested reduced models, we developed the nnSEM method for the ACE model in the reported study given that rMZ < 2rDZ from the Supplemental Table 1, (code can be accessed at http://www010.upp.so-net.ne.jp/koken/bg.html) where rMZ and rDZ are correlation coefficients between co-twins for MZ and DZ twins, respectively. This new nnSEM ACE method can estimate four ACE models, four CE models, one AE model and one E model using 2nd and 3rd order moments. For the comparisons among the nnSEM ADCE full and its nested reduced models, generally, a smaller BIC or RMSEA indicates a better fit to the data (Ozaki et al. 2011). However, if the BIC for one model is slightly larger than the other model but the RMSEA values at or greater than 0.05 for both models, the model that has a smaller RMSEA may be better (for example, the ACE model 4 vs CE model 8 for TC in Table 3). In another situation, if the BIC for the ADCE full model vs the other reduced model is small but the RMSEA is slightly large, the BIC indicates the ADCE full model as the best fit model while the RMSEA indicates the reduced model as the best fit model. Given that both BIC and RMSEA are the model fit evaluation indices that incorporate the model parsimony (Loelin 2004), it is arbitrary to use either the BIC or the RMSEA to select the best fit model. Theoretically or conceptually, the possibility of the importance of all of four factors (i.e. A, D, C, and E) to explain the phenotype cannot be exclusively ruled out; thus, in this study, the ADCE full model is preferred as the best fit model. The reduced model might also fit well. The detailed statistical method was shown in Appendix B.

The nnSEM was conducted using the R software (Ozaki et al. 2011). To provide a full view of genetic and environmental contributions to a phenotype, we additionally conducted the traditional SEM-using the Mx software (Neale et al. 1994; Posthuma and Boomsma 2005). The other analyses were performed using the SAS 9.1 (SAS Institute, Cary, NC) statistical package.

Results

Characteristics of the study population

Among 1028 twins (506 MZ and 522 DZ twins), the average age was 47.8 years, and similar between MZ and DZ twins. The mean (SD) of the concentrations of lipid markers were also similar between MZ and DZ twins (Table 1). The concentrations of HDL-C were higher in DZ than MZ twins, while those of LDL-C, TC and TG tended to be lower in DZ than MZ twins (Table 1).

Table 1 Age and lipid profile in the fasting plasma at baseline in NHLBI twins study

Univariate nnSEM ADCE models

The ADCE model 1 with non-normally distributed C and E was the best fitting model for each of the lipid markers using the nnSEM (Table 2), and was used to estimate genetic and environmental contribution. Supplemental Table 1 shows the descriptive statistics for MZ and DZ pairs used in the nnSEM estimation. The genetic and environmental influences on HDL-C was 0.04 [95% confidence interval (CI): 0.00, 0.29] for additive genetic effect (A), 0.17 (95% CI: 0.00, 0.35) for dominant genetic effect (D), 0.47 (95% CI: 0.37, 0.56) for common environmental effect (C) and 0.33 (95% CI: 0.30, 0.36) for unique environmental effect (E) (Model 1). The global genetic contribution (i.e., heritability) to HDL-C was 21%. The A, D, C, and E estimates from the model 1 for LDL-C was 0.00 (95% CI: 0.00, 0.00), 0.30 (95% CI: 0.31, 0.38), 0.34 (95% CI: 0.26, 0.41), and 0.37 (95% CI: 0.33, 0.40), respectively. The heritability for LDL-C (30%) was mainly from dominant genetic variance. Estimated from the model 1 for TC, A was 0.30 (95% CI: 0.20, 0.40); D was 0.00 (95% CI: 0.00, 0.00); C was 31% (95% CI: 23%, 39%) and E was 39% (95% CI: 35%, 42%). The heritability for TC (30%) was from A. The ADCE for TG was 0.00 (95% CI: 0.00, 0.00) for A, 0.12 (95% CI: 0.07, 0.17) for D, 0.31 (95% CI: 0.26, 0.36) for C and 0.57 (95% CI: 0.50, 0.64) for E. The heritability from TG (12%) was mainly due to D.

Table 2 Parameter estimates from univariate nnSEM ADCE model for plasma lipid concentrations in monozygotic and dizygotic twins

Univariate nnSEM ACE model and its nested reduced models

Table 3 shows the estimates for 10 nnSEM ACE models including 4 ACE full models and its nested reduced models (i.e., 4 CE, 1 AE, and 1 E models). For HDL-C, model 2 was the best nnSEM ACE full model and model 6 was the best CE model. Among 10 nnSEM ACE models, the best fitting model was the ACE full model 2 for HDL-C. Similarly, among 10 nnSEM ACE models, the best fitting model was AE model for LDL-C, ACE model 4 for TC and AE for TG.

Table 3 Parameter estimates for ACE, CE, AE and E Models from univariate nnSEM model for plasma lipid concentrations in monozygotic and dizygotic twins

The best fitting nnSEM ADCE model 1 for all lipids was compared with the lipid-specific best fitting nnSEM ACE model using BIC or RMSEA as the model fit indices. The nnSEM ADCE model 1 provided better fitting for HDL-C relative to the nnSEM ACE model 2 (Tables 2, 3: −43.1 vs 29.9 for BIC and 0.029 vs 0.043 for RMSEA); for LDL-C relative to the nnSEM AE model (Tables 2, 3: −46.2 vs −38.2 for BIC and 0.013 vs 0.029 for RMSEA); and for TC, the nnSEM ADCE model 1 relative to the nnSEM ACE model 4 (Tables 2, 3: −38.4 vs −26.7 for BIC and 0.044 vs 0.053 for RMSEA). For TG, the nnSEM ADCE model 1 generated smaller BIC but slightly larger RMSEA than the nnSEM AE model (Tables 2, 3: −44.4 vs −40.9 for BIC and 0.024 vs 0.015 for RMSEA). Given the smaller BIC for the nnSEM ADCE model 1 for TG, the nnSEM ADCE model was better than the nnSEM AE model for TG. The nnSEM ADCE model 1 was the best fitting model for each of the lipid markers with the nnSEM.

Discussion

Using the novel nnSEM that could analyze the influence of common environment and dominant genes simultaneously on a trait, we found that the additive and dominant genetic, and common and unique environmental influences simultaneously affected the variance of HDL-C, but either additive or dominant genetic and environmental factors influenced total variance of LDL-C, TC and TG.

To our knowledge, this is the first study to estimate the genetic and environmental influences on lipid markers using the nnSEM ACE and ADE models. The ADCE model could not be identified by the traditional SEM because C and D are confounded in the classical twin study (Neale and Maes 1992; Rijsdijk and Sham 2002). In contrast, the nnSEM could specify an ADCE model and compare which model is the best model among three types of models, none of which are nested submodels of one another. We newly developed an nnSEM method to estimate its nested reduced models, which were compared with the nnSEM ADCE full model, and found that the ADCE is the best fitting model for all markers in this study. If the ADCE model is the true model, the ACE and ADE models yield biased estimates (Ozaki et al. 2011). Using the ADCE model, the nnSEM could estimate global genetic effect (G) [the sum of additive genetic (A) and dominant genetic (D) effect], which cannot be realized in the classic SEM. Furthermore, the global genetic effect is also a good estimator of the global genetic effect for the three factor case like the nnSEM ADE model (Ozaki et al. 2011).

Although there are no methods to compare the traditional SEM and our nnSEM, it might be interesting to explore the apparent differences in heritability estimated from the SEM. The heritability from our study estimated by the nnSEM was a little lower than those using the SEM from our study (Supplemental Table 2) and previous studies for HDL-C (0.36–0.76) (Goode et al. 2007; Heller et al. 1993; Jermendy et al. 2011; O’Connell et al. 1988; Snieder et al. 1999) and LDL-C (0.22-1.00) (Goode et al. 2007; Heller et al. 1993; Snieder et al. 1999). The results from the SEM may overestimate the additive genetic component but underestimate the dominant genetic component (Neale and Maes 1992; Ozaki et al. 2011; Rijsdijk and Sham 2002). However, estimated from the nnSEM, the heritability for LDL-C, TC and TG in our study was in the range of previously reported heritability, TC (0.00–0.80) (Goode et al. 2007; Heller et al. 1993; Jermendy et al. 2011; O’Connell et al. 1988; Snieder et al. 1999) and TG (0.19–0.81) (Goode et al. 2007; Jermendy et al. 2011; Snieder et al. 1999). The dominant genetic effect of LDL-C (35%) was also observed in 12,000 Swedish twins born between 1911 and 1958 estimated by the ADE model using the SEM (Rahman et al. 2009). Compared to previous studies, their findings of dominant genetic effects may be due to the enhanced power of the large and homogenous sample in their study, enabling them to detect weaker variance components underlying the phenotypic traits. Another contributing factor may be the older age of their study participants, possibly leading to decreased influences from shared familial environment. In addition, the effect of common environment is under-estimated. In our study, the best nnSEM model for all four lipid markers was model 1 with little additive genetic effects: common environment accounted for 31–47% of the variance. By contrast, the best SEM model for all four markers was the AE model, from which the effect of common environment was estimated as zero. Even the SEM ACE full model estimated the smaller effect of common environment than the nnSEM best model (model 1) for each lipid marker. When all four factors (i.e. ADCE) affect a trait, given the mathematical comparisons between the nnSEM and the SEM, the SEM always underestimates variance of common environment (Neale and Maes 1992; Rijsdijk and Sham 2002) but the nnSEM does not (Ozaki et al. 2011).

Potential limitations of our study require acknowledgement. Given that the univariate nnSEM ADCE model is developed only for univariate continuous traits, we could not analyze the genetic and environmental influence on a categorical trait such as hypercholesterolemia, and we were unable to adjust for the effect of potential confounding factors such as age, obesity and other lifestyle factors on the estimations. Our twins were white men, thus our results may not be generalizable to women and other ethnic groups.

In conclusion, as shown by the ADCE model fit via the novel nnSEM (which cannot be performed with the traditional SEM), additive- and dominant-genetic, and common- and unique- environmental influences simultaneously affected the concentrations of HDL-C; additive or dominant genetic effect as well as common and unique environmental effect simultaneously affected concentrations of LDL-C, TC and TG. Genetic factors and environmental factors were important determinants in concentrations of lipid markers.