Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

Investigations regarding the development of delinquency during the life course are currently of great importance in longitudinal criminological research. Over the past 20 years a variety of criminologists have argued that there are distinctive groups of offenders which can be described by different delinquent trajectories (Loeber & LeBlanc 1990; Moffitt 1993; Sampson & Laub 2003; Thornberry 2005).

A trajectory is a pathway or line of development over the life span such as worklife, parenthood, and criminal behavior. Trajectories refer to long-term patterns of behavior and are marked by a sequence of transitions. Transitions are marked by life events (e.g. first job or first marriage) that are embedded in trajectories and evolve over shorter time spans. (Sampson & Laub 1997, p. 142)

Major methodological developments in criminological longitudinal research are influenced by the debate whether distinctive groups of criminal behavior can be identified and in which way the development of a “criminal career” can be incorporated in a statistical model. The debate is mainly enforced by Moffitt’s dual taxonomy of offending behavior: The adolescent limited offenders exhibit antisocial behavior only during adolescence while life-course-persistent offenders begin to behave antisocially early in childhood and continue this behavior into adulthood (Moffitt 1993). In further analyses of data from the “Dunedin Multidisciplinary Health and Development Study” (Moffitt, Caspi, Rutter, & Silva 2011) four antisocial behavior trajectory groups were identified among females and males: life-course-persistent, adolescent-onset, childhood-limited, and low trajectory groups (Odgers et al. 2008). Furthermore, Nagin and collaborates explored population heterogeneity in behavioral trajectories using other longitudinal studies, like the “Cambridge Study” (Farrington & West 1990), the “Philadelphia Study” (Tracy, Wolfgang, & Figlio 1990), and the “Montreal Study” (Tremblay, Desmarais-Gervais, Gagnon, & Charlebois 1987). Depending on the type of the dependent variable, nature of the sample, and characteristics of the community, three to five trajectories were detected which reflect different intensity and growth of delinquency. These trajectories distinguish between non-offenders, a time-limited delinquent behavior through adolescence and a more or less chronic group of offenders (D’Unger, Land, McCall, & Nagin 1998; Nagin 1999; Nagin & Land 1993).

The reported findings suggest that there is a variety of heterogeneous trajectories which differ in the age of entry and exit in delinquency, its intensity, its duration, and its continuity (for an overview, see Piquero 2008). Furthermore the research on delinquent trajectories has shown that most people commit delinquent acts rarely or do not become delinquent at all. In most cases early intensive offenders desist from crime. There are, however, trajectories which are marked by high delinquency rates or by increases in delinquency (Mariotti & Reinecke 2010; Moffitt 1993; Sampson & Laub 2003; Thornberry 2005). Moreover, research has shown that transition points are relevant for the analysis of delinquent trajectories (Sampson & Laub 1993).

Techniques of longitudinal statistical modeling are highly relevant and gained considerable attention in the examination of delinquent trajectories. With latent growth curve models (LGM) (Meredith & Tisak 1990), the structural equation methodology offers a strategy to examine intra- and interindividual developmental processes of delinquent behavior. It can, however, not be assumed that there is always a single population underlying the growth curves. Therefore, observed as well as unobserved heterogeneity has to be taken into account. Observed heterogeneity can be considered by relevant exogenous variables (e.g. gender) which are related to the growth curve variables explaining parts of their variances. To capture unobserved heterogeneity, the latent growth curve model has to be enlarged by a growth mixture model (Muthén & Shedden 1999) which contains the continuous manifest and latent variables as well but in addition categorical variables. The latter ones refer to particular subgroups reflecting different developmental processes. Analyses with a growth mixture model usually assume single-phase data which associate any event with a specific time period. However, longitudinal data often contain transition points which separate different phases of the development under study. An appropriate framework for multi-phase longitudinal data regarding unobserved heterogeneity is the extension of the growth mixture models (GMM) to stage-sequential growth mixture models (Kim & Kim 2012). These models can lead to a better understanding of the developmental process over several phases.

The following section “Method and Models” discusses basic conceptions of growth curve, growth mixture, and the multi-phase growth mixture models as well as the respective methods of model estimation and model evaluation. Additional attention is given to the distributional assumptions of the manifest time-variant variable under study. In the case of count variables Poisson or negative binomial distributions (Hilbe 2011) can be considered which give a better model representation compared to the assumption of a continuous distribution (Reinecke & Seddig 2011).

All models are applied to panel data from the German panel study Crime in the Modern City (CrimoC).Footnote 1 The data set contains 3938 adolescents and young adults who participated at least twice in a row in the eight panel waves. Data, variables and descriptive statistics are discussed in section “Data, Variables, and Descriptive Statistics”.

Results are presented in section “Modeling Results”. The analysis starts with single-phase growth mixture models considering up to eight classes and considers various specifications of stage-sequential growth mixture models. Finally, in section “Conclusion” models are compared and discussed with recommendations for further analyses.

Method and Models

Latent Growth Curve Models

LGM specified with structural equations have already been discussed in several papers (McArdle 2009; McArdle & Epstein 1987; Meredith & Tisak 1990) and books (Bollen & Curran 2006; Duncan, Duncan, Strycker, Li, & Alpert 2006; Reinecke 2012). As is typical for all structural equation models, growth curve models distinguish between a measurement and a structural model. Structural model refers to the intraindividual development whereas the measurement model refers to interindividual differences of those trends. For a growth curve model with two factors, the measurement model can be formulated as follows:

$$\displaystyle{ \begin{array}{ccccccc} y_{t}& =&\lambda _{1t}\eta _{1} & +&\lambda _{2t}\eta _{2} & +&\epsilon _{t}\\ \end{array} }$$
(1)

y t are the manifest variables at time t, which are related to the latent variables η 1 and η 2. η 1 is the initial level factor or intercept factor while η 2 is the linear growth factor or slope factor. \(\lambda _{1t}\) and \(\lambda _{2t}\) are the factor loadings on η 1 and η 2. ε t is the measurement error of y t . For each latent variable η, a structural equation has to be formulated as follows:

$$\displaystyle\begin{array}{rcl} \begin{array}{ccccc} \eta _{1} & =&\alpha _{1} & +&\zeta _{1}\\ \end{array} & &{}\end{array}$$
(2)
$$\displaystyle\begin{array}{rcl} \begin{array}{ccccc} \eta _{2} & =&\alpha _{2} & +&\zeta _{2}\\ \end{array} & &{}\end{array}$$
(3)

The latent variables η 1 and η 2 are described by their means (α 1 and α 2) as well as by their residuals (ζ 1 and ζ 2). ζ 1 and ζ 2 can be defined as deviations of the latent variables from their mean values.Footnote 2 Variances and covariances of the latent variables are specified in the matrix \(\Psi\):

$$\displaystyle{ \Psi = \left (\begin{array}{*{10}c} \psi _{11} &\\ \psi _{ 21} & \psi _{22}\\ \end{array} \right ) }$$
(4)

Assuming linear growth, the factor loadings for η 1 have to be fixed to one and the factor loadings for η 2 have to be restricted according to a linear development:

$$\displaystyle{ \left [\begin{array}{c} y_{1} \\ y_{2} \\ y_{3} \\ y_{4}\\ \end{array} \right ] = \left [\begin{array}{cc} 1&0\\ 1 &1 \\ 1&2\\ 1 &3\\ \end{array} \right ]{\ast}\left [\begin{array}{c} \eta _{1}\\ \eta _{ 2}\\ \end{array} \right ]+\left [\begin{array}{c} \epsilon _{1}\\ \epsilon _{ 2}\\ \epsilon _{3} \\ \epsilon _{4}\\ \end{array} \right ] }$$
(5)

To model non-linear growth curves it is possible to extend the two-factor model by additional latent variables, for instance a quadratic term. The measurement and structural equations (1)–(3) of the two-factor model described above can be extended as follows:

$$\displaystyle\begin{array}{rcl} \begin{array}{ccccccccc} y_{t}& =&\lambda _{1t}\eta _{1} & +&\lambda _{2t}\eta _{2} & +&\lambda _{3t}^{2}\eta _{3} & +&\epsilon _{t}\\ \end{array} & &{}\end{array}$$
(6)
$$\displaystyle\begin{array}{rcl} \begin{array}{ccccc} \eta _{1} & =&\alpha _{1} & +&\zeta _{1} \\ \eta _{2} & =&\alpha _{2} & +&\zeta _{2} \\ \eta _{3} & =&\alpha _{3} & +&\zeta _{3}\\ \end{array} & &{}\end{array}$$
(7)

Another possibility to cope with nonlinearity is the so-called piecewise growth curve model which is useful when transition points are assumed across the time range (Bollen & Curran 2006). Such a model contains two or more latent variables. Contrary to the linear growth model, those models can be used to analyze multiphase data. Piecewise growth curve models are meaningful when transition points can be found in the course of development (see, for instance, Raudenbush & Bryk 2002). Assuming one transition point, the first trajectory describes the development between the intercept and the transition point. The second trajectory describes development after the transition point. If six panel waves and a transition point for the third panel wave are assumed, the following measurement model can be formulated:

$$\displaystyle{ \left [\begin{array}{c} y_{1} \\ y_{2} \\ y_{3} \\ y_{4} \\ y_{5} \\ y_{6}\\ \end{array} \right ] = \left [\begin{array}{ccc} 1&0&0\\ 1 &1 &0 \\ 1&2&0\\ 1 &2 &1 \\ 1&2&2\\ 1 &2 &3\\ \end{array} \right ]{\ast}\left [\begin{array}{c} \eta _{1}\\ \eta _{ 2}\\ \eta _{3} \end{array} \right ]+\left [\begin{array}{c} \epsilon _{1}\\ \epsilon _{ 2}\\ \epsilon _{3} \\ \epsilon _{4}\\ \epsilon _{ 5}\\ \epsilon _{6}\\ \end{array} \right ] }$$
(8)

η 1 is the intercept, η 2 is the linear slope for the first phase, and η 3 is the linear slope for the second phase. Because of the transition point at the third panel wave, the restricted values of factor loadings of η 1 are not changing for the subsequent waves.

Growth Mixture Models for Single-Phase Data

With GMM it is possible to control for unobserved heterogeneity in the data. If the variances of the growth factors in a linear or piecewise growth curve model are not different from zero, growth mixture modeling is not necessary. The GMM extends Eqs. (1)–(3) by a categorical variable c with k = 1, 2, , K classes. Assuming a two-factor growth mixture model, the following measurement and structural equations can be formulated:

$$\displaystyle\begin{array}{rcl} \begin{array}{ccccccc} y_{tk}& =&\lambda _{1tk}\eta _{1k}&+&\lambda _{2tk}\eta _{2k}&+&\epsilon _{tk}\\ \end{array} & &{}\end{array}$$
(9)
$$\displaystyle\begin{array}{rcl} \begin{array}{ccccc} \eta _{1k}& =&\alpha _{1k}&+&\zeta _{1k}\\ \end{array} & &{}\end{array}$$
(10)
$$\displaystyle\begin{array}{rcl} \begin{array}{ccccc} \eta _{2k}& =&\alpha _{2k}&+&\zeta _{2k}\\ \end{array} & &{}\end{array}$$
(11)

Means and variances of the latent variables are estimated for each class k \((\alpha _{1k},\alpha _{2k},\psi _{11k},\psi _{22k})\). The matrix \(\Psi _{k}\) contains the class-specific variances and covariances:

$$\displaystyle{ \Psi _{k} = \left (\begin{array}{*{10}c} \psi _{11k}&\\ \psi _{ 21k}&\psi _{22k}\\ \end{array} \right ) }$$
(12)

The so-called latent class growth analysis (LCGA; Muthén 2004) is a submodel of the GMM which gained great importance in criminological research under the name group-based trajectory modeling (Nagin 2005). LCGA assumes that variances and covariances of the growth factors are restricted to zero (\(\Psi _{k} = 0\)). Consequently, there are no residual terms in the structural equations of the latent variables η 1 and η 2 and therefore all class members are treated as homogenous regarding their individual developments:

$$\displaystyle\begin{array}{rcl} \begin{array}{ccc} \eta _{1k}& =&\alpha _{1k}\\ \end{array} & &{}\end{array}$$
(13)
$$\displaystyle\begin{array}{rcl} \begin{array}{ccc} \eta _{2k}& =&\alpha _{2k}\\ \end{array} & &{}\end{array}$$
(14)

Previous analyses of delinquent trajectories with longitudinal data show that specifications of the LCGA lead to quite reasonable substantive results (see, for instance, Kreuter & Muthén 2008). From a methodological point of view Muthén (2004, p. 350) suggests using LCGA as starting point for the analysis of trajectories, because it can be explored how many different classes might be necessary to estimate distinct developmental trends appropriately.

In most criminological studies the longitudinal response variable is a count measure (e.g., the number of convictions). Therefore, the Poisson regression model as a special case of the generalized linear model has to be used. Let y i be the number of observed count occurrences, x i the vector of covariates, and ν i the expected number of counts. The number of events in an interval of a given length is Poisson distributed and the Poisson regression model can be formulated via a log link function (Hilbe 2011, p. 31):

$$\displaystyle{ Pr(y_{i}\vert x_{i}) = exp(-\nu _{i})\nu _{i}^{y_{i} }/y_{i}! }$$
(15)

with \(\nu _{i} = exp(\alpha +x_{i}^{'}\beta )\). β is the vector of regression coefficients. The conditional mean function of the Poisson distribution is \(E(y_{i}\vert x_{i}) =\nu _{i}\) with its equidispersion \(V ar(y_{i}\vert x_{i}) =\nu _{i}\). Small values of ν i indicate the rarity of the event and the skewness of the distribution.

If the assumption of equidispersed data does not hold, the negative binomial regression model can be employed by introduction of latent heterogeneity in the conditional mean of the Poisson model (Hilbe 2011, p. 185):

$$\displaystyle{ Pr(y_{i}\vert x_{i},\epsilon _{i}) = exp(\alpha +x_{i}'\beta +\epsilon _{i}) = h_{i}\nu _{i} }$$
(16)

where h i  = exp(ε i ) is assumed to have a one parameter gamma distribution, \(G(\theta,\theta )\) with a mean equal to 1 and variance \(\kappa = 1/\theta\). The negative binomial distribution can be obtained by integrating h i out of the joint distribution. The conditional mean function is still E(y i  | x i ) = ν i while overdispersion can be obtained from the latent heterogeneity with the variance function \(V ar(y_{i}\vert x_{i}) =\nu _{ i}^{2}[1 + (1/\theta )]\).

Within the context of the CrimoC study previous analyses of GMM using the assumption of a negative binomial distributed variable have shown that those models have always better model fits compared to models with the assumption of a continuous or Poisson distributed variables (Reinecke & Seddig 2011, p. 432). Therefore the negative binomial distribution assumption will be used for the current analyses.

Growth Mixture Models for Multi-Phase Data

The discussed mixture models always assume that every estimated trajectory relies on longitudinal data covering a single phase of development. In case of long repeated panel designs this assumption might not be appropriate. The larger the time span of the longitudinal data, the higher is the chance that modeling of different phases is necessary to estimate the trajectories of the particular latent classes. The difference between single-phase and multi-phase data does not depend on specific features of a panel design but on whether transitions points are likely between the particular measurement occasions.

For homogenous populations piecewise growth curve models, as discussed above, are able to consider transition points. In case of unobserved heterogeneity the piecewise growth curve model can be extended to a so-called Traditional Piecewise Growth Mixture Model (TPGMM, Kim & Kim 2012, p. 300). TPGMM has multiple growth components and additionally one mixture component. The growth components are the same as for piecewise growth models whereas the finite mixture component is the same as for the GMM. The growth trajectories before and after a transition point are connected at the transition point. Figure 1 illustrates the model assumption: y 1y 8 are the measures for eight panel waves, I is the intercept and S1 as well as S2 are the particular slopes. The first and second growth trajectory are connected at the transition point (e.g. t 6). c represents the mixture component, X is a time-invariant exogenous variable, U represents outcome variables. Both X and U will not be considered in the applications (cf. section “Modeling Results”).

Fig. 1
figure 1

Traditional Piecewise Growth Mixture Model. Source: Kim and Kim (2012, p. 297)

If a larger change or a discrepancy (e.g. intervention) is expected at the transition point, the TPGMM might not be sufficient to model this effect. One possible extension of the TPGMM is the so-called Discontinuous Piecewise Growth Mixture Model (DPGMM, Kim & Kim 2012, p. 301) in which an intercept is specified for each phase. Figure 2 shows an example with eight panel waves and a transition point between the fourth and fifth measurement: y 1y 4 are the first-phase measures, y 5y 8 are the second-phase measures, I1 is the first intercept and S1 is the first slope, I2 is the second intercept and S2 is the second slope. All other variables are the same as in Fig. 1. In difference to the TPGMM the trajectories of the first and the second phases are not directly connected at the transition point.

Fig. 2
figure 2

Discontinuous Piecewise Growth Mixture Model

Introducing a second intercept changes the measurement part of the DPGMM compared to the TPGMM while the structural part remains the same. The measurement part of the model is given as follows:

$$\displaystyle{ \left [\begin{array}{c} y_{1} \\ y_{2} \\ y_{3} \\ y_{4} \\ y_{5} \\ y_{6} \\ y_{7} \\ y_{8}\\ \end{array} \right ] = \left [\begin{array}{cccc} 1&0&0&0\\ 1 &1 &0 &0 \\ 1&2&0&0\\ 1 &3 &0 &0 \\ 0&0&1&0\\ 0 &0 &1 &1 \\ 0&0&1&2\\ 0 &0 &1 &3\\ \end{array} \right ]{\ast}\left [\begin{array}{c} \eta _{11}\\ \eta _{ 21}\\ \eta _{12} \\ \eta _{22}\\ \end{array} \right ]+\left [\begin{array}{c} \epsilon _{1}\\ \epsilon _{ 2}\\ \epsilon _{3} \\ \epsilon _{4}\\ \epsilon _{ 5}\\ \epsilon _{6} \\ \epsilon _{7}\\ \epsilon _{ 8}\\ \end{array} \right ] }$$
(17)

η 11 and η 12 are the intercepts for the first and the second phase whereas η 21 and η 22 are the particular slopes. Both the TPGMM and the DPGMM assume a common mixture component c for all phases. If in addition changes between the classes due to the transition between the phases have to be considered, the DPGMM can be extended to a so-called Sequential-Process Growth Mixture Model (SPGMM, Kim & Kim 2012, p. 303). Transition points as well as changes between latent class membership can be applied with the SPGMM. Figure 3 shows an example with the two mixture components c1 and c2. The relationship between both mixture components is specified via a transition probability matrix which contains the estimates of the probability of latent class membership of the second phase, conditional on latent class membership at the first phase. The number of intercepts and slopes and the specifications of the measurement part of the model are equal to the DPGMM.

Fig. 3
figure 3

Sequential-Process Growth Mixture Model

Model Estimation and Model Evaluation

Mixture models are estimated by maximizing the log-likelihood function within the admissible range of parameter values given classes and data. The program Mplus employs the EM-algorithm for maximization (Dempster, Laird, & Rubin 1977; Muthén & Shedden 1999). Thereby, different sets of starting values are tested for the calculation of the optimal function value and the best set is used for the estimation of the parameters. For a given solution, each individual’s probability of membership in each class is estimated. Individuals can be assigned to the classes by calculating the posterior probability that an individual i belongs to a given class k. Each individual’s posterior probability estimate for each class is computed as a function of the parameter estimates and the values of the observed data. The number of classes has to be specified in each model variant.

Standard errors of estimates are asymptotically correct if the underlying mixture model is the true model. χ 2-differences between the particular mixture model variants, however, cannot be calculated because a k-class model is not nested within a k + 1-class model. Therefore, the Bayesian Information Criterion (BIC, Schwarz 1978) is used for model comparisons. Furthermore, Mplus calculates a sample size adjusted BIC which was found to give superior performance for model selection (adj. BIC, Yang 1998). Models with the lowest BIC or adjusted BIC can be selected for further substantial interpretations. But accepting or rejecting a model on the basis of the BIC is more or less descriptive and does not imply any statistical test.

However, Lo, Mendell, and Rubin (2001) have developed a statistical test for mixture models. The so-called Lo-Mendell-Rubin likelihood test (LMR-LRT) tests a k-class model against a k − 1-class model. Thereby the relation of the likelihoods of a k − 1-class model to the ones of a k-class model is calculated. If the p-value of the test is small, the k-class model should be accepted (Reinecke 2012, p. 38). The LMR-LRT can only be calculated for GMM and TPGMM.

In addition, the entropy of a particular mixture model can be used to decide about the adequate number of classes. Entropy is a summary measure of classification quality based on the estimated posterior probabilities that ranges from zero to one:

$$\displaystyle{ E_{k} = 1 -\frac{\Sigma _{i}\Sigma _{k}(-\hat{p}_{ik}ln\hat{p}_{ik})} {n\ lnK} }$$
(18)

\(\hat{p}_{ik}\) is the estimated probability for each individual i to be in class k. The closer its values are to one, the better the classification.

BIC, adjusted BIC, LMR-LRT, Entropy, and the substantive interpretability of the classes should be considered for the decision process (Muthén & Muthén 2000). In the context of multi-phase mixture models three additional aspects have to be taken into account (Kim & Kim 2012, 305f): At first, the number of latent classes in each phase should be kept as small as possible, second,for multiple latent classes nearly empty patterns can be accepted (e.g., as outliers) and finally, redundant classes should be avoided as well as classes which are misleadingly omitted. All in all, it is advisable to make the decision about the number of latent classes not only on the basis of one information source, but include various statistical and substantive arguments (see also Kim 2014).

Data, Variables, and Descriptive Statistics

The data used for the current analyses are taken from the panel survey of Duisburg which is part of the ongoing German panel study CrimoC. Duisburg is an industrial city of about 500,000 inhabitants. It is located in the western part of the Ruhr area in Germany. The sample was drawn from secondary schools in Duisburg. Eight annual panel waves have been collected between 2002 and 2009, which covers the period from early to late adolescence. The self-administered questionnaires were completed in school classes as long as the students attended the particular schools. After leaving school participants were usually contacted by mail. If repeated contacts were unsuccessful personal contacts were realized to conduct the interviews. Retention rates are between 84 and 92 % (Boers et al. 2014, p. 184).

The panel data contain individuals who participated twice in a row between 2002 and 2009 (n = 3938). Table 1 gives descriptive information about each panel wave (age, sex). In the first panel wave (2002) the sample’s average age is 13, in 2009 it is about 20 years. The sex ratio in each panel wave is relatively balanced although there are always more female than male participants. In 2002, for instance, 48.6 % of the respondents are male and 51.4 % female. In the subsequent panel waves, however, there are larger differences. In 2009 only 42.2 % of the respondents are male. Therefore, females are slightly overrepresented in the data.

Table 1 Descriptive information about the sample

To measure deviant and delinquent behavior, about 15 different offenses are obtained in the questionnaires of each panel wave. These offenses can be classified into property offences (burglary, theft of cars, theft out of cars; fencing, theft out of vending machines, theft of bicycles, shoplifting), violent offences (robbery, purse snatching, assault with a weapon; assault), and criminal damage offences (graffiti, scratching, other criminal damage). Concerning each of those offences, the respondents were asked whether they ever committed it (lifetime prevalence) and whether they have committed them in the past year (annual prevalence). If they committed the particular offence in the past year, they were also asked about the frequency of their offending (annual incidence). The time-variant dependent variable of the mixture models considers the annual incidence rates which is given as the sum of the particular rates of the 15 offences.

Table 2 gives a descriptive overview of the distributions of prevalence and incidence rates. The prevalence of the self-reported criminal behavior increases in early adolescence between 2002 and 2003 and decreases later on. The peak is reached in 2003 in which the adolescents were about 14 years old. In the year 2002 nearly 31 % of the respondents reported an offence. This rate increased to 40 % in 2003 and decreased continuously down to 7 % in 2009.

Table 2 Annual prevalence and incidence rates

Incidence mean rates are based on the number of persons who reported at least one offence in the prevalence measure. Some of the respondents gave an answer to the annual prevalences but not to the annual incidences. Therefore the numbers of persons are slightly different for each of the eight panel waves. The lower half of Table 2 shows the means of the annual incidences of the offenders. The first row of means are based on the number of valid answers in each panel wave. The second row are the means estimated via the Full Information Maximum Likelihood procedure (FIML, Enders 2010, p. 88) considering unit nonresponses in each panel wave. The FIML estimated means are slightly higher compared to those based on the complete cases in each panel wave which reflects a certain underreporting of the incidence rates (see also Reinecke & Weins 2013). Nevertheless, both rows of means reflect the typical development of adolescents’ delinquent behavior with the peaks at age 15 (year 2004) and a continuous decline thereafter. FIML estimated means, variances, and covariances are used for the GMM in section “Modeling Results”.

Modeling Results

With different slope specifications variants of the TPGMM (see Fig. 1) are firstly evaluated. One specification assumes three phases with one turning point at the second panel wave and another turning point at the sixth panel wave. The factor loadings of the intercept and the three linear slopes are restricted as follows:

$$\displaystyle{ \left (\begin{array}{*{10}c} 1&0& 0 & 0\\ 1 &1 & 0 & 0 \\ 1&1&0.5& 0\\ 1 &1 &1.5 & 0 \\ 1&1&2.5& 0\\ 1 &1 &3.5 & 0 \\ 1&1&3.5&20.25\\ 1 &1 &3.5 &30.25\\ \end{array} \right ) }$$
(19)

The first linear slope (second column in the matrix) specifies the first turning point. Therefore subsequent factor loadings are restricted to the value of one. The second linear slope (third column) specifies the continuous development of delinquency up to the sixth panel wave with a difference value of one. Therefore subsequent factor loadings are fixed to the value of 3.5. The third slope (fourth column) reflects a faster development by doubling the value of 3.5 with additional constants (\(3.5^{2} + 8 = 20.25\) and \(3.5^{2} + 10 = 30.25\)). These fixed values were previously explored by different model specifications of the piecewise growth curve model.

Alternatively, a more parsimonious specification assumes only two slopes and a faster development of delinquency after the second wave. The factor loadings of the intercept and the two linear slopes are restricted as follows:

$$\displaystyle{ \left (\begin{array}{*{10}c} 1&0& 0\\ 1 &1 & 0 \\ 1&1& 0.25\\ 1 &1 & 2.25 \\ 1&1& 6.25\\ 1 &1 &12.25 \\ 1&1&20.25\\ 1 &1 &30.25\\ \end{array} \right ) }$$
(20)

The restrictions of the factor loadings of the first slope (second column in the matrix) are equal to the previous specification in Matrix (19). The second slope (third column) specifies the continuous development of delinquency by adding the constants 2, 4, 6, 8, and 10 to the value of 0.25. So, the factor loadings of the second slope for the last two panel waves do not differ to the factor loadings from the third slope in Matrix (19). In general, restrictions of the factor loadings influence the form of the trajectory pieces, the direction of the development (increase or decrease of delinquency) can only be observed from the sign of the particular slope mean estimators (see vector α in Eq. (7)).

TPGMM are calculated from two up to eight classes. Incidence rates are treated as a negative binomial distributed count variable (cf. Eq. (16)). Intercept and slopes are specified according to LCGA, i.e., all variances and covariances of the growth curve variables are fixed to zero (cf. section “Growth Mixture Models for Single-Phase Data”). Table 3 shows the particular fit information for the TPGMM with three and two linear phases according to the specifications in Matrices (19) and (20).

Table 3 Model fit information of the TPGMM with three and two phases

All the BIC and adjusted BIC values of the models with two phases are lower than the particular models with three phases. It clearly shows that the development of delinquency can be modelled sufficiently well with two phases: one for the increase and one for the decrease. Regarding the TPGMM with two phases the p-value of the LMR-LRT shows no redundancy up to six classes.

Table 4 and Fig. 4 give an overview of the model. The largest class in this model represents a group of adolescents who were nearly not involved in delinquent behavior during the observed period (non-offenders, 49.9 %). The second largest class is characterized by a slight increase in the early adolescence and a likewise slight decrease later on. Here, delinquency was limited to the period of adolescence (adolescence-limited offenders, 16.8 %). The third largest group comprises adolescents who committed crimes just in early adolescence (low-level-decliners, 13.5 %). The following class is a group of adolescents who reported only a few crimes during the observed period (low-rate-offenders, 7.6 %). Only a small proportion of the adolescents can be classified as persistent offenders with a large incidence rate (6.9 %). A likewise small proportion is characterized by a high crime rate in early adolescence and a low crime rate later on (high-level-decliners, 5.3 %).

Fig. 4
figure 4

Traditional Piecewise Growth Mixture Model with six classes. Labels of the classes (from the bottom to the top): non-offenders, low-rate-offenders, low-level-decliners, adolescence-limited-offenders, high-level-decliners, persistent offenders

Table 4 Means of the growth variables (TPGMM)

This type of growth trajectory, however, can distort real growth patterns in data, when a more dynamic change or a discrepancy is expected at the transition point.

The TPGMM, however, “can distort real growth patterns in data, when a more dynamic change or a discrepancy is expected at the transition point” (Kim & Kim 2012, p. 300). The DPGMM (see Fig. 2) contains intercepts for each phase. For the substantive application a DPGMM would be assumed to have one intercept and slope for the increase of delinquency as well as one intercept and slope for the decrease of delinquency. With this model a larger discrepancy at the transition is expected which means a sufficient discontinuity between the phases. However, previous analyses with the CrimoC panel data did not support a discontinuity of the developmental process and therefore the specification of a DPGMM was rejected.

As described in section “Growth Mixture Models for Multi-Phase Data” the SPGMM extends the DPGMM by additional latent class variables assuming that the number of classes can change between the phases. According to the results of the DPGMM and in difference to Fig. 3 we do not assume two but only one intercept for the phases. In addition, it is proposed that the number of classes will decrease over time. Substantively this means that a larger unobserved heterogeneity of the trajectories is expected in early adolescence compared to late adolescence. With increasing age and increasing desistance from crime a smaller unobserved heterogeneity is expected. Similar to the TPGMM, the models are tested with three and two phases. According to the assumption of decreasing heterogeneity the number of classes is always higher in the first phase compared to the subsequent phases. Table 5 shows the model fit information for the calculated SPGMMs. Model selection is limited to the BIC and adjusted BIC (LMR-LRT is not calculated in Mplus when different class patterns are specified). Similar to the TPGMM, results show that two phases are sufficient. The model with three classes in the first phase and two classes in the second phase can be selected for further interpretations.

Table 5 Model fit information of the SPGMM with three and two phases

The combination of the first and the second phase leads to a six-class pattern with different combinations of classes (see Table 6):

  1. 1.

    Class pattern 1 1: 12.3 % of the adolescents change from low-rate-offenders in the early adolescence to non-offenders later on.

  2. 2.

    Class pattern 1 2: 4.1 % of the adolescents are characterized by a high and increasing delinquency rate in early adolescence and a declining delinquency rate later on.

  3. 3.

    Class pattern 2 1: Nearly half of the adolescents are characterized as non-offenders in both phases.

  4. 4.

    Class pattern 2 2: 19.3 % of the adolescents show a slight increase in early adolescence and a slight decrease later on.

  5. 5.

    Class pattern 3 1: 10.5 % of the adolescents are characterized as low-rate-offenders in both phases.

  6. 6.

    Class pattern 3 2: 5.8 % of the adolescents show persistent delinquency on a high level with a decreasing tendency in late adolescence.

Table 6 Number and proportion of persons in the class patterns (SPGMM)

The first phase is characterized by three classes. The first one comprises adolescents who started to behave delinquently early in the adolescence and partly on a high level (16.4 %). The second class encompasses non-offenders and adolescents whose crime rate increases slightly on a low level (67.2 %). Finally, low rate offenders and high starters can be found in the third class. The second phase comprises two classes. The first of them encompasses non- and low-rate offenders, the second one adolescents with decreasing delinquency (see Fig. 5).

Fig. 5
figure 5

Sequential-Process Growth Mixture Model. Labels of the classes (from the bottom to the top): non-offenders, low-rate-offenders, low-rate-offenders \(\rightarrow\) non-offenders, early increasers \(\rightarrow\) decreasers, high-starters \(\rightarrow\) decliners, high-starters \(\rightarrow\) persisters

With the two latent class variables C1 and C2 the SPGMM is able to estimate transition probabilities. With a probability of 66 % it is more likely for adolescents to stay as or to become non- or low-rate-offenders during the life course than to still act delinquent in late adolescence. Quite a few adolescents, however, commit crimes in late adolescence as well.

One possibility to compare and validate the results of different mixture model specifications is to look at the bivariate table with the particular proportions for the latent classes based on their most likely latent class membership. The estimated TPGMM contains six classes and one latent class variable, the estimated SPGMM also contains six classes which can be differentiated into different class patterns. Table 7 gives the result of the crosstabulation between the class distribution of the TPGMM and the SPGMM. Most of the individuals in pattern 1 1 of the SPGMM belong to the third class of the TPGMM (low-level decliners, 96.07 %), nearly all individuals in pattern 1 2 of the SPGMM belong to the sixth class of the TPGMM (high-level decliners, 99.38 %). Differences between these two patterns (SPGMM) or classes (TPGMM) refer only to the level of delinquency, both patterns or classes are characterized by processes of desistance.

Table 7 Individual latent class membership of the TPGMM (row) and SPGMM (column)

Non-offenders in pattern 2 1 of the SPGMM are 100 % part of class 1 of the TPGMM. Pattern 2 2 of the SPGMM is characterized by processes of early increasing and later declining delinquency. Eighty-six percent of these individuals belong to class 2 of the TPGMM (adolescent-limited offenders). The rest of pattern 2 2 is distributed across the other classes of the TPGMM. The lowest congruence to class 4 of the TPGMM has pattern 3 1 of the SPGMM (low-rate offenders). Only 68 % of the individuals are in the particular cell of the cross-table. Nearly 17 % of the pattern belongs to the non-offender class 1 and about 10 % to the low-level declining class 3 of the TPGMM. Similar to the class of non-offenders pattern 3 2 (persistent offenders) of the SPGMM are 100 % part of class 5 of the TPGMM. In total, the crosstabulation of both class memberships confirms the stability of the latent class distributions although the specifications of TPGMM and SPGMM are different. Non-offenders and low-rate offenders have overlaps in their particular developments and therefore their assignments can differ between the models. This has been observed in previous applications of GMM with criminological panel data (see, for example, Mariotti & Reinecke 2010; Piquero 2008; Reinecke & Seddig 2011).

Conclusion

This study has shown that with an increasing number of panel waves unobserved heterogeneity of developmental processes results not only from a mix of these developments but also from multiple phases. In difference to the TPGMM, the SPGMM has separate mixture parts with a latent class variable in each phase. Whereas the TPGMM has only one intercept over multiple phases, the DPGMM and SPGMM specify separate intercepts as well as separate slopes. Kim and Kim (2012) showed how growth and mixture models can be extended to more complex and flexible stage-sequential growth mixture models within the structural equation modeling framework. Their model applications contain continuous data related to smoking behavior. In the present study the observed variable was treated as a count variable with overdispersion. Therefore piecewise and stage-sequential growth mixture models have been applied with the specification of a negative binomial distributed variable. But all the analyses are limited to the LCGA specification meaning that no variances and covariances were estimated for the growth curve variables within classes.

With eight panel waves of self-reported delinquency obtained from the CrimoC study (Boers et al., 2014) separate intercepts could not be detected and identified while separate growth components reflect increase and decrease of delinquency through the period of adolescence and young adulthood. If only one intercept is required, the specification of the DPGMM collapses to the TPGMM. One possible explanation is that the CrimoC study contains no experimental intervention and therefore the different trajectory pieces do not reflect phases of discontinuity.

Starting with the single-phase TPGMM six distinct classes of delinquent developments could be identified: non-offenders who were nearly not involved in delinquent behavior at all, adolescent-limited offenders with the typical development of the age-crime curve, low-level-decliners who limited their delinquency in early adolescence, low-rate-offenders who reported only a few crimes during the panel study, persistent offenders with the largest incidence rate compared to the other classes and high-level decliners with a high crime rate in early adolescence and a declining tendency later on. A specification with six classes could also be verified with the multiple-phase SPGMM. Different models with two and three phases were tested and compared. The first phase can be characterized by the development of delinquency in early adolescence, the second phase by the development in late adolescence. A possible third phase belongs to the period of young adulthood which might be detected with further panel waves. The specification of the different SPGMM variants assume always a higher number of classes in the first phase compared to the second or third phase. Heterogeneity of the development of delinquency is expected to be higher in the first panel waves and decreases thereafter. On the average this assumption was confirmed. The number of offenses decreases over time and the development of delinquency tends to be homogenized. Two class patterns of the final SPGMM are expected to be stable across the phases: the non-offenders and the low-rate offenders. One pattern shows the transition from low-rate to non-offending, two patterns show the transition from high starters to decliners or persisters and another pattern is characterized by the transition of early increasing to later decreasing delinquency. Transition parameters between the phases show that the probability to stay as or to become a non- or low-rate-offender is much higher than to persist as a delinquent persons during the life course. The crosstable of the most likely latent class memberships of the TPGMM and the SPGMM reflects the stability of the classification and serve as a proof of quality for the substantive interpretations.

Although the applications of the single and multi-phase mixture models is very useful for the longitudinal criminological research technique in various fields, some unresolved issues have to be mentioned. The complexity of the models requires not only large sample sizes but also a large number of starting values. In the initial stage, 500 random sets of starting values were generated and optimized for each of the sets. The ending values of 20 optimizations with the highest log-likelihoods were used as starting values in the final stage. With the assumption of a negative binomial distribution stable results could only be obtained with the LCGA specification. Evaluation of model fit is not the same for single-phase and multiple-phase mixture models. The LMR-LRT is only available for models with one latent class variable while the statistical evaluation of multiple-phase models is limited to descriptive information criteria with preference to the adjusted BIC (Kim 2014). In addition, the large number of zeros in the incidence rates can be accounted by an inflation part of the particular mixture model (Reinecke & Seddig 2011). This extension has to be studied in future applications of stage-sequential growth mixture models.