Access provided by Autonomous University of Puebla. Download reference work entry PDF
Multilevel Analysis, Hierarchical Linear Models
The term “Multilevel Analysis” is mostly used interchangeably with “Hierarchical Linear Modeling,” although strictly speaking these terms are distinct. Multilevel Analysis may be understood to refer broadly to the methodology of research questions and data structures that involve more than one type of unit. This originated in studies involving several levels of aggregation, such as individuals and counties, or pupils, classrooms, and schools. Starting with Robinson’s (1950) discussion of the ecological fallacy, where associations between variables at one level of aggregation are mistakenly regarded as evidence for associations at a different aggregation level (see Alker 1969, for an extensive review), this led to interest in how to analyze data including several aggregation levels. This situation arises as a matter of course in educational research, and studies of the contributions made by different sources of variation such as students, teachers, classroom composition, school organization, etc., were seminal in the development of statistical methodology in the 1980s (see the review in Chap. 1 of de Leeuw and Meijer 2008). The basic idea is that studying the simultaneous effects of variables at the levels of students, teachers, classrooms, etc., on student achievement requires the use of regression-type models that comprise error terms for each of those levels separately; this is similar to mixed effects models studied in the traditional linear models literature such as Scheffé (1959).
The prototypical statistical model that expresses this is the Hierarchical Linear Model, which is a mixed effects regression model for nested designs. In the two-level situation – applicable, e.g., to a study of students in classrooms – it can be expressed as follows. The more detailed level (students) is called the lower level, or level 1; the grouping level (classrooms) is called the higher level, or level 2. Highlighting the distinction with regular regression models, the terminology speaks of units rather than cases, and there are specific types of unit at each level. In our example, the level-1 units, students, are denoted by i and the level-2 units, classrooms, by j. Level-1 units are nested in level-2 units (each student is a member of exactly one classroom) and the data structure is allowed to be unbalanced, such that j runs from 1 to N while i runs, for a given j, from 1 to n j . The basic two-level hierarchical linear model can be expressed as
or, more succinctly, as
Here Y ij is the dependent variable, defined for level-1 unit i within level-2 unit j; the variables x hij and z hij are the explanatory variables. Variables R ij are residual terms, or error terms, at level 1, while U hj for h = 0, …, p are residual terms, or error terms, at level 2. In the case p = 0 this is called a random intercept model, for p ≥ 1 it is called a random slope model. The usual assumption is that all R ij and all vectors U j = (U 0j , …, U pj ) are independent, R ij having a normal \(\mathcal{N}(0,{\sigma }^{2})\) and U j having a multivariate normal \({\mathcal{N}}_{p+1}(\mathbf{0},\mathbf{T})\) distribution. Parameters β h are regression coefficients (fixed effects), while the U hj are random effects. The presence of both of these makes (1) into a mixed linear model. In most practical cases, the variables with random effects are a subset of the variables with fixed effects (x hij = z hij for h ≤ p; p ≤ r), but this is not necessary.
More Than Two Levels
This model can be extended to a three- or more-level model for data with three or more nested levels by including random effects at each of these levels. For example, for a three level structure where level-3 units are denoted by k = 1, …, M, level-2 units by j = 1, …, N k , and level-1 units by i = 1, …, n ij , the model is
where the U hjk are the random effects at level 2, while the V hk are the random effects at level 3. An example is research into outcome variables Y ijk of students (i) nested in classrooms ( j) nested in schools (k), and the presence of error terms at all three levels provides a basis for testing effects of pupil variables, classroom or teacher variables, as well as school variables.
The development both of inferential methods and of applications was oriented first to this type of nested models, but much interest now is given also to the more general case where the restriction of nested random effects is dropped. In this sense, multilevel analysis refers to methodology of research questions and data structures that involve several sources of variation – each type of units then refers to a specific source of variation, with or without nesting. In social science applications this can be fruitfully applied to research questions in which different types of actor and context are involved; e.g., patients, doctors, hospitals, and insurance companies in health-related research; or students, teachers, schools, and neighborhoods in educational research. The word “level” then is used for such a type of units. Given the use of random effects, the most natural applications are those where each “level” is associated with some population of units.
Longitudinal Studies
A special area of application of multilevel models is longitudinal studies, in which the lowest level corresponds to repeated observations of the level-two units. Often the level-two units are individuals, but these may also be organizations, countries, etc. This application of mixed effects models was pioneered by Laird and Ware (1982). An important advantage of the hierarchical linear model over other statistical models for longitudinal data is the possibility to obtain parameter estimates and tests also under highly unbalanced situations, where the number of observations per individual, and the time points where they are measured, are different between individuals. Another advantage is the possibility of seamless integration with nesting if individuals within higher-level units.
Model Specification
The usual considerations for model specification in linear models apply here, too, but additional considerations arise from the presence in the model of the random effects and the data structure being nested or having multiple types of unit in some other way. An important practical issue is to avoid the ecological fallacy mentioned above; i.e., to attribute fixed effects to the correct level. In the original paper by Robinson (1950), one of the examples was about the correlation between literacy and ethnic background as measured in the USA in the 1930s, computed as a correlation at the individual level, or at the level of averages for large geographical regions. The correlation was .203 between individuals, and .946 between regions, illustrating how widely different correlations at different levels of aggregation may be.
Consider a two-level model (1) where variable X 1 with values x 1ij is defined as a level-1 variable – literacy in Robinson’s example. For “level-2 units” we also use the term “groups.” To avoid the ecological fallacy, one will have to include a relevant level-2 variable that reflects the composition of the level-2 units with respect to variable X 1. The mostly used composition variable is the group mean of X 1,
The usual procedure then is to include x 1ij as well as \(\bar{{x}}_{1.j}\) among the explanatory variables with fixed effects. This allows separate estimation of the within-group regression (the coefficient of x 1ij ) and the between-group regression (the sum of the coefficients of x 1ij and \(\bar{{x}}_{1.j}\)).
In some cases, notably in many economic studies (see Greene 2003), researchers are interested especially in the within-group regression coefficients, and wish to control for the possibility of unmeasured heterogeneity between the groups. If there is no interest in the between-group regression coefficients one may use a model with fixed effects for all the groups: in the simplest case this is
The parameters γ j (which here have to be restricted, e.g., to have a mean 0 in order to achieve identifiability) then represent all differences between the level-two units, as far as these differences apply as a constant additive term to all level-1 units within the group. For example in the case of longitudinal studies where level-2 units are individuals and a linear model is used, this will represent all time-constant differences between individuals. Note that (3) is a linear model with only one error term.
Model (1) implies the distribution
Generalizations are possible where the level-1 residual terms R ij are not i.i.d.; they can be heteroscedastic, have time-series dependence, etc. The specification of the variables Z having random effects is crucial to obtain a well-fitting model. See Chap. 9 of Snijders and Bosker (1999), Chap. 9 of Raudenbush and Bryk(2002), and Chap. 3 of de Leeuw and Meijer(2008).
Inference
A major reason for the take-off of multilevel analysis in the 1980s was the development of algorithms for maximum likelihood estimation for unbalanced nested designs. The EM algorithm (Dempster et al. 1981), Iteratively Reweighted Least Squares (Goldstein 1986), and Fisher Scoring (Longford 1987) were applied to obtain ML estimates for hierarchical linear models. The MCMC implementation of Bayesian procedures has proved very useful for a large variety of more complex multilevel models, both for non-nested random effects and for generalized linear mixed models; see Browne and Draper (2000) and Chap. 2 of de Leeuw and Meijer (2008).
Hypothesis tests for the fixed coefficients β h can be carried out by Wald or Likelihood Ratio tests in the usual way. For testing parameters of the random effects, some care must be taken because the estimates of the random effect variances τ hh 2 (the diagonal elements of T) are not approximately normally distributed if τ hh 2 = 0. Tests for these parameters can be based on estimated fixed effects, using least squares estimates for U hj in a specification where these are treated as fixed effects (Bryk and Raudenbush 2002, Chap. 3); based on appropriate distributions of the log likelihood ratio; or obtained as score tests (Berkhof and Snijders2001).
About the Author
Professor Snijders is Elected Member of the European Academy of Sociology (2006) and Elected Correspondent of the Royal Netherlands Academy of Arts and Sciences (2007). He was awarded the Order of Knight of the Netherlands Lion (2008). Professor Snijders was Chairman of the Department of Statistics, Measurement Theory, and Information Technology, of the University of Groningen (1997–2000). He has supervised 52 Ph.D. students. He has been associate editor of various journals, and Editor of Statistica Neerlandica (1986–1990). Currently he is co-editor of Social Networks, Associate editor of Annals of Applied Statistics, and Associate editor of Journal of Social Structure. Professor Snijders has (co-)authored about 100 refereed papers and several books, including Multilevel analysis. An introduction to basic and advanced multilevel modeling. (with Bosker, R.J., London etc.: Sage Publications, 1999). In 2005, he was awarded an honorary doctorate in the Social Sciences from the University of Stockholm.
References and Further Reading
To explore current research activities and to obtain information training materials etc., visit the website www.cmm.bristol.ac.uk. There is also an on-line discussion group at www.jiscmail.ac.uk/lists/multilevel.html.
There is a variety of textbooks, such as Goldstein (2003), Longford (1993), Raudenbush and Bryk (2003), and Snijders and Bosker (1999). A wealth of material is contained in de Leeuw and Meijer (2008).
Alker HR (1969) A typology of ecological fallacies. In: Dogan M, Rokkan S (eds) Quantitative ecological analysis in the social sciences. MIT Press, Cambridge, pp 69–86
Berkhof J, Snijders TAB (2001) Variance component testing in multilevel models. J Educ Behav Stat 26:133–152
Browne WJ, Draper D (2000) Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Stat 15:391–420
de Leeuw J, Meijer E (2008) Handbook of multilevel analysis. Springer, New York
Dempster AP, Rubin DB, Tsutakawa RK (1981) Estimation in covariance components models. J Am Stat Assoc 76:341–353
Goldstein H (1986) Multilevel mixed linear model analysis using iterative generalized least squares. Biometrika 73:43–56
Goldstein H (2003) Multilevel statistical models, 3rd edn. Edward Arnold, London
Greene W (2003) Econometric analysis, 5th edn. Prentice Hall, Upper Saddle River
Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38:963–974
Longford NT (1987) A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects. Biometrika 74:812–827
Longford NT (1993) Random coefficient models. Oxford University Press, New York
Raudenbush SW, Bryk AS (2002) Hierarchical linear models: applications and data analysis methods, 2nd edn. Sage, Thousand Oaks
Robinson WS (1950) Ecological correlations and the behavior of individuals. Am Sociol Rev 15:351–357
Scheffé H (1959) The analysis of variance. Wiley, New York
Snijders TAB, Bosker RJ (1999) Multilevel analysis: an introduction to basic and advanced multilevel modeling. Sage, London
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this entry
Cite this entry
Snijders, T.A.B. (2011). Multilevel Analysis. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_387
Download citation
DOI: https://doi.org/10.1007/978-3-642-04898-2_387
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04897-5
Online ISBN: 978-3-642-04898-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering