Abstract
It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between the latent variables and dichotomous observed variables, which may be responses to tests or questionnaires. It will be shown that the multilevel model with measurement error in the observed predictor variables can be estimated in a Bayesian framework using Gibbs sampling. In this article, handling measurement error via the normal ogive model is compared with alternative approaches using the classical true score model. Examples using real data are given.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling.Journal of Educational Statistics, 17, 251–269.
Bock, R.D., & Zimowski, M.F. (1997). Multiple group IRT. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 433–448). New York, NY: Springer.
Béguin, A.A. (2000).Robustness of equating high-stakes tests. Unpublished doctoral dissertation, Twente University, Enschede, Netherlands.
Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation of multidimensional IRT models.Psychometrika, 66, 541–562.
Bernardo, J.M., & Smith, A.F.M. (1994).Bayesian theory. New York, NY: John Wiley & Sons.
Best, N.G., Cowles, M.K., & Vines, S.K. (1995).CODA Convergence diagnosis and output analysis software for Gibbs sampler output: Version 0.3 [Computer software and manual]. University of Cambridge: MRC Biostatistics Unit.
Bollen, K.A. (1989).Structural equations with latent variables. New York, NY: John Wiley & Sons.
Bosker, R.J., Blatchford, P., & Meijnen, G.W. (1999). Enhancing educational excellence, equity and efficiency. In R.J. Bosker, B.P.M. Creemers, & S. Stringfield (Eds.),Evidence from evaluations of systems and schools in change (pp. 89–112). Dordrecht/Boston/London: Kluwer Academic Publishers.
Box, G.E.P., & Tiao, G.C. (1973).Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley Publishing.
Bryk, A.S., & Raudenbush, S.W. (1992).Hierarchical linear models. Newbury Park, CA: Sage Publications.
Carlin, B.P., & Louis, T.A. (1996).Bayes and empirical Bayes methods for data analysis. London: Chapman & Hall.
Carroll, R., Ruppert, D., & Stefanski, L.A. (1995).Measurement error in nonlinear models. London: Chapman & Hall.
Chen, M.-H., & Shao, Q.-M. (1999). Monte Carlo estimation of Bayesian credible and HPD intervals.Journal of Computational and Graphical Statistics, 8, 69–92.
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis-Hastings Algorithm.The American Statistician, 49, 327–335.
Cook, T.D., & Campbell, D.T. (1979).Quasi-experimentation, design & analysis issues for field settings. Chicago, IL: Rand McNally College Publishing.
de Leeuw, J., & Kreft, I.G.G. (1986). Random coefficient models for multilevel analysis.Journal of Educational and Behavioral Statistics, 11, 57–86.
Fox, J.-P. (2001).Multilevel IRT: A Bayesian perspective on estimating parameters and testing statistical hypotheses. Unpublished doctoral dissertation, Twente University, Enschede, Netherlands.
Fox, J.-P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling.Psychometrika, 66, 269–286.
Fuller, W.A. (1987).Measurement error models. New York, NY: John Wiley & Sons.
Gelfand, A.E., & Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities.Journal of the American Statistical Association, 85, 398–409.
Gelfand, A.E., Hills, S.E., Racine-Poon, A., & Smith, A.F.M. (1990). Illustration of Bayesian inference in normal data models using Gibbs sampling.Journal of the American Statistical Association, 85, 972–985.
Gelman A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995).Bayesian data analysis. London: Chapman & Hall.
Gelman, A., Meng X.-L., & Stern, H.S. (1996). Posterior predictive assessment of model fitness via realized discrepancies.Statistica Sinica, 6, 733–807.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images.IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Gilks, W.R., & Roberts, G.O. (1996). Strategies for improving MCMC. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.),Markov Chain Monte Carlo in practice (pp. 89–114). London: Chapman & Hall.
Goldstein, H. (1995).Multilevel statistical models (2nd ed.). London: Edward Arnold.
Gruber, M.H.J. (1998).Improving efficiency by shrinkage. New York, NY: Marcel Dekker.
Hoijtink, H., & Boomsma, A. (1995). On person parameter estimation in the dichotomous Rasch model. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 53–68). New York, NY: Springer.
Johnson, V.E., & Albert, J.H. (1999).Ordinal data modeling. New York, NY: Springer-Verlag.
Lindley, D.V., & Smith, A.F.M. (1972). Bayes estimates for the linear model.Journal of the Royal Statistical Society, Series B,34, 1–41.
Liu, J.S., Wong, H.W., & Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes.Biometrika, 81, 27–40.
Lord, F.M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.
Lord, F.M., & Novick, M.R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
MacEachern, S.N., & Berliner, L.M. (1994). Subsampling the Gibbs sampler.The American Statistician, 48, 188–190.
McDonald, R.P. (1967). Nonlinear factor analysis.Psychometrika Monograph Number 15.
McDonald, R.P. (1982). Linear versus nonlinear models in latent trait theory.Applied Psychological Measurement, 6, 379–396.
McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 257–269). New York, NY: Springer.
Muthén, B.O. (1989). Latent variable modeling in heterogeneous populations.Psychometrika, 54, 557–585.
Patz, J.P., & Junker, B.W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses.Journal of Educational and Behavioral Statistics, 24, 342–366.
Raudenbush, S.W. (1988). Educational applications of hierarchical linear models: A review.Journal of Educational Statistics, 13, 85–116.
Raudenbush, S.W., Bryk, A.S., Cheong, Y.F., & Congdon, R.T., Jr. (2000).HLM 5. Hierarchical linear and nonlinear modeling. Lincolnwood, IL: Scientific Software International.
Richardson, S. (1996). Measurement error. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.),Markov Chain Monte Carlo in practice (pp. 401–417). London: Chapman & Hall.
Robert, C.P., & Casella, G. (1999).Monte Carlo statistical methods. New York, NY: Springer.
Roberts, G.O., & Sahu, S.K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler.Journal of the Royal Statistical Society, Series B,59, 291–317.
Seltzer, M.H. (1993). Sensitivity analysis for fixed effects in the hierarchical model: A Gibbs sampling approach.Journal of Educational Statistics, 18, 207–235.
Seltzer, M.H., Wong, W.H., & Bryk, A.S. (1996). Bayesian analysis in applications of hierarchical models: Issues and methods.Journal of Educational and Behavioral Statistics, 21, 131–167.
Snijders, T.A.B., & Bosker, R.J. (1999).Multilevel analysis. London: Sage Publications.
Tanner, M.A., & Wong, W.H. (1987). The calculation of posterior distributions by data augmentation.Journal of the American Statistical Association, 82, 528–550.
Tierney, L. (1994). Markov chains for exploring posterior distributions.The Annals of Statistics, 22, 1701–1762.
van der Linden, W.J. (1998). Optimal assembly of psychological and educational tests.Applied Psychological Measurement, 22, 195–211.
Zellner, A. (1971).An introduction to Bayesian inference in econometrics. New York, NY: John Wiley & Sons.
Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996).Bilog MG, Multiple-group IRT analysis and test maintenance for binary items. Chicago, IL: Scientific Software International.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is part of the dissertation by Fox (2001) that won the 2002 Psychometric Society Dissertation Award.
Rights and permissions
About this article
Cite this article
Fox, JP., Glas, C.A.W. Bayesian modeling of measurement error in predictor variables using item response theory. Psychometrika 68, 169–191 (2003). https://doi.org/10.1007/BF02294796
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02294796