Abstract
Large-scale assessments of student competencies address rather broad constructs and use parsimonious, unidimensional measurement models. Differential item functioning (DIF) in certain subpopulations usually has been interpreted as error or bias. Recent work in educational measurement, however, assumes that DIF reflects the multidimensionality that is inherent in broad competency constructs and leads to differential achievement profiles. Thus, DIF parameters can be used to identify the relative strengths and weaknesses of certain student subpopulations.
The present paper explores profiles of mathematical competencies in upper secondary students from six countries (Austria, France, Germany, Sweden, Switzerland, the US). DIF analyses are combined with analyses of the cognitive demands of test items based on psychological conceptualisations of mathematical problem solving. Experts judged the cognitive demands of TIMSS test items, and these demand ratings were correlated with DIF parameters. We expected that cultural framings and instructional traditions would lead to specific aspects of mathematical problem solving being fostered in classroom instruction, which should be reflected in differential item functioning in international comparative assessments. Results for the TIMSS mathematics test were in line with expectations about cultural and instructional traditions in mathematics education of the six countries.
Résumé
Les évaluations à large échelle concernant les compétences d’étudiants traitent des dimensions assez globales et utilisent des modèles de mesure restreints et unidimensionnels.
Le “differential item functioning”, utilisé pour certaines sous-populations, a été interprété comme erreur ou biais. De travaux récents dans le domaine de l’évaluation en éducation laissent supposer cependant que le DIF reflète la multidimensionalité inhérente aux dimensions de compétence et ce qui nous amène à des profils de compétence différentiels. En conséquence, les paramètres des analyses DIF sont aptes à identifier les forces et les faiblesses relatives de certaines sous-populations étudiantes.
Cet article examine les profils de compétences mathématiques chez des étudiants du deuxième cycle de six pays différents (Autriche, France, Allemagne, Suède, Suisse et Etats-Unis). Les analyses DIF ont été combinées avec l’analyse des exigences cognitives des items, basée sur des concepts psychologiques de la résolution de problèmes mathématiques. Des experts ont jugé les exigences cognitives des items TIMSS, ensuite ces jugements ont été mis en rapport avec les paramètres DIF.
Selon notre hypothèse que les différents cadres culturels et traditions d’enseignement devraient se traduire dans une différentes priorités attribuées à différents aspects de la résolution de problèmes en classe, phénomène qui devrait se retrouver, en utilisant des analyses DIF, dans les estimations comparatives internationales. Les résultats du test mathématique de TIMSS étaient en accord avec les attentes liées aux traditions culturelles et d’enseignement dans l’enseignement des mathématiques dans les six pays examinés.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Adams, R.J., & Wilson, M.R. (1996). A random coefficients multinomial logit: A generalized approach to fitting Rasch models. In G. Engelhard & M. Wilson (Eds.),Objective measurement III: Theory into practice (pp. 143–166). Norwood, NJ: Ablex.
Adams, R., Wilson, M., & Wang, W.C. (1997). The multidimensional random coefficients multinomial logit model.Applied Psychological Measurement, 21, 1–23.
Bauersfeld, H. (1980). Hidden dimensions in the so-called reality of a mathematics classroom.Educational Studies in Mathematics, 11, 23–29.
Baumert, J., Bos, W., Klieme, E., Lehmann, R.H., Lehrke, M., Hosenfeld, I., Neubrand, J., & Watermann, R. (Eds.), (1999).Testaufgaben zu TIMSS/III. Mathematisch-naturwissenschaftliche Grundbildung und voruniversitäre Mathematik und Physik der Abschlußklassen der Sekundarstufe II (Population 3). Berlin: Max-Planck-Institut für Bildungsforschung (Materialien aus der Bildungsforschung, 62).
Baumert, J., Bos, W., & Lehmann, R. (Eds.). (2000).TIMSS/III. Dritte Internationale Mathematik-und Naturwissenschaftsstudie — Mathematische und naturwissenschaftliche Bildung am Ende der Schullaufbahn: Vol. 2. Mathematische und physikalische Kompetenzen am Ende der gynnasialen Oberstufe. Opladen: Leske+Budrich.
Baumert, J., Lehmann, R., Lehrke, M., Schmitz, B., Clausen M., Hosenfeld, I., Köller, O., & Neubrand, J. (1997).TIMSS — Mathematisch-naturwissenschaftlicher Unterricht im internationalen Vergleich. Deskriptive Befunde. Opladen: Leske + Budrich.
Baumert, J., Bos, W., & Watermann, R. (2000a). Mathematische und naturwissenschaftliche Grundbildung im internationalen Vergleich. In J. Baumert, W. Bos, & R. Lehmann (Eds.),TIMSS/III. Dritte Internationale Mathematik- und Naturwissenschaftsstudie — Mathematische und naturwissenschaftliche Bildung am Ende der Schullaufbahn: Vol. 1. Mathematische und naturwissenschaftliche Grundbildung am Ende der Pflichtschulzeit (pp. 135–198). Opladen: Leske + Budrich.
Baumert, J., Bos, W., & Watermann, R. (2000b). Fachleistungen im voruniversitären Mathematik- und Physikunterricht im internationalen Vergleich. In J. Baumert, W. Bos, & R. Lehmann (Eds.),TIMSS/III. Dritte Internationale Mathematik- und Naturwissenschaftsstudie — Mathematische und naturwissenschaftliche Bildung am Ende der Schullaufbahn: Vol. 2. Mathematische und physikalische Kompetenzen am Ende der gynnasialen Oberstufe (pp. 129–180). Opladen: Leske + Budrich.
Baumert, J., Klieme, E., & Watermann, R. (1999) Jenseits von Gesamttest- und Untertestwerten: Analyse differentieller Itemfunktionen am Beispiel des mathematischen Grundbildungstests der Dritten Internationalen Mathematik- und Naturwisser schaftsstudie der IEA (TIMSS). In H.-J. Herber & F. Hofmann (Eds.),Schulpädagogik und Lehrerbildung, Festschrift zum 60. Geburtstag von Josef Thonhauser (pp. 301–324). Innsbruck: Studien Verlag.
Becker, J.P., Sawada, T., & Shimizu, Y. (1999). Some findings of the US-Japan cross-cultural research on students’ problem-solving behaviours. In G. Kaiser, E. Luna, & I. Huntley. (Eds.),International comparison in mathematics education (pp. 121–139). London: Falmer Press.
Blum, W., & Wiegand, B. (1998). Wie kommen die deutschen TIMSS-Ergebnisse zustande? In W. Blum & M. Neubrand (Eds.),TIMSS und der Mathematikunterricht (pp. 28–34). Hannover: Schroedel.
Brislin, R.W. (1986). The wording and translation of research instruments. In W. J. Lonner & J. W. Berry (Eds.),Field methods in cross-cultural research. Cross-cultural research and methodology series (vol. 8, pp. 137–164). Beverly Hills, CA: Sage.
Calvert, T. (2001).Exploring differential item functioning (DIF) with the Rasch model: A cross-country comparison of gender differences on eighth grade science items. Seattle, WA: AERA.
Camilli, G., & Shepard, L.A. (1994).Methods for identifying biased test items (vol. 4). Thousand Oaks: Sage.
Cobb, P., & Bauersfeld, H. (1995).The emergence of mathematical meaning: Interaction in classroom cultures. Hillsdale, NJ: Lawrence Erlbaum.
Cogan, L.S., & Schmidt, W.H. (1999). An examination of instructional practices in six countries. In G. Kaiser, E. Luna, & I. Huntley (Eds.),International comparison in mathematics education (pp. 68–85). London: Falmer Press.
Ercikan, K. (1998). Translation effects in international assessments.International Journal of Educational Research, 29, 543–553.
Freudenthal, H. (1983).Didactical phenomenology of mathematical structures. Dodrecht: Riedel.
Holland, P.W., & Wainer, H. (Eds.). (1993).Differential item functioning. Hillsdale, NJ: Erlbaum.
Husén, T., & Postlethwaite, T.N. (1996). A brief history of the International Association for the Evaluation of Educational Achievement (IEA).Assessment in Education, 3, 129–141.
Keeves, J.P., & Masters, G.N. (1999). Introduction. In G.N. Masters & J.P. Keeves (Eds.),Advances in measurement in educational research and assessment (pp. 1–19). Oxford: Pergamon.
Kintsch, W., & Greeno, J.G. (1985). Understanding and solving word arithmatic problems.Psychological Review, 92, 109–129.
Klieme, E. (2000): Fachleistungen im voruniversitären Mathematik- und Physikunterricht: Theoretische Grundlagen, Kompetenzstufen und Unterrichtsschwerpunkte. In J. Baumert, W. Bos, & R. Lehmann (Eds.),TIMSS/III. Dritte Internationale Mathematik- und Naturwissenschaftsstudie — Mathematische und naturwissenschaftliche Bildung am Ende der Schullaufbahn: Vol. 2. Mathematische und physikalische Kompetenzen am Ende der gynnasialen Oberstufe (pp. 57–128). Opladen: Leske + Budrich.
Klieme, E., & Bos, W. (2000). Mathematikleistung und mathematischer Unterricht in Deutschland und Japan: Triangulation qualitativer und quantitativer Analysen am Beispiel der TIMS-Studie.Zeitschrift für Erziehungswissenschaft. 3.
McDonnell, L.M. (1995). Opportunity to learn as a research concept as a policy instrument.Educational Evaluation and Policy Analyses, 17, 305–322.
Miller, M.D., & Linn, R.L. (1988). Invariance of item characteristic functions with variations in instructional coverage.Journal of Educational Measurement, 25, 205–220.
Mullis, I.V.S., Martin, M.O., Beaton, A.E., Gonzalez, E.J., Kelly, D.L., & Smith, T.A. (1998).Mathematics and science achievement in the final year of secondary school. IAE’s Third International Mathematics and Science Study. Chestnut Hill, MA: Boston College.
Muthen, B., Huang, L., Jo, B., Khoo, S., Goff, G., Novak, J., & Shih, J. (1995). Opportunity-to-learn effects on achievement: Analytical aspects.Educational Evaluation and Policy Analysis, 17, 371–403.
Neubrand, J., Neubrand, M., & Sibberns, H. (1998). Die TIMSS-Aufgaben aus mathematik-didaktischer Sicht: Stärken und Defizite deutscher Schülerinnen und Schüler. In W. Blum & M. Neubrand (Eds.),TIMSS und der Mathematikunterricht (pp. 17–24). Hannover: Schroedel.
Neubrand, M., Biehler, R., Blum, W., Cohors-Fresenborg, E., Flade, L., Knoche, N., Lind, D., Löding, W., Möller, G., & Wynands, A. (2001). Grundlagen der Ergänzung des internationalen PISA-Mathematik-Tests in der deutschen Zusatzerhebung.Zentralblatt für Didaktik der Mathematik, 33, 1–15.
Pellegrino, J., Chudowsky, N., & Glaser, R. (Eds.). (in press).Knowing what students know: The science and design of educational assessment. Washington, DC: National Academy Press.
Ramseier, E. (1999). Task difficulty and curricular priorities in science: Analysis of typical features of the Swiss performance in TIMSS.Educational Research and Evaluation, 5, 105–126.
Reusser, K. (1996). From cognitive modeling to the design of pedagogical tools. In S. Vosniadou, E. De Corte, R. Glaser, & H. Mandl (Eds.),International perspectives on the design of technology-supported learning environments (pp. 81–103). Mahwah, NJ: Lawrence Erlbaum Ass. Publishers.
Rost, J., & Carstensen, C. H. (2000). Multidimensional Rasch measurement via item component models and faceted designs. Accepted toApplied psychological Measurement.
Scheuneman, J.D., & Bleistein, C.A. (1999). Item bias. In G.N. Masters & J.P. Keeves (Eds.),Advances in measurement in educational research and assessment (pp. 220–234). Oxford: Pergamon.
Schmidt, W.H., Jakwerth, P.M., & McKnight, C. (1998). Curriculum sensitive assessment: Content does make difference.International Journal of Educational Research, 29, 503–527.
Schmidt, W.H., Jorde, D., Cogan, L.S., Barrier, E., Gonzalo, I., Moser, U., Shimizu, K., Sawada, T., Valverde, G.A., McKnight, C., Prawat, R.S., Wiley, D.E., Raizen, S.A., Britton, E.D., & Wolfe, R.G. (1996).Characterizing pedagogical flow. An investigation of mathematics and science teaching in six countries. Dordrecht: Kluwer.
Seeger, F., Voigt, J., & Waschescio, U. (Eds.). (1998).The culture of the mathematics classroom. Cambridge, UK: University Press.
Shavelson, R.J., & Webb, N.M. (1991).Generalizability theory: A primer. Newbury Park, CA: Sage.
Stein, M.K., Grover, B.W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms.American Educational Research Journal, 32, 455–488.
Stigler, J.W., Gonzalez, P., Kawanaka, T., Knoll, S., & Serrano, A. (1996).The TIMSS videotape classroom study: Methods and preliminary findings. Prepared for the National Center for Education Statistics, U.S. Department of Education, Los Angeles, CA.
Tatsuoka, K.K., Linn, R.L., Tatsuoka, M.M., & Yamamoto, K. (1988). Differential item functioning resulting from use of different solution strategies.Journal of Educational Measurement, 25, 301–319.
Van der Linden, W.J. (1998). A discussion of some methodological issues in international assessments.International Journal of Educational Research, 29, 569–577.
Voigt, J. (1984).Interaktionsmuster und Routinen in Mathematikunterricht [Patterns of interaction and routines in mathematics classrooms]. Weinheim: Beltz Verlag.
Westbury, I. (1993). American and Japanese achievement... again.Educational Researcher, 22, 21–25.
Wolf, R.M. (1998). Validity issues in international assessments.International Journal of Educational Research, 29, 491–501.
Wu, M.L., Adams, R.J., & Wilson, M.R. (1998).ACER Conquest. Generalised Item Response Modelling Software. Unpublished manual. Camberwell, Melbourne, Victoria: Australian Council for Educational Research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Klieme, E., Baumert, J. Identifying national cultures of mathematics education: Analysis of cognitive demands and differential item functioning in TIMSS. Eur J Psychol Educ 16, 385–402 (2001). https://doi.org/10.1007/BF03173189
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF03173189