Zusammenfassung
Dieses Kapitel befasst sich mit verschiedenen Möglichkeiten, wie die Antworten der Testpersonen auf die Testaufgaben/Fragen erfasst und kodiert werden können („Antwortformate“). Daraus ergeben sich verschiedene Itemtypen. Unter Beachtung von Vor- und Nachteilen wird das freie Antwortformat dem gebundenen Antwortformat gegenübergestellt. Bei Letzterem sind vor allem Ordnungs- und Auswahlaufgaben sowie kontinuierliche und diskrete Beurteilungsaufgaben als Itemtypen weitverbreitet. Unter Heranziehung zahlreicher Beispiele werden viele praktische Konstruktionsaspekte thematisiert und unter Bezug auf verschiedene Zielvorgaben diskutiert. Entscheidungshilfen für die Wahl des Aufgabentyps runden das Kapitel ab.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Literatur
Alwin, D. F. (1992). Information transmission in the survey interview: number of response categories and the reliability of attitude measurement. Sociological Methodology, 22, 83–118.
Amthauer, R., Brocke, B., Liepmann, D. & Beauducel, A. (2001). I-S-T 2000 R. Göttingen: Hogrefe.
Bauer, D., Holzer, M., Kopp, V. & Fischer, M. R. (2011). Pick-N multiple choice-exams: a comparison of scoring algorithms. Advances in health sciences education: theory and practice, 16, 211–221.
Becker, N. & Spinath, F. (2014). DESIGMA-Advanced – Design a Matrix-Advanced (Manual). Göttingen: Hogrefe.
Bishop, G. F. (1987) Experiments with the Middle Response Alternatives in Survey Questions. Public Opinion Quarterly, 51, 220–232.
Chernyshenko, O. S., Stark, S., Chan, K. Y., Drasgow, F. & Williams, B. A. (2001). Fitting item response theory models to two personality inventories: Issues and insights. Multivariate Behavioral Research, 36, 523–562.
Cronbach, L. J. (1941). An experimental comparison of the multiple true–false and multiple multiple-choice tests. Journal of Educational Psychology, 32, 533–543.
Cox, E. P. (1980). The optimal number of response alternatives for a scale: a review. Journal of Marketing Research, 17, 407–442.
De Beuckelaer, A., Toonen, S. & Davidov, E. (2013). On the optimal number of scale points in graded paired comparisons. Quality & Quantity, 47, 2869–2882.
Dickinson, T. L. & Zellinger, P. M. (1980). A comparison of the behaviorally anchored rating mixed standard scale formats. Journal of Applied Psychology, 65, 147–154.
Döring, N. & Bortz, J. (2016). Forschungsmethoden und Evaluation in den Sozial- und Humanwissenschaften (5. Aufl.). Heidelberg: Springer.
Eid, M. & Schmidt, K. (2014). Testtheorie und Testkonstruktion. Göttingen: Hogrefe.
Exner, J. E. (2010). Rorschach-Arbeitsbuch für das Comprehensive System: Deutschsprachige Fassung von A Rorschach Workbook for the Comprehensive System – Fifth Edition. Göttingen: Hogrefe.
Haladyna, T. M. & Downing, S. M. (1993). How many options is enough for a multiple-choice test item? Educational and Psychological Measurement, 53, 999–1010.
Hardesty, F. P. & Priester, H. J. (1963). Hamburg-Wechsler-Intelligenz-Test für Kinder. HAWIK (2. Aufl.). Bern: Huber.
Hartley, J. & Betts, L. R. (2010). Four Layouts and a Finding: The effects of changes in the order of the verbal labels and numerical values on Likert-type scales. International Journal of Social Research Methodology, 13, 17–27.
Henss, R. (1989). Zur Vergleichbarkeit von Ratingskalen unterschiedlicher Kategorienzahl. Psychologische Beiträge, 31, 264–284.
Höft, S. & Funke, U. (2006). Simulationsorientierte Verfahren der Personalauswahl. In H. Schuler (Hrsg.), Lehrbuch der Personalpsychologie. (2. Aufl., S. 145–188). Göttingen: Hogrefe.
Hornke, L. F., Etzel, S. & Rettig, K. (2005). Adaptiver Matrizen Test. Version 24.00. Mödling: Schuhfried.
Hui, C. H. & Triandis, H. C. (1989). Effects of culture and response format on extreme response style. Journal of Cross-Cultural Psychology, 20, 296–309.
Hurley, J. R. (1998). Timidity as a Response Style to Psychological Questionnaires. Journal of Psychology, 132, 202–210.
Jäger, R. S. & Petermann, F. (Hrsg.) (1999). Psychologische Diagnostik (4. Aufl.). Weinheim: Beltz PVU.
Johnson, T., Kulesa, R., Cho, Y. I. & Shavitt, S. (2005). The relation between culture and response styles. Evidence from 19 countries. Journal of Cross-Cultural Psychology, 36, 264–277.
Katlon, G., Roberts, J. & Holt, D. (1980). The effects of offering a middle response option with opinion questions. Statistician, 29, 65–78.
Krampen, D. (2015). Zur Bedeutung des Testformats für die Testauswertung. Aufgabenstamm- und Antwortabhängigkeiten im C-Test. Frankfurt am Main: Lang.
Krebs, D. & Hoffmeyer-Zlotnik, J. H. P. (2010). Positive first or negative first? Effects of the order of answering categories on response behavior. Methodology, 6, 118–127.
Krosnick, J. A. (1999). Survey research. Annual review of Psychology, 50, 537–567.
Krosnick, J. A., Holbrook, A. L., Berent, M. K., Carson, R. T., Hanemann, W. M., Kopp, R. J., Mitchell, R. C., Presser, S., Ruud, P. A., Smith, V. K., Moody, W. R., Green, M. C. & Conaway, M. (2002). The impact of “no opinion” response options on data quality: Non-attitude reduction or an invitation to satisfice? Public Opinion Quarterly, 66, 371–403.
Lam, T. C. M. & Kolic, M. (2008). Effects of semantic incompatibility on rating response. Applied Psychological Measurement, 32, 248–260.
Lienert, G. & Raatz, U. (1998). Testaufbau und Testanalyse. Weinheim: Beltz PVU.
Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 5–53.
Lord, F. M. (1944). Reliability of multiple choice tests as a function of number of choices per item. Journal of Educational Psychology, 35, 175–180.
Lord, F. M. (1977). Optimal number of choices per item—a comparison of four approaches. Journal of Educational Measurement, 14, 33–38.
Lozano, L. M., García-Cueto, E. & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology, 4, 73–79.
Marsh, H. W. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70, 810–819.
Moors, G. (2008). Exploring the effect of a middle response category on response style in attitude measurement. Quality & Quantity, 42, 779–794.
Moors, G., Kieruj, N. D. & Vermunt, J. K. (2014). The effect of labeling and numbering of response scales on the likelihood of response bias. Sociological Methodology, 44, 369–399.
Moosbrugger, H. & Oehlschlägel, J. (2011). Frankfurter Aufmerksamkeits-Inventar 2 (FAIR-2). Bern, Göttingen: Huber.
Netemeyer, R. G., Bearden, W. O. & Sharma, S. (2003). Scaling procedures: Issues and applications. Thousand Oaks, CA: Sage Publications.
Organisation for Economic Co-operation and Development (OECD). (2014). PISA 2012 Ergebnisse: Was Schülerinnen und Schüler wissen und können (Band I, überarbeitete Ausgabe): Schülerleistungen in Lesekompetenz, Mathematik und Naturwissenschaften. Bielefeld: W. Bertelsmann.
Oswald, W. D. (2016). Zahlen-Verbindungs-Test ZVT (3. Aufl.). Göttingen: Hogrefe.
Paulhus, D. L. (1991). Measurement and control of response bias. In J. P. Robinson, P. R. Shaver & L. S. Wrightsman, (Eds.), Measures of personality and social psychological attitudes (pp. 17–59). San Diego, CA: Academic Press.
Petermann, F. & Petermann, U. (Hrsg.) (2011). WISC-IV. Wechsler Intelligence Scale for Children – Fourth Edition. Frankfurt am Main: Pearson Assessment.
Pfiffer, D. (2012). Can creativity be measured? An attempt to clarify the notion of creativity and general directions for future research. Thinking Skills and Creativity, 7, 258–264.
Presser, S. & Schuman, H. (1980). The measurement of a middle position in attitude surveys. Public Opinion Quarterly, 44, 70–85.
Preston, C. C. & Colman, A. M. (2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104, 1–15.
Rammstedt, B. & Krebs, D. (2007). Does response scale format affect the answering of personality scales? Assessing the Big Five dimensions of personality with different response scales in a dependent sample. European Journal of Psychological Assessment, 23, 32–38.
Rauch, W. A., Schweizer, K. & Moosbrugger, H. (2007). Method effects due to social desirability as a parsimonious explanation of the deviation from unidimensionality in LOT-R scores. Personality and Individual Differences, 42, 1597–1607.
Rodriguez, M. C. (2005). Three options for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement: Issues and Practice, 24, 3–13.
Rost, J. (2004). Lehrbuch Testtheorie – Testkonstruktion. (2. Aufl.). Bern: Huber.
Schuller, R. & Keppler, M. (1999). Anforderungen an Skalierungsverfahren in der Marktforschung/Ein Vorschlag zur Optimierung. Planung & Analyse, 2, 64–67.
Schwarz, N., Knäuper, B. Hippler, H. J., Noelle-Neumann, E. & Clark, L. (1991). Rating scales. Numeric values may change the meaning of scale labels. Public Opinion Quarterly, 55, 570–582.
Stark, S., Chernyshenko, O. S., Drasgow, F. & Williams, B. A. (2006). Examining assumptions about item responding in personality assessment: Should ideal point methods be considered for scale development and scoring? Journal of Applied Psychology, 91, 25–39.
Thurstone, L. L. (1927a). A law of comparative judgment. Psychological Review, 34, 273–286.
Thurstone, L. L. (1927b). Psychophysical analysis. American Journal of Psychology, 38, 368–389.
Thurstone, L. L. (1928). Attitudes can be measured. American Journal of Sociology, 33, 529–554.
Torrance, E. P. (1998). The Torrance tests of creative thinking norms—technical manual figural (streamlined) forms A&B. Bensenville, IL: Scholastic Testing Service.
Torrance, E. P. & Ball, O. E. (1984). Torrance test of creative thinking. Revised manual. Bensenville, IL: Scholastic Testing Services.
Tversky, A. (1964). On the optimal number of alternatives at a choice point. Journal of Mathematical Psychology, 1, 386–391.
van Herk, H., Poortinga, Y. H. & Verhallen, T. M. (2004). Response styles in rating scales evidence of method bias in data from six EU countries. Journal of Cross-Cultural Psychology, 35, 346–360.
Weijters, B., Cabooter, E. & Schillewaert, N. (2010). The effect of rating scale format on response styles: The number of response categories and response category labels. International Journal of Research in Marketing, 27, 236–247.
Weng, L.-J. (2004). Impact of the Number of Response Categories and Anchor Labels on Coefficient Alpha and Test-retest Reliability. Educational and Psychological Measurement, 64, 956–972.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature
About this chapter
Cite this chapter
Moosbrugger, H., Brandt, H. (2020). Antwortformate und Itemtypen. In: Moosbrugger, H., Kelava, A. (eds) Testtheorie und Fragebogenkonstruktion. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-61532-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-662-61532-4_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-61531-7
Online ISBN: 978-3-662-61532-4
eBook Packages: Psychology (German Language)