Abstract
Secondary teachers across the United States are being asked to use formative assessment data (Black and Wiliam 1998a,b; Roediger and Karpicke 2006) to inform their classroom instruction. At the same time, critics of US government’s No Child Left Behind legislation are calling the bill “No Child Left Untested”. Among other things, critics point out that every hour spent assessing students is an hour lost from instruction. But, does it have to be? What if we better integrated assessment into classroom instruction and allowed students to learn during the test? We developed an approach that provides immediate tutoring on practice assessment items that students cannot solve on their own. Our hypothesis is that we can achieve more accurate assessment by not only using data on whether students get test items right or wrong, but by also using data on the effort required for students to solve a test item with instructional assistance. We have integrated assistance and assessment in the ASSISTment system. The system helps teachers make better use of their time by offering instruction to students while providing a more detailed evaluation of student abilities to the teachers, which is impossible under current approaches. Our approach for assessing student math proficiency is to use data that our system collects through its interactions with students to estimate their performance on an end-of-year high stakes state test. Our results show that we can do a reliably better job predicting student end-of-year exam scores by leveraging the interaction data, and the model based on only the interaction information makes better predictions than the traditional assessment model that uses only information about correctness on the test items.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Anozie, N., Junker, B.W.: Predicting end-of-year accountability assessment scores from monthly student records in an online tutoring system. In: Beck, J., Aimeur, E., Barnes, T. (eds.) Educational Data Mining: Papers from the AAAI Workshop, pp. 1–6. AAAI Press, Menlo Park, CA Technical Report WS-06-05 (2006)
Ayers E., Junker B.W.: Do skills combine additively to predict task difficulty in eighth grade mathematics? In: Beck, J., Aimeur, E., Barnes, T. (eds.) Educational Data Mining: Papers from the AAAI Workshop, pp. 14–20. AAAI Press, Menlo Park, CA, Technical Report WS-06-05 (2006)
Baker, R.S., Corbett, A.T., Koedinger, K.R.: Detecting student misuse of intelligent tutoring systems. In: James, C.L., Vicari, R.M., Paraguacu, F. (eds.) Intelligent Tutoring Systems: 7th International Conference ITS 2004, Maceió, Alagoas, Brazil Proceedings, pp. 531–540. Springer-Verlag Berlin Heidelberg, Berlin, Germany (2004)
Baker, R.S., Roll, I., Corbett, A.T., Koedinger, K.R.: Do performance goals lead students to game the system? In: Proceedings of the 12th International Conference on Artificial Intelligence in Education, pp. 57–64. Netherlands, Amsterdam (2005)
Beck J.E., Sison J.: Using knowledge tracing in a noisy environment to measure student reading proficiencies. Int. J. Artif. Intell. Educ. 16, 129–143 (2006)
Beck J.E., Jia P., Mostow J.: Automatically assessing oral reading fluency in a computer tutor that listens. Technol. Instr. Cogn. Learn. 2, 61–81 (2004)
Black P., Wiliam D.: Assessment and classroom learning. Assess. Educ.: Princ., Policy Pract. 5, 7–74 (1998a)
Black P., Wiliam D.: Inside the black box: raising standards through classroom assessment. Phi Delta Kappan 80(2), 139–149 (1998b)
Boston, C.: The concept of formative assessment. Pract. Assess. Res. Eval. 8(9) (2002)
Campione J.C., Brown A.L., Bryant R.J.: Individual differences in learning and memory. In: Sternberg, R.J. (eds) Human Abilities: An Information-processing Approach, pp. 103–126. W. H. Freeman, New York (1985)
Corbett, A.T., Bhatnagar, A.: Student modeling in the ACT Programming Tutor: Adjusting a procedural learning model with declarative knowledge. User Modeling: Proceedings of the Sixth International Conference on User Modeling UM97 Chia Laguna, Sardinia, Italy, pp. 243–254. Springer-Verlag Wein, New York (1997)
Corbett A.T., Anderson J.R., O’Brien A.T.: Student modeling in the ACT programming tutor. In: Nichols, P., Chipman, S., Brennan, R. (eds) Cognitively Diagnostic Assessment., Erlbaum, Hillsdale, NJ (1995)
Computer Research Association.: Cyberinfrastructure for Education and Learning for the Future: a Vision and Research Agenda. Final report of Cyberlearning Workshop Series workshops held Fall 2004—Spring 2005 by the Computing Research Association and the International Society of the Learning Sciences. Retrieved from http://www.cra.org/reports/cyberinfrastructure.pdf on 10 November 2006 (2005)
Embretson S.E.: Structured Rasch models for measuring individual-difference in learning and change. Int. J. Psychol. 27(3–4), 372–372 (1992)
Feng M., Heffernan N.T.: Towards live informing and automatic analyzing of student learning: Reporting in the assistment system. J. Interact. Learn. Res. 18(2), 207–230 (2007) AACE, Chesapeake, VA
Feng, M., Heffernan, N.T., Koedinger, K.R.: Addressing the testing challenge with a web-based e-assessment system that tutors as it assesses. In: Carr, L.A., De Roure, D.C., Iyengar, A., Goble, C.A., Dahlin, M. (eds.) Proceedings of the Fifteenth International World Wide Web Conference, pp. 307–316. Edinburgh UK, 2006. ACM Press, New York, NY (2006)
Feng M., Heffernan N., Beck J., Koedinger K.: Can we predict which groups of questions students will learn from?. In: Baker, Beck (eds) Proceedings of the First International Conference on Educational Data Mining, pp. 218–225. Montreal, Canada (2008)
Feng M., Beck J., Heffernan N., Koedinger K.: Can an intelligent tutoring system predict math proficiency as well as a standardized test? In: Baker, Beck (eds) Proceedings of the First International Conference on Educational Data Mining, pp. 107–116. Montreal, Canada (2008)
Fischer G., Seliger E.: Multidimensional linear logistic models for change. Chap. 19. In: Linden, W.J., Hambleton, R.K. (eds) Handbook of Modern Item Response Theory, Springer-Verlag, New York (1997)
Grigorenko E.L., Sternberg R.J.: Dynamic testing. Psychol. Bull. 124, 75–111 (1998)
Hulin C.L., Lissak R.I., Drasgow F.: Recovery of two- and three-parameter logistic item characteristic curves: A Monte Carlo study. Appl. Psychol. Meas. 6(3), 249–260 (1982)
Jannarone R.J.: Conjunctive item response theory kernels. Psychometrika 55(3), 357–373 (1986)
Koedinger, K.R., Aleven, V., Heffernan, N.T., McLaren, B., Hockenberry, M.: Opening the door to non-programmers: authoring intelligent tutor behavior by demonstration. In: Proceedings of the 7th International Conference on Intelligent Tutoring Systems, pp. 162–173. Maceio, Brazil (2004)
Massachusetts Department of Education.: Massachusetts Mathematics Curriculum Framework. Retrieved from http://www.doe.mass.edu/frameworks/math/2000/final.pdf, 6 November 2005 (2000)
MCAS technical report.: Retrieved from http://www.cs.wpi.edu/mfeng/pub/mcas_techrpt01.pdf, 5 August 2005 (2001)
Mitchell T.: Machine Learning. McGraw-Hill, Columbus, OH (1997)
Mostow J., Aist G.: Evalutating tutors that listen: an overview of Project LISTEN. In: Feltovich, P. (eds) Smart Machines in Education, pp. 169–234. MIT/AAAI Press, Menlo Park, CA (2001)
Olson, L.: State test programs mushroom as NCLB Mandate Kicks. In: Education Week, 20 November, pp. 10–14 (2004)
Olson, L.: Special report: testing takes off. Education Week, 30 November 2005, pp. 10–14 (2005)
Raftery A.E.: Bayesian model selection in social research. Sociol Methodol 25, 111–163 (1995)
Razzaq, L., Heffernan, N.T.: Scaffolding vs. hints in the Assistment System. In: Ikeda, Ashley, Chan (eds.) Proceedings of the 8th International Conference on Intelligent Tutoring Systems, pp. 635–644. Springer-Verlag, Jhongli, Taiwan, Berlin, Germany (2006)
Razzaq, L., Feng, M., Nuzzo-Jones, G., Heffernan, N.T., Koedinger, K.R., Junker, B., Ritter, S., Knight, A., Aniszczyk, C., Choksey, S., Livak, T., Mercado, E., Turner, T.E., Upalekar, R., Walonoski, J.A., Macasek, M.A., Rasmussen, K.P.: The ASSISTment project: blending assessment and assisting. In: Proceedings of the 12th Annual Conference on Artificial Intelligence in Education. Amsterdam, The Netherlands, pp. 555–562. ISO Press, Amsterdam (2005)
Razzaq, L., Heffernan, N.T., Lindeman, R.W.: What level of tutor interaction is best? In: Luckin, Koedinger (eds.) Proceedings of the 13th Conference on Artificial Intelligence in Education, pp. 222–229. IOS Press, Los Angeles, CA, Amsterdam, The Netherlands (2007)
Roediger H.L. III, Karpicke J.D.: The power of testing memory. Perspect. Psychol. Sci. 1(3), 181–210 (2006)
Sternburg R.J., Grigorenko E.L.: All testing is dynamic testing. Issues Educ. 7, 137–170 (2001)
Sternburg R.J., Grigorenko E.L.: Dynamic Testing: The Nature and Measurement of Learning Potential. Cambridge University Press, Cambridge (2002)
Tan E.S., Imbos T., Does R.J.M.: A distribution-free approach to comparing growth of knowledge. J. Educ. Measure. 31(1), 51–65 (1994)
Tatsuoka K.K.: Rule space: an approach for dealing with misconceptions based on item response theory. J. Educ. Measure. 20, 345–354 (1983)
van der Linden, W.J., Hambleton, R.K. (eds.): Handbook of Modern Item Response Theory. Springer Verlag, New York, NY (1997)
Walonoski, J., Heffernan, N.T.: Detection and analysis of off-task gaming behavior in intelligent tutoring systems. In: Ikeda, Ashley, Chan (eds.) In: Proceedings of the 8th International Conference on Intelligent Tutoring Systems. Berlin, pp. 382–391. Springer-Verlag, Jhongli, Taiwan (2006)
Zimowski, M., Muraki, E., Mislevy, R., Bock, D.: BILOG-MG 3—Multiple-Group IRT Analysis and Test maintenance for Binary Items. Scientific Software International, Inc., Lincolnwood, IL. URL http://www.ssicentral.com/. (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Feng, M., Heffernan, N. & Koedinger, K. Addressing the assessment challenge with an online system that tutors as it assesses. User Model User-Adap Inter 19, 243–266 (2009). https://doi.org/10.1007/s11257-009-9063-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11257-009-9063-7