New Sources of Information in Computerized Testing | Vestnik Tomskogo gosudarstvennogo universiteta – Tomsk State University Journal. 2021. № 465. DOI: 10.17223/15617793/465/24

New Sources of Information in Computerized Testing

Traditionally, psychometrics is concerned with theory-based information about human behavior - indicators of the targeted construct, like item responses, performance assessment products, etc. However, over the past forty years, advances in psychometric modeling and the development of information technologies allowed for the analysis of the so-called collateral information. This information is not theory-based and easy to collect in computerized testing. However, most importantly, collateral information is intended solely to increase the reliability of measurements preserving the construct's original interpretation. This article distinguishes between target and collateral information gathered during computerized testing. A carefully crafted measurement model is required to properly process collateral information along with target information. Social scientists usually choose Item Response Theory (IRT) models as such measurement models due to their clear interpretation, facilitating the discussion of the results of measurements in terms of social sciences. Since the choice of the correct IRT-model is crucial for preserving the original interpretation of the parameter estimates, it is possible to use the classification of such models to describe sources of collateral information systematically. This article introduces a classification of sources of collateral information based on the type of data they describe: (i) collateral information about respondents, (ii) collateral information about items, (ii) collateral information about interactions between respondents and items. The latter type of collateral information is particularly intriguing. Typically, it includes such types of data as item response times, response strategies, actions log data, gaze data, and other types of process data. Additionally to IRT modeling, examples of process mining and sequence pattern mining are also provided as examples of collateral information. The article illustrates the use of collateral information in educational psychometrics with a recent literature review. We describe cases where the measurement model's choice changes the interpretation of the IRT parameter estimates, which causes the breaking of the conditions defining collateral information. There is large- and small-scale educational and psychological research among cases. We also highlight the most illustrative cases of using collateral information in modern psychometric practice with regard to its source and the IRT-model used to process it. Moreover, we demonstrate that using the new sources of information in computerized testing contributes to developing evidence-based pedagogical practices and makes their application more manageable. The directions for future research in the area of collateral information in psychometrics are provided.

Download file

Counter downloads: 66

Keywords

collateral information, computerized testing, item response theory, models with latent variables, psychometrics

Authors

Name	Organization	E-mail
Federiakin Denis A.	Higher School of Economics	dafederiakin@hse.ru
Uglanova Irina L.	Higher School of Economics	iuglanova@hse.ru
Skryabin Maksim A.	Higher School of Economics	maxim.skryabin@gmail.com

Всего: 3

References

Stoeffler K., Rosen Y., Bolsinova M., von Davier A.A. Gamified performance assessment of collaborative problem solving skills // Computers in Human Behavior. 2020. № 104. P. 106-136.

Shute V.J., Wang L., Greiff S., Zhao W., Moore G. Measuring problem solving skills via stealth assessment in an engaging video game // Computers in Human Behavior. 2016. № 63. P. 106-117. DOI: 10.1016/j.chb.2016.05.047

Shute V.J. Stealth assessment in computer-based games to support learning // Computer games and instruction. 2011. № 55 (2). P. 503-524.

Toth K., Rolke H., Goldhammer F., Barkow I. Educational process mining: New possibilities for understanding students' problem-solving skills // The Nature of Problem Solving: Using Research to Inspire 21st Century Learning / B. Csapo, J. Funke (eds.). Paris : OECD Publishing, 2017. P. 193-209. DOI: 10.1787/9789264273955-14-en

Begicheva A.A., Lomazov I.A. Discovering high-level process models from event logs // Моделирование и анализ информационных систем. 2017. № 24 (2). С. 125-140. DOI: 10.18255/1818-1015-2017-2-125-140

He Q., von Davier M. Analyzing Process Data from Problem-Solving Items with N-Grams. Insights from a computer-based large-scale assessment // Handbook of Research on Technology Tools for Real-World Skill Development, Information Science Reference / Y. Rosen, S. Ferrara, M. Mosharraf (eds.). Hershey, PA : IGI Global, 2016. P. 750-777. DOI: 10.4018/978-1-4666-9441-5.ch029

Guo H., Deane P.D., van Rijn P.W., Zhang M., Bennett R.E. Modeling Basic Writing Processes from Keystroke Logs // Journal of Educational Measurement Summer. 2018. № 55/2. P. 194-216. DOI: 10.1111/jedm.12172

Chen Y., Li X., Liu J., Ying Z. Statistical Analysis of Complex Problem-Solving Process Data: an Event History Analysis Approach // Frontiers in Psychology. 2019. № 10. P. 1-10. DOI: 10.3389/fpsyg.2019.00486

Liao D., He Q., Jiao H. Mapping Background Variables with Sequential Patterns in Problem-Solving Environments: an Investigation of United States Adults' Employment Status in PIAAC // Frontiers in Psychology. 2019. № 10. P. 1-32. DOI: 10.3389/fpsyg.2019.00646

Lee Y. How to Make an Assessment More Informative and Interpretable Using the Ordered Partition Model // Journal of Curriculum and Evaluation. 2011. № 14. P. 333-361. DOI: 10.29221/jce.2011.14.3.333

Wilson M. The ordered partition model: an extension of the partial credit model // Applied Psychological Measurement. 1992. № 16 (4). P. 309325. DOI: 10.1177/014662169201600401

Куравский Л. С., Юрьев Г. А., Ушаков Д. В., Юрьева Н. Е., Валуева Е. А., Лаптева, Е. М. Диагностика по тестовым траекториям: метод паттернов // Экспериментальная психология. 2018. № 11 (2). С. 77-94. DOI: 10.17759/exppsy.2018110206

Куравский Л. С., Артеменков С. Л., Юрьев Г. А., Григоренко Е. Л. Новый подход к компьютеризированному адаптивному тестированию // Экспериментальная психология. 2017. № 10 (3). С. 33-45. DOI: 10.17759/exppsy.2017100303

Goldhammer F. Measuring ability, speed, or both? Challenges, psychometric solutions, and what can be gained from experimental control // Measurement: interdisciplinary research and perspectives. 2015. № 13 (3-4). P. 133-164. DOI: 10.1080/15366367.2015.1100020

Molenaar D., Tuerlinckx F., van der Maas H.L. A bivariate generalized linear item response theory modeling framework to the analysis of responses and response times // Multivariate Behavioral Research. 2015. № 50 (1). P. 56-74. DOI: 10.1080/00273171.2014.962684

Molenaar D., Tuerlinckx F., van der Maas H.L. A generalized linear factor model approach to the hierarchical framework for responses and response times // British Journal of Mathematical and Statistical Psychology. 2015. № 68. P. 197-219. DOI: 10.1111/bmsp.12042

van der Linden W.J. A hierarchical framework for modeling speed and accuracy on test items // Psychometrika. 2007. № 72 (3). Р. 287-308. DOI: 10.1016/j.procs.2018.04.094

Thorndike E.L., Bregman E.O., Cobb M.V., Woodyard E. The measurement of intelligence. New York : Teachers College Bureau of Publications, 1926. 616 p. DOI: 10.1007/978-3-319-93846-2_64

Klein P., Kuchemann S., Bruckner S., Zlatkin-Troitschanskaia O., Kuhn J. Student understanding of graph slope and area under a curve: a replication study comparing first-year physics and economics students // Physical Review Physics Education Research. 2019. № 15. P. 1-17. DOI: 10.1103/PhysRevPhysEducRes.15.020116

He Q., von Davier M., Han Z. Exploring Process Data in Computer-based International Large-scale Assessments // Data analytics and psychometrics: informing assessment practices / H. Jiao, R. Lissitz, A. van Wie (eds.). Charlotte, NC : Information Age Publishing, 2018. P. 53-76.

Goldhammer F., Naumann J., Kessel Y. Assessing Individual differences in Basic Computer Skills: Psychometric characteristics of an interactive performance measure // European Journal of Psychological Assessment. 2013. № 29. P. 263-275. DOI: 10.1027/1015-5759/a000153

Dobria L. Longitudinal Rater Modeling with Splines : Doctoral dissertation. Chicago, IL : University of Illinois, 2011.

Schaefer E. Rater bias patterns in an EFL writing assessment // Language Testing. 2008. № 25 (4). P. 465-493. DOI: 10.1177/0265532208094273

Myford C.M., Wolfe E.W. Detecting and measuring rater effects using many-facet Rasch measurement: Part I // Journal of Applied Measurement. 2003. № 4 (4). P. 386-422.

Myford C.M., Wolfe E.W. Detecting and measuring rater effects using many-facet Rasch measurement: Part II // Journal of Applied Measurement. 2004. № 5 (2). P. 189-227.

Eckes T. Introduction to Many-Facet Rasch Measurement. Analyzing and Evaluating Rater-Mediated Assessments. Berlin : Peter Lang GmbH, Internationaler Verlag der Wissenschaften, 2015. 241 p. DOI: 10.3726/978-3-653-04844-5

Rolfes T., Roth J., Schnotz W. Effects of tables, bar charts, and graphs on solving function tasks // Journal Fur Mathematik-Didaktik. 2018. № 39 (1). P. 97-125. DOI: 10.1007/s13138-017-0124-x

Hahne J. Analyzing position effects within reasoning items using the LLTM for structurally incomplete data // Psychology Science Quarterly. 2008. № 50. P. 379-390.

Sonnleitner P. Using the LLTM to evaluate an item-generating system for reading comprehension // Psychology Science Quarterly. 2008. № 50 (3). P. 345-362.

Baghaei P., Kubinger K.D. Linear Logistic Test Modeling with R // Practical Assessment, Research & Evaluation. 2015. № 20. Article 1. DOI: 10.7275/8f33-hz58

Baker F.B. EQUATE 2.0: a computer program for the characteristic curve method of IRT equating // Applied Psychological Measurement. 1993. № 17 (1). DOI: 10.1177/014662169301700105

De Boeck P. Random item IRT models // Psychometrika. 2008. № 73 (4). P. 533-559. DOI: 10.1007/s11336-008-9092-x

Fischer G.H. Logistic latent trait models with linear constraints // Psychometrika. 1983. № 48 (1). Р. 3-26. DOI: 10.1007/BF02314674

Fischer G.H., Formann A.K. Some applications of logistic latent trait models with linear constraints on the parameters // Applied Psychological Measurement. 1982. № 6 (4). P. 397-416. DOI: 10.1177/014662168200600403

Wilson M., Zheng X., McGuire L. Formulating latent growth using an explanatory item response model approach // Journal of applied measurement. 2012. № 13 (1). Р. 1-22.

Andersen E.B. Estimating latent correlations between repeated testings // Psychometrika. 1985. № 50. P. 3-16. DOI: 10.1007/BF02294143

Embretson S.E. A multidimensional latent trait model for measuring learning and change // Psychometrika. 1991. № 56. P. 495-515. DOI: 10.1007/BF02294487

Gonzalez J., Wiberg M. Applying test equating methods using R. New York : Springer, 2017. 196 p. DOI: 10.1007/978-3-319-51824-4

Bock R.D., Mislevy R.J. Adaptive EAP estimation of ability in a microcomputer environment // Applied psychological measurement. 1982. № 6(4). P. 431-444. DOI: 10.1177/014662168200600405

OECD. PISA 2018 Results (Vol. I): What Students Know and Can Do. Organisation for Economic Co-operation and Development (OECD) Publishing, 2020. DOI: 10.1787/19963777

Wu M., Tam H.P., Jen T.H. Multidimensional IRT Models // Educational measurement for applied researchers. Theory into practice. Singapore : Springer, 2016. P. 283-296. DOI: 10.1007/978-981-10-3302-5

Wilson M., Gochyyev P. Having your cake and eating it too: Multiple dimensions and a composite // Measurement. 2020. № 151. P. 107-247. DOI: 10.1016/j.measurement.2019.107247

Markus K.A., Borsboom D. Frontiers of test validity theory: measurement, causation, and meaning. New York : Routledge, 2013. DOI: 10.4324/9780203501207

Foy P., Yin L. Scaling the TIMSS 2015 Achievement Data // Methods and procedures in TIMSS 2015 / M.O. Martin, I.V. Mullis, M. Hooper (eds.); TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College and International Association for the Evaluation of Educational Achievement (IEA). Boston, 2016. P. 13.1-13.62.

Explanatory item response models: a generalized linear and nonlinear approach / P. De Boeck, M. Wilson (eds.). New York : Springer Science & Business Media, 2004. DOI: 10.1007/978-1-4757-3990-9

Handbook of item response theory / W.J. van der Linden (ed.). CRCPress. 2016. Vol. 1: Models. DOI: 10.1201/9781315374512

Wang W., Chen P., Cheng Y. Improving measurement precision of test batteries using multidimensional item response models // Psychological Methods. 2004. № 9 (1). P. 116-136. DOI: 10.1037/1082-989X.9.1.116

De Boeck P., Bakker M., Zwitser R., Nivard M., Hofman A., Tuerlinckx F., Partchev I. The estimation of item response models with the lmer function from the lme4 package in R // Journal of Statistical Software. 2011. № 39 (12). С. 1-28. DOI: 10.18637/jss.v039.i12

Bennett R.E., Goodman M., Hessinger J., Kahn H., Ligget J., Marshall G., Zack J. Using multimedia in large-scale computer-based testing programs // Computers in Human Behavior. 1999. № 15 (3-4). P. 283-294. DOI: 10.1016/S0747-5632(99)00024-2

Polyak S.T., von Davier A.A., Peterschmidt K. Computational psychometrics for the measurement of collaborative problem solving skills // Frontiers in psychology. 2017. № 8. P. 1-16. DOI: 10.3389/fpsyg.2017.02029

Goldhammer F., Zehner F. What to make of and how to interpret process data // Measurement: Interdisciplinary Research and Perspectives. 2017. № 15 (3-4). P. 128-132. DOI: 10.1080/15366367.2017.1411651

Csapo B., Ainley J., Bennett R.E., Latour T., Law N. Technological issues for computer-based assessment // Assessment and teaching of 21st century skills / P. Griffin, B. McGaw, E. Care (eds.). Dordrecht : Springer, 2012. P. 143-230. DOI: 10.1007/978-3-319-65368-6

DiCerbo K., Shute V., Kim Y.J. The future of assessment in technology rich environments: Psychometric considerations // Learning, design, and technology: An international compendium of theory, research, practice, and policy / M. Spector, B.B. Lockee, M.D. Childress (eds.). Switzerland AG : Springer Nature, 2017. P. 1-21. DOI: 10.1007/978-3-319-17727-4_66-1

Qiao X., Jiao H. Data mining techniques in analyzing process data: a didactic // Frontiers in psychology. 2018. № 9. P. 1-11. DOI: 10.3389/fpsyg.2018.02231

Ulinskas M., Damasevicius R., Maskeliunas R., Wozniak M. Recognition of human daytime fatigue using keystroke data // Procedia computer science. 2018. № 130. P. 947-952.

Bylieva D., Lobatyuk V., Safonova A., Rubtsova A. Correlation between the Practical Aspect of the Course and the E-Learning Progress // Educa tion Sciences. 2019. № 9 (3):167. P. 1-14. DOI: 10.3390/educsci9030167

De Boeck P., Jeon M. An overview of models for response times and processes in cognitive tests // Frontiers in psychology. 2019. № 10. P. 1-11. DOI: 10.3389/fpsyg.2019.00102

Vlug K.F.M. Because every pupil counts: the success of the pupil monitoring system in The Netherlands // Education and Information Technolo gies. 1997. Vol. 2, № 4. P. 287-306.

Унт И.Э. Индивидуализация и дифференциация обучения. M. : Педагогика, 1990. 190 c.

Авдеева С.М., Руднев М.Г., Васин Г.М., Тарасова К.В., Панова Д.М. Оценка информационно-коммуникационной компетентности учащихся: подходы, инструмент, валидность и надежность результатов // Вопросы образования. 2017. № 4. С. 104-132. DOI: 10.17323/1814-9545-2017-4-104-132

He Q., von Davier M., Greiff S., Steinhauer E.W., Borysewicz P.B. Collaborative Problem-Solving Measures in the Programme for International Student Assessment (PISA) // Innovative assessment of collaboration / A.A. von Davier, M. Zhu, P.C. Kyllonen (eds.). Cham : Springer, 2017. P. 95-111. DOI: 10.1007/978-3-319-33261-1_7

Lee A.T. Flight simulation: virtual environments in aviation. London : Routledge, 2017. DOI: 10.4324/9781315255217

Kyllonen P. New constructs, methods and directions for computer-based assessment // The transition to computer-based assessment / F. Scheuermann, J. Bjornsson (eds.). Luxemburg : Office for Official Publications of the European Communities, 2009. P. 151-156. DOI: 10.2788/60083