Abstract
Cross-validation is an important evaluation strategy in behavioral predictive modeling; without it, a predictive model is likely to be overly optimistic. Statistical methods have been developed that allow researchers to straightforwardly cross-validate predictive models by using the same data employed to construct the model. In the present study, cross-validation techniques were used to construct several decision-tree models with data from the MacArthur Violence Risk Assessment Study (Monahan et al., 2001). The models were then compared with the original (non-cross-validated) Classification of Violence Risk assessment tool. The results show that the measures of predictive model accuracy (AUC, misclassification error, sensitivity, specificity, positive and negative predictive values) degrade considerably when applied to a testing sample, compared with the training sample used to fit the model initially. In addition, unless false negatives (that is, incorrectly predicting individuals to be nonviolent) are considered more costly than false positives (that is, incorrectly predicting individuals to be violent), the models generally make few predictions of violence. The results suggest that employing cross-validation when constructing models can make an important contribution to increasing the reliability and replicability of psychological research.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
BANKS, S., ROBBINS, P.C., SILVER, E., VESSELINOV, R., STEADMAN, H.J., MONAHAN, J., and ROTH, L.H. (2004), “A Multiple-Models Approach to Violence Risk Assessment Among People With Mental Disorder”, Criminal Justice and Behavior, 31, 324–340.
BERK, R. (2011), “Asymmetric Loss Functions for Forecasting in Criminal Justice Settings”, Journal of Quantitative Criminology, 27, 107–123.
BERK, R. (2012), Criminal Justice Forecasts of Risk: A Machine Learning Approach, New York, NY: Springer.
BREIMAN, L. (1996), “Bagging Predictors”, Machine Learning, 26, 123–140.
BREIMAN, L. (2001), “Random Forests”, Machine Learning, 45, 5–32.
BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A., and STONE, C.J. (1984), Classification and Regression Trees, Belmont, CA: Wadsworth and Brooks.
BREIMAN, L., and SPECTOR, P. (1992), “Submodel Selection and Evaluation in Regression. The X-Random Case”, International Statistical Review, 291–319.
DOYLE, M., SHAW, J., CARTER, S., and DOLAN, M. (2010), “Investigating the Validity of the Classification of Violence Risk in a UK Sample”, International Journal of Forensic Mental Health, 9, 316–323.
FERNÁNDEZ-DELGADO, M., CERNADAS, E., BARRO, S., and AMORIM, D. (2014), “Do We Need Hundreds of Classifiers to Solve Real World Classification Problems?”, The Journal of Machine Learning Research, 15, 3133–3181.
GARDNER, W., LIDZ, C.W., MULVEY, E.P., and SHAW, E.C. (1996), “A Comparison of Actuarial Methods for Identifying Repetitively Violent Patients with Mental Illnesses”, Law and Human Behavior, 20, 35–48.
GINI, C. (1912), Variability and Mutability: Contribution to the Study of Distributions and Report Statistics, Bologna, Italy: C. Cuppini.
HARE, R.D. (1980), “A Research Scale for the Assessment of Psychopathy in Criminal Populations”, Personality and Individual Differences, 1, 111–119.
HARRIS, G.T., and RICE, M.E. (2013), “Bayes and Base Rates: What is an Informative Prior for Actuarial Violence Risk Assessment?”, Behavioral Sciences and the Law, 31, 103-124.
HASTIE, T. , TIBSHIRANI, R., and FRIEDMAN, J. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.), New York, NY: Springer.
JAMES, G., WITTEN, D., HASTIE, T., and TIBSHIRANI, R. (2013), An Introduction to Statistical Learning, New York, NY: Springer.
KUHN, M., and JOHNSON, K. (2013), Applied Predictive Modeling, New York, NY: Springer.
MCCUSKER, P.J. (2007), “Issues Regarding the Clinical Use of the Classification of Violence Risk (COVR) Assessment Instrument”, International Journal of Offender Therapy and Comparative Criminology, 51, 676–685.
MCDERMOTT, B.E., DUALAN, I.V., and SCOTT, C.L. (2011), “The Predictive Ability of the Classification of Violence Risk (COVR) in a Forensic Psychiatric Hospital”, Psychiatric Services, 62, 430–433.
MEEHL, P.E., and ROSEN, A.(1955), “Antecedent Probability and the Efficiency of Psychometric Signs, Patterns, or Cutting Scores”, Psychological Bulletin, 52, 194–215.
MONAHAN, J., STEADMAN, H.J., APPELBAUM, P.S., GRISSO, T., MULVEY, E.P., ROTH, L.H., and SILVER, E. (2006), “The Classification of Violence Risk”, Behavioral Sciences and the Law, 24, 721–730.
MONAHAN, J., STEADMAN, H.J., ROBBINS, P.C., APPELBAUM, P.S., BANKS, S., GRISSO, T., and SILVER, E. (2005), “An Actuarial Model of Violence Risk Assessment for Persons with Mental Disorders”, Psychiatric Services, 56, 810–815.
MONAHAN, J., STEADMAN, H.J., ROBBINS, P.C., SILVER, E., APPELBAUM, P.S., GRISSO, T., and ROTH, L.H. (2000), “Developing a Clinically Useful Actuarial Tool for Assessing Violence Risk”, The British Journal of Psychiatry, 176, 312–319.
MONAHAN, J., STEADMAN, H.J., SILVER, E., APPELBAUM, P.S., ROBBINS, P.C., MULVEY, E.P., and BANKS, S. (2001), Rethinking Risk Assessment: The MacArthur Study of Mental Disorder and Violence, New York, NY: Oxford University Press.
MOSSMAN, D. (2006), “Critique of Pure Risk Assessment or, Kant Meets Tarasoff”, University of Cincinnati Law Review, 75, 523–609.
MOSSMAN, D. (2013), “Evaluating Risk Assessments Using Receiver Operating Characteristic Analysis: Rationale, Advantages, Insights, and Limitations”, Behavioral Sciences and the Law, 31, 23–39.
PASHLER, H., and WAGENMAKERS, E.J. (2012), “Editors’ Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence?”, Perspectives on Psychological Science, 7, 528–530.
POLLACK, I., and NORMAN, D.A. (1964), “A Non-Parametric Analysis of Recognition Experiments”, Psychonomic Science, 1, 125–126.
R CORE TEAM (2014), R: A Language and Environment for Statistical Computing (Version 3.1.1), Vienna, Austria, http://www.R-project.org/.
ROBERTS, S., and PASHLER, H. (2000), “How Persuasive is a Good Fit? A Comment on Theory Testing”, Psychological Review. 107, 358–367.
SNOWDEN, R.J., GRAY, N.S., TAYLOR, J., and FITZGERALD, S. (2009), “Assessing Risk of Future Violence Among Forensic Psychiatric Inpatients with the Classification of Violence Risk (COVR)”, Psychiatric Services, 60, 1522–1526.
SPSS, INC. (1993), SPSS for Windows (Release 6.0), Chicago, IL: SPSS, Inc.
STEADMAN, H.J., SILVER, E., MONAHAN, J., APPELBAUM, P.S., ROBBINS, P.C., MULVEY, E.P., and BANKS, S. (2000), “A Classification Tree Approach to the Development of Actuarial Violence Risk Assessment Tools”, Law and Human Behavior, 24, 83–100.
STURUP, J., KRISTIANSSON, M., and LINDQVIST, P. (2011), “Violent Behaviour by General Psychiatric Patients in Sweden: Validation of Classification of Violence Risk (COVR) Software”, Psychiatry Research, 188, 161–165.
VRIEZE, S.I., and GROVE, W.M. (2008), “Predicting Sex Offender Recidivism. I. Correcting for Item Overselection and Accuracy Overestimation in Scale Development. II. Sampling Error-Induced Attenuation of Predictive Validity over Base Rate Information”, Law and Human Behavior, 32, 266–278.
Author information
Authors and Affiliations
Corresponding author
Additional information
Ehsan Bokhari is now a Senior Analyst with the Los Angeles Dodgers in Los Angeles, California.
Rights and permissions
About this article
Cite this article
Bokhari, E., Hubert, L. The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study. J Classif 35, 147–171 (2018). https://doi.org/10.1007/s00357-018-9252-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-018-9252-3