The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study

Bokhari, Ehsan; Hubert, Lawrence

doi:10.1007/s00357-018-9252-3

The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study

Published: 26 March 2018

Volume 35, pages 147–171, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Classification Aims and scope Submit manuscript

The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study

Download PDF

Ehsan Bokhari^1,2 &
Lawrence Hubert¹

269 Accesses
10 Citations
Explore all metrics

Abstract

Cross-validation is an important evaluation strategy in behavioral predictive modeling; without it, a predictive model is likely to be overly optimistic. Statistical methods have been developed that allow researchers to straightforwardly cross-validate predictive models by using the same data employed to construct the model. In the present study, cross-validation techniques were used to construct several decision-tree models with data from the MacArthur Violence Risk Assessment Study (Monahan et al., 2001). The models were then compared with the original (non-cross-validated) Classification of Violence Risk assessment tool. The results show that the measures of predictive model accuracy (AUC, misclassification error, sensitivity, specificity, positive and negative predictive values) degrade considerably when applied to a testing sample, compared with the training sample used to fit the model initially. In addition, unless false negatives (that is, incorrectly predicting individuals to be nonviolent) are considered more costly than false positives (that is, incorrectly predicting individuals to be violent), the models generally make few predictions of violence. The results suggest that employing cross-validation when constructing models can make an important contribution to increasing the reliability and replicability of psychological research.

Article PDF

Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection

Article Open access 27 September 2018

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

Article Open access 15 January 2019

Reflections on Validation Practices in the Social, Behavioral, and Health Sciences

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

BANKS, S., ROBBINS, P.C., SILVER, E., VESSELINOV, R., STEADMAN, H.J., MONAHAN, J., and ROTH, L.H. (2004), “A Multiple-Models Approach to Violence Risk Assessment Among People With Mental Disorder”, Criminal Justice and Behavior, 31, 324–340.
Article Google Scholar
BERK, R. (2011), “Asymmetric Loss Functions for Forecasting in Criminal Justice Settings”, Journal of Quantitative Criminology, 27, 107–123.
Article Google Scholar
BERK, R. (2012), Criminal Justice Forecasts of Risk: A Machine Learning Approach, New York, NY: Springer.
Book Google Scholar
BREIMAN, L. (1996), “Bagging Predictors”, Machine Learning, 26, 123–140.
MATH Google Scholar
BREIMAN, L. (2001), “Random Forests”, Machine Learning, 45, 5–32.
Article MATH Google Scholar
BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A., and STONE, C.J. (1984), Classification and Regression Trees, Belmont, CA: Wadsworth and Brooks.
MATH Google Scholar
BREIMAN, L., and SPECTOR, P. (1992), “Submodel Selection and Evaluation in Regression. The X-Random Case”, International Statistical Review, 291–319.
DOYLE, M., SHAW, J., CARTER, S., and DOLAN, M. (2010), “Investigating the Validity of the Classification of Violence Risk in a UK Sample”, International Journal of Forensic Mental Health, 9, 316–323.
Article Google Scholar
FERNÁNDEZ-DELGADO, M., CERNADAS, E., BARRO, S., and AMORIM, D. (2014), “Do We Need Hundreds of Classifiers to Solve Real World Classification Problems?”, The Journal of Machine Learning Research, 15, 3133–3181.
MathSciNet MATH Google Scholar
GARDNER, W., LIDZ, C.W., MULVEY, E.P., and SHAW, E.C. (1996), “A Comparison of Actuarial Methods for Identifying Repetitively Violent Patients with Mental Illnesses”, Law and Human Behavior, 20, 35–48.
Article Google Scholar
GINI, C. (1912), Variability and Mutability: Contribution to the Study of Distributions and Report Statistics, Bologna, Italy: C. Cuppini.
Google Scholar
HARE, R.D. (1980), “A Research Scale for the Assessment of Psychopathy in Criminal Populations”, Personality and Individual Differences, 1, 111–119.
Article Google Scholar
HARRIS, G.T., and RICE, M.E. (2013), “Bayes and Base Rates: What is an Informative Prior for Actuarial Violence Risk Assessment?”, Behavioral Sciences and the Law, 31, 103-124.
Article Google Scholar
HASTIE, T. , TIBSHIRANI, R., and FRIEDMAN, J. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.), New York, NY: Springer.
Book MATH Google Scholar
JAMES, G., WITTEN, D., HASTIE, T., and TIBSHIRANI, R. (2013), An Introduction to Statistical Learning, New York, NY: Springer.
Book MATH Google Scholar
KUHN, M., and JOHNSON, K. (2013), Applied Predictive Modeling, New York, NY: Springer.
Book MATH Google Scholar
MCCUSKER, P.J. (2007), “Issues Regarding the Clinical Use of the Classification of Violence Risk (COVR) Assessment Instrument”, International Journal of Offender Therapy and Comparative Criminology, 51, 676–685.
Article Google Scholar
MCDERMOTT, B.E., DUALAN, I.V., and SCOTT, C.L. (2011), “The Predictive Ability of the Classification of Violence Risk (COVR) in a Forensic Psychiatric Hospital”, Psychiatric Services, 62, 430–433.
Article Google Scholar
MEEHL, P.E., and ROSEN, A.(1955), “Antecedent Probability and the Efficiency of Psychometric Signs, Patterns, or Cutting Scores”, Psychological Bulletin, 52, 194–215.
Article Google Scholar
MONAHAN, J., STEADMAN, H.J., APPELBAUM, P.S., GRISSO, T., MULVEY, E.P., ROTH, L.H., and SILVER, E. (2006), “The Classification of Violence Risk”, Behavioral Sciences and the Law, 24, 721–730.
Article Google Scholar
MONAHAN, J., STEADMAN, H.J., ROBBINS, P.C., APPELBAUM, P.S., BANKS, S., GRISSO, T., and SILVER, E. (2005), “An Actuarial Model of Violence Risk Assessment for Persons with Mental Disorders”, Psychiatric Services, 56, 810–815.
Article Google Scholar
MONAHAN, J., STEADMAN, H.J., ROBBINS, P.C., SILVER, E., APPELBAUM, P.S., GRISSO, T., and ROTH, L.H. (2000), “Developing a Clinically Useful Actuarial Tool for Assessing Violence Risk”, The British Journal of Psychiatry, 176, 312–319.
Article Google Scholar
MONAHAN, J., STEADMAN, H.J., SILVER, E., APPELBAUM, P.S., ROBBINS, P.C., MULVEY, E.P., and BANKS, S. (2001), Rethinking Risk Assessment: The MacArthur Study of Mental Disorder and Violence, New York, NY: Oxford University Press.
Google Scholar
MOSSMAN, D. (2006), “Critique of Pure Risk Assessment or, Kant Meets Tarasoff”, University of Cincinnati Law Review, 75, 523–609.
Google Scholar
MOSSMAN, D. (2013), “Evaluating Risk Assessments Using Receiver Operating Characteristic Analysis: Rationale, Advantages, Insights, and Limitations”, Behavioral Sciences and the Law, 31, 23–39.
Article Google Scholar
PASHLER, H., and WAGENMAKERS, E.J. (2012), “Editors’ Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence?”, Perspectives on Psychological Science, 7, 528–530.
Article Google Scholar
POLLACK, I., and NORMAN, D.A. (1964), “A Non-Parametric Analysis of Recognition Experiments”, Psychonomic Science, 1, 125–126.
Article Google Scholar
R CORE TEAM (2014), R: A Language and Environment for Statistical Computing (Version 3.1.1), Vienna, Austria, http://www.R-project.org/.
ROBERTS, S., and PASHLER, H. (2000), “How Persuasive is a Good Fit? A Comment on Theory Testing”, Psychological Review. 107, 358–367.
Article Google Scholar
SNOWDEN, R.J., GRAY, N.S., TAYLOR, J., and FITZGERALD, S. (2009), “Assessing Risk of Future Violence Among Forensic Psychiatric Inpatients with the Classification of Violence Risk (COVR)”, Psychiatric Services, 60, 1522–1526.
Article Google Scholar
SPSS, INC. (1993), SPSS for Windows (Release 6.0), Chicago, IL: SPSS, Inc.
Google Scholar
STEADMAN, H.J., SILVER, E., MONAHAN, J., APPELBAUM, P.S., ROBBINS, P.C., MULVEY, E.P., and BANKS, S. (2000), “A Classification Tree Approach to the Development of Actuarial Violence Risk Assessment Tools”, Law and Human Behavior, 24, 83–100.
Article Google Scholar
STURUP, J., KRISTIANSSON, M., and LINDQVIST, P. (2011), “Violent Behaviour by General Psychiatric Patients in Sweden: Validation of Classification of Violence Risk (COVR) Software”, Psychiatry Research, 188, 161–165.
Article Google Scholar
VRIEZE, S.I., and GROVE, W.M. (2008), “Predicting Sex Offender Recidivism. I. Correcting for Item Overselection and Accuracy Overestimation in Scale Development. II. Sampling Error-Induced Attenuation of Predictive Validity over Base Rate Information”, Law and Human Behavior, 32, 266–278.
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Illinois at Urbana-Champaign, Champaign, IL, USA
Ehsan Bokhari & Lawrence Hubert
Los Angeles, USA
Ehsan Bokhari

Authors

Ehsan Bokhari
View author publications
You can also search for this author in PubMed Google Scholar
Lawrence Hubert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ehsan Bokhari.

Additional information

Ehsan Bokhari is now a Senior Analyst with the Los Angeles Dodgers in Los Angeles, California.

Electronic supplementary material

ESM 1

(PDF 826 kb)

ESM 2

(PDF 852 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bokhari, E., Hubert, L. The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study. J Classif 35, 147–171 (2018). https://doi.org/10.1007/s00357-018-9252-3

Download citation

Published: 26 March 2018
Issue Date: April 2018
DOI: https://doi.org/10.1007/s00357-018-9252-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study

Abstract

Article PDF

Similar content being viewed by others

Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

Reflections on Validation Practices in the Social, Behavioral, and Health Sciences

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

ESM 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study

Abstract

Article PDF

Similar content being viewed by others

Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

Reflections on Validation Practices in the Social, Behavioral, and Health Sciences

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

ESM 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation