Common Statistical Methods for Primary and Secondary Analysis in Substance Abuse Research

King, Adam; Li, Libo; Hser, Yih-Ing

doi:10.1007/978-3-319-55980-3_5

Adam King^4,5,
Libo Li⁴ &
Yih-Ing Hser⁴

981 Accesses

Abstract

This chapter presents statistical methods and issues commonly encountered in the design and analysis of substance abuse research studies. It begins with a general discussion contrasting primary and secondary data analysis, followed by an overview of study design from the perspective of the conduct of primary research, including hypothesis and planned analysis specification, sampling schemes, and power analysis. Next, descriptions of study characteristics are included from the perspective of secondary analysis, paying particular attention to characteristics that need to be considered when determining appropriate analytic methods and interpreting results. Statistical methods reviewed include: various types of regression (linear regression, logistic regression, survival analysis), related topics, such as moderators and mediators, as well as multilevel models (for longitudinal or clustered observations), and latent variable modeling techniques, including structural equation modeling, latent class analysis, latent transition analysis, and growth mixture modeling. Finally, overviews of four major special topics particularly important when using secondary data are provided, which include: multiplicity of hypotheses, combining data and results from multiple studies, missing data, and propensity scores. Where helpful, concepts and methods are illustrated using practical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Analyzing Long-Duration and High-Frequency Data Using the Time-Varying Effect Model

Article 23 January 2016

Meta-analysis with Robust Variance Estimation: Expanding the Range of Working Models

Article 07 May 2021

Sequential Bayesian Data Synthesis for Mediation and Regression Analysis

Article 21 July 2021

References

Agresti, A. (2002). Categorical data analysis (2nd ed.). Hoboken, NJ: Wiley.
Book Google Scholar
Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317–332.
Article Google Scholar
Arbuckle, J. L. (2006). Amos (version 7.0) [computer program]. Chicago, IL: SPSS.
Google Scholar
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 2, 238–246.
Article Google Scholar
Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software.
Google Scholar
Bentler, P. M. (2000–2008). EQS 6 structural equations program manual. Encino, CA: Multivariate Software, Inc.
Google Scholar
Biernacki, C., Celeux, G., & Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 719–725.
Article Google Scholar
Biernacki, C., & Govaert, G. (1997). Using the classification likelihood to choose the number of clusters. Computing Science and Statistics, 29, 451–457.
Google Scholar
Bollen, K. A. (1989). Structural equations with latent variables. New York, NY: Wiley.
Book Google Scholar
Bollen, K. A., & Curran, P. J. (2006). Latent curve models: A structural equation approach (Wiley series on probability and mathematical statistics). Hoboken, NJ: Wiley.
Google Scholar
Browne, M. W. (1974). Generalized least squares estimators in the analysis of covariance structures. South African Statistical Journal, 8, 1–24.
Google Scholar
Browne, M. W. (1984). Asymptotic distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83.
Article Google Scholar
Celeux, G., & Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification, 13, 195–212.
Article Google Scholar
Chou, C.-P., & Bentler, P. M. (1990). Model modification in covariance structure modeling: A comparison among likelihood ratio, lagrange multiplier, and Wald tests. Multivariate Behavioral Research, 25, 115–136.
Article Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Collins, L. M. (2006). Analysis of longitudinal data: The integration of theoretical model, temporal design and statistical model. Annual Review of Psychology, 57, 505–528.
Article Google Scholar
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B, 34(2), 187–220.
Google Scholar
D’Agostino, R. (1998). Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Statistics in Medicine, 17, 2265–2281.
Article Google Scholar
Dang, H. D. (2011). A latent transition analysis of self-efficacy among men treated for cocaine dependence (doctoral dissertation). Available from ProQuest dissertations and theses database (UMI No. 3472617).
Google Scholar
Deeks, J. J., Higgins, J. P. T., & Altman, D. G. (2011). Chapter 9: Analysing data and undertaking meta-analyses. In J. P. T. Higgins & S. Green (Eds.), Cochrane handbook for systematic reviews of interventions, version 5.1.0 (updated March 2011). London, UK: The Cochrane Collaboration. Available from www.cochrane-handbook.org
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series. B, 39, 1–38.
Google Scholar
DerSimonian, R., & Laird, N. (1986). Meta-analysis in clinical trials. Controlled Clinical Trials, 7, 177–188.
Article Google Scholar
Eliason, S. (1997). The categorical data analysis system. Version 4.0 of MLLSA. Iowa City, IA: University of Iowa.
Google Scholar
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191.
Article Google Scholar
Feng, W., Jun, Y., & Xu, R. A (2006). Method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. SAS Technical Report, paper PR05, pp. 1–11.
Google Scholar
Friedman, L. M., Furberg, C. D., & DeMets, D. L. (2010). Fundamentals of clinical trials (4th ed.). New York, NY: Springer.
Book Google Scholar
Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5(10), 3–8.
Article Google Scholar
Green, S. B., Thompson, M. S., & Babyak, M. A. (1998). A Monte Carlo investigation of methods for controlling type I errors with specification searches in structural equation modeling. Multivariate Behavioral Research, 33, 365–384.
Article Google Scholar
Guo, S., & Fraser, M. W. (2010). Propensity score analysis: Statistical methods and application. Thousand Oaks, CA: Sage Publications.
Google Scholar
Heitjan, F., & Little, R. J. A. (1991). Multiple imputation for the fatal accident reporting system. Applied Statistics, 40, 13–29.
Article Google Scholar
Higgins, J. P. T., & Green, S. (Eds.). (2011). Cochrane handbook for systematic reviews of interventions version 5.1.0 (updated March 2011). London, UK: The Cochrane Collaboration. Available from www.cochrane-handbook.org
Homburg, C., & Dobartz, A. (1992). Covariance structure analysis via specification searches. Statistical Papers, 33(1), 119–142.
Article Google Scholar
Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression. New York, NY: Wiley.
Book Google Scholar
Hser, Y.-I., Evans, E., Huang, Y., & Anglin, M. D. (2004). Relationship between drug treatment services, retention and outcomes. Psychiatric Services, 55(7), 767–774.
Article Google Scholar
Hser, Y.-I., Evans, E., Huang, D., & Messina, N. (2011). Long-term outcomes among drug-dependent mothers treated in women-only versus mixed-gender programs. Journal of Substance Abuse Treatment, 41(2), 115–123.
Article Google Scholar
Hser, Y.-I., Huang, D., Chou, C.-P., & Anglin, M. D. (2007). Trajectories of heroin addiction: Growth mixture modeling results based on a 33-year follow-up study. Evaluation Review, 31(6), 548–563.
Article Google Scholar
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.
Article Google Scholar
Hussong, A. M., Curran, P. J., & Bauer, D. J. (2013). Integrative data analysis in clinical psychology research. Annual Review of Clinical Psychology, 9, 61–89.
Article Google Scholar
Kang, J. D., & Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data (with discussions). Statistical Science, 22, 523–539.
Article Google Scholar
Jones, B. L., Nagin, D. S., & Roeder, K. (2001). A SAS procedure based on mixture models for estimating developmental trajectories. Sociological Methods and Research, 29, 374–393.
Article Google Scholar
Jöreskog, K. G., & Sörbom, D. (2006). LISREL 8.8 for Windows [computer software]. Skokie, IL: Scientific Software International, Inc.
Google Scholar
Kalbfleisch, J. D., & Prentice, R. L. (2002). The statistical analysis of failure time data (2nd ed.). Hoboken, NJ: Wiley.
Book Google Scholar
Klein, J. P., & Moeschberger, M. L. (2003). Survival analysis: Techniques for censored and truncated data (2nd ed.). Hoboken, NJ: Springer.
Google Scholar
Kline, R. B. (1998). Principles and practice of structural equation modeling. New York, NY: Guilford Press.
Google Scholar
Lanza, S. T., Dziak, J. J., Huang, L., Wagner, A., & Collins, L. M. (2013). PROC LCA and PROC LTA Users’ guide (version 1.3.0). University Park, PA: The Methodology Center, Penn State.
Google Scholar
Li, L., & Hser, Y.-I. (2011). On inclusion of covariates for class enumeration of growth mixture models. Multivariate Behavioral Research, 46(2), 266–302.
Article Google Scholar
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). New York, NY: Wiley.
Google Scholar
Lo, Y., Mendell, N., & Rubin, D. (2001). Testing the number of components in a normal mixture. Biometrika, 88, 767–778.
Article Google Scholar
MacCallum, R. C. (1986). Specification searches in covariance structure modeling. Psychological Bulletin, 100, 107–120.
Article Google Scholar
MacCallum, R. C., & Austin, J. T. (2000). Applications of structural equation modeling in psychological research. Annual Reviews in Psychology, 51, 201–226.
Article Google Scholar
MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological Bulletin, 111, 490–504.
Article Google Scholar
McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Applied Statistics, 36, 318–324.
Article Google Scholar
McLellan, A. T., Kushner, H., Metzger, D., Peters, R., Smith, I., Grissom, G., et al. (1992). The fifth edition of the addiction severity index. Journal of Substance Abuse Treatment, 9(3), 199–213.
Google Scholar
Muthén, B. O. (2003). Statistical and substantive checking in growth mixture modeling: Comment on Bauer and Curran (2003). Psychological Methods, 8, 369–377.
Article Google Scholar
Muthén, B. O. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (Ed.), The Sage Handbook of Quantitative Methodology for the Social Sciences (pp. 345–368). Thousand Oaks, CA: Sage Publications.
Google Scholar
Muthén, B., & Muthén, L. (2000). The development of heavy drinking and alcohol-related problems from ages 18 to 37 in a U. S. National sample. Journal of Studies on Alcohol, 61(2), 290–300.
Article Google Scholar
Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599–620.
Article Google Scholar
Muthén, B., & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics, 55(2), 463–469.
Article Google Scholar
Muthén, L. K., & Muthén, B. O. (1998–2012). Mplus user’s guide (7th Ed.). Los Angeles, CA: Muthén and Muthén.
Google Scholar
Nagin, D. S. (1999). Analyzing developmental trajectories: A semiparametric group-based approach. Psychological Methods, 4(2), 139–157.
Article Google Scholar
Nagin, D. S., & Tremblay, R. E. (2001). Analyzing developmental trajectories of distinct but related behaviors: A group-based method. Psychological Methods, 6, 18–34.
Article Google Scholar
Neale, M. C., Boker, S. M., Xie, G., & Maes, H. H. (2003). Mx: Statistical modeling (6th ed.). Richmond, VA: Department of Psychiatry.
Google Scholar
Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14, 535–569.
Article Google Scholar
Peugh, J., & Fan, X. (2012). How well does growth mixture modeling identify heterogeneous growth trajectories? A simulation study examining GMM’s performance characteristics. Structural Equation Modeling: A Multidisciplinary Journal, 19, 204–226.
Article Google Scholar
Riley, R. D., Lambert, P. C., & Abo-Zaid, G. (2010). Meta-analysis of individual participant data: Rationale, conduct, and reporting. British Medical Journal, 340(7745), 521–525.
Google Scholar
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.
Article Google Scholar
Rothman, K. J., Greenland, S., & Lash, T. L. (2008). Modern epidemiology (3rd ed.). Philadelphia, PA: Lippincott, Williams & Wilkins.
Google Scholar
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Article Google Scholar
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.
Book Google Scholar
Rubin, D. B. (1997). Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine, 127, 757–763.
Article Google Scholar
Rubin, D. B., & Thomas, N. (1996). Matching using estimated propensity scores: Relating theory to practice. Biometrics, 52, 249–264.
Article Google Scholar
SAS Institute Inc. (2013). SAS/STAT® 13.1 user’s guide. Cary, NC: SAS Institute Inc.
Google Scholar
Satorra, A., & Bentler, P. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis. American Statistical Association 1988 proceedings of the Business and Economics Sections (pp. 308–313). Alexandria, VA: American Statistical Association.
Google Scholar
Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications for developmental research (pp. 399–419). Thousand Oaks, CA: Sage.
Google Scholar
Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York, NY: Chapman and Hall.
Book Google Scholar
Schenker, N., & Taylor, J. M. G. (1996). Partially parametric techniques for multiple imputation. Computational Statistics and Data Analysis, 22, 425–446.
Article Google Scholar
Schwartz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.
Article Google Scholar
Sclove, L. (1987). Application of model-selection criteria to some problems in multivariate analysis. Psychometrika, 52, 333–343.
Article Google Scholar
Smith, A. K., Ayanian, J. Z., Covinsky, K. E., Landon, B. E., McCarthy, E. P., Wee, C. C., et al. (2011). Conducting high-value secondary dataset analysis: an introductory guide and resources. Journal of General Internal Medicine, 26(8), 920–929.
Article Google Scholar
Sörbom, D. (1989). Model modification. Psychometrika, 54, 371–384.
Article Google Scholar
Steiger, J. H., & Lind, J. C. (1980). Statistically-based tests for the number of common factors. Paper presented at the Annual Meeting of the Psychometric Society, Iowa City, IA.
Google Scholar
Stewart, L. A., & Tierney, J. F. (2002). To IPD or not to IPD?: Advantages and disadvantages of systematic reviews using individual patient data. Evaluation and the Health Professions, 25(1), 76–97.
Article Google Scholar
Stewart, L. A., Tierney, J. F., & Clarke, M. (2011). Reviews of individual patient data. In J. P. T. Higgins & S. Green (Eds.), Cochrane handbook for systematic reviews of interventions (version 5.1.0) [updated March 2011]. London, UK: The Cochrane Collaboration, 2011. Available from www.cochrane-handbook.org
Tofighi, D., & Enders, C. K. (2007). Identifying the correct number of classes in growth mixture models. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 317–341). Charlotte, NC: Information Age.
Google Scholar
Tucker, L. R., & Lewis, C. (1973). The reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.
Article Google Scholar
Vermunt, J. K. (1997). LEM 1.0: A general program for the analysis of categorical data. Tilburg, NL: Tilburg University.
Google Scholar
Vermunt, J. K. (2004). Latent Markov Model. In M. S. Lewis-Beck, A. Bryman, & T. F. Liao (Eds.), The sage encyclopedia of social science research methods (pp. 553–554). Thousand Oaks, CA: Sage Publications.
Google Scholar
Vermunt, J. K., & Magidson, J. (2013). Latent GOLD 5.0 upgrade manual. Belmont, MA: Statistical Innovations Inc.
Google Scholar
Von Davier, M. (1997). WINMIRA program description and recent enhancements. Methods of Psychological Research Online, 2, 25–28.
Google Scholar
Weiss, R. E. (2005). Modeling longitudinal data. New York, NY: Springer.
Google Scholar
Weston, R., & Gore, P. A., Jr. (2006). A brief guide to structural equation modeling. The Counseling Psychologist, 34, 719–751.
Article Google Scholar
Willett, J. B., & Singer, J. D. (1993). Investigating onset, cessation, relapse, and recovery: Why you should, and how you can, use discrete-time survival analysis to examine event occurrence. Journal of Consulting and Clinical Psychology, 61(6), 952–965.
Article Google Scholar
Yang, C. (2006). Evaluating latent class analyses in qualitative phenotype identification. Computational Statistics and Data Analysis, 50, 1090–1104.
Article Google Scholar
Ye, Y., & Kaskutas, L. A. (2008). Using propensity scores to adjust for bias when assessing the effectiveness of Alcoholics anonymous in observational studies. Drug and Alcohol Dependence, 104, 56–64.
Article Google Scholar

Download references

Acknowledgements

The writing of this chapter was supported by the National Institute on Drug Abuse, Center for Advancing Longitudinal Drug Abuse Research (CALDAR, P30 DA016383, PI: Hser).

Author information

Authors and Affiliations

University of California, 11075 Santa Monica Blvd., Suite 200, Los Angeles, CA, 90025, USA
Adam King, Libo Li & Yih-Ing Hser
Department of Mathematics and Statistics, California State Polytechnic University, 3801 West Temple Ave., Pomona, CA, 91768, USA
Adam King

Authors

Adam King
View author publications
You can also search for this author in PubMed Google Scholar
Libo Li
View author publications
You can also search for this author in PubMed Google Scholar
Yih-Ing Hser
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam King .

Editor information

Editors and Affiliations

Department of Health Policy and Management, College of Public Health, Kent State University, Kent, Ohio, USA
Jonathan B. VanGeest
Survey Research Laboratory, Department of Public Administration, College of Urban Planning and Public Affairs, University of Illinois at Chicago, Chicago, Illinois, USA
Timothy P. Johnson
Department of Health Policy and Management, College of Public Health, Kent State University, Kent, Ohio, USA
Sonia A. Alemagno

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

King, A., Li, L., Hser, YI. (2017). Common Statistical Methods for Primary and Secondary Analysis in Substance Abuse Research. In: VanGeest, J., Johnson, T., Alemagno, S. (eds) Research Methods in the Study of Substance Abuse. Springer, Cham. https://doi.org/10.1007/978-3-319-55980-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-55980-3_5
Published: 20 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55978-0
Online ISBN: 978-3-319-55980-3
eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics

Common Statistical Methods for Primary and Secondary Analysis in Substance Abuse Research

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Analyzing Long-Duration and High-Frequency Data Using the Time-Varying Effect Model

Meta-analysis with Robust Variance Estimation: Expanding the Range of Working Models

Sequential Bayesian Data Synthesis for Mediation and Regression Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Common Statistical Methods for Primary and Secondary Analysis in Substance Abuse Research

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Analyzing Long-Duration and High-Frequency Data Using the Time-Varying Effect Model

Meta-analysis with Robust Variance Estimation: Expanding the Range of Working Models

Sequential Bayesian Data Synthesis for Mediation and Regression Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation