Skip to main content

Using machine learning to accelerate our understanding of risks for early substance use among child-welfare and community youth

Ninety percent of adults with a substance use disorder started using substances during adolescence (U.S. Department of Health and Human Services, 2012, 2014; National Center on Addiction and Substance Abuse, 2011).  However, screening during adolescent primary care visits is low and patients are less likely to disclose during a medical visit than through anonymous means (Gryczynski et al., 2019).

Children and adolescents with a history of child abuse/neglect as well as child welfare (CW) involvement are at higher risk for substance use and abuse (Lewis et al., 2011; Mills et al., 2014, 2017; Tonmyr et  al., 2010).  Unfortunately, evidence regarding predictors of substance use/abuse has only come from normative (i.e., non-CW) samples.  Therefore, this study aims to fill the critical gap regarding knowledge of the risk factors of adolescent substance use that are specific to the CW population.  (For the purposes of this study, we are interested in any alcohol, marijuana, or illicit drug use.)

This study is co-led by Kaiser Permanente Southern California (KPSC), the largest integrated health system in Southern California.  KPSC serves approximately 1 million children with diverse racial/ethnic and socioeconomic backgrounds.  All KPSC patient information is captured in electronic health records.

Importantly, we are collaborating with youth medical providers throughout the study to inform variable selection, how the models will be set up and run, and what outcome period is most important to predict. Providers will also establish a risk stratification and corresponding treatment recommendations (e.g., no intervention for very low risk, referrals to addiction medicine for very high risk).

Goals

We are using machine learning (ML) to provide vital new information about how to best identify both child welfare-involved and community (non-CW) youth who are at risk for early substance use and develop risk scores that can be implemented into EHR systems.

We aim to:

–Determine shared and unique predictors of early substance use for CW and non-CW youth among early, mid, and late adolescents.

–Use retrospective data to determine whether risk factors from a recent medical visit can predict current substance for CW vs. non-CW youth, to inform early identification and intervention.

–Prepare for future implementation in routine clinical care by engaging clinical stakeholders to use the ML models to set cut points for risk stratification, which would then be accompanied by appropriate recommendations.

Methods

We will develop our prediction models using two unique data sources:​

1. Primary data will come from the EHR of KPSC members. We will use diagnosis codes for maltreatment to select the CW sample, a reasonable assumption of referral to child welfare. Risk factors will be obtained from diagnosis codes and progress notes in the EHR of children and parents as well as county crime and geographic income data. ​

2. To address the limitations of EHRs in providing detailed psychosocial data, we will use an existing longitudinal dataset of 454 youth, 303 referred from child welfare and 151 comparison (Youth Adolescent Project). Participants were seen at mean ages 11, 13, 15, and 18 years old and are racially/ethnically diverse (89% African American, Latinx, or multi-racial). Data include child level, parent level, family level, and neighborhood risk factors and CW case records. ​

Analysis

Natural language processing (NLP) will be used to ascertain any missing diagnostic codes/conditions based on clinical notes.

Machine learning models will be used to predict any use of alcohol, marijuana, or illicit drugs. Our approach consists of following best practices in ML research to conduct a thorough and reproducible multi-step process: 1) feature selection to reduce the number of variables and improve model accuracy; 2) cross-validated model fitting, including a held-out test set; and 3) feature importance analysis to identify key risk and protective factors. This multi-step approach will be used on a selection of ML (predictive) models.

Specifically, the models will be used to determine concurrent shared and unique predictors of early substance use for CW and non-CW youth among early, mid, and late adolescents. We will also identify the variables that are of highest importance at each developmental period (i.e., early, mid, late adolescence) and thus might be included in a clinical risk score.

We will also use ML with the retrospective data to determine whether risk factors prior to visits can predict current substance use for CW vs. non-CW youth, to inform early identification and intervention.

Bringing it all together, implications

Lead by our physician collaborator, Dr. Mercie DiGangi, and Co-I Claudia Nau, we will engage KPSC providers from pediatrics, adolescent medicine, social work, psychiatry, and addiction medicine to create an advisory panel.  The provider panel will determine which ML model should be incorporated into EHR, and decide on cut points for low, medium, and high risk.  Suggested care for each risk group would then be displayed to the provider.

Prior Work

Previous work with the Young Adolescent Project (YAP) demonstrated that earlier timing of puberty predicted higher levels of alcohol and marijuana use in later adolescence and the effect of peer substance use was stronger for youth with maltreatment (Negriff & Trickett, 2012).  Additionally, Negriff (2018) used a longitudinal model to test pathways from maltreatment to multiple risk behaviors across adolescence; it was found that maltreatment increased the likelihood of multiple risk behaviors, including substance use (see figure below).

We also ran ML models using the YAP data to predict marijuana use at the last time point (about 7 years after their first interview; age at baseline was 9-13 years old) for child welfare involved youth and non-CW youth.  We found that peer marijuana use and parental monitoring were the top predictors for CW youth and externalizing behavior and delinquency were the top predictors of marijuana use for non-CW youth.  (See figure below; Negriff et al., 2022.)

Bistra Dilkina

Sonya Negriff

Eric Rice

Claudia Nau

Chengyi Zheng

Brian Mittman

Mercie DiGangi

Divyajyoti Panda

National Institute on Drug Abuse

Gryczynski J, Mitchell SG, Schwartz RP, et al. Disclosure of adolescent substance use in primary care: Comparison of routine clinical screening and anonymous research interviews. Journal of Adolescent Health. 2019;64(4):541-543.

Lewis TL, Kotch J, Wiley T, et al. Internalizing Problems: A potential pathway from child maltreatment to adolescent smoking. Journal of Adolescent Health. 2011;48(3):247-252. 

Mills R, Alati R, Strathearn L, Najman JM. Alcohol and tobacco use among maltreated and non-maltreated adolescents in a birth cohort. Addiction. 2014;109(4):672-680.

Mills R, Kisely S, Alati R, Strathearn L, Najman JM. Child maltreatment and cannabis use in young adulthood: a birth cohort study. Addiction. 2017;112(3):494-501.

National Center on Addiction and Substance Abuse. Adolescent substance use: America’s #1 public health problem. New York: Columbia University; 2011.

Negriff S. Developmental pathways from maltreatment to risk behavior: Sexual behavior as a catalyst. Development & Psychopathology. 2018;30(2):683-693. 

Negriff S, Dilkina B, Matai L, Rice E. Using machine learning to determine the shared and unique risk factors for marijuana use among child-welfare versus community adolescents. PLoS One. 2022;17(9):e0274998. 

Negriff S, Trickett PK. Peer substance use as a mediator between early pubertal timing and adolescent substance use. Drug and Alcohol Dependence. 2012;126:95-101. 

Tonmyr L, Thornton T, Draca J, Wekerle C. A review of childhood maltreatment and adolescent substance use relationship. Current Psychiatry Reviews. 2010;6(3):223-234.

US Department of Health and Human Services. The Health Consequences of Smoking-50 Years of Progress: A Report of the Surgeon General. In. Atlanta: Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health; 2014. 

US Department of Health and Human Services. Preventing Tobacco Use Among Youth and Young Adults: A Report of the Surgeon General. In. Atlanta: Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health; 2012.

Share this
Become a USC CAIS partner through community projects, funding, volunteering, or research collaboration.