Keywords

Introduction

The interdisciplinary fields of precision psychiatry, machine learning, neural network algorithms, and neuroimaging had been making good progress in recent years [1,2,3]. The objective of a machine learning method is to enable a data-driven algorithm that can generally learn from data of the past or present and leverage the learned knowledge to make a predictive decision for an unknown future event or for any unknown data in the future [4,5,6]. In the general terms, the roadmap for a machine learning method is comprised of three steps where we build the model from initial inputs in the first step, evaluate and tune the model in the second step, and then utilize the model for making a predictive decision in the third step [4,5,6]. In the field of precision psychiatry, machine learning approaches integrate multiple data types such as neuroimaging and multi-omics data by using state-of-the-art statistical and data mining algorithms that can automatically learn to perceive complicated patterns based on empirical datasets [1,2,3]. To address the pressing challenges precision psychiatry faces today, there is a tremendous need for the development of machine learning software frameworks that can achieve clinical predictions of a given categorical or quantitative phenotype using next-generation neuroimaging and multi-omics data [1,2,3].

Precision psychiatry, an emerging field of medicine, is growing into a cornerstone of medical practices with prospects of the customization of healthcare for patients with psychiatric disorders, which means that medical practices, decisions, and treatments are tailored to individual patients [7]. More precisely, entire patient populations are subdivided into groups by biomarkers such as neuroimaging and multi-omics data; thereby, medications might be adapted personally to each individual patient with relevant or comparable genetic and imaging characteristics [7]. To date, there are more and more accumulating biomarkers that could affect clinical drug response and disease prognosis for treatment of patients with psychiatric disorders [8]. Furthermore, it has long been acclaimed that selected single nucleotide polymorphisms (SNPs) and gene expression profiles could be used as biomarkers to influence clinical treatment response and adverse drug reactions for antidepressants in patients with major depressive disorder (MDD) [9, 10].

Recent advances in machine learning, especially deep learning, have pointed out its potential and ability to learn and recognize nonlinear and complicated hierarchical patterns based on enormous large-scale empirical datasets [11,12,13,14,15]. Due to new approaches such as the deployment of general-purpose computing on graphics processing units, deep learning has achieved state-of-the-art performances on a wide variety of applications such as precision psychiatry [2, 12,13,14,15]. In general, the goal of deep learning is to construct a machine learning algorithm to facilitate a hierarchical representation of the data by using multiple layers of abstraction such as neural networks [12,13,14,15,16]. In other words, a deep learning algorithm for classification applications such as medical diagnosis in precision psychiatry is a procedure for choosing the best hypothesis using a neural network with multiple layers, instead of using a neural network with only single layer [12,13,14,15,16].

With the advent of technology in neuroimaging and multi-omics sciences, novel diagnostic tools as well as new drugs are exhibiting high growth potentials to address the needs of precision psychiatry for treatment and therapeutic interventions [17]. The use of biomarkers based on machine learning approaches has played a vital role in precision medicine in psychiatry [17]. Recently, there were a number of key emerging diagnostic studies for various diseases and treatments of significance for psychiatry with consideration of machine learning methods [2]. To that end, it would be greatly fascinating to create machine learning models that are able to forecast the probable outcome of disease status and drug treatment for patients with psychiatric disorders [2, 17]. In addressing this need, machine learning approaches might provide invaluable tools to accomplish the promise of precision psychiatry by tailoring treatment based on individual biomarkers [2, 17]. In this article, we present various precision psychiatry studies for assessment of disease status and drug treatment with consideration of machine learning, deep learning, and neural network approaches. In addition, we summarize the limitations in these studies and provide a discussion of future directions and challenges.

Method

Literature Search and Analyses

In this review, we present relevant studies on precision psychiatry and machine learning applications after a comprehensive search of the electronic PubMed database (2015–present). Key words in the search included “machine learning,” “deep learning,” “psychiatry,” “neuroimaging,” “precision psychiatry,” and “neural network.” Furthermore, we employed a manual search procedure of bibliographical cross-referencing. We manually screened the obtained articles and aimed at identifying original papers and reviews with a particular focus on precision psychiatry and machine learning applications. While this article is by no means a comprehensive review of all potential studies reported in the literature, we merely pinpointed various examples for machine learning methods in precision psychiatry.

Machine Learning and Neural Network Applications

Here we describe selective studies that focus on four main arenas including diagnosis prediction, prognosis prediction, treatment prediction, and the detection of potential biomarkers in the context of machine learning and neural network methods in psychiatry. Clinical or biological suggestions from these four main categories could be a decision support aide for future prognosis and optimal treatments in translational psychiatry [18].

While this summary does not provide the entire set of relevant studies reported in the literature, it nonetheless provides a synthesis of those that can markedly influence public and population health-oriented applications in psychiatry and machine learning in the near to midterm future.

Diagnosis Prediction

In recent years, there has been a growing trend in combining machine learning techniques and structural and functional neuroimaging to provide new insights into brain disorders such as Alzheimer’s disease, autism spectrum disorder, and schizophrenia [19]. In particular, we focus on emerging big data methodologies such as deep learning in the following selective studies.

In order to distinguish mild to severe sporadic Alzheimer’s disease from normal aging, Kloppel et al. pioneered a project to utilize a machine learning method, support vector machines (SVMs), using structural magnetic resonance imaging (MRI) and achieved 89% accuracy in their model [20]. In their two-step procedure, SVMs learned the differences between patients with Alzheimer’s disease and healthy controls in the first step, and then the framework was tested on a new brain scan in the second step [20].

In a recent study, Ju et al. proposed a deep learning approach, which consists of auto-encoders and a softmax regression layer, to predict the early diagnosis of Alzheimer’s disease [21]. In general, the auto-encoder is an encoding architecture, which consists of a neural network with the input layer representing the MRI data, multiple hidden layers representing nonlinear transformations from the previous layer, and the output layer representing the reconstructed MRI instances. Compared to widely used single-kernel SVM (accuracy = 84.40%) and multi-kernel SVM (accuracy = 86.42%), their proposed auto-encoder model had a better accuracy (87.76%). Their work highlights that deep learning approaches have an advantage over traditional machine learning methods to predict and prevent Alzheimer’s disease at an early stage [21].

With a belief network-based algorithm, Ortiz et al. employed a deep learning architecture that integrates gray matter images from brain areas and automated anatomical labeling data for identifying Alzheimer’s disease with MRI data [22]. Their model was composed of an ensemble of two deep belief networks with four different voting schemes [22]. The analysis results demonstrated that the proposed deep belief network method provided good performances for differentiating images between healthy controls and Alzheimer’s disease individuals (accuracy = 90%) as well as differentiating images between mild cognitive impairment and Alzheimer’s disease (accuracy = 84%) [22].

In machine learning, a deep belief network is a class of deep neural networks which consist of multiple layers of latent variables and can be viewed as a composition of auto-encoders [23]. Deep belief networks have also been applied to discriminate young children with autism spectrum disorders using functional MRI [23]. In addition, Pinaya et al. utilized a deep belief network model to characterize differences between patients with schizophrenia and healthy controls (accuracy = 73.6%) using MRI data, and the performance was better than the SVM model (accuracy = 68.1%) [24].

Prognosis Prediction

In order to identify course trajectories of MDD, Schmaal et al. proposed a machine learning framework that integrates structural and functional MRI using Gaussian process classifiers [25]. Gaussian process classifiers are one type of multivariate pattern recognition methods, which are similar to SVMs and provide the advantage of predictive probabilities of class membership [25]. Schmaal et al. employed Gaussian process classifiers to evaluate three MDD trajectories (including chronic, gradual improving, and fast remission patients) using prognostic value of MRI and clinical data (such as baseline severity, duration, and comorbidity) [25]. Their analysis showed that their machine learning framework can discriminate chronic patients from remitted patients up to 73% accuracy.

To predict health status and better inform clinical decision-making, Miotto et al. proposed a deep learning framework which constructs a general-purpose patient representation from electronic health records (EHRs) to facilitate clinical predictive modeling [26]. Specifically, the deep learning framework uses a multilayer neural network, which is a three-layer stack of de-noising auto-encoders with sigmoid activation functions to derive hierarchical regularities and dependencies in the dataset of about 700,000 patients [26]. Then, random forest classifiers were implemented to evaluate the probability that patients might develop a certain disease given their current clinical status [26]. In the proposed deep learning framework, de-noising auto-encoders were trained to reconstruct the input from a noisy version of the original data to avoid overfitting [26]. The findings indicated that schizophrenia and attention deficit hyperactivity disorder could be forecasted with high accuracy (area under the receiver operating characteristic curve (AUC) = 0.85) [26]. Moreover, Miotto et al. compared the proposed deep learning framework with well-known conventional machine learning algorithms, including principal component analysis, k-means clustering, Gaussian mixture model, and independent component analysis [26]. The performance metrics of the proposed deep learning framework were superior to those obtained by the conventional machine learning algorithms [26]. The main strength of the proposed approach is that preprocessing EHR data with the deep learning framework can help provide more effective predictions because of nonlinear transformations [26].

Treatment Prediction

The use of machine learning in terms of predicting treatment response in psychiatric drugs is still in its infancy as scant human studies have investigated methods to build prediction models for estimating treatment response. We focus on antidepressant treatment response in this section.

In order to predict patient-specific possible antidepressant treatment outcomes in MDD, Lin et al. carried out a deep learning prediction algorithm and leveraged it to the integrated datasets from several data types (including genetic data such as SNPs, demographic data such as age, sex, and marital status, and clinical data such as baseline Hamilton Rating Scale for Depression score, depressive episodes, and suicide attempt status of MDD patients) [27]. First, they conducted a genome-wide association study (GWAS) to pinpoint potentially significant SNPs of antidepressant treatment response and remission in a hypothesis-free manner [27]. Their deep learning prediction algorithm is called the multilayer feedforward neural network (MFNN) approach, which adapts the back-propagation algorithm. MFNN was employed to calculate the predicted complex relationship between antidepressant treatment response and biomarkers [27]. An advantage of their approach is that these predictive MFNN methods possess the benefits of nonlinear models, fault tolerance, real-time processing, and integrated systems [27]. In their analysis results, Lin et al. identified the MFNN model with three hidden layers (AUC = 0.81; sensitivity = 0.77; specificity = 0.66) for remission and the MFNN model with two hidden layers (AUC = 0.82; sensitivity = 0.75; specificity = 0.69) for antidepressant treatment response [27].

There are several studies that utilized traditional machine learning methods to predict antidepressant treatment response. A study by Kautzky et al. reported that a machine learning prediction model pinpointed 25% of responders correctly for treatment outcome by using clinical and genetic information [28]. Their model was based on the random forests algorithm, which is an ensemble learning method and is constructed as a multitude of decision trees to perform classification tasks [28]. Particularly, Kautzky et al. identified several potential biomarkers including a clinical variable called melancholia as well as three SNPs such as brain-derived neurotrophic factor (BDNF) rs6265, 5-hydroxytryptamine receptor 2A (HTR2A) rs6313, and protein phosphatase 3 catalytic subunit gamma (PPP3CC) rs7430 [28].

Moreover, by leveraging information such as age, structural imaging, and mini-mental status examination scores, the subsequent study by Patel et al. showed that a machine learning model predicted treatment response with 89% accuracy by using an alternating decision tree model, which generalizes decision trees and is related to boosting for primarily reducing bias and variance [29].

Furthermore, another study by Chekroud et al. demonstrated that a machine learning model estimated clinical remission by using 25 variables with 59% accuracy [30]. First, the top 25 predictors were identified by the elastic net, and then a tree-based ensemble method, a gradient boosting machine, was used to combine several weakly predictive models (typically decision trees) to form a final ensemble model [30]. In particular, Chekroud et al. identified the top three potential biomarkers of non-remission such as baseline depression severity, feeling restless during the past 7 days, and reduced energy level during the past 7 days [30]. On the other hand, the top three potential biomarkers of remission were total years of education, currently being employed, and loss of insight into one’s depressive condition [30]. One advantage of their approach is that their model and predictors were externally validated by other independent cohorts [30].

Iniesta et al. also implicated that a machine learning model based on clinical and demographical characteristics can forecast response with clinically meaningful accuracy by using regularized regression models (AUC = 0.72) [31]. They utilized elastic net, an application of regularized regression models, which are general linear models with penalties to provide variable selection from a large number of variables while avoiding overfitting [32].

Finally, a recent study by Maciukiewicz et al. suggested that a machine learning model based on SNPs can predict treatment response with 52% accuracy by using both SVM-based and decision trees-based models [33]. They first conducted a GWAS study to search for genetic susceptibility loci of antidepressant treatment response in a hypothesis-free manner [33]. Then, they performed least absolute shrinkage and selection operator (LASSO) regression to identify potentially significant predictors such as rs2036270 and rs7037011 SNPs [33]. In order to enhance the prediction accuracy, LASSO performs both variable selection and regularization [34].

Detection of Potential Biomarkers

In order to carry out a novel risk factor called Alzheimer’s disease pattern similarity scores, Casanova et al. implemented a high-dimensional machine learning framework which is a regularized logistic regression method with an elastic net to simultaneously accomplish data integration and prediction for Alzheimer’s disease [35]. A regularized logistic regression method with the elastic net, also called the adaptive elastic net, has been successfully applied in high-dimensional data, where the adaptive elastic net uses elastic net estimates as the initial weight in the model [32, 36]. By using MRI data, their approach is based on a coordinate-wise descent technique [37] and is able to discriminate patients with Alzheimer’s disease from healthy controls [35]. Casanova et al. revealed that Alzheimer’s disease pattern similarity scores had strong associations with age, cognitive function, and cognitive status and may be used as an Alzheimer’s disease risk factor [35].

Limitations

The findings as discussed in the previous sections should be interpreted by taking into account some limitations of these studies in psychiatry and machine learning applications. One limitation of the aforementioned studies is that the small size of the sample does not allow for drawing of definite conclusions due to the possibility of overfitting during the training process of machine learning and deep learning algorithms [38]. Second, it is important to replicate their results by comparing comparable data from an independent cohort [2]. However, most of these studies did not provide replication studies because more and larger datasets may not be available to facilitate subsequent studies. Furthermore, these findings may not be generalizable. An open challenge is that the most useful findings will generalize across various testing conditions, broader populations, different ethnic groups, numerous sites, and diverse real-life clinical settings [39].

In addition, variability in data quality such as missing data may make prior selection of predictive variables or driver biomarkers essential; for example, preselecting SNPs from the GWAS data [40]. However, we speculate that this preselection can be expected to impact the overall structure and implementation in the final machine learning model. In future work, large prospective clinical trials are necessary in order to answer whether the relevant biomarkers are reproducibly associated with disease status and drug response in machine learning studies.

While in this article we selected a few studies to exemplify relevant machine learning and deep learning algorithms in the neurobiology of psychiatric disorders, it should be pointed out that psychiatry research could further benefit from combining the advanced machine learning algorithms with multi-omics techniques [17]. Several existing biobanks have been established for multi-omics studies, including the COMBINE Biobank [41], the Korea Healthy Twin Study [42], LifeGene [43], and the Taiwan Biobank [44,45,46,47,48,49,50,51].

Finally, it should be noted that in order to demonstrate the robustness of a single biomarker or even a set of biomarkers, future studies in precision psychiatry and machine learning should assess the overlap or lack of overlap between biomarkers in diverse machine learning analyses by using tools such as Venn diagrams [52]. Current evidence indicates that various machine learning analysis strategies may often yield different biomarkers, even when they were applied to the same type of disease [53].

Perspectives

As suggested by the aforementioned studies, precision psychiatry promises to offer new therapeutic and diagnostic techniques for accurate diagnosis, prognosis, and treatment in a disease-specific and patient-specific manner [7, 8, 54, 55]. In the context of multi-omics and neuroimaging-driven approaches, it is of major importance that potential clinical applications involving machine learning for prediction of drug responses or disease might provide the appropriate solutions to global primary care, and thereby it would have been up-taken by users and governments. Moreover, it is hypothesized that the next generation of psychiatric therapies for disease treatment thus ought to take into consideration the interplay among neuroimaging and multi-omics data. The latest advances in single-cell sequencing and data-intensive life sciences will certainly trigger more novel machine learning software tools for population health in the next decade [56]. Furthermore, it will also increasingly generate application-oriented solutions toward the field of public health in light of the pressing needs of precision psychiatry for innovative diagnostics [57]. Over the next few years, machine learning-based precision psychiatry for the pretreatment prediction may become a reality in patient care after prospective large clinical trials to validate clinical factors and the relevant biomarkers [58].