Introduction

Food addiction (FA) recently increased scientific interest. Even if this concept remains controversial, food addiction might be hypothesized as a substance-related use disorder [1] characterized by a high consumption of palatable foods. Food addiction has been associated with higher binge eating prevalence, high depression rates [2] and lower quality of life scores, especially in patients with obesity [3]. Thus, early food addiction assessment may be relevant while considering obesity management. However, it may be difficult for a primary care physician to detect and diagnose these disorders. Therefore, a prescreening tool could help them refer patients to specialists who will then be able to diagnose and treat food addiction.

The SCOFF-test (Sick, Control, One stone, Fat, Food) is a 5-item eating disorder-screening tool [4] and might be considered as an example of this type of prescreening tool. Due to its simplicity, it is considered as a helpful tool for routine screenings. However, the scoff is less suitable for obesity or binge eating disorder [5] and items formulations do not appear relevant for food addiction. The Yale Food Addiction Scale (YFAS), developed by Gearhardt et al. in 2009, and modified in 2016, is currently regarded as the “Gold Standard” for food addiction screening [6]. The time needed for the patient to complete the questionnaire may limit its use as a routine screening tool (i.e., during general practitioner consultation).

Due to interest in the diagnosis of food addiction or emotional eating, we aim to establish a short screening test such as the SCOFF-test. The objective of the present study was to develop a fast screening tool for emotional eating detection in patients with obesity, using artificial intelligence (machine learning).

Materials and methods

Population

This is a retrospective study conducted on electronically registered clinical anonymized data available for patients with obesity (NCT02857179 at clinicaltrials.gov). The study has been approved by the national research ethics committee and has been performed in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments. Participants were hospitalized in a tertiary care center for obesity, between January 2017 and January 2018. Inclusion criteria were patients suffered from either obesity with Body Mass Index ≥ 35 kg.m−2 and at least two comorbidities related to obesity (i.e., type 2 diabetes, hypertension, dyslipidemia, and sleep obstructive apnea), or suffering from obesity with Body Mass Index ≥ 45 kg.m−2. Exclusion criteria were age under 18 and over 70 years and incomplete datasets.

Diagnosis of food addiction—Yale Food Addiction Scale (YFAS)

For the present study, participants were separated into two groups regarding the dichotomous diagnosis results (target variable). The FA + group had a positive diagnosis of food addiction using the YFAS (≥ 3 symptoms and satisfied distress criteria [6]), while the FA− group do not meet the criteria for food addiction diagnosis.

Clinical variables used in the dataset

During their hospital stay, patients had to complete an electronic 152-item survey, the results of which were stored in their personal electronic medical record. These items included self-assessment questionnaires (Hospital Anxiety and Depression Scale [7], Lehmann and Golay questionnaire [8]) and items concerning medical history, history of body weight management, lifestyle habits, physical activity, medical, functional, and psychological impact of obesity. A comprehensive list of items is available in supplemental material Table 1.

Table 1 Ranking analysis of the 10 most relevant variables to discriminate bariatric candidates with and without food addiction, using the fast correlation-based filter (FCBF) algorithm

Data-mining procedures

Data-mining analyses were performed using Orange Data Mining Software® (version 2.7, University of Ljubljana, Slovenia) [9]. The total dataset was split into two class-labeled datasets, FA + and FA− according to YFAS scores (target variable). Then, a 3-step data-mining procedure was performed with all the 152 items of the survey used as independent variables (predictor variables).

The first step aimed at reducing entropy in the dataset. Indeed, our dataset is based on daily clinical assessment and included variables that may appear to be heterogenic or irrelevant. Data-mining procedure support to include predictor variables without strong preliminary consideration on their potential interest. A ranking procedure was used to identify the most discriminating predictor between the two class-labeled datasets (FA + vs. FA−). Items were ranked using the Fast Correlation-Based Filter (FCBF). This entropy reduction-based measure also identifies redundancy due to pairwise correlations between features. This method allows identifying and ranking the most relevant items for large data panels, by calculating the correlation between items and food addiction class; i.e., FA + and FA−. Only items with a FCBF score > 0.1 were retained for the subsequent steps of the data-mining analysis. Several other ranking methods were tested for identification of the most discriminant variables: Information Gain, Gain Ratio, GINI, ANOVA, Chi2 and ReliefF. The variables retained were thereafter used for food addiction prediction.

The second step of the data-mining analysis aimed at selecting the most relevant predictive algorithm for food addiction. Performances of different predictive algorithms were tested and compared: logistic regression, artificial neural networks, naive Bayes classification, decision tree, AdaBoost meta-algorithm, CN2 rule inducer algorithm, SVM algorithm, k-nearest neighbors’ algorithm and stochastic gradient algorithm. These artificial intelligence algorithms were cross-validated (ten times in a row) with a randomized learning sample, renewed ten times and representing 66% of the study population. The validation sample was represented by the other 33% of the population. The predictive algorithm with the best precision and F1 score was considered as the best valuable algorithm.

The third step aimed at building a nomogram, i.e., a two-dimensional calculating graphical tool, designed to allow the approximate graphical computation of the best valuable predictive algorithm, determined at step 2.

Results

Complete data were available for 176 participants over the 320 patients hospitalized during the study period. Among them, 45 (25%) exhibited a food addiction positive diagnosis according to their YFAS scores and were assigned to the FA + class. Both classes (FA + and FA−) did not differ significantly in terms of age (42.93 ± 11 years in FA + vs. 46.55 ± 14 years in FA−, p = 0.09), gender (75.5% of women in FA + vs 68.7% in FA−, p = 0.48) and body mass index (44.6 ± 5.3 kg.m−2 in FA + vs. 45.9 ± 7.3 kg.m−2 in FA−, p = 0.25).

Data-mining procedure results

Following entropy reduction based on the FCBF algorithm, only 3 of the 152 available variables showed a FCBF score > 0.1 and were retained: “I eat to forget my problems”; “I eat more when I’m alone”; “I eat sweets or comfort foods” (Table 1).

Using these three variables, the food addiction predictive performance of several algorithms was evaluated based on the AUC score and F1 analysis results. Two algorithms stand out: logistic regression (AUC: 0.732) and naive Bayes classification (AUC: 0.729). The highest F1 parameter (0.463) and recall parameter (0.352) were obtained using the naive Bayes classification which was subsequently retained as the best food addiction predictive algorithm (Supplemental material Table 2).

The three parallel-scale nomogram obtained at the third step is presented in Fig. 1. This tool allows the graphical calculation of food addiction probability according to YFAS, for each of the three discriminant items, thanks to their projection score on the upper line (Fig. 1a). The sum of food addiction probabilities for each item score has to be calculated to give the global probability (as illustrated in Fig. 1b). Thus, the use of the nomogram can quickly assess the likelihood of a positive YFAS score as if the patient had fulfilled the classic paper-pen YFAS, and therefore, determine the patients with a high food addiction risk.

Fig. 1
figure 1

Three parallel-scale nomogram is proposed as a screening tool for food addiction (FA) in bariatric candidates. a Presents the tool for a low probability of FA. b Presents a concrete example: a patient who declares often eating more when alone, often eating to forget his problems and consuming comforting food more than three times a week (score of 2 for each item) will have a 78% probability of presenting with FA

Discussion

The present study showed that a brief and simple tool could be created assisted by artificial intelligence to screen for disordered eating (e.g., emotional eating) in population with obesity. The 3-item Food Addiction Screening Test (FAST, available at http://medecine-bariatrique.com/fast/) could lead to ease large-scale screening for emotional eating.

Despite an item selection driven by artificial intelligence, the variables that were retained are consistent with clinical common sense and coherent with literature in the field. The first item, “I eat to forget my problems”, underlines the value of anxiety and stress in substance-use disorder. Indeed, stress plays a key role in obesity and its repetition is likely to play a role in the genesis of food misuse [10]. The second question, “I eat more when I am alone” introduces a reference to loneliness, and more particularly to the avoidance of loneliness. This feeling has been identified as influencing the establishment or severity of substance addiction [11] or emotional eating. The last item, “I eat sweets or comfort food”, by questioning the frequency of consumption, might evaluate the severity of the addiction. Lemeshow et al. have recently confirmed the strong association between an almost daily consumption of certain comfort foods (high-fat, snack…) and the diagnosis of food addiction [12]. Moreover, those items are also closed to binge eating disorder definition or to emotional eating. This is consistent with the important overlap between those disorders and should be address in further study.

A strength of the present study is the broad range of clinical variables that were used to supply artificial intelligence algorithms. One of the limitations is the relatively small sample size for a data-mining study. However, 152 variables were used, representing a set of almost 27,000 data elements. Variables covered broad information on medical history, history and management of obesity, behavioral and psychological phenotypes. Despite this small sample size, the results appear to be relevant enough to be considered in clinical practice, especially regarding F1 score and AUC. Another major limitation of our study is the use of a retrospective design as well as the absence of other measures of eating behaviors (such as the Binge Eating Scale for binge eating disorder or Dutch Eating Behaviors Questionnaires for emotional eating) or of the latest version of the YFAS (that better fit with DSM 5 criteria). Therefore, it is not possible to determine the specificity of the test for food addiction or other close constructs. However, early identification of binge eating disorder or emotional eating is also of interest for patient care.

In conclusion, our study provided an artificial intelligence-derived tool to ease the screening of emotional eating (use of food in response of negative emotional states) in obesity. The next step will be a validation study of the FAST nomogram in a larger population, regarding sensitivity, specificity and its relevance for disordered eating diagnosis.

What is already known on this subject?

Food addiction is relevant in patients with obesity. The SCOFF-test seems to be less suitable for obesity. Food addiction assessment during general practitioner consultation can be challenging.

What your study adds?

FAST, a brief and simple tool (3-item), could help to determine which patients have high risk of positive score to the YFAS. Further validation study is needed to precise underlying construct.