Introduction

Migraine is the second most disabling neurological disorder and the third most prevalent medical condition in the world [1, 2], characterized by multiple episodes of moderate or severe headache and reversible neurological and systemic symptoms [3,4,5]. The prevalence of migraine varies by gender, with annual incidence rates of 6% and 18% for men and women, respectively, and lifetime rates of 13% and 33%, respectively [3]. In terms of age, the prevalence is similar for boys and girls before puberty, but in females it rises significantly after puberty, peaking between 35 and 39 years of age [4]. Migraine not only reduces a patient's health-related quality of life and leads to migraine-related disability, but also increases the risk of cardiovascular and cerebrovascular diseases, psychiatric disorders such as anxiety and depression, and has a significant impact on daily activities and direct medical costs [6,7,8,9,10]. Therefore, strengthening early prevention, diagnosis and timely and effective intervention of migraine is of great significance to patients, families and even society.

Despite the rapid development of international understanding of the pathophysiology of migraine and evidence-based guidelines designed to inform clinical decision-making in migraine, the prevention and treatment of migraine remains suboptimal, particularly in the early prevention and diagnosis of migraine [5, 11]. Because in reality, few people seek medical attention for mild headaches (e.g., stress- and tension-related headaches), which can lead to underdiagnosis of early migraine [12]. However, by the time patients arrive at the hospital seeking headache treatment, they have usually progressed to more severe migraines or chronic migraines of longer duration, which can lead to more severe migraine related burdens [12,13,14]. In recent years, many researchers have conducted preliminary explorations of diagnostic biomarkers for migraine, including those based on genetics, provocation modeling, biochemistry, and neuroimaging [15, 16], as well as a number of migraine-specific biomarkers using a "omics" approach [17]. However, most of the biomarkers in previous studies have limitations due to their invasiveness or high acquisition costs, which are not conducive to early diagnosis and population screening for migraine. Therefore, there is an urgent need for more convenient, efficient, safe, and noninvasive methods for early prediction and diagnosis of migraine.

Migraine is known to be associated with a variety of sleep disorders and psychiatric disorders, such as anxiety and depression, and the bidirectional relationship between migraine and these disorders is of increasing interest to researchers [7, 18,19,20,21,22]. Our previous studies showed that poor sleep quality, anxiety, and depression were significantly and positively associated with increased risk of migraine and migraine burdens, and that Pittsburgh Sleep Quality Index (PSQI) score, Zung's Self-rating Anxiety Scale (SAS) and Self-rating Depression Scale (SDS) scores had good predictive value for migraine and could be used as potential predictors for migraine [23, 24].

Traditional Chinese Medicine Composition (TCMC) has a complete functional state classification system, which is widely used in health care, subhealth prevention, quality of life evaluation, disease diagnosis, treatment and prevention [25,26,27,28,29,30]. A previous study analyzed 1,639 clinical studies on TCMC-disease correlations published over the past 10 years, including 19 disease categories and 333 different disorders, and showed that eight biased TCMCs were strongly associated with specific diseases and could be used to guide personalized prevention and treatment [26]. Another study found that Yang-deficiency constitution was closely related to metabolic syndrome and may be a potential predictor of metabolic syndrome [25]. In addition, a previous study found that TCMC was strongly associated with the risk of depression in women [31]. Based on the above findings and considering the close relationship between migraine and depression, we initially hypothesized that TCMC may be associated with the risk of migraine. As shown in a survey of TCMC characteristics in migraine patients in Hong Kong, Qi-depression and Blood-stasis constitutions were risk factors for migraine [32]. However, the predictive value of TCMC scores for migraine are unclear. In addition, it remains to be studied and explored whether the combination of TCMC scores with PSQI, SAS, and SDS scores can improve the predictive ability of migraine.

Therefore, this study will systematically explore the relationship between TCMC scores and the risk of migraine, evaluate the predictive value of different TCMC scores for migraine, and then screen out appropriate TCMC predictors and combine them with PSQI, SAS and SDS scores to establish novel non-invasive prediction models for migraine, and externally validate the prediction models, thereby providing new ideas and methods for early prevention and diagnosis of migraine.

Methods and materials

Study design and participants recruitment

This study utilized a case–control study method to develop and validate a new non-invasive prediction model for migraine in Chinese adults. The research design and paper writing referred to the TRIPOD reporting guidelines [33]. This study was approved by the Ethics Committee of Beijing University of Chinese Medicine (Project No.2022BZYLL0903). All subjects volunteered to participate in this study and signed an informed consent form.

First, 128 migraine patients who met the research criteria and 60 sex- and age-matched healthy control subjects attending the headache clinic of the Department of Neurology at China-Japan Friendship Hospital between April 2021 and September 2022 were recruited continuously as a training set to establish the non-invasive prediction models for migraine. Then, from September 2022 to June 2023, 64 migraine patients who met the research criteria and 30 healthy control subjects were recruited continuously as the external validation set to further validate the predictive performance of the prediction models. The research flow chart of the subjects was shown in Fig. 1.

Fig. 1
figure 1

Research flowchart of this study

Outcome and diagnosis criteria

The main outcome of this study was the presence or absence of migraines. All subjects underwent professional migraine diagnostic evaluation and completed a standardized questionnaire with interviews conducted by certified neurologists and headache specialists without knowing other independent variables and predictors influencing the participants. The diagnostic criteria for migraine were referenced to the International Classification of Headache Disorders (ICHD-3), third edition [34]. The migraine subjects included in this study included two types: migraine with aura and migraine without aura.

Inclusion and exclusion criteria

Inclusion criteria of migraine subjects: 1) Fulfilled the diagnostic criteria for migraine (including migraine without aura and migraine with aura). 2) Aged between 18 and 65 years old, regardless of gender. 3) Had at least one migraine attack in the past month. 4) Those who volunteered to participate in this study and signed the informed consent form.

Exclusion criteria: 1) People under the age of 18 or over the age of 65, or pregnant or breastfeeding women. 2) People with serious primary diseases, such as heart, liver, kidney, blood system, mental diseases or malignant tumors. 3) People with infectious diseases, such as tuberculosis and AIDS. 4) Secondary headache caused by brain or other organic lesions. 5) Specific types of migraine, such as hemiplegic migraine, ophthalmoplegic migraine, and other migraines. 6) Subjects with missing or clearly incorrect important data.

In addition, all healthy control subjects were required to have no family history of migraine and no current or prior diagnosis of primary or secondary migraine.

Measurement of main research indicators

Pittsburgh Sleep Quality Index (PSQI): The PSQI is a common self-assessment questionnaire to assess sleep quality [35]. It consists of 19 items across 7 components, and PSQI scores range from 0 to 21, with higher scores indicating poorer sleep quality [36]. Poor sleep quality is defined when the PSQI score is greater than 5, with a diagnostic sensitivity of 98.7 and specificity of 84.4 [37].

Zung's Self-rating Anxiety Scale (SAS) and Self-rating Depression Scale (SDS): The SAS and SDS are commonly used questionnaires for evaluating anxiety and depression, and their reliability and validity have been validated in Chinese populations [38,39,40]. Both scales include 20 items, with the original scale scores ranging from 20–80 and the transformed index scores ranging from 25–100, with higher scores indicating higher levels of anxiety and depression [41].

Traditional Chinese medicine constitution (TCMC): The TCMC includes nine subtypes, including Gentleness type, Qi-deficiency type, Yang-deficiency type, Yin-deficiency type, Phlegm-Wetness type, Wet-heat type, Blood-stasis type, Qi-depression type, and Special diathesis type, which were measured using Nine-Constitution Scale with an internal consistency of 0.72 to 0.82 and a retest reliability of 0.77 to 0.90 [31, 42]. The scale for each subtype of TCMC of 6 to 8 items, each with a score ranging from 0 to 4 [26, 27]. The raw scores were then converted to a score of 0 to 100, which was the TCMC score for that subtype.

Other measurement

Demographic and baseline characteristics such as gender, age, height, weight, smoking history, drinking history, weekly exercise time, and subjective pressure score were collected through standardized questionnaire interviews and quality control. Body mass index (BMI) was calculated as weight in kilograms divided by the square of height in meters. Smoking history and drinking history were defined as current or previous smoking and drinking behavior. Weekly exercise time was defined as total weekly exercise time (hours). Subjective stress was assessed by a visual analog pressure scale ranging from 0 to 10.

Statistical analysis

This study was a preliminary exploratory study, where the sample size of the training and validation sets was set in a 2:1 ratio and met the minimum sample size required for statistical methods to develop predictive models. Samples with missing important data have been excluded. Statistical analyses and figure plotting were performed using SPSS, version 25.0, the Python package, version sklearn 0.22.1, and the R package, version logreg6.2.0. All statistical tests were two-tailed and were considered significant for P less than 0.05 (P values < 0.05).

First, baseline characteristics of the migraine and control groups in the training set were compared. The chi-square test was used to compare categorical data between groups, expressed as number of cases and percentage [n (%)]. The independent samples t-test was used to compare quantitative data that were normally distributed between groups, expressed as mean ± standard deviation (SD). The Mann–Whitney U-test was used to compare quantitative data that were not normally distributed between groups, expressed as median [interquartile range (IQR)]. Then, logistic regression analyses were performed to assess the effects of PSQI scores, SAS scores, SDS scores, and the nine TCMC scores on the risk of migraine under different adjustment conditions, to screen the risk predictors of migraine, and to establish the optimal prediction models for migraine that included different numbers of indicators. In addition, receiver operating characteristic curves (ROCs) were plotted to assess the discrimination of predition models, and the area under the ROC curves (AUCs) of predition models in the total sample and in subgroups of different genders and ages were compared by the Delong test. Calibration curves and the Hosmer–Lemeshow test were used to assess the calibration of the prediction models. Finally, we validated the ability of each prediction model in an external validation set.

Results

Baseline characteristics of participants

The baseline characteristics of training and validation sets were presented in Table 1. From the training set, it can be seen that there were no significant difference in gender, age, BMI, smoking history, drinking history, weekly exercise time, pressure score and the Yin-deficiency score, Phlegm-wetness score, Wet-heat score, and Special diathesis score levels between the migraine group and the control group (all P values > 0.05). The PSQI score, SAS score, SDS score, Qi-deficiency score, Yang-deficiency score, Blood-stasis score, and Qi-depression score levels of the migraine group were significantly higher than those of the control group, while the Gentleness score level was significantly lower than that of the control group (all P values < 0.05). The same results were found for the validation set. There were no statistically significant differences between the variables in the validation set and the training set (all P values > 0.05), indicating that the validation set was well represented and comparable.

Table 1 Comparison of baseline data between training and validation sets

Risk predictors for developing migraine

Logistic regression analyses were conducted to explore the effects of PSQI score, SAS score, SDS score and nine TCMC scores on the risk of developing migraine. It can be seen from Table 2 that the PSQI score, SAS score, SDS score, Gentleness score, Qi-deficiency score, Blood-stasis score, and Qi-depression score were significantly correlated with the risk of migraine (all P values < 0.05). After adjusting for gender, age, smoking history, drinking history, BMI, weekly exercise time, and pressure score, the odds ratios (ORs) for a 1-standard deviation (SD) increase in PSQI score, SAS score, SDS score, Gentleness score, Qi-deficiency score, Blood-stasis score, and Qi-depression score were still 1.748 (95% CI: 1.436–2.128), 1.139 (95% CI: 1.081–1.200), 1.042 (95% CI: 1.007–1.079), 0.959 (95% CI: 0.938–0.979), 1.053 (95% CI: 1.027–1.078), 1.034 (95% CI: 1.011–1.058) and 1.047 (95% CI: 1.025–1.069), respectively.

Table 2 Results of logistic regression analysis and AUCs on the risk of migraine by various variables

In addition, Table 2 also illustrated the AUCs of these 12 variables for predicting migraine. The results showed that the top three indicators for ranking predictive ability were PSQI score, Qi-depression score, and SAS score, with all AUCs greater than 0.7, indicating good predictive ability for migraine.

Establishment of non-invasive risk prediction models for developing migraine

First, PSQI score [AUC = 0.83 (95% CI: 0.77–0.89)], the best predictor of migraine among all predictors, was selected and included in Logistic regression to build a prediction model for migraine containing only one variable (Model 1). Then, PSQI score, SAS score, SDS score, Gentleness score, Qi-deficiency score, Blood-stasis score, and Qi-depression score, which were significantly associated with the risk of migraine, were further included in the progressive forward Logistic regression. The result showed that only PSQI score and Qi-depression score were still significantly and independently associated with the increased risk of migraine. Based on this, a Logistic regression prediction model (model 2) containing these two variables was constructed.

In addition, in order to further improve the prediction performance for migraine, we also tried to incorporate PSQI score, SAS score, SDS score and nine TCMC scores into the progressive forward Logistic regression in various combinations, and screened out the best prediction models containing 3 to 8 variables, respectively, model 3 to model 8. The specific Logistic regression equation and predictive performance results of each model were illustrated in Table 3. Meanwhile, it also provided the optimal cutoff values for each model, which can be used in clinical applications to predict whether there is migraine.

Table 3 Prediction performance results of various models in training set

Evaluation of discrimination and calibration of prediction models

First, in order to evaluate the discrimination of these prediction models, the ROC curves of eight prediction models were drawn (Fig. 2A), and the Delong test was conducted to compare the prediction ability of each model. The results showed that the AUCs of models 1 to 8 were 0.83 (95%CI: 0.77–0.89), 0.86 (95%CI: 0.80–0.91), 0.87 (95%CI: 0.82–0.92), 0.88 (95%CI: 0.84–0.93), 0.90 (95%CI: 0.85–0.94), 0.91 (95%CI: 0.86–0.95), 0.91 (95%CI: 0.87–0.95), and 0.92 (95%CI: 0.89–0.96), respectively. Compared to Model 1, the AUCs of Models 3 to 8 with 3 to 8 variables had significantly improved. Compared to Model 2, the AUCs of Models 4 to 8 with 4–8 variables had significantly improved. In addition, the AUCs of Model 6 to Model 8 were significantly higher than those of Model 1 to Model 4 (all P values < 0.05), and all AUCs were not less than 0.9. It was worth noting that Model 6 also had the best specificity (0.95) and Model 8 had the best sensitivity (0.92).

Fig. 2
figure 2

ROC curves of eight prediction models for the total sample (A) and different subgroups (B-E) in the training set. In the training set, the AUC for models 1 to 8 ranged from 0.83 (95% CI:0.77–0.89) to 0.92 (95% CI:0.89–0.96) in the total sample and 0.80 (95% CI:0.63–0.97) to 0.95 (95% CI:0.91–1.00) in the different sex and age subgroups. AUC Area under the ROC curve

A net reclassification index (NRI) analysis was then performed to compare the diagnostic accuracy of the eight models with the single best predictor, PSQI score. The results, as shown in Fig. 3 and Supplementary Tables 18, showed that the percentage of correct reclassification from model 1 to model 8 was improved by 51.1%, 47.6%, 54.0%, 52.4%, 64.0%, 59.1%, 61.5%, and 72.2%, respectively, in comparison with PSQI score, which indicated that the prediction accuracies of the eight models established in the present study for migraine were all greatly improved and significantly better than the PSQI score.

Fig. 3
figure 3

Plots of NRI analysis results between the eight models and the PSQI score predictor in the training set. Compared to the PSQI score, the correct reclassification rate from Model 1 to Model 8 increased by 51.1%, 47.6%, 54.0%, 52.4%, 64.0%, 59.1%, 61.5%, and 72.2%, respectively. NRI Net reclassification index, PSQI Pittsburgh sleep quality index

In addition, in order to evaluate the calibration of these prediction models, calibration curves were drawn (Fig. 4) and the Hosmer–Lemeshow tests were performed on each model. The results showed that models 1 to 7 had excellent calibration (P for H–L test > 0.05), but the calibration of model 8 was not very good (P for H–L test = 0.049 < 0.05).

Fig. 4
figure 4

Calibration curves of eight prediction models for migraine in the training set. The Hosmer–Lemeshow tests showed that models 1 to 7 had excellent calibration (P for H–L test > 0.05), but the calibration of model 8 was not very good (P for H–L test = 0.049 < 0.05). H–L Hosmer–Lemeshow

Collinearity diagnosis of prediction model variables

Considering that the variables included in some models were not significant in the Logistic regression analysis results in Table 2, or the positive and negative values of the variables have changed, there may be a problem of variable collinearity. Therefore, we specifically drew a correlation cluster graph (Fig. 5) for each variable and conducted collinearity diagnosis on the variables included in each model. The results showed in Table 4 that the variance inflation factor (VIF) of all variables in the eight models were less than 5, indicating that there were no multicollinearity problem and these prediction models were well constructed.

Fig. 5
figure 5

Correlation cluster graph of each variable in the training set

Table 4 Collinearity diagnosis of various model variables

Subgroup analysis of prediction models

Considering the epidemiological differences of migraine in terms of gender and age, we further conducted subgroup analysis according to gender and median age (35 years old), then the ROC curves of eight prediction models in different subgroups were drawn (Fig. 2B-E) and the Delong test was conducted to compare the prediction ability of each model in each subgroup. The results were shown in Table 5. The AUCs of eight models in each subgroup range from 0.80 (95% CI: 0.63–0.97) to 0.95 (95% CI: 0.91–1.00), indicating that all eight models have good predictive value for migraine in different gender and age subgroups. And Delong test showed that Model 6, Model 7, and Model 8 exhibited relatively good AUCs in any subgroup, consistent with the predicted results in the total sample of training set.

Table 5 Subgroup analysis of prediction performance of various models in training set

External validation of prediction models

In order to evaluate the applicability of the prediction models, 94 subjects with no statistically differences from the baseline data of the training set samples were recruited as the external validation set. The ROC curves of eight prediction models for migraine in the validation set were drawn to evaluate their prediction performance. As shown in Fig. 6A, the AUCs of models 1 to 8 were 0.76 (95% CI: 0.66–0.85), 0.80 (95% CI: 0.71–0.88), 0.79 (95% CI: 0.70–0.88), 0.83 (95% CI: 0.75–0.91), 0.83 (95% CI: 0.75–0.91), 0.81 (95% CI: 0.73–0.90), 0.80 (95% CI: 0.71–0.89), and 0.82 (95% CI: 0.74–0.90), respectively. In addition, Fig. 6B-E also showed the ROC curves of eight prediction models for different gender and age subgroups in the validation set. The results showed that the AUCs of these eight prediction models ranged from 0.73 (95% CI: 0.64–0.84) to 0.93 (95% CI: 0.82–1.00) in different subgroups. Therefore, the eight non-invasive prediction models established in this study showed excellent predictive ability in the overall population and subgroups of different genders and ages in the external validation set.

Fig. 6
figure 6

ROC curves of eight prediction models for the total sample (A) and different subgroups (B-E) of migraine in the validation set. In the validation set, the AUC for models 1 to 8 ranged from 0.76 (95% CI: 0.66–0.85) to 0.83 (95% CI: 0.75–0.91) in the total sample and 0.73 (95% CI: 0.64–0.84) to 0.93 (95% CI: 0.82–1.00) in the different sex and age subgroups. AUC Area under the ROC curve

In addition, we also predicted whether migraine occurred or not based on the best cutoff values of the eight models in the training set, and further analyzed the association between the predict outcomes and the actual occurrence of migraine in the training and validation sets by multifactor logistic regression, respectively. The results showed that after adjusting for confounders such as age, gender, smoking history, drinking history, BMI, weekly exercise time, pressure score, the predict outcomes of eight models in both the training and validation sets were significantly and independently positively associated with the actual occurrence of migraine. As shown in Fig. 7, the ORs in the training set ranged from 8.481 (95% CI: 4.131–17.411) to 39.886 (95% CI: 16.422–96.880), and those in the validation set ranged from 6.111 (95% CI: 2.171–17.201) to 10.895 (95% CI: 3.605–32.921), which further confirmed the excellent predictive performance of the eight prediction models for migraine.

Fig. 7
figure 7

Effects of predict outcomes of eight models on the risk of migraine in training and validation sets. *Adjusted for age, gender, smoking history, drinking history, BMI, weekly exercise time, pressure score

Discussion

This study first systematically explored the relationship between PSQI score, SAS score, SDS score, and nine TCMC scores with the risk of migraine, and evaluated their predictive ability for migraine. The results not only confirmed the predictive value of PSQI score, SAS score, SDS score for migraine, but also found that the Qi-deficiency score, Blood-stasis score, and Qi-depression score were significant associated with the risk of migraine, and had a certain predictive ability for migraine. Then, this study mainly developed eight non-invasive prediction models for migraine, evaluated the discrimination and calibration of each prediction model, conducted subgroup analysis based on gender and age stratification, and validated the excellent predictive performance of each prediction model on the overall sample and different subgroup subjects using external datasets, providing new ideas and reliable methods for early prevention and diagnosis of migraine, which had important clinical application value.

To analyze the risk factors of developing migraine, seven predictors were preliminary screened out by multiple Logistic regression analysis, and then they were further included in the progressive forward logistic regression analysis. Finally, PSQI score and Qi depression score were demonstrated as significant independent risk factors of migraine. Two of our previous studies found that the PSQI score, SAS score, and SDS score were all significantly and positively associated with the risk of migraine, with the PSQI score having the relatively best predictive power for migraine [23, 24]. The results of this study showed that the AUCs of PSQI score, SAS score, and SDS score for migraine were 0.83 (95% CI: 0.77–0.89), 0.75 (95% CI: 0.68–0.83), and 0.61 (95% CI: 0.52–0.69), respectively, which were similar to our previous studies and further demonstrated that the PSQI score had the relatively best predictive ability for migraine. These results also further confirmed the close relationship between anxiety, depression and poor sleep quality and migraine.

In recent years, TCMC has shown significant value in many studies of psychosomatic and cardiovascular diseases. For example, a systematic evaluation study of TCMC as a predictor of depression showed that Qi-deficiency constitution and Qi-depression constitution could be used as a predictor of depression [43]. A cross-sectional study during the COVID-19 pandemic showed that Qi-deficiency constitution and Qi-depression constitution were associated with depression and Qi-depression constitution was associated with anxiety in patients with systemic sclerosis [44]. A study of Malaysian university students also showed that Qi-depression constitution was an important risk factor for depression among university students [45]. Another study has shown that Qi-deficiency was significantly associated with emotional, pain, and fatigue in SLE patients [46]. Moreover, Qi-deficiency constitution, Qi-depression constitution, and Blood-stasis constitution were all significantly associated with insomnia [47]. In addition, other studies have shown that Blood-stasis constitution was a risk factor for cognitive dysfunction [48], was strongly associated with the development of peripheral arterial disease in patients with type 2 diabetes mellitus [49], and may also be used as an early predictive diagnostic indicator for the development of coronary artery disease in patients with chest pain [50]. Considering that migraine, as a neurological disorder that is closely related to psychosomatic disorders, we hypothesized that TCM might also be closely related to migraine. The results of this study showed that the Qi-deficiency score, Blood-stasis score, and Qi-depression score were significantly and positively associated with the risk of migraine, which further confirmed our hypothesis. Meanwhile, the correlation analysis showed that Qi-deficiency score, Blood-stasis score, and Qi-depression score were also positively correlated with SAS, SDS, and PSQI scores. Interestingly, the Qi-depression score had the relatively best predictive ability for migraine, with an AUC of 0.76 (95% CI: 0.68–0.84). According to the correlation cluster graph (Fig. 5), the Qi-depression score was also demonstrated as a most correlated TCMC score with SAS score (r = 0.56) and SDS score (r = 0.60), which might indirectly explain the close relationship between Qi-depression score with the risk of developing migraine. From the concept of TCM, people with Qi-depression constitution usually behave depression, nervousness, fear or sigh for no reason, which is very similar to the anxiety or depression co-morbidities of migraine [7]. People with Qi-deficiency constitution are mainly characterized by lethargy, tiredness, and are more common in women, and this is very similar to the sleepiness and fatigue in the triggers of migraine attack [51]. People with blood stasis are often characterized by relative stagnation of blood in the local area of the body, resulting in pain, and this relative stagnation of blood stasis is often caused by abnormal blood flow or abnormal vasoconstriction and diastole, which is also similar to the pathogenesis of migraine [4].

To develop the ideal prediction models for migraine, this study preliminarily developed model 1 and model 2, including one indicator of PSQI score and two indicators of PSQI score and Qi-depression score, with the AUC of 0.83 (95% CI: 0.77–0.89) and 0.86 (95% CI: 0.80—0.91), respectively. On this basis, we further developed models 3 to 8 including 3 to 8 indicators, and the results showed that the predictive performance of these models for migraine were further improved, with AUCs ranging from 0.87 (95% CI: 0.82–0.92) to 0.92 (95% CI: 0.89–0.96). Compared with the single best predictor, PSQI score, the predictive accuracy from Model 1 to Model 8 increased by 51.1%, 47.6%, 54.0%, 52.4%, 64.0%, 59.1%, 61.5%, and 72.2%, respectively, which suggested that the models developed in this study were significantly better than PSQI score for migraine. Moreover, 7 of these 8 models included TCMC scores, which also indicated indirectly that the inclusion of TCMC scores improved the overall performance of the models. It was worth noting that Model 6 had the best specificity (0.95) indicating the lowest missed diagnosis rate, while Model 8 had the best sensitivity (0.92) indicating the lowest misdiagnosis rate. In clinical practice, different prediction models could be selected based on different application scenarios.

In terms of the calibration of the models, the study showed that Model 8, which included eight variables, was a weaker calibration curve than the other models, which may have been caused by the addition of the variables resulting in a relatively small sample size for the statistical analysis of the logistic regression. But despite this, Model 8 remained a strong predictor of migraine in the validation set, with an AUC of 0.82 (95% CI: 0.74–0.90). Meanwhile, Models 1 through 7 were both well discriminated and well calibrated and also demonstrated good predictive ability for migraine in the validation set, with AUCs ranging from 0.76 (95% CI: 0.66–0.85) to 0.83 (95% CI: 0.75–0.91). In addition, the predict outcomes of eight models in both the training and validation sets were significantly and independently positively associated with the actual occurrence of migraine, which further confirmed the excellent predictive performance of the eight prediction models for migraine.

Considering the epidemiological differences of migraine in terms of gender and age [3, 4], the predictive ability of various models for migraine of different genders and ages were worth further exploration. From the ROC curves, it was known that in the training set, there was no significant difference in the predictive ability of the models for migraine in men and women, while in the validation set there was a tendency for the models to have a higher predictive ability for men than for women. The reason for this may be due to the relatively small sample size of male subjects in the validation set, which may amplify the positive results of gender differences. Nonetheless, we did not believe that the models were not applicable to males but only to females. Because the results of the subgroup analyses showed that the AUCs ranged from 0.80 (95% CI: 0.63—0.97) to 0.95 (95% CI: 0.91–1.00) for the different ages and genders in the training set, and ranged from 0.73 (95% CI: 0.64–0.84) to 0.93 (95% CI: 0.82–1.00) in the validation set, with all AUCs greater than 0.7, further validating that these models were applicable to migraine patients of different ages and genders with good predictive performance.

Compared to previous studies with invasive or high-cost biomarker studies for predicting migraine, the noninvasive predictive models in this study has the advantages of being easier to access and less costly. In addition, compared with other clinical subjective descriptive diagnostic models, this study used specialized scales to more objectively assess the PSQI, SAS, and SDS scores that are closely related to the predictors of migraine, and combined them with the TCMC scores to establish novel prediction models, which makes the prediction results more accurate and more suitable for efficiently predicting and screening the population of potential migraine risk. It is particularly suitable for early warning of migraine risk for community hospitals and homes with relatively inadequate medical care, or for people at high risk of migraine with sleep and mood disorders. In the future, we hope to develop our noninvasive prediction model into a portable mobile application for early screening and timely advice and intervention for migraine in Chinese adults.

In summary, this study developed a series of convenient and novel non-invasive prediction models for migraine, and conducted external validation and subgroup analysis, confirming that the excellent predictive ability of these prediction models for migraine in Chinese adults of different genders and ages. It was of great significance for early prevention, screening, and diagnosis of migraine. However, this study also had some limitations. Firstly, the participants in the training and validation sets of this study were from the single clinical center, although the enrollment time was different, there might still be a selection bias. In the future, further external validation needs to be conducted by including participants in different regions and clinical application scenarios. Secondly, when exploring subgroup analysis, the sample size of male subjects in this study was relatively small, although the minimum sample size required for statistics was reached, positive results of gender differences may be amplified due to selection bias. Therefore, we hope to further expand the sample size in the future to validate the gender difference of prediction models for migraine. Thirdly, the main research indicators of this study were obtained through questionnaire scales, although they were all filled out under the guidance of professional doctors, there may still be some subjectivity. In the future, we hope to incorporate more objective laboratory biochemical markers, imaging markers, etc., and construct a series of more diversified migraine prediction models to achieve more precise prediction for migraine.

Conclusions

This study developed and validated a series of convenient and novel non-invasive prediction models for migraine, which have good predictive ability for migraine in Chinese adults of different genders and ages. It is of great significance for the early prevention, screening, diagnosis of migraine.