Keywords

1 Introduction

Obesity has reached alarming levels among adults, adolescents, and children. Excess weight and obesity, along with a sedentary lifestyle and a family history of cardiovascular disease [6], contribute to a significant prevalence of metabolic disorders such as Metabolic Syndrome (MS) [23, 25], Insulin Resistance (IR) [33, 34], atherosclerosis [41], and impaired glucose tolerance [14, 32]. These conditions significantly increase the risk of developing type 2 diabetes and cardiovascular disease [24]. The high prevalence of cardiovascular diseases (CVD) and diabetes poses a significant public health concern, as they are the primary causes of disability and mortality in many countries worldwide.

The excessive accumulation of adipose tissue characterizes obesity [21]. However, in epidemiological and clinical contexts, assessing fat quantity and distribution relies on simple anthropometric measurements due to the challenges posed by direct adiposity measurement [20]. The Body Mass Index (BMI) [3] is a prevalent measure for estimating total fat quantity. Moreover, indicators such as Waist Circumference (WC), Waist-to-Hip Ratio (WHR), and the more recently introduced Waist-to-Height Ratio (WHtR) offer valuable insights into the distribution of visceral, central, or abdominal fat [10]. These measurements play a crucial role in understanding and assessing obesity-related health risks.

While WC is considered a valuable indicator for CVD, IR, and MS, its usefulness is limited due to variations in diagnostic cut-off points based on ethnic and racial backgrounds [22]. The literature suggests that a better predictor in this regard is the WHtR, which is a universal index with gender-specific variations. A WHtR value of 0.50 or higher is associated with cardiometabolic risk in individuals aged 18 and above [8]. Young adults with a normal BMI and a WHtR above 0.50 exhibit elevated IR, insulin plasma concentration, triglyceride levels, and lower HDL cholesterol levels compared to those with a WHtR below 0.50 [17]. Furthermore, research evaluating the predictive value of WHtR as a predictor of coronary heart disease has shown a higher prevalence of this disease among individuals with a WHtR of 0.50 or higher (indicating abdominal obesity) [4].

Various research studies demonstrate the potential of integrating machine learning into medical tasks, including disease diagnostics and personalized treatment provision. Incorporating machine learning techniques into clinical information processing offers several advantages. Firstly, it enables the analysis of datasets with high dimensionality. Additionally, it allows for analyzing information in diverse formats, such as images [9, 15] and electrical data [19]. Moreover, machine learning algorithms can identify intricate patterns and relationships within the data [26].

The application of artificial neural networks (ANN) in the diagnosis of obesity [30], MS [37], and metabolic diseases have shown promising results in the field of medical research. ANNs, known for their capacity to learn and recognize complex patterns, have been used to analyze large datasets of various metabolic parameters, including BMI [28], WC [38], blood glucose levels [27] and lipid profiles [11]. By training these networks with relevant data, they can effectively identify patterns and relationships that contribute to diagnosing obesity, MS, and related metabolic disorders. The integration of neural networks in this diagnostic process has the potential to enhance accuracy and efficiency, ultimately leading to improved patient outcomes and personalized treatment strategies.

Hence, the primary objective of this research is to utilize anthropometric parameters and the ANN technique as a classifier to categorize individuals with impaired WHtR. A comprehensive database comprising 1978 subjects was employed, encompassing 26 different anthropometric variables for achieving this goal. The subsequent section of this study provides a detailed account of the methodology used. Section 3 elaborates on the main findings obtained from the analysis. Furthermore, Sects. 4 and 5 delve into the discussion of these findings and present the concluding remarks, respectively. By employing the ANN technique and leveraging the extensive dataset, this study aims to contribute to the understanding and identifying individuals with impaired WHtR, thereby aiding in the diagnosis and management of related health conditions.

2 Methodology

2.1 Database

The dataset used in this study comprises a total of 1978 individuals, with 678 being male and the remaining participants being female. The Nutritional Evaluation Laboratory of Simón Bolívar University collected this dataset between 2004 and 2012 [12]. The data collection protocol included 28 anthropometric measurements, covering parameters such as height, weight, body circumferences, and body folds. Carefully recorded, these measurements were part of the comprehensive assessment during the data collection period. Together with a wide range of anthropometric variables, this dataset provides a robust foundation for analyzing and investigating various aspects of body composition and nutritional evaluation. Additionally, Eq. (1) calculated the WHtR.

$$\begin{aligned} WHtR=\frac{Waist}{Height} \end{aligned}$$
(1)

where Waist is the circumferential perimeter of the waist (measured in centimeters) and the Height is the tall (measured in centimeters) [5, 7]. In this research, the inclusion criteria for impaired WHtR were based on [4, 16], where a WHtR above 0.5 is considered indicative of an increased risk of health issues like obesity, cardiovascular diseases, and metabolic disorders.

All methodologies employed in this research adhered to the ethical guidelines set forth by the Bioethical Committee of Simon Bolívar University, following the principles outlined in the 1964 Helsinki Declaration and its subsequent revisions or any equivalent ethical standards. Before they participated in the study, all subjects provided their informed consent by signing the necessary documentation. This ensured that the participants were fully aware of the nature of the study, its objectives, and any potential risks or benefits associated with their involvement. By upholding these ethical standards and obtaining informed consent, the study aimed to protect the rights, privacy, and well-being of the individuals involved.

2.2 Classifier Assessment Metrics

In order to evaluate the ANNs classifiers, the true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) were measured [35]. The accuracy (ACC), specificity (SPE), sensitivity (SEN), positive predictive value (PPV), negative predictive value (NPV), and F1 score (F1) were calculated using the Eqs. (2), (3), (4), (5), (6), and (7) respectively.

$$\begin{aligned} ACC= \frac{(TP+TN)}{(TP+FP+TN+FN)} \end{aligned}$$
(2)
$$\begin{aligned} SEN= \frac{TP}{(TP+FN)} \end{aligned}$$
(3)
$$\begin{aligned} SPE= \frac{TN}{(TN+FP)} \end{aligned}$$
(4)
$$\begin{aligned} PPV= \frac{TP}{(FP+TP)} \end{aligned}$$
(5)
$$\begin{aligned} NPV= \frac{TN}{(FN+TN)} \end{aligned}$$
(6)
$$\begin{aligned} F1=2~\frac{(PPV)~(SEN)}{(PPV+SEN)} \end{aligned}$$
(7)

2.3 ANN Implemented

Artificial neural networks (ANNs) are computational frameworks that draw inspiration from the intricate structure and functional mechanisms of the human brain. Comprising interconnected nodes, often referred to as “neurons”, these networks possess the ability to process and transmit information. Engineers specifically design ANNs to acquire knowledge and generate predictions by discerning intricate patterns within complex datasets. This learning process, commonly known as training, enables ANNs to uncover underlying relationships and make accurate predictions based on the acquired knowledge [39].

ANNs can detect and categorize patterns present in data by training to identify and classify input patterns by analyzing their inherent features and attributes [2]. ANNs find utility in various applications, including image recognition, speech recognition, and data analysis. They consist of multiple interconnected layers of nodes, or neurons, which process and evaluate the input data [2]. During the training phase, the network adjusts the weights and biases of its connections to optimize its capability for pattern recognition and classification. Once trained, the network can effectively classify new and unseen patterns based on the knowledge it acquired during training [42].

ANNs have emerged as highly effective instruments in diverse domains, encompassing computer vision, natural language processing, and bioinformatics. They facilitate automated and efficient examination of intricate data, enabling tasks like object recognition, handwriting recognition, and disease diagnosis [13].

Monte Carlo Cross Validation (MCCV). The MCCV, widely employed in machine learning, evaluates model performance. It entails conducting the cross-validation process multiple times using distinct data splits. This methodology effectively reduces performance estimate fluctuations and offers a more dependable evaluation of the model’s generalization. By averaging the outcomes across numerous iterations, MCCV provides a reliable measure of the model’s anticipated performance on unseen data [1]. This method is notably effective when dealing with limited data or in cases of significant variability in performance estimates.

Characteristics of ANN Implemented. In this study, a feedforward neural network was employed in conjunction with a training function that effectively updates weight and bias values using the scaled conjugate gradient technique to classify individuals with impaired WHtR. This approach allows the network to iteratively adjust its weights and biases, facilitating effective navigation through complex and multidimensional input spaces. This adaptive capability enhances the network’s flexibility and promotes robust learning from the available data. To further improve classification accuracy, a Monte Carlo cross-validation (MCCV) technique was implemented [36]. Figure 1 depicted the methodology research followed in this study.

The feedforward neural network classifier was trained and tested, first, randomly dividing the dataset into two groups: one for training and the other for testing. This random partitioning was repeated 100 times to ensure the reliability and robustness of the results. In each iteration, the different performance metrics were calculated. The partitions processed were for 90% training and 10% testing, 80% training and 20% testing, 70% training and 30% testing, 60% training and 40% testing, and 50% training and 50% testing. The random partitioning allows a comprehensive evaluation of the ANN’s performance under various training and testing scenarios.

Furthermore, the procedure was repeated for the feedforward neural networks with different numbers of hidden layers, ranging from 10 to 100 layers. This extensive analysis enabled a thorough examination of the ANN’s performance across different configurations.

Fig. 1.
figure 1

General methodology schematics for the ANN classification.

2.4 Statistical Tests

In order to make a statistical comparison of the metrics from each experiment, the researchers utilized the Mann-Whitney U test. This particular test was selected because it assumes that the samples being compared are not paired and have distributions that deviate from the normal distribution. The statistical significance was given considering a p-value lower than 5%, as stated by [18]. Tables 1234, and 5 are presented as mean and standard deviation values (mean ± STD).

3 Results

Tables 1234, and 5 present the area under the ROC curve of ANNs classification, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score obtained from applying ANNs to classify subjects with impaired WHtR. These results correspond to different training and testing percentage values used in the experiments and the best amount of hidden layer in ANN.

Table 1. Monte Carlo cross-validation test results for 90% training and 10% test and 70 hidden layers.
Table 2. Monte Carlo cross-validation test results for 80% training and 20% test and 70 hidden layers.
Table 3. Monte Carlo cross-validation test results for 70% training and 30% test and 70 hidden layers.
Table 4. Monte Carlo cross-validation test results for 60% training and 40% test and 50 hidden layers.
Table 5. Monte Carlo cross-validation test results for 50% training and 50% test and 30 hidden layers.

4 Discussion

This research used ANNs to classify impaired WHtR based on anthropometric parameters. In order to achieve this classification, several testing was conducted using MCCV (Monte Carlo cross-validation) with different ratios for training and testing. Furthermore, the ANN architecture was modified by varying the number of hidden layers. The performance of the model was evaluated using various metrics, including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score.

The analysis of accuracy revealed that the model’s predictive performance exhibited a marginal enhancement with an increase in the percentage of training data. Nevertheless, no statistically significant differences were observed between each experiment. This implies that the model’s performance remained stable across various training ratios. Generally, the accuracy metric is an evaluator for a classification model’s ability to predict the class labels of the data. In this specific study, the model achieved an impressive level of classification accuracy, surpassing 82.4% [29].

Concerning the sensitivity, our experiments consistently showed that using a 90% for training and 10% for testing split resulted in values exceeding 79.9%, indicating the model’s effective and accurate detection of positive cases. The consistent sensitivity observed across all experiments suggests that our model is a good classifier identifying individuals with impairments [29].

The measure of specificity refers to the accurate identification of the true negative cases by the model [29]. Our research findings demonstrate a slight enhancement in specificity as the training dataset expands. The model showcases excellent performance in correctly categorizing negative cases, as evidenced by its impressive score exceeding 85%.

As the amount of training data increases, the PPV, also known as precision, exhibits a slight improvement. PPV reflects the proportion of predicted positive instances that are truly positive. A high PPV value exceeding 80.8% indicates that the likelihood of false positives in the model’s predictions is only 19.2% citeumberger2017understanding. False positives may lead to unnecessary medical treatments for subjects with normal WHtR.

Similarly, NPV represents the proportion of predicted negative instances that are truly negative. The results indicate a slight increase in NPV as the training data expands. With an NPV exceeding 83.8%, the model effectively identifies negative instances, with only a 16% probability of false negatives in its predictions [31]. False negatives can result in delayed or missed diagnoses, preventing patients from receiving timely treatment and care. This can lead to developing obesity, and MS among others.

The F1 score is a metric that assesses a classification model’s performance by considering both precision and sensitivity. It provides an overall evaluation of how well the model achieves a balance between accurately identifying positive instances (precision) and capturing all relevant positive instances (sensitivity) [40]. In this particular study, obtaining an F1 score above 0.794 suggests that the model exhibits a favorable balance between precision and sensitivity when classifying impaired WHtR subjects. This indicates that the model can accurately identify positive instances while also capturing a significant proportion of the actual positive instances.

It is important to highlight that the standard deviations associated with accuracy, specificity, positive predictive values, and F1 score values are relatively low. This indicates that the model’s performance remains consistent across multiple runs, regardless of different training-test splits of the database. The low standard deviations suggest that the results are reliable and not significantly affected by random variations. In other words, the model’s performance can be considered stable and not heavily influenced by chance fluctuations in the data.

5 Conclusions

In conclusion, this research utilized ANNs to classify impaired WHtR based on anthropometric parameters. Multiple tests were conducted using MCCV with different training and testing ratios, and the ANN architecture was modified by varying the number of hidden layers. The model exhibited an impressive level of classification accuracy, surpassing 82.4%. Sensitivity values consistently exceeded 79.9%, indicating the model’s effective detection of positive cases. The model also demonstrated excellent specificity, with a score exceeding 85%. Positive and negative predictive values showed slight improvements as the training data expanded. The F1 score, which considers both precision and sensitivity, was above 0.794, indicating a favorable balance in classifying impaired WHtR subjects. The model’s performance remained consistent across different training-test splits, suggesting stability and reliability.

In future work, we will apply this methodology to different demographic groups, cultural contexts, or environmental conditions. By doing so, we can gain a deeper understanding of the model’s generalizability and identify any potential biases or limitations that may arise in specific contexts.