Keywords

1 Introduction

Medical science has long been focusing on researching on disease while ignoring research the human body itself. However, Chinese medicine constitution discipline is committed to research on the physiological and pathological characteristics of each constitution, analyze the state of the disease and the development of the disease based on different constitutions in order to guide disease prevention and medical treatment. An individual’s constitution exhibits morphological structure, physiological function, psychological status and other relatively stable characteristics of an individual. The future of medicine will focus on preventive medicine, and therefore classifying individual constitution is essential action to protect health. The classification of body constitution cannot only accurately reflect the physical difference between individuals but also lay a solid foundation for the future standardization of Chinese Medicine constitution.

Nowadays, there is a controversial issue about the method to accurately identify constitution. Since 2009, Wang Qi’s research of nine body constitutions has been the standard for Chinese medical diagnosis and treatment. Nine body constitutions are classified as Gentleness, Qi-deficiency, Qi-depression, Dampness-heat, Phlegm-dampness, Blood-stasis, Special diathesis, Yang-deficiency and Yin-deficiency [1]. To classify an individual’s constitution, he or she can complete a paper constitution test or conduct a medical examination. These two methods either produce high error rate or require medical devices, time and manpower. On the other hand, pulse diagnosis can accurately identify constitution, but it requires a physician who has long-term accumulated experience. There are a lot of researches on using modern science and technology to classify Traditional Chinese Medicine (TCM) pulse. Traditional identification methods such as MLP, SVM ignore the complexity and deep hidden features of TCM pulse. As a result, these methods often have low accuracy rate on TCM pulse multi-classification.

This paper introduces a deep CNN model to achieve an applicable multi-classification accuracy rate on TCM pulse. The architecture of the CNN model is carefully designed in order to extract deep hidden feature and model small training dataset, which suits well for TCM pulse. The pulse dataset collected by the China Academy of Traditional Chinese Medicine (CACMS) is used to evaluate the performance of the CNN model. As far as we know, this dataset contains the largest number of samples of different constitutions pulse signals in the world. The experiment result shows that the CNN model has an adaptive accuracy rate on classification of nine constitutions.

The major contributions of this paper is summarized as follow:

  • The method uses CNN to classify 9 body constitution types and achieves an accuracy of 95 %.

  • The CNN model uses a large number of convolution layers, compound regularization layer, advanced activation layer and high efficiency optimizer to be adaptive to TCM pulse dataset.

This paper has five sections. The second section will present related work. The third section will describe the details of applying CNN model to body constitution classification. The fourth section will show the experiment result. The final section will draw a conclusion of this paper.

2 Related Work

Classification of TCM pulse has drawn a lot of attention in the past few years. The approaches focus primarily on two directions of TCM pulse, namely classification of TCM pulse conditions and the classification of certain diseases. Traditional methods, such as support vector machine (SVM), random forest (RF) used to predict recognize a certain disease. After the arising of neural network, researchers began to classify TCM pulse conditions using back propagation (BP) neural network and probabilistic neural networks (PNN).

However, most traditional methods perform poorly on pulse signal multi-classification. The core problem of previous models is that most researchers use pulse signal only from one single diagnosis point which is insufficient to cover the entire information of pulse diagnosis. Another important cause of low accuracy is that these neural networks are incapable of extreme deep neural network structure, which makes it unable to extract deep hidden features.

3 Body Constitution CNN Model

The following section will describe the preprocessing of pulse signal, major improvements and the overall architecture of the CNN model.

3.1 Signal Preprocessing

Signal Cropping. Since pulse signal is a weak physiological signal, hand movement or interference from other devices can cause irrational fluctuations in the process of acquisition. Manual signal cropping is applied to pulse signal that has distorted wave.

Signal Smoothing. Signal smoothing is used to retain the original signal while eliminating the noise from the signal. A 10th-order Butterworth band-pass filter of 0.00001 Hz–48 Hz is applied to the signal to dispel noise.

Detrending. Irregular breathing can cause an intrinsic overall pattern in pulse signal. A 10th-order polynomial curve fitting is applied to the signal, and then the outcome polynomial function is subtracted to remove the trending noise from the signal.

Decimation. Decimation reduces the original pulse signal sampling rate of 1000 Hz to 500 Hz. Due to the graphic memory limitation of our experiment, this process optimizes the learning efficiency of the neural network and provides more flexibility for the architecture of the neural network.

Segmentation. Each samples signal is separated into four parts. The original pulse data contains too many data points, so these points may be thrown away during the training process. Therefore, separating the dataset would save as many features as possible and at the same time increase recognition accuracy.

Synthetic Sampling. To handle the unbalance distribution of the dataset, an adaptive oversampling method, called adaptive synthetic (ADASYS) is implemented [2]. K-nearest-neighbors, imbalance ratio threshold and growth percentage are set to 7, 0.6 and 0.75.

3.2 Leaky Rectified Linear Unit

The traditional way to train a neural network is by using a saturated counterpart, such as tanh or sigmoid function. Non-saturated activation function, such as ReLU is far more superior to these saturated functions, in terms of addressing vanishing gradient and enhancing convergence efficiency. Following Maass neural network acoustic model, the CNN model trains neurons with leaky rectified linear unit (LReL) [3]. LReL allows saturated and inactive gradients to approach very low and non-zero value. LReL is proven to have better performance and higher learning rate in deep neural network comparing to tanh and ReLU [3].

3.3 Initialization

Initialization determines the probability distribution function for the initial weights. The model uses uniform initialization scaled by fan in, He weight initialization [4]. This initialization method effectually solves the bottleneck of training extremely deep neural network. Initialized with a fixed standard deviation, CNN models that have more than 8 convolutional layers often have difficulty with converging. Therefore, He initializes weights with a standard deviation,

$$\begin{aligned} \sigma = gain \sqrt{\frac{1}{fan_{in}}} \end{aligned}$$
(1)

This derivation takes the rectifier nonlinearities of rectified linear unit into consideration. He initialization ensures the weights to be adaptive through multi layers in extremely deep rectified network models.

3.4 Optimization

A stochastic optimization method, Adam is applied to the CNN model to update the network parameter in order to optimize the objective function. Adam is well suited for a neural network that has large number of parameters [5]. The method combines the strength of both AdaGrad and RMSProp. The CNN model uses \(\beta _1 = 0.9, \beta _2 = 0.999, \varepsilon = 10^{-8}\) as optimizers parameters.

3.5 Overfitting Prevention

Overfitting occurs when the CNN model has a large number of parameters due to its complex structure and numerous filters in each convolutional layer. It is a major shortcoming that the model must overcome. In order to prevent overfitting, two techniques were implemented, in terms of dropout and regularization.

Dropout. Dropout is the most immediate way to prevent overfitting. The idea of dropout is to drop a given fraction of units at each epoch during the training process, which prevent units from co-adapting [6]. This technique enhances the robustness of each unit by forcing them to conjunct with other randomly chosen units in order to learn new features by themselves. The experiment shows strong overfitting when the model does not use dropout.

Regularization. Regularization adds regularization penalties to parameters or activities of a neural network layer to reduce regression coefficient overfitting. Weight regularization penalty, known as Ridge and L2 activity regularization are applied in fully connected layer. Ridge regularization decreases the approximated regression coefficients towards zero in order to prevent overfitting that is caused by high dimensionality [7]. The penalty parameter is set to 0.01.

Fig. 1.
figure 1

The visualization of the CNN model

3.6 Overall Architecture

The basic architecture of the CNN model is presented in Fig. 1. The input dimension of the 1st convolutional layer is \(6 \times 1 \times 2500\). The 1st, 2nd, 3rd convolutional layers convolute the input from the previous layer with 10 convolutional kernels of size \(1 \times 10\). The 4th, 5th, 6th, 7th, 8th, 9th convolutional layers follow this structure with an identical size of kernels and a following max-pooling layer subsamples the output which furthers reduces the output size with a factor of 2. The numbers of kernels of the 2rd, 3rd, 4th, 5th, 6th convolutional layers are 20, 40, 80, 160, 320. The 7th, 8th, 9th convolutional layers have the exact same number of convolutional kernels, 640. The final input feature map is a size of \(6 \times 1 \times 6\). A small size of the final input feature map enhances the model to completely see and learn the sample. A dropout layer is applied to the output with a probability of 0.25 on the 3rd, 4th, 5th convolutional layers and more dropout layers with a probability of 0.5 on 6th, 7th, 8th, 9th convolutional layers and fully connected layer. The final layer of the CNN model is the 9-way softmax which classifies the output into 9 class labels.

4 Experiment

This section introduces the dataset, experiment environment and the overall performance of the CNN model.

4.1 The Dataset

The dataset collecting process and the details of the dataset will be described in this section.

Data Acquisition. Pulse signal acquisition system obtains the pulse signal from 6 pulse locations simultaneously on the participant’s hands, in terms of left hands Cun, Guan, Chi and right hand’s Cun, Guan, Chi. Traditional Chinese medicine defines Cun, Guan, Chi as pulse diagnosis locations that infer the change of a disease and identify an individual’s health condition. The sampling rate is 1000 Hz. Each acquisition takes 40 s.

Constitution Classification. TCM researchers will record participants blood biochemical determination, symptoms, result of pulse diagnosis and result of body constitution scale sheet. TCM researchers will analyses the overall scale result to identify each participant’s constitutional types.

Details of Dataset. The pulse dataset contains a total of 1661 participants’pulse signals that are unevenly distributed into nine constitutional types. The numbers of gentleness, dampness-heat, qi-depression, qi-deficiency, yang-deficiency, yin-deficiency, blood-stasis, special diathesis and phlegm-dampness constitutions in the dataset are 867, 79, 83, 205, 234, 76, 43, 33, 43 accordingly. Each pulse data is sampled at 1000 Hz with a length of 40 s, which produces a sequence of length of 40000 data points. Each person has 6 pulse locations to acquire signal, and therefore the dimension of one sample is \(6 \times 40000\). After signal preprocessing, the total number of samples is 12046, and the length of each sample is 2500. All samples are shuffled in the dataset before input to the CNN model. 80 % of the samples are randomly selected as training set while the remaining 20 % are used as validation set.

4.2 Experiment Settings

The experiment is built based on Keras and Scikit-learn [8]. We evaluate the classification performance using GTX TITAN with 12 GB of memory.

4.3 Result

We perform classification experiment of nine constitutional types on seven different classifiers to compare the effects of various methods. The results of each method are shown in Table 1. Accuracy defined as to the number of correctly identified samples divided by the total number of test samples. The task is to classify pulse signals into 9 constitutional types. SVM and RF models achieve relatively close results, 54.66 % and 54.34 %. The initial CNN model achieves an higher accuracy rate of 62.49 %. With dropout, the accuracy rate significantly increases by 29.09 %. Applying L2 regularization slightly increases the rate to 92.31 %. With the ADASYS method, the rate decreases by 3.24 %. The final CNN model with He initialization produces the best performance of 95.33 %.

In addition to the global accuracy, we perform classification testing on each individual constitutional type by randomly selecting 100 samples from each type. According to Table 2, Gentleness (0), Qi-deficiency (1), Qi-depression (2), Dampness-heat (3), Phlegm-dampness (4), Blood-stasis (5), Special diathesis (6), Yang-deficiency (7), Yin-deficiency (8) types achieve accuracy rates of 100 %, 94 %, 94 %, 97 %, 97 %, 96 %, 95 %, 91 %, 95 %.

To verify the CNN model can be applied to a wide range of pulse diagnosis tasks, we preform the classification on gender, age and acquisition time within 748 gentleness type participants. The ratio between male female is 1.6 to 1, while the numbers for the three age groups (14–44, 45–59 and 60+) are 647, 77 and 24. Another set of data is acquisition time data where the numbers corresponding to spring, summer autumn and winter is 230, 117, 181 and 220. According to Table 3, the classifications on gender, age and acquisition time achieve accuracies of 94.42 %, 99.38 % and 95.76 %.

Table 1. Comparison of results on the pulse dataset
Table 2. Prediction on every body constitutions
Table 3. Results of different classification tasks

4.4 Discussion

Comparing to other classifier, the CNN model has demonstrated its superiority base on very complex and multidimensional pulse input and limited samples. However, the CNN model requires a much longer training time, approximately 6 h because of the implementation of the dropout.

5 Conclusion

Convolutional neural network avoids the shortcomings of traditional recognition methods and improves the multi-classification accuracy rate on compound pulse signals. The experiment shows that the CNN model is capable of achieving adaptive accuracy rate on an extremely complex pulse dataset. Ultimately, we want to implement the CNN model on multi-classification of diseases to enhance public health and provide help for clinical treatment.