Deep Belief Networks for EEG-Based Concealed Information Test

Liu, Qi; Zhao, Xiao-Guang; Hou, Zeng-Guang; Liu, Hong-Guang

doi:10.1007/978-3-319-59081-3_58

Qi Liu¹⁶,
Xiao-Guang Zhao¹⁶,
Zeng-Guang Hou¹⁶ &
…
Hong-Guang Liu¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10262))

Included in the following conference series:

International Symposium on Neural Networks

2844 Accesses
5 Citations

Abstract

This paper introduces a deep learning approach to the feature extraction of P300 cognitive component existing in electroencephalogram signals collected in an autobiographical paradigm test. A thorough belief mechanism is used for the extraction of deep characteristics rather than raw feature vectors to train the classifier. It is shown that the classification accuracy is satisfactory by learning deep from the experimental data. Experiments have validated the usefulness of the algorithm. The hidden information has been obtained accurately with a single electroencephalogram channel. Moreover, performances of support vector machine with different feature extraction methods are compared.

Access provided by CONRICYT-eBooks. Download conference paper PDF

A Multichannel Deep Belief Network for the Classification of EEG Data

Tri-model classifiers for EEG based mental task classification: hybrid optimization assisted framework

Article Open access 30 October 2023

A new method for P300 detection in deep belief networks: Nesterov momentum and drop based learning rate

Article 04 December 2018

Keywords

1 Introduction

In recent years, EEG-based concealed information test has drawn considerable attention in the field of criminal investigation. Many effective methods have been used for EEG signal analysis in Concealed Information Test (CIT) [1]. Compared to traditional methods based on physiological responses which are easily affected by emotions and stress, cognitive behavior based polygraph is considered more reliable and scientific that can reduce the risk in false positive errors [2]. In addition, EEG is more convenient, more harmless and more economical than other brain activity monitoring methods such as PET, MEG and fMRI [3].

Due to the complexity and particularity of actual criminal investigation tasks and poor ratio of signal intensity to noise intensity (SNR), increasing performance of recognition of raw EEG signals remains a live problem. In which, methods based on machine learning algorithms have achieved the most effective results. Numerous feature extraction approaches have been adopted in machine learning algorithms such as time or periodicity methods [4], model parameter methods [5], as well as methods on the basis of wavelet decomposition [6], etc. [7]. However, the distinguishability of a certain feature is uncertain in different tasks, which may lead to a failure of recognition. Therefore, feature extraction methods which are capable of feature self-learning are necessary to be studied in this field. Recently, deep learning strategy has made great progress and the related algorithms have also been adopted in various fields such as EEG signal processing [8]. It can be viewed as a computational intelligent method since its similar mechanism to human brain. To improve the generalization performance of EEG feature, deep belief networks (DBN) is adopted to learn features automatically.

In this paper, we use the CIT technique and focus primarily on the feature extraction process of different brain waves evoked by relevant stimulus and control stimulus. DBN was applied to self-learn features of EEG signals. Then support vector machine (SVM) was implemented as the classifier. The classification performance is satisfactory and the runtime is acceptable.

2 Methods

2.1 Data Description

Data in this paper were from an autobiographical paradigm test [9]. There were 11 volunteering subjects in total participated who were all males at the age of between 22 and 35. They are all used to using the right hand and their vision are all normal or corrected to normal range. They have no idea what the test is based on and just know how to carry out the test. All the subjects were required to offer five numbers which all contained 4 digits and one of the numbers was the year of birth. The experimenter was not informed by the subjects of the birth date number until when the experiment ended. In the experiment, subject 11 took part in 3 runs while other subjects were involved in 2 runs. Due to wrong target stimulus counting (as was shown below), subject 1, 3, and 7 saw one of their runs vetoed. Finally, the study applied a total of 20 runs in the experiment. To achieve whole stimuli, in each run, the subject was exposed to each number with random for thirty times. Each number was revealed for one second and there was a two-second blank in the screen between the numbers. The experiment required the subjects to count how many times the number of the birth year was revealed instead of responding to the items. The subjects did not know that the entire target stimulus were displayed with 30 repetitions. EEG signals sampled at 256 Hz digitally were recorded at the Fz, Cz and Pz electrode positions of the 10–20 international electrode placement system (Fig. 1). The electrodes referred to linked mastoids. For the purpose of blink artifact detection, the experiment also recorded vertical EOG signals.

2.2 Methods

For the complexity and weak anti-interference capability of EEG, it is not easy to recognize effective data from raw signals. Figure 2 shows the raw waveforms. It is observed that the potential offset value of each sample belonging to the same category is quite different and there is no obvious distinction between samples belonging to separate categories.

Figure 3 shows that the signal process mainly includes data collection, pre-processing, feature exaction and pattern classification.

(1)
Pre-processing

This process consists of electrodes selection, segmentation of signals, superposition and filtering. For low SNR of EEG signals, the stimulations are repeated to remove unnecessary signals and enhance useful signals. Because the P300 frequency is primarily allocated in area with low frequency, the experiment designed a 6-order band pass Chebyshev Type I filter with cut-off frequencies 0.5 and 35 Hz to penetrate each epoch. Moreover, the data information matrix is designed into a range from 0 to 1 according to Eq. (1).

$$ {\mathbf{x}}_{norm} = {{{\mathbf{x}} - {\mathbf{x}}_{{min} } }/{{\mathbf{x}}_{{max} } - {\mathbf{x}}_{{min} } }} $$

(1)

(2)
Deep Feature Extraction

To begin with, k-means method is adopted to represent features preliminary as described in [11]. Using subject 1 as an example, some differences between the two categories can be seen in Fig. 4 after the initial feature extraction. However, the difference is still too small to distinguish samples. Further feature extraction is implemented as following.

DBN could be considered to be a stack of RBMs (Restricted Boltz-man Machines), which are motivated from the idea of equilibrium from the statistical physics literature [12]:

$$ E\left( {{\mathbf{v,h}} ;{\varvec{\uptheta}}} \right) = - \sum\limits_{j} {a_{j} v_{j} } - \sum\limits_{i} {b_{i} h_{i} } - \sum\limits_{i,j} {v_{j} h_{i} w_{ij} } $$

(2)

Where $ w_{ij} $ is the symmetric interaction term between distinct unit $ v_{j} $ and covered unit $ h_{i} $, $ a_{i} $ as well as $ b_{j} $ are both the bias term. $ {\varvec{\uptheta}} = \left\{ {{\mathbf{w}},{\mathbf{a}},{\mathbf{b}}} \right\} $ is the model parameter need to be learned.

Equation (2) could be optimized in a tricky way by contrastive divergence that has been usually applied to border on the expectation by a sample deriving from a certain amount of Gibbs sampling iterations [13].

When defined on a probability space, the joint distribution over $ {\mathbf{v}} $ and $ {\mathbf{h}} $ is:

$$ P\left( {{\mathbf{v}},{\mathbf{h}}} \right) = \frac{1}{z}e^{{ - E\left( {{\mathbf{v}},{\mathbf{h}}} \right)}} $$

(3)

where $ z $ is a standardized factor. Then

$$ P\left( {\mathbf{v}} \right) = \sum\limits_{{\mathbf{h}}} {P\left( {{\mathbf{v}},{\mathbf{h}}} \right)} = \frac{{e^{{ - F\left( {\mathbf{v}} \right)}} }}{z} $$

(4)

in which

$$ F\left( {\mathbf{v}} \right) = - \log \sum\limits_{{\mathbf{h}}} {e^{{ - E\left( {{\mathbf{v}},{\mathbf{h}}} \right)}} } $$

(5)

Model (2) can be simplified by using binary input variables. The conditional probabilities can be formulated as:

$$ \begin{aligned} P\left( {h_{i} = 1\left| {\mathbf{v}} \right.} \right) & = sigm\left( {b_{i} + w_{i} {\mathbf{v}}} \right) \\ P\left( {v_{j} = 1\left| {\mathbf{h}} \right.} \right) & = sigm\left( {a_{j} + w_{j}^{'} {\mathbf{h}}} \right) \\ \end{aligned} $$

(6)

Then

$$ F\left( {\mathbf{v}} \right) = - {\mathbf{a^{\prime}v}} - \sum\limits_{i} {log \left( {1 + e^{{\left( {c_{i} + w_{i} {\mathbf{v}}} \right)}} } \right)} $$

(7)

$$ - \frac{{\partial \,log P\left( {\mathbf{v}} \right)}}{\partial \theta } = \frac{{\partial F\left( {\mathbf{v}} \right)}}{\partial \theta } - \sum\limits_{{{\tilde{\mathbf{v}}}}} {P\left( {{\tilde{\mathbf{v}}}} \right)} \frac{{\partial F\left( {{\tilde{\mathbf{v}}}} \right)}}{\partial \theta } $$

(8)

To make RBM stability, the energy of system should be the minimum. By the above formulas, $ P\left( {\mathbf{v}} \right) $ should be maximized. The partial derivative of loss function $ - P\left( {\mathbf{v}} \right) $ is calculated as:

$$ \begin{aligned} - \frac{{\partial \,log P\left( {\mathbf{v}} \right)}}{{\partial w_{ij} }} & = E_{{\mathbf{v}}} \left[ {P\left( {h_{i} \left| {\mathbf{v}} \right.} \right) \cdot v_{j} } \right] \\ & - v_{j}^{\left( i \right)} \cdot sigm\left( {w_{i} \cdot v^{\left( i \right)} + c_{i} } \right) \\ - \frac{{\partial \,log P\left( {\mathbf{v}} \right)}}{{\partial c_{i} }} & = E_{{\mathbf{v}}} \left[ {P\left( {h_{i} \left| {\mathbf{v}} \right.} \right)} \right] - sigm\left( {w_{i} \cdot v^{\left( i \right)} } \right) \\ - \frac{{\partial \,log P\left( {\mathbf{v}} \right)}}{{\partial b_{j} }} & = E_{{\mathbf{v}}} \left[ {P\left( {v_{j} \left| {\mathbf{h}} \right.} \right)} \right] - v_{j}^{\left( i \right)} \\ \end{aligned} $$

(9)

Thus, the parameter $ {\varvec{\uptheta}} $ corresponding to maximum $ P\left( {\mathbf{v}} \right) $ is obtained. DBN could then be trained with the greedy layer-wise method [12]. Each RBM is trained greedily and unsupervised [14]. The posterior distribution of the first RBM is used as the input distribution of the second RBM. Then the weights are fine-tuned by back propagation (BP) neural network. Figure 5 shows the architecture of DBN model and Fig. 6 displays the comparison of mean values between two categories. The difference is significant after feature learning by DBN.

(3)
Classification

DBN model is viewed as a feature extraction system in this paper. Outputs of the last model were used as the new input feature vectors with labels of samples to train the SVM classifier.

3 Experiments

Responses to the birth year of the subject are expected to contain the P300 component, which is a late positive component, which is considered as the most typical and common event-related potential (ERP) closely related to human cognitive process. For the time-locked conception between the stimulus and the response [15], the value of the signal during 0–700 ms was set after stimulus onset. The experiment randomly assigned the weights with an initial value and the turning parameters were set as: learning rate = 0.07, momentum = 0.95. For the first RBM, the visible unit is set at 200 and the hidden unit is 100. For the second RBM, the number of the visible units is 100 and that of the hidden units is 50. The fifty-dimensional feature vector is input to libsvm.

To ensure the accuracy of training as well as testing data, a 10-fold cross-validation method was employed. According to this technique, the dataset was divided into ten subsets [16]. To improve the dependability, the 10-fold cross-validation procedure was performed with ten repetitions. And in each time, only one subset was used as the testing dataset and the other 9 ones were collected to constitute the training dataset. Particularly, data from test fold is not be involved in the optimization procedure. All final data were calculated by averaging the ten results.

4 Results and Discussion

This section made a test of performance of the DBN-SVM classification algorithm on the basis of the dataset presented in Sect. 2.1. Table 1 and Fig. 7 reveal the results. Specifically, Table 1 displays the recognition accuracy and runtime over all eleven subjects. Figure 7 compares performances of classifiers adopted different effective feature extraction methods for SVM classifier. All the experiments are repeated ten times, and the average results are reported.

Table 1. Performances of the algorithm over all subjects

Full size table

From the effects of perspective, a high average accuracy is obtained. In addition, as shown in Fig. 7, compared with other features used methods, the performance of our approach is significantly better.

Moreover, it is worth noticing that it does not require pre-processing operations including artifact removal or bootstrapping which takes much time and allows the approach possible to be applied to actual tasks.

However, the complex application environment and unpredictable interference will definitely put forward higher requirements considering the practical applications in crime information identification tasks. As for future works, it would be interesting to investigate a way to overall fine-tune the weights of DBN model with regard to SVM learning rule [13].

5 Conclusion

In this paper, deep learning strategy is applied for signal processing in concealed information test based on EEG. The introduction of DBN aims to better express characteristics of different signals. We choose SVM as the classifier which can avoid over-fitting effectively. According to the results, the method has been highly recognized. The study in this paper suggests that it is valuable to do further development on deep learning or other computational intelligence strategies applied in CIT based on EEG as well as provide reliable supports to actual future explorations.

References

Abootalebi, V., Moradi, M.H., Khalilzadeh, M.A.: A comparison of methods for ERP assessment in a P300-based GKT. Int. J. Psychophysiol. 62(2), 309–320 (2006)
Article Google Scholar
Zhao, M., Zhang, C., Zhao, C.: New approach for concealed information identification based on ERP assessment. J. Med. Syst. 36(4), 2401–2409 (2011)
Article MathSciNet Google Scholar
Wolpaw, J.R., Birbaumer, N., McFarland, D.J., Pfortscheller, G., Vaughan, T.M.: Brain-computer interfaces for communication and control. Clin. Neurophysiol. 113(6), 767–791 (2002)
Article Google Scholar
Guo, X.J., Wu, X.P., Zhang, D.J.: Motor imagery EEG detection by empirical mode decomposition. In: International Joint Conference on Neural Networks, pp. 2619–2622 (2008)
Google Scholar
Zhao, M.Y., Zhou, M.T., Zhu, Q.X.: Feature extraction and parameters selection of classification model on Brain-computer interface. In: IEEE 7th International Symposium on Bioinformatics and Bioengineering, pp. 1249–1253 (2007)
Google Scholar
Sherwood, J., Derakhshani, R.: On classifiability of wavelet features for EEG-based Brain-computer interfaces. In: International Joint Conference on Neural Networks, pp. 2895–2902 (2009)
Google Scholar
Subha, D.P., Joseph, P.K., Acharya, U.R., Lim, C.M.: EEG signal analysis: a survey. J. Med. Syst. 34(2), 195–212 (2010)
Article Google Scholar
Jirayucharoensak, S., Pan-Ngum, S., Israsena, P.: EEG-based emotion recognition using deep learning network with principal component based covariate shift adaptation. Sci. World J. 2014, 1–10 (2014)
Article Google Scholar
Abootalebi, V., Moradi, M.H., Khalilzadeh, M.A.: Detection of the cognitive components of brain potentials using wavelet coefficients. Iran. J. Biomed. Eng. 1(1), 25–46 (2004)
Google Scholar
Wang, D., Miao, D., Blohm, G.: A new method for EEG-based concealed information test. IEEE Trans. Inf. Forensics Secur. 8(3), 520–527 (2013)
Article Google Scholar
Coates, A., Ng, A.Y.: Learning feature representations with k-means. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 561–580. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35289-8_30
Chapter Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural network. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article MATH Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural. Inf. Process. Syst. 19, 146–153 (2007)
Google Scholar
Quiroga, R.Q.: Quantitative analysis of EEG signals: time-frequency methods and chaos theory. Institute of Physiology-Medical University Lubeck and Institute of Signal Processing-Medical University Lubeck (1998)
Google Scholar
Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)
Book MATH Google Scholar

Download references

Acknowledgement

Special thanks would be expressed to Dr. V. Abootalebi and the Research Center of Intelligent Signal Processing (RCISP), Iran, for the provision of the data. Besides, Dr. Deng Wang with department of Computer Science and Technology, Tongji University also deserves the appreciation, for the data support.

Author information

Authors and Affiliations

The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, University of Chinese Academy of Sciences, Beijing, 100190, China
Qi Liu, Xiao-Guang Zhao & Zeng-Guang Hou
Institute of Crime, Chinese People’s Public Security University, Beijing, 100038, China
Hong-Guang Liu

Authors

Qi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Guang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zeng-Guang Hou
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Guang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Liu .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Fengyu Cong
City University of Hong Kong, Kowloon Tong, Hong Kong
Andrew Leung
Chinese Academy of Sciences, Beijing, China
Qinglai Wei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Q., Zhao, XG., Hou, ZG., Liu, HG. (2017). Deep Belief Networks for EEG-Based Concealed Information Test. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10262. Springer, Cham. https://doi.org/10.1007/978-3-319-59081-3_58

Download citation

DOI: https://doi.org/10.1007/978-3-319-59081-3_58
Published: 31 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59080-6
Online ISBN: 978-3-319-59081-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics