Keywords

1 Introduction

Neuromorphic computing science has the potential to lead the next generation of computing science because of its high computing efficiency [1]. The neuromorphic computing model draws from the human brain computation mechanism, which performs the critical functions of perception and cognition [2, 3]. Due to the 80% of external environment information is obtained through the visual neural mechanism, and the 90% of external environment information is obtained through the audiovisual neural mechanisms, then the neuromorphic computing model of audiovisual evocations is absolutely important for human perception and cognition mechanism. Meanwhile, the audiovisual neuromorphic computing model is equally significant for the patrol robot using the human perception and cognition model [4,5,6].

For the perception and cognition mechanism modeling by using human electroencephalograph (EEG) signals of audiovisual evocations, a specific filtering algorithm is firstly needed to eliminate the noises from the raw EEG signals [7,8,9]. Researchers have used different filtering methods to remove the noises, such as the filter bank method to decompose the EEG signals into sub-band and have overlapped frequency cutoffs, and the coincidence filtering method to remove the different types of noises [10, 11]. Besides, scientists propose a denoising method of an empirical mode decomposition (EMD) fused with the principal component analysis for the seizure EEG processing [12]. However, the different descriptions of EEG signals of the different evocation patterns have particular characteristics, and different filtering methods are required [13,14,15].

The paper is proposed an improved convolutional neural network (CNN) parallel computation strategy for the convolutional layer and Brain source estimation. Results and comparisons are provided in conclusions.

2 Related Works

In this section,we introduce the existing studies related to our approach and desvribe the differences.

For neural computing, the researchers performed a simple fusion of EEG and neurocomputing [16,17,18]. Besides, the neuromorphic feature extraction is secondly required. There is usually the one-dimensional time-frequency feature of a single tunnel EEG feature [19], the two-dimensional EEG energy feature of the multiple tunnels [20,21,22], and the three-dimensional EEG connection feature of the brain network [23]. Furthermore, researchers propose the EEG feature of the composite multivariate multiscale fuzzy entropy to describe the motor imagery [24,25,26]. Besides, the EEG and the electrooculogram (EOG) are fused to represent the brain-muscle feature [27]. However, the evocation of EEG signals is a time process, the dynamic characteristic should be considered [28].

Moreover, the neuromorphic computation model is also an important part of modeling human perception and cognition mechanisms [29]. The hidden Markov models and recurrent neural networks are used for event detection. In addition, the multi-modal neural network is constructed for person verification using signatures and EEGs [30]. In addition, scientists give the brain informatics-guided systematic fusion method and propose a predictive model to build a bridge between brain computing and the application services. The self-regulated neuro-fuzzy framework and the hierarchical brain networks via volumetric sparse deep belief network are developed to conduct neuromorphic computation.

3 Estimation of Brain Network Source Model

Brain networks are novel methods to study the interactions between important brain regions, combined with brain inverse activity is possible to extract nonlinear information about the source of local electrical activity using nonlinear potential information generated directly from the activity of local organismal neural circuits in the cerebral cortex. The source model is calculated from the segmentation of Magnetic Resonance Image (MRI). Usually, the white matter-gray matter interface is chosen as the main source space generating region. The MRI anatomy and channel locations are used with the same anatomical landmarks (left and right ear anterior points and nasal tip).

According to dipole theory, the EEG signal X(t) recorded from the M channel can be considered as a linear combination of P time-varying current dipole sources S(t).

$$ X(t) = \left( {\begin{array}{*{20}c} {x_1 (t)} \\ { \cdot \cdot \cdot } \\ {x_M (t)} \\ \end{array} } \right) = G \cdot \left( {\begin{array}{*{20}c} {s_1 (t)} \\ { \cdot \cdot \cdot } \\ {s_p (t)} \\ \end{array} } \right) + N(t) = G \cdot S(t) + N(t) $$
(1)

where G is the guide field matrix describing the deterministic quasi-instantaneous projection of the source onto the scalp electrodes. N(t) is the measurement noise inherent in any acquisition process, and G is calculated from the head model and the position of the electrodes. If the source distribution is constrained to a current dipole field uniformly distributed across the cortex and normal to the cortical surface, the position of the source is defined and direction.

Using a validated mind model as well as an internally generated source model, each two elements within the network are connected to function as a whole network, and the effect of this network on EEG signals under audiovisual modal stimulation is then analyzed by the following experiments.

4 Experiments and Results

4.1 Experimental Environment

The EEG signals were collected by the 40 leads electrode cap of Neuroscan. The international 10–20 system electrode placement method is adopted to standardize the position of each lead electrode. And that distance between potential is moderate. According to the analysis of the visual and auditory experimental areas in previous studies, the electrode can accurately and effectively collect the original data of EEG signals, which can meet the requirement of this study. Moreover, the symmetry between potentials can guarantee the accuracy of the following brain source inverse estimation results. The placement of standard 10–20 system electrodes is shown in Fig. 1.

Fig. 1.
figure 1

Electrode distribution diagram of standard 10–20 system

The reference electrode is between Fz and Cz, and the ground electrode is between Fz and Pz. EEG signals are synchronously collected at a sampling frequency of 1 kHz, and the impedance between scalp and electrode is less than 5 kΩ. Electrodes are placed on each lobe of the brain, and electrode labels identify their location. The odd number is left, the even number is right, and the middle electrode is “z”. For example, the F3 electrode comes from the left frontal lobe, and Cz is placed at the vertex on the top of the head.

In this experiment, five subjects were selected, the age distribution was about 25 years old, both sexes were male, without any history of brain diseases, and all of them had normal hearing and normal or corrected vision. Before the experiment, make sure that the subjects have enough rest time, and keep their hair clean within one hour before the experiment, to ensure less interference in the collection process. The experiment was conducted in a shielded room, Including three kinds of stimuli: visual, auditory, and audio-visual stimuli.

4.2 Experiment and Analysis of Brain Source Estimation Under Audiovisual Stimulation

In this experiment, the 30s scalp EEG sequence of the subjects in the EEG acquisition experiment under audio-visual mixed stimulation was selected, and then the collected EEG signals were denoised by the CEEMDAN-FastICA algorithm, and then the source was estimated in different periods. Finally, the brain network connection model under audio-visual stimulation was constructed by the collected EEG signals.

As shown in Fig. 2, Fig. 3 and Fig. 4 from top to bottom, the audio-visual mixed modal stimuli collected in three different periods are used for brain source estimation. Figure 5. is a diagram of brain network connection under audio-visual mixed-mode simulation.

Fig. 2.
figure 2

Estimated images of brain sources under audio-visual stimulation in different periods(R0)

Fig. 3.
figure 3

Estimated images of brain sources under audio-visual stimulation in different periods(R1)

Fig. 4.
figure 4

Estimated images of brain sources under audio-visual stimulation in different periods(R2)

Fig. 5.
figure 5

Brain connection diagram under audio-visual stimulation

Through brain network visualization and brain connection, we can get the density and weight change of global network connection in different audio-visual stimulation stages, which makes the analysis of EEG signals more intuitive.

4.3 Experiment and Analysis of Improved CNN Audiovisual Model

In the experiment, binary cross-entropy is used as the cost function, L2 regularization is used, and the penalty term of the cost function is used to avoid overfitting. Visual stimulation is used as the evoked modality of EEG signals to obtain relevant EEG data. Taking the MSE value of the data as the input of the improved CNN, the curves of training accuracy and loss function are shown in Fig. 7.

Fig. 6.
figure 6

Model performance under auditory stimuli: (a) Model accuracy under auditory stimuli; (b) Model loss value under the auditory stimulus

For visual stimulation, the best training effect is achieved at about the 200th time, when the training accuracy of the model is 82.34% and the loss value is 0.0117. For auditory stimulation, the best training effect is achieved at about the 870th time, and the training accuracy rate is 84.78%. The loss value is 0.0072. When the visual and auditory stimuli are mixed, the best training effect is achieved around the 200th time, when the training accuracy is 90.12% and the loss value is 0.0041 (Fig. 6).

By discussing the modes of visual stimulation, auditory stimulation, and audio-visual mixed stimulation, the following is the classification accuracy of the recognition model. The EEG signals denoised by the CEEMDAN-FastICA algorithm are finally used as the input of linear classifier, Support Vector Machine (SVM), and improved CNN. The classification accuracy is as shown in Table 1.

Table 1. Accuracy of EEG classification (%)

For the feature vectors constructed with scale entropy values under the above three stimulation modes as the input of each classifier, the accuracy visualization results of the classifier are shown in Fig. 8.

Fig. 7.
figure 7

Comparison of results of EEG Signal classification by different classifiers

According to the classification accuracy results in the chart, among the linear classifier, the SVM, and the improved CNN, the classification results of the improved CNN are better than those of the other two. Meanwhile, for visual stimuli, auditory stimuli, and mixed audio-visual stimuli, the classification results of mixed audio-visual stimuli are better.

Fig. 8.
figure 8

Visualization of model evaluation metrics

In this experiment we assess the significance of correlations between the first canonical gradient and data from other modalities (curvature, cortical thickness and T1w/T2w image intensity). A normal test of the significance of the correlation cannot be used, because the spatial auto-correlation in EEG data may bias the test statistic. In Fig. 9. we will show three approaches for null hypothesis testing: spin permutations, Moran spectral randomization, and autocorrelation-preserving surrogates based on variogram matching.

5 Conclusions

EEG signals in different periods under audio-visual stimulation are analyzed by constructing a brain connection network and brain source map visualization, Thereby providing a powerful guarantee for the processing of EEG signals of audio-visual stimuli.

Brain source estimation and model of audiovisual evocations, that is, the parallel computing strategy of the convolutional layer is proposed. The convolutional kernel in the convolutional layer is set as a vector to extract only spatial features, and the regularization operation is added to the network structure to prevent over-fitting. The audio-visual model is integrated by using the improved CNN model with the inception network.

Through experiments, the classification effects of EEG signals by traditional classification methods and improved CNN model are compared.The experimental results indicate that the environment safety recognition accuracy of the neuromorphic computing model can respectively reach 87.98%, 88.4%, and 90.12% for the single visual, the single auditory, and the audiovisual stimulus.The results show that the improved CNN model has better recognition accuracy, and the classification results under audio-visual mixed stimulus mode are better than those under single stimulus mode.