Keywords

1 Introduction

The term “extended reality” (XR) has been recently introduced to enclose possible combination of real and virtual environments [1]. Although literature proposes a continuous scale ranging from reality to virtuality [2], XR is usually declined into three possibilities: virtual reality, mixed reality, and augmented reality. On the whole, these technologies might enhance the human-machine interaction by exploiting computer-generated perceptual information, which may or may not be overlapped with the real world [3]. In healthcare, the great potential of XR has been addressed to many applications, such as surgery [4, 5], rehabilitation [6, 7], and clinicians training [8, 9]. This applies particularly to the health 4.0 framework, which introduces customization and real-time adaptation in patient care [10]. The exploitation of XR technologies enhances health-related services such as prevention, diagnosis, and treatment from either the patients’ and caregivers’ point of view. In such a scenario, a decentralization of the health system is highly desirable since moving from the hospital to the patient’s home leads to cost saving and optimization of therapeutic outcome [11]. To this aim, novel mobile devices are boosting this evolution primarily due to their wearability and portability [12], as well as their increasing availability on the market.

As a further development, integrating XR with a brain-computer interface (BCI) guarantees unprecedented interactions between a human and the external world. Indeed, a BCI is a novel mean of interaction relying on direct measurement of brain activity and it is receiving itself much investment from the scientific and technical communities [13, 14]. Through a BCI, both control and monitoring are possible: a user can communicate his/her intentions to a machine by voluntarily modulating brain waves, or the machine can acquire information about the mental state of the user [15]. Intuitively, such an interface has found many applications in healthcare [16], but it has been also investigated for other fields. For instance, BCI systems have been proposed in conjunction with virtual reality for controlling an avatar or navigating a virtual environment [17]. Their application address either able-bodied people and paralyzed patients that may restore basic communication. Moreover, a survey has highlighted that BCI enables hands-free interaction for extended reality [18] for applications in medicine, robotics, and domotics. In that work, head mounted devices were mostly considered as feasible for real life applications. Nonetheless, it was also shown that the development of XR-BCI system still deserves more efforts.

In addressing wearability and portability, the brain activity is acquired by means of electroencephalography (EEG), which is also non-invasive and relatively low-cost [19]. On the other side, optimizing user-friendliness could lead to performance loss. Therefore, investigating different approaches has been crucial in the implementation of daily-life applications. Different paradigms can be taken into account in developing a BCI. A useful distinction is between reactive, active, and passive [20]. In a reactive BCI, the user voluntarily exposes to sensory stimulation, and the brain potentials evoked by that are exploited for communication and control. As an example, a recent work reports the implementation and validation of an XR-BCI monitoring system for health 4.0 relying on visually evoked potentials [21]. In there, smart glasses are simultaneously used for stimulating the user and for data visualization. Meanwhile, stimulation is not required in active and passive BCIs because their operation relies upon voluntary modulation of spontaneous brainwaves or detection of involuntary activity, respectively. This basic difference makes them more user-friendly, and hence suitable for daily-life applications. Given that, the present work focuses on active and passive paradigms implemented in conjunction with XR. In the active XR-BCI, the system is based on the detection of motor imagery, while the passive XR-BCI system is based on engagement monitoring. Both systems find wide application in health 4.0 because they can be used for enhancing human-machine interaction for able-bodied users, or they can be exploited in customized rehabilitation.

The remainder of the paper is organized as follows. Section 2 discussed the implementation and the preliminary results of a motor imagery-based (active) BCI exploiting a neurofeedback in virtual reality. Then, Sect. 3 reports the validation of an EEG-based component for engagement detection in an adaptive XR rehabilitation system. Conclusions and implications are addressed in Sect. 4.

2 Active BCI

2.1 Background

An active BCI derives its outputs from brain activity directly and consciously controlled by the user, independently from external events [22]. The most typical paradigm employed for an active BCI is undoubtedly the motor imagery. In there, the user imagines a specific movement for a control application, for instance controlling a wheelchair or navigating a virtual environment. Remarkably, the user must train before being capable of performing a mental task properly. Moreover, the system performance also depend on the user ability to focus on the imagery task. In this regard, neurofeedback (NF) helps the users to self-regulate brain rhythms and it increases both the user’s motivation and attention span. NF is accomplished by providing sensory information related to the ongoing neural activity for the BCI user. In particular, the creation of a dedicated virtual environment allows to engage the user. This is especially true in the field of rehabilitation, where XR-BCI systems have been helping patients to avoid “feeling like a patient”, though remaining in a clinical surrounding [23].

Currently, visual feedback is a widely employed modality in the field of BCI/NF. However, in recent years, other modalities have been also exploited, such as auditory and haptic. Some studies tested the impact of unimodal feedback modalities on system performance and user comfort, and they resulted quite equivalent [24]. Therefore, recent studies are exploring the possibility of using multiple feedback modalities simultaneously because that may be more effective than simple unimodal feedbacks [25]. Hereafter, the combination of visual and vibrotactile feedback is investigated by relying on XR technologies.

2.2 System Implementation

An XR-BCI prototype was implemented to investigate the effects of a multimodal feedback during motor imagery. Virtual reality was taken into account in this first prototype. The task consisted of controlling both intensity and direction of a moving virtual object. Multimodal feedback was obtained by merging visual and vibrotactile modalities. In details, the visual feedback was provided by the movement of a virtual ball on a PC screen, while the vibrotactile one was given by a wearable suit with vibrating motors. Intensity and direction of the feedbacks were determined by means of the user’s brain activity, measured through EEG. The feedback actuator of the XR-BCI system are shown in Fig. 1, while the EEG acquisition hardware is shown in Fig. 2.

Fig. 1.
figure 1

Hardware components of the XR-BCI system for multimodal (visual plus vibrotactile) feedback.

In this prototype, a generic PC monitor was used to provide the visual feedback. However, this will be replaced by smart glasses to provide a more immersive experience and hence furtherly increase user engagement. A Unity application was purposely developed to have a virtual environment with a rolling ball, as well as to control the haptic suit. Gravity was applied to the ball to attach it to the virtual floor, while an EEG-modulated force is applied to the ball during the experiments. Note that the applications also indicated the task to carry out (Fig. 1a). Vibration was modulated according to the ball position in aiming to augment immersiveness by a simple multi-sensory stimulation. The hardware for vibrotactile feedback consisted of a wireless suit from bHaptics Inc. This has a \(5 \times 4\) matrix of vibration motors both on front and back of the torso. The intensity of the vibration can be varied for each motors, thus allowing the creation of customized patterns. The suit is shown in Fig. 1b. Finally, the employed EEG acquisition system consisted of the Olimex EEG-SMT acquisition board and two bipolar channels. The EEG signals were collected by four active dry electrodes by means of a differential measurement between each pair of them. The measurement electrodes were placed at C3 and C4, while the reference electrodes for the differential measurement were placed at Fp1 and Fp2, respectively. A ground electrode was also placed on the left ear lobe. Electrodes were hold by a soft headband.

Fig. 2.
figure 2

Low-cost wearable EEG acquisition system.

During each trials, the virtual ball could roll to the left or to the right of the display according to the detected brain activity. Vibration was simultaneously provided by the haptic suit on the left or on the right part of the torso. The target task was decided by a cue-based paradigm. In particular, with reference to Fig. 3, the user was relaxed until an indication appeared at time \(t_{CUE}\) = 2 s; then, motor imagery started at \(t_{MI}\) = 3 s and it was stopped 5 s later. In this time interval, feedback was provided to the user depending on the detected brain activity. Notably, while the direction was chosen according to the detected motor imagery class, the class score modulated ball velocity and vibration intensity, respectively.

Fig. 3.
figure 3

Timing diagram of a single trial in the BCI experiment with neurofeedback.

The algorithm for EEG processing was the filter bank common spatial pattern (FBCSP), a widely adopted approach in MI-BCI [26, 27]. Of course, the algorithm was adjusted to provide feedback in real-time. Its main blocks were (i) an array of bandpass filters, (ii) the “common spatial pattern” (CSP) for extracting spatial features from the filtered EEG signal, (iii) a selector of features that discards the less informative ones, and (iv) a naive Bayesian Parzen window (NBPW) classifier. Notably, the classifier exploits signal features to assign a class probability to each trial. Therefore, the most probably class was assigned to the incoming EEG data for each trial, while the class probability was exploited as a score. These two pieces of information were exploited to modulate the intensity and direction of the feedback, respectively. It should be noted that the algorithm had to be trained by means of labeled EEG data. Since the users were completely new to the system, there were no available data for them, and the algorithm had to be trained from already available datasets. Furthermore, in adapting the algorithm for real-time processing, the trained model had to classify 1.0 s-long time windows starting from \(t_{MI}\) and progressively shifting of 0.5 s. This implies that, in the 5 s-long motor imagery window, the signal was classified to provide feedback 9 times.

2.3 Preliminary Results

First experiments with the multimodal feedback were conducted by considering two motor imagery tasks, namely left versus right imagery. Each user was instructed to mentally visualize the left part of the upper body or the right one by particularly focusing on the respective arm. During each trial, the XR-based feedback aimed to help the user in the motor imagery task. In particular, a positive feedback should have strengthen the correct imagination of the indicated direction, while a negative feedback would have indicated the need to improve the mental task execution. Unfortunately, a wrong feedback could also be misleading for the user, thus implying that the brain activity classification had to be as accurate as possible.

The classification algorithm was trained on the data of the best subject from the BCI competition III dataset 3a (subject “k3b”). This public dataset contains data from 60 EEG channels related to four motor imagery tasks and visual feedback was provided during their acquisition. Hence, the algorithm training relied upon the left hand and the right hand imagery tasks, and two-channels data were extracted to match the EEG channels of the proposed system. The algorithm was implemented in Matlab, which communicated with the Unity application to match the timing protocol and provide feedbacks.

Four subjects (one female) participated to the preliminary experiments. Among them, two subjects had already experience with a reactive XR-BCI systems, but only one already had experience with motor imagery. The experiment was design for executing 80 trials per each subject, divided into 2 runs of 40 trials each. However, due to technical problems, two subjects carried out less trials (down to 30). Note that during the preliminary experiments, the user was wearing the suit and the EEG cap while seated on a comfortable chair in front of the PC monitor and he/she was asked to limit unnecessary movements during the EEG acquisition. Indeed, movements affect the EEG electrodes stability. Instead, it was evaluated that vibrations associated with the haptic feedback did not influence the EEG acquisition.

After the experiments with feedback in real time, an offline accuracy assessment was conducted to evaluate the XR feedback efficacy. Firstly, the cross-validation was considered. In this procedure, acquired EEG data are split several times in training and evaluation data, so that the accuracy can be calculated for each split. The cross-validation accuracy is then obtained as the mean of these accuracies, and the associated standard deviation can be obtained as well. In the present case, a 4-folds cross-validation with 5 repetitions was considered, meaning that data was split \(5 \times 4\) times in 75% for training the algorithm and 25% for using the algorithm in classifying data (evaluation phase). The accuracy in the evaluation phase can be obtained by knowing the true labels of evaluation data. Clearly, this would not be possible in real applications, but labels are indeed known for all data during these experiments. After the cross-validation, another accuracy assessment was conducted by dividing in half the available data, so that the first half could be used for training and the second half for evaluation. Note that in both procedures, data were considered subject-by-subject.

The results are reported in Table 1. Apart from subject S3, the accuracies are above the chance level (50% for two classes). Cross-validation accuracy goes up to 84% for subject S2, which is relatively high for the motor imagery field. However, this performance is not confirmed by evaluation accuracy, which is only 65%. Instead, cross-validation accuracy and evaluation accuracy are compatible for S4, and the resulting 70% is still an acceptable result.

Table 1. Classification results for the preliminary experiments with four subjects.

In synthesis, the relatively good classification accuracies obtained in the preliminary experiments suggest that the multimodal XR neurofeedback might help the user during the mental tasks. However, experiments with further subjects are necessary to better validate the system design. Furthermore, feedback modalities should be better investigated in order to optimize the XR environment for utmost engagement. Portability of the visual feedback will be enhanced by replacing the PC monitor with a virtual reality visor, but augmented and mixed reality could be investigated as well. Then, the motor vibration should be better modulated to offer a more realistic haptic sensation.

3 Passive BCI

3.1 Background

The notion of passive BCI was officially defined at the 4th Graz BCI Conference [28].

In passive BCI field, a promising application is the engagement detection. The engagement detection can be applied in several field such as rehabilitation, academy, or work According to Lamborn, engagement stands for active participation and concentrated attention in opposition to superficial participation or apathy [29]. Many researchers, defined three types of engagement: emotional engagement, behavioural engagement and cognitive engagement [30]. High emotional engagement level is related to the positive emotional reply to a task. Cognitive engagement represents the mental effort required to perform activities.

Finally, behavioral engagement is evaluable by the direct observation of the individual physical effort produced during the activities.

Moreover, the concept of engagement must be associated to motivation. For example, according to Maehr and Meyer motivation, conceptualized in terms of direction, intensity, and quality of individual energies, answers to the question of why for a given behavior. In addition, Maier and Seligman argue that lack of motivation can lead to individual negative cognitive and emotional states (disengagement) [31].

3.2 Adaptive VR Based Rehabilitation

An adaptive rehabilitation game platform is proposed for rehabilitation of children with neuro-psychomotor deficits. The rehabilitation tasks change accordingly to the engagement of the user. The cognitive and emotional engagement were assessed by means the EEG signals. By combining the different levels of cognitive and emotional engagement, four different states can be identified: (i) participation (high level of both emotional and cognitive engagement), (ii) boredom (low cognitive engagement and low emotional engagement), (iii) stress (high cognitive engagement and low emotional engagement), and (iv) distraction (low cognitive engagement and high emotional engagement).

Fig. 4.
figure 4

Architecture of the adaptive rehabilitation game platform

Architecture. The system architecture is shown in Fig. 4 and it is composed by a Rehabilitation Game Platform and an Engagement Detection Component. The main hardware of the Rehabilitation Platform is the GE73 Raider RGB 8RF-212IT Gaming Notebook (Intel i7-8750H, 16 GB of RAM, 256 GB NVMe SSD and 1 TB HDD, nVidia GTX 1070 Graphics Card, Windows 10 Home) The Platform proposes the rehabilitative task to the patient through the Monitor and the Earphones. The Monitor used is a Philips’s Monitor (223V5LHSB2 LCD-TFT Monitor for Desktop PC 21.5 inch” LED, Full HD, 1920 \(\times \) 1080).

The visual stimulus is an avatar chosen by the patient (bee, ladybug, girl, or fish). The auditory stimulus is a background music. The Content Production Module updates the audio-visual stimuli as a function of two inputs.

The first input is an information about the user’s posture. It comes from the Posture Tracker on the base of the images acquired by the Video Camera. The second input contains the patient status information. It is elaborated by the Adaptivity Manager on the base of the emotional and the cognitive engagement levels detected by the Engagement Detection Component.

The latter is organized into two blocks: the EEG Acquisition unit and the the EEG Processing Unit. The EEG Acquisition unit acquires and digitizes the patient EEG signal and subsequently sent it to the EEG Processing unit by wireless transmission. In the EEG Processing unit, a custom Filter Bank (FB) filters the EEG signal by 12 pass-band filter. Then the Common Spatial Pattern (CSP) realizes the feature extraction and selection. Finally, the Classifier identifies the engagement levels of the patient.

Data Processing. The FB decomposes the EEG into 12 frequency bands using fourth order Chebyshev filters. In this study, a total of 12 band-pass filters with 4 Hz bandwidth were used, equally spaced from 0.5 to 48.5 Hz. Then CSP, a supervised machine learning algorithm, polarized the variance of the EEG signals according the class they belong to and maximized the discriminability of the two classes for each type of engagement. The FB-CSP outputs are sent to the Classifier the cognitive and emotional engagement levels detection.

Operation. The Rehabilitation Game Platform asseses the orientation of the user visage through the Video Camera to launch the activity. During the activity, the patient must fix the avatar to move it. The avatar completes the path if the correct posture is maintained by the patient. The activity content is adapted on the base of the patient state. For example, if the patient was experiencing a stress state, the activity level was simplified.

3.3 Preliminary Results

An experimental campaign was carried out to validate the Engagement Detection Component of the adaptive rehabilitation game platform.

The participants included in the experimental activity were children aged between five and eight years (three males and one female) suffering from disturbances in motor-visual coordination. The children were affected by double hemiplegia, motor skills deficit with dyspraxia, neuropsychomotricity delay (NPM), and severe NPM delay in spastic expression from perinatal suffering, respectively. The experimental campaign took place during the hours already scheduled for rehabilitation activities. The wearability of EEG based component was guaranteed by the use of the Emotive Epoc + [32] for the EEG signal acquisition.

The experimental protocol was authorized by the University Federico II ethical commission. Families signed the informed consent before the experiment. The procedures were implemented in accordance with the appropriate directive and guidelines. EEG signals were acquired during the rehabilitation sessions for a total of thirty minutes per week. The acquisition was conducted in a room illuminated by solar light and provided with air swap. Each child was asked to sit in front of the display wearing the EEG cap and the adhesion of the electrodes on the scalp were checked. Then, the audio-visual stimuli were produced to launch the activity. During the activity, the child had to fix the avatar to move it, maintaining eye contact and a correct posture.

Each session were video recorded. A multidisciplinary equipe realized the video classifications, (and, therefore, the EEG tracks classification), identifying all the transitions between two consecutive states (i.e. from participation to stress) crossed by the subjects. The children completed all the sessions without expressing discomfort during the use of the proposed devices. Moreover, the measurements of emotional engagement exhibited a strong prevalence of high level. Therefore, elements emerged confirming that the system does not represent an element of discomfort for the users.

As preliminary analysis, the Median Absolute Deviation (MAD) algorithm, was implemented [33] to identify and eliminate anomalous trials from the EEG dataset.

Subsequently, the EEG signal was divided into epochs. Two types of epoch divisions were simultaneously performed: 3 s epoch and 9 s epoch. The Independent Component Analysis (ICA) was implemented to remove artifacts from the signals. Thend the signals were filtered with Filter Bank and the Common Spatial Pattern was applied for feature extraction and selection.

Finally, different classifiers were compared in addressing a two-class (high-low) classification problem both for cognitive and emotional engagement.

Moreover, both intra-subjective and inter-subjective classifications were implemented.

As concern inter-subjective classification, under-sampling was realised to balance the dataset by reducing the size of the abundant class. Subsequently, the data classification was implemented with three different classifiers: Support Vector Machine (SVM), Linear Discriminant Analysis(LDA) and K-Nearest Neighbors(KNN). The EEG features were input to above mentioned classifiers whose hyperparameters were optimized with a grid-search cross-validation procedure.

The results are shown in Table 2 and Table 3.

Table 2. Best subject-independent accuracy (and relative classifier) obtained for the cognitive engagement detection
Table 3. Best subject-independent accuracy (and relative classifier) obtained for the emotional engagement detection

About the intra-subjective classification, data balancing wasn’t realized before the classification. In this case, Support Vector Machine (SVM), Artificial Neural Network (ANN) and Graph based Neural Network(GNN) were used. The EEG features were input to above mentioned classifiers whose hyperparameters were optimized with a 3-fold cross-validation procedure. The results are shown in Table 4 and Table 5.

Table 4. Best subject-independent accuracy (and relative classifier) obtained for the cognitive engagement detection
Table 5. Best subject-dependent accuracies (and relative classifier) obtained for the emotional engagement detection

4 Conclusion

The present manuscript has discussed possible integration of XR and BCI addressed to health 4.0 applications. Indeed, XR technologies are extensively proposed nowadays to enhance prevention, diagnosis, and treatments in healthcare. Therefore a BCI, furnishes a novel mean for control and monitoring, so that the user can communicate with a machine by modulating his/her brain activity, or the machine can be adapted to the detected mental state of the user.

In these regards, motor imagery was investigated as an active BCI paradigm in which the spontaneous brain activity is measured during the mental visualization of a movement. The XR feedback, provided either visually and with haptic stimulation, aims to enhance the user’s engagement to help movements imagination. Classification accuracies were assessed to test the XR neurofeedback efficacy. It resulted that up to 84% classification accuracy was obtained in cross-validation, while evaluation accuracy is up to 70%. Compared to typical results in motor imagery classification, these values are encouraging and suggest that further experiments may be required to better assess the system efficacy. Foreseen applications for this XR-BCI system concern motor rehabilitation or the control of robotic prostheses.

Emotional and cognitive engagement were detected in the framework of passive BCI. A wearable EEG based component of an adaptive rehabilitation game platform was prototyped and validated on 4 children. Experimental results, reported a subject independent accuracy of 73.3 % and 72.3 % for the emotional and cognitive engagement, respectively when an epoch of 9 s was adopted. The subject-dependent accuracy increased up to 87.2% in the case of cognitive engagement detection for the fourth subject. Considering the lack of studies on the engagement detection by a passive BCI, the results lead to consider this line of research as very promising.