Keywords

1 Introduction

In previous research, many types of physiological signals (ECG (electrocardiography), PPG (photoplethysmography), GSR (galvanic skin response), SKT (skin temperature), and EEG (electroencephalography)) have been adopted for quantitatively measuring human intention and emotion [1]. In the case of an ECG or PPG, the heart rate can be determined by analyzing successive pulse-to-pulse intervals [2]. Additionally, amplitude levels are parameters measuring respective skin responses and temperatures when using GSR and SKT. EEG data can be interpreted by using various types of methods in the time or frequency domain. However, the aforementioned methods are inconvenient to measure based on the fact that sensors must be attached that can inadvertently cause a negative emotion causing noise in the readings.

Recently, camera vision-based physiological data acquisition methods were proposed, which is free from conventional physiological sensors. The Cardio-Cam was proposed for measuring human heart rates by ICA (independent component analysis)-based color channel analyses without any sensor attachment [3]. Additionally, many smartphone applications have been released that measure heart rates in real time by using both built-in backside cameras and white illuminators [4]. In these applications, the brightness levels of successive images were observed to change because the amount of illuminative reflection was continuously and regularly changed according to the blood flow. Although the mentioned methods are meaningful because no sensors have to be attached, they only decode heart rates.

Various types of micro-movements of the human body offer significant measurements without sensor attachments. In this study, we defined the measurements of human body movement (as shown in Fig. 1) that can clarify the dependency (or independency) of each part’s micro-movement. As shown in Fig. 1, the hierarchical model could well explain the dependency of micro-movements between body parts. For example, the amount of facial micro-movements can be calculated by summing the bust amount and the facial self-amount. In Fig. 1, the hierarchical model was defined as “Full body > (Bust > Arm = (Face > (Eye = Mouth))).” Because the measurement of full body micro-movement was not suitable with a camera, comparatively higher ranked “Bust” micro-movement information can be chosen in this research for purpose of estimating social emotion.

Fig. 1
figure 1

Definition of various micro-movements of human body

To measure the amount of micro-movements, the region of interest (ROI) was defined at first in the captured bust image frame on the basis of face position as detected from the adaptive boosting (Adaboost) method [5]. The amount of bust micro-movement was then calculated by subtracting two successive images. According to the interval between two image frames, the amounts of micro-movement per several frequency bands could be acquired. The results of feasibility tests for comparison between intimate and non-intimate groups showed that more less movement in case of intimate relation group was occurred compared with the case of non-intimate one.

2 Proposed Method

First, face region was detected in the 1st frame of upper body image by using OpenCV [6] Adaboost face detector. The Adaboost method used a strong classifier generated by combining simple weak classifiers to detect face on an input image [5]. Although this algorithm took much training time, it had advantages such as rapid time required for detection and good detecting performance. It took 29ms per an image in average to detect facial region in 1/4 decimated image. Figure 2 showed face detection in red rectangle as an example.

Fig. 2
figure 2

Face detection results using Adaboost. (Red Detected facial region, Green Defined candidate region of bust micro-movement)

After face detection shown in Fig. 2, the candidate region for subtracting image to calculate bust micro-movement was defined by expanding 160 pixels directed to horizontal directions of facial region rectangle as shown in green rectangles of Fig. 2.

To measure the amount of micro-movement, the camera vision analysis program was implemented. In the analyses, the captured color image was converted to gray level one because color component was not important in terms of estimating motion. The average amount of micro-movement at F frequency band \( (O_{{F_{Hz} }} )\) could be calculated as following equation.

$$ O_{{F_{Hz} }} = \frac{1}{WH}\mathop \sum \limits_{j = y}^{H} \mathop \sum \limits_{i = x}^{W} \left| {I_{n} \left( {i,j} \right) - I_{R/F} \left( {i,j} \right)} \right| $$
(1)

In Eq. (1), W and H were the horizontal and the vertical length of the bust micro-movement candidate region, respectively. And I n (i, j) meant pixel value of ith column and jth row of nth image frame. R was a frame rate of the used camera. In our method, the frame rate was 15. The conceptual diagram of proposed method for extracting micro-movement was shown in Fig. 3.

Fig. 3
figure 3

Conceptual diagram of proposed micro-movement calculation method (rightmost white background images are inverted ones for visibility)

Since the extracted continuous O values in Eq. (1) generate 1D temporal signal as shown in Fig. 4, it could be analyzed by using same way of conventional signal analyses methods. In addition, the micro-movement information could be analyzed at the various frequency bands. This frequency analysis method was a different contribution point of this chapter from the previously presented method [7]. For example, intended big gesture could be well extracted at low-frequency band, and then, their regions might be a role of mask region for rejection of micro-movement measurement at high-frequency band. Although previous background subtraction methods for object detection had problem in continuous change of background or complex background modeling, our proposed method was independent upon background changes because it used only the latest two image frames.

Fig. 4
figure 4

Example of the amounts of micro-movements at several frequency bands

The proposed method can be used to any body part in Fig. 1. If the particular body part’s detection method is previously performed, its micro-movements or muscle movements can be measured and analyzed. For example, if Adaboost (Adaptive boosting)-based face detection method is used, only micro-movement of facial region can be analyzed. Figure 5 showed many kinds of micro-movement detecting results by using the program. In this figure, bright regions were regions in micro-movement.

Fig. 5
figure 5

Feasibility test of detecting micro-movements [7]. a Almost no micro-movement. b Bust micro-movement. c Upper facial muscle movement. d Mouth movement. e Eye blink. f Pupil movements caused by changing gaze direction

3 Experimental Result

Micro-body movement has been assumed as an intrinsic response of social emotion in this study. Our proposed method allowed social interaction free from senor attachment and compared intimate with non-intimate social emotion groups. For that, 4 subject pairs were participated in which each pair had a conversation about a given topic during 25 minutes, where 2 subject pairs had intimate relation with each other, while another 2 subject pairs had no social relation. During conversations, upper body image was captured by using a conventionally used webcam as resolution of 640 pixels by 480 pixels and 15 frames per second. Then, the captured image frames were analyzed for obtaining the amount of micro-movements at five frequency bands such as 0.5, 1, 3, 5, and 15 Hzs.

The extracted average amounts of micro-movements at various frequency bands were shown in Fig. 6. According to this result, we found that the amounts of micro-movement for intimate social relation were less than the case of non-intimate one excepting for 0.5 Hz band. Also, we recognized that more difference between two groups was appeared at higher frequency bands. From this result, non-intimate social emotion was assumed to be induced more amount of micro-movement of human bust. However, this result should be verified by statistical significances or exploring correlations with various conventionally used bio-signal-based emotion estimation features.

Fig. 6
figure 6

The average amounts of micro-movements for two social emotion: groups such as intimacy and non-intimacy relations at various frequency bands

4 Conclusion

We proposed a new method for estimating social emotion by analyzing micro-body movement. Allowing social interaction without measurement burden from sensor attachment, our method took advantage of observing social emotion development. Micro-body movement could be classified into two levels of intimacy. Our study successfully analyzed micro-movement of human bust from successive image frames captured by conventional webcam. For that, the amount of bust micro-movement was measured by subtracting adjacency two image frames. Because the measured successive values of bust movement were the form of 1D temporal signal, all of conventional temporal signal processing methods might be used. Results showed that micro-movement in case of intimate relation was less than in the case of non-intimate emotion.

In future works, we will experimentally validate connectivity between each part’s micro-movement of human body and various kinds of conventional physiological responses. For example, we will analyze correlation between pulse-to-pulse interval and the amount of bust micro-movement after acquiring both ECG signal and bust movement for specific social emotion. Also, higher frequency band signal will be partially rejection filtered by masking big gesture section extracted from low-frequency band.