Keywords

1 Introduction

The recent diffusion of broadband networks and social network services has made it easy for workers to collaborate with each other remotely by using video, audio, and document sharing with a multipoint video conferencing system. On the other hand, workers who collaborate at the same location often use physical assets such as paper and whiteboards. Therefore, using these physical assets and sharing physical user states are assumed to be desirable for collaboration between remote workers.

An improved method for sharing physical states is needed to enhance remote collaboration when users are located in various physical spaces, as depicted in Fig. 1. In this example of remote collaboration, several people undertake a task in the conference room in space X, while a worker in space Y participates in the task remotely. In this situation, workers in space X may focus only on the discussion with workers in the same space. This situation may cause problems if the worker in space X forgets to share the discussion with the remote worker in space Y. To resolve this problem, a remote collaboration system needs to ensure that the workers in space X naturally share their working states with the remote worker in space Y. If the workers in space X are aware of the presence of the remote worker, then the collaboration will proceed more smoothly.

To support remote collaboration, a realistic method that represents face and gaze direction with a 3D display [3, 7, 8] was proposed. This method is expected to effectively support the awareness of remote workers. However, this expression method has two issues: first, it requires a very large capacity network for video transmission, and second, it requires many input devices, which are not easy to set up for video generation. Therefore, an alternative method that does not use video is also considered.

In this research, a vibration sensor and a distance sensor are proposed, prototyped, and tested. The vibration sensor detects signals associated with worker behavior. The distance sensor detects the area in which the worker is present. Each sensor is extremely compact and easy to integrate in the environment, and can detect various kinds of information about the worker. The combination of sensors creates an ambient environment, which is expected to further develop with the Internet of Things (IoT).

Fig. 1.
figure 1

Concept of remote collaboration.

2 Related Research

Recognition of a remote worker’s state is an important factor in remote collaboration. Research related to this factor has focused on awareness support [2, 4, 5, 11]. Various methods to estimate worker states have been investigated, but each method has some drawbacks.

The first method uses the operational log of a personal computer (PC) to estimate a user state. Hashimoto et al. [6] obtained useful information from workers using this method, but the method did not differentiate between each user’s state and the log in the shared workspace.

The second method uses wearable sensors that are attached to a worker to estimate worker states. Olguin et al. [1] attached various sensors to workers to measure worker conditions. The workers wore microphones, acceleration sensors, infrared sensors, etc. This direct measurement of a worker captured detailed information, but attaching the sensor to users could be a major hindrance in remote collaboration.

The third method uses a microphone or a camera to estimate worker states. Kennedy et al. [9] installed a microphone in the environment to estimate worker states during a conference. The microphone method encountered difficulty in estimating the participants who weren’t speaking. Otsuka [10] reconstructed the state of the speaker and that of the listener from a video image. The camera method could estimate various kinds of information but encountered difficulty in adjusting to various conditions, and it did not work in instances of physical occlusion.

In this research, a method using a combination of vibration sensor data and distance sensor data to estimate worker states is investigated by prototyping and testing. This method is not an alternative method, but may become supplemental to other methods in the new IoT environment.

3 Estimation Method and Prototype Module

A prototype module with a combination of vibration and distance sensors was developed. The module is used simply by placing it on a desk (Fig. 2). This module has one vibration sensor and one distance sensor, and receives data from both sensors synchronously using an Arduino microcontroller. The data are transmitted to and stored in a PC.

Fig. 2.
figure 2

Prototype module with Arduino.

In order to classify the worker state, the relationship between the sensor data and the state of the worker is analyzed with self-organizing maps (SOM). A Fourier transform is applied to the vibration sensor data, and distance sensor data is averaged over a short time in the data analysis.

4 Test of Prototype Module

4.1 Test Method

The prototype module was tested on its ability to infer four states: “writing on a desk,” “key typing,” “viewing a PC monitor,” and “leaving a seat.” The first three states are shown in Fig. 3. These four states were selected from usual works using PC and physical assets.

Fig. 3.
figure 3

Worker state in test of prototype module.

During the 20-minute test, a user works at a desk, and at times may leave the desk. The user is video recorded at a sampling rate of 1 [kHz] in order to label the sensor data with user states. The window size of the Fourier transform on the vibration sensor data is 5 [s], and the average interval for the distance sensor is the same length of time.

Fig. 4.
figure 4

Data Results from both sensors and vibration data labeled with worker states. Y-axis is 10-bit value of A/D converter with Arduino.

Fig. 5.
figure 5

Results of short-time Fourier transform of vibration sensor.

Fig. 6.
figure 6

SOM resulting from analysis.

4.2 Result and Discussion

The graphs of the data from both sensors are shown in Fig. 4. The vibration data labeled with the worker states is shown in Fig. 4. The results of the short-time Fourier transform of the vibration sensor are shown in Fig. 5. The result of the SOM is shown in Fig. 6. The rate of correct worker status included in each cluster was 63 [%].

The SOM classified the sensor data into three areas, as shown in Fig. 6. Area 1 included the state “key typing,” area 2 included the two states “writing on a desk” and “viewing a PC monitor,” and area 3 included the state “leaving a seat.” By comparing Fig. 4 with Fig. 5, it is evident that the vibration data was related to the state “key typing.” Therefore, the difference in vibration properties impacts the classification of area 1 and area 2. Figure 4 shows that the distance data was related to the state “leaving a seat.” Therefore, the distance sensor influenced the classification to area 3. These effects suggest that the two sensors could be useful in estimating worker states.

In this test, area 2 contained two states, thus illustrating that the data from both sensors cannot easily distinguish between the two states “writing on a desk” and “viewing a PC monitor.” The vibration level corresponding to the state “writing on a desk” was small owing to frequent stops in writing motion, and the vibration level corresponding to the state “viewing a PC monitor” was also low. In order to distinguish between the two states, we will consider the time length of low-vibration activities, especially the intermittent motion of writing.

The data of the vibration sensor varies depending on the material that transmits the vibration, and the data of the distance sensor also changes according to the position of the worker’s seat and the distance sensor. The relationship between the sensor data and the state of the worker will be investigated, including the state of the working environment. To estimate the more detailed states of the worker, this method will be investigated using combinations with other methods with cameras, microphones, PC logs, etc.

5 Conclusion

Recognizing a worker state at a remote site is necessary to develop a sense of work sharing in remote collaboration. In this research, a prototype module that combines vibration sensor data and distance sensor data was developed, and its ability to classify the four states of a worker was tested.

The test confirmed that three states (“key typing,” “leaving a seat,” and “other”) can be mechanically classified from the sensor data using a SOM. In the future, the prototype module will be applied to remote collaboration to evaluate its effect on a local worker’s recognition of the remote worker.