Keywords

1 Introduction

The traditional mode of working is undergoing change. Over the past decade, communications technology has evolved rapidly. Various online interactive tools have continued to emerge. By using the Internet, people can communicate with others anytime and anywhere, freeing them of time and space constraints. Moreover, the constant updating of functions brings online communication closer to real-time face-to-face communication, making it effective. Therefore, increasingly sophisticated online interactive technologies are beginning to be accepted by the public and becoming trendy. This trend has also brought change to modern corporations. Recent corporate activity is increasingly moving away from the traditional working mode of coming to work in the same place and at the same time. Using online interactive technologies, a new working mode has emerged, and that is teleworking.

Teleworking, also known as telecommuting or virtual work, has been defined as a work mode that involves using communication and computer technologies to work from home or another location away from the traditional office [1]. As an alternative to the traditional mode of work, teleworking has many benefits. For employees, they can complete job requirements without being physically present at an employer’s office location. Flexible working hours and free workplaces brings the positive effect of reduced work pressure and increased productivity. For companies, they may want to consider teleworking as a means of cost control and potentially, as an alternative to some layoffs [2]. Because of these benefits, it has become popular in modern corporate activities, and its prospects are well worth looking forward to.

However, teleworking also makes organizational management more difficult. Compared to the traditional working mode, teleworking makes the distance between employees and the company increase. The increased distance is not only physical but also psychological, and it can cause adverse effects such as increased role ambiguity and reduced support and feedback, resulting in employee engagement issue [3]. The decline in employee engagement can negatively affect the company in many aspects, so the maintenance of engagement has always been an essential topic in organizational management. In order for teleworking to be more effective, how to deal with its negative relationship to engagement is imperative.

To solve this issue, the concept of interactive people analytics has been proposed [4]. It is using a short video as a daily report in teleworking. Employees record the progress of their day in the form of short videos and report back to managers. We want to understand the changes in reporter’s engagement by analyzing the subjective information which unconsciously revealed by the reporters when recording the video. Then managers can give targeted feedback to employees to help them adjust. For teleworking, which is no longer working in the same place, this kind of process can make up for the lacking interaction between the organization members. We hope it can help to decrease the distance between employees and the company and achieve the goal of engagement maintenance.

To investigate the effectiveness of our interactive people analytics approach, in this paper we explored the utility of using a short video as a daily report in teleworking. First, we collected some video report samples as the analytical dataset used in this paper, with the reporters’ rating of engagement level and our original scale of each video report. Then we manually extracted the paralinguistic cues from the video reports, which is rich in subjective information. We extracted two kinds of paralinguistic cues here: filled pause and silent pause. Then we analyzed the relationship between the occurrence of paralinguistic cues and the rating results. Our results illustrate that the relationship between engagement and video reports. The occurrence number and duration of filled pause are more significant at lower engagement level. In terms of our original scale, the relationship between rating results and paralinguistic cues reveals the potential for using video reports to understand employees’ subjective evaluations.

2 Related Work

2.1 Employee Engagement

Employee engagement is defined as working with passion and willingness to drive organizational goals, theorized in terms of physical, cognitive and emotional factors. For employees and their organizations, engagement can be critical. It can bring a significant effect on productivity, performance, resilience and organizational profitability [5]. Since maintaining high employee engagement can have many positive effects on an organization, how to maintain and improve employee engagement has always been an essential topic in organizational management. Also, there are many studies on employee engagement.

Regarding the factors that affect employee engagement, many studies show executives [6,7,8] and managers [9, 10] plays an essential role in shaping and enhancing engagement of their daily report. They explain the superiors can have a significant impact on employee engagement. Appropriate feedback from managers can help improve employee engagement. Besides, Muller et al. discussed the influence on the engagement of peers, friends, and managers [11]. They show the impact on employee engagement from three different aspects: peers, friends, and managers. Their re-search gives a more social view of employee engagement, shows an employee’s engagement is associated with the engagement of her/his peer, friends, and managers. It shows not only the manager, but the interaction between organizational members is also a crucial factor influence engagement.

In terms of ways to maintain and enhance employee engagement, Mitra et al. dis-cussed the spread of engagement in a large organizational network [5]. They found how engagement and disengagement spread from one employee to another. Their result suggests it is important for organizations to sense and address workplace disengagement promptly to maintain engagement. In addition to the usual approach, Morales et al. discussed civic engagement using gamification [12]. They proposed a gamified volunteer management platform to enhance engagement and satisfaction. Gamification has been employed to enhance user attraction, satisfaction, and retention in a wide variety of applications, they offer a new perspective on enhancing engagement. Previous studies have discussed the factors that affect employee engagement from various perspectives and have given many ways to increase engagement. However, there is limited research on the increase in employee engagement under teleworking condition.

2.2 People Analytics

From the perspective of organizational management, in this data-centric era, various data collected by various sensors constitute a vast data set to support human re-source management. People analytics is one of such trends and deal with human resources data analytics. Waber combined human resource management and big data and presented the concept of people analytics [13], it can be described as using data analytics to improve organization and human resource management, which is fully data-centered, and always needs unbiased and consistent measurement. It suggests using sensor and analytics to understand how employee work and collaborate, and building a more effective, productive and positive organization.

Regarding the practical use of people analytics in organizational management sight, Xu et al. proposed an approach that enables data mining in inter-company talent transitions using an online business social network [14]. They create a job transition network to extract the characteristics of talent circle. Based on this, they developed a talent circle model and design a talent exchange prediction method for talent recommendation. Roy et al. proposed a conceptual ontology to evaluate human factors [15]. They applied data mining techniques on email corpus, using sentiment analysis to evaluate six components of human resource constructs: performance, engagement, leadership, workplace dynamics, organizational developmental support, and learning and knowledge creation. These studies all illustrate that people analytics techniques can help organizational management become effective. For teleworking, how to leverage people analytics technology has not been explored in depth. On the other hand, engagement as a subjective state related to emotions, whether the objective data-based people analytics technology is suitable, should be discussed again.

2.3 Paralinguistic Cues

Speech information includes verbal and non-verbal information that includes paralanguage. Unlike verbal information, paralinguistic information is content-free, but it plays an essential role in speech communication. It is informative and can help us to understand what is hidden behind the speech. Paralinguistic information always presents as paralinguistic cues in speech. Paralinguistic cues can add context to an utterance, such as information on the way the speaker feels punctually. By analyzing these cues overtime during the interaction, one also gets some hints about the emotional profile of the speaker [16].

Paralinguistic cues contain many categories, including filled pause, silent pause, repetition, pitch, speech rate, energy, loudness, and so on [17]. About their meaning, many prior studies have shown the paralinguistic cues is closely related to the subjective states such as emotion, mental state. Wang et al. showed the relationship be-tween speech rate and emotional mood [18]. Their results showed that when people are under the conditions of happy, angry, scared, and other exciting moods, their speakings naturally become faster. Conversely, negative moods like sad or boring lead to speaking slower. Goto et al. proposed that filled pause is also as having to do with the mental state of the speaker [19]. The results explored that speakers unconsciously use filled pauses to express mental states such as diffidence, anxiety, hesitation, and humility, also to express different thinking states, such as retrieving information from memory. Besides, Lee et al. discussed the connection between the silent pause and the speaker’s state of stress [20]. When the speaker is stressed, he/she uses unconsciously more silent pauses during speech than when he is non-stressful, and the duration of silent pauses are also longer than usual. Although the relationships between subjective states such as mood and mind and paralinguistic cues have been addressed a lot in previous studies, the relationship between paralinguistic cues and employee engagement remains unmentioned.

3 Proposal

3.1 Interactive People Analytics

As mentioned before, people analytics is deeply data driven. However, it can be criticized how it is reliable to measure and evaluate subjective mental state or engagement of employees only from such objective data. Instead, previous study has pro-posed the concept of interactive people analytics [4]. Interactive people analytics is defined as data analytics to focus interaction with organization members to improve engagement and organizational goal achievement. It enhances people interaction during the data gathering and provide feedback for further interaction.

People analytics requires unbiased and consistent measurements. In many cases, it is not easy to assure these conditions. In contrast, interactive people analytics emphasizes human centeredness, which aims at introducing intended biases into the measurement. It also discusses that the active reporting by people itself can change or can be a trigger to change them into the direction they wish. It allows biased and intended interaction for positive organizational development.

3.2 Using a Short Video as a Daily Report

To implement the concept of interactive people analytics, we propose a new approach by using a short video as a daily report in telework. Employees record and upload video reports, and due to the content, mangers give feedbacks. Through this process, the interaction between organizational members in teleworking can be realized. Reporting in video form achieves the human-centered goal while collecting data. Further analysis of the video reports may make it possible to capture changes in employee engagement. Screen out employees whose engagement is declining, then based on the content managers give targeted and appropriate feedback, to increase the interaction with them to help them adjust. Finally, the goal of maintaining employee engagement can be achieved.

Our approach has many benefits. First, recording video reports do not require a specific recording place or recording device. It can be done anytime, anywhere just using mobile device such as mobile phone. A short video report for about 30 s a day also does not bring a huge burden on employees. Compare with traditional text format report, analyze video reports is easier to understand changes in engagement. The previous study shows that using video is better than text to let the user engage emotionally in the content [21]. Therefore, video contains much subjective information than text, which is closely related to engagement. Take video reports as a daily routine can also play a role in self-management. Other study shows that employees using more self-management strategies can help them improve their engagement [22]. A regular review of previous video reports can empower introspection and maintain their engagement.

According to our hypothesis, when employee engagement changes, video report content also changes accordingly. For example, assuming an employee’s work is not going well, resulting in a decline in his engagement. Then his report may also contain much information, like more pauses during reporting than normal. If our hypothesis is correct, then it proves the utility of our approach. Therefore, in the remaining part of this paper, we conduct a utility study for our approach. Focus on the relationship between engagement and video reports.

4 Utility Study of Daily Video Report

4.1 Material

Daily Video Report Dataset.

To investigate the utility of short videos as daily reports, we first collected sample daily video report reports, and build a dataset used for further analysis. The sample video reports were all recorded by a male Japanese employee in his 50 s [4]. He works for an IT company based in Tokyo, Japan, and involved in development-related work contexts. He persistently recorded video reports for over two years from April 2017 to May 2019, finally recorded a total of 418 video reports, 142 in 2017, 194 in 2018, and 82 in 2019. He recorded video reports using the front camera of his smartphone and reported that day’s work progress in Japanese. After finished recording, then uploaded it to Google Photo. The example of the daily video reports is shown in Fig. 1. We used all of 418 videos for investigation without any screening.

Specifically, his recording process for each video report was as follows. First, be-fore recording started, the reporter spent around one minute to complete a simple text summary by reviewing what he had done in the day. After that, he sat in front of the camera and started reporting for around 30 s. During the recording, his upper body was in the camera frame. The report content included items such as the date, the progress of ongoing projects, business meetings attendance, future work schedule, problems encountered in work, and other matters related to his work.

Video reports in this dataset have the following features. Due to the report content, the reporters almost always maintained a neutral tone when speaking. Likewise, his expression kept serious and almost unchanged. Besides, the reporter was not tele-working. Although there was no restriction on recording place, most of the reports were recorded in his workplace. Because of this, they did not contain loud voice.

Rating of the Engagement Level.

After obtaining sample video reports, we measured the level of engagement of this reporter. Employee engagement is an abstract and subjective state that is difficult to measure directly. Thus, we investigated the engagement level using a questionnaire. The UWES-3 (Utrecht work engagement scale-3) self-report questionnaire was used in our study [23]. The UWES is based on in-depth interviews and was introduced as a 17-item self-report questionnaire that includes three dimensions: vigor, dedication, and absorption. In order to reduce the demand placed on the reporter, we used a shortened version of the questionnaire: UWES-3. It only has three questions, which brings a small burden on the reporter. For the scoring scale, each question needs to be rated on a scale from 0 to 6.A reporter recorded all the video reports first. This time he rated his daily video engagement level based on the video reports’ content and his memory at a later date.

Fig. 1.
figure 1

Example of daily video report samples.

Rating by Original Scale.

In addition to the rating by the known engagement level, rating by the following original scale was asked. This is to explore what video reports can convey other than the known engagement level. The rating was also conducted at a later date. Our original scale consists of the following three items:

  1. 1.

    Satisfaction. This item refers to the degree the reporter is satisfied with his work. Rating consists of high, medium, and low.

  2. 2.

    Clarity. Clarity here refers not to the clarity of sound in the video, but the clarity of the reported content. Specifically, the item is to evaluate whether a report contains the clearly observable progress of the work. Rating consists of high, medium, and low.

  3. 3.

    Assessment. This item refers to the degree the content includes self-assessment aspects. Specifically, some tasks are routine and some are not. Routine tasks typically require less self-assessment, while non-routine tasks require more self-assessment such as recognition of achievement or recognition of problems. Rating consists of three categories of high, medium, and low.

4.2 Method

We were interested in the relationship between the paralinguistic cues in the video reports and the ratings. We conducted these steps to the material. We extracted para-linguistic cues from the daily video report dataset, and focused on two kinds of para-linguistic cues: filled pause and silent pause, as these two were observed much more than other paralinguistic cues. For the ratings, we tallied the results for each item and the videos were divided into groups. The details are is described as follows.

Paralinguistic Cues Extraction.

Previous study proves the mental state affects the occurrence of filled pauses [19]. Speakers unconsciously use pauses to express mental states. In our daily video report data set, filled pauses also frequently appear, and we want to explore whether the occurrence of filled pause is related to our rating results. We counted the frequency of filled pauses, and the rate of filled pause duration for each video report. We used ELAN software to mark the filled pauses in each video report manually to two decimal places in second. The counting rule of filled pauses we use is to count typical Japanese fillers such as/ee-/,/maa-/,/ano-/, and most word-lengthening sounds as the filled pause [19]. On the other hand, we counted the duration of each video from the beginning of the reporter’s speech to the end of the last sentence, called actual speech duration. Based on the actual speech duration of each video report, we normalize the data to the result of the frequency of filled pause (times/min) and the rate of filled pause duration in percentage (%).

When in the case of high mental stress, it is easy to have more pauses during speaking [20]. Because each video report is only around 30 s in our case, several seconds of silent pause may contain a wealth of information. What is more, as the reporter conducted the text summary before recording the video report, it may not be normal to have a long duration of silence or frequent silent pauses. We also calculated the frequency of silent pause and the rate of silent pause duration for each video report. Here we counted the silent pauses of more than 1.0 s duration.

Video Grouping According to the Rating.

For engagement item, the average score of the 418 ratings was 2.31. Regarding the total 418 samples, we first arranged the rating results in descending order. Then divided them into two groups with the same number of samples by the order, called the high engagement group (209 samples) and the low engagement group (209 samples).

For the original scale, videos were directly divided into groups according to the rating of three categories. For the satisfaction, 67 videos went to high, 124 to medium, and 227 to low. For the clarity, 181 to high, 225 to medium, and 12 to low. For the assessment, 67 to high, 22 to medium, and 329 to low.

Regarding the analysis, we first calculated the mean (M) of different paralinguistic cues in each group for each rating item, including both the mean number of occurrences and mean duration, and the corresponding standard deviation (SD). Then we conducted one-way analysis of variance (ANOVA) by SPSS to check whether there is a statistically significant difference between groups. Furthermore, we conducted Fishers Least Significant Difference (LSD) to determine whether there was a statistically significant difference between each two groups in the original scale.

4.3 Results

Engagement Level.

We explored the effect of different engagement levels on filled pause in video reports. The relationship between filled pause and engagement levels is shown in Fig. 2. One-way ANOVA found significant difference of frequency of filled pause between engagement levels where low engagement level had more frequent filled pause (F = 14.143, p = .000). It also found significant difference in the rate of filled pause duration between engagement levels where low engagement level had higher rate of filled pause duration (F = 6.163, p = .013).

Fig. 2.
figure 2

(a) The relationship between engagement level and filled pause frequency. (b) The relationship between engagement level and filled pause duration.

Fig. 3.
figure 3

(a) The relationship between engagement level and silent pause frequency. (b) The relationship between engagement level and silent pause duration.

The relationship between silent pause and engagement levels is shown in Fig. 3. With one-way ANOVA, we did not find significant difference of frequency of silent pause (F = .027, p = .869) and the rate of silent pause duration (F = .114, p = .736) between engagement levels.

Satisfaction.

Figure 4 shows the relationship between satisfaction levels and filled pause. One-way ANOVA did not find significant difference of frequency of filled pause between satisfaction levels (F = 2.155, p = .117). Meanwhile, it found marginally significant difference of duration rate of filled pause between satisfaction levels (F = 2.864, p = .058) where high satisfaction level had lower duration rate of filled pause. With further LSD, differences were found between low and high (t = 2.167, p = .031), and between medium and high (t = 2.252, p = .025).

Figure 5 gives the relationship between satisfaction levels and silent pause. One-way ANOVA found marginally significant difference of frequency of silent pause between satisfaction levels (F = 2.684, p = .069) where high satisfaction level had less frequent silent pauses. With further LSD, the difference was found between medium and high (t = 2.289, p = .023). On the other hand, it did not find significant difference of duration rate of silent pause between satisfaction levels (F = .921, p = .399).

Fig. 4.
figure 4

(a) The relationship between satisfaction level and filled pause frequency. (b) The relationship between satisfaction level and filled pause duration.

Fig. 5.
figure 5

(a) The relationship between satisfaction level and silent pause frequency. (b) The relationship between satisfaction level and silent pause duration.

Clarity.

The relationship between clarity levels and filled pause is shown in Fig. 6. One-way ANOVA found significant difference of frequency of filled pause between clarity levels (F = 30.664, p = .000) where high clarity level had less frequent filled pause. With further LSD, differences were found between low and high (t = 3.096, p = .002), and between medium and high (t = 7.643, p = .000). It also found significant difference of duration rate of filled pause between clarity levels (F = 43.963, p = .000) where high clarity level had lower duration rate of filled pauses. With further LSD, differences were found between low and high (t = 4.389, p = .000), and between medium and high (t = 8.957, p = .000).

The relationship between clarity levels and silent pause is shown in Fig. 7. One-way ANOVA did not find significant difference of both frequency of silent pause (F = .487, p = .615) and duration rate of silent pause (F = 1.738, p = .177) between clarity levels.

Fig. 6.
figure 6

(a) The relationship between clarity level and filled pause frequency. (b) The relationship between clarity level and filled pause duration.

Fig. 7.
figure 7

(a) The relationship between clarity level and silent pause frequency. (b) The relationship between clarity level and silent pause duration.

Assessment.

Figure 8 presents the relationship between assessment levels and filled pause. One-way ANOVA found marginally significant difference of frequency of filled pause between assessment levels (F = 2.465, p = .086) where high assessment level had less frequent filled pauses. With further LSD, the difference was found between low and high (t = 2.210, p = .028). By contrast, it did not find significant difference of duration rate of filled pause between assessment levels (F = 1.599, p = .203).

Figure 9 shows the relationship between assessment levels and silent pause. One-way ANOVA found significant difference of frequency of silent pause between assessment levels (F = 4.233, p = .015) where low assessment level had less frequent silent pause. With further LSD, the difference was found between low and high (t = −2.689, p = .007). It also found significant difference of duration rate of silent pause between assessment levels (F = 8.489, p = .000) where low assessment had lower duration rate of silent pause. With further LSD, differences were found between low and medium (t =  −2.552, p = .011), and between low and high (t = −3.481, p = .001).

Fig. 8.
figure 8

(a) The relationship between assessment level and filled pause frequency. (b) The relationship between assessment level and filled pause duration.

Fig. 9.
figure 9

(a) The relationship between assessment level and silent pause frequency. (b) The relationship between assessment level and silent pause duration.

5 Discussion

5.1 Summary of Results

To summarize the analysis results on engagement, the frequency and duration rate of filled pause was significantly higher in low engagement level than in high engagement level. On the contrary, we did not find any relationship between silent pause and engagement levels.

In terms of our original scale, we did not find the relationship between the frequency of filled pause and satisfaction levels. Meanwhile, the duration rate of filled pause was significantly higher in low and medium satisfaction levels than in high satisfaction level. Also found was that silent pause was more frequent in medium satisfaction level than in high satisfaction level.

On the clarity item, the frequency and duration rate of filled pause was significantly higher in low and medium clarity levels than in high clarity level. We did not find relationship between silent pause and clarity levels.

Regarding the assessment item, the frequency of filled pause was significantly higher in low assessment level than in high assessment level. The frequency of silent pause was significantly lower in low assessment level than in high assessment level. The duration rate of silent pause was significantly lower in low assessment level than in medium and high assessment levels.

5.2 Findings

Among all the results, we found significant effects of some items on paralinguistic cues. It might be an indication of decline of engagement when longer filled pauses are found more frequently in the longitudinal video reports.

Meanwhile, low clarity reports in their contents are likely to have more frequent and higher duration rate of filled pauses. Then frequent and long filled pauses are not necessarily caused by low engagement. They might be caused by reporting not very clear tasks and so on. Thus when longer filled pauses are found more frequently in the longitudinal video reports, we should be also aware if it comes from the reporter’s engagement level or from unclear characteristics of the reported tasks, or unclear understanding of the tasks by the reporter.

Moreover, we found high assessment reports had more frequent and higher duration rate of silent pauses. This result proves it is possible to understand employees’ self-assessment by analyzing video reports. Which level of self-assessment is con-ducted by the employee can be understood by analyzing their silent pause in their video reports.

5.3 Limitations and Future Works

The study reported in this paper has some limitations. The video report data we used for analysis this time is from one single reporter, which makes it hard to generalize our findings. People’s penchant for expressing their subject evaluation or engagement through paralinguistic cues might vary from person to person. Thus, it is not clear if the analysis of the paralinguistic cues of other people can determine their engagement in the same way. The collection and analysis of video reports from other re-porters should be considered in the imminent future. Besides, all ratings in this study conducted at a later date, which is possible to influence the accuracy of rating results. The later ratings might not truly show the reporter’s evaluation at the time. Therefore, the real-time rating should also be conducted in future work. In this study, we dis-cussed two paralinguistic cues, filled pause and silent pause, because these two were observed much more than other paralinguistic cues in our dataset. The relationship between various paralinguistic cues can also be considered as a part of future work.

About our rating items, in order to reduce the reporter’s burden, we used the UWES-3 questionnaire, the short version of UWES, in this study. Indeed, the reporter can finish only three questions in a short time, but the limited questions may make it harder to understand the subtle changes in engagement. Using the complete version of UWES or other employee engagement scales can be an option in future works. Also, in this study we did not use standardized questionnaires to measure satisfaction, clarity, and assessment. Hence the rating result can be ambiguous and subjective. We aim to study this further to tease out the relationship between these engagement-related items and paralinguistic cues.

5.4 Design Implications for Organizational Structure

Although this study is preliminary, and is supported by not a very solid result because of its data size, we found there would be relationships between paralinguistic cues and engagement in the video. In particular, our results demonstrated that it is possible to use video reports to understand changes in employee engagement. This proves it is useful to address the engagement issue in teleworking by using our approach.

Maintaining a high-level engagement is critical in collaborative work especially when working remotely, and in enterprises. Traditionally teleworking has had issues concerning awareness, communication opportunity and so forth, which resulted in the maintenance of engagement. To address this issue, we propose more active use of video technology in computerized organizational settings. By this study, this claim can be supported in that using video together with analytical computing is helpful for increasing awareness of teleworkers’ inner feelings of engagement.

Whether like it or not, teleworking has been increasing in organization. Organizational structure should be and will be designed incorporating teleworking environment. The interactive people analytics could be a nerve of such novel organizations.

6 Conclusion

In this paper, we propose a new approach based on the concept of interactive people analytics to address the engagement issue in teleworking. We propose to use a short video as a daily report to increase interaction between organizational members in teleworking, and by analyzing the video report, it is possible to understand changes in employee engagement. To investigate the utility of our approach, we first collected video report samples to construct a dataset for analysis and obtained the rating results of the engagement level and our original scale from the reporter. Next, we extracted paralinguistic cues from the video report, which are informative and closely related to the subjective state of the reporter. We extracted two paralinguistic cues of filled pause and silent pause. Then we analyzed the frequency and duration of each paralinguistic cue for different rating items. In engagement item, our results illustrated the relationship between engagement and video reports; frequent and longer filled pause occurred in lower engagement. It shows the utility of using daily video report to understand engagement changes. For our original scales, the results show the relationships between subjective evaluation and video reports. They gave possibility of using daily video reports to understand employees’ job satisfaction, the completeness of the reported content, and how they conduct self-assessment in daily work. Our results suggest using a short video as a daily report in teleworking is an essential way to ad-dress engagement problem and support effective organizational management.