The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Paggio, Patrizia; Navarretta, Costanza

doi:10.1007/s10579-016-9371-6

The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Original Paper
Published: 19 October 2016

Volume 51, pages 463–494, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Language Resources and Evaluation Aims and scope Submit manuscript

The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Download PDF

Patrizia Paggio^1,2 &
Costanza Navarretta¹

738 Accesses
15 Citations
Explore all metrics

Abstract

This article presents the Danish NOMCO Corpus, an annotated multimodal collection of video-recorded first acquaintance conversations between Danish speakers. The annotation includes speech transcription including word boundaries, and formal as well as functional coding of gestural behaviours, specifically head movements, facial expressions, and body posture. The corpus has served as the empirical basis for a number of studies of communication phenomena related to turn management, feedback exchange, information packaging and the expression of emotional attitudes. We describe the annotation scheme, procedure, and annotation results. We then summarise a number of studies conducted on the corpus. The corpus is available for research and teaching purposes through the authors of this article.

The ALICO corpus: analysing the active listener

Article 21 May 2016

Multimodal Behaviours in Comparable Danish and Polish Human-Human Triadic Spontaneous Interactions

The Corpus of Interactional Data: A Large Multimodal Annotated Resource

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The past few decades have seen the emergence of a new research paradigm, which considers human communication as a multimodal system, so that communication is increasingly being studied by considering gesture alongside speech (see e.g. Kendon 2004; McNeill 2005; Duncan et al. 2007; Poggi 2007; Cienki and Müller 2008; Gullberg and de Bot 2010; Gibbon 2011; Enfield 2012). The general effort made by theoreticians in gesture studies to re-define the realm of linguistic analysis to encompass gestural behaviour goes hand in hand with the development of multimodal corpora, where subjects are video-recorded while they interact in different types of communicative situation, and their speech and gesture behaviour is annotated with rich descriptive features. The existence of such corpora, and of specialised tools for their annotation and analysis, provides a unique opportunity for researchers from different fields to work on naturally occurring multimodal data.

In this article, we describe the Danish NOMCO corpus, which we consider an important contribution to the fields of multimodal corpora and gesture studies, not only because the corpus has specific and interesting properties related to the communicative situation in which it has been collected, but also because we believe the methods used to annotate and analyse it will be helpful to the research community. Note that the term modality in this work is used to refer to production modality (speech, and different types of gestural behaviour, e.g. head movements, facial expressions, and body posture). Following this definition, an annotated multimodal corpus is a video-recorded collection in which contributions in two or more of these modalities are annotated.

We start in Sect. 2 by describing the way the data were collected. Then in Sect. 3 we describe the annotation methods used to annotate speech as well as gestural behaviour and give counts of the various annotation features. We also analyse the relation between gesture and speech, and the way emotional attitudes are expressed. Finally, we provide an account of how inter-coder agreement was measured. In Sect. 4, we discuss a number of phenomena in light of the annotated data. These include the issue of temporal coordination between speech and gesture, the relation between gestures and focusing, and the mechanisms of multimodal feedback and turn management. The last part of the section is dedicated to an overview of results from machine learning studies carried out on the NOMCO data. Section 5 contains the conclusions.

2 The recordings

The Danish NOMCO corpus is one of a collection of first acquaintance dialogues created under the auspices of the Nordic NOMCO project. The collection consists of video-recorded and annotated conversations in Danish, Swedish, Finnish, and Estonian (Paggio et al. 2010), comparable with one another for the type of dialogues, the recording setting, and the annotation methodology. Recently, a similar corpus was also recorded for Maltese (Paggio and Vella 2014).

The Danish corpus, which is the focus of this article, consists of twelve recordings, featuring six male subjects and six female subjects of age 21–36, each taking part in a dialogue with a female and one with a male, for a total of about an hour of interaction. The two conversations took place on different days, and in both cases the dialogue participants had never seen each other before. They were told that they had about five minutes to get to know each other, as if they had been at a party, or a similar situation. As a consequence, they spoke freely about any topic they wanted. The dialogues were recorded in a studio, with the participants standing in front of each other on a carpet to delimit the possible distance between them. Each dialogue was filmed by three cameras, as shown in Fig. 1. The video format is MOV (six files) and AVI (six files), both with CINEPAK as codecs. The audio is uncompressed (44.100 Hz), and five files were recorded in stereo while seven are in mono format. The three camcorders and two cardioid microphones used were synchronised by the IT and Media group at the faculty of the Humanities of the University of Copenhagen.

Table 1 Self-assessment scores (Likert scale 0–5, N = 12)

The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Abstract

Similar content being viewed by others

The ALICO corpus: analysing the active listener

Multimodal Behaviours in Comparable Danish and Polish Human-Human Triadic Spontaneous Interactions

The Corpus of Interactional Data: A Large Multimodal Annotated Resource

Explore related subjects

1 Introduction

2 The recordings

3 The annotation

3.1 Transcription and annotation of speech

3.2 Annotation of gestural behaviour

3.2.1 Gesture shape and dynamics

3.2.2 Gesture function

3.2.3 Relation between gesture and speech

3.2.4 Emotional attitudes

3.2.5 Annotation procedure and inter-annotator agreement

4 Studies

4.1 Coordination of speech and gesture

4.2 Focus and gesture

4.3 Feedback by speech and head movements

4.4 Multimodal turn management

4.5 Prediction and validation

5 Conclusions and future directions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation