Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

From an epistemological and ontological perspective, CA and CL have very different origins and research foci. CA takes as its starting point turn-taking and looks at how interactants establish and maintain sequential order. By using a detailed, microscopic approach to spoken interaction, CA sets out to explain how interactants co-construct meanings, repair breakdowns and orient to each other. Data are naturally occurring and the aim of the analyst is to show what ‘really happened’ by asking the question ‘why here, why now?’ in relation to sequences of turns-at-talk and by using a very small data-set. The ensuing rich, detailed and up-close commentary focuses on key features of the interaction which provide vital clues as to what is happening: these include pauses, overlapping speech, latched turns, ‘smiley voice’, laughter tokens, and so on (Heritage, 2004). CL, on the other hand, draws upon naturally occurring data, but offers a very different type of analysis. Here, the aim of the analyst is to examine specific linguistics features of the data in terms of word frequency, concordances, multi-word units and keyness. Put simply, the analysis is highly quantitative, uses a large sample of data and sets out to describe patterns and key linguistic features. The main focus is at the level of word or word patterns. CL allows for the (rapid) quantification of recurring linguistic features, which can be examined in their immediate linguistic contexts. Software programs enable analysis which is accurate and consistent, fast and without human bias (see, for example, O’Keeffe & McCarthy, 2010).

On closer examination, however, we can see that both CA and CL have much in common. They both use a corpus of empirical, naturally occurring data and refer to baseline comparisons with other types of interactions (sequential order in the case of CA, reference corpora in CL). They both look at language in context; an understanding of context lies at the heart of both approaches to analysis. For CA, the ‘language’ under investigation is social interaction, while for CL, ‘language’ often means words or word clusters. Yet both approaches use language which is context specific. Similarly, for both CA and CL, turn level analysis is crucial to enhancing understandings. It is at this level that the most revealing insights can be found—a point I will come back to later in the article.

By way of illustration of the viability of a conjoint approach to the analysis of spoken discourse, Corpus Linguistics (CL) and Conversation Analysis (CA) are used in the investigation of a set of recordings which had been transcribed to form a principled collection, or corpus: the Limerick Belfast Corpus of Academic Spoken English (hereafter LI-BEL). This corpus currently comprises almost one million words of recorded lectures, small group seminars and tutorials, laboratories and presentations. However, following on from previous research (see, for example, Walsh, 2006; Walsh & O’Keeffe, 2007), in this study, the focus was on looking at discourse in the context of longer stretches of text. To do this, I needed to combine CL with another approach since CL was unable to account for some of the features of spoken interaction which occurred at the ‘higher levels’ of utterance and turn (e.g., adjacency pairs). In order to conduct a detailed analysis at this level of the discourse, conversation analysis (CA) was used, an established and respected approach to providing detailed, micro-analytic descriptions of spoken interaction. This combined approach, using both CL and CA (henceforth, CLCA), I suggest, gives a more ‘up-close’ description of spoken interactions in context (in this case, an educational setting) than could be gained by using either one on its own. From the analysis, we can gain powerful insights into the ways in which interactants establish understandings and observe how words, utterances and text combine in the co-construction of meaning.

Increasingly, CL is being applied to contexts and domains outside of the study of language itself where the use of language is the focus of empirical study in a given context. Such contexts include courtrooms and forensic linguistics (Cotterill, 2010), the workplace (Koester, 2006), the classroom and educational contexts (O’Keeffe & Farr, 2003; Walsh & O’Keeffe, 2007), political discourse (Ädel, 2010), advertising and the media (O’Keeffe, 2006), among other areas. In all of these cases, CL is used alongside another complementary approach, including CA, discourse analysis and pragmatics. The use of CL with other complementary research methods is regarded as a definite strength in research design, and something which should be given further consideration in future CL-based research projects. It has to be stressed that none of the above studies could have achieved the same insights without CL in addition to another approach. In all of these studies, and in the present one, CL is applied to achieve a particular goal rather than used to describe the language features of a corpus. It is important to note here that two different types of corpus research may be used:

  1. (1)

    descriptive corpus research: the corpus as an end in itself. The researcher looks ‘into’ the corpus so as to scrutinise the use of language and further our description of language patterns in a particular genre. An example of this type of research is Bednarek (2006). This is a corpus study of evaluation in a corpus of newspapers. It tells us a lot about patterns of evaluation across and within the genre but it is not concerned with issues in a broader context of newspaper discourse, such as power, ideology, identity, and so on.

  2. (2)

    applied corpus research: the corpus as a means to an end. The researcher looks beyond the corpus for both its research questions and its analysis. The corpus is a powerful methodological tool which leads to greater depth of analysis in combination with another theoretical framework. An example of this is O’Halloran (2010), where a corpus of newspaper articles about immigrants is analysed within the framework of Critical Discourse Analysis and the result is an in-depth analysis of the links between the use of language patterns and ideology.

In this study, it would have been possible to take a descriptive approach to the data and, by comparison with other corpora, come to a description of the language of small group interactions in Higher Education settings. This in itself would be a valuable exercise, but the aim in this study is to address broader issues of interaction in these settings and so a complementary framework (using both CL and CA) is utilised in order to understand the patterns of language used. I refer to this as a CLCA approach.

2 Context

The focus of this study is small group teaching (henceforth SGT) in higher education contexts. SGT, such as seminars and tutorials, is used to support lectures by allowing tutors and students to engage in discussion and debate. While small group teaching is generally highly valued by both staff and students, there are some sources of dissatisfaction, with departments identifying problems relating to student engagement and tutor skills and a lack of time (see, for example, Bennet, Howe, & Truswell, 2002). In the present study, I was interested in identifying some of the reasons for students’ apparent dissatisfaction and for a lack of engagement, especially in relation to tutor skills in managing the interaction, their ‘interactional competence’ (see Walsh, 2006).

From the perspective of corpus linguistics, much influential work on spoken interaction in higher education is based on the Michigan Corpus of Academic Spoken English or MICASE. This corpus comprises data from across a range of speech events in higher education. It includes contexts relevant to the study reported here, such as classroom discussions, seminars, lab work and advising sessions. Studies based on the MICASE corpus have explored a wide range of phenomena in academic spoken interaction, such as metadiscourse in lectures, the use of conditionals, and, of more direct relevance to this study, the effect of class size on lecture discourse (Lee, 2009).

Outside corpus linguistics, there is quite a long history of research into spoken interaction in higher education. Some of this research has taken as its main focus spoken interaction in SGT situations and has used CA as a research methodology. More recent research on talk-in-interaction in SGT in higher education has uncovered important aspects of the processes or ‘machinery’ by which seminars and tutorials ‘get done’, for example, by focusing on cues and signals used to manage interaction and participant roles (Viechnicki, 1997), sequential organisation and negotiation of meaning (Basturkmen, 2002), and the issue of ‘topicality’ in small group discussion (Stokoe, 2000; Gibson, Hall, & Callery, 2006). Other research has explored the formulation and uptake of tasks and resistance to ‘academic’ identities (Benwell & Stokoe, 2002).

Much of the more recent work on talk in SGT (particularly that of Benwell and Stokoe) draws on perspectives from ethnomethodology, conversation analysis and discursive psychology. In these perspectives, human social activity such as small group seminars or tutorials, are seen as locally produced accomplishments in which participants display their own understandings of the unfolding context. Participants take actions to further their own goals and agendas and display their orientations to others’ actions. In SGT contexts, tutors will demonstrably orient to the accomplishment of pedagogical goals and tasks, and students may accept or resist these actions (Benwell & Stokoe, 2002).

In the present study, the focus was on the ways in which tutors and students manage the complex relationship between pedagogic goals and the talk used to realise them. In SGT settings, as in most educational contexts, there is a strong relationship between pedagogic goals and pedagogic actions and the language used to achieve them (Seedhouse, 2004). Understanding this relationship, and the ways in which tutors and students engage in tightly organised and intricate negotiations of a set of pedagogic agendas, lies at the heart of any enterprise which sets out to improve teaching and learning in higher education. I adopt the strong position taken by others that interaction and learning are inextricably linked. Any attempt to enhance learning in SGT should, therefore, begin by gaining a closer understanding of the interactions taking place. By using a combination of CL and CA, we are able to provide a more realistic description of the relationship between pedagogic actions and the language used to achieve those actions in classroom discourse (Walsh, 2013), thus offering a greater understanding of the finer interactional adjustments and variations which exist in SGT interaction. We can then use these insights as a means to the end of addressing the problem identified above, that of the relationship between student engagement and tutor interactional skills in SGT in one higher education context.

3 Data and Analysis

The study is based on data from the Limerick Belfast Corpus of Academic Spoken English (hereafter LI-BEL), which comprises 500,000 words of recorded lectures, small group seminars and tutorials, laboratory practicals and presentations. These data were collected in two universities on the island of Ireland: Limerick and Belfast, across common disciplinary sites within the participating universities: Arts and Humanities, Social Sciences, Science, Engineering and Informatics and Business. From the main corpus, a sub-corpus of 50,000 was created by identifying all the instances of SGT, defined as sessions comprising between 15 and 25 students and where there was evidence of sustained interaction. It is perhaps significant that only 50,000 words were identified (or 10 % of the corpus) in which there was evidence of extended interactions. This, in itself, is indicative of the current state-of-play of SGT in the two universities under investigation; it is apparent that tutorials and seminars are functioning more as extensions of lectures than offering opportunities for engagement and sustained debate.

Using WordSmith Tools (Scott, 2004) key words and word frequency lists for both single words and multi-word units were generated. The one-million word Limerick Corpus of Irish English (LCIE) was used as a reference corpus (Farr, Murphy, & O’Keeffe, 2002). Table 1 illustrates the top 20 key words.

Table 1 Top 20 key words from LI-BEL sub-corpus

Through concordance and source text analysis via WordSmith differences in the functioning of these higher frequency words was brought into relief. For example, the word if, when used in ‘first conditional’ type structures, had three main functions:

  • pedagogic illustration of ‘general truths/facts’ if John Kerry takes Texas, …he takes every vote…;

  • projecting, ‘meaning when you find yourself in this situation’ if you are on TP and you have a class that…;

  • demonstrating, if you click the mouse and then click

Figure 1 illustrates the most salient items when we looked at the LI-BEL sub-corpus frequencies using LCIE as a baseline for comparison.

Fig. 1
figure 1

Single word frequencies in LI-BEL sub-corpus and LCIE

As Fig. 1 illustrates, nine single-word items were found to be significantly different in frequency when compared to the reference corpus. Some of these are context-specific, for example, the prevalence of the interrogative pronoun what, discourse markers so, okay, alright, deictic next (as in next week, next semester, next lecture), modality (what I need you to do, you need to, etc.), and so on. Even at the word level the corpus data was pointing to the significance of such actions as eliciting information, signposting the discourse, locating learning and teaching in time and directing learners to perform certain actions and carry out tasks.

Having scoped out the word frequencies and word patterns related to these, the next level of analysis was multi-word units. More than 128 multi word units (MWUs) were identified and these further illuminated the earlier results for key word and the single word frequencies. This resulted in the emergence of clear categories into which the words and their patterns could be divided. Like the single word items referred to above, the MWUs which prove statistically salient in this context have the broad function of marking the discourse. They signpost, manage, demonstrate, sequence, set up activities/groups and mark out shared and new knowledge (Carter & Fung, 2007), as Table 2 illustrates.

Table 2 Broad functional categorisation of significant multi-word units in LI-BEL sub-corpus

Having identified the most frequently occurring words, multi-word units and language functions, the next stage of the analysis was to interrogate the corpus using CA.

The CL analysis clearly identified a number of key linguistic features whose distribution was in some way marked in terms of frequency. In order to gain a deeper understanding of spoken interaction in this context, it was important to see how these statistically salient features actually operated in speakers’ turns and in longer sequences of interaction. In the qualitative analysis (see below), the corpus was examined using CA, building ‘collections’ of similar instances of stretches of interaction where there was both a clustering of the linguistic features identified in the corpus analysis in addition to specific patterns of sequential organisation.

The CL analysis, therefore, had helped identify longer stretches of discourse which were marked in some way, indicating particularly high frequencies of usage or high levels of ‘keyness’. The CA analysis highlighted a number of specific interactional features of the discourse which were considered alongside the linguistic features previously identified in the CL analysis. Using CA alongside the findings from the CL analysis enabled a focus on the ways in which linguistic and interactional features come into play and how both sets of features collectively contribute towards co-constructed meaning. In short, this dual analysis revealed patterns and relationships between tutors’ and learners’ language use which each methodology on its own would be unable to uncover.

By way of example, compare the sample plot graphs for the high frequency items, last week, next week and okay in the data. These show whether these items cluster at certain points and in which files (i.e., which interactions/classes). References to last week and next week prevail at the beginning and end of interactions whereas okay is more dense at the beginning of interactions but is used throughout as well, with ‘clusterings’ around certain phases of a seminar or tutorial (Figs. 2, 3 and 4).

Fig. 2
figure 2

Sample dispersion plot graph of ‘last week’ in LI-BEL sub-corpus

Fig. 3
figure 3

Sample dispersion plot graph of ‘next week’ in LI-BEL sub-corpus

Fig. 4
figure 4

Sample dispersion plot graph of ‘okay’ in LI-BEL sub-corpus

The dialectic between CA and CL thus allowed a better understanding of why certain items were clustering at certain points. In the next part of the analysis, I present the most salient contexts in which high frequency items clustered. For the sake of convenience, each context is labelled according to its predominant pedagogic function.

3.1 Organisational Talk

Much of what goes on in SGT entails tutors organising learning in some way, often temporally or spatially. Here, tutors’ pedagogic goals are to inform students about different procedural matters (the date and time of an examination, the materials to bring to the next session, and so on). Consider extract 1 below, where the tutor makes frequent reference to time (the next day, week nine, and so on). The prime purpose here is to alert students to upcoming tasks and activities, and to the overall organisation of modules and courses. Note too the use of okay at the end of this sequence as a marker of a transition to the next stage of the SGT session. okay was the third most statistically ‘key’ lexical item in the corpus, occurring very frequently in stretches of organisational talk.

Extract 1

The interactional organisation of ‘organisational talk’ is characterised by long turns by one participant (normally the tutor), while the other participants produce short responses or no responses at all. It is here that the tutor may use discourse markers such as ‘okay?’ to check understanding, but often will not wait for a verbal response (presumably relying on visual information to monitor the state of comprehension of the other participants). In the data, it was very obvious that the tutor may also perform the role of both questioner and answerer, as evidenced below in extract 2, where the tutor produces both the first and second-pair parts of a question, with no pause between them to indicate a turn transition relevance place, showing that no response is expected:

Extract 2

3.2 Instructional Talk

Much of the interaction of SGT was found to be reminiscent of more traditional classroom discourse, dominated by display questions, IRF (Initiation by teacher, Response by learner, Feedback by teacher) exchanges, short utterances from students, and so on. In what is labelled ‘instructional talk, the discourse is highly controlled, with the main responsibility for managing the interaction firmly in the hands of the tutor. Turn-taking is tightly controlled by the tutor, who manages both next turn allocation and questions addressed to individual participants, thus making the respondee’s provision of the second pair-part strongly relevant. In terms of corresponding linguistic features, the most obvious example is found in concordance searches of the pattern tell me (Fig. 5). Another example is I want you/ye to (Fig. 6).

Fig. 5
figure 5

Extracts from concordance lines of ‘tell me’

Fig. 6
figure 6

Concordance extracts of the pattern ‘I want you to’

Consider extract 3 below, which comes from a teacher education seminar, and which makes extensive use of the MWUs found in the corpus and used for eliciting information (as in lines 7, 9–10, 12):

Extract 3

This extract shows one other typical feature of this context, an IRF exchange in lines 5–7, with a cued elicitation used in the initiation move. The use of ‘perfect’ (line 7) is an example of evaluative feedback, typical of the follow-up move in the IRF exchange. In fact, it is telling for the ubiquity of this exchange that the statistically salient or ‘key’ lexical items with the function of giving feedback on elicitations were typically found in this position. The tutor’s long turn (lines 7–14) consists almost entirely of elicitation, to the extent that this provokes a meta-comment on the discourse in lines 13 and 14.

3.3 Discursive Talk

One of the most important indicators of success in any educational discourse, arguably, is a tutor’s ability to create shared space where learning can take place. This is particularly true in a higher education context, where students must feel able and willing to participate and contribute to the discussion. In this study, the focus was on the ways in which tutors, through their choice of linguistic and interactional features, created ‘space for learning’: interactional space in which students could become involved, engaged, and willing to take risks in the discussion. The quantitative analysis showed quite clearly that ‘discourse markers of shared space’ occurred frequently in this context, labelled here ‘discursive talk’. By discursive talk, I mean instances where students produce accounts of experiences that they are having as part of the course, often accompanying these accounts with assessments of situations and behaviour. The tutor accepts and builds on these accounts, converting them into pedagogical material in the form of reflective statements about appropriate behaviour, roles and identities in the professional practice of the discipline. Agreement to assessments is favoured (there is a lack of dispreferred responses) and there is frequent use of interpersonal discourse markers to provide supportive responses to the speaker (yeah) and to mark shared knowledge (you know; you see).

Using a CL analysis reveals telling differences as illustrated in Fig. 3 in the comparison of you see and you know (Fig. 7).

Fig. 7
figure 7

Comparison of ‘you see’ and ‘you know’ in LI-BEL sub corpus and LCIE (normalised results)

You see usually marks new information while you know generally marks shared information. It is revealing in a corpus recorded in higher education classrooms that we find an exceptional number you knows (marking shared information) but we find more or less the same amount of you sees (marking new information). The priority to build on and appeal to shared knowledge and ‘shared space’ is central to both the pedagogic and interactional process.

The interactional features of this kind of talk show that there is considerable symmetry; tutor and students adopt almost equal roles and it may not be immediately obvious who the tutor is. Typically, turns are evenly distributed and often managed by students themselves, in a way which closely resembles everyday conversation. Tutors may initiate exchanges as a form of open invitation to produce accounts of experience, as in extract 4, taken from a film studies seminar.

Extract 4

The overall tone of this interaction is conversational. In response to the teacher’s opening turn, one student (S3) produces an account of a group’s experiences of making a film, including an assessment of the situation (it’s crazy), to which the tutor offers a preferred (agreeing) response with the discourse marker ‘yeah’ and the repetition of the assessment, before building on this to project what experiences will be like in the future. It seems apparent that participants can express feelings such as frustration with aspects of the course, or in the case above, with other students’ behaviour. In lines 11 and 12, S3 indicates that ‘some people’ may have problems in accepting that material has to be cut, and in line 19, seems to be expressing frustration either about the existing director, or the lack of a director’s role in the group.

The role of the tutor here is to ‘take a back seat’, listen to what students contribute, take their experiences and feelings and build on them, and so on. The pedagogic goal is to reinforce appropriate behaviours and identities, especially in a context where professional practice is important, as in the one above. However, there may be a tension between the establishment of a more ‘equal’ turn-taking system, with the freedom to express feelings, and the need for tutors to convert this into pedagogically useable material. This can be seen in the tutor’s last turn in the extract (lines 21–26), in which ‘okay’ marks a switch in orientation, and the content about appropriate roles and behaviours is prefaced with a lengthy string of hedges, indicating pragmatic work in switching roles from an empathic listener to a ‘reflexive judge’ (Baumgart, 1976). This tutor does quite a lot of interactional work in order to change footing (‘okay yeah you see that’s the thing like you know I mean like really’); his stance after this preface is that of teacher again, giving instruction and passing on new knowledge. The interactional work is apparently needed in order to change from equal interactant to tutor, to move from a position of role symmetry to one of role asymmetry.

3.4 Argumentative Talk

A key aim of higher education is to foster criticality and promote individualised thinking. Most tutors would be delighted if students would engage with their discipline, discuss, debate and argue about new concepts, challenge existing principles and offer new ideas of their own. Unfortunately, all too often, this does not occur and students resort to being passive recipients, apparently uninterested and only motivated by information which well help them pass the course or succeed in an assignment. In the present study, there were instances of what we are calling ‘argumentative talk’ where there was some kind of discussion or debate, even argument.

Typically, and based on the quantitative CL analysis, argumentative talk occurred most frequently when there was a preponderance of discourse markers of shared space. Accompanying these discourse markers, there was heavy use of frequent examples of negation or adversative items such as ‘but’, as exemplified in Fig. 8.

Fig. 8
figure 8

Sample concordance lines of ‘but’ preceded by a discourse marker in argumentative contexts

From an interactional perspective, contexts in which argumentative talk could be found were characterized by a symmetrical speech exchange system, with ‘give and take’ in the interaction as tutor and students collaboratively negotiate meanings and co-construct understandings. There can be quite rapid exchanges of assertions, with frequent occurrence of dispreferred options such as straight rebuttals, and there is a high frequency of latched turns and a relative lack of pauses at transition relevance places. Extract 5, which is from a politics seminar, is a clear example of argumentative talk in action:

Extract 5

Extract 5 opens with an apparent challenge from S5 (are we are we defining (.) ethnicity or nationalism) followed by an uncertain response from the tutor (lines 2–4). Note the frequent use of pausing (.) which may indicate hesitation or uncertainty. In line 5, the same student appears to be dissatisfied with the tutor’s previous response and interrupts (indicated=) with a further challenge. S5 also appears to show some uncertainty in line 8 (a pause (.) followed by aggh), allowing the tutor an opportunity to interrupt again in line 11. The tutor succeeds in holding the floor from lines 11–16 and, despite some obvious transition relevance places (marked (.)), nobody challenges his explanation further. Indeed, he even closes down space in 14 (do you get my point (.) okay). The discourse marker ‘okay’ here seems to show a degree of finality to the discussion, pointing to a transition in this stage of the seminar and a time to move on.

The pedagogic orientation appears to be towards an open and dialogic exploration of disciplinary knowledge, similar to Mercer’s (2000) ‘exploratory talk’. However, this micro-context actually shows characteristics of ‘disputational talk’ (Mercer, 2000) in which participants, rather than interacting to build knowledge together, dispute each other’s meanings in ways which may not move the discussion forward.

4 Discussion

In terms of the main findings within the study, several implications can be identified. First, there is a need for further research to consider more carefully the relationship between language use, interaction and learning in SGT sessions. At present, we only have a partial understanding of the complex relationships between language, pedagogy, interaction, learning and knowledge. The linguistic and interactional features identified in this study, we suggest, perform a central role in co-constructing meaning, in promoting criticality and in engaging learners in academic debate. There is more work needed to promote an understanding of the ways in which these features assist in the creation of space for learning.

Second, I would argue that there is a need for tutors to develop greater interactional competence in order to facilitate the kind of ‘whole class interactive teaching’ such as that currently being advocated in the national literacy strategy in secondary classrooms. Classroom interactional competence (Walsh, 2013) refers to the specific interactional strategies that tutors use to help learners express new ideas, discuss key concepts, question accepted knowledge, and articulate emerging understandings. By helping tutors gain greater interactional competence, the overall quality of learning can be enhanced, both in terms of depth and breadth.

Third, there is a need to look more closely at ways of including and involving students more fully in the discourse of SGT sessions, raising students’ interactional competence and facilitating a more interactive, engaging learning environment. Much can be done to improve the learning experience of students by helping them to consider how they can become better interactants, more able to articulate complex ideas or take a particular stance in relation to an idea, concept or theory.

Finally, further research is needed to evaluate and assess the extent to which the micro-contexts identified in this study stand up to closer scrutiny when extended to other contexts in which SGT takes place. This study was carried out in one national context (Ireland) using a relatively small corpus. Further studies using larger corpora across a range of contexts in higher education would be likely to reveal the robustness of the framework for understanding interaction in these contexts.

5 Conclusion

In terms of the overarching focus of this paper, namely the proposition that CL and CA are suitable bedfellows, I set out to demonstrate not only how they can be mutually beneficial, but how they can actually synergize. Through an over and back process, a methodological dialectic, I was able to identify and verify four distinct micro-contexts which emerged through a combination of the tutors’ and students’ orientations to certain pedagogic goals, and the speech exchange systems set up to produce this knowledge as an interactional accomplishment between them. Implicated in, and indexical of, these micro-contexts, is the use of high frequency items in the corpus at particular points in the interactions. Had CL been used on its own, an interesting list of high frequency items and their functions would have been identified with no corresponding depth of analysis such as that offered by CA. Similarly, looking at the data purely from a CA perspective may have established the four micro-contexts, but the finding that these were actually high frequency items (i.e., key words, high frequency words and multi-word units) would have been overlooked. In addition, by drawing on quantitative methods within CL, it was possible to reference findings against another dataset. All in all, therefore, it seems safe to assert that CL and CA are ‘well met’. By way of final reflection on what CL can gain from CA, the narrowness of CA transcription could be accommodated more into the transcription of spoken corpora; similarly, using recordings with transcripts is something which will hopefully become more a reality as the next generation of spoken corpora emerge.