
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Pronunciation is considered to be one of the most complex human motor skills (Levelt, 1989). One of the major goals of FL learners is to acquire the pronunciation that will not diverge significantly from the target native norm. This task is made difficult by a complex interaction between the L1 sound system and the FL sound system. Non-native sounds will be shaped both in perception and production by already established native sound categories. This ongoing process of assimilation and dissimilation is modeled by both the Perceptual Assimilation Model (Best, 1995; Best & Tyler, 2007) and the Speech Learning Model (Flege, 1995). The ultimate success in acquiring native-like pronunciation in FL is influenced by an array of linguistic and non-linguistic factors (Piske et al., 2001). The complexity of reaching native-like or accent-free FL pronunciation is highlighted by the estimation that only between 5 to 15 % adult learners are ultimately successful (Birdsong, 1999, 2005; Novoa et al., 1988; Seliger et al., 1975).

The elicitation techniques used in the FL speech research rely on collecting speech samples of different length and structure in the foreign language in order to assess the learners’ divergence from the native norm. Such elicitation of non-native speech samples may be in the form of individual words, sentences, describing pictures, recounting personal experience, or repeating after a native model (references in Piske et al., 2001). All of them logically use the foreign language as a means of assessing the level of acquisition of investigated phonetic parameters. In the current study, we propose FL accent imitation in L1 as an alternative method of investigating the degree of acquisition of FL speech parameters. The main tenet of this method, as will be argued below, is that the feature which is well acquired in FL will be transferred as a characterizing feature in the imitation of FL accent in L1. The application of this method will be later tested with the parameter of Voice Onset Time to ascertain the degree of transfer of a non-native feature into L1 pronunciation.

2 FL Accent Imitation in L1

The fundamental concept in FL accent imitation in L1 is the notion of transfer. The fact that some structures are transferred from the native language to a foreign language has been discussed in numerous studies (Andersen, 1983; Ausubel, 1963; Gagné, 1965; Kellerman, 1995; Odlin, 1989; Osgood, 1946). The understanding of transfer here is not in the sense that it sets the objective of explaining why certain features of L1 are transferred into FL and how they can predict learners’ errors (see Major, 2008, for a discussion). It is understood here as the process “which incorporates (…) previously acquired capabilities” into a new activity (Gagné, 1965, p. 129), and which has ‘relevant aspects’ that are relatable to the new experience (Ausubel, 1963). In the technique of FL accent imitation in L1 the concept of transfer is taken to the extreme. While in the standard elicitation technique of FL pronunciation samples in which learners provide productions in FL, some features may be less exposed depending on the task specificity, in the FL accent imitation task in L1 those features must be highly exposed due to the task requirements. This is the key element of the proposed technique. When asked to imitate FL accent in their L1, the learners will be pressed to consistently reveal the features of FL pronunciation that they have already acquired. We argue here that this technique will shed more light on which FL pronunciation features the learners perceive as the most characteristic. Moreover, the task characteristics may be paradoxically more informative than standard elicitation in FL due to an element of mockery and fun. Recordings of speech samples in laboratory conditions may be inhibiting and when it is done in the speakers’ FL it may result in combined inhibitions that will influence the obtained samples. The element of mockery and entertainment implicit in accent imitation may lead to more open and less restricted productions. When the test material is appropriately devised, such a technique may be no less informative than the classic elicitation in FL, if not more, depending on the construction of the experiment.

Accent imitation is not new in speech research and has been used for various purposes. For example, Adank et al. (2013) exposed participants to Glaswegian accent of English and asked them to either repeat or imitate the sentences produced by a mode talker. They applied the Communication Accommodation Theory (Giles et al., 1991; Shepard et al., 2001) to interpret the results which showed that imitation had a positive effect on the perceived social attractiveness of the speaker. Other studies have shown that people relatively easily acquire the features of a new accent or a dialect in the new surrounding (Delvaux & Soquet, 2007; Evans & Iverson, 2007; Munro et al., 1999; Trudgill, 1986).

Some studies in forensic speaker identification have investigated imitation of a foreign accent as a form of voice disguise with results indicating a rather low level of effectiveness that is easily detectable (Baldwin & French, 1990; Rose, 2002; Storey, 1996). As noted by Rose (2002, p. 194) “offenders often try to assume an accent as part of a disguise. A discussion of the sound structure of language is a good place to point out how difficult it is to imitate an accent correctly … It is relatively easy for a linguist who knows the phonological structure of a given language to detect a bogus accent when it is used for disguise”. Most relevant to the current study, Neuhauser (2011) looked into the production of voicing and VOT in voiced and voiceless stops by native speakers of German imitating French accent as well as by native speakers of French imitating German accent. It relied on the cross-linguistic difference in which German contrasts short-lag and long-lag VOT values for voiced and voiceless stops, while French uses prevoicing and short-lag VOT values respectively. The results showed a complex pattern in which German speakers reduced VOT during French accent imitation for voiceless plosives, the values being even lower than for native French. On the other hand, French speakers imitating German accent did not achieve native-like values for voiceless plosives. The author concludes that there may be a degree of exaggeration in accent imitation. This degree of exaggeration is not a disadvantage in the elicitation technique proposed here. From the point of view of the acquisition of FL pronunciation, any degree of exaggeration in imitation of an FL accent in L1 will be a marker of a pronunciation feature that learners find characteristic in FL and have acquired it to an extent that they can transfer it to L1 as an element of FL phonetics. More recently, Sypiańska and Olender (2013) looked into VOTs produced by Polish learners in imitated English accent in Polish—the procedure they termed ‘phonetically transplanted speech’—as a function of the amount of phonetic training. They observed that both theoretical and practical courses in phonetics increase the phonetic awareness of the FL sound system that can be tranferred into L1 in an accent imitation task.

3 The Current Study

The current study uses the proposed elicitation technique of FL accent imitation in L1 to investigate the production of English accent in Polish by Polish learners. The tested parameter is VOT, which is differently implemented in English and Polish to cue the voicing contrast between /p, t, k/ and /b, d, g/. English contrasts short-lag and long-lag VOT values for voiced and voiceless plosives (Keating et al., 1983; Lisker & Abramson, 1964). On the other hand, Polish uses prevoicing or negative VOT values for voiced plosives and short-lag values for voiceless plosives (Keating, 1980; Keating et al., 1981; Kopczyński, 1977; Mikoś et al., 1978). These cross-linguistic differences result in two types of learning challenges that Polish learners of English must face. First, they must learn to reduce voicing in the hold phase of English /b, d, g/. Second, they must learn to increase VOT values in English /p, t, k/ by delaying the onset of voicing of the following vowel after the release burst. While the former challenge has relatively little influence on the perception of accent due to the fact that English native speakers may also prevoice in hyperspeech or for certain places of articulation or vowel contexts (Kessinger & Blumstein, 1997; Magloire & Green, 1999; Miller et al., 1986), the latter challenge has more serious consequences on the perceived accent. Polish learners’ production of insufficiently long VOT values for English voiceless plosives not only contributes significantly to the perception of foreign accent but also to the miscategorization of voiceless /p, t, k/ as voiced /b, d, g/ by native speakers. Experimental research has shown that speakers of Polish normally produce insufficient VOT values for English voiceless plosives that do not match those reported for native speakers (Waniek-Klimczak, 2005) and they do not have a categorical shift between voiced and voiceless categories along the positive VOT continuum typical for native speakers (Rojczyk, 2010). Consequently, the purpose of the current study is to use the proposed elicitation technique to investigate if, and to what extent, the Polish learners of English will transfer long-lag values for English /p, t, k/ in their imitation of English accent in Polish. Accordingly, the two research questions are formulated below:

  1. 1.

    Do Polish learners of English transfer longer VOT values for voiceless plosives in Polish to imitate English accent? Do they find this temporal parameter as a salient cross-linguistic phonetic feature?

  2. 2.

    Is the production of longer VOT values for voiceless plosives in English correlated with increasing VOT values in the imitation of English accent in Polish? Will learners who produce longer VOT values in English also produce relatively longer values in imitated English accent in Polish?

3.1 Materials

The materials were composed of nine sentences, each containing words beginning with voiceless /p, t, k/ in English and Polish. English sentences were used to establish the learners’ VOTs for English voiceless stops. Polish sentences also contained voiceless /p, t, k/ each. All words containing /p, t, k/ in both languages were stressed on the first syllable. Neither the prosodic strength nor the lexical frequency was controlled. The distribution of test words in English and Polish sentences was not uniform, because the purpose of the experiment was not to directly compare VOTs in English and Polish. The purpose was to directly compare the sentences produced with default Polish accent and imitated English accent, which means that all the positional and prosodic factors were uniform in both tasks which were based on the same sentences. The English and Polish sentences with underlined target words containing voiceless stops are presented below:

Put it on top of the cake

Take this pistol carefully

Cats are tiring pets

Cast this pen on the table

Ten people caused this accident

This Coke is for two parties

This passenger can’t take all the baggage

Those pots and cans are totally everywhere

The cost of this pan is too high

Ta kawa jest pyszna

Tak właśnie ma pan kosić

Polak musi kupić taki samochód

Ten kielich będzie znowu pełny

Postaw kasę na ten mecz

Każdy chce pięknie tańczyć

Powiedz komuś o niej, ale tylko nie kłam

To jest koncert w pełni księżyca

Tamten film miał bardzo pusty koniec

3.2 Participants

Ten advanced Polish learners of English participated in the study, six females and four males. They were recruited from the second year of a 5 year English programme at the Institute of English, University of Silesia. They had received three semesters of training in English phonetics and reported to be fluent in English. The mean age was 20.4 years. All participants volunteered to participate in the experiment. None of them reported any speech or hearing disorders nor had any indication of such.

3.3 Procedure and Recording

All participants were instructed that they would perform three tasks: reading sentences in English, reading sentences in Polish, and reading sentences in Polish with imitated English accent. They were encouraged to treat the last task as entertainment, a type of mockery. They were told to imagine that they were native speakers of English trying to read in Polish and to demonstrate in mockery what it would sound like. They were naive as to the parameter subject for the analysis. The sentences were presented in orthography on a printout sheet. Upon entering the lab, the participants were given approximately 5 min to read the sentences in quiet and prepare their productions prior to the recording session. They were constantly encouraged to treat it as entertainment and were assured that it would not be analysed as correct or incorrect imitation. They were asked to read the sentences with natural tempo and intonation.

The experiment took place in the Acoustic-Phonetic Laboratory at the Institute of English, University of Silesia. The recordings were made in a sound-proof booth. The signal was captured with a headset dynamic microphone Sennheiser HMD 26, preamplified with USBPre2 (Sound Devices) into .wav format with the sampling rate 48 kHz, 24-bit quantization.

3.4 Measurements

All measurements were made using waveform and spectrogram displays available in Praat (Boersma, 2001). The Voice Onset Time was measured using the standard definition as “the time interval between the burst that marks release and the onset of periodicity that reflects laryngeal vibration” (Lisker & Abramson, 1964, p. 422). The plosive release was measured as the first distinct pulse in the amplitude and the onset of voicing was measured as the first zero crossing of the periodic pulse. The total number of measured tokens was 810.

3.5 Analysis and Results

The measurements in ms were analysed using repeated-measures 2 × 3 ANOVAs with two independent variables: task (Polish; Polish with English accent) and the place of articulation (/p/; /t/; /k/). The correlation was performed using non-parametric Spearman correlation, which is more conservative than parametric correlations and does not assume normal distribution of the data.

The analysis of the participants’ VOTs in English sentences yielded the following values: /p/ (M = 36; SE = 2.8); /t/ (M = 56; SE = 2.8); /k/ (M = 61; SE = 2.6). These values are higher than those reported by Keating et al. (1981) for Polish in isolated words, which means that the learners learnt to increase VOTs for English voiceless stops. That fulfills the assumption that they already possessed the temporal feature of FL that they might or might not transfer to their native language in imitation of English accent.

There was a highly significant effect of task for sentences in Polish (Polish accent; imitated English accent), indicating that imitation significantly influenced the produced VOTs [F(1, 89) = 106.81, p < 0.001]. The mean VOT in Polish blocked for all the three places of articulation was 28 ms (SE = 0.7) and for imitated English accent 42 ms (SE = 1.6). It clearly shows that the participants consistently increased VOTs for /p, t, k/ in Polish as a marker of English accent (Fig. 1).

Fig. 1
figure 1

Mean VOT values is ms for Polish sentences produced with Polish accent (left) and imitated English accent (right)

The interaction of the task (Polish accent; imitated English accent) and the place of articulation (/p/;/t/;/k/) was not significant [F(2, 178) = 0.24, p > 0.05], showing that the place of articulation did not influence the increased VOTs in imitated English accent. Post hoc Bonferroni tests revealed that VOTs were significantly longer in imitated English accent for all the three places of articulation (p < 0.001). The VOT values in Polish accent were 23 ms (SE = 1.1) for /p/; 25 ms (SE = 0.8) for/t/; and 37 ms (SE = 1.3) for /k/. Respectively, the values for imitated English accent were 35 ms (SE = 2.4) for /p/; 38 ms (SE = 2) for /t/; and 51 ms (SE = 2) for /k/ (Fig. 2).

Fig. 2
figure 2

The interaction between the task (1: Polish accent; 2: imitated English accent) and the place of articulation (1: /p/; 2: /t/; 3: /k/) in the production of VOT in ms

The Spearman’s Rank Order correlation was run to determine the relationship between the learners’ VOTs in English and in the imitated English accent in Polish. There was a strong positive correlation [r(88) = 0.602, p < 0.001], which indicates that the learners’ who produce longer VOT values in English voiceless plosives also produce longer values while imitating English accent in Polish.

4 Conclusions

The purpose of the current study was to test the proposed elicitation technique which uses the imitation of FL accent in L1. The major assumption of the technique is that learners will transfer the FL pronunciation features that they find characteristic or salient into L1, thus revealing the perceived hierarchy of FL phonetic features and the level of their acquisition. The current study investigated the Polish learners’ production of VOT values in voiceless plosives. Two research questions were formulated and subsequently tested.

1. Do Polish learners of English transfer longer VOT values for voiceless plosives in Polish to imitate English accent? Do they find this temporal parameter as a salient cross-linguistic phonetic feature?

The analysis of the results revealed that Polish learners produced significantly longer VOTs when imitating English accent than when producing the test sentences in their L1 accent. The transfer of an English phonetic parameter to Polish indicates that this parameter is perceived as salient or characteristic and typical for the English phonetic repertoire. It also points to the degree of acquisition of long-lag VOT values for English /p, t, k/. Although the analysis of the sentences produced in English already showed that the learners increased VOTs for voiceless stops, we are inclined to argue that the fact that they also transferred them into their L1 in accent imitation is a stronger indication of the acquisition of this feature. While longer VOTs may be lexically encoded in some words in English as a result of exposure to multiple instances from the input in English, transferring this feature into Polish is a conscious strategy bypassing lexical influences. In other words, producing words which are only heard with shorter VOTs with longer VOTs requires more phonetic sensitivity and control than resorting to stored instances of tokens in the lexicon.

2. Is the production of longer VOT values for voiceless plosives in English correlated with increasing VOT values in the imitation of English accent in Polish? Will learners who produce longer VOT values in English also produce relatively longer values in imitated English accent in Polish?

The results of the correlation showed that the learners that produced longer VOTs in English also produced longer VOTs in imitated English accent in Polish. It leads to a natural conclusion that the FL pronunciation feature that is acquired more successfully will be transferred more effectively in L1. It is hard to expect the learners who do not produce an FL pronunciation feature in FL to render it in the imitation. The fact that correlation, although it is highly significant, is not perfect suggests that there is a degree of imitational skills that may contribute to the outcome of this elicitation technique. There may be a population of learners who produce the FL feature in FL, but are not able to transfer it to L1 due to lower imitational skills. The talent for mimicry has been identified in second-langue speech research as a significant predictor of a degree of L2 foreign accent (Flege et al., 1999; Purcell & Suter, 1980; Suter, 1976; Thompson, 1991). It is also considered as a subcomponent of language aptitude for pronunciation in L2 connected with empathy and the ability to overcome the ‘ego boundary’ (Guiora, 1967; Guiora & Acton, 1979; Guiora et al., 1972; Hu et al., 2013). More research is needed to determine the degree of influence of the talent for mimicry in the imitation of FL accent in L1.