Keywords

1 Introduction

Since Ge’s [1] introduction of the term ‘China English’, it has aroused growing interest of scholars in China, and more recently, abroad. At the earlier stage, scholars debated whether there was such a language variety called China English, and some of them tried to offer a definition of it [2,3,4,5,6]. After that, scholars began to look at the linguistic features of China English [6,7,8,9,10,11,12,13]. Most of these studies focused on the phonological and lexical features of China English. It was not until recently that structural nativization has become a topic for researchers on the variety of China English. These recent studies [14,15,16,17] have identified several new linguistic structures in the English language when it is used in China, including new collocational structures, ditransitive verbs, and verb-complementation. However, no previous studies have approached structural nativization through colligations in the field of China English. Therefore, it is of significance to explore colligational patterns in China English. This current study will extend the research on China English, and its findings can attest to the process of structural nativisation.

Colligation refers to the grammatical company a word or a word category keeps. Put simply, colligation is the co-occurrence of syntactic categories, usually within a sentence. For example, the verbs of agree (including choose, decline, manage, etc.) co-occur with an infinitive with “to”, but not a gerund. The difference between a collocation and colligation is that “while collocation relates to the lexical aspect of a word’s selection of neighbouring words, colligation can be described as relating to the grammatical aspect of this selection” [18]. The reasons to choose colligational patterns for this study is that as a basic grammatical category in English, colligational patterns are not given due attention in the field of world Englishes, including China English. Second, thanks to existing tools and corpora, colligational patterns can be examined by analyzing large amounts of corpus data. Thirdly, the difference between China English and “standard” English in colligational construction is indicative of structural nativization of the English language used in the contexts of Chinese culture. Finally, colligational patterns carry constructional meanings, which is significant for the task of natural language processing.

The following research questions will be addressed in the present study:

  1. (1)

    Are there any common colligational patterns shared by both China English and British English?

  2. (2)

    Are there any preferences for certain colligational patterns in China English? If so, what are they?

  3. (3)

    Are there any innovative colligational patterns in China English? If so, what are they?

In this present study, a corpus-based methodology will be adopted in order to look at the differences between China English and British English in the use of colligational constructions. The subsequent sections of this paper will be devoted to discussing the data obtained from the corpora in order to provide answers to the above research questions.

2 Literature Review

Structural nativization is defined as “the emergence of locally characteristic linguistic patterns” [19]. Studies of structural nativization in the field of world Englishes focus mainly on new ditransitive verbs and verb-complementation in new Englishes, such as Indian English, Singapore English, Fijian English, Jamaican English, etc. For example, Mukherjee and Hoffmann [20] did a case study of the verb complementation of “give”, and found that while in British English the verb is most frequently used with the ditransitive construction, it is most frequently used with monotransitive one in Indian English. Schilk [21] compared the collocation and verb-complementational patterns of three verbs between Indian English and British English. Schilk et al. [22] also compared verb-complementation patterns between two varieties located in the Outer Circle: Indian English and Sri Lankan English.

Although much has been done to investigate structural nativizatoin in New Englishes, such as Indian English, and Singapore English, fewer such studies have been done in China English. Yu [23] conducted a comparative study of the adjective “foreign” in China’s English newspapers and British newspapers. He argues that the linguistic use of the word in China’s news reports show clear evidence of nativisation which is a result of Chinese people’s way of thinking. Using the same data as Yu’s study [23], Yu and Wen [17] examined the collocations of 20 adjectives, and reached similar conclusion that English used in China’s news reports shows a distinct tendency towards systematic nativisation at the collocational level. They argue that this different and innovative use of certain collocational patterns is a result of language contact.

Other scholars investigate into structural nativization in the grammatical structures in China English. Using three sets of data (the interview data, the newspaper data, and the short-story data), Xu [6] identified eight categories of syntactic features in the interview data: adjacent default tense, null-subject/object utterances, co-occurrence of connective pairs, subject pronoun copying, yes–no response, topic–comment, unmarked OSV, and inversion in subordinate finite wh-clauses; three categories in the newspaper data: nominalization, multiple-coordinate construction and modifying-modified sequencing; and the use of imperatives and tag variation strategy in the short-story data. Xu’s study is the most comprehensive one of such studies, and he identified distinctive grammatical patterns with Chinese characteristics.

Hu and Zhang [15] collected data from the China Daily and built a corpus with about 1 million tokens. They compared the collocation and colligation patterns of some high-frequency verbs of transformation in the corpus with those in the Corpus of Contemporary American English (COCA). It is found that the intransitive use of the verb “grow” has a greater frequency in China’s English newspapers than in American news reports, and the transitive use of the verb “develop” and “increase” has a smaller frequency.

Using corpus data collected from an online discussion forum, Ai and You [14] examined several locally emergent linguistic patterns in China English, and found some new ditransitive verbs, verb-complementation and collocation patterns in China English.

Their studies have contributed new knowledge to the grammatical features of China English. However, they have some common limitations. Xu’s [6] data is small in size. His written data only consists of 20 newspaper articles and 12 short stories. Considering the availability of large scale corpora nowadays, his collection of text is too small for linguistic studies, especially for identifying grammatical patterns. The other studies [15, 17, 23] have the same problems with the data size. Besides, Xu’s [6] spoken data come from interviews with Chinese English learners which, in his own words, “is difficult to distinguish between the syntactic features of Chinese learners’ English and those of China English” [6]. That might be the reason why more distinct grammatical features were detected in the interview data than those in the written data, the latter of which were written by more competent English users. Ai and You’s [14] corpus has about 7 million tokens. However, it only consists of data from Chinese English learners in an online discussion forum. From the point of view of language acquisition, these “creative uses are in fact instances of erroneous use” [14]. Therefore, their findings of these new grammatical patterns in China English could be ungrammatical use of English by English learners rather than accepted new syntactic structures.

In a word, these previous studies leave some room for improvement. First, no previous studies examined systematically the processes of acculturation of English in China through colligation. Next, the data for previous studies was either too limited in size or not representative of China English. The data for most studies came from a single source, either from newspapers [15, 17, 23], or from learners’ production [6, 14]. Therefore, this present study intends to fill up these gaps, and aims to explore structural nativization through colligation in a larger corpus of China English.

3 Methods

Owing to the availability of large corpora and corpus-enquiry softwares, it is now possible to identify linguistic features of a language based on large volumes of data. In this present study, a corpus-based and comparative methodology will be adopted to investigate the uses of colligational patterns in different English varieties. In this section, the source of the data and research procedures will be introduced.

3.1 Source of the Data

In order to facilitate the studies on China English, the Corpus of China English (CCE) was built with 13,962,102 tokens. It collects written texts from the following four genres: magazine, newspaper, fiction and academic, each of which contains about 3.5 million tokens. So it has the same genres as COCA except the spoken parts. To guarantee its representativeness, only the texts written by Chinese speakers of English were collected in the corpus. They include Chinese journalists, magazine writers, novelists, scholars, and others who write and communicate in English. As proficient English speakers, their production can be regarded as China English. BNC was chosen as the reference corpus in this study which has about 96,134,547 tokens.

In the present study, the communicative verbs are to be selected as the objects of study. Verbs, as the core member of a sentence, have been a focus of linguistic research in the field of semantics and syntax. Verbs of communication are considered as one of the basic categories of verbs as they represent the most essential objectives and motivations in human being’s communication [24]. In the FrameNet, 435 verbs of communication are collected and classified into 37 groups, such as discussion, encoding, reasoning, verdict, commitment, request, questioning, reporting, encoding, statement, response, summarizing, etc. In this study, the sub-category of discussion was selected, which includes the following verbs: discuss, confer, communicate, debate, negotiate, parley, consult, bargain, etc. The numbers of occurrence of the verbs are 792, 17, 251, 36, 174, 0, 93 and 19 respectively in CCE. As frequency counts in the analysis of linguistic data, the verbs discuss (792), communicate (251), and negotiate (174) were chosen for this study.

3.2 Procedures

After the CCE has been built, data extraction and analysis were carried out in the following steps:

First, the CCE was tagged by CLAWS7. It is a prerequisite for extracting colligational patterns with the tool Colligator 2.0 developed by members of the Foreign Language Education Research Informed by Corpora (FLERIC) team at Beijing Foreign Studies University [25].

Second, the colligation patterns for the three selected verbs discuss, communicate and negotiate were extracted from the CCE by the tool Colligator 2.0.

Third, from the BNC, the concordance lines of the three verbs were retrieved, out of which 1000 concordance lines were sampled for each verb. The colligational patterns of these verbs were also extracted by Colligator 2.0.

Fourth, an analysis was done to investigate the differences between China English and British English with the colligational patterns. The significance of the differences was calculated by the log-likelihood ratio calculator developed by members of the FLERIC team [25]. If the significance value is less than 0.05, then the result is statistically significant.

4 Results

In this section, the data of the three verbs discuss, communicate, and negotiate in the CCE and BNC will be given for further discussion in the next section. The number of occurrences of the three verbs in the CCE and BNC will be given first. Then the top 10 colligational patterns in CCE will be listed.

4.1 Number of Occurrences of the Three Verbs

Both the original frequency and normative frequency of the three verbs in CCE and BNC are given in Table 1. The normative frequency is the original frequency divided by the total number of the tokens in a corpus, expressed in terms of percents. Their log-likelihood and significance values are calculated by the log-likelihood ratio calculator.

Table 1. Number of occurrences of the three verbs in the two corpora

In Table 1, the log-likelihood and significance values show that the verbs communicate occur more frequently in CCE than in BNC. The difference is significant as the p-values are less than 0.05. The other two verbs discuss and negotiate are used less frequently in CCE, but the difference is not significant.

4.2 Colligational Patterns of Discuss

The frequencies of the colligational patterns of the word discuss in CCE are shown in Table 2. The lemma discuss has four inflections in the corpus, i.e. discuss, discusses, discussing and discussed. For the sake of easy comparison and space limit, only the word form of discuss was retrieved from the two corpora by the Colligator 2.0. In column 1 and 2, the ten patterns used most frequently in CCE are listed with frequencies and normative frequencies. In column 3, the corresponding number of occurrences of the patterns in BNC is given, with the first number being the number of occurrences of the word form out of its 1000 samples, and the second one being the total number of occurrences of the patterns in BNC. For example, the colligation pattern V deter. occurs 123 times out of the 310 concordance lines of the word form “discuss” in the 1,000 samples. Then in all of the 5,505 concordance lines of the lemma “discuss” in BNC, the colligational pattern V deter should be a × b/c, in which a is 5505, b is 123, c is 310, and the result is approximately 2184. In column 4 and 5 the log-likelihood and significance values are calculated by the log-likelihood ratio calculator.

Table 2. Colligational patterns of “discuss” in CCE and BNC

In Table 2, the colligational pattern V deter. means the verb is followed by a determiner, such as “the” or “a”, V pl. n. followed by a plural noun, V sing. n. by a singular noun, V adj. by an adjective, V prep. by a preposition, V pers. pron. by a personal pronoun, V poss. pron. by a possessive pronoun, V conj. by a conjunction, V inter. pron. by an interrogative pronoun, and V prop. n. by a proper noun.

From Table 2, one can see that the pattern V deter. is used most frequently of all the patterns in the two corpora. It is the common colligational structure for the verb discuss that is shared by China English and British English. However, there are marked differences between China English and British English with colligational patterns. On the one hand, three colligations (V pl. n., V conj. and V prop. n.) are used more frequently in China English than in British, but on the other hand, 3 colligations (V deter., V pers. pron. and V poss. pron.) occur less frequently. These differences are statistically significant as the p-value is less than 0.05. The other patterns are used differently, but are not statistically significant.

4.3 Colligational Patterns of Communicate

The verb communicate ranks the second in frequency in the sub-category of the communicative verbs in CCE. Its frequencies of the colligational patterns in the two corpora are given in Table 3. The frequencies of the colligational patterns in BNC are calculated in the same way as discussed in Sect. 4.2. There are 1507 hits of the lemma “communicate” in BNC, and 555 hits of the word form “communicate” in the 1,000 samples from BNC. So the colligation patterns of the verb should be a × b/c, in which a is the total number of the lemma in BNC, b is the number of the colligational patterns out of the 1000 samples, and c is the word form of “communicate” in the 1,000 samples.

Table 3. Colligational patterns of “communicate” in CCE and BNC

In Table 3, the colligational patterns are named the same as in Table 2. The pattern V, means the verb communicate is used as an intransitive verb and followed by a comma, e.g. “By contrast, all of us need to learn how to communicate, and to understand the language we use.”. The pattern V prep. to means the verb is followed by a preposition “to”, e.g. “I try to communicate to the audience why I think they should bother to listen to what I’ve got to say.’’.

From Table 3, it is found that the pattern “V prep.” occurs most frequently of all the patterns in both CCE and BNC. The greatest difference between China English and British English in the use of the colligational patterns might be the pattern “V deter.”. It is the second most frequently used pattern in BNC, but it occurs only twice in CCE. Moreover, the pattern “V sing. n.” is also used much less frequently in CCE. Both patterns occur less in China English. The number of occurrence of other patterns is different at various degrees, but not statistically significant.

4.4 Colligational Patterns of Negotiate

The verb negotiate is the third most frequently used verbs in the sub-category of the communicative verbs in CCE. Its frequencies of the colligational patterns in the two corpora are given in Table 4. The frequencies of the colligational patterns in BNC are calculated in the same way as discussed in Sects. 4.2 and 4.3. There are 1259 hits of the lemma “negotiate” in BNC, and 328 hits of the word form “negotiate” in the 1,000 samples from BNC. So the colligation patterns of the verb should be a × b/c, in which a is the total number of the lemma in BNC, b is the number of the colligational patterns out of the 1000 samples, and c is the word form of “negotiate” in the 1,000 samples.

Table 4. Colligational patterns of “negotiate” in CCE and BNC

In Table 4, the colligational patterns are named the same as in Tables 2 and 3. The data in Table 4 indicates that the colligational patterns of “negotiate” in China English tend to be more identical with those in British English when compared with the colligational patterns of the verbs “discuss” and “communicate”. The patterns “V prep.” and “V deter.” are top two patterns in both corpora although they are in alternating orders, i.e. China English has more “V prep.”, and British English more “V deter.”. However, the most striking difference between China English and British English is the use of the colligational pattern of “V infin. to”, which occurs twice in CCE, but nil in BNC. Another significant difference is the use of “V adj.” pattern, which is used less frequently in China English. The log-likelihood shows that 7 out of the 10 patterns have a lower frequency in China English than in British English. This may indicate that China English tends to heavily rely on some patterns, and use less varying patterns.

5 Discussion

With the results shown in Sect. 4, it is now possible to answer the research questions listed in Sect. 1. The answer to the first research question is affirmative. Although the three verbs are used with different grammatical structures, each of them has some common colligational patterns shared by both China English and British English. For the verb “discuss”, it is the pattern V deter. A verb’s colligational patterns are closely related to its transitivity. As the verb “discuss” is mainly used as a transitive verb, the most frequent structure might be “discuss sth.”, in which a determiner is needed when required by the noun. However, the colligational pattern is used much less in China English than in British English with a significant value less than 0.05. This may suggest speakers of China English tend to omit the determiner or use other forms instead. For example, the pattern “V pl. n.” is used significantly more often in China English than in British English (See Table 2).

For the verb “communicate”, the common colligational patterns are V prep. and V adv. as shown by Ex.1 and Ex. 2, which are top two patterns used both in China English and British English. So the verb is mainly used as an intransitive one.

Ex. 1. Zhou required that that the courts expand their use of technology to better communicate with the public (from CCE).

Ex. 2. The addresser and the addressee must communicate simultaneously at two levels (from CCE).

For the verb “negotiate”, the top two colligational patterns shared by China English and British English are V prep. and V deter. The verb is used both as a transitive verb and an intransitive one. However, in China English, it is more frequently used as an intransitive one because the colligational pattern of “V prep.” occur more often in CCE than in BNC. On the contrary, native English speakers use more transitive colligational patterns of “V deter”.

The answer to the second research question is positive. Although China English and British English share some common colligational patterns, there are marked differences between China English and British English in the use of colligational patterns with regard to frequency. Speakers of China English tend to use some colligational patterns more frequently but other patterns less frequently than their British counterparts. For the verb “discuss”, the following colligational patterns have a greater frequency value in China English than in British English: V pl. n., V conj., and V prop. n., and the following ones a smaller frequency value: V deter., V pers. pron., and V poss. pron.. For the verb “communicate”, the pattern “V prep.” is used more frequently in China English, but the patterns “V sing. n.” and “V deter.” are used less frequently. For the verb “negotiate”, the pattern “V adj.” has a significantly smaller frequency in CCE than in BNC. This may suggest that Chinese speakers of English tend to use less often the structure of “negotiate +adj. +n.”, such as “The case of minjian online writers has shown that both the Chinese state and Internet users constantly negotiate new boundaries in this new domain” (from CCE).

The answer to the third research question is negative. Previous studies [14] claimed that new grammatical structures such as “discuss about sth.” were founded in China English. But in the present study, no such colligational patterns have been found. This might be caused by the source of corpus data. Ai and You’s [14] data came from Chinese English learners in an online forum. So that grammatical structure is more an erroneous use than a creative one.

In the present study, only one colligational pattern has been found in CCE but not in the sampled data of BNC, i.e. “V infin. to”, in which the verb “negotiate” is followed by an infinitive with “to” as in Ex. 3 and Ex. 4.

Ex. 3. Sometimes he can negotiate to have a larger loan because the future interest provides him growing collateral beyond the 10 million yuan (from CCE).

Ex. 4. If one party of the married couple signs the purchase contract of a real estate property and makes the down payment with his or her savings prior to marriage, and the couple jointly repay the mortgage within the marriage, in case of divorce, the couple should negotiate to divide the property (from CCE).

However, similar use of the pattern was found by a thorough examination in BNC although there were only a few examples. The sampled data in the present study is not large enough to include this use. Therefore, the colligational pattern “V infin. to” is used both in China English and British English. The difference is only quantitative.

6 Conclusion

The present paper uses corpus data to examine colligational patterns in China English and British English in order to identify the differences between these two English varieties in the use of grammatical constructions. Research results show that there are distinct differences between them with regard to the frequency. These results are a further proof of structural nativization in China English. Mukherjee and Gries [26] argue that “structural nativisation not only refers to entirely new and innovative forms and structures in individual varieties, but also covers quantitative differences between varieties of English in the use of forms and structures that belong to the common core that is shared by all Englishes”. Below is a brief summary of the study’s major contributions:

(1) It sheds light on the differences between China English and British English in the use of colligational patterns. No literature has been found to study structural nativization through colligations in the field of world Englishes. (2) The Corpus of China English was built for the studies of China English, which is a valuable language resource for natural language processing, computational linguistics, and the study of world Englishes. (3) It has explored the quantitative differences in the use of colligational patterns in China English by analyzing very large amounts of natural data, which is a methodological contribution to the study of China English.

The findings of the study show that speakers of China English have a clear preference for certain colligational patterns. However, the present study has some limitation. For each investigated verb in the BNC, only 1,000 concordance lines were selected by random sampling. It might not be a big problem for the verb “communicate” or “negotiate” because there are 1507 and 1295 hits for the two verbs respectively. But there are 5505 hits for the verb “discuss”. Therefore, some use of colligational patterns might be missing in the sampled data. Besides, only three verbs of communication were chosen as the objects of study due to time and space limit. It is better if more communicative verbs or other types of verb are investigated.