Integration of technology with education has become an ideology of teaching and learning highly sought after for the last two decades as it brought about new opportunities, resources, useful tools and methods to teachers and learners. With regard to language teaching and learning, data-driven learning (DDL) advocates learner engagement through technology means which has been found effective in enriching EFL learners’ learning experience and providing opportunities for them to explore language knowledge and language use. Researchers reported on how corpus-based or corpus-aided DDL has been used in assisting learners with their English language development (Chen and Flowerdew 2018; Larsen-Walker 2017; Gui et al. 2010; He 2010).

While the DDL approach has been applied to learners’ English writing, grammar or vocabulary enhancement, a relatively less researched on aspect concerns the influence of DDL on the development of advanced EFL learners’ analytical reading ability. The current study designed a 6-week teaching using different corpus processed materials such as wordlists, keywordlists, and concordances to a class of 31 advanced Chinese EFL learners with a purpose of helping them to gain a deeper and quicker understanding of the texts. A post-study survey was conducted which found that the subjects responded to the approach positively. In order to verify some feedback by the subjects with regard to the corpus-aided approach, a reading-writing experiment was further conducted to three more parallel classes (135 students in total) using three different instructional modes. Class A were required to write a summary after reading a text with teacher instruction, while Class B finished the writing after reading the same text but with additional corpus-aided input. Class C finished the same task with both teacher instruction and corpus-aided input. The data obtained provide further evidence that DDL is effective in helping the subjects establish a deeper understanding of the text, and that the students performed the best under the mode of corpus-aided input with teacher instruction.

2 Corpus-Aided Reading Teaching for EFL Learners

Data-driven learning is a special language teaching and learning approach which enables students to make use of corpus to explore specific linguistic issues through authentic language materials. The theory underpinning it is corpus linguistics, which could be applied far beyond the discipline itself (McEnery et al. 2006: 8). Cobb and Boulton (2015) suggest that its methodologies make great influence on improving description of language varieties and features, which are important factors in language teaching. The activities design of DDL, namely, corpus-based learning, covers a wide range from “hard” to “soft” (Bernardini 2004; Gabrielatos 2005), depending on the degree learners themselves can take over the learning stages (Cobb and Boulton 2015).

For its diverse advantages, DDL is a frequently discussed issue in language teaching. Cobb and Boulton (2017) review 64 studies and find that most of the relevant studies focus on corpus-based learning of vocabulary from the lexicogrammar or grammar perspective. They argue DDL plays an excellent role in helping learners deal with the great difficulties they encountered in producing natural and effective language. Johns (1991) analyses two samples of DDL materials and states that this approach provides a new style of “grammatical consciousness-raising” (Rutherford 2014), which is different from the traditional teaching approach that presents students with a set of “rules” directly.

Apart from the studies which focus on grammar and vocabulary, few scholars pay attention to the application of corpus to reading teaching. The feasibility and advantages were not fully researched on. Cobb and Boulton (2015) point out that corpus tools can be flexibly adjusted and applied to individual texts. It is helpful in deciding what elements of a text can be paid great attention to. Such concentration can help teachers decide on reading activities to be conducted. Except for its feasibility, research shows the strengths of incorporating technology with reading instruction. After reviewing 42 studies, scholars (Jamshidifarsania et al. 2019) agree that the combination of technology and reading instruction is able to increase motivation, reduce cognitive load and allow students learn at their own pace. A higher motivation coupled with less reading anxiety is useful in improving reading effects. Others (Kamil and Chou 2009; Boekaerts and Corno 2005; Zimmerman and Tsikalas 2005) point out the potential for teaching reading through computer-based approach and the benefits it holds. As one of the pioneers, Hadley and Charles’ (2017) study reveals the effects of DDL in reading instruction and confirms that this method has positive effect on the extensive reading program the study designed.

Studies cited above laid some foundation for the corpus-aided approach in reading teaching the present study attempts to explore. What sets it apart from previous studies is its concentration on enhancing students’ analytical reading which is a kind of in-depth reading. According to Chall (1983) who divides reading ability development into five stages based on Piaget’s cognitive development theory, college students are at the stage of having their analytical reading ability quickly developed. Learners at such a stage attempt to establish their view of the world via reading and analyzing different types of texts. They make inferences, judgments, and draw their own conclusions by interacting with the texts. Besides, they develop their logical thinking abilities by analyzing the themes contained in the texts, textual development features, language and rhetoric features, etc. For English major students at college level, they are required to develop excellent English skills and are expected to enhance their critical thinking ability which can be demonstrated through analytical reading performance. Such an expectation is clearly indicated in the nationally used Syllabus of English Teaching for English Majors of Higher Education published by China’s Ministry of Education (2000).

As improving Chinese advanced EFL learners’ analytical reading ability becomes a goal in English education currently, a question concerning how it can be empirically done arise. Besides, how the teaching can be assisted through technology to improve its effects is also to be considered. The corpus-aided approach of the study makes use of a user-friendly corpus tool, AntConc 3.2.0. (Anthony 2014) which can quickly present very useful information of a text or texts such as keywordlist, concordances and n-grams (clusters of different number of words centering around certain words). With this core information, teachers can guide students towards understanding of the theme of a text, considering how it develops into sub-themes, and what and how language forms are used for the making of aboutness of a text. Besides such guided analysis which students can do, their reading effects can be enhanced because DDL approach provides them with autonomous chances of exploration, i.e., they are engaged in making attemptive analysis of different kinds of materials the corpus tools instantaneously offer.

3 Research Design

3.1 Research Questions

In order to examine the effectiveness of corpus-aided teaching of reading, the study aims at answering the following questions: (1) How can the teaching of analytical reading make use of the corpus method and tools? (2) Can corpus-aided DDL help Chinese advanced EFL learners improve analytical reading performance?

To answer the first question, a 6-week teaching experiment of corpus-aided instruction was conducted. To answer the second question, a post-study questionnaire survey, interviews, and a reading-writing experiment were carried out.

3.2 Participants

The teaching experiment with the questionnaire survey and interviews took place in April to May of 2016. A class of 31 students participated. They were third-year English majors who had all passed Test for English Major Band 4 with an English proficiency level comparable to IELTS 6.0–6.5. They had taken the course of Corpus Linguistics in the previous term and were familiar with the corpus tools used in this study. In the second stage of the study, three classes of 135 second-year English majors participated in a 40-min reading-writing experiment. They had also taken the course of Corpus linguistics and were familiar with corpus tools used in this study.

3.3 The Corpus and Tools

A mini corpus of five reading passages taken from an intensive reading course for English major students was constructed for the teaching experiment. Table 1 shows the titles and lengths of the texts.

The corpus software used in the study is AntConc (Anthony 2014). The main tools used are keywordlist and concordance. Keywords in Corpus linguistics refers to a list of words of a text (or texts/corpus) which are unusually frequent (with statistic significance) when compared with a wordlist of a referent text (or texts/corpus). The reference is of data from different topics and of a larger word size (at least five times more). Figure 1 shows a part of the keywordlist extracted from the text The Science of Custom. The higher the keyness value, the greater the difference in word frequency between the target text and the reference corpus. Keyness is regarded as a mostly textual quality (Scott and Tribble 2006). Keywords often indicate the aboutness and stylistic features of a text.

Concordance refers a tool which searches a text or a corpus for a selected word or phrase (a node) and presents every instance of the node in the center, with words that come before and after the node presented to the left and right. Concordance lines are the instances the tool presents in accordance with the nodes keyed in. Some corpus tools such as AntConc provides a function which allows users to sort and highlight words of different distance to the left or right of the node. Figure 2 shows what concordance lines look like.

3.4 The Teaching Experiment

In the teaching experiment, there were four lessons each week, each lasting for 40 min. The first week was used to demonstrate how corpus-aided analytical reading was carried out. In the following four weeks, the students studied four passages using the corpus-aided DDL method. In the last week, the questionnaire survey and interviews were carried out. The corpus-aided teaching method consisted of the following steps:

    Prepare a corpus of the reading passages ready to be searched by AntConc.

    Prepare the students with the tools used, reviewing the functions of AntConc, i.e. wordlist, keywordlist, concordances, and collocates.

    Demonstrate to the students how to make keyword lists (the content words with significant keyness indicated in Fig. 1) and guide them to analyze and categorize the words, as shown in Table 2.

    Table 2. Keywords identified and categorized from The Science of Custom
    Guide students to make prediction, and discuss possible connections, logic relationships of the items on the keywords list.

    Have students’ attention focus on the most important keywords and use them as nodes to work out the concordance lines for analytical reading. Figure 3 shows part of the concordance lines of custom.*Footnote 1 Figure 4 shows how guided reading on the words and phrases associating with the nodes is done.

    Encourage students to use corpus tools to read for key information, identify how the information is organized, and find out the deep meanings underneath the text, etc. by helping them understand the theme and its supporting sub-themes, meanings contained in keywords, and their possible logical connections.

    Have students read the text and bear the analysis they have done with the corpus data in mind to confirm and deepen their understanding.

    After Week one’s demonstration, students were required to do the corpus-aided DDL reading for four weeks with necessary teacher guidance.

After the teaching experiment, a questionnaire survey was carried out to gather responses from the students. Part One of the survey attempted to find out to the students’ opinion about the effect of the teaching method, and Part Two focused on their suggestion for the use of the method by having them answer a multiple-choice question (see Table 4). On the basis of the survey, follow-up interviews were conducted with some of the students.

3.5 The Reading-Writing Experiment

The survey and interviews after the teaching experiment suggests that the students expected to have the corpus processed input prior to reading, and that they would like to have teacher guidance for using the corpus materials. To further investigate the significance of teacher instruction in DDL, a reading-writing experiment was conducted. Three classes of English-major students from the same cohort (classes A, B, and C) were required to read a passage with different methods of instruction and to write a summary of it at the end of the experiment.

The reading passage used in the experiment was a 1674-word essay Some Thoughts on WritingFootnote 2. The additional corpus materials included a keyword list and concordance lines of selected keywords of the passage. AntConc was used to compare the reading passage with the rest of the same textbook (79,960 words). The theme of the passage on writing can be inferred from some of most salient keywords such as writing, write, work, how, published, rejected, etc. The frequent mention of first-person and second-person pronouns such as I, my, you, your, etc. is a feature of the dialogic style of the writing. In addition, concordance lines were extracted for three selected keywords, write, you, and I as well as their inflectional forms.

The methods of instruction involved two variables: form of input and teacher instruction. Class A read the passage with teacher instruction. Class B were given both the passage and corpus-aided materials, but no teacher instruction. Class C were provided with the passage as well as the corpus input with teacher instruction. The instruction provided to Class A was regular directions in a reading lesson and brief guidance about the topic and organization of the text, while in Class C it focused on the reading of the concordance lines, especially on guiding the students’ attention to words and expressions indicating the topic and subtopics of the reading passage.

The experiment was carried out in language labs. It lasted for 40 min, within which the reading process and the writing task were completed. The summaries were written directly on computer and were collected at the end of the experiment. Among the 135 pieces of summary collected, 16 of them being too short or not in the proper form of summary were excluded. The remaining 119 valid copies of summary were used as the final data of the study. Table 3 provides description of the summaries collected.

Table 3. Description of the summaries collected

After the summaries were gathered they were carefully graded with the following criteria: (1) Main idea coverage which focuses on the number of main ideas included in the written summary; (2) Integration which examines the extent to which the information in the text is presented succinctly; and (3) Language use which focuses on correctness of grammar and vocabulary use. Four independent raters were invited to do the rating and an average score of each summary was obtained.

4 Results and Discussion

The results of the questionnaire survey are shown in Table 4. The results from Part One show that the students had a positive attitude towards the teaching method. 93.4% of them agreed that the corpus-aided method improved their understanding of the theme and main ideas. 75.8% reported that the method was more likely to attract their attention to the language forms. 83.3% stated that the method deepened their understanding of the meaning of the keywords and their contribution to the logic development of the texts.

Table 4. Results of the survey

The data of Part Two indicate that the students had a higher expectation for teacher guided reading with corpus-aided input. It is the top choice they selected, taking up a third of all the choices they made. Compared with other aspects such as further explanation of the use of the tools, whether printed input is necessary and the time to use the corpus-aided materials, the students regarded teacher guidance to be more important.

The reason students positively commented on the corpus-aided DDL in reading is due to the fact that they were more engaged in reading. While they explored the theme of the text and its development, they were given bottom-up data centering on the keyness of a text. The corpus tools which instantaneously highlight the keywords, and the different neighbouring words of the nodes in concordance lines can help focus students’ attention on forms and observe how meanings are constructed. In the follow-up interviews, students reported that they were able to “quickly grasp the main theme and relationships of topic and sub-topics” while in traditional reading class they “tend to read word by word or line by line from the beginning to the end and still miss the core meaning”. They stated that “exploration of the texts is more effective because specific tasks are given”, and that “the rich data displayed and the requirements to work out the themes and the underneath meanings help to improve my critical reading ability”.

With regard to the reading-writing experiment, the treatment of both corpus-aided input and teacher instruction proved to be most effective in helping the students with summary writing. Table 5 shows the results of the experiment.

Table 5. Statistics of results of the reading-writing experiment

The mean scores of summary writing of classes A, B, and C are 76.55, 79.53 and 81.1 respectively. Classes B and C with corpus-aided input performed better than Class A who received no corpus-aided input. Class C with both corpus-aided input and teacher guidance performed the best. Its mean score is significantly higher than that of Class A, suggesting that teacher-guided use of corpus-aided materials is effective in enhancing analytical reading.

Summary writing is an important output skill which is closely related to one’s reading quality. A good summary includes the main idea of the text, the most essential supporting details and the use of paraphrasing skills with correct grammar and vocabulary. The skill to handle these elements in summary writing has much to do the students’ critical reading ability which lays a foundation for the writing. The experiment provides evidence of the positive influence of the mode of corpus-aided input with teacher guidance on analytical reading. It also indicates the importance of considering learner needs in teaching, namely, in-depth interaction with bottom-up data will be more effective when coupled with proper guidance and challenge from teacher-student interactions.

5 Concluding Remarks

The present study constructed a 6-week reading teaching experiment with Chinese advanced EFL learners using the corpus-aided DDL method. The effect of the method was investigated through a questionnaire survey, interviews and a follow-up reading-writing experiment. Different from previous studies which used much larger corpus when applying the DDL approach, the study concerns single texts. It shows how corpus tools can be used to assist analytical reading and language learning. The corpus-aided bottom-up method helps students to focus on the most relevant information of a text. When students observe keywords, concordances, clusters centering on the nodes, etc., they are exploring how the theme and its sub-topics or supporting details developed. When reading keywordlists, for example, they make comparison and categorization of the contents. When observing concordance lines of selected nodes, they make inference and evaluation of the ideas expressed.

The integration of technology with teaching brings about chances for classroom teaching innovation. The present study suggests that the effect of such innovations can be mediated by the teacher instruction provided. In addition, it provides implications for further research applying the corpus-aided DDL method to reading teaching of more texts in one particular discipline or subject area.