Keywords

1 Introduction

Research using Translog-II (Carl 2012) in conjunction with eye-tracking data in translation studies (Carl and Jakobsen 2009; Jakobsen 2011; Hvelplund 2011; Carl and Dragsted 2012; Sjørup 2013; Balling and Carl 2014; Mesa-Lao 2014; among others) has focused on tasks involving Western European languages and consequently alphabetical scripts. However, studies focusing on languages which use logographic scripts are still incipient. This chapter reports on a study of from-scratch translation and post-editing tasks carried out from Portuguese into Chinese by Chinese translators of Portuguese (L2). Drawing mostly on the methodology used by Sjørup (2013), we carried out a study to examine gaze and key logging data from six participants while translating and six other participants while post-editing a 79-word news report. These data are available in CRITT Translation Process Research Database as MS13 (translation session 16, post-editing session 18).

Building on Halliday and Hasan (1976) and Hasan (1984), referents pertaining to the main cohesive chain of the source text (labelled chain A) were defined as our focus of enquiry and contrasted with items in a secondary cohesive chain traceable in the same text (labelled chain B). Our assumption was that tracking participants (referents) in chain A would be critical for the 12 translators to build a coherent interpretation of the source text (ST) and would require them to retrieve the identity of what was being talked about by referring to another expression either in the co-text or the context of the situation and culture. A higher number of fixations in eye and keyboard activities were thus expected during reading and production of chain~A. A secondary chain, in contrast, would have a lesser contribution to the ST and TT (target text) coherence and would thus demand less attention, as well as fewer keyboard and eye activities.

2 Review of the Literature

The role of cohesion in the establishment of a coherent interpretation of text is one of the many core questions of reading comprehension in translation tasks (Bell 1991; Hatim and Mason 1990). Since it has to do with translators’ active participation in understanding an ST unfolding and in building a TT patterned on it, this is an issue particularly well suited to be approached from a translation process perspective.

Among the different resources playing a part in texture, i.e., that which makes a text a text and makes it function “as a unity with respect to its environment” (Halliday and Hasan 1976, 2), cohesive devices are responsible for non-structural relations between items in a text. Such relations are established through the creation of semantic bonds, so that one item is interpreted with reference to the other.

One cohesive relation in particular is especially relevant to discourse coherence in text unfolding: this is participant tracking, i.e., the mapping of referents pertaining to the main cohesive chains running along a text. Both grammatical cohesion (more precisely, reference) and lexical cohesion are recruited in participant tracking. By reference is meant a relationship in meaning construed through the use of a personal reference item (personal pronoun or possessive determiner) that enters into a semantic relationship with an item mentioned either before in the text (anaphora) or afterwards (cataphora). Occasionally, reference is made to entities that cannot be retrieved from the text and need to be established situationally; this is referred to as exophoric reference. When two items share the identity of a referent, this is termed co-referentiality. Items sharing identity can also be linked through lexical cohesion, be that repetition, synonymy or hyponymy.

Cohesive ties, i.e. semantically bonded items, are particularly important when they form so-called cohesive chains, responsible for strong integration of cohesive ties and a more coherent text. A cohesive chain built on participant tracking may be realized through co-reference or lexical cohesion categories that are valid for language in general but that ultimately need to be interpreted in a particular text. Thus, this type of cohesive chain is crucial to text organization and comprehension. Conversely, secondary chains are not essential to participant tracking and are built upon lexical relations that are not text-specific, but general to the lexicon of the language.

When text is processed in translation tasks, Hatim and Mason (1990) argue, translators rely both on contextual and co-textual cues in order to identify cohesive items deemed relevant to a coherent construction of the TT. These cues can be sought in the immediate co-text or demand integration of items that are more distant in the text.

Cohesion has not been extensively examined in translation process research. Denver (2009) investigated adversative-concessive logical-semantic relations in translations from Spanish into Danish. The author found different right and wrong choices among translators and students, but no trace of mental activity in processing relations realized through conjunctions in Spanish, i.e. no verbalization or keystroke, pause or revision signalling that the relations constituted translation problems for the participants.

Angelone (2010) studied uncertainty management and metacognitive problem solving of a professional translator, two students and a bilingual. He classified the textual level at which the participants’ metacognitive activity was employed into lexis, term, collocation, phrasal, syntax, sentential, macro level, and unclassified. The macro level category refers, according to the author, to beyond sentence considerations, such as cohesion, coherence, and gender. Only a small part of the elicited verbalizations fell into the sentential and macro level categories.

Both authors relied on think-aloud protocol data; Denver also used key logging data, and Angelone also used screen recordings. To the best of our knowledge, no other translation process research using eye-tracking has addressed translators’ or post-editors’ processing of cohesion in STs and TTs.

Staub and Rayner (2007) claim that many eye-tracking studies have focused on syntactic parsing, but few have looked into how discourse processing (including cohesion) affects eye movements in reading. Staub and Rayner (2007, 335) argue that recognizing individual words and analyzing grammatical structures of each sentence does not suffice to understand a text; the reader “must also maintain a representation of the entities and events that have been mentioned, and relate the information that is currently being processed to this stored representation.”

Basically, eye-tracking studies focusing on cohesion have so far shown an increase in fixation times due to: long distance between an anaphor and its antecedent (O’Brien et al. 1997); antecedent being a low-frequency word (van Gompel and Majid 2004); and reading a target word and drawing conclusions that have not been explicitly stated in the text (O’Brien et al. 1988).

3 Methodology

The results described in this chapter are part of a larger empirical-experimental project carried out by the AuTema-PostEd Group, which aims at tapping into translation and post-editing processes as a source of insight into the role of translators’ understanding in task problem solving. In this chapter, we report the results regarding the Portuguese(L2)-Chinese(L1) translation and post-editing of a text about the China Gold Research Institute.

3.1 Equipment and Analysis Tools

Data from gaze and keyboard activity were collected and analyzed using Translog-II (Carl 2012, 2013), version 0.1.0189, connected to a Tobii T120 remote eye tracker. The eye-tracking software application Tobii Studio 3.2.1 was also used as a recorder for the participants’ verbalizations and gestures. Calibration was performed in both Translog-II and Tobii Studio, the latter running in the background while the participants worked in Translog-II.

Figure 11.1 shows the screen setting in the post-editing task: the ST appears in the top half of the application window, and the TT in the bottom half, which is empty in the translation task. The ST font was Tahoma, and the TT font was SimSun with font size 17. Both texts were double-spaced.

Fig. 11.1
figure 1

Screenshot of Translog-II user post-editing window

3.2 Participants

Originally, 23 professional translators performed two translation tasks (L1 into L2, and L2 into L1) and two post-editing tasks (one in their L1 and another one in their L2) using machine-translated (MT) input provided by the software PCT (Portuguese-Chinese TranslatorFootnote 1) (Wong and Chao 2010). In this chapter, we report the analysis of Text 1 (see Sect. 11.3.4), for which we have data from 17 participantsFootnote 2 contained in the MS13 study within TPR-DB (translation session 16, post-editing session 18). Only 12 of the participants with more than 1 year of experience and an age range of 23–32 years remained in our sample (see Table 11.1). The others were excluded because of failure to comply with data quality criteria as described in Sect. 11.3.3.

Table 11.1 Results for the three quality assurance criteria

All participants provided informed consent. They were Chinese nationals and had an undergraduate degree in Portuguese Studies or a Master’s degree in Chinese-Portuguese Translation Studies. All had been granted a fellowship to live in a Portuguese speaking country, and all of them used glasses or contact lenses. None of them had experience in post-editing. Each was paid MOP 90.00 to take part in the experiments.

The participants were instructed to sit approximately 55 cm away from the eye-tracker monitor. They were also told that they could move freely, but were told to keep their eyes on the monitor as much as possible.

3.3 Data Quality

For data collection, we tried to cope with the numerous factors that may have had an effect on the quality of the gaze data, especially lighting, glasses, and distance from the monitor. However, in order to ensure consistency in the sample, three data quality assurance criteria were observed. Data should comply with at least two of the three criteria to ensure that the results were not skewed by flawed data.

The first criterion was mean fixation duration: following Sjørup (2013) and Hvelplund (2011), our threshold was established at a minimum of 180 ms. The second criterion was gaze time on screen (GTS), that is, the percentage of time spent gazing on the text in relation to the total time of translation production: once again, following Sjørup (2013) and Hvelplund (2011), our threshold was 30 %. The third criterion, called “% of valid win gaze data,” was calculated in terms of the percentage of valid gaze data on the ST and TT token considering the attribute “win” in the XML files produced after each Translog session. More specifically, we divided the number of occurrences of win = 1 (gaze on ST) plus win = 2 (gaze on TT) by the total number of “wins”, which included both win = 1 and win = 2 and also win = 0 (gaze not ascribed to either the ST or the TT). As, to the best of our knowledge, no study has reported such a measure before, we arbitrarily established our threshold at ≥40 %.

Table 11.1 shows the figures of the remaining data considering the three criteria mentioned above.

3.4 The Experimental Text

The ST, Text 1, is a short news report written in Brazilian Portuguese on the increase of gold extraction and consumer market in China (see Appendix 2). Chain A is the main chain, where the participant being tracked is production volume. Chain B is the secondary chain and refers to the country (i.e. China). Table 11.2 shows both chains and their cohesive devices.

Table 11.2 Main chain and cohesive devices of ST (selected tokens in italics)

The relative position and length of the selected tokens (words and noun groups) in chain A and chain B were accounted for in our statistical analysis (see Sect. 11.4.2).

3.5 Task

After a brief warm-up session, which consisted of a copy test before the experiment, each participant was asked to perform four tasks, randomly assigned to participants: two translations (one into their L1 and another one into their L2), and two post-editing tasks (one in their L1 and another one in their L2) using MT output.

Table 11.3 provides the tasks performed by each participant. The analyses in this chapter refer to T1 and P1 highlighted in Table 11.3.

Table 11.3 Task distribution across participants

As a brief, the participants were informed that they should render texts aimed at a target audience analogous to that of the ST. They were told to feel free to produce the human TT or post-edit the MT text without any time constraint, but they could not use any kind of translation aids. As they had little to no experience in post-editing, we provided them with guidelines reported in Mesa-Lao (2014, 225), see also chapters 11.3.5, 11.7, 11.8, and 13 in this volume.

After each task, the participants were requested to provide a retrospective protocol, whereby they could explain whatever they felt like concerning their translation or post-editing, such as difficulties, challenges, strategies, doubts. The retrospective protocols were carried out by means of the Translog-II Supervisor replay function (Jakobsen 2011, 39).

3.6 Processing of Chinese Data

In the CRITT TPR-DB are four study folders containing data of translation and post-editing involving both Chinese language and a Latin alphabetic script (see Appendix 1, Chap. 2). As Chinese is a logographic language which does not require blank spaces between the characters (Zang et al. 2011), the processing of the Chinese language data involved additional procedures so that they could be automatically analyzed using the Study Analysis script.Footnote 3 These procedures are described in Sects. 11.3.6.1 and 11.3.6.2.

3.6.1 Chinese Input System

A logographic language like Chinese requires an input methodFootnote 4 through a graphic user interface (GUI), which converts sequences of alphabetic letters into Chinese characters. The participants used SogouFootnote 5 as their Chinese input method.

Figure 11.2 shows a snapshot of a post-editing session. In the bottom half of the window, Sogou’s dialog box pops up below the line where the participant wants to introduce new characters. A zoom-in shows that while the participant types in pinyin Footnote 6 a series of alphabetic letters, a number of options are shown out of which one may be the desired corresponding character(s). To select the desired characters and insert them in Translog-II, the participant presses the space bar or the corresponding number key.

Fig. 11.2
figure 2

Snapshot of a post-editing session showing Sogou’s dialog box. Note: Circle indicates fixation

As shown in Fig. 11.2, the Chinese input system is prone to word gaze error, since the place where pinyin is typed is not the same place where the Chinese character is inserted. Assuming that the Sogou’s dialog box pops up right below the space where the character is supposed to be inserted, we manuallyFootnote 7 attribute the fixation to a specific word through Translog-II SupervisorFootnote 8 with support of Tobii Studio replay function.

3.6.2 Chinese Tokenization and Alignment

The procedures to analyze Translog-II data can be retrieved from the CRITT website.Footnote 9 However, some additional steps were required to tokenize and align the data because of Chinese language specificities. As Chinese texts are written as a stream of characters without blank spaces, there is no explicit delimiter to identify word boundaries and automatically tokenize the data using the “StudyAnalysis.pl tokenize” script. To tackle these problems, we came up with an alternative workflow and developed applications to (semi-)automate the process.

Figure 11.3 shows the conventional and the alternative workflows, which contains additional steps 0, 2.5, and 3 as in Fig. 11.3b. Firstly, we fix the incorrect gaze data (see Sect. 11.3.6.1) and save each log file as a new *.xml file. Secondly, a Chinese tokenization step (Step 2.5) is added after the extraction of text data in Step 2: We use the in-house developed tool, ChiSegmentor (Leong et al. 2006; Zeng et al. 2013), to automatically identify the word boundaries, and then we manually revise the output drawing on the Modern Standard Chinese Dictionary (Li 2010). This information is recorded in the corresponding log file—*.src or *.tgt.

Fig. 11.3
figure 3

Procedures to prepare Translog-II data for “Study Analysis”: (a) Conventional process for Roman-alphabet languages; (b) Adapted process for the Chinese language

Another change to the workflow is in the alignment step, for which we use LexAligner (Tian et al. 2011) to automatically estimate possible word alignments. To check the alignments, we draw on criteria provided by the Guidelines for Chinese-English Word Alignment (Li et al. 2009). Because the translation renditions are a result from the processing of an ST (Mossop 2003), we align all ST and TT tokens.

Finally, we run the “StudyAnalysis.pl tables” script to extract several kinds of simple and compound process and product units, which are represented in tables (see Chap. 2 for details). From these units it is possible to generate Translation Progression Graph (TPG) using the R environment for statistical computing.

It is worth noticing that sometimes Chinese and Portuguese tokens have encoding conflict that prevents us from generating TPGs. To overcome this problem, which is identified while running R to generate TPGs, we replace the problematic character with pinyin for Chinese and with “a” for Portuguese.Footnote 10

3.6.3 Chinese Production Data Extraction

For technical reasons related to the Chinese input method,Footnote 11 which is external to Translog-II, the system logs only the text modifications (delete and space keys), but not the actual keystrokes (see Sect. 11.3.6.1). Table 11.4 shows examples of incorrect number of character insertions in the target tokens (TToken) (see Sect. 2.4): in the column Ins, the number “2” refers to the Chinese characters shown on the screen, whereas the column should provide the number of keystrokes actually typed (five insertions for TTid50 and nine insertions for TTid51, see Fig. 11.2).

Table 11.4 Example of wrong log of keystroke insertions and deletions

A problem also occurs with the duration of the production time of Chinese characters (as reported in Balling and Carl 2014, 260). As shown in Table 11.5, it is common to find long pauses and short production times (most of them of 1 ms), when in fact the participant had virtually no pause and took longer to produce a given token (TToken).

Table 11.5 Example of wrong log of pause and duration (Dur)

To solve the aforementioned problems, we use Tobii Studio replay function to identify when exactly the participants started and stopped typing keystrokes corresponding to each word logged in Translog-II. While doing this, we also count the actual keystrokes that the participants pressed to produce the characters logged in Translog-II. Then, we manually correct the TT tables.

We also observed that the edited units are coherently aligned to the respective STid and TTid in most of the cases. However, because Translog-II aligns only the initial ST and the final TT, when the characters of the MT text are deleted and/or immediately edited (such as Edit 1 [], which represents the deletion of [that amount] in TTid49, Table 11.6), the system does not identify which words they were originally part of (a part of Edit 1 in TTid49 should belong to TT50 [that], and another part [amount] should belong to TT51). To account for this, we check all Edit1 and Edit2 actual operations to identify the actual ST and TT tokens (STokens and TTokens).

Table 11.6 Example of intermediate production alignment problem (Edit1)

In sum, a substantial part of our production analysis was built on manually processed data. To ensure quality, all manually extracted data were double-checked. These spreadsheets are available in TPR-DB.

4 Data Analysis

4.1 Research Question

Our main objective in this study is to compare the cognitive effort demanded for translating and post-editing the selected tokens in the main cohesive chain (chain A) in the ST, which is built on participant tracking, with the cognitive effort demanded for processing selected tokens in a secondary cohesive chain (chain B) (see Sect. 11.3.4).

Our first research question is “Is it cognitively more demanding to understand and produce a cohesive chain that is built on participant tracking than a secondary cohesive chain?” This question is based on the assumption that participant tracking is crucial to construing a coherent representation of a text (Halliday and Hasan 1976).

Our second research question is “In dealing with cohesive chains, is it cognitively more demanding to translate than to post-edit?”. Given that translation takes longer than post-editing (Balling and Carl 2014; Mesa-Lao 2014), it is possible that processing cohesive chains during post-editing is also faster or that cohesive chains are processed differently in the two tasks.

4.2 Statistical Analysis and Variables

Our investigation is divided into three statistical analyses: (1) eye movements on chain A and chain B in the ST, (2) eye movements on chain A and chain B in the TT, and (3) keyboard movements relating to chain A and chain B.

In the following, we describe the variables of analysis in the order they were included in the statistical model (for details on the model, see Sect. 11.4.3).

The dependent variablesFootnote 12 for analyses (1) and (2) were:

  • Total reading time on ST and TT token (TrtS and TrtT);

  • Number of fixations on ST and TT token (FixS and FixT); and

  • First pass duration on ST and TT token (FPDurS and FPDurT).

The dependent variable for analysis (3) was TT token total production time. Time was measured considering any pauses preceding a TT token plus duration (Dur) (see Sjørup 2013, 126–127 for further details).

Our analysis investigates how the dependent variables vary as a function of several explanatory variables, as described below. For further details, see Baayen (2008) and Balling (2008).

The first group of explanatory variables consists of random effects. Random factors are not repeatable and are assumed to have been selected randomly from any given population (Baayen 2008, 241). As such, we included the participants and item, i.e., the selected ST and TT tokens of chains A and B (see Sect. 11.3.4).

The second group of explanatory variables consists of fixed effects, which refer to factors with repeatable levels (Baayen 2008, 241). They were used to account for previous studies that have reported their effect on the results or their importance for TT cohesion. Due to space restrictions, we report only on the ones that were significant in our model (see Table 11.10 in Appendix 1).

Four fixed effects were used in the analysis of both eye and keyboard movements as proxies for processing of chains A and B in both ST and TT:

  • Token Length: The length in characters of the ST and TT tokens of chains A and B was expected to affect the dependent variable, as longer words generally receive longer fixations than shorter words (e.g. Rayner 1998; Hyönä et al. 2003; Staub and Rayner 2007);

  • Token Position: The position of the ST and TT tokens of chains A and B in the text was expected to have an effect on the participants’ gaze behaviour. It may be due to fatigue (e.g. Rayner 1998; Balling 2008, 2013) and/or to a priming effect (Rayner 1998: 390; Staub and Rayner 2007: 331), which, based on Halliday and Hasan (1976), may imply that the beginning of a text deserves more attention because it will determine the understanding of the remaining of the text and it will have items that will serve as referents for items further in the text (see also Chap. 9);

  • Token Unigram Frequency: Readers are expected to fixate longer on low-frequency words than on high-frequency words (e.g. Rayner et al. 2005; Rayner 1998). The Corpus of PortugueseFootnote 13 and the Corpus of the Peking University Center for Chinese LinguisticsFootnote 14 were used to measure frequency;

  • Token Trigram Probability: High predictability of word association was expected to have an impact on processing effort (Frisson et al. 2005). The variable was computed following McDonald and Shillcock (2003, 650) and considering the selected token and the two preceding tokens as they occur in the text (Balling 2013).

Two variables were added to the analysis of TT Token production, namely:

  • Token Character Count: sum of insertions and deletions of TT Token (see Sect. 11.3.6.3); and

  • Correctness of Token in the Chain: TT Tokens were assessed as right or wrong in lieu of the ST Tokens; right TT Tokens were assumed to be as instance in which “patterns of lexical cohesion in texts are maintained, subject to the constraints of particular text norms in particular languages” (Hatim and Mason 1990, 200).

Two other fixed effects were included to directly answer our research questions:

  • Task: to investigate the effect of translating or post-editing on ST and TT comprehension and on TT production; and

  • Type of Chain: to investigate the effect of chain A and chain B on ST and TT comprehension and on TT production.

All continuous variables were naturally logarithmically transformed.

4.3 Data Analysis: Statistical Models

Following the methods used in Balling and Carl (2014, 250ff.) and Sjørup (2013), we applied a linear mixed-effect regression model (LMER) as implemented in the lme4 package (Bates et al. 2014) in the R environment for statistical computing (version 3.1.2, R development Core Team 2014). The final model, containing only the significant variables, is summarized in the appendix. Table 11.10 in Appendix 1 provides the fixed effects, with variable names in the first column, estimated effect size in the second column, the standard error of this estimate (indicating the amount of variation in the data) in the third column, the t-value in the fourth column, and the associated p-value in the fifth column. We set the significance level at p ≤ 0.05.

Table 11.11 in Appendix 1 shows the random effects part of the model. Random effects are not associated with p-values, but are included in the model in order to estimate individual effects and dependencies between observations. The standard error of the effects indicates how much variation the different levels capture.

5 Results and Discussion

In this session, we report the fixed effects that had a significant impact on the dependent variables.

5.1 Comprehension: Eye Movements along Chain A and Chain B in the ST

Table 11.7 summarizes the effects (√) that had a significant impact on our dependent variables related to ST comprehension. Type of chain and type of task were non-significant for all dependent variables relating to source text processing.

Table 11.7 Summary of significant results for ST comprehension

ST Token length had a significant effect on total reading time on ST Token (Fig. 11.4) and number of fixations on ST Token (Fig. 11.5). This confirms claims in the literature (e.g. Rayner 1998, 387; Sjørup 2013, 140) that the longer the word or words, the longer the gaze time and the higher the number of fixations.

Fig. 11.4
figure 4

Total reading time on ST Token (TrtS, in ms) vs. length of ST Token in the chain (character count naturally logarithmically transformed)

Fig. 11.5
figure 5

Number of fixations on ST Token (FixS) vs. length of ST Token in the chain (character count naturally logarithmically transformed)

Position of ST Token had a significant effect on total reading time (Fig. 11.6) and number of fixations on ST Token (Fig. 11.7). The ST Tokens in initial positions were gazed longer and more often than tokens towards the final positions. Since the type of chain had no significant impact on the eye movements on the ST and the text was relatively short, the results seem to indicate a priming effect: the initial items in the chains receive more attention from participants because the beginning of the text is crucial for their orientation in order to understand the entire text (Halliday and Hasan 1976).

Fig. 11.6
figure 6

Total reading time on ST Token (TrtS, in ms.) vs. position of ST Token (naturally logarithmically transformed)

Fig. 11.7
figure 7

Number of fixations on ST Token (FixS) vs. position of ST Token (naturally logarithmically transformed)

The probability of word association, measured through trigram probability, had a significant effect on total reading time on ST Token (Fig. 11.8) and number of fixations on ST Token (Fig. 11.9). This confirms findings in the literature, such as Frisson et al.’s (2005) and McDonald and Shillcock’s (2003), who found that the more probable or common a word or expression is, the shorter and the less often it is fixated.

Fig. 11.8
figure 8

Total reading time on ST Token (TrtS, in ms.) vs. trigram probability (naturally logarithmically transformed)

Fig. 11.9
figure 9

Number of fixations on ST Token (FixS) vs. trigram probability (naturally logarithmically transformed)

The probability of a three-word combination also had a significant effect on first pass duration (Fig. 11.10), which may be indicative of processing of higher level information (see Staub and Rayner 2007, 329). According to our results, the less probable the occurrence of such combination, the more often and the longer it was fixated in subjects’ first gaze on it (see the effect of trigram probability on FPDurS).

Fig. 11.10
figure 10

First pass reading time on ST Token (FPDurS) vs. trigram probability (naturally logarithmically transformed)

5.2 Comprehension: Eye Movements along Chain A and Chain B in the TT

Table 11.8 summarizes the effects (√) that had significant impact on the dependent variables related to TT comprehension. Each variable had an impact on only one of the dependent variables. The type of task had no impact on any of the dependent variables regarding eye movements on the TT. No variable impacted on first pass duration on TT Token.

Table 11.8 Summary of significant results for TT comprehension

Token length was significant for number of fixations on TT Token: the longer the token, the greater the number of fixations (Fig. 11.11). Unigram frequency was significant for total reading time on TT Token There seems to be a tendency in fixating more on both the most and the least frequent words; the reason for more fixations on frequent words may be related to their role in the chain, as we are observing instances of participant tracking (Fig. 11.12). Type of chain was significant for total reading time on TT Token: (Figure 11.12): tokens in chain A were fixated longer than those in chain B (Fig. 11.13).

Fig. 11.11
figure 11

Number of fixations on TT Token (FixT) vs. length of TT Token in the chain (naturally logarithmically transformed)

Fig. 11.12
figure 12

Total reading time on TT Token (TrtT, in ms.) vs. unigram frequency of TT Token (naturally logarithmically transformed)

Fig. 11.13
figure 13

Total reading time on TT Token (TrtT, in ms.) vs. type of chain (A and B)

5.3 Production: Keyboard Movements for Producing of Chain A and Chain B

Table 11.9 summarizes the effects (√) that had significant impact on the dependent variable “duration of token production time.”

Table 11.9 Summary of significant results for TT production

The more the participants inserted or deleted characters, the longer was the duration of their token production time (Fig. 11.14). The participants took longer to produce a wrong token in the chain than to produce a right item (Fig. 11.15), which may be related to the number of renditions that they provided while being uncertain to what would be an adequate solution. The participants also took longer to produce the items in chain A (Fig. 11.16).

The longer time for producing tokens in chain A may be indicative of hesitation, need for internal support to make decisions, as well as on-line revisions. These results seem to be consistent with our previous findings (Sect. 11.5.2). The participants’ verbalizations also showed that they found it difficult to render some items in chain A. The type of task did not have any significant effect on the results. This suggests that processing of cohesive ties is similar in translation from scratch and post-editing.

6 Summary and Future Directions

We set out to answer two research questions: (1) Is it cognitively more demanding to understand and produce a cohesive chain that is built on participant tracking than a secondary cohesive chain? And (2), in dealing with cohesive chains, is it cognitively more demanding to translate than to post-edit?

Fig. 11.14
figure 14

Duration of token production time (Dur, in ms.) vs. character count (naturally logarithmically transformed)

Fig. 11.15
figure 15

Duration of token production time (Dur, in ms.) vs. correctness of token (R right, W wrong)

Fig. 11.16
figure 16

Duration of token production time (Dur, in ms.) vs. type of chain

As for question 1, our results show a significant effect of the type of cohesive chain on eye movements on the TT, which let us infer that it is cognitively more demanding to produce a chain built on participant tracking when it comes to the TT, but no significant effect was observed for the metrics related to the ST.

We expected that the results were also significant for the ST, especially because we assumed that keeping track of participants in the main chain of a text would be challenging both to understand the ST and to produce the TT. However, we should be aware that both translation and post-editing involve transiting gaze from ST to TT, and therefore, the reading of the TT may be closely connected to the understanding of the ST. Future studies should include transiting from ST to TT and vice-versa as also a measure of effort and investigate if ST and TT comprehension should be addressed as one single event. Measures of global processing should also be developed and tried out to account for “relationships between pieces of text information that span relatively long distances in a text” (Hyönä et al. 2003, 314), especially across sentences.

Furthermore, in a larger scale study involving the four tasks for which we collected data, we intend to check if the order in which the task was carried out had an effect on the participants’ processing. As we collected all data from each participant on the same day, there might have been an effect of fatigue on the results (e.g., the total time of the last sessions seems to be shorter than that of the first sessions).

As for question 2, the results pointed to no significant impact of type of task on the measures that we assessed in this chapter. We have two potential non-mutually exclusive explanations for this. One reason might be experience—none of the participants had PE experience. Another factor might have been that our design is between subjects, so differences in the results may have to do with differences in the groups. The other potential explanation is that the sample size is small. Considering previous studies that do show significant differences between post-editing and translation (e.g. Balling and Carl 2014; Mesa-Lao 2014), it is possible that either the impact of the type of task is on the transitions across both ST and TT areas or that type of task has an impact on the global processing of the entire text, rather than only on particular cohesive chains. A third possibility concerns to MT quality, as reported in the protocols, participants found the MT text ambiguous at some points and exophoric reference to what MT tokens refer also lacked.

To address some of the aforementioned limitations, in future work we intend to analyse more than one task and use a between subject design, which will allow us to have data for participants that both translated and post-edited. We also intend to compare the tasks considering the entire text and a larger volume of data (including four texts). Following Alves et al. (2014), we also intend to perform a more fine-grained analysis by qualitatively examining the renditions and their processing as shown, for instance, in scan paths.

Besides answering the research questions, the alternatives we had to come up with in order to cope with limitations to process Chinese language data are also a contribution of the present chapter. We hope that the procedures we reported herein contribute to facilitating further studies involving the Chinese language and that our results awake the interest of new scholars to approach language pairs other than those involving only alphabetic scripts. Although the tendency to use the same language pairs and scripts may have methodological advantages (e.g., one language, English, is kept as a standard for comparison’s sake, and blank spaces undoubtedly delimit words), not only does it overlook the insights that other language pairs and scripts may add to understanding (non-)language and (non-)script specific cognitive aspects of post-editing and translation, but it also prevents future generalizations based on a comprehensive body of research encompassing multiple languages, language pairs, and scripts.