Introduction

Titles, although they are the smallest composition of academic publications, play a critical role in bringing about desirable communication between the writer and the potential reader (Haggan 2004; Ball 2009; Cheng et al. 2012). The significance of titles has been well recognized. They serve as an “eye catcher” and may possibly influence the reader’s decision about whether to continue reading, or that of the editor about whether to consider for publication. In some situations, titles, especially those of articles in scientific journals, are self-explanatory and mirror a set of requisites which are crucial to the construction, communication, and progress of new knowledge (Soler 2007). Besides, a well-phrased title enables an article to be more “findable” by incorporating keywords and index terms required in online databases (Fox and Burns 2015). The study of titles was firstly termed in French by Duchet (1973) as titrologie. (cited in Salager-Meyer and Alcazar-Ariza 2013, p. 298). It was later formally introduced in English as “titleology” by Baicchi (2003). He further elaborated the relational complexity of titles and texts through a semiotic taxonomy and declared “titleology” as a new field of study. Early title research during the 1970s mainly concentrated on written literary texts (Soler 2011; Salager-Meyer and Alcazar-Ariza 2013) and barely extended to other domains (Netterville and Hirsch 1958; Wilsmore 1987). However, scholarly attention began to be paid to titles of academic publications in 1990 when Swales mentioned in his book Genre analysis: English in academic and research settings that titles in academic genres were not fully addressed and were worth being further explored. Since then, extensive research has been conducted on academic English (Swales 1990; Yitzhaki 1994, 1997), and specifically on English titles in various academic genres. For example, titles of medical literature, including medical case reports, were broadly investigated (e.g. Wang and Bai 2007; Jacques and Sebire 2009; Habibzadeh and Yadollahie 2010; Salager-Meyer and Alcazar-Ariza 2013). He (2000) examined the information transfer in the translation of medical article titles between English and Chinese. Furthermore, conference paper titles have also been studied (Afful 2017). Within the academic genres, unsurprisingly, journal article titles have received the most attention among researchers in recent years (Ball 2009; Guo et al. 2015). Title research has so far been conducted in a wide range of disciplines. However, while some disciplines, such as medicine and engineering, have been adequately explored, others like linguistics and literature, still call for further investigation. Therefore, the present study intends to examine the long-range trend of the past four decades and discipline variations, concerning title features of research articles published in linguistics and literature journals.

Literature review

Title research has covered both form features and content attributes of journal article titles. Form features are generally classified into three types—title length, syntactic structure and the use of punctuation marks, while content attributes include prominent information types provided in titles, title informativity and lexical diversity.

Previous research on title length has mainly focused on the variations in title length over time (Whissell 2013) and the correlations between title length and other title features, such as article length (Yitzhaki 2002), author numbers (Kuch 1978; White 1991; Yitzhaki 1994) and citation rate (Nair and Gibbert 2016; Gnewuch and Wohlrabe 2017). All diachronic studies of title length seemed to conclude that titles had become increasingly longer with time (Yitzhaki 1994; Méndez et al. 2013), though title length might vary across disciplines. For example, Méndez et al. (2013) found that titles in natural sciences articles were longer than those in social sciences, therefore presenting more extensive information content. On the other hand, results have been inconsistent in correlational research, not only across disciplines but within sub-disciplines belonging to the same broad area of knowledge. For example, Rostami et al. (2013) found that title length of articles published in Addictive Behaviors Journal had no obvious relation to citation rate while Paiva et al. (2012) came to the conclusion that shorter titles gathered more citations in Biomed Central (BMC) journals. However, a few studies did confirm the results of previous studies or even supplemented further findings. The study conducted by Jacques and Sebire (2009) suggested a strong positive correlation between title length and citation rate in medical research journals, which was in line with the finding in Habibzadeh and Yadollahie’s study (2010). They concluded that longer titles of medical research articles had a positive impact on the number of citations.

Another high-profile form feature is syntactic structure. Several models for analyzing syntactic structure based on different taxonomies have been widely employed. Soler (2007) identified four main syntactic structures used in titles, including nominal, full-sentence, compound and question constructions. Cheng et al. (2012) classified syntactic structure into compound structure, nominal structure, V-ing phrase structure, full sentence structure and prepositional phrase structure. Moattarian and Alibabaee (2015) adapted the existing model proposed by Dietz in 1995 (as cited in Busch-Lauer 2000) and examined the frequency of nominal, verbal and prepositional construction of titles in three disciplines. Their results revealed a marked dominance of nominal constructions over other structures. Meanwhile, some researchers focused on the subdivision of only one syntactic structure type. Wang and Bai (2007) grouped the nominal structure into three types, which are Uni-head nominal group, Bi-head nominal group, and Multi-head nominal group, and they counted the frequencies of each type. Existing research on syntactic structure seemed to show that nominal construction is the most recurrent structural construction with an overwhelming majority. Various disciplines were examined, in which articles in medicine accounted for the majority (Wang and Bai 2007; Ball 2009). A contrastive approach was often employed in the study of syntactic structure, featuring either several subject fields belonging to different disciplines or two broad-spectrum disciplines. For example, Moattarian and Alibabaee (2015) compared syntactic structure of article titles from applied linguistics, civil engineering, and dentistry. Soler’s (2007) study contrasted syntactic structure of journal article titles between sciences, including medicine, biology and biochemistry, and social sciences, including linguistics, psychology and anthropology. However, most investigations were synchronic studies without observing the long-range trends.

The third category of research on form features focuses on punctuation marks, which are used to establish sentence boundaries, thereby identifying different types of titles. Haggan (2004) defined three basic types of titles, including full sentence (e.g. Was Spenser a Republican?), compound (e.g. Circling the spheres: A Dialogue) and a remaining group made up largely of noun phrases with or without postmodification (e.g. Neo-Petrarchan Kitsch in Romeo and Juliet). According to the taxonomy proposed by Méndez et al. (2013), compound titles comprised a general heading and a specific theme generally separated by a colon, a dash, a full stop, or written in two different lines, while simple titles only consisted of a general heading. Lewinson and Hartley (2005) reported that titles with colons were longer and more informative than those without. Hartley (2007) found that there was a greater use of colons in arts and humanities than in sciences. His finding was also verified in Nagano’s (2015) investigation. Furthermore, researchers were interested in the relationship between the use of punctuations and citations (Rostami et al. 2013; Paiva et al. 2012; Nair and Gibbert 2016) and the change of punctuations in titles with time (Ball 2009; Hartley and Cabanac 2015). Rostami et al. (2013) discovered that articles (from Addictive Behaviors) with titles using hyphens or colons had a higher number of citations while Nair and Gibbert (2016) summarized that the use of punctuations and citations did not seem to correlate in management science articles. Studies by Ball (2009) and Hartley and Cabanac (2015) both came to the conclusion that there was a growing inclusion of question marks in scientific article titles and in Hartley’s academic papers.

Content-related title research has mainly investigated information types provided in titles based on different taxonomies. Anthony (2001), as one of the earliest scholars who studied the content of academic article titles, proposed a set of categories named “rhetorical combination” based on Hamp-Lyon’s (1987) language assessment criteria. In his taxonomy, Name-Description, Description-Name, Topic-Description, Topic-Scope, and Topic-Method were used to describe the content of compound titles. Based on Anthony’s model, Cheng et al. (2012) developed a more precise classification, consisting of eleven information types. Salager-Meyer and Alcazar-Ariza (2013) identified titles that contain a statement of purpose, method and/or outcome as Research Procedure Titles, and explored the long-range trend of this attribute in titles extracted from the British Medical Journal. Sahragard and Meihami (2016) established a framework by adding definition to each feature of the prototype of Goodman et al. (2001). They identified five elements conveyed by titles: topic only, method/design, dataset, result and conclusion, and applied this framework to their diachronical study on the research titles of applied linguistics journal. McGowan and Tugwell (2005) as well as Gjersvik and Nylenna (2014) examined whether article titles in the medical field were “declarative”, i.e., whether titles revealed the research conclusion. The research on title content revealed information trends across different journals or disciplines, whether they showed similar or different patterns, and provided suggestions to novice writers on how to formulate their article titles.

Since Diener (1984) investigated the informational dynamics in journal article titles, an increasing number of scholars have begun to study the informativity of article titles. Title informativity is of great importance, and highly informative titles tend to perform their functions more effectively (Yitzhaki 1994). This acknowledgement thus promoted increased attention to the study on the informativity of journal article titles (Buxton and Meadows 1977; Nagano 2015). Related research commonly employed the method of counting the number of “substantive” words. Tocatlian (1970) conducted a diachronic analysis of informativity of chemical paper titles. He defined non-substantive words as words that convey little or no information, such as articles, prepositions, conjunctions, pronouns, and auxiliary verbs. This objective approach was then extensively used by scholars to study title informativity. Journals of various subject fields or disciplines, such as ecology (Rodríguez and Moreiro 1996), humanities (Yitzhaki 1997) and multi-disciplines (Buxton and Meadows 1977) have been investigated. Existing studies on the informativity of journal article titles were all diachronic, consistently indicating the increase of the number of substantive words in titles with time.

Lexical diversity is also an important parameter in the content of titles. According to Bérubé et al. (2018), lexical diversity is an indication of whether academic discourse develops towards a more disparate or concentrated vocabulary. On the one hand, it seems that the advancement of a scientific discipline may bring about a diversification of its terminology. On the other, stabilizing the vocabulary is necessary to ensure effective transmission of knowledge within a subject field (Jacob 2004). This contradiction naturally would call for empirical studies on the variation trend of lexical diversity of scholarly discourse, particularly journal article titles. However, only a few studies can be found addressing this question. In 2015, Milojević track the evolution of vocabulary diversity in three scientific fields, namely, physics, astronomy and biomedicine, to quantify the extents of cognitive domains of different bodies of scientific literature independently from publication volume. Results showed that while the publication rates continued to grow exponentially, the number of distinct noun phrases expanded on linear scales “within a factor of few” (Milojević 2015). Later in, Bérubé et al. (2018) examined the trend of lexical diversity of scholarly titles. The findings of this study revealed that titles in the fields of natural sciences & engineering and social sciences & humanities incorporated the use of an increasingly restricted and cross-disciplinary set of words, which was indicated by the slight decrease of lexical diversity. Up to now, the most frequently-used and intuitive way of measuring lexical diversity on the basis of word repetition patterns is type-token ratio (hereafter TTR) (Bérubé et al. 2018). Unfortunately, TTR is a function of sample size, which means a larger sample size could lead to a lower TTR. Other commonly used measures derived from TTR are also problematic although they are claimed to be independent of sample size. (Malvern and Richards 2002). In 2004, Malvern et al. proposed Measure of Textual Lexical Diversity (MTLD) to deal with the sample size dependency of TTR through fixed size sampling procedures. Later Milojević (2015) proceeded by calculating the number of different noun phrases within consecutive segments of 1500 noun phrases, which was similar in kind to the method advocated in Malvern et al. (2004)’s study. However, according to Bérubé et al. (2018), such fixed size sampling techniques did not solve the sample size dependency problem at all, but merely transformed it into a sampling scheme dependency problem. In Bérubé et al. (2018)’s study, the authors adopted a new indicator based on zipfian frequency-rank distribution tail fits, which was claimed to be more independent of corpus size than other lexical diversity indicators. Still, in the same study, Bérubé et al. (2018) pointed out that there is no universally agreed method of standardizing samples to avoid the limitation of TTR yet (Malvern et al. 2004). But they also clarified that lexical diversity variables measured by TTR would still be valid in cases where language segments are of relatively small size. Nevertheless, they did not define what is “small enough”. Given the considerably smaller sample size in the present study compared to the previously-mentioned studies on lexical diversity of titles (e.g. Milojević 2015; Bérubé et al. 2018), TTR will be used to measure lexical diversity of linguistics and literature article titles.

Decades of title research has yielded fruitful results, but limitations still exist. First, some disciplines received more scholarly attention than others. Medical journals were explored the most comprehensively with an uncontested conclusion that medical article titles tend to be long, precise, and informative (Méndez et al. 2013). In contrast, a rather limited number of studies were conducted on language-related journals, such as applied linguistics journals (Cheng et al. 2012; Sahragard and Meihami 2016) and literature journals (Diener 1984; Yitzhaki 1994; Haggan 2004; Cook and Plourde 2016). Even within the same discipline, imbalanced attention was paid to different sub-fields. For example, in linguistics, unlike applied linguistics journals, theoretical linguistics journals have seldom been examined. Second, in comparative title studies, researchers usually chose to compare sciences or engineering with social sciences or humanities (e.g. Buxton and Meadows 1977; Haggan 2004; Soler 2007; Méndez et al. 2013; Moattarian and Alibabaee 2015), but seldom between humanities and social sciences (Yitzhaki 1994). However, differences in title features may still exist among these two closely related disciplines or subject fields although social sciences and humanities both study the human aspects of the world. Third, most of the existing title research concentrated on the correlation between various title features and citations. However, the results of these studies were sometimes inconsistent (Haggan 2004; Hartley 2008; Soler 2011). A change of approach then might reveal some interesting and new findings. Hence, the present study intends to investigate research article titles in linguistics (social sciences) and literature (humanities) journals between 1980 and 2018 from a diachronic and comparative perspective. Specifically, we will study both form features and content attributes by examining titles from seven leading journals belonging to either of the two disciplines. Our study will therefore address the following two research questions:

  1. 1.

    Are there any diachronic changes in title features of the two disciplines respectively between 1980 and 2018?

  2. 2.

    Do title features in linguistics journals and literature journals exhibit a similar pattern?

Methodology

Data

We adopted a two-step approach to compiling the corpus. Firstly, we selected journals following the criteria below:

  1. 1.

    The linguistics journals are all SSCI-indexed (Social Sciences Citation Index) and the literature journals are all A&HCI-indexed (Art & Humanities Citation Index);

  2. 2.

    The journals should have begun publishing since 1980.

  3. 3.

    The journals publish articles not limited to only one particular research area;

  4. 4.

    Considering the differences between linguistics and applied linguistics in terms of definition, principles, scope, focus, etc., the linguistics journals we selected generally publish research on language itself rather than applied studies, such as language teaching.

Taking the above four criteria into consideration, we selected the following fourteen journals. (see Table 1).

Table 1 The 14 selected journals

In the second phase, we divided the decade years between 1980 and 2018 into four periods (A: 1980–1989; B: 1990–1999; C: 2000–2009; D: 2010–2018). To mitigate sampling bias, we collected research article titles from all the volumes in 1988, 1998, 2008 and 2018 respectively to represent each period, excluding book reviews, review articles, announcements, notes, squibs and discussions, remarks and replies, as well as forum articles. Articles written in languages other than English were not included either. The number of titles collected from linguistics journals and literature journals are 679 and 685 respectively. Therefore, our corpus contained a total number of 1364 titles. Table 2 presents the specific information of this corpus.

Table 2 The corpus data

Data analysis methods

The present study will specifically investigate form features (average title length and syntactic structure) and content attributes (information types, lexical diversity and informativity) of journal article titles. Unlike Méndez et al. (2013) and Nair and Gibbert (2016), we adopted both manual counting and automated calculation using statistical software WordCounter to measure title length. The two results were then manually checked for consistency. Title length refers to the number of words in a title. A word, in our research, refers to a unit occurring with space or punctuation on either side, while a hyphenated word counts as one. Acronyms or abbreviations combining capital letters and figures are counted based on the number of capitalized initials (as well as the number of figures). For example, “EFL” (English as Foreign Language) is counted as three words; “L2” (Second Language) is counted as two words. Considering that the semantic components of hyphenated words in the disciplines of linguistics and literature are usually interpreted as a whole, a hyphenated word in this corpus is counted as one word.

For the purpose of the present study, existing frameworks were employed to examine the other title features manually. Specifically, Sahragard and Meihami (2016) framework (see Table 3) adapted from Goodman et al. (2001) was used to investigate the information type of titles. The types include topic only, method/design, dataset, result, and conclusion. It is worth mentioning that many titles contain more than just one information type. For instance, the title Evaluating S(c)illy Voices: The Effects of Salience, Stereotypes, and Co-present Language Variables on Real-time Reactions to Reginal Speech includes both method and result. To investigate the syntactic structure of titles, we adopted the framework proposed by Cheng et al. (2012). They identified five types of syntactic structures: nominal, compound, full sentences, V-ing phrases and prepositional phrases (see Table 4).

Table 3 Framework of information types (Sahragard and Meihami 2016, adapted from Goodman et al. 2001)
Table 4 Framework of syntactic structures (Cheng et al. 2012)

To measure informativity, we followed the existing approach used by Tocatlian (1970) and Buxton and Meadows (1977) by counting the number of “substantive” words in each title manually with articles, prepositions, conjunctions, pronouns, and auxiliary verbs excluded. For example, the number of substantive words in the title Emphasis Harmony in a Modern Aramaic Dialect is five and the informativity (the number of substantive words divided by the total) of this title is 71.4%.

In the present study, we adopted TTR to measure lexical diversity. It is calculated by dividing the number of type count in a title by the number of token count in it. The token count of a language segment represents the total number of words it contains, while the type count refers to the number of different words in it. For example, lexical diversity of the title Inversion and finiteness in Spanish and English: Developmental evidence from the optional infinitive and optional inversion stages is 76.5% (the total number of words in this title is seventeen, while the number of different words in it is thirteen).

Results and discussion

The present study intends to unveil the long-range trends of research article title features published in linguistics and literary journals between 1980 and 2018 and to examine possible disciplinary variations in title features. In this section we will firstly present the features in different time periods and analyze the diachronic changes in linguistics journals and literature journals respectively. Then we will describe the similarities and differences in title features between the two disciplines.

Diachronic description of title features in linguistics journals

Average title length

Table 5 shows that the average title length of linguistics journals enjoyed a trend for each decade year, with an increase from 6.6 words per title in 1988–11.2 words per title in 2018. The increase for the first decade-year was gradual but not significant, with the exception of the latest decade which saw a rather dramatic increase of 2.3 words per title, suggesting that linguistic researchers are more likely to use longer titles than before. In general, from 1980 to 2018, the title length has increased by 4.6 words, an increase of 70% on average.

Table 5 Average title length in linguistics journals

Syntactic structure

Figure 1 shows the information on syntactic structure of titles in linguistics journals. On the whole, nominal structure and compound structure are the dominant structures used, taking up almost ninety percent of the entirety. However, the use of nominal structure has suffered a continuous decline while compound structure has enjoyed a steady 10% growth for each decade year. In the latest period, the popularity of compound structure surpassed nominal structure and reached its peak (50%) in the four periods. This result in the 2010s is consistent with the finding in Cheng et al. (2012) study, in which compound structure stood at 54% and nominal structure took up about 39% of the applied linguistic journal article titles between 1999 and 2008. The other three structures, including full sentence, V-ing phrase and prepositional phrase were also adopted in linguistics journals, but they made up a very small proportion, less than 20% combined for each decade year.

Fig. 1
figure 1

The trends of syntactic structures of titles in linguistics journals

Information types

In general, titles containing the information on topic only predominated over the past four periods with no exception, followed by result, method/design, dataset and conclusion in sequence (see Fig. 2). However, from 1980 to 2018, topic only titles have experienced a dramatic decrease from 74 to 53%, while titles with information about result and method/design have both enjoyed relatively substantial increases, from 14 to 23% and 7 to 18% respectively. Although topic only titles in the 2010s still accounted for the largest portion, the absolute predominance was gradually weakened in this period of time. Meanwhile, it is worth noticing conclusion titles were the least favored with a very slight fluctuation between 0 and 2%. This finding is basically in line with Sahragard and Meihami’s (2016) results, which showed that applied linguistics journals tended to provide the least information on conclusion of the studies. However, their research revealed a decline in titles containing information about conclusion in Modern Language Journal, Language Learning, and Foreign Language Annals, while our findings suggest a weak increase of conclusion titles between 2010 and 2018. As the proportion of titles containing methods/design and result both rise steadily, it is not difficult to make the case that more dimensions of research content are being included in linguistics article titles. That means that journal article titles in the field of linguistics have become increasingly information-diversified and can better serve as the “compass” to provide initial information for readers. Another important finding about information type is that the proportion of dataset did not witness much change and is relatively lower than that presented in Sahragard and Meihami’s (2016) study. A possible explanation might be the difference in our data source. The journals chosen by Sahragard and Meihami (2016) mainly publish applied linguistics articles, which focus more on the subjects and specific groups of people; while the journals that we selected all center around pure linguistic topics. Another explanation could be the differences in journal guidelines between disciplines. Researchers normally formulate titles in accordance with the journal policies and discourse conventions to a great extent (Gesuato 2009).

Fig. 2
figure 2

The trends of information types in titles in linguistics journals

Informativity

Informativity of titles can greatly determine whether the title is catchy or not (Nagano 2015). The figures shown in Table 6 reveal that informativity of linguistics article titles reached a low of 59.8% in the 2000s and peaked at 70% for the last decade year. Although the number of substantive words in titles increased in the 2000s, its growth rate could not catch up with that of average title length, resulting in a slight decline in the 2000s. We can see from Table 6 an overall growing trend of informativity of titles in linguistics journals, which is consistent with findings of the previous research conducted by Yitzhaki (1997), who claimed that the increase in informativity in humanities journal article titles from one decade to the following was not always significant and not necessarily linear even if the long-range trend indicated an increase. More specifically, Yitzhaki (1997) examined the number of substantive words in article titles in the journal Language, which also contributes to the corpus of our research, between 1930 and 1990 in his study. He came to the conclusion that the average number of substantive words in titles published in Language in 1980 and 1990 were 4 and 5 respectively, which was exactly consistent with our results.

Table 6 The trends of informativity of titles in linguistics journals

Lexical diversity

Table 7 shows the trend of lexical diversity of titles in linguistics journal articles. From the 1980s to the 2000s, lexical diversity went through an overall decrease from 50.6 to 44.6%, with slight increases of 2.2% and 1.3% for the second and the last decade year respectively. A more significant decline of nearly ten percent was registered for the third period from 52.8 to 43.3%. Nevertheless, variations in lexical diversity of titles seem to be small in linguistics journals.

Table 7 The trends of lexical diversity of titles in linguistics journals

Diachronic description of title features in literature journals

Title length

The average title length of literature journals has kept increasing steadily during the four periods, from 8.7 words per title to 11.2 words per title, which is in line with the previous studies on the variation of title length (Yitzhaki 1994; Méndez et al. 2013). Furthermore, there is only a slight increase after the 2000s, indicating that the average title length of literary articles has been staying around 11 words per title for the past two decades (Table 8).

Table 8 Average title length in literature journals

Syntactic structure

Results show that compound structure is the most widely used syntactic structure in literature journals throughout the four periods, followed by nominal structure (see Fig. 3). Since most of the literature journal articles concentrate on one specific topic, such as a book, an author, a literary figure, a particular phenomenon, etc., the authors seem to prefer using compound structure, especially using colons to give an abstract concept first and then further introduce the real topic after the colon. For instance, Woman in a Trap: Pope and Ovid in “Eloisa to Abelard” (published in College Literature) can be a good illustration in point. Our finding is consistent with that of Haggan’s (2004), which concluded that in literature journals compound titles had a much higher percentage than nominal titles. The proportions of the other three syntactic structures remained very small and witnessed few noticeable changes across the periods examined. However, there was a slight increase of 4% in V-ing phrase in the last decade year. This change might suggest that researchers tend to include the analysis process of literary works in their titles.

Fig. 3
figure 3

The trends of syntactic structures of titles in literature journals

Information types

Figure 4 shows that titles containing the information on topic only are the most favored and frequently-used in literature journals from 1980 to 2018. Although there were small fluctuations between different periods, the long-term trend of increase in topic only titles is still significant. Given that most literature journal articles do not require experimental designs or fixed and objective results, result and method/design especially were not provided in most titles throughout the four periods. Since a great number of literature journal articles focus on a literary work or even a narrator, the authors would sometimes include the name of the work or even one specific quotation in the title, thus revealing part of the source of the dataset. For example, the title Back to the Future: Late Modernism in J.G. Ballard’s The Drowned World (published in College Literature) directly revealed the research subject is the book The Drowned World written by J.G. Ballard. Therefore, titles providing information on dataset account for more than one third of the whole proportion, which is much higher than that of method/design and result.

Fig. 4
figure 4

The trends of information types in titles literature journals

Informativity

The trend of informativity in literature article titles is presented in Table 9. It can be seen that, in general, informativity increased with periods investigated, which is basically consistent with the results of other studies (Buxton and Meadows 1977, Yitzhaki 1994). Informativity remained high though with occasional fluctuations. The proportion fell suddenly in the 2000s to 67.3% before peaking at about 76.8% in the 2010s. Though the chosen literature journals in this paper were not incorporated in Yitzhaki’s study (1997), our finding confirms its conclusion that the pace of increase in the humanities journals has been much slower, and there are some decreases at certain points in time as for the long-range trend.

Table 9 The trends of informativity of titles in literature journals

Lexical diversity

The trend of lexical diversity of titles in literature journal articles is presented in Table 10. Average lexical diversity decreased from 54.9% in the 1980s to 48.2% in the 2010s, with a minor increase of 0.7% for the second decade year. Since the 2000s, the downward trend became more prominent with a decrease of 5.2%. Nevertheless, the whole process seems to be a gradual fall without a trough.

Table 10 The trend of lexical diversity of titles in linguistics journals

Comparison between linguistics journals and literature journals in terms of title features

Titles in linguistics journals and literature journals show consistency in the long-range trend of title length and informativity, both enjoying a steady and linear growth in length and an increase with fluctuations in informativity, which is basically consistent with the findings of previous studies. For example, Buxton and Meadows (1977) came to the conclusion that article titles of most subjects showed a significant increase in informativity from 1947 to 1973. Although they did not examine article titles in linguistics and literature, our findings show consistency with theirs and verify the validity of their conclusion. Compared with the 1980s, titles of both disciplines achieved an immense increase in length in the 1990s. While the length had been growing steadily for the third decade year, titles of both disciplines encountered a sharp decrease in informativity in the same period. Although longer titles may imply the increase of substantive words, our finding suggests that word increase does not necessarily equal the growth of information content. Moreover, when title lengths of both disciplines were enjoying a rapid increase from the 1980s to the 1990s, the proportions of compound structures also experienced a sudden increase in the same period, from 18 to 28% and 49 to 60% respectively. The average length of compound structures is the longest in general since the compound structure can serve the role of the other three while playing its unique part (Wang and Bai 2007), therefore the growth in the proportions of compound structure may lead to that in title lengths. As far as lexical diversity is concerned, both disciplines experienced a downward trend. Starting from the 1980s, lexical diversity increased slightly and apparently experienced a decline since the 2000s. This might be the result of the increasing use of the same words for important concepts and problems. The downward trend is also generally consistent with the finding in Bérubé et al. (2018)’s research on the lexical diversity of humanities and social sciences titles, which experienced an 8% decrease over the period from 1975 to 2014. However, a difference can still be identified between the two subject domains. Even though lexical diversity of linguistics article titles enjoyed a slight increase for the latest period, it was still about four percentage points lower than that of literature article titles. A possible reason for this might be the difference in information types carried by titles between these two fields. While incorporating terminology in the description of problems, methods and results may be a common practice in formulating linguistics article titles, literary works or main characters would often be mentioned in literature article titles, contributing to a relatively higher level of lexical diversity.

Nevertheless, significant variations between the two disciplines do exist in several aspects. First, compound structure has remained predominant in literature journals. While nominal structure had been the most widely adopted syntactic structure of titles in linguistics journals for the first three periods, it was surpassed by compound structure in the most recent decade. While Wang and Bai (2007) recognized the heads in nominal titles usually function to inform readers of the general focus of study and often need further specification, Soler (2007) in the same year also pointed out nominalization is the materialization of informativity via the piling up of pre- and post-modifiers. For example, in The Role of Diffusion in the Genesis of Hawaiian Creole (published in Language), the subject Diffusion, is an abstraction of the research object. However, the specific research result and the origin of data is revealed through both the pre-modifier, The Role and the post-modifier Genesis of Hawaiian Creole. The pre-modifier The Role is an elicitation of the result of the research, while the post-modifier Genesis of Hawaiian Creole gives a specific statement about in which group of subjects this research is conducted. This may account for the high proportion of nominal structure used in linguistics article titles. However, as titles also serve the function of an eye catcher, the use of punctuation marks in titles, especially question marks can greatly arouse the interests of readers in further reading. Though nominal titles effectively and coherently summarize the essential information of research articles (Cheng et al. 2012), the same group of research also shows clearly the advantages of compound structures in their study. Titles with a compound structure can better present the indication of a specific research focus in titles (Cheng et al. 2012) and can realize more functions. The most irreplaceable function of compound titles is its clever format, which allows the general topic and specific content of the research to coexist. Specifically, as either part of a compound title can be a nominal, V-ing phrase, full sentence or prepositional phrase, a compound title can achieve a maximized function compared to the other three structures. For example, in What Are You Cookin’ on a Hot?: Movement Constraints in the Speech of A Three-Year-Old Blind Child (published in Language), the author(s) combined a full sentence with a nominal statement modified by a post-modifier. This title achieved the functions of interrogative sentence and nominal structure at the same time, arousing the interests of readers as well as revealing the research subject and result. It is understandable linguistics article titles tend to explore the complicated relationships between a number of key elements such as social context, data source, participants, method, scope. However, while the researchers try to squash more information into the title, the old-fashioned nominal structure is proceeding from glory to decline and will be replaced by emerging compound structures. Because of the literariness and subjectivity of literary research, colons are employed in a great number of literary titles so as to retain the romance and mystery before the colon and supply subjective elaboration after the colon. Therefore, in literature journals, compound structure has always been the most popular across the four periods.

As far as information type is concerned, the proportion of topic only titles has always been overwhelmingly high (around 60%) in literature journals. On the contrary, topic only titles in linguistics journals, though still dominant compared with other information types, suffered a linear and consistent decrease from 74 to 53%. Specifically, linguistic article titles have gradually covered more information types by incorporating the description of result and method/design. The variation can be accounted for from the following aspects. One is that the empirical and quantitative approach has been increasingly popular in linguistic research, which calls for rational experimental designs and is expected to present objective results. In contrast, literary research usually focuses on literary analysis of specific literary figures or phenomena based on subjective interpretation. The other possible explanation might be that literature research does not necessarily generate specific and objective results, let alone specifying information on result in titles. Furthermore, dataset takes up the second highest proportion in literary article titles while it is seldom included in linguistics article titles. This might be related to the particularity of literary research. A large amount of literature research will focus on a specific literary work, therefore, its title, a type of dataset, would normally be included in the article title.

Conclusions

This paper investigated five title features of research articles in seven linguistics journals and seven literature journals. Since linguistics and literature are both language-related disciplines, it might be interesting to conduct a diachronic study on their respective prominent features, long-range trends, as well as the similarities and differences between the two disciplines. Article titles in both disciplines demonstrated a similar trend: title lengths enjoyed a steady increase while informativity went up with fluctuations and lexical diversity experienced a modest decline. Nominal structure and compound structure combined have the largest proportion, accounting for more than ninety percent, which is in line with Cheng et al (2012) results. As for information types, in our study, the proportion of method/design in linguistics journal article titles is not as high as that of topic only, which is overwhelmingly high in the first three decades and still relatively high during the latest decade. We also found that topic only is the information type provided most frequently by titles in both linguistics and literature journals, which is consistent with the conclusion of Siegel et al. (2006) and Goodman et al. (2001) drawn from articles in the field of medicine.

The present study has enriched the existing title research and may bear some implications for future studies. First, since titles in literature and linguistics journals were not studied as comprehensively as other science subjects investigated before, this study has verified the research results of predecessors and served as a solid supplement. In addition, the findings of this article can provide valuable information on the formulation of a research article title for those who have the intention to publish in the fourteen selected journals in this study. Considering the research result that compound article titles are more prevailing than the other four types, EAP teachers ought to encourage linguistic and literary students to use compound structures more often when formulating research article titles in academic English writing class.

However, the present study still has some limitations. First, the results of lexical diversity in this research may still need further validation, since no empirical assessment of the sample size robustness of the TTR on our corpus was conducted. Second, information types and syntactic structure of article titles in this study are manually analyzed. Although two authors went through the same procedures and checked for consistency, the classification may still be subjective. In addition, a larger corpus of titles covering a wider time span would probably offer more significant variations. With the increase of sample size, a different lexical diversity measure could also be employed to resolve the sample size dependency problem, thus hopefully yielding more convincing results in lexical diversity. Moreover, this study mainly focused on the diachronic variation of each of the five features studied. Not much attention was paid to the possible correlations among the five features, which may call for further investigation as well. Future studies can also be conducted on the specific types of compound structure, since a comprehensive model has already been developed by Anthony (2001) for further analysis.