Introduction

Writing highly cited articles is an important goal for scholars. It gives prestige to the authors and the institutions with which they are associated. Measurements of citations are used to rank and evaluate universities, departments and individual scholars, as well as the countries in which they are located (Haslam et al. 2008; Ball et al. 2009). More importantly, whether or not claims in an article become facts depends on if and how later papers refer to them (Latour 1987). Scientific facts are settled by broad agreement (Collins 1990). “Scientific activity is not ‘about nature,’ it is a fierce fight to construct reality.” (Latour and Woolgar 1986).

A scholar who is able to align large numbers of other scholars (Latour 1987; Latour and Woolgar 1986; Collins 1990) has a greater impact on what becomes a fact and what not in his or her field, than a scholar who is unable to align other scholars. Aligning these scholars and obtaining their agreement “may involve funding, status, or persuasive ability” (Martin and Groth 1991). Being cited by others is a signal of this influence on scientific progress.

Are claims and content all that matters? What if seemingly superficial factors influence the number of times an article is cited? The many guidebooks on how to write research papers suggest that there are tricks to writing better papers. In addition to this literature, often based on authors’ own experiences or methods, there is also evidence that there are factors that influence the frequency with which articles are cited. For example, it has long been established that scholars of higher rank are more promptly and widely cited (Merton 1968) than less well-known scholars. Having an established name on a paper might ensure that a paper is not ignored, the worst fate to befall a scientific paper (Latour 1987).

In the remainder of this introduction, we summarize earlier research about non-content related factors that affect subsequent citation, including length of titles and abstracts, numbers of pages, authors and cited references, and readability. We then outline our own methodology for selecting articles for analysis and for operationalizing our selected variables. We discuss the results for each of the three subject categories we analyzed, separately and in comparison, before making some recommendations about how to write highly cited articles in Sociology, General & Internal Medicine, and Applied Physics.

In research guidebooks it is recommended to use keyword and title search, preferably in indexes and/or bibliographies, and to base selection of articles to read on their abstracts (Booth et al. 2003; Neuman 1991). This indicates the importance of a catchy title, a good selection of keywords and an attractive abstract. In addition to the content, the readability of an abstract might contribute to its attractiveness. While Haslam et al. (2008) assumed informative and attention-capturing titles might improve impact, they found no association between the catchiness of a title and the impact of an article in the field of Social and Personality Psychology. Furthermore, in a regression of, what they refer to as, organization characteristics of an article they did find that title length had a small negative effect and the presence of a colon in the title had a positive effect on the impact. A possible explanation for this is that a colon may indicate scholarly complexity and distinction (Haslam et al. 2008). Stremersch et al. (2007) hypothesized that title length would have an impact on the number of citations an article in marketing would receive, but could not confirm this with their data. Jacques and Sebire (2010), in comparing highly and lowly cited articles in three medical journals, found a positive correlation between the number of citations received and the length of the title, the presence of a colon and the presence of an acronym. Jamali and Nikzad (2011), however, found a negative correlation between the number of citations and the title length and the presence of a colon in a set of six PLoS journals.

An effective way to boost impact might be sought in working together with others. There are various reasons why collaboration might positively influence the number of times an article is cited. Some argue that it positively affects the quality of a paper (for instance Haslam et al. 2008) as there will be a more extensive internal review process. Collaboration also increases the opportunities for self-citation (for instance Smart and Bayer 1986) and increases the network of scholars into which a paper can easily be introduced (for instance Frenken et al. 2005). Conclusions about whether or not collaboration indeed has a positive impact on citations vary. In an analysis of 270 articles in three applied fields (Clinical Psychology, Educational Measurement, and Management Science), Smart and Bayer (1986) conclude that “collaboration generally has little effect on aggregate quality, regardless of field, as measured by citation indices”. Furthermore, their conclusion holds irrespective of whether or not self-citations are included (Smart and Bayer 1986). More recently, Haslam et al. (2008) found first author eminence and total author eminence influenced impact in the field of Social and Personality Psychology, although the number of authors had no significant influence. Webster et al. (2009) uncovered a significant positive relation between the number of authors and the number of citations a paper receives in Evolutionary Psychology. Similar relationships were found in the fields of Biology & Biochemistry, Chemistry, Mathematics and Physics (Vieira and Gomes 2010). Using raw data from the Web of Science over a ten-year period, Glänzel and Thijs (2004) were able to conclude that “multi-authorship increases above all the probability to be cited by others”. Multi-authored papers are cited more, but the increase in self-citation rates is weaker than the increase in foreign citations (Glänzel and Thijs 2004). Important outliers in their set are single-authored papers, which have a very low share of self-citations. Furthermore Franceschet and Costantini (2010), in their study of 18,500 Italian research outputs, conclude that collaboration has a positive influence on the impact of papers. Important exceptions being hyperauthored papers, as is common in Physics, which receive fewer citations than papers with a smaller group of authors. Frenken et al. (2005) found that the number of authors (and the number of organizations) had a positive impact in the field of Biotechnology and Applied Microbiology. Within the field of Information Science and Technology, collaboration has a significant positive influence on citation rates (Levitt and Thelwall 2009).

Another important factor is the number of references a paper contains. In the past, this was stable at ten references per paper (Price 1963), but it is widely assumed this number has since increased. Larivière et al. (2008) have shown that while the growth of publications in medical fields and in natural sciences & engineering is progressively slowing down since 1980 the number of references has not leveled off, which would indicate a growth in the number of references per paper. A paper that itself contains many references to previous work is likely to develop a stronger standing than a paper with no or few references (Latour 1987; Latour and Woolgar 1986). References are used to increase a paper’s power of persuasion (Gilbert 1977). Webster et al. (2009) suggest that, among other reasons, a form of reciprocal altruism (“I cite you, you cite me”) could cause a paper with many references to be cited more often. They found a linear relation between a log transformation of the number of citations and the number of references (Webster et al. 2009), however they also indicate that there could be untested and unknown other variables influencing this relationship. Similar results were found by Vieira and Gomes (2010).

Several scholars have found a positive relationship between article length and the number of citations an article receives (Haslam et al. 2008; Wang et al. 2012; Vieira and Gomes 2010; Hudson 2007), simply because longer articles more often contain more findings.

Hartley et al. (1988), in a short literature review about the relation between readability and prestige, found indications that readability can have both a positive and negative effect on prestige, and thus concluded that superior measuring instruments were needed. For journals in the field of Marketing an increase in readability might negatively influence credibility (Stremersch et al. 2007). An article which is very readable might be thought of as simplistic, whereas an article that is difficult to read “presents us with a choice of whether to judge the author inept for not being clear, or ourselves stupid for not grasping what is going on.” (Botton 2001), suggesting there is an optimum somewhere between ‘too easy’ and ‘too hard’ to read.

Different techniques have been developed to measure readability, including the Flesch Reading Ease Score (Flesch 1948) and Flesch-Kincaid Grade Level (Kincaid et al. 1975). These types of measurements have been criticized for only looking at surface level linguistic information (Crossley et al. 2008; Lin et al. 2009). Nonetheless the Flesch Reading Ease Score correlates quite well with comprehension (Fry 1968), and is widely used in readability research (see for instance Hayden 2008; Weeks and Wallace 2002; Wager and Middleton 2002; Roberts et al. 1994; Friedman et al. 2004; Villere and Stearns 1976; Hartley et al. 1988).

The Flesch-Kincaid Grade Level (FKGL) expresses the US school grade level or the years of education the reader should have completed in order to understand the text, while the Flesch Reading Ease Score (FRES) expresses readability on a scale, that for practical considerations can be thought of as ranging from 0 to 100, where a higher score indicates easier readability (e.g. 0–30 very difficult, 90–100 very easy). Whilst both FKGL and FRES are used in research, FRES appears to be the most used, even in more recent studies, and will be used in the rest of this study. Both formulas are included in Microsoft Word, and implemented as follows (Microsoft 2003).

$$ \begin{aligned} & {\text{Flesch Reading Ease Score}} = 206.835 - (1.015 * {\text{Total Words}}/{\text{Total Sentences}}) - (84.6 * {\text{Total Syllables}}/{\text{Total Words}}) \\ & {\text{Flesch-Kincaid Grade Level}} = (0.39 * {\text{Total Words}}/{\text{Total Sentences}}) + (11.8 * {\text{Total Syllables}}/{\text{Total Words}}) - 15.59 \end{aligned} $$
(3)

Using these internal functions of Microsoft Word, the readability of a text can be calculated. For instance, the readability of this introduction, from head to tails, has a FRES of 31.3 and a FKGL of 14.

Methodology

In order to analyze how the factors identified above affect citations, we selected three different subject categories, namely Sociology, General & Internal Medicine, and Applied Physics. Journal names for the 10 journals with the highest impact factor were extracted from the Journal Citation Report for 2005. Using the Web of Knowledge (WoK) advanced search function, information for the document type ‘Article’ was collected over the period 1996 to 2005, creating a corpus of these three categories. Records of these papers were downloaded between 3 and 11 February 2011, and stored in a database for further analysis. As most articles are cited within 5 years of publication, it was important to choose an early cut-off date.

For the papers identified, we attempted to collect full-text PDFs via the publishers. We searched on the issue and volume, the journal and the article title. Not all journals or journal issues fall within the scope of the library subscription of Maastricht University. Other journals only contained reviews. Not all articles were found, sometimes due to misspellings in WoK where some titles seem to have been read using object character recognition (OCR) which can lead to mistaken characters, for instance ‘rn’ is read as ‘m’, and vice verse. Some text could not be extracted for analyses, as articles were sometimes locked for text extraction, which might be unintended. Articles with more than 100 words in the full text and five words in the abstract were included. Only journals for which at least 50 % of the articles were found, extracted and contained at least 100 words are included in the analysis (see Table 1 for an overview of the number of journals and articles included in the analysis compared against the number of articles in the Web of Knowledge, per included category).

Table 1 Overview of the number of journals and articles included in the analysis compared to the number of articles in the Web of Knowledge, per category

Text was extracted from the downloaded PDF files for analysis. For each paper the following information was recorded:

  • Number of pages, cited reference count, times cited count: all directly from the WoK record

  • Length of the title and the number of authors: based on or inferred from the WoK record

  • Number of sentences in the abstract and FRES of the abstract: based on the WoK abstract, and analyzed by Microsoft Word

  • Number of sentences in the full text and FRES of the full text: based on the downloaded paper and analyzed by Microsoft Word.

As the citation and reference counts are expected to be positively skewed (Wang et al. 2012; Webster et al. 2009), a log transformation [Log Times Cited = Log10(Times Cited + 1) and Log Reference count = Log10(Reference count + 1)] was applied as exemplified by Webster et al. (2009). Since the author count is also highly positively skewed, a log transformation [Log Author Count = Log10(Author Count)] was also applied. Following the suggestion by Haslam et al. (2008) the presence of a (semi-) colon in the title was indicated by a binary variable.

The relation between the independent variables and the number of citations was analyzed using bivariate correlation for each category. Since there are indications that both readability and its opposite might have a negative impact on the number of citations, the relationship between the number of citations a paper receives and its readability might be parabolic. As bivariate correlations and linear regressions are linear, the square root of the readability scores is also be included in the analysis, as this makes the relationship behave in a more linear fashion.

A more advanced statistical analysis is required since the journal in which an article is published might have characteristics independent of the individual article that influence the number of citations received and the elapsed time since publication also influences the number of citation. A linear regression for each category using dummy variables for the journal and publication year was created. However, since the journal itself changes over time, for instance when a new editor takes over, time and journal cannot be seen as independent of each other. Therefore these dummies are combined in one dummy representing a journal in a year. To look at the effects of some factors independent of the journal in which the papers were published, a model without the journal/year dummy was first created.

Results

From Tables 2, 3, and 4 it is immediately clear that there is a high correlation between all the factors which could lead to multicollinearity in the regression model. Whilst this could have a negative impact on the reliability of the coefficient estimates in the regression model, the predictive power of the model remains intact. Before making inferences about the coefficient estimates we tested for multicollinearity.

Table 2 Correlations between the different variables included in the Sociology category
Table 3 Correlations between the different variables included in the General & Internal Medicine category
Table 4 Correlations between the different variables included in the Applied Physics category

The number of Words in Title is significantly correlated (p < .01) with the Log Times Cited in all three subject categories (see Tables 2, 3, 4). This correlation is negative in both Sociology and Applied Physics (articles with shorter titles received more citations), but positive in Internal & General Medicine (the longer the title, the more citations received), confirming a hypothesis put forth by Stremersch et al. (2007) and results obtained by Jacques and Sebire (2010).

For all three categories the number of pages correlates significantly (p < .01) and positively with the Log of the number of times an article is cited, in concurrence with earlier literature. Another measurement for article length, the number of sentences in an article also correlates significantly (p < .01) and positively with the Log times an article is cited in all three categories (see Tables 2, 3, 4).

The length of the abstract, in terms of numbers of sentences, correlates positively and significantly (p < .01) with the Log times cited in both General & Internal Medicine and Applied Physics, but not in Sociology (see Tables 2, 3, 4).

The Log of the number of references an article contains correlates positively and significantly (p < .01) with the Log of the number of references the article received in all three categories (see Tables 2, 3, 4).

For all three categories a positive, significant correlation (p < .01) between Log of the Author Count and Log times cited has been found (see Tables 2, 3, 4).

The square root of the readability of the abstract as measured by the Flesch Reading Ease Score has a negative, significant, correlation (p < .01) with the Log times cited in all fields. The square root of the FRES of the whole text correlates, negatively and significantly (p < .01) with the Log times cited in Sociology and General & Internal Medicine, but not in Applied Physics (see Tables 2, 3, 4).

Regression

All variables were entered in a regression model as predictor variables for each of the subject categories (see Table 5). The journal/year dummies were entered in model two to further explain the variance of the Log Times Cited (see Table 6).

Table 5 Summary of the first model regression
Table 6 Summary of the second model regression

Compared to Sociology and Applied Physics the variance in the Log Times Cited General & Internal Medicine is explained to a high degree (31.7 %) by these seemingly superficial factors (see Table 5).

For all three fields, the variance explained increases after adding the journal/year dummies and these increases are significant (p < .01), the variance in the Log Times Cited for General & Internal Medicine is explained beyond the 50 % (see Table 6).

The (unstandardized) Beta values of the significant predictors from Tables 7, 8 and 9 are combined in Table 10, that standardized Beta values are also given. From these standardized Betas we can see that a standard deviation change in the number of sentences in the full text has the largest impact on the log of the times an article is cited (and thus on the number of times an article is cited) in Sociology in the first model. In the second model an increase in the number of pages results in the largest change in the log of the times an article is cited. Likewise, we can see that in General & Internal Medicine the largest change in the log of the number of times an article is cited is caused by a change in the log of the author count (first model) and the number of pages (second model). In Applied Physics an increase in the number of sentences in the Full Text (first model) and log of the author count (second model) results in the largest increase in the log of the times an article is cited.

Table 7 Descriptive statistics, unstandardized Beta and p values for variables in the regression models for the Sociology category
Table 8 Descriptive statistics, unstandardized Beta and p values for variables in the regression models for the General & Internal Medicine category
Table 9 Descriptive statistics, unstandardized Beta and p values for variables in the regression models for the Applied Physics category
Table 10 Unstandardized and standardized (between brackets) Beta values of significant predictors in first and second regression models for all three categories

In the first model, for all three fields there were no parameters with a VIF (Variance Inflation Factor) greater than five, which would indicate multicollinearity. After adding the journal/year dummies in the second model for all three fields, the number of pages had a VIF greater than five (Sociology: 8.347; General & Internal Medicine: 5.643; Applied Physics: 6.986). In Sociology the number of sentences in the full text was also above the five threshold (7.202), in General & Internal Medicine and Applied Physics the VIF was close too, but did not break the threshold (4.837 and 4.920 respectively).

Discussion

Our analysis shows that some of the variance in the number of citations an article receives can be explained by seemingly superficial factors that have nothing to do with the content of the article. In the Sociology articles, 13.3 % of the variance in the log times cited can be explained by such factors. Changes in the log of the number of references, the log of the number of authors, and the number of sentences in the full text and the number of pages have the most influence. Adding the journal and year of publication to the model explains 31.7 % of the variance in the log times cited. The variables with the most influence are the log of the numbers of references, log of the number of authors, the number of words in the title, the presence of a colon in the title and the number of pages.

In General & Internal Medicine articles, 31.7 % of the variance in the log of the number of times an article is cited can be explained by superficial factors such as the log of the number of authors, the log of the number of references, the presence of a colon in the title and the number of pages. When journal and publication year dummies are added, the model can explain 50.9 % of the variance. Relevant factors are the log of the number of references and number of authors, number of pages and the square root of the Flesch Reading Ease Score (FRES) of the abstract.

In Applied Physics, only 6.7 % of the variance in the log of the number of times an article is cited can be explained by factors such as the log of the numbers of references and authors, number of pages and the square root of the full text FRES. In the second model, this rises to 12.2 % of the variance. The log of numbers of references and authors plus the numbers of pages and title words are significant factors.

While the influence of these superficial factors varies between fields, it is clear that such factors are not trivial as they can influence the number of citations an article obtains. Adding the journal and year dummies has an effect on the influence of some of the more superficial variables on the variance in the frequency with which an article is cited. When we look at the two Sociology models, adding journal/year dummies changes the influence of a standard deviation change to the number of sentences in the full text from positive to negative. This suggests that there is a difference between the distributions of these variables between journals. Also there are differences between the categories, for instance, in Sociology, the square root of the full-text FRES is not an important explanatory variable, though it is in General & Internal Medicine, and for the abstracts of Applied Physics and General & Internal Medicine articles.

Why this difference in distribution between journals exists is not clear from this research. Possibly some factors influence acceptance rate of papers in some journals or some factors are influenced in the editing process. Another suggestion might be that it depends on the specific subfield in which a journal operates; this could be especially true in Sociology which seems a broader field than General & Internal Medicine and Applied Physics. Also, currently, we have no explanation for the between-field variation of the influence of the factors studied. We can only speculate as to whether this has to do with different citation practices, or with the training, position and time allocated to research by the people writing it up (full-time scholars vs. doctors who do some research along side their clinical practice).

While these results are based on statistical analysis, they could be used to help people to prepare articles that might become more highly cited. For instance, we notice, when we look at the first and second models, that the number of references and the number of authors explain some of the variance in the number of citations articles received in all three of the fields. This does not mean that one should artificially inflate the number of references (for instance by coping references from other articles, as discussed in Ramos et al. 2012) and the number of authors. The positive effect of an increase in the number of references should be understood in the context of the persuasion factor of papers that build on previous literature, as well as some reciprocal altruism. Also the positive impact of an increase in the number authors should be understood as arising from the extension of the network of scholars into which work can easily be introduced, as well as a possible increase in the quality of a paper resulting from rigorous internal review.

Further lessons can be gleaned for sociologists: do not use titles that are too long. An article with a title less than the mean will, all other things being equal, receive more citations than an article with a title longer then the mean. The articles themselves ought to be longer than the mean (as measured by the numbers of pages, at least) and the number of sentences in the abstract should also be greater than the mean in sociology. In contrast, a longer title does help in General & Internal Medicine, as do more pages and more sentences in the abstract. Here there is also the somewhat paradoxical result that the abstract should be less readable than the mean abstracts but the article itself should be more readable, both as measured by the square root of the Fresh Reading Ease Score. In Applied Physics, not only the title should be shorter than the mean but also the article itself. Both the abstract and the full text should contain more sentences than the mean in and the abstract should not be less easy to read. Short articles with many sentences could indicate short sentences should be used, but could also indicate one should avoid too many figures and tables in an article, which would inflate article length. Future research could shine some light on this matter.

For those variables that do not surface as significant, we cannot claim they do not contribute to the number of citations an article receives. It could well be that the sample lacks the ability to discriminate between highly and lowly cited articles for these variables.

There are limitations to this research: Applied Physics Letters accounts for 93.9 % of the sample, overshadowing all other journals in the Applied Physics category. In a future research project this could be circumvented by another way of selecting journals and articles to create a more homogenous set of articles. Whilst General & Internal Medicine and Applied Physics are subfields of the broader fields of Medicine and Physics, respectively, Sociology itself is a broad field, making it a more diverse category compared to the other two categories. There are also limitations to the text extraction method which are summarized in “Appendix”.

Some research questions remain;

  • Are there differences between journals in the same category?

  • Would a more homogenous set of articles produce the same results?

  • Do these factors already play a role in the selection of papers, or are they introduced during the editing process (as suggested by Roberts et al. 1994; Wager and Middleton 2002 with respect to readability)

  • Does readability surface as an influential factor when using more advanced techniques, such as the soft fuzzy rough set model (Wang et al. 2012)

  • Why does the influence of these factors vary so much between the three fields?

If scholars or their institutions want to contribute to scientific literature, and to be seen to contribute, and if they wish promote their individual and collective reputations in rankings and evaluations, they need to be aware of how the invisible hand in science works, and how it can be influenced. Form and style also influence how well individual scholars and their institutions fare in the global competition that scientific publication has become.