Introduction

Sociologists often divide prose fiction into the broad categories of ‘highbrow’, ‘middlebrow’, and ‘lowbrow’, where the third of these equates to the products of the category publishing industry, the second equates to bestsellers and aspirant bestsellers, and the first equates to ‘literary’ fiction, i.e. novels and shorter works which are consecrated or legitimated by critics working within academia or writing for periodicals with an intellectual reputation. The distinction between ‘highbrow’ or legitimate culture and other forms of culture, like the role of professional critics in differentiating the two, appears to be maintained in the Internet era despite the emergence of forms of peer criticism such as online customer reviews (Verboord 2009).

Although employing computational linguistic methods, this study can be considered a conceptual replication of qualitative studies which have found a distinction between customer reviews that do and customer reviews that do not adopt the style of cultural consumption legitimated by professional critics. It argues that this style may be required for the appreciation of those cultural forms to which professional critics give legitimacy. Online customer reviews are not intentionally created as research data, and as such do not come accompanied by information of the sort that a sociologist would hope to use in order to explain any patterns arising within them: in particular, demographic information such as age, gender, and level of education. However, it is still possible to ask whether the patterns found to arise are compatible with what one would expect given a particular theory of the relationship between social position and style of cultural consumption.

The patterns to be studied here consist of associations between numeric ‘star ratings’ awarded by Amazon customers and the words which appear in the accompanying reviews. No claim is made that such a method can substitute for qualitative analysis (and thus no position is taken on the debates engaged in by Lee and Martin 2015). Indeed, once these patterns are identified, what one critic of quantitative text analysis refers to as ‘humanistic interpretation’ (Biernacki 2015) is deployed in order to make sense of the meanings that may be—to use the terminology of another such critic—‘embedded’ in the use of each lexical item (Reed 2015). Such an approach is possible here because reviews (and accompanying ratings) are aggregated by title, rather than being treated as an undifferentiated mass. This makes it feasible to study them (first quantitatively and then qualitatively) as collective discourses on specific books.

The following section of this paper explains the theory of cultural consumption which informed the design and conduct of the research. Beyond that theory, both quantitative and qualitative aspects of the research are based on the intuitive assumption that the numeric rating which a customer assigns to a book reflects his or her evaluation of it (one star = very negative, five star = very positive), while the accompanying text provides insight into the style of cultural consumption in relation to which that evaluation was produced.

Elite and popular styles of cultural consumption

Following Bourdieu (1984 [1979]), many quantitative studies have been carried out to investigate the proposition that, in the contemporary western world, social position is asserted through the performance of particular forms of aesthetic discernment. Historically, much of the debate has focussed on whether high social standing is associated with exclusive consumption of ‘highbrow’ cultural goods, or whether this association has broken down thanks to the adoption of an ‘omnivorous’ cultural diet by social elites (Bennett et al. 2009; Bryson 1996; Coulangeon and Lemel 2007; Lizardo 2005; Peterson and Simkus 1992; Prieur et al. 2008; Tampobulon 2008; Wardeet al. 2007, 2008). When it comes to books, for example, it has been found that genre preferences are associated with social status (Bennett et al. 1999; Bukodi 2007), but that gender also plays a role, with less highly educated men and women being more likely to make obviously ‘gendered’ reading choices (sports books and romantic fiction, respectively), and with more highly educated men and women tending ‘disproportionately [to] read books demanding symbolic mastery, particularly classic and contemporary fiction, art/photography books, poetry, politics books, and history books’ (Atkinson 2016, p. 263).

This debate continues, with recent analysis of survey data finding continued correlation between social position and the consumption of ‘highbrow’ culture, across both decades (Weingartner and Rössel 2019) and national borders (Reeves 2019). However, it has been argued that this research tradition neglects the importance of styles of consumption of cultural goods: that is, not the goods consumed but the manner in which they are appropriated by the consumer (Jarness 2015). Holt writes as follows:

when someone details Milos Forman’s directorial prowess in The People vs Larry Flynt to a friend over dinner (or, conversely, offers a damning harangue of Forman as an unrepentant proselytiser of the dominant gender ideology), this discussion not only recreates the experiential delight that the movie provided, but also serves as a claim to particular resources (here, knowledge of directorial styles in movies, and the ability to carefully analyse these characteristics) that act as reputational currency. Such actions are perceived not as explicit class markers but as bases for whom one is attracted to and admires, whom one finds uninteresting or doesn’t understand, and whom one finds unimpressive and so seeks to avoid. Thus, status boundaries are reproduced simply through expressing one’s tastes. (1997, p. 102)

It is a clear implication of Bourdieu’s position that works produced for an elite or popular audience are designed in anticipation of different styles of consumption. ‘Highbrow’ or legitimate culture is intentionally challenging to appreciate and understand, and as such ‘excites the symbolic mastery, the “cultural need” of the habitus comprised of large stocks of cultural capital, as it is read and recognised as a decipherable symbolic code’ (Atkinson 2011, p. 175). In a certain sense, then, choice between legitimate and non-legitimate cultural goods—the focus of research in the ‘omnivores’ tradition—is only a proxy for style of cultural consumption. It is not only that cultural capital—Bourdieu’s term for the store of explicit, implicit, and embodied cultural knowledge that is the permanent fruit of formal education—is required for the appreciation of ‘highbrow’ cultural goods, but also that it inclines its possessor to seek out such goods—which is to say, to seek out goods which have been designed in such a way as to be enjoyable only through a style of consumption which demands the possession of cultural capital. Bourdieu writes as follows:

The perception called for by the work produced within the logic of the field [of art and literature] is a differential perception … drawing into the perceiving of each singular work the space of compossible works, and hence attentive and sensitive to the deviations in relation to other works, contemporary but also past. … [I]t is increasingly rare that delectation does not have as a precondition the consciousness and the knowledge of the historical games and stakes of which the work is the product, of the ‘impact’ … that it has and which clearly cannot be grasped without historical comparison and references. (Bourdieu 1995, p. 248)

Thus, legitimacy in cultural production is defined by orientation to consumers both able and inclined to bring to the act of consumption a subjectivity founded in legitimate cultural knowledge. Like that knowledge, this subjectivity is inculcated through classroom socialisation (see Allington 2012). Its exercise is modelled through the work not only of academic literary critics but also of professional book reviewers: on the basis of 30 in-depth interviews with the latter, Chong argues that, ‘[w]hile subjectivity of taste is an accepted condition of the art world, legitimate artistic judgement’ necessarily involves ‘regulat[ing] the place of subjectivity in knowledge making about artistic quality’ (2013, p. 278). Although Bourdieu acknowledges the existence of naive artists who do not understand why their creations are valued by ‘highbrow’ consumers, he suggests that it is artists such as Marcel Duchamp, fully able to anticipate the conditions of the reception of their work, who are more likely to achieve success in the field:

By … reinforcing the ambiguity which makes the work transcendent over all interpretations, including those of the author himself, [Duchamp] methodically draws on the possibilities of a willed polysemy which, with the appearance of a corps of professional interpreters – meaning [interpreters] professionally determined to find meaning and necessity, however much work of interpretation or overinterpretation is involved – is found inscribed in the field itself, and therefore in the creative intention of producers. (Bourdieu 1995, p. 247)

None of the above applies within the sphere of ‘middlebrow’ culture, the ideal producer of which is not an autonomous artist working in pursuit of an uncompromising vision but an ‘entertaining technician’ (Bourdieu 1993, p. 130), skilled in producing goods with such characteristics as will readily provide an enjoyable experience for the general listener, viewer, or reader. This implies that works destined for a ‘middlebrow’ audience are likely to disappoint the expectations of the ‘highbrow’ consumer, and vice versa: one thinks, for example, of the social climber whom Proust ironically describes imagining the work of the ‘great poets’ to consist of ‘romantic and heroic verse in the style of the Vicomte de Borelli, only even more moving’ (1996 [1913], pp. 289–290).

The Inheritance of Loss and The White Tiger

Kiran Desai’s 2006 novel, The Inheritance of Loss, and Aravind Adiga’s 2008 novel, White Tiger (henceforth, IoL and WT), belong to the specific genre of ‘highbrow’ or literary fiction that is usually referred to as Indian Writing in English or Indian English Literature (see e.g. Dwivedi and Lau 2014): a label whose success can be compared to that of ‘literatura latinoamericana’ in facilitating the marketing and critical valorisation of the works and authors that it groups together (see Santana-Acuña 2020). The two novels are frequently associated with one another, both because of their similarities of theme and because they won one of the Anglophone world’s most prestigious literary prizes in successive years: the Man Booker prize was won by IoL in 2007 and by WT in 2008. Although some have argued that cultural prizes reproduce the values of the market by rewarding blockbusters, major literary prizes are not typically awarded to best-selling books, and it has indeed been argued that ‘the awards industry has helped to shape a scale of value ever further removed from the scale of bestsellerdom’ (English 2005, p. 331). Winning a major prize is one means by which a book may progress towards classic status, which, as Santana-Acuña (2013) argues, involves transcending its original ‘vernacular organisational context’ through appropriation by agents outside that context (here including all those individuals involved in the awarding of the prize, as well as the institution of the prize itself). But this also brings financial reward: IoL sold 534 copies in the week before its Man Booker win, which rose to 4726 copies in the week after and eventually resulted in sales of 182,044 copies by 2012, while WT made a still more dramatic rise from 463 copies in the week before to 8033 in the week after and enjoyed total sales of 551,061 by 2012 (Guardian Datablog 2012). The greater sales for WT might be taken to indicate greater orientation to (or at least, compatibility with) the values of the mass market, but its relative success must be seen in context of what can be achieved by true ‘middlebrow’ works: a roughly contemporaneous novel by Da Vinci Code author Dan Brown reportedly sold over a million copies in a single day (Reuters Staff 2009).

To introduce both the books themselves and the style of consumption for which they are intended, I shall now consider synopses of IoL and WT which were published in scholarly journals devoted to the study of literature. Within the context of this article, these synopses serve at once to outline the content of the books and to exemplify legitimacy in cultural consumption, revealing that content as it takes shape before the eyes of the ‘corps of professional interpreters’ referred to by Bourdieu:

The Inheritance of Loss is structured in such a way that it explores interconnections between colonialism, nationalism, postcolonial conflicts, globalization, cultural imperialism, class-based exploitation, cosmopolitanism, migrancy, and diaspora. The narrative begins in the 1980s and moves back and forth in time and in space between a Nepali separatist movement in rural northwest India, a migrant experience in the basements and kitchens of New York restaurants, and the colonial past in which anglicised Indians could be made to feel deeply alienated from their families and themselves. Linking together this complex plot is a varied cast of characters including the teenage Sai and her embittered grandfather, judge Jemubhai Patel, who live together in a crumbling mansion in Kalimpong; their cook’s son, Biju, who has migrated to New York in pursuit of a more prosperous life; Sai’s young tutor Gyan, whose romance with her is thwarted by his involvement in the Nepalese separatist movement; and various minor characters including a host of charmingly eccentric neighbours, most notably Father Booty and Uncle Potty. These latter individualistic misfits transcend the divisions and alliances based on class, nationality, and ethnic background, among other artificial markers of human difference, which influence the interactions between other characters. (Jackson 2016, p. 26)

[The] White Tiger … employs an epistolary form – the story is told as a series of letters which the protagonist, Balram, purports to write to the Chinese Premier – and a relatively simple time-frame in which Balram recollects his early life and deeds from the security of his new identity as a successful businessman. Apart from these narrative devices, the novel is a very direct account, appropriate for one narrated by a rural uneducated (if sly) villager, with little by way of philosophical or moral reflection, deep symbolism or great psychological insight into Balram or human nature – although of course the reader is invited to make such reflections and insights precisely because of Balram’s narrative shortcomings.

What characterises the narrative is the chilling frankness and simplicity with which Balram recounts his career of poverty and oppression, desperation, and finally murder … . It is the contradictory nature of Balram … which primarily sustains the reader’s interest in the novel. Adiga creates a protagonist who, in his social background could well be India’s Everyman, but if so, his inner workings and career suggest the deep moral malaise which lies at the heart of modern India. (Goh 2011, pp. 333–334)

Each of the above embodies the style of cultural consumption in anticipation of which ‘highbrow’ novels are designed. Plot, character, and narrative structure are all viewed in relation to theme: they are not seen as discrete elements to be enjoyed in themselves or appraised in relation to external criteria, but as facets of an imaginative work every aspect of which is to be understood as expressive of an underlying meaning. This meaning is explicated through discussion of these formal features, and those formal features are discussed through explication of that meaning.

One novel has a ‘complex plot’ that ‘moves back and forth in time and in space’ not for the pleasure of the reader but as a means of ‘explor[ing] interconnections’ between myriad social and philosophical topics. In other words, it is bewildering in construction, but the sufficiently perceptive reader may perceive a unity beneath its fragmented surface, in recognising the latter to have been deliberately constructed in such a way as to inspire meditation on important ideas. For example, there are certain characters whose function is to certain characters whose function is to provide contrast with ‘the divisions and alliances ... which influence the interactions between other characters’. The other novel, by contrast, is ‘a very direct account’ of events taking place within ‘a relatively simple time-frame’, and yet, through its ‘chilling frankness and simplicity’, it invites ‘reflections and insights’ from the reader ‘precisely because of [the focal character’s] narrative shortcomings’. In other words, it is very simple in construction and thus offers relatively little on a surface level, but its simplicity is such as to lead the reader into meditation on ‘the deep moral malaise which lies at the heart of modern India’.

Other readings of the books are, of course, possible, thanks to the ‘willed polysemy’ that Bourdieu associates with self-aware literary and artistic creation. For example, one critic first observes that several other critics have read WT as engaged in a form of Orientalist othering of low-status Indians, and then rebuts them by interpreting it as ‘a response to the circulation of critiques of the writer as Orientalist’ (Brouillette 2011, p. 41). Responses to a cultural product such as a novel are legitimate not to the extent to which they reproduce some Platonic ideal of a ‘correct’ reading of the work in question, but to the extent that they embody a culturally legitimate style of consumption. It is not, in other words, that the ‘highbrow’ reader must interpret these books as having the precise meanings attributed to them in the above synopses, but that these synopses embody the style of the reader who is, as Bourdieu put it, ‘determined to find meaning and necessity, however much work of interpretation or overinterpretation is involved’ (quoted above; see “Elite and popular styles of cultural consumption”). Having been constructed in anticipation of such a reader’s preferences, it seems likely that they may fail to satisfy the needs and expectations of readers who do not approach cultural goods in the same way.

A prize such as the Booker brings with it a wave of publicity: enough to carry a ‘highbrow’ novel beyond the relatively narrow circle of ‘highbrow’ readers. But this is likely to mean being read by people who do not read novels in the way that ‘highbrow’ novels are designed to be read. Kovács and Sharkey’s analysis of ratings on the Goodreads website suggests that ‘[w]hen audience members evaluating an object are attracted to it because of its status rather than its substantive features, mismatches between the focal object and the taste of the audience members are more likely to occur’ (2014, p. 3). The current article goes further by theorising these ‘mismatches’ in terms of the contrast between legitimate and non-legitimate prose fiction. It argues that, given such an understanding of the relationship between the kinds of books that win prizes like the Booker and the kinds of books that most readers enjoy, it is reasonable to expect to find in lay reviews of prize-winning ‘highbrow’ works such as IoL and WT evidential traces not only of a more legitimate style of consumption similar to that of professional literary critics and relatively more likely to be satisfied with these novels but also of a less legitimate style of consumption relatively more likely to be disappointed or frustrated by those novels’ failure to conform with expectations formed from subjective enjoyment of more ‘middlebrow’ works.

Amazon customer reviews

Incongruous though it may now seem, the giant web services, e-commerce, and video streaming corporation, Amazon, began as a book retailer. In common with other online retailers, it provides customers with the ability to ‘rate’ products by awarding them from one to five ‘stars’, and also to ‘review’ products by entering text which will be displayed for other customers to see. Ratings are aggregated both as an average and via a sort of histogram, while reviews (together with the individual ratings that accompany them) are presented in the form of a list that may run to many pages in length. These reviews are freely available to read for users of the website.

Online customer reviews of this kind have been most intensively studied within disciplines such as marketing. This is because such reviews are acknowledged to have considerable commercial importance as a form of ‘electronic word-of-mouth’ (eWOM) communication through which information and opinions of brands and products may be disseminated (see e.g. Ismagilova et al. 2017). For example, multiple studies have investigated the persuasiveness of online customer reviews, their influence or potential influence on purchasing decisions, and the ways in which internet users make use of them as an information source (see e.g. Filieri and McLeay 2013; Gottschalk and Mafael 2017; Jiménez and Mendoza 2013; Pyle et al. 2021; Qiu et al. 2012; Tsao et al. 2015; Ye et al. 2009; Zhang et al. 2010). However, customer reviews of books—especially on the various Amazon websites—have also attracted attention from those who are interested in ‘popular’ modes of literary consumption, because they provide easy and direct access to a form of reception data (see Staiger 2005 for an authoritative introduction to the field of reception studies).

Indeed, the study presented in this article follows several earlier studies in using Amazon customer reviews of books as data in order to study the contrast between elite and popular modes of literary consumption. However, those earlier studies may be contrasted with it in that they did not use computational methods to study the discourse of reviews on a linguistic level, instead replying on forms of content analysis or thematic analysis (see Braun and Clarke 2006; Neuendorf 2017 [2002] for methodological introductions to these analytic approaches). Steiner (2008), for example, contrasts Amazon customer reviews of a single ‘literary’ novel with those of multiple ‘chick-lit’ novels, finding that reviews of the former tend to follow the methods of academic literary criticism in grounding positive comments in observations about the characteristics of the text, in particular the author’s prose style, while reviews of the latter tend to evaluate books positively or negatively according to whether or not they meet a range of criteria that are recognised as important for an enjoyable reading experience, such as the presence of a sympathetic main character. Gutjahr (2002) examines many hundreds of Amazon customer reviews of a single work of Christian popular fiction and its sequels, and finds positive comments to focus on the fast-moving plot and the persuasiveness of the biblical interpretation which underlies it; while he does not examine elite responses to the same novels, he notes that scholars attempting to study the same novels for research purposes appear by contrast to have found them much less enjoyable to read. Both Steiner and Gutjahr’s findings recall the contrast between the ‘highbrow’ and popular cultural goods, and the modes of consumption for which they are designed. Elsewhere, I have contrasted professional reviews and Amazon customer reviews of IoL, finding that different standards seem to be applied: professional reviews were over three times more likely to praise the book for its humour, more than twice as likely to praise it for its narrative, and nearly four times less likely to criticise it for its characters, than customer reviews (2016, pp. 269–270).

All of the above studies employ forms of analysis which require human experts to read the reviews that comprise the data. By contrast, there have been many studies which employ a computational linguistic methodology to study customer reviews, but these have by contrast been primarily concerned with the theoretical and technical problems faced by such analysis when applied to this particular form of data, and with potential commercial applications for technical approaches capable of surmounting those problems. This emphasis is apparent in the sheer number of studies concerned with the evaluation of alternative methods for accuracy in automatic classification of reviews, especially with regard to review polarity, but also sometimes with regard to other features, such as sarcasm and unfairness (Alghamdi 2019; Almjawel et al. 2019; Chiavetta et al. 2016; Elmurngi and Gherbi 2018; Jagdale et al. 2019; Lee et al. 2017; Rathora et al. 2018; Shrestha and Nasoz 2019; Sygkounas et al. 2016; Ul Haque et al. 2018).

One exception to this trend is Teso et al.’s (2018) study of book reviews on the Ciao UK website, which focuses on gender differences, identifying linguistic, content, and (inferred) psychological characteristics which appear to differentiate reviews authored by male and female customers. A second notable exception is Gao et al.’s (2018) analysis of tens of thousands of Amazon customer reviews of three specific consumer electronics products in order to understand the ways in which users evaluated one of the three. The success of the latter analysis suggests the possibility of using similar methods to study large numbers of reviews of a limited number of books with the aim of identifying styles of cultural consumption, as well as the ways in which the books in question satisfy or frustrate those styles.

Methodology

Data collection and cleaning

An R script was written to download every page of reviews of each of the two novels from the Amazon.co.uk website in November 2020. At the time of data collection, there were 29 pages of reviews of IoL and 144 pages of reviews of WT. A second R script was written to scrape data from each page. Altogether 1648 reviews were extracted, of which 280 related to IoL and 1368 related to WT. For each review, the date, rating, and text were extracted. Reviews of greater than ten words of which fewer than 10% were non-lexical English-language words (such as ‘and’ and ‘the’), were assumed not to be in English and therefore removed from the dataset. This resulted in the removal of 28 reviews, of which three were associated with IoL and 25 with WT.

Review text was transformed into lower case, then tokenised (that is, divided into individual words), with the individual words being automatically lemmatised (that is, their grammatical inflections were removed, so that e.g. ‘won’ becomes ‘win’ and ‘characters’ becomes ‘character’). Non-lexical words were removed using the same list that was used to identify English-language reviews, and all remaining words of fewer than two letters were removed on the grounds that the English language has only two words of one letter, both of which are non-lexical.

Data analysis

The analytic procedure was adapted from Gao et al. (2018; see “Amazon customer reviews”, above, for discussion). For each title, the 200 most frequent lexical words were identified in reviews (a) with accompanying ratings of four or five stars, and (b) with accompanying ratings of one to three stars. These were treated as potential keywords. Words with fewer than twenty occurrences for a given title were excluded from subsequent analysis. Ratings were centred (that is, the mean rating for the title in question was subtracted from each individual rating), and for each keyword and title, the mean centred rating was calculated for all reviews of that title in which it appeared. For each title, the 25 remaining words with the highest and the 25 remaining keywords with the lowest mean standardised ratings were retained for analysis (with ties broken by number of occurrences). Being associated with ratings that were most divergent from the mean, these were treated as keywords and assumed to reflect the evaluative and interpretative vocabularies used by reviewers who did and did not find each title to their taste. No statistical tests were required as the data were collected from whole population samples for each title.

Once positive and negative keywords had been identified, it was possible to search through reviews for concrete examples of their usage, and thus to establish a qualitative impression of how they were typically being used. Quotations of selected examples are presented with minimal editing (e.g. correction of spelling or typing errors) in order to provide the reader with a direct impression of the forms of discourse employed.

Dataset

Table 1 shows the total number of reviews for each title, with mean rating and mean word length (after exclusion of non-English reviews but before removal of non-lexical words). It can be seen that there were nearly five times more reviews of WT than of IoL, and that the reviews were more positive, with a mean rating of 4.2 rather than 3.4, but also on average about 23% shorter. The greater frequency most likely reflects WT’s greater sales: there will have been far more potential reviewers of WT than of IoL, because it had more readers. The standard deviation for IoL was higher on both measures, indicating lower levels of agreement among its customer reviewers.

Table 1 Reviews of IoL and WT, with mean ratings and lengths (in words)

Figure 1 is a bar chart showing the distribution of reviews by rating. With regard to both titles, five-star reviews were most common. However, the trend towards positivity was far more pronounced in the case of WT: while IoL gained very similar numbers of two-, three-, and four-star reviews and an only slightly greater number of five-star reviews, WT gained 1.3 times more three-star reviews than one- and two-star reviews combined, 2.7 times more four-star than three-star reviews, and 1.8 times more five-star than four-star reviews. Figure 2 shows that in both cases, reviews accompanying ratings that were nearer to the middle of the scale tended to be longer, with the shortest reviews being those which accompanied one- and five-star reviews. The numbers behind each of these visualisations are presented in Table 2.

Fig. 1
figure 1

Frequency of ratings, IoL and WT

Fig. 2
figure 2

Mean review length (in words) by rating, IoL and WT

Table 2 Reviews of IoL and WT by rating, with mean lengths (in words)

Figures 3 and 4 show numbers of reviews and mean ratings (respectively) for each title by year, with the underlying numbers presented in Table 3. WT received more reviews and higher ratings than IoL in every year during which both were in print. In view of the argument above that the publicity associated with Man Booker success may have introduced both books to audiences beyond the community of ‘highbrow’ readers, it is interesting that the two years in which WT received the lowest ratings were 2009 and 2008—that is, the year following and the year of its Man Booker win (which was also its year of publication)—while IoL received its lowest ratings in 2008 and 2007—again the year following and the year of its Man Booker win (in the year of its publication, it received very few reviews, but they were very positive).

Fig. 3
figure 3

Reviews per calendar year, IoL and WT

Fig. 4
figure 4

Mean rating by calendar year, IoL and WT

Table 3 Reviews for IoL and WT by year, with mean ratings

Findings

The top 25 positive and negative keywords for each title are reported in Tables 4 and 5 (each of which juxtaposes positive or negative keywords for both titles) and visualised in Fig. 5 (which juxtaposes positive and negative keywords for each title).

Table 4 Top 25 positive keywords in reviews of IoL and WT by mean centred rating
Table 5 Top 25 negative keywords in reviews of IoL and WT by mean centred rating
Fig. 5
figure 5

Mean centred rating for top 25 positive and negative keywords, IoL and WT

Some of the top positive keywords are not surprising, and would appear unlikely to be title- or genre-specific. These include ‘recommend’ (which appears in the top positive keyword lists for both IoL and WT), ‘great’ (which appears in IoL’s), and ‘highly’ (which appears in WT’s). The appearance of ‘beautifully’ (IoL and WT) and ‘beautiful’ (IoL), however, might suggest something more specific about the evaluative criteria in relation to which ‘literary’ novels are valued, orientating towards them as aesthetic artefacts (as opposed to entertaining experiences). Other words used to evaluate the two writers’ artistry included ‘create’ (IoL), ‘describe’ (WT), ‘picture’ (WT), and ‘draw’ (WT; this usually referred either to how the characters were ‘drawn’ or to how the narrative technique ‘draws’ the reader into the world of the book, or ‘draws’ him or her back to the book itself). ‘Love’ (WT) generally referred to the reader’s feelings for the book or (sometimes) its main character. Interestingly, ‘funny’ only appeared as a keyword for WT, although both books contained comedic elements.

Beyond the above, the top positive keywords give us an idea of the kinds of things that people who liked these books found relevant to talk about. Several of the top positive keywords for both titles referred to their settings, in particular ‘Kalimpong’, ‘India’, and ‘world’ (IoL), and ‘village’, ‘travel’, and ‘move’ (WT), while seven of those for IoL referred to the novel’s four main characters: ‘father’ and ‘cook’, ‘Biju’ and ‘son’, ‘grandfather’ and ‘judge’, and also ‘Sai’. There were further top positive keywords that closely related to the themes of the two books: ‘immigrant’, ‘dream’, ‘culture’, ‘colonial’, and ‘class’ for IoL, and ‘darkness’, ‘dark’, ‘light’, and ‘social’ for WT (‘The Darkness’ and ‘The Light’ are that book’s terms for rural and urban India, respectively). Some examples of positive reviews featuring these words follow, showing the ways in which many positive reviewers sought not merely to evaluate, but also to interpret the two books:

It reminded me very much of The God of Small Things, exploring how the legacy of colonialism impacts current generations and permeates one's relationships and self-identity. (IoL, 4 star)

Kiran Desai handles several themes in this book – the colonial legacy that still dominates India, India’s own multiculturalism, being an outsider in one's own country, and some painful truths about immigration to and from India. (IoL, 4 star)

Far-reaching effects of colonialism mark the isolated grandfather judge, who learned self-loathing under the Raj (IoL, 5 star)

It covers the class system the poverty in India and America, the violence and corruption of all the classes through small details in character development. (IoL, 4 star)

It […] tells a fascinating story about relationships between upper- and lower-class Indians several decades after independence. (IoL, 5 star)

At once so many dimensions – a political novel, tackling issues of class, prejudice and race; one of the few works portraying the realities of the illegal immigrant underclass in America – and the hopes and dreams that started it all (IoL, 5 star)

[The narrator’s] letters … [cast] a grievous pall across both affluent, corrupt, urban India (the Light) and the Darkness, the traditional life of the villagers, painted as bigoted, often unpleasant and oppressed. (WT, 5 star)

Some very strong images and unsentimental views of life on the (much) poorer side of Indian life, through the eyes and mouth of one who shows the amazing gumption to plot his way out of the ‘darkness’ – the almost inescapable poverty and family trap that the majority of Indians find themselves in. (WT, 5 star)

The reader learns about the caste system, how those at its top are in The Light, those trapped at its base are in The Darkness. He learns that both worlds are controlled by corruption and retribution. (WT, 5 star)

The India portrayed is one of bribery, corruption and huge social injustices. (WT, 4 star)

it’s a book with a social and political message, showing corruption and misery in modern India (WT, 4 star)

It’s a brilliant satire and social commentary, which is occasionally hilariously funny but always totally engaging. (WT, 5 star)

These reviews speak almost with the same voice as the critical readings quoted above (see “The Inheritance of Loss and The White Tiger”). It is the same style of cultural consumption that they enact, not approaching characters as (imaginary) individuals whom one might hope to feel affectionate towards, nor approaching plot as a sequence of (imaginary) events vicarious experience of which one might enjoy, but approaching both as expressions of an inner logic, such that appreciation and interpretation of the books in question are inseparable, a single process of explication de texte.

Aside from a few predictable-seeming items such as ‘bad’ and ‘quality’ (WT), the top negative keywords are just as interesting, and it is worth quoting from them extensively in order to show that they too constitute a coherent discourse. To begin with perhaps the most notable of these overlaps, ‘win’, ‘Booker’, and ‘prize’ occur in both of the negative keyword lists, along with ‘winner’ and ‘award’ for WT. It might have been expected that words such as these would primarily appear in positive reviews, but in fact, reviewers who used them were more likely to do so in order to express bafflement or even anger that the authors had received such institutional rewards: a point that may perhaps be related to the finding that award-winning books are 57% more likely to attract online reviews expressive of disappointed expectations on Goodreads (Kovács and Sharkey 2014, p. 22). Examples follow:

The characters are flat, the ‘wit’ seems to consist mainly of being negative about various characters, and there isn’t enough tension to sustain a story. Worse, it's very gimmicky, with lots of words being arranged in arty, quirky ways on the page, usually the sign that content is lacking. Oh, and the over-writing. Puh-LEEZE!I really wonder why this book won the Booker – good lobbying by Mummy? (IoL, 1 star)

I’m confused as to why/how this book won a Booker Prize award. The author has a very disjointed writing style and continually uses references in Indian [sic] which made it extremely difficult to read. (IoL, 1 star)

I found this quite hard to finish for the simple reason that the plot was quite thin and the writing style not one that I enjoyed. This wasn’t the page turner that I expected it to be, given the various prizes it has been awarded. (IoL, 2 star)

The one redeeming feature of this novel is it somehow managed to win the Booker Prize – well done to the author for managing that, but what on earth were the judges thinking? (WT, 1 star)

I am at a loss to understand how this book has received the Man Booker Prize and the plaudits that it has. With the hero or antihero of a book you want to be able to care about what happens to them. In this case, I could not have cared less. (WT, 1 star)

Shame on the Booker judges to go with the Ugly and Raw. Like the fine art market all that is beautiful is considered outmoded. (WT, 1 star)

The word ‘cover’ appears with negative associations for both. Some occurrences of this word simply featured in complaints about the wrong edition having been delivered, or about the physical condition in which it arrived—a reminder that Amazon reviews are tied to the transaction of book purchase in a way that reviews on sites such as Goodreads are not—but many were expressions of dissatisfaction with one or other book’s marketing, suggesting that the work in question had been mis-sold:

the cover says the book is funny. It’s not (IoL, 1 star)

According to the back cover, the sordid and commonplace adventures of this guy ‘Illuminate on the consequences of Colonialism’ (IoL, 1 star)

the cover was attractive and ... I guess that’s it (IoL, 2 star; ellipsis in original)

Some of Adiga’s descriptive prose is excellent, but this is not really enough to make this a ‘blazingly savage and brilliant’ novel as described on the front cover (WT, 3 star)

all the accolades and hyperbole spattered across the cover lead you to believe that […] this is [a great novel]. Unfortunately, White Tiger is not a great novel (WT, 3 star)

After seeing […] the ecstatic quotes on the cover, I was expecting a lot more (WT, 3 star)

Implicit in the above-quoted negative references both to the Man Booker Prize and to the covers of both books is a critique of, hostility to, or perhaps resentment of the literary institutions (prize committees, publishers) that brought these books to the attention of readers who did not enjoy them. Perhaps relatedly (given that book clubs or reading groups often choose titles on which professional critics or literary prizes have bestowed legitimacy; see Long 2003), negative WT reviewers often made reference to the novel having been chosen by book clubs that they were members of (hence the appearance of the word ‘club’ among the top negative keywords). The words ‘disappoint’ and ‘expect’ were also top keywords among negative reviews of that novel, with negative reviewers often using these terms in order to highlight features which they felt that a ‘good’ book should possess, but WT did not:

I found this book a bit disappointing. […] It was a real struggle to plough through to the end and when I did get there I felt as if I had wasted my time. (WT, 1 star)

having won the Man Booker Prize, I expected [it would be] a good read. However, I was disappointed, it is a slow book with not a lot happening, we’re told the main event right at the beginning and one would expect some twists or other big events, but no. (WT, 2 star)

I found this novel disappointing. […] A good book needs either a stunning plot or really sympathetic characters, and for me, this has neither. (WT, 3 star)

I'm stunned this won the Booker prize […] although the book featured lots of nice imagery and description, where was the story? […] It’s not an awful read, but certainly not as good as I was expecting. (WT, 2 star)

As the story unfolds, you expect there to be some kind of twist that will change everything. It doesn’t come. (WT, 3 star)

While references to specific characters were, as we have seen, associated with positive reviews of IoL, references to ‘character’ were associated with negative ratings of that book—as too were references to ‘style’:

It was difficult to warm to any of the characters and I couldn’t care less what happened to any of them (IoL, 1 star)

There were no endearing characters, well maybe Biju a touch (IoL, 1 star)

Nothing much happens to the consistently loathsome characters until page 300 (IoL, 1 star)

The author has a very disjointed writing style (IoL, 1 star)

I found the poetic style of the early sections quite laboured (IoL, 2 star)

the writing style is sparse to the point of emotional emptiness (IoL, 2 star)

References to ‘narrative’ (IoL) or the ‘narrator’ (WT) were also very often complaints about the way in which the two books were written:

I found the narrative too dull (IoL, 1 star)

boring and confusing narrative (IoL, 1 star)

This is a narrative without any real plot (IoL, 2 star)

I struggled through the first 100 pages, mainly because of the awkward premise that the first person narrator style was in a letter to the Chinese premier (WT, 3 star)

The format of this novel through the narrator sending emails to Premier Jiabao of China I found gratuitously contrived (WT, 3 star)

a narrator who not only would be uninterested in the society around him anyway, but who is either a communist or an imbecile to boot (WT, 3 star)

The words ‘care’ and ‘interest’ (IoL) typically referred to what the book did not inspire in the reader. In a similar vein, the words ‘story’ (IoL) and ‘plot’ (WT) were typically used to refer to a specific aspect of both books which many negative reviewers considered to be poorly constructed or even absent:

a very depressing and confusing story (IoL, 2 star)

Biju's story was almost completely unrelated to the main narrative (IoL, 2 star)

there's no intrigue, no plot, NO STORY (IoL, 2 star)

One last bugbear of mine was the plot. Because the narrator tells us of the climactic event right at the beginning, I was expecting ... a plot twist towards the end … This, too, was non-existent (WT, 1 star)

The plot is ungripping (WT, 1 star)

No plot and a boring silly unrealistic story, no real characters, grotesque descriptions and vulgar for no purpose (WT, 1 star)

In this connection, it is worth noting that the words ‘murder’, ‘kill’, and ‘employer’ were all used to discuss the central event of the plot of WT, and appear as top negative keywords within its reviews. Reviewers who described the plot as an account of how this fictional event came to pass, rather than as an expression of themes of moral corruption, appeared to find the book unsatisfactory.

The word ‘finish’ appeared in both top negative keyword lists. This was because of the frequency with which negative reviewers referred to the difficulty they found in making themselves read the books in question, which they often explicitly attributed to the presence of features that they found disagreeable or to the absence of features that they would have found enjoyable:

I took so long to finish it, I nearly didn't!! Like some of the other readers, I only continued reading as I thought any minute now it will get interesting! But it didn't. I would have liked to see Sai and Gyan’s romance blossom a little more. (IoL, 2 star)

I found that I really had to push myself to finish this book (IoL, 2 star)

I didn’t finish the book, I found it boring and horrible in places. (IoL, 2 star)

The plot is quite interesting but the writing style is a bit old fashioned, with too long and too descriptive phrases. I could hardly finish that book. (WT, 1 star)

Too tedious to finish. (WT, 1 star)

The only reason I finished this book was for a class, and I despise my professor for it. Boring book with no redeemable characters, and written from the perspective of an egomaniac with a 3rd grade reading level. (WT, 1 star)

The last of these is particularly interesting when considered alongside the critical synopsis of the same book quoted in “The Inheritance of Loss and The White Tiger”: rather than a literary device whose ‘narrative shortcomings’ constitute an invitation to social and philosophical reflection, this customer reviewer perceives the narrator only as an ‘egomaniac with a 3rd grade reading level’, and angrily rejects the institutional context that privileges texts of this kind and attempts to socialise students into the style of cultural consumption that such texts are designed to afford.

Conclusion and scope for further work

This study has focussed on points of comparison between reviews of two literary or ‘highbrow’ novels, and on points of contrast between positive and negative reviews of both, rather than on points of contrast between literary and non-literary novels. Its findings have suggested explanations relating to styles of cultural consumption. This has meant a focus on the characteristic expectations of readers who do and do not typically read literary fiction, and to the ways in which such expectations may have been satisfied or frustrated by these two specific exemplars of that genre.

Within each corpus of reviews, it is possible to discern a contrast between two styles of cultural consumption. The contrast can perhaps best be illustrated with two short quotations from customer reviews of IoL. The first rejected the book, explaining that he or she ‘do[es]n’t do miserable reads any more’ because ‘life is too short’. The second praised it as ‘a beautifully written book that is difficult to read because of the depth of pain and truth in its descriptions’. Customer reviewers who evaluated these two books positively often implied that they appreciated them in part because they were so difficult to enjoy, while those who evaluated them negatively often suggested that it was for precisely the same reason. Enjoyment of books such as these appears identical with enjoyment of engaging in the learned interpretation of their formal features. This style of cultural consumption is modelled by what Bourdieu referred to as ‘professional commentator[s] on texts’ (1995, p. 194), both in published literary criticism and also in lecture theatres and classrooms (see Allington 2012). Some readers are able, and choose, to engage in it. Others are not, or do not. The novels whose Amazon customer reception has been investigated here would appear to have little to offer to the latter group—hence the disappointment and frustrated expectations expressed by so many of those readers who gave them below average ratings, and hence also the anger and resentment which such readers sometimes expressed towards the prize which both novels won.

This study’s findings are consistent with what one would expect if there are to be found among the readers of IoL and WT both some who have assimilated the values of the field of art and literature as conceived by Bourdieu, and who therefore approach novels such as these in a manner modelled on that of ‘professional interpreters’, and some who have not assimilated those values, and therefore approach novels such as these in a manner that might perhaps be better suited to ‘middlebrow’ titles. However, reliance on data from this source makes it impossible to investigate the important question of whether these styles of consumption are indeed associated with social status and with level and type of formal education (as Bourdieu's work would suggest). As with other uses of what has been called ‘by-product data’ (Beer and Taylor 2013), its greatest limitation is therefore in the lack of information on the individuals whose online activities brought the data into being. Nonetheless, it may still serve to inform studies employing methods such as interviewing and sample surveying by providing an idea of which topics to focus on in questions: not only which types of cultural goods are preferred, but which characteristics are sought out and valued in those goods.

Finally, it is worth considering how much further it might be possible to take a study of this nature without yet returning to more conventional forms of sociological data. To gain a better sense of contrast between literary and non-literary fiction and their reception among Amazon customer reviewers, it might be necessary to use a representative sample (for example, a random sample) of reviews of a much larger number of works, including both literary and non-literary works. However, such a study would not be able to go into the same depth on the linguistic detail of reviews, and would likely lose many of the finer points that were brought out in the current study: for example, it has here been possible to identify that positive reviewers had a tendency to write about specific characters in one novel, based on the appearance of those characters’ names in the top positive keywords list (e.g. ‘Biju’, ‘Sai’, etc.), and to identify that positive reviewers of both novels had a tendency to write interpretatively about themes in the novels in question, based on the appearance of words relating to the specific themes of those novels (e.g. ‘colonial’, ‘Darkness’, etc). Such title-specific lexis would be likely to disappear from view if reviews of multiple titles were considered together. Thus, it might be that future research will need to focus on negative keywords, as there appeared to be more commonality between the negative keyword lists for the two titles studied here than between the corresponding positive keyword lists. Alternatively, some form of quantitative content analysis could be employed, in order, for example, to identify instances where reviewers engaged in critical interpretation of the texts they reviewed, rather than relying on the ability to detect words that were plausibly related to the specific themes of individual literary works.

Technical note

Collection and analysis of data were done using R v. 3.6.1 (R Core Team 2019) together with the rvest package, v. 0.3.5 (Wickham 2019) for extraction of data from web pages, the tidytext package, v . 0.2.2 (see Silge and Robinson 2016) for tokenisation and preparation of text data, the textstem package, v. 0.1.4 (Rinker 2018) for lemmatisation of words, and the ggplot2, v. 3.2.1 (see Wickham 2016) and patchwork, v. 1.1.0 (Pedersen 2020) packages for visualisation. Non-lexical English words were identified using the ‘smart’ stopwords list from the stopwords package, v. 1.0 (Benoit et al. 2019).