1 Introduction

Online reviews, a form of electronic word-of-mouth (eWOM) have become an influential information source for customers making purchasing decisions, and in turn, promote the sale of products or services [71, 85]. In the tourism industry, customers view online reviews from other travelers to help them make travel plans and reduce uncertainty in their choices [6, 33]. Research has indicated that online hotel reviews can increase customers’ booking intention and enhance hotel performance [1, 47, 98]. Despite their benefits, “the more the better” does not apply to online reviews: too many reviews could cause information overload, reducing the efficiency of customers’ decision-making [12, 23]. Many e-commerce platforms (e.g., TripAdvisor and Ctrip) therefore allow customers to give “helpfulness” votes to online reviews. Customers are more likely to give “helpful” votes to reviews that are informative and readable [59].

A helpful review is a “peer-generated product evaluation that facilitates the consumer’s purchase decision process” [70]. Helpfulness is often measured by the number or ratio of helpful votes [32]. In the hospitality field, studies have explored the determinants of hotel review helpfulness, including review-related factors regarding review length [90, 100], readability [36, 82], rating [48], extremity [20, 59], valance [8, 51], and emotions [10, 46], as well as reviewer-related factors concerning reviewer expertise [83], reputation [59], and identity disclosure [52]. However, some other potential factors have rarely been examined. Particularly, in terms of the characteristics of reviews, although prior research explored how review content influenced review helpfulness, most of them focused on the linguistic features of reviews (e.g., the number of characters, syllables, words or sentences in a review) [33, 48, 59], and only limited studies explored the semantic features of reviews (i.e., the meaning of the text) [34, 82, 90]. Prior research indicated reviews focusing on one or two topics were more attractive to viewers than reviews with messy information [39], suggesting that topics in review text may contribute to review helpfulness [90, 92]. Accordingly, the present study aims to explore the following research question:

RQ How review content type impact helpful hotel reviews?

Our research contributes to the literature in the following ways. First, we advance the understanding of the drivers of review helpfulness by focusing on the semantic features of reviews (i.e., review content type) which are rarely examined in previous studies. Second, we shed new light on the moderators in review helpfulness research by discussing the interaction effects between review content type and other determinants, whereas existing studies mainly consider the moderating effects of hotel features (e.g., hotel price, hotel size, or hotel class) [20, 34, 100]. Finally, our results may help hotel managers use the information from helpful hotel reviews to design more effective marketing strategies to attract customers and improve their loyalty.

2 Related research on helpful hotel reviews

A “helpful” hotel review provides more valuable information than a general hotel review and helps customers make decisions about hotel bookings [32, 48, 52]. Some scholars have discussed the determinants of hotel review helpfulness including the characteristics of reviews and reviewers as well as the potential moderators, which we summarized in Table 1. Next, we explain these relevant studies in more details.

Table 1 Previous studies on the determinants and moderators of hotel review helpfulness

2.1 The determinants of hotel review helpfulness

In terms of review-related factors, a review consists of four components including linguistic features, semantic features, sentiment, and its source (the reviewer information) [90]. Previous hotel review helpfulness studies mainly focus on aspects of linguistic features, sentiment, and source of reviews, but limited attention has been paid to the semantic features.

The linguistic features are related to characteristics of review textual content (e.g., the number of words, readability, relevance, and completeness, etc.). In this regard, relevant variables in prior hotel review helpfulness research include review length [94, 96], review depth [54, 59], review relevance [83], and review readability [36, 82]. Yin et al. [96] demonstrated that review length (measured by word counts) had a positive effect on consumer perceptions of review helpfulness. Srivastava and Kalro [83] indicated that a review that is easily readable, relevant, comprehensive, and complete is more helpful. Sentiment is related to the valence or emotions (positive/negative) embodied in a review, and relevant factors involving in review sentiment [33, 52, 90], review valence [51, 83] and review emotions [46]. Lee et al. [51] examined how review valance (positivity or negativity) affected consumers’ perceptions of hotel review helpfulness, and demonstrated that negative reviews were more helpful than positive reviews. Likewise, Kim and Hwang [46] found that consumers considered negative emotional expressions in reviews as more helpful, as they have higher level of information diagnostics. In addition, other features of reviews are also investigated in previous studies including review rating [48, 94], review extremity [20, 59], and review photo [57, 65]. For instance, Kwok and Xie [48] indicated that hotel review helpfulness is negatively influenced by review rating. Ma et al. [65] used a deep learning model to estimate how user-provided photos affected review helpfulness and found that photos provided alongside review texts improved the helpfulness of online hotel reviews.

Regarding the source of review (features of reviewers), previous studies mainly focused on reviewer’s reputation [59, 83], expertise [36, 100], experience [48, 59], level [31] and demographic characteristics (e.g., gender or age) [37, 52]. Zhu et al. [100] demonstrated that reviewer credibility (i.e., expertise and online attractiveness) influences the perceived helpfulness of online hotel reviews. Lee et al. [52] developed review helpfulness prediction models using classification techniques and found that reviewer characteristics were good predictors of hotel review helpfulness. More recently, Liang et al. [59] investigated how a reviewer’s background influences the helpfulness of hotel reviews and found that customers often give helpful votes to reviews provided by reviewers with a high reputation and local cultural background who had had poor experiences.

Semantic features are related to the meanings of the review text (e.g., keywords, topics) [7]. To our best knowledge, there were only two studies explored how review topics impacted hotel review helpfulness. Specifically, Xiang et al., [90] explored how five hotel attributes extracted from reviews as well as their relationships with review sentiment, review rating, and review helpfulness varied in three different review platforms. Shin et al. [82] investigated how the extent to four topics of hotel attributes mentioned in review text impacted review helpfulness, as well as the moderating of review rating. However, the above two studies did not compare the effects of different review topics on hotel review helpfulness. Moreover, they only considered the direct effects of review topics, but failed to discuss how review topics moderated the relationships between other factors and review helpfulness.

2.2 The moderators of hotel review helpfulness

Several scholars have explored factors that moderate the helpfulness of hotel reviews and mainly involve in hotel features (e.g., hotel class, hotel price, and hotel size) [31, 34, 57, 100], review features (e.g., review emotion, review length, review rating, and review types) [20, 51, 76], and manager factors (e.g., manager response) [48, 95]. In terms of hotel features, Wang et al. [87] claimed that the presence of price cues could increase helpful votes when regarding low-class hotels but not for high-class counterparts. Li et al. [57] found that reviews with photos would be perceived more helpful regarding low-priced hotel than high-priced hotel. Regarding review features, Filieri et al. [20] indicated that extreme reviews were more helpful when reviews were long and included photos. Shin et al. [82] explored how review rating moderated the effects of reviews’ semantic and linguistic features on review helpfulness, and found that review rating moderated the relationships between review topics, review length and review helpfulness. Qazi et al. [76] found that review type (i.e., regular, comparative, suggestive) moderated the effects of review length and review wordiness on helpfulness, reflecting that longer review would be more helpful when review content regarding comparative opinions. As to manager factors, Kwok and Xie [48] demonstrated that manager response moderated the impact of reviewer experience on the helpfulness of hotel reviews. Yang et al. [95] demonstrated the moderating effects of topic consistency and linguistic style similarity between manager response and consumer review on the relationships between text sentiment, review length and review helpfulness. Based on the above, prior research on moderators of hotel review helpfulness is limited. In particular, how determinants of review helpfulness vary in different review topics remains unknown.

3 Identifying review content type in hotel reviews

The purpose of this part is to identify the main types of review content in helpful hotel reviews through text mining approach and based on data collection from Ctrip.com.

3.1 Methodology

3.1.1 Latent Dirichlet allocation

Latent Dirichlet allocation (LDA), an unsupervised Bayesian learning algorithm, is one of the most popular machine-learning topic models and uses a probabilistic framework to infer topics from massive review content [80]. The fundamental of LDA is that it assumes that all online reviews share the same probabilistically distributed topics and all topics can be represented by probabilistically distributed words. Specifically, if d represents the document, w represents the words in the document, and t represents the subject, then the probability P of words appearing in the document can be expressed as follows [25]:

$$P\left( {\left. w \right|d} \right) = P\left( {\left. w \right|t} \right) \times P\left( {\left. t \right|d} \right)$$
(1)

\(P\left(\left.w\right|d\right)\) represents the probability of the word w appearing in the document d, which is known; \(P\left(\left.w\right|t\right)\) represents the probability of the word w appearing in the topic t, and \(P\left(\left.t\right|d\right)\) represents the probability of the document d corresponding to the topic t, which are unknown [58]. The LDA model uses a statistical sampling method to calculate two unknown parameters through a known parameter to realize the subject analysis of documents, which has been widely adopted in previous studies [64, 80]. Accordingly, we applied LDA to obtain the key topics from unstructured helpful hotel reviews.

3.1.2 Data collection

We collected data from Ctrip.com, one of the most popular online hotel review platforms in China that lists approximately 800,000 hotels. Ctrip.com provides hotel reviews and relevant information about reviewers, and it has been studied by many scholars [60, 78, 86]. Figure 1 shows an example of a review on the site. Using a python crawler, we obtained all customer reviews on Ctrip.com for Chengdu’s five main urban areas from November 28, 2016 to November 30, 2019. We chose Chengdu because it is an important city in western China as well as one of the most well-known tourism cities in China. In particularly, Chengdu has been ranked among the top 10 tourist destination in China in recent years, and ranked the fourth in 2020 [41]. Moreover, many previous studies set Chengdu as an example when they do research on tourism [62, 91, 97].

Fig. 1
figure 1

An example of an online review on Ctrip.com

We divided our data according to three aspects: review, reviewer, and hotel attribute (hotel level). The review items were customers’ reviews, helpfulness votes for the reviews, the number of pictures included in the reviews, the review ratings, the positions of reviews, and the published date of reviews. The reviewer items were the reviewer’s level, their total number of reviews and their total number of helpful votes. The names, levels, and rating scores of hotels were also acquired.

3.1.3 Preprocessing and text representation

Online hotel reviews have an unstructured text that needs to be transformed into a structured format to enable analysis. Therefore, to use LDA and reveal the effective information in our final clustering results, we had to preprocess the corpus. We undertook several preprocessing steps: (a) getting rid of punctuation and other special symbols; (b) removing non-Chinese text; (c) filtering stop words (removing words composed of less than two characters and common words); (d) using text segmentation technology to split reviews into individual words or tokens and get rid of repeated words. The final sample comprised 166, 546 reviews across 2690 hotels.

Before proceeding, it was necessary to choose an a priori parameter regarding the number of topics. If the number of topics is too small, the content details of the topic cannot be explained; If the number of topics is too large, the distribution of topics will be too scattered and unsystematic. Most researchers use perplexity (a method to measure the advantages and disadvantages of probability model) to determine the number of topics of LDA [24, 26, 68]. However, the confusion degree will increase irregularly with the increase of the number of topics, so it is not completely reliable to select the number of topics only rely on the perplexity. Mimno et al. [69] determined the number of topics by using the combination of perplexity and coherence score, which effectively solved the above problem. Accordingly, by comparing the perplexity and coherence score of the number of topics from 1 to 20, we selected the point with the lowest perplexity when the coherence score was the highest (as shown in Figs. 2 and 3) which reflected the number of topics is 5.

Fig. 2
figure 2

Perplexity comparison with number of topics

Fig. 3
figure 3

The coherence score of different numbers of topics

3.2 Qualitative analysis

As discussed above, the LDA approach identified 5 topics. The 10 most prevalent words in each topic and their relative weights are shown in Table 2. The procedure of qualitative analysis relies on previous studies about computer-assisted content analysis [3, 30]. Two trained coders with domain knowledge about hotel attributes in reviews [34, 90] read through the top 10 most probable words (shown in Table 2) and the top 20 most representative documents (10 documents with the highest probability, and other 10 documents randomly selected from the remaining documents) (examples shown in Appendix A) for each topic. Then they independently assigned labels to topics, and discussed the coding results and generated a reconciled set of labels. When facing multiple topics included in a review, coders assigned the label to a review mainly relying on the sequence, sentiment, or length of topic content as well as some key adverbs (e.g., particularly, especially, primarily, and importantly).

Table 2 Ten most prevalent words and word clouds for topics

To illustrate the qualitative validation of our results, we provide details of the coding procedure for one of the topics (i.e., Topic 3). Topic 3 was labeled “personalization & uniqueness” after analyzing the corresponding word distribution and representative reviews. In terms of word distribution, this topic contained relevant words regarding service such as free, friendly, help, and gift. Then, by viewing the 20 representative reviews, we found that customers mainly describe the personalized service provided by a hotel such as when a customer with kids or being sick. A typical review is “Very warm service! Knowing that I had kids with me, the hotel provided toothpaste, toothbrush, slippers, and snacks especially for the kids”. Meanwhile, we found that customers also highlight the unique features (e.g., decoration style, smart devices) of a hotel in most representative reviews. For example, a review reflects that “The decoration of the room is exquisite and all the devices are smart, as well as the location is convenience. Importantly, the service is very warm! The hotel staff buy medicine for me when knowing I caught a cold!!! Thumbs up!”. Accordingly, we define the topic 3 as “personalization and uniqueness”.

Following the above procedure, we gain five topics including room experience, location convenience, personalization & uniqueness, event management & staff attitude, and cleanliness & smell. Specifically, room experience (Topic 1) describes the holistic experience customers feel when they stay in the room of a hotel, including the room size, room facilities and amenities, insulation, and ventilation, et al. Location convenience (Topic 2) highlights the convenience of a hotel’s location, involving in the transportation, attractions, and surrounding supporting facilities (e.g., supermarkets, restaurants). Moreover, personalization & uniqueness (Topic 3) is related to personalized services provided by a hotel (e.g., customers with kids or being sick) as well as its unique features (e.g., decoration style, smart devices). Event management & staff attitude (Topic 4) focuses on the procedure and efficiency of a hotel regarding deal with customers’ individual events (e.g., leave a wallet in the room) as well as the attitude of hotel staff when interacting with customers. Finally, cleanliness & smell (Topic 5) is associated with the degree of cleanliness (e.g., room environment, bedding sets) and smell (e.g., non-smoking, fragrance) of a hotel room.

Further, to ensure the reliability of the coding results, as suggested by prior research [42, 89], two other researchers assigned the labels to 200 randomly sampled documents to calculate the inter-rater reliability. As shown in Table 3, the degrees of agreement between LDA and researcher A, between LDA and researcher B, as well as between researcher A and researcher B are 0.586, 0.587, and 0.567, respectively, indicating moderate agreement [15]. Most of the Kappa coefficients of each topic are moderate or above. Thus, our coding results are reliable to some extent [15, 42].

Table 3 Results of inter-rater reliability test

4 Exploring how review content type impacts hotel review helpfulness

We attempt to explore how review content type along with other determinants regarding both characteristics of reviews and reviewers directly impact hotel review helpfulness. We also examined how review content type moderate the relationships between other determinants and review helpfulness. Moreover, hotel level, review positions, and review elapsed days are included as control variables because these factors are found to influence review helpfulness [36, 87, 96]. Our research model is shown in Fig. 4.

Fig. 4
figure 4

Research model

4.1 The direct effects of review content type along with other determinants on hotel review helpfulness

4.1.1 Review content type

Review content type is associated with semantic characteristics of reviews, which focuses on the meaning of words included in a review [7]. Prior research indicated review semantic features could enhance the helpfulness votes of reviews, even more influential than other review characteristics [32, 56, 81]. For example, Wang et al. [87] found that price cues as a semantic feature of reviews could attract help votes. This may be because reviews with specific concepts contained more valuable information, which are more attractive for customers [39]. Also, rich content included in reviews aids in building customers’ trust [7].

In hostility, topics in hotel reviews are mainly related to hotel attributes (e.g., location, facility, and service), and the effects of hotel attributes on consumer preferences have been demonstrated by previous studies. [21, 35, 45]. For example, by extracting hotel attributes from reviews, scholars could identify influencing factors of customers’ satisfaction [55, 99] and hotel rating [5, 9]. Moreover, several scholars begin paying attention to the relationship between review topics regarding hotel attributes between review helpfulness, and found that review topics could impact review helpfulness and varied in different hotel classes and platforms [34, 82, 90]. For instance, Shin et al. [82] found that review included more topics regarding “value”, “landmarks and attractions”, and “core product” are perceived as more helpful. As such, we propose the following hypothesis:

H1

Hotel review helpfulness would vary in different types of review content.

4.1.2 Review characteristics

Review length has a positive influence on review helpfulness, as long reviews usually contain more information and more convincing arguments than short reviews, which can reduce uncertainty about products or services and help customers make decisions [70, 73, 96]. However, longer is not necessarily better because not all words in reviews are valuable or useful [38]. Thus, we use word count after repeated words are deleted to represent review length. A longer effective review length suggests a review will be more helpful.

Review sentiment is defined as the level of positive or negative emotion expressed in a review [40]. Individuals are more sensitive to losses than gains when facing uncertain situations [43]. They therefore pay more attention to negative reviews to reduce the risk of their purchase decisions, especially for experience goods [93]. Accordingly, reviews with negative sentiment may have a stronger influence on review helpfulness.

Review extremity reflects the extremity of an individual’s attitude and is often measured by rating scores [70]. Usually, scores of 5 or 1 represent an extremely negative/positive opinion, respectively, and 3 suggests a moderate view [77]. Reviews with extremely low or high ratings are more influential than moderate reviews [74, 85, 96], perhaps because extreme reviews are associated with strong emotions, which drive customers to devote more cognitive effort to expressing their attitudes [59]. Moreover, moderate reviews with balanced arguments have ambiguous viewpoints, whereas extreme reviews with strong arguments reflect unequivocal opinions, which may help customers to make choices [20, 74]. Therefore, customers are more likely to be influenced by reviews with extreme ratings.

Review photos, as visual cues, are more likely to capture individuals’ attention than text, as photos demand less cognitive effort [50]. Moreover, photos can provide vivid and convincing information that may easily trigger customers’ emotions and in turn influence their purchase decisions [17, 20]. Photos also provide customers with visual evidence and they are thus more objective and reliable than text-only reviews, especially for experience goods [88, 94]. As such, the more photos, the more helpful of a review.

Based on the above discussion, we propose the following hypotheses:

H2

Reviews with longer effective text are perceived as more helpful than reviews with shorter effective text.

H3

Reviews with negative sentiment are perceived as more helpful than reviews with positive sentiment.

H4

Reviews with extreme ratings are perceived as more helpful than reviews with moderate ratings.

H5

Reviews with more photos are perceived as more helpful than reviews with fewer photos.

4.1.3 Reviewer characteristics

Reviewer expertise reflects the knowledge and skills of a reviewer in relation to writing high-quality reviews [100]. Generally, expertise improves the credibility of information and in turn induces persuasion [4, 63]. Customers are more likely to read and trust information provided by reviewers with high levels of expertise because they believe expert reviewers provide more objective and valuable information based on their rich experience [13, 74]. Accordingly, we consider reviewer with a higher level of expertise will improve the helpfulness of their current reviews.

Reviewer reputation relates to the recognition and opinions of others on review platforms [92]. Reviewers with a high reputation are often considered more trustworthy and influential [59]. Customers believe that positive feedback from others will encourage reviewers to put more effort into writing high-quality reviews [77]. As such, reviews by reviewers with a high reputation will be perceived as more credible and useful and thus be more persuasive [14, 19].

According to the above discussion, we posit the following hypotheses:

H6

Reviews provided by reviewers with a high level of expertise are perceived as more helpful than reviews published by reviewers with a low level of expertise.

H7

Reviews provided by reviewers with high reputation are perceived as more helpful than reviews published by reviewers with low reputation.

4.2 The moderating effects of review content type on the relationships between determinants and hotel review helpfulness

As discussed above, there exists some factors moderate the relationship between determinants and hotel review helpfulness, and mainly include hotel features, review features, and manager factors [20, 48, 100]. Our work focuses on how review content type as review features moderates the relationships between determinants and hotel review helpfulness.

In terms of determinants about review characteristics, many previous studies verified that certain review features (e.g., review rating) could moderate the relationships between other review characteristics (e.g., review length) and hotel review helpfulness [20, 51, 82]. In particular, Qazi et al. [76] investigated the moderating effect of review type (i.e., regular, comparative, suggestive) on the links between review length, review wordiness and review helpfulness, and found that longer review would be more helpful when review content regarding comparative opinions. Meanwhile, several scholars verified the relationships between review topics regarding hotel attributes and review helpfulness could be moderated by review sentiment and review rating [82, 90], suggesting the interaction effects on review helpfulness between review topics and other review characteristics. Accordingly, we infer that review content type could moderate the relationships between review-rated factors and hotel review helpfulness, and posit the following hypothesis:

H8

The relationships between review characteristics and review helpfulness vary in different types of review content.

With regard to determinants about reviewer characteristics, prior research also indicated that the interaction effects on review helpfulness between reviewer characteristics and review features [20, 28, 100]. For instance, Zhu et al. [100] verified that the positive influence of reviewer expertise on review helpfulness would become weaker in reviews with extreme ratings than in those with moderate ratings, suggesting review features could moderate the relationship between reviewers’ characteristics and review helpfulness. Regarding review content type, different textual information in reviews could impact customers’ preferences [7, 81], which may adjust the degree of their emphases on reviewer features when making decisions. Therefore, we suppose that review content type could moderate the relationships between reviewer-rated factors and hotel review helpfulness, and propose the following hypothesis:

H9

The relationships between reviewer characteristics and review helpfulness vary in different types of review content.

4.3 Methodology

4.3.1 Data collection and measurements

The data collection process was as described in Sect. 3.1.2. The data were divided into review, reviewer, and hotel feature items. The review items were review content, review rating, the number of helpfulness votes, the number of pictures corresponding to reviews, the positions of reviews, and the posted date of reviews. The reviewer items were related to reviewer identity, as well as the reviewer’s total number of reviews and helpful votes. We also acquired hotel names, levels, and ratings. After excluding non-Chinese, and repeated reviews, we obtained 166, 546 reviews.

The descriptions of variables are shown in Table 4. The dependent variable is review helpfulness, measured by the number of helpful votes for a review; the independent variables include the characteristics of reviews (i.e., content type, effective length, sentiment, extremity, and photo) and the characteristics of reviewers (i.e., expertise and reputation). The review content type also serves as a moderator. The hotel level, review positions, and review elapsed days as control variables.

Table 4 Descriptions of variables

The review content type is coded based on the results of LDA ranging from 1 to 5 (e.g., topic 1 for 1, topic 2 for 2). Effective review length by counting the number of words included in a review after deleting repeated words. We used the counting method suggested by Huang et al. [38], which is more suitable for Chinese text. Review sentiment represents the extent to which a customer feels positively or negatively towards the subject of their review. To measure the level of sentiment in a review, we applied the SnowNLP package to determine a sentiment score for each review, which ranged from 0 (completely negative) to 1 (completely positive). Review extremity was measured by review rating score on a scale from 1 (very dissatisfied) to 5 (very satisfied) and its quadratic term. Review photo was measured by the number of photos included in a review. Reviewer expertise was measured by the total numbers of reviews posted by a reviewer and reviewer reputation was measured by the total numbers of helpful votes received by a reviewer. Review elapsed days was measured by the elapsed days between the review posted and December 30, 2019. Hotel level was scored according to the star rating assigned to a hotel by the platform, ranging from 1 (two-star hotel) to 4 (five-star hotel). As there were very few reviews of one-star hotels, we did not include these hotels in our sample. Review positions was measured based on the crawling sequency of a review/the total review number of the hotel.

4.3.2 The descriptive statistics of data.

Table 5 shows the descriptive statistics for all variables from all reviews, helpful reviews and non-helpful reviews. Among 166,546 valid samples, the average review content type was 3.23 (SD = 1.52). Among them, 121,546 reviews are helpful, and 45,000 reviews are non-helpful. Specifically, regarding helpful reviews, 19,765 reviews belonged to topic 1 (room experience), 30,088 reviews belonged to topic 2 (location convenience), 16,562 reviews belonged to topic 3, 14,340 reviews belonged to topic 4 (event management & staff attitude) and 40,791 reviews belonged to topic 5 (cleanliness & smell). In terms of non-helpful reviews, 6621 reviews belonged to topic 1, 12,929 reviews belonged to topic 2, 4674 reviews belonged to topic 3, 3615 reviews belonged to topic 4 and 17,161 reviews belonged to topic 5.

Table 5 Descriptive statistics

4.4 Results

To provide a deep understanding of how review content type impacted review helpfulness, we examined its direct and moderating effects through regression analysis, ANOVA, and subgroup analysis.

  1. (1)

    Regression analysis

In the regression analysis, we treated each review topic as a continuous variable (from 0 to 1) based on the procedure in previous studies [82, 90]. Specifically, a review was assigned to five values corresponding to five review topics, and each value referred to the probability of a topic to be included in a review based on LDA results. For example, the values of a review are (0.52, 0.31, 0.24, 0.01, 0.02), suggesting that the probabilities of five topics to be included in the review are: 0.52 for “room experience”, 0.31 for “location convenience”, 0.24 for “personalization & uniqueness”, 0.01 for “event & staff attitude”, and 0.02 for “cleanliness & smell”.

Then, we conducted a hierarchical regression analysis to examine how review content type along with other determinants impacted review helpfulness. Centralization was conducted to all the variables to avoid multicollinearity concerns. Specifically, we applied the negative binomial regression model, because the number of review helpfulness votes was a non-negative counting variable and the variance of our dependent variable was bigger than its mean after testing. Thus, the negative binomial regression model is more suitable for our work than the conventional multiple regression model [53]. In model 1, we included only the control variables (i.e., hotel level, review positions, and review elapsed days). In model 2, we added the other determinants as independent variables (i.e., effective review length, review sentiment, review extremity, review photos, reviewer expertise, and reviewer reputation). In model 3, review content type was added to test its direct effect on review helpfulness. In model 4, interaction terms were added to test the moderating effects of review content type. All the regression analysis were conducted through SPSS 25.

As shown in Table 6, the adjusted R squared between the four models increased, and all improvements showed significance. In model 1, all control variables have significant positive impacts on review helpfulness excepting review positions. Model 2 examines the contribution of reviews’ and reviewers’ characteristics to review helpfulness. We found the longer the effective length of the review text, the more review helpfulness (β = 0.018, p < 0.001), verifying H2. However, the effect of review sentiment on helpfulness is nonsignificant, thus rejecting H3. The negative coefficient of Rating (β = − 1.872, p < 0.001) and the positive coefficient of Rating2 (β = 0.276, p < 0.001) indicate that there is a U-shaped relationship between rating and review helpfulness [29, 61]. In other words, extreme reviews are more helpful than moderate reviews, and therefore H4 is supported. Review photos has a positive effect on review helpfulness (β = 0.136, p < 0.001), thus confirming H5. However, the coefficient of reviewer expertise is negative (β = − 0.008, p < 0.001), which is contrary to our hypothesis, and therefore H6 is rejected. The relationship between reviewer reputation and review helpfulness is positive (β = 0.052, p < 0.001), thus verifying H7.

Table 6 Results of hierarchical regression analysis

In model 3, we examine the direct effects of review topics on review helpfulness. Results show all review topics have significant influences on review helpfulness, thus supporting H1. Specifically, Topic 1 (room experience) (β = − 4.889, p < 0.05), Topic 2 (location convenience) (β = − 4.827, p < 0.05), and Topic 5 (cleanliness & smell) (β = − 4.807, p < 0.05) negatively impact helpfulness, suggesting reviews become less helpful when regarding these three topics. While Topic 3 (personalization & uniqueness) (β = 5.203, p < 0.05) and Topic 4 (event management & staff attitude) (β = 5.297, p < 0.05) positively influence helpfulness, this indicates that the more mentioning these two topics, the more review helpfulness.

Further, in model 4, we added interaction items between other determinants and review topics to investigate the moderating effects of review content type. Results reflect that most interaction items are significant, suggesting review content type moderate the relationships between certain determinants and review helpfulness, and therefore both H8 and H9 are supported. Specifically, regarding effective review length, its positive effect on review helpfulness becomes stronger when related to Topic 5 (cleanliness & smell) (β = 0.013, p < 0.001), but becomes weaker when related to Topic 1(room experience) (β = − 0.022, p < 0.001), Topic 2 (location convenience) (β = − 0.021, p < 0.001), and Topic 3 (personalization & uniqueness) (β = − 0.016, p < 0.001). However, review content type could not moderate the relationship between review sentiment and review helpfulness. In terms of review extremity, when the review contains content about Topic 3 (personalization & uniqueness) (β = 0.312, p < 0.05), the helpfulness of extreme reviews would be enhanced. Reviews with photos about Topic 1 (room experience) (β = 0.060, p < 0.1), Topic 4 (event management & staff attitude) (β = 0.117, p < 0.001), Topic 5 (cleanliness & smell) (β = 0.082, p < 0.05) would receive more helpful votes. Moreover, the negative effect of reviewer expertise on review helpfulness become stronger when regarding Topic 3 (personalization & uniqueness) (β = − 0.004, p < 0.001) and Topic 4 (event management & staff attitude) (β = − 0.010, p < 0.001), while becomes weaker when regarding Topic 1 (room experience) (β = 0.013, p < 0.001) and Topic 2 (location convenience) (β = 0.006, p < 0.001). Meanwhile, the positive effect of reviewer reputation on review helpfulness becomes greater when related to Topic 3 (personalization & uniqueness) (β = 0.051, p < 0.001) and Topic 4 (event management & staff attitude) (β = 0.054, p < 0.001), while becomes weaker when related to Topic 1 (room experience) (β = − 0.050, p < 0.001), Topic 2 (location convenience) (β = − 0.055, p < 0.001), and Topic 5 (cleanliness & smell) (β = − 0.709, p < 0.001).

  1. (2)

    ANOVA analysis

To gain a further understanding of the relationship between review content type and review helpfulness, we conduct an analysis of variance (ANOVA) to compare each two hotel attributes regarding their influences on helpfulness, by considering review content type as a categorical variable.

The significant results of Levene’s test showed that the variance between levels was not equal (p < 0.001), so the Welch F-test was adopted [84]. The results revealed that there were significant differences regarding review helpfulness between different review content type groups (F (4, 61,261.291) = 116.616, p < 0.001). As such, we conducted post-hoc test to explore these differences in more details, and results are shown in Table 7.

Table 7 Results of ANOVA analysis

The results indicate that review content about Topic 1(room experience) receives less helpful votes than review content regarding Topic 3 (personalization & uniqueness) (Md = − 0.611, p < 0.001), Topic 4 (event management & staff attitude) (Md = − 0.625, p < 0.001), and Topic 5 (cleanliness & smell) (Md = − 0.222, p < 0.001). Meanwhile, review content about Topic 2 (location convenience) scores lower on helpfulness than review content about Topic 3 (personalization & uniqueness) (Md = − 0.582, p < 0.001), Topic 4 (event management & staff attitude) (Md = − 0.596, p < 0.001), and Topic 5 (cleanliness & smell) (Md = − 0.194, p < 0.001). Moreover, review content regarding Topic 3 (personalization & uniqueness) receives more helpful votes than review content about Topic 5 (cleanliness & smell) (Md = 0.389, p < 0.001), but receives less helpful votes than review content regarding Topic 4 (event management & staff attitude) (Md = − 0.014, p < 0.001). In addition, review content regarding Topic 4 (event management & staff attitude) receives more helpful votes than review content regarding Topic 5 (cleanliness & smell) (Md = 0.403, p < 0.001). However, there was no difference in terms of review helpfulness between review content regarding Topic 1 (room experience) and Topic 2 (location convenience). Based on the above, among the five types of review content regarding hotel attributes, customers value Topic 4 (event management & staff attitude) the most, followed by Topic 3 (personalization & uniqueness), Topic 5 (cleanliness & smell), and Topic 2 (location convenience)/ Topic 1(room experience), reflecting that review helpfulness varies in different types of review content.

  1. (3)

    Subgroup analysis

We conducted a subgroup analysis [2, 16] to explore how the effects of determinants on hotel review helpfulness vary in different types of review content. We divided the sample into five groups based on the types of review content, and tested the hypotheses by regression analysis in all subgroups separately, and results are presented in Table 8.

Table 8 Subgroup analysis on path differences among different types of reviews

The results indicate there are significant differences regarding the relationships between some of reviews’ characteristics and hotel review helpfulness among different types of review content, thus further confirming H8. Specifically, effective review length receives more helpful votes when regarding Topic 5 (cleanliness & smell) than the other four topics, but receives less helpful votes when regarding Topic 3 (personalization & uniqueness) (diff3-2 = − 0.001, p < 0.05) and Topic 4 (event management & staff attitude) (diff4-2 = − 0.001, p < 0.05) than Topic 2 (location convenience). The positive effect of review sentiment would become greater when review content is about Topic 5 (cleanliness & smell) rather than Topic 1 (room experience) (diff5-1 = 0.144, p < 0.05), Topic 2 (location convenience) (diff5-2 = 0.139, p < 0.05) and Topic 3 (personalization & uniqueness) (diff5-3 = 0.121, p < 0.05). Meanwhile, extreme reviews would gain less helpful votes when related to Topic 1 (room experience) than Topic 2 (location convenience) (diff1-2 = − 0.186, p < 0.001), Topic 3 (personalization & uniqueness) (diff1-3 = − 0.163, p < 0.001), and Topic 5 (cleanliness & smell) (diff1-5 = − 0.151, p < 0.001). The positive effect of review photo would become greater when review content is about Topic 2 (location convenience) (diff2-3 = 0.015, p < 0.05) and Topic 5 (cleanliness & smell) (diff5-3 = 0.027, p < 0.001) than Topic 3 (personalization & uniqueness).

Further, we also found that the relationships between some of reviewers’ characteristics and review helpfulness vary in different types of review content, thus further verifying H9. Specifically, reviewer expertise has a more negative effect on review helpfulness when review content regarding Topic 5 (cleanliness) than Topic 1 (room experience) (diff1-5 = − 0.001, p < 0.05). Regarding reviewer reputation,its positive effect on helpfulness becomes weaker when review content related to Topic 1 (room experience) than the other four topics.

5 Discussion

The present study aims to explore how review content type as a semantic feature of reviews impact hotel review helpfulness by examining its direct and moderating effects, based on 166, 546 reviews on Ctrip.com. The summary of key findings of our results are shown in Table 9. We can see that most findings based on different methods support each other.

Table 9 The summary of key findings of our results

We identified five review topics regarding hotel attributes (i.e., room experience, location convenience, personalization & uniqueness, event management & staff attitude, and cleanliness & smell) to represent the review content type in our work. Next, we utilize the Kano model to explain these attributes in more details. Specifically, the Kano model refers to identify quality attributes that influence customer satisfaction of products, involving in three main types: basic, performance, and attractive attributes [11, 44]. Regarding our topics, room experience, location convenience, and cleanliness & smell are related to basic attributes corresponding to basic requirements of a hotel. Their poor performance would generate absolute dissatisfaction, but their sufficient performance does not enhance satisfaction. Event management & staff attitude refers to performance attributes. The higher the performance of this attribute, the higher the customer’s satisfaction will be and vice-versa. Finally, personalization & uniqueness is associated with attractive attributes, which are usually unexpected by customers and can lead to great satisfaction, but the absence leads to no dissatisfaction. Hotels should keep basic attributes and pay more emphasis on performance and attractive attributes to achieve advantages in differentiation [67].

Further, we explored how review content type impact review helpfulness. In terms of its direct effect, we found that hotel review helpfulness varies in different types of review content, which supports findings in previous studies [34, 82, 90]. In particular, our results show that reviews regarding room experience, location convenience, and cleanliness & smell would receive less helpful votes. This may be because that these three hotel attributes are related to basic requirements of a hotel, and customers don’t need to know much additional information about them. Also, we found that reviews about personalization & uniqueness as well as event management & staff attitude could receive more helpful votes. One possible explanation is that it is hard for customers to know the above two attributes from hotel descriptions, and former customers’ experience from reviews is an effective way to get relevant information. Likewise, results of comparison analysis also show that review content associated with event management & staff attitude could be voted as the most helpful, followed by personalization & uniqueness, cleanliness & smell and location convenience/room experience. As mentioned above, viewing feedback from other customers is a main way to catch clues of hotel service, as the quality of hotel service should rely on feedback of other customers who have experience [90]. These findings revealed that customers pay more attention to personalized and extra service provided by hotels rather than standard requirements of hotels (e.g., cleanliness, room facilities). Previous also indicated that personalized service could add value for customers and increase their satisfaction and loyalty [18, 75].

With regard to its moderating effects, we verified that review content type could moderate the relationships between certain determinants and review helpfulness, which support previous research [76]. Specifically, in terms of the reviews’ characteristics, our results show that effective long reviews regarding cleanliness & smell would be perceived as more helpful than reviews associated with the other four hotel attributes. This may be because the smell and certain details of cleanliness of a hotel could not be identified based on photos, but only rely on text description from other experienced customers. Also, positive reviews regarding cleanliness & smell would receive more helpful votes than the other two basic hotel attributes (i.e., room experience and location convenience). One explanation is customers could not stand a hotel room is unclean and with odor compared to poor location or bad room experience (e.g., small room size). Meanwhile, customers sometimes have a positive pre-decision preference on one hotel, so they pay more attention to reviews with a positive sentiment [79]. Moreover, extremity reviews related to room experience would be perceived less helpful than reviews related to location convenience, personalization & uniqueness, and cleanliness & smell. Extreme opinions on room experience may mainly depend on people’ own preferences (e.g., customers have different criteria on the size of a room), so these reviews are not helpful for customers [20, 34]. Reviews with photos about location convenience and cleanliness & smell could be voted as more helpful than reviews with photos regarding personalization & uniqueness. This may be because location convenience and cleanliness & smell are two important basic requirements of a hotel which need more photos to provide visual cues and in turn enhance persuasiveness [57, 65]. The above findings suggest that different review topics have heterogeneous moderating effects, suggesting customers have different requirements for information regarding hotel attributes.

In addition, in terms of reviewers’ characteristics, our findings show that reviews from expert reviewers regarding cleanliness & smell would be perceived as less helpful than reviews from expert reviewers associated with room experience. One explanation is that compared to room experience, cleanliness & smell depends more on customers’ subjective evaluation rather than professional assessment, so customers prefer to trust service information offered by ordinary reviewers who are similar to them [77]. Moreover, reviews from reviewers with high reputation regarding room experience would be perceived as less helpful than reviews from reviewers with high reputation regarding the other four hotel attributes. This may be because that customers have different criterion to evaluate the room experience of their hotel stay.

6 Supplementary analysis

According to the above discussion, we further categorize the five review topics as three attributes based on the Kano model [11, 44]. That is, basic attributes (i.e., room experience, location convenience, and cleanliness & smell), performance attributes (i.e., event management & staff attitude), and attractive attributes (i.e., personalization & uniqueness). Then, suggested by previous studies [20, 76], we applied the Tobit regression model to investigate how these three new types of review content impact review helpfulness. Review content type was added as a dummy variable, and we set basic attributes as the reference category.

As shown in Table 10, reviews regarding attractive attributes (β = 0.202, p < 0.001) or performance attributes (β = 0.140, p < 0.05) were perceived more helpful than basic attributes, which supports our previous findings. Moreover, most of the interaction items are significant, suggesting the moderating effect of this new review content type. In particular, effective long reviews regarding basic attributes would be perceived as more helpful than reviews regarding performance (β = − 0.008, p < 0.001) or attractive attributes (β = − 0.008, p < 0.001). Extreme reviews related to performance attributes (β = 0.285, p < 0.001) receive more helpful votes than basic attributes. Reviews with photos become more helpful when regarding basic attributes rather than performance attributes (β = − 0.058, p < 0.001). Moreover, reviews from expert reviewers would receive less helpful votes when related to performance (β = − 0.003, p < 0.001) or attractive attributes (β = − 0.011, p < 0.001) rather than basic attributes, while reviews from reviewers with high reputation would be more helpful when regarding performance (β = 0.041, p < 0.001) or attractive attributes (β = 0.055, p < 0.001) rather than basic attributes.

Table 10 Results of Tobit regression analysis

Based on the above, we can see that customers prefer view reviews with long text or photos when regarding basic attributes as well as extreme reviews about performance attributes. Meanwhile, they like reviews about basic attributes from expert reviewers and reviews about performance or attractive attributes from reviewers with high reputation.

7 Conclusions

To gain a deep understanding of how review content type impact hotel review helpfulness, we confuted two studies to explore these issues based on 166, 546 reviews across 2,690 hotels on Ctrip.com. Based on the LDA topic model algorithm, we extracted five topics that were common in hotel reviews: room experience, location convenience, personalization & uniqueness, event management & staff attitude, and cleanliness & smell, which are conceptualized as the review content type. Then, we performed regression analysis, ANOVA, and subgroup analysis to investigate how review content type along with other determinants regarding characteristics of reviews and reviewers directly impact hotel review helpfulness as well as the moderating effects of review content type. We found that review helpfulness as well as its relationships with certain determinants (e.g., review extremity and reviewer reputation) vary in different types of reviews. These findings have several theoretical and practical implications.

7.1 Theoretical contributions

The present study makes the following contributions to the literature. First, our work extends previous studies about dimensions in hotel reviews by identifying five review topics regarding hotel attributes. Although some of the topics (e.g., room experience, location convenience) were reported in previous studies [27, 34], no studies have considered them together and highlighted their relative importance. Most importantly, the topic regarding “personalization & uniqueness” has never been mentioned in prior research on hotel reviews. We found that customers put emphasis on the personalized services and unique features of a hotel when view reviews.

Second, our work deep the understanding of how review’s semantic features impact hotel review helpfulness by exploring the direct and moderating effects of review content type regarding hotel attributes. Although there are two studies discussed the effects of review topics (i.e., basic service, value, landmark & attraction, dining & experience, and core product) on hotel review helpfulness through regression analysis [82, 90], they didn’t compare review topics with each other. Our work conducted comparison analysis on each two review topics, which could provide more valuable guidelines for hotels to allocate resources efficiently, due to the limited time and resources. For example, Shin et al. [82] found that reviews with landmark & attraction could be more helpful, while our findings show that review content about location convenience scores lower on helpfulness than reviews with personalization & uniqueness content, suggesting hotel managers could pay more attentions on improving personalization and uniqueness of hotels.

Moreover, previous studies failed to consider the moderating effects of review topics on the relationships between other factors and hotel helpfulness. Although some scholars have explored the moderators of hotel review helpfulness including hotel features, review features, and manager factors [20, 31, 95], relevant studies are still limited. In particular, regarding moderators of review features, the moderating effect of review content type is rarely examined. Our work contributes to this research stream by demonstrating that the effects of the characteristics of reviews (e.g., review length) and reviewers (e.g., reviewer reputation) on hotel review helpfulness vary in different types of review content. These findings can serve as a foundation for future research on investigating the moderating effects of the semantic features of reviews on review helpfulness.

Finally, we add new knowledge of review types by further categorize review topics into three types (i.e., basic, performance, and attractive attributes) based on Kano model and explore its moderating effect. Several types of reviews were proposed by prior research (e.g., regular vs. comparative vs. suggestive; abstract vs. concrete) [76, 81], but almost no scholars categorized reviews based on the Kano model. Our findings could shed new insights on review types when study the effects of reviews on customer attitudes and behaviors.

7.2 Practical implications

Our work has several practical implications. First, under the competitive market environment, hotel operators should invest more resources to improve the efficiency of event management, personalized service and uniqueness to attract customers, as we found customers hope know more information about event management & staff attitude as well as personalization & uniqueness than other basic hotel attributes. Thus, hotels could improve the above two hotel attributes and encourage customers to post more content regarding these attributes. For instance, hotels could provide customized service for customers with kids or old man. Also, hotels could offer unique and superior amenities (e.g., free gifts, customized welcome cards) to create surprises for customers, which could differentiate themselves from other hotels and in turn create competitive advantage.

Further, since the importance of review content type on review helpfulness, online reviews platforms could optimize their interface by highlighting the topics of review content. For instance, they could require reviewers to tag a specific topic when post a review (e.g., # personalization, #staff attitude). Moreover, based on our findings, as customers give high priority to the event management & staff attitude, personalization & uniqueness, and cleanliness & smell of hotels, platforms could automatically highlight relevant review content for customers through text mining method. Such actions may enhance users’ experience and the “stickiness” of the platforms. Meanwhile, our findings also reveal other factors (e.g., review photo, reviewer reputation) contribute to hotel helpfulness when regarding certain hotel attributes, travel websites could identify helpful hotel reviews effectively and quickly based on our findings. For instance, we highlight the effect of user-provided photos on review helpfulness when regarding several basic hotel attributes (e.g., room facilities and cleanliness). Travel websites could encourage customers to share more photos of their hotel stay by designing a reward system. Likewise, reviews from reviewers with high reputation regarding event management and personalized service also contribute to the helpfulness of reviews. Thus, travel websites should develop ways to motivate these reviewers to post more content about hotel’s event management and personalized service to improve the usability of travel platforms.

7.3 Limitations and future research directions

This study has several limitations. First, we only collected data from Ctrip.com, one of the most popular travel websites in China. This may limit the generalizability of our findings, and future studies could collect data based more and other platforms such as Qunar.com or Booking.com. Meanwhile, our work is restricted to hotels in one city (Chengdu) and the findings may not be applicable to hotels in other cities. Further research could use data from more geographical locations or cross-cultural settings to explore helpful hotel reviews in more depth. In addition, as reviewers’ information could be changed over time, the data of reviewer expertise and reputation may not correspond to the time regarding a review received the helpful vote. Moreover, although we control the influence of review positions on review helpfulness based on the crawling sequency of a review, this measurement could not represent the review position viewed by each customer. We hope future studies could control these biases through more effective ways.

Second, we used an LDA approach to extract dimensions from helpful reviews, but big data may give misleading information [49]. As such, future studies could use more validation methods and other text-mining methods, such as latent sematic analysis (LSA), probabilistic latent semantic analysis (PLSA), and hierarchical Dirichlet process (HDP), to replicate this study. Meanwhile, as prior research indicates that individual characteristics (e.g., gender or age) influence which aspects of hotels reviewers focus on [27], future studies could explore how these dimensions vary with conditions such as gender, age, or travel types.

Finally, although we highlight the importance of semantic features of reviews, we only consider the effect of review content type regarding hotel attribute on review helpfulness. Future research could investigate how other features of specific review content (e.g., review content style) affect the helpfulness of hotel reviews. Likewise, we did not consider other important factors that may influence hotel review helpfulness, and further studies are encouraged to consider other situational factors that might influence hotel review helpfulness, such as hotel surroundings, travel type, or season of travel, to gain higher predictive power.