Introduction

Since the implementation of reforms and opening-up in the 1980s, China has experienced significant transitions in its economic system and social structure. The economy has shifted from a planning system to a market system, and society has evolved from traditional to modern forms. These transformations have profoundly affected the class structure of Chinese society, transitioning from two classes and one stratum (working and peasant classes, and intellectual stratum) to a diversified class structure (Li 2008; Li et al. 2012; Lin and Wu 2010; Liu 2018). Moreover, the diversity of social classes is evident not only at the objective level but also in subjective dimensions, such as individuals’ feelings, cognition, and judgment, collectively known as subjective social status or subjective class identification (Jackman and Jackman 1973). Generally, people’s judgments of their own social status are primarily based on their objective social status, leading to a general consistency between objective and subjective social status. However, scholars have found through extensive discussions on subjective and objective social status that there is a discrepancy between individuals’ subjective and objective social status (Chen and Fan 2015; Benjamin et al. 2013; Evans and Kelley 2004), that is, social status discordance.

Three main theories explain subjective social status. The first theory attributes it to structural factors such as social structure and urban–rural duality (Li 2005; Han and Qiu 2015; Fan and Chen 2015). The second theory, which is based on the concept of reference groups, suggests that irrational reference group selection leads to discordance (Liu 2001, 2002). The third theory, aligned with the status process theory, argues that individuals assess their current status by integrating their past social status, with social mobility playing a significant role (Liu 2002; Fan and Chen 2015; Zhang et al. 2019; Zhang and Liang 2021).

While previous research has provided valuable insights, most studies have focused on analyzing how objective social status shapes subjective social status, neglecting other influencing factors. Due to this limitation, it is essential to explore additional dimensions that shape subjective social status. With the rise of information technology, the role of new media in shaping subjective social status has gained importance (Zhou 2011). Consequently, some studies have focused on the relationship between internet usage and subjective social status, yielding valuable observations (Luo and Liu 2022; Zhou 2011; Feng and Liu 2022). However, these studies primarily address subjective social status, paying insufficient attention to social status discordance. Given the significant transformations in China’s social and economic structures, understanding the nuances of social status discordance has become increasingly important. As a critical component of modern digital engagement, internet usage plays a vital role in shaping individuals’ perceptions and interactions within the social hierarchy. The internet provides access to information and resources that can enhance one’s social and economic capital. It exposes individuals to diverse social groups and status symbols, potentially influencing their subjective social status and perceptions of social mobility. Therefore, investigating the impact of internet usage on social status discordance is crucial for obtaining a comprehensive understanding of how digital engagement influences social hierarchy perceptions, potentially offering new insights into existing research.

Based on the above analysis, this research aims to explore the impact of internet usage on social status discordance among Chinese residents. The steps are as follows: First, previous research has confirmed that internet access and usage behavior are important factors affecting individuals’ subjective social status (Zhou 2011). Consequently, this study examines the individual effects of internet usage frequency and behavior on social status discordance. Second, different internet usage behaviors affect subjective social status differently (Luo and Liu 2022). Therefore, with reference to the relevant research (Hargittai and Hinnant 2008), we categorize internet usage behaviors into capital-enhancing and other behaviors and analyze how each category impacts social status discordance, with a particular focus on capital-enhancing behaviors. Finally, considering the disparities among various groups’ internet usage capacities, individuals with higher objective social status tend to have more powerful and deeper internet usage capabilities (Peter and Valkenburg 2006). Therefore, we examine how internet usage affects social status discordance among people with multiple objective social statuses. The specific research questions are as follows.Footnote 1

RQ1:

What impact does internet usage frequency have on the social status discordance of Chinese residents? Specifically, does greater internet usage reduce the likelihood of status deflation (or inflation)?

RQ2:

What impact does internet usage behavior have on the social status discordance of Chinese residents? Specifically, do variations exist in the impacts of different internet usage behaviors on social status discordance?

RQ3:

Does group heterogeneity exist in the impact of internet usage on social status discordance? Specifically, what is the difference in the impact of internet usage on social status discordance among individuals with various objective social statuses?

Reviewing previous research, this study extends existing work in the following ways. First, in the discussion of social status discordance, this study considers the internet as another potential factor different from objective social status. It examines the issue of social stratification from the perspective of a digital society, which provides an alternative research perspective for related research on social status discordance. Second, this study analyzes the impact of the internet on social status discordance from two aspects: internet usage frequency and behavior. This deepens our understanding of the relationship between the internet and social status. Finally, using a causal random forest model, we conduct a heterogeneity test to analyze the impact of internet usage on social status discordance with various objective social statuses. This approach not only retains the characteristics of previous studies that emphasize objective social status but also clarifies the group heterogeneity of the impact of internet usage on social status discordance, helping to enrich the research content.

The remainder of this article is organized as follows. Section 2 presents the research hypotheses by reviewing existing research and related theories. Section 3 presents the specific methods used in this study, including data collection and variable measurements. Section 4 presents the results and tests the research hypotheses. Section 5 discusses the results and reveals the current problems and future research directions. Finally, Sect. 6 summarizes the conclusions of the study.

Literature review

Internet usage frequency and social status discordance

As a platform for information exchange and dissemination, the internet’s anonymity and equality can profoundly impact individuals’ existing social structures and socioeconomic status (Sun et al. 2023). On the one hand, the rapid development of the internet and other new media technologies provides interest groups with more convenient channels to express their opinions, allowing them to dominate the public opinion environment and exacerbating social isolation and division (Turow 1997). On the other hand, the timeliness and convenience of the internet offer people increased opportunities for education, work, and political participation, positively influencing the subjective social status of various social strata and their members (Wei 2006).

Despite its importance, few studies have directly examined the relationship between internet usage and social status discordance. Most relevant research has concentrated on the internet’s impact on subjective social status (Skogen et al. 2022; Feng and Liu 2022; Lin and Liu 2020; Zhou 2011), subjective well-being (Lu and Kandilov 2021; Yang et al. 2022), mental health (Sun et al. 2023; Kwak et al. 2022), and other issues. A review of the relevant literature reveals a debate on the internet’s impact on social status discordance.

According to reference group theory, the development of the internet and the information society has expanded people’s social networks. Individuals now consider not only those around them when selecting reference groups but also evaluate and define their socioeconomic status through ideal social groups and lifestyles observed online. Consequently, status identification is influenced by other status groups on the internet (Lu and Kandilov 2021).

When individuals focus on their similarities with positive information and believe that they can reach the comparison object’s level through effort, an assimilation effect occurs, leading to increased positive emotional experiences and behaviors (Mussweiler and Rüter 2003). This process fosters positive status identification, reducing the likelihood of status deflation. However, upward social comparison via online social media often results in negative emotions (Fardouly et al. 2015), problematic behaviors (Duffy et al. 2012), and low self-evaluations of individuals (Vogel et al. 2014), which can lead to negative status cognition and underestimation of one’s social status. For example, after browsing positive information from strangers on Instagram, individuals with a greater tendency for social comparison experience negative emotions, whereas those with lower tendencies experience positive emotions (De Vries et al. 2018). Additionally, Jackman and Jackman (1973) reported that online language, representing network groups, can easily aggravate status antagonism, leading to lower subjective social status through social comparison. Therefore, we propose the following competitive hypotheses based on existing research and theories:

H1a

Internet usage frequency is positively related to individuals’ status deflation.

H1b

Internet usage frequency is negatively related to individuals’ status deflation.

Internet usage behavior and social status discordance

In studying the relationship between internet usage and status identification, researchers have gradually noticed the heterogeneity of internet usage and found that different internet usage behaviors result in varied status perceptions among individuals. The digital divide theory highlights inequalities in information and communication technologies, including access, usage, and utilization (Wen et al. 2023; Scheerder et al. 2017). Overall, individuals in society differ in their internet access and the modes, methods, and content of internet usage. The benefits obtained through the internet also vary to a certain extent, resulting in different types of subjective status identification.

Hargittai and Hinnant (2008) categorized internet usage into capital-enhancing and recreational. Capital-enhancing activities involve seeking political or government information, personal career development, and consulting financial and medical services, whereas recreational activities include checking sports scores and reading jokes. These activities reflect and affect actors’ economic, cultural, and social capital (Riggins and Dewan 2005; Baker et al. 2020). Individuals engaging in capital-enhancing activities are more likely to benefit from the internet, improving their objective social status. The higher the objective social status, the less room for overestimating one’s own social status (Chen and Fan 2015), thus reducing the likelihood of status inflation.

Therefore, we speculate that there are differences in the impact of internet usage behavior on social status identification. Accordingly, we propose the following research hypothesis:

H2

Capital-enhancing behavior is negatively related to individuals’ status inflation.

Objective social status, internet usage, and social status discordance

When investigating the relationship between internet usage and social status discordance, it is crucial to consider the role of objective social status. Existing studies demonstrate that objective social status is the most important determinant of individuals’ perceptions of themselves (Evans and Kelley 2004; Goldman et al. 2007; Oddsson 2018). Many studies have discussed the differences in status identification among various social status groups. Han and Qiu (2015) used CGSS2010 data to examine the status identity deviation of Chinese urban residents. They reported that the status identification of residents with upper status and upper-middle status shifted downward, middle-status residents unanimously agreed, and lower-status and lower-middle-status residents shifted upward. Sun and Wang (2019) noted that three major objective stratum indicators (income level, occupational prestige, and education level) have a significant effect on social status identification.

In addition, acceptance and use of the internet are largely the result of an individual’s objective social status (Zhu and He 2002). Members of different statuses often exhibit differences in their internet usage behavior. Studies on digital inequality have described and analyzed these differences (Riggins and Dewan 2005; Blank 2013).

To understand these differences, it is important to consider how socioeconomic status influences digital skills and functional choices regarding internet access and use (Hargittai and Hinnant 2008; Chen 2013). High-status groups have more opportunities and resources to develop advanced digital skills (Tapia et al. 2011), which allows them to use the internet more effectively for activities that enhance their social and economic capital (Baker et al. 2020; Van Deursen and Helsper 2015). This contrasts with individuals in lower-status groups, who might primarily use the internet for entertainment and social interactions (Correa 2016). However, it should also be noted that high-status groups have higher baseline expectations and standards (Field et al. 2024), which can lead to a perception of underachievement even when engaging in capital-enhancing internet activities, resulting in a bias in their status perceptions. Additionally, high-status individuals compare themselves with their elite peers online and offline (Prato et al. 2024), which may exacerbate the misjudgment of their own social status. In contrast, individuals from low-status groups are less likely to experience downward status cognition bias because they have lower initial expectations and fewer opportunities, and the marginal utility of the same capital appreciation behavior is greater than that of high-status individuals.

The discrepancy in social status discordance between social groups can be attributed to the varying reference points and benchmarks employed by individuals across social strata. When engaging in capital-enhancing activities, high-status individuals often compare themselves to peers of even higher status, leading to a perception of insufficient achievement despite actual improvements. This phenomenon aligns with social comparison theory, which suggests that individuals evaluate their status relative to others within their social context. Conversely, individuals from low-status groups, with lower initial expectations and fewer opportunities, set lower criteria for success. Therefore, participating in capital-enhancing activities may lead to a more pronounced sense of advancement, reducing the likelihood of status deflation within low-status groups.

Combining the above analysis, we speculate that differences exist in internet usage and social status discordance among groups with different objective social statuses. Therefore, the impact of internet usage on social status identification will show a certain degree of heterogeneity owing to the different objective social statuses of the research object. Thus, we propose the following hypothesis:

H3

The effects of internet usage differ among various social status groups.

Method

Data collection

The microdata used in this study were obtained from the Chinese Social Survey (CSS). The CSS, conducted by the Institute of Sociology at the Chinese Academy of Social Science, is a continuous large-scale social survey project in China. This survey systematically and comprehensively collected data on social changes in China during the transition period.

This study considers individual residents to be the analytical units. We measure objective social status and investigate the structure and distribution of social status discordance among residents in China. Furthermore, we analyze internet usage frequency and behavior and their impact on social status discordance. Considering the above research requirements and data availability, we use the CSS2021 data, which includes 30 provinces in China, and collect 10,136 qualified questionnaires. The samples are highly representative.

We eliminated invalid observations with missing or abnormal values of key variables, such as individuals who did not know or refused to answer questions about subjective social status, personal income, and years of education, and obtained 8483 valid samples.

Variable definition and operationalization

Dependent variable: social status discordance

The dependent variable in this study is social status discordance, which is operationalized as the difference between individual status identification and objective social status. The CSS2021 questionnaire directly measures status identification via the following questions: “What level do you think your current socioeconomic status is in the local area?” Responses are rated on a five-point Likert scale (1 = lower; 2 = lower-middle; 3 = middle; 4 = upper-middle; 5 = upper).

The measurement method for objective social status is relatively complex, and various measurement systems have been adopted in the academic community. As income, education, and occupation are key factors that determine social status (Hodge and Treiman 1968), researchers generally measure the objective social status of individuals in three dimensions: income, education level, and occupational prestige (Kirsten et al. 2023a, b; Chen and Fan 2015; Rarick et al. 2018; Li 2023). Income and education are the two most commonly used indicators for measuring objective social status (Van Doesum et al. 2017; Kraus et al. 2009). As the survey contained many missing values regarding occupation, using this indicator would invalidate many effective samples; therefore, we refer to Xu (2018) and use the Communist Party of China (CPC) membership as an alternative indicator of occupational prestige. In Chinese society, not only is CPC membership an important reference condition for entering certain professions, such as the civil service and job promotions, but also CPC member recruitment comprehensively considers the social status of applicants (Walder et al. 2000). This, to a certain extent, reflects the social status of the interviewees (Xu 2018). In addition, Hukou registration is an indicator of an individual’s socioeconomic status in China (Rarick et al. 2018). Hukou registration determines an individual’s social welfare, and residents with a rural identity have a lower social status in society (Han et al. 2011). Therefore, we use income, education, CPC membership, and Hukou registration to measure an individual's objective social status.

The operationalization process of each variable is as follows. First, we divide the interviewees’ income into five levels (1–5) according to residents’ per capita disposable income divided into income quintiles reported in the 2021 China Statistical Yearbook. Second, we transformed educational attainment into an ordinal variable according to the highest level of education reported by respondents, ranging from “not attending school” (1) to “graduate” (8). Third, the original classification of CPC membership includes Communist Party members, democratic parties, Communist Youth League members, and the masses. In this article, the democratic parties, the Communist Youth League members, and the masses are uniformly classified as non-Communist Party members with an assignment of 1; Communist Party members are still classified as Communist Party members with an assignment of 2. Finally, we used a dichotomous variable for Hukou registration (nonurban = 1, urban = 2). We standardize income level, education level, CPC membership, and Hukou registration and adopt factor analysis to combine the above four indicators into a comprehensive index to measure objective social status. Table 1 presents the results of the factor analysis.

Table 1 Factor analysis of objective social status

Social status discordance can be measured by the objective social status obtained through factor analysis and respondents’ subjective social status. We first divide the objective social status index of the sample into five equal parts to obtain an ordinal measure of objective social status, with values of 1–5 representing lower, lower-middle, middle, upper-middle, and upper statuses, respectively. We then subtract the abovementioned ordinal measure from the sample’s subjective social status to obtain the raw social status discordance(ranging from -4 to 4). A negative value indicates that the individual’s status identification shifts downward; smaller values indicate stronger status deflation. If the value of the raw social status discordance is zero, then the subjective and objective social statuses are consistent. If the value of the raw social status discordance is positive, this indicates that the individual’s status identification deviates upward; higher values indicate stronger status inflation.

To simplify the analysis, we convert the raw social status discordance into a standardized measure called the standardized social status discordance, ranging from − 1 to 1. For standardized social status, -1 indicates status deflation (a significant downward shift in social status identification), 0 indicates status concordance (alignment between subjective and objective social status), and 1 indicates status inflation (a significant upward shift in social status identification). Unless otherwise specified, the standardized social status discordance is used in subsequent analyses.

Independent variable: Internet usage

The independent variables in this study mainly consist of internet usage frequency and behavior.

The first part concerns internet usage frequency. The questionnaire asked the interviewees whether they used mobile phones or computers to surf the internet. This article classifies the groups that never use the internet into one category, while those that use the internet are divided into two categories, occasional and frequent internet access, based on the average frequency of internet usage, thus obtaining a three-category variable. Referring to Li and Ren (2022), we measure the frequency of individual internet usage by calculating the arithmetic average of all internet usage behaviors.

The second part concerns internet usage behavior. This variable is measured based on the respondent’s answer to the question “How often do you go online for the following activities?” in the CSS2021 questionnaire. There are seven types of internet usage behaviors under this question: “browsing current political information (e.g., watching party and government news),” “entertainment and leisure (e.g., playing online games/listening to music/watching videos/reading novels),” “chatting and making friends (e.g., using WeChat, QQ, and other socializing apps to chat and make friends),” “business or work (including opening online stores, web anchors, and live streaming),” “learning and education,” “online shopping or life services (e.g., online shopping, takeout, map navigation, map positioning),” and “investment and financial management.” Responses are rated on a five-point Likert scale (1 = several times a year; 5 = almost daily). Higher values indicate a higher frequency of the corresponding internet usage behavior. Referring to Hargittai and Hinnant (2008), we divide internet usage behavior into two categories: capital-enhancing behavior and other behavior. The former includes “browsing current and political information,” “business or work,” “learning and education,” and “investment and financial management,” whereas the latter includes “entertainment and leisure,” “chatting and making friends,” and “online shopping or life services.” We measure the frequency of individual internet usage under different usage types by calculating the arithmetic average of different internet usage behaviors.

Control variables

Considering that gender, age, marital status (Bucciol et al. 2019), personal income, education level (Chen and Fan 2015), and residential region (Wei et al. 2022) may potentially impact an individual’s social status, this study includes these factors as control variables in the regression model. In addition, given that insufficient social security can cause downward bias in individuals’ status identification (Zou 2023), this study also considers social security participation as a potential influencing factor and includes it as a control variable. By doing so, we aim to reduce the interference of omitted variables, ensure the robustness of the results, and more comprehensively explore the influencing factors of social status inconsistency.

Gender (Women = 0; Men = 1), social security participation, and marital status are used as dummy variables. Social security participation is operationalized via the following two questions in the CSS2021 questionnaire: “Do you currently have any endowment insurance or pension provided by the government?” and “Do you currently have medical insurance or publicly funded health care provided by the government?” If the response to both questions is negative, the value of social security participation is 0; otherwise, the value is 1. The marital status classification in the questionnaire includes unmarried, cohabiting, first marriage, remarriage, divorced, and widowed. This study simplifies this as follows: cohabitation, first marriage, and remarriage are classified as married and assigned a value of 1; unmarried, divorced, and widowed individuals are classified as unmarried and assigned a value of 0. Age is subtracted from the survey year 2021, and the respondents’ years of birth are treated as a continuous variable. Personal annual income refers to the total income of respondents in 2020. To improve the model’s goodness of fit, we add one to the sample’s annual personal income and take the logarithm. The original data on education level was an ordinal variable; in the regression analysis, we transform these data into a continuous variable according to the standard years of each education level as follows: 0 years indicate no education, 6 years indicate primary school, 9 years indicate junior high school, 12 years indicate vocational and senior high school, 13 years indicate technical secondary school, 15 years indicate college, 16 years indicate an undergraduate degree, and 19 years indicate a graduate degree or above. The living location is divided into western, central, and eastern regions according to the province where the respondent is located. The descriptive statistical results of each variable in the sample are shown in Table 2.

Table 2 Descriptive statistics of the variables

Model setting

We first fit the OLS model, treating social status as a continuous variable. Additionally, since the dependent variable can also be viewed as a multicategorical variable, including status deflation, status concordance, and status inflation, this study conducts multinomial logistic regression for statistics and hypothesis testing. Multinomial logistic regression comprises a set of simple logarithmic ratio regression equations. The status concordance sample (p1) is used as the benchmark status to examine how a group of independent variables affects social status, in which p2 and p3 represent the probabilities of status deflation and inflation, respectively. The resulting multinomial logistic regression equation is as follows:

$$ Logit\left( {p_{2} /p_{1} } \right) = \alpha_{2} + \mathop \sum \limits_{i = 1}^{n} \beta_{2i} x_{i} + \mu $$
$$ Logit\left( {p_{3} /p_{1} } \right) = \alpha_{3} + \mathop \sum \limits_{i = 1}^{n} \beta_{3i} x_{i} + \mu $$

where j = 2 and j = 3 in Logit(pj/p1) represent status deflation and inflation, respectively; Logit(pj/p1) represents the logarithmic ratio of the type of social status to the first case, with status concordance being the reference group of this model; xi indicates factors affecting social status discordance; α2 and α3 are constant terms; and β2i and β3i represent the partial regression coefficients of factor i.

Results

Descriptive analysis

We calculated the objective and subjective social status of Chinese residents in 2021. Figure 1 shows that the highest proportion of residents (31.24%) are lower-middle status. As objective social status increases, the proportion of the population within each objective social status decreases. Moreover, most residents (41.27%) identified as middle social status, with very few (0.75%) seeing themselves as upper status. Comparing subjective status identity with objective social status, we find that fewer residents (51.41%) consider themselves below middle social status than those who are actually below it (61.61%). Additionally, the proportion of residents who see themselves above the middle social status (7.32%) is lower than the actual proportion in that status (16.75%), highlighting a clear inconsistency between objective social status and subjective status identification in Chinese society.

Fig. 1
figure 1

Distributions of objective and subjective social status. The proportion of residents with an objective lower-middle status is the highest. In terms of subjective social status, most residents place themselves in the middle status of society, and only a few residents regard themselves as upper status, indicating a significant inconsistency between objective and subjective social status in Chinese society

The raw social status discordance is determined by subtracting the objective social status score from the subjective status score. The values range from -4 to 4. A negative value indicates a downward shift in subjective social status, with larger absolute values indicating stronger status deflation. A zero value indicates consistency between subjective social status and objective social status. A positive value indicates an upward deviation in subjective social status, with larger values indicating greater status inflation. Figure 2 shows the distribution of social status discordance among Chinese residents, which is close to a normal distribution but not perfect. The highest proportion of residents (28.41%) have status concordance. More residents identify one or two units above their actual social status (35.27%) than those who identify one or two units below (31.23%). Most residents either accurately determine their social status (28.41%) or exhibit one to two status deviations (66.50%), with only a few showing three or more deviations (5.08%).

Fig. 2
figure 2

Distribution of raw social status discordance(percent). The proportion of residents with status concordance is the highest (28.41%). Most residents exhibit one to two status deviations (66.50%), and only a few residents have three or more status deviations (5.08%)

To describe the types of social status discordance, we transform the continuous values of social status discordance into three categories: status deflation, status concordance, and status inflation. We examined the distribution of social status discordance across different internet usage frequencies and among different objective social status groups (Table 3). To make the statistical results more intuitive, this study merges the lower and lower-middle social statuses into the lower-middle category and the upper and upper-middle social statuses into the upper-middle category while keeping the middle social status unchanged (Han and Qiu 2015).

Table 3 The distribution of social status discordance among different groups

The results revealed that 53.29% of the samples in the group that never used the internet presented status inflation, whereas 55.33% of the samples in the group that frequently used the internet underestimated their social status. Additionally, for the group that occasionally used the internet, the proportions of different types of status identification biases are relatively close, with a greater proportion of individuals experiencing status inflation. Among groups with different objective social statuses, most individuals in the lower-middle-status group tend to overestimate their social status, whereas most individuals in the middle- and upper-middle-status groups experience status deflation.

These findings highlight the nuanced relationship between internet usage and social status discordance. Specifically, internet users are more likely to underestimate their social status, suggesting that increased exposure to online information and social comparison may lead to a more critical self-assessment. In contrast, non-internet users are more likely to exhibit status inflation, possibly due to a lack of comparative information. In response to the above phenomenon, it is necessary to discuss further the causal relationship between internet usage and social status discordance.

Baseline regression

Before the regression analysis of the research model, considering the possible multicollinearity problems between variables, collinearity is conducted on each regression model with internet usage frequency and behavior as the independent variables. The test results revealed that the variance inflation factor (VIF) values of all the variables in the model are less than 3, with no clear multicollinearity problem, so the subsequent analysis can be carried out.

Table 4 reports the baseline regression results of the impact of internet usage frequency and behavior on the social status discordance of residents. In Model 1, only the variable of internet usage frequency is added, and the group without internet access is taken as the reference to examine the impact of internet usage frequency on social status discordance. Model 2 adds control variables based on Model 1. In Models 1 and 2, the regression coefficients of occasional internet access and frequent internet access are significantly negative, indicating that people who use the internet are less likely to inflate their status than those who do not use the internet. Thus, H1a is partially supported, and H1b fails the test. Furthermore, the regression results show that frequent internet access has a greater impact on social status discordance than occasional internet access. Therefore, we retain only samples that use the internet and analyze the impact of internet usage frequency in Model 3. Since only samples of internet users are retained, the internet usage frequency variable includes occasional internet access and frequent internet access. Model 3 uses occasional internet access as the reference and finds that for samples that use the internet, an increase in internet usage frequency reduces the likelihood of status inflation. Thus, H1a is partially supported. Finally, Model 4 analyzes the impact of different internet usage behaviors on social status discordance. Since there is no internet usage behavior variable for groups that never surf the internet, Model 4 also retains only samples that use the internet. The results show that capital-enhancing behavior has a significantly negative effect (β = -0.039, p < 0.001) on social status discordance, whereas the regression coefficients of the other behaviors are not significant. This result shows that using the internet to increase capital can have a more significant effect on social status discordance than internet use for other purposes. Specifically, individuals using the internet for capital enhancement are more likely to show less status inflation. Thus, H2 is supported.

Table 4 Baseline regression results

Among the control variables, gender has a significant effect on social status discordance. Status identification among men is less likely to shift upward than that among women. Furthermore, higher personal income and educational level are associated with less likelihood of status inflation.

Multinomial logistic regression

We adopt a multinomial logistic regression model to test the regression results further. With status concordance as the reference, two statistical models are established. Table 5 summarizes the results of the multinomial logistic regression model. Model 5 examines the impact of internet usage frequency on social status discordance. Models 6 and 7 select samples that use the internet to analyze the impact of different internet usage frequencies and behaviors on social status discordance.

Table 5 Multinomial logistic regression model

First, from the perspective of internet usage frequency, groups that use the internet are more likely to experience a downward shift in social status discordance. Specifically, individuals who occasionally use the internet are approximately 24.61% (e0.220–1 = 0.2461) more likely to underestimate their own status than to assess their social status accurately. Individuals who frequently use the internet are approximately 38.13% (e0.323–1 = 0.3813) more likely to have status deflation than status concordance, whereas the likelihood of status inflation has decreased by approximately 23.59% (e−0.269–1 = -0.2359). Further analysis of groups that use the internet revealed that, compared with individuals who occasionally use the internet, those who frequently use the internet are approximately 14.80% (e0.138–1 = 0.1480) more likely to have social status deflation than to have status concordance. In contrast, the likelihood of status inflation is approximately 14.96% (e−0.162–1 = -0.1496) lower than the likelihood of status concordance. The above results show that individuals who use the internet are more likely to underestimate their social status than those who do not, and the higher the frequency of internet usage, the greater the likelihood of having a lower status identity. Thus, H1a is fully supported.

Second, when the frequency of capital-enhancing behavior increases by one unit, compared with status concordance, the likelihood of status inflation decreases by 9.16% (e−0.0961–1 = -0.0916). The regression coefficient of capital-enhancing behavior is not significant in the comparison between status deflation and status concordance, indicating that capital-enhancing behavior reduces the likelihood of upward deviation in status identification. However, this does not prove that capital-enhancing behavior significantly impacts the downward deviation of status identification. Moreover, the regression coefficient of other behavior is not significant in Model 7; therefore, there is no evidence that other behavior has a significant effect on the social status discordance of individuals. In summary, individuals who use the internet for capital enhancement are more likely to show less upward bias in status identification. H2 is supported again.

Finally, according to the three models, the results for the control variables are consistent with the results of the baseline regression. Specifically, (1) men are more likely to underestimate their social status than women, whereas the likelihood of status inflation for men is lower than that for women; and (2) compared with status concordance, an increase in personal income and education level increases the likelihood of status deflation and reduces the likelihood of status inflation.

Endogeneity analysis

Considering that both internet usage and social status discordance involve the current objective status of social members, different social members have varying internet usage patterns, potentially leading to an endogeneity issue between internet usage and the dependent variable. Therefore, this article employs both the multicategory treatment effects intervention (Teffects) and the instrumental variable (IV) methods to conduct endogeneity tests on the impact of internet usage frequency and usage behavior on social status discordance.

Propensity score matching methods, which are commonly used to address endogeneity, face challenges when dealing with multiclass treatment variables. The Teffects method can effectively address this issue (Zhang and Liang 2021). This method’s fundamental concept is similar to that of binary treatment effects but involves multiple (k-1) average treatment effects (ATEs). Additionally, the Teffects method cannot handle negative values, so we transformed the social status discordance into positive integers ranging from 1 to 3 for analysis. The analysis results are presented in Table 6.

Table 6 Endogeneity analysis (Teffects)

In Table 6, the treatment effect variable is the frequency of internet usage, which is categorized into three groups: no internet access, occasional internet access, and frequent internet access. We used occasional and frequent internet access as the treatment effects, with no internet access and occasional internet access as the control group. The table reports the impact of different internet usage frequencies (treatment groups) compared with the control group on social status discordance.

The results indicate that holding other conditions constant, individuals who occasionally use the internet experience a 5.57% decrease in social status discordance compared with nonusers; frequent internet users experience an 8.82% decrease compared with nonusers; and frequent users experience an 8.87% decrease compared with occasional users. These statistically significant results suggest that even after addressing the endogeneity between internet usage frequency and social status discordance, the impact of internet usage frequency on social status discordance remains. In other words, internet usage reduces the likelihood of an upward bias in social status identification. Therefore, the empirical results presented earlier are robust and reliable.

Additionally, we conduct an endogeneity analysis via the instrumental variable (IV) method. Specifically, this study first treats internet usage frequency as a continuous variable and selects the per capita number of mobile phones in each province as an instrumental variable for internet usage frequency. The per capita number of mobile phones in each province influences individual internet usage but does not directly affect individual social status, thus satisfying the relevance and exogeneity assumptions of instrumental variables. To address the endogeneity issue between internet usage behavior and social status discordance, this study adopts the method proposed by Li and Cai (2024), using the average frequency of different internet usage behaviors at the provincial level as instrumental variables for each internet usage behavior. On the one hand, internet usage at the provincial level is closely related to individual internet usage, satisfying the relevance assumption of instrumental variables. On the other hand, provincial internet usage does not directly affect individual status discordance, meeting the exogeneity assumption of instrumental variables. The two-stage least squares (2SLS) method is employed in this section for parameter estimation. The analysis results are presented in Table 7.

Table 7 Endogeneity analysis (IV-2SLS)

In the two models presented in Table 7, the weak instrument variable test results indicate that the F statistics of the first stage are 22.225 and 14.249, confirming that the selected instrument variable is not weak. Furthermore, the Hausman test results for both models have p values less than 0.0001, indicating that internet usage is an endogenous variable and justifying the use of an instrumental variable model. The number of endogenous explanatory variables is equal to the number of instrumental variables, indicating exact identification; therefore, there is no need for overidentification tests.

The findings demonstrate that increased internet usage frequency decreases the likelihood of status inflation after addressing potential endogeneity issues. Additionally, greater engagement in internet activities that enhance capital decreases the tendency for individuals to overestimate their social status. Moreover, the significance of the coefficients for other internet usage behaviors in Table 7 differs from the previous regression results, suggesting that these behaviors may not have a robust impact on social status discordance. However, since the primary research hypotheses of this study focus on the frequency of internet usage and capital-enhancing behaviors, this result does not affect the robustness of the core conclusions.

Heterogeneity analysis

To account for the potential heterogeneity in the impact of internet usage on social status discordance across different social strata, this study conducted a heterogeneity analysis. Traditional heterogeneity analysis methods usually rely on the grouping and interaction terms of the regression model and depend on the model’s artificial settings. Although treatment effect heterogeneity analysis by propensity score matching has made some breakthroughs, it has its own problems. Nonparametric models, such as causal forests, use a “matching” strategy to estimate the treatment effect of a certain treatment variable on “everyone,” allowing researchers to analyze further what factors affect such interindividual differences, thereby explaining the heterogeneity of treatment effects. Since the results of the baseline regression show that the impact of other behaviors on social status discordance is not significant, we use the causal forest model to conduct a heterogeneity test on the treatment effects of the heterogeneity of internet usage frequency and capital-enhancing behavior. This method includes the following four steps.

Step 1 The heterogeneous treatment effect (HTE) is based on the conditional average treatment effect (CATE) as follows:

$$ \tau \left( x \right) = E\left[ {Y_{i} \left( 1 \right) - Y_{i} \left( 0 \right)\backslash X_{i} = x} \right] $$

Step 2 Using the homogeneity treatment effect estimator τ to generate the “R-learner” function for HTE (Nie and Wager 2021).

The estimator for HTE τ is as follows:

$$ \hat{\tau } = \frac{{\frac{1}{n}\mathop \sum \nolimits_{i = 1}^{n} \left[ {Y_{i} - \hat{m}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right]\left[ {Z_{i} - \hat{e}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right]}}{{\frac{1}{n}\mathop \sum \nolimits_{i = 1}^{n} \left[ {Z_{i} - \hat{e}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right]^{2} }} $$

where \(e\left( x \right) = P\left[ {Z_{i} \backslash X_{i} = x} \right]\) represents the propensity score and \(m\left( x \right) = E\left[ {Y_{i} \backslash X_{i} = x} \right]\) represents the expected value of the internet usage effect. The symbol \(- i\) denotes an “out-of-bag” prediction, meaning that \(Y_{i}\) is not used to compute \(\hat{m}^{{\left( { - i} \right)}} \left( {X_{i} } \right)\). The propensity score \(e\left( x \right)\) in the above equation follows a 0–1 uniform distribution. The above equation is used to generate an “R-learner” function:

$$ \hat{\tau }\left( \cdot \right) = argmin_{\tau } \left\{ {\mathop \sum \limits_{i = 1}^{n} \left[ {\left( {Y_{i} - \hat{m}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right) - \tau \left( {X_{i} } \right)\left( {Z_{i} - \hat{e}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right)} \right]^{2} + \Lambda_{n} \left[ {\tau \left( \cdot \right)} \right]} \right\} $$

where \(\Lambda_{n} \left[ {\tau \left( \cdot \right)} \right]\) is the regularization factor controlling the function \(\hat{\tau }\left( \cdot \right)\).

Step 3 Combine the “R-learner” function with the “planting” of a random forest. A casual forest is the generalized random forest obtained from the above equation and the random forest results (Athey et al. 2019). The random forest formed by regression trees can be written as:

$$ \hat{m}\left( x \right) = \mathop \sum \limits_{i = 1}^{n} \left[ {\alpha_{i} \left( x \right)Y_{i} } \right] $$
$$ \alpha_{i} \left( x \right) = \frac{1}{B}\mathop \sum \limits_{B = 1}^{b} \frac{{1\left( {\left\{ {X_{i} \in L_{b} \left( x \right), i \in S_{b} } \right\}} \right)}}{{\left| {\left\{ {i: X_{i} \in L_{b} \left( x \right), i \in S_{b} } \right\}} \right|}} $$

where \(L_{b} \left( x \right)\) represents the b-th tree in the forest containing training sample x, \(S_{b}\) is the subsample associated with the tree, and B is the number of regression trees. The term \(\alpha_{i} \left( x \right)\) is a kernel formed by data-driven methods, indicating the frequency with which the i-th training individual shares the same leaf as x under the covariate characteristics of x.

Step 4 We obtain the treatment effects by employing the above method and the “planting” forest process. The estimated form of the HTE is as follows:

$$ \hat{\tau } = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \alpha_{i} \left( {X_{i} } \right)\left[ {Y_{i} - \hat{m}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right]\left[ {Z_{i} - \hat{e}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right]}}{{\mathop \sum \nolimits_{i = 1}^{n} \alpha_{i} \left( {X_{i} } \right)\left[ {Z_{i} - \hat{e}^{{\left( { - i} \right)}} \left( {X_{i} } \right)} \right]^{2} }} $$

Referring to Athey and Imbens (2016), we adopt the principle of “honest estimation”; that is, the training set is divided into two equal subsets, where all samples in one subset are used to create the partition, and all the samples in the other subset are used to calculate the average treatment effect of each leaf. Table 8 shows that the estimated coefficient of objective social status is significantly positive at the level of 0.001, indicating that the treatment effects of internet usage frequency and capital-enhancing behavior on social status discordance vary significantly across different social statuses. Thus, H3 is supported. Although the previous regression results indicate that an increase in internet usage frequency and capital-enhancing usage behaviors decreases the likelihood of individuals’ overestimating their own social status, this heterogeneity analysis reveals that this decrease in the likelihood of status inflation is more pronounced among high-social-status groups than among low-status groups.

Table 8 Heterogeneity test based on the causal forest algorithm

Discussion

Using the CSS2021 data, this study examines the impact of internet usage on social status discordance among residents from multiple dimensions. The results show that there is a relationship between the objective and subjective social status of Chinese residents. An increase in the frequency of internet use reduces the likelihood of upward bias in status identification and increases the likelihood of downward bias in social status identification. Applying the internet to capital enhancement can reduce the likelihood of status inflation. In addition, the impact of internet usage on social status discordance differs across various social statuses. These findings enhance our understanding of the residents’ subjective social status and provide different perspectives for studying social stratification in the digital age. The specific implications are discussed further below.

First, this study incorporates internet usage as an influencing factor of social status discordance into the analytical model, which helps to further enrich the relevant research on status identification. For a long time, status identification has been a key topic in social stratification research. Among them, as important research topics, social status discordance and its influencing factors have been analyzed and tested in many aspects (Evans and Kelley 2004; Goldman et al. 2007; Shirahase 2010; Chen and Fan 2015). The structure (Li 2005; Han and Qiu 2015; Fan and Chen 2015), reference group (Lu and Zhang 2006; Liu 2001), and social mobility theories (Liu 2002; Fan and Chen 2015; Zhang et al. 2019) provide classical explanations for the generational mechanism of social status discordance. However, new media in the digital age also shapes individual social status identification (Zhou 2011), and most studies on social status discordance focus on objective social status. Therefore, this research further expands existing studies and includes the internet in the study of social status discordance. This study investigated the impact of internet usage frequency and behavior on social status discordance using the internet’s social comparison and digital divide effects, enriching the literature in this field.

Second, this study divides internet usage behaviors into different types, examines the impact of different usage behaviors on social status discordance, and explains the causal relationship between internet usage and social status discordance from a digital divide perspective. In the field of social stratification, an increasing number of studies have focused on the digital divide and digital inequality, and many studies have taken objective or subjective social status as the research object and conducted in-depth analyses and discussions of the relevant issues (Hargittai and Hinnant 2008; Zhou 2011; Correa 2016; Baker et al. 2020). Based on existing research, this study extends the focus of the research to social status discordance by examining the impact of internet usage frequency on social status discordance and the effects of different types of internet usage behaviors on social status discordance, which enriches the research on the digital divide in the field of social stratification.

Finally, this study continues the characteristics of attaching importance to objective social status in previous studies, incorporates the objective social status of individuals into the research framework, and analyzes the social status heterogeneity of the influence of internet usage on social status discordance. Research on the digital divide and inequality has discussed the relationship between objective social status and internet usage (Parsons and Hick 2008; Ignatow and Robinson 2017; Lopez-Sintas et al. 2020), and the relationship between objective social status and social status discordance has been explained in related studies (Evans and Kelley 2004; Benjamin et al. 2013; Chen and Fan 2015). Based on existing research, this study uses a causal random forest model to analyze the differences in the impact of internet usage on social status discordance among various social statuses, which further enriches the literature in related fields.

This study has several limitations. First, limited by the original data, some missing values were presented as variables, resulting in the loss of some cases. As the data are second-hand data, these issues could not be resolved. If more complete data are available in the future, it will be more helpful to discuss the impact of internet usage on the social status discordance of residents. Second, in the classification of objective social status, most studies have used latent class analysis (Chen and Fan 2015; Sun and Wang 2019) or the Erikson–Goldthorpe–Portocarero class scheme (Kirsten et al. 2023a, b; Zhang and Liang 2021; Zhang et al. 2019), which makes the division of different social statuses more accurate. However, these methods have high data requirements; therefore, this study can use only a relatively simplified method. The simplified method may cause slight inaccuracies in the measurement results; however, this does not significantly impact our conclusions. We will consider using a more accurate method to measure objective social status in the future,. Third, this article studies the impact of internet usage on social status discordance but lacks an analysis of the influence mechanism, which will be further studied in the future.

Conclusion

Based on the CSS2021 data, this study examines the impact of internet usage frequency and behavior on social status discordance among Chinese residents. In addition, this study combines causal inference techniques with machine learning algorithms and uses a causal random forest model to examine the heterogeneity of treatment effects. Specifically, this study adopts factor analysis to measure the objective social status of individuals, divides it into five equal parts, subtracts it from the subjective social status score, and obtains the value of individual social status discordance. Accordingly, social status discordance is divided into three types: status deflation, status concordance, and status inflation. Second, the study considers internet usage frequency and behavior as independent variables and uses the type of social status discordance as the dependent variable to investigate the impact of internet usage degree and different usage behaviors on social status. Finally, the study uses the causal forest model to test the heterogeneity in the treatment effects of internet usage frequency and capital-enhancing behavior.

The specific findings of this study are as follows. First, this study reveals a clear inconsistency between objective and subjective social status in Chinese society. Second, increased internet usage frequency reduces the likelihood of status inflation and increases the likelihood of status deflation. Third, capital-enhancing behavior significantly reduces the likelihood of status inflation, whereas other types of internet usage behavior do not significantly impact individual social status discordance. Finally, individuals with higher social status are less likely to overestimate their own social status than those with lower social status when they use the internet frequently and use it for capital enhancement.