Keywords

1 Introduction

Accessing and using the Internet can help residents gain digital advantages, such as obtaining valuable information, finding a job, consulting on health issues, or accessing public services. However, access to and use of the Internet presupposes access to technology and infrastructure, as well as the acquisition of skills to cope with innovations in the digital world [1], this difference in technology, facilities, and skills creates a digital divide, which primarily describes the differences between different countries in terms of technology, facilities, and skills. The digital divide primarily describes differences in Internet access and use by gender, income, and urban and rural residents [2]. Although China’s Internet access facilities have achieved widespread coverage [3], but China’s rural areas still face a serious digital divide problem. Therefore, this paper utilizes CFPS 2018 data to conduct a detailed analysis of the Internet diffusion in rural areas of China. The reason for choosing this database is that the CFPS database focuses on the economic and non-economic well-being of Chinese residents, as well as a number of research topics including economic activity, educational outcomes, health, and internet use, etc. It is a nationwide, large-scale, multidisciplinary social tracking survey project, with a sample covering 25 provinces/municipalities/autonomous regions. Data such as Internet use in the database allow us to determine the profile of Internet users and define various patterns of Internet use.

This paper contributes to the study of the digital divide in rural areas of developing countries in three ways. First, there are more existing studies on the urban digital divide, but relatively few on the rural digital divide. Secondly, by studying the rural digital divide through microeconomic data, this study can further clarify the status quo and reasons for the relative lagging behind of rural development under China’s urban-rural dualistic system, especially in the presentation of “digital space”. Third, this study provides solid empirical evidence for the Chinese government to formulate public policies, such as rural revitalization and common prosperity, aimed at increasing the benefits of adequate and effective handling of information technology (IT), including the Internet, and thus making it easier for a growing number of people living in rural areas of developing countries to access and use IT.

The structure of this article is as follows: the first part describes the significance of the study and leads to the research question. The literature review in Part II identifies the new characteristics of rural China in the new era, the digital divide, and the determinants of Internet access and use. The third part presents the CFPS survey data and the variables used in the econometric model. The fourth section includes regression results on Internet access, usage, and usage patterns in rural China. Finally, the conclusions, policy recommendations, and research deficiencies of this article are given.

2 Theoretical Analysis and Literature Review

2.1 New Characteristics of China’s Rural Areas in the New Era

Since the reform and opening up for more than 40 years, China’s rural society has undergone great changes, presenting the following main features: First, with the increasing urbanization rate, the number of villages in China shows an overall downward trend, and the rural space “gives way” to the urban space [4]; Secondly, rural population mobility has increased and heterogeneity has strengthened, especially the outflow of young and strong people, and the rural areas are mainly left with the elderly people, children, and women [5]; Thirdly, as population mobility reshapes the rural mechanism, there are also significant differences within rural China [6]. The new features presented in China’s rural areas in the new era have led to non-agricultural activities in the countryside being strengthened, with more and more linkages between rural areas and towns, highlighting the multifunctional role of rural space [7].

In the context of the rural revitalization strategy, the Chinese government has put forward a digital rural development strategy, which attempts to modernize agriculture, rural areas and farmers through rural digital empowerment, and to crack the problem of unbalanced and insufficient development of agriculture, rural areas and farmers, in order to narrow the development gap between urban and rural areas, and to find the endogenous power for rural development as well as to provide a new momentum for the modernization of the rural society [8]. Information and communication technologies are naturally incorporated into the countryside as a key element of changes in the life of the country [9], which in turn makes rural and urban areas more connected.

Influenced by factors such as relative geographic isolation and low population density, digital access in rural areas falls behind the urban areas [8]. Although the Chinese government has dramatically improved the coverage of Internet access in rural areas through telecommunication policies such as “Village to Village”, the lack of access devices [10], educational limitations [11] and other factors, resulting in the fact that there is still a large portion of rural residents without access to ICT products and services. Although mobile devices such as cell phones have improved rural residents’ Internet access, most villagers have not yet been able to enjoy the digital dividends brought by the Internet due to the limitations of cell phone products and the generally low digital skills of the residents [12].

2.2 Digital Divide

The term “digital divide” originated in the United States in the 1990s and quickly became a topic of concern with the widespread use and penetration of ICT [13]. Existing research on the digital divide consists mainly of the Level 1 digital divide, where studies have focused on differences in Internet use and access [14]. The second-level digital divide, began to study the digital skills needed to use the Internet and differences in carrying out various types of activities online [15]. The third-level digital divide, which focuses on the types of inequalities that result from Internet use outcomes [16], suggesting that differences created through Internet use may exacerbate the social stratification in reality [17]. In addition, the current studies provide further details about differences in Internet use. For example, Internet use provides more opportunities and resources to improve their education, employment, professional life and social status [18]. Connected usage provides more insight into differences in devices, uses, skills and purposes of Internet use [19].

2.3 Literature Review: From Internet Access to Internet Use

Studies on Internet access and use have focused on both developed and developing countries. As far as China is concerned, some scholars point out that the digital divide exists, mainly manifested as the urban-rural divide [20] and the intergenerational divide [21], the essence of which is that the Matthew effect in the process of synergistic interaction between information and the social economy makes the digital society “replicate” the inequality of the real society, and even “create” new inequality. Research has shown that in the elderly people [22], low-income [23] and other groups in the digital divide, the choice of online activities is determined by the specific characteristics of each group, and the government needs to help these groups to Internet access and use through public decision-making.

The study found that the main factors affecting internet penetration between countries include income, education level, infrastructure [24]. Some scholars have found that income, telephone costs, and years of education are major determinants of online access [25]. Existing studies generally agree that the availability of Internet infrastructure in rural areas is in its infancy, which is largely determined by low population density and high network costs [26].

From a demand perspective, income disparities within countries hinder Internet diffusion [27]. Dohse and Cheng consider the important impact of geographic location on the digital divide [24]. This finding can be interpreted as a consequence of the lack of telecommunications infrastructure in rural and remote areas. Some studies have found that internet connections made by households are influenced by income, education level and number of children [28]. Some scholars have focused on the factors influencing the choice of Internet modes, arguing that the choice of online activities, such as communication, entertainment, social networking and e-commerce, is largely dependent on digital skills [29].

In contrast, the literature on the digital divide in China is less extensive and focuses mainly on comparative studies of urban-rural differences [30]. Regarding Internet diffusion, some scholars believe that in addition to traditional socio-economic factors determining Internet access, the factor of children in the family also plays an active role [31]. Scholars generally believe that education is still the biggest limitation for Chinese residents to use the Internet [32]. For the quality of Internet usage, it was found that economic income [33], education level, age [34], type of work [35] and digital skill [36] can affect Chinese residents’ Internet use.

In the case of China, few studies have addressed the digital divide in the rural context by using microeconomic data. Some scholars have analyzed the disadvantages of rural areas in the digital divide through comparison between urban and rural areas [37], partly studying the impact of digital divide on income in rural areas [38], consumption structure [39], etc. There are also some studies focus on the impact of digital divide on youth [40], the Elderly people [41] and other special groups. However, existing studies have not comprehensively analyzed the situation of Internet access, use, and usage patterns (entertainment, social networking, e-commerce, study and work) in rural areas, as well as the factors influencing them. Therefore, this paper analyzes Internet access, usage and usage patterns in rural areas and their influencing factors based on differentiating Internet access (including mobile access and regular access) and usage patterns (entertainment, social networking, e-commerce, study and work).

3 Data Sources and Research Methodology

3.1 Data Sources

The data used in this paper come from the 2018 China Family Tracking Survey (CFPS) organized and implemented by the China Center for Social Science Research at Peking University. The CFPS observes and records China’s characteristics and changes in social, economic, demographic, educational, and health aspects in an all-round way by tracking the data at the three levels of the individual, the family, and the community over a long period of time. A sample of 15,605 rural people aged 12 years and above in CFPS was selected for this study. The sample data contains socio-demographic information such as gender, age, education level and occupation. The main topic of the survey was Internet access (both in the form of regular Internet access, such as computers, and mobile Internet access, such as cell phones) and usage, and the respondents were asked about Internet access, frequency of use, online activities, digital skills, and so on.

The data shows the descriptive statistical information from the sample. In terms of gender, 51.84% of the respondents were male, which is basically in line with the percentage of male population (51.13%) in the China Statistical Yearbook 2019; 22.33% of the respondents were 32 years old or younger, which reflects the young population in rural areas. In terms of education, 45.11% were in elementary school and below, 32.40% in middle school, 12.41% in high school, and only 10.08% in college. This data reflects the reality of the low proportion of people with higher education typical of rural China, coupled with the fact that education is an important factor influencing the digital divide [42], which deepens the possibility of an internal digital divide in rural areas, exacerbating digital exclusion and educational backwardness of the rural population [8]. In terms of work, 59.14% of the respondents were mainly engaged in their own agricultural production and management, accounting for 35.71% of the employed. Interestingly, the western region has the highest percentage of farmers realizing Internet access at 39.99%, followed by the eastern and central regions respectively. As for digital connectivity in rural China, 35.53% of respondents realized Internet connectivity (either regular access or mobile access). Among them, the proportion of mobile Internet access (42.08%) has exceeded the proportion of regular Internet access (11.48%). A higher proportion of Internet users have accessed the Internet via mobile Internet. The proportion of rural users using the Internet is 34.27%, and the proportion of Internet users who fulfill the conditions for Internet use set in this paper is 96.45%.

3.2 Research Methodology

In order to determine the factors influencing Internet usage patterns in rural China, this paper identifies five usage patterns: entertainment, social networking, e-commerce, study and work. Overall, rural users in the sample used e-commerce activities and study and work activities relatively less. This also shows that rural residents’ Internet applications are less used for value-creating activities such as work, study, and transactions, which is similar to the findings of the China Academy of Information and Communication Research Institute’s “Research Report on China’s Urban and Rural Digital Inclusive Development - Digitalization Helps Revitalize the Rural Areas and Shared Prosperity,” which concluded that China’s The proportion of rural residents using the Internet for learning and business activities is low [43]. First, we use a logistic regression model to estimate the determinants of Internet access. The dependent variable consists of two variables, regular access (regular) and mobile access (mobile), both of which are binary (1 indicates an Internet connection and 0 indicates no Internet connection). The independent variables are a set of socioeconomic and demographic characteristics at the household and individual level. Second, two equations were created to model Internet use decisions and usage patterns in rural areas. It is worth noting that the choice of usage pattern depends on the Internet usage decision. It is important to note that given the different demographic characteristics in the two models, Internet usage patterns are influenced by Internet usage decisions, meaning that the second equation may have a sample selection problem. Considering that the dependent variable of Internet use decision is a binary variable (1 means use, 0 means not applicable), according to the research of Shen Hongbo et al. [44], splitting the Heckman two-stage analysis so as to solve the sample selection bias problem and provide consistent and asymptotically valid estimates for the parameters.

3.2.1 Using Probabilistic Models to Estimate Whether Residents Use the Internet or Not

This view is based on the utility maximization model, which argues that the decision to use the Internet depends on a range of individual and household characteristics, reflecting differences in education, Internet use skills, financial status, social capital, and age [45]. For these reasons, the decision to use the Internet depends on the maximization of the utility of its use by each individual, which can be expressed as:

$$ {\text{y}}^{*}_{io} = X_{io} \beta_{0} + \varepsilon_{i0} $$
(1)

where \(X_{io}\) represents a matrix of independent variables (such as socio-demographic characteristics, social capital and digital skills), \(\beta_{0}\) represents a vector of coefficients, and \(\varepsilon_{i0}\) is a normally distributed random error term. Total utility is unobservable, but the decision to access the Internet is observable. Therefore, \(\varepsilon_{i0}\) is the result of a decision-making process that is influenced by the explanatory variables. Thus, \({\text{y}}_{io} = 1\) indicates an individual’s decision to access the Internet (regular access or mobile access), \({\text{y}}_{io} = 0\) indicates not doing so.

3.2.2 Internet Usage Pattern Selection Model

After determining whether the residents use the Internet or not, the Internet usage pattern selection model is determined as shown in (2):

$$ {\text{y}}_{ij} = X_{ij} \beta_{j} + \varepsilon_{ij} $$
(2)

Among them, j (where j = 1, ..., J) represents the Internet usage pattern. \({\text{y}}_{ij}\) measurement of Internet usage patterns, \(X_{ij}\) represents the matrix of independent variables and \(\varepsilon_{ij}\) denotes the normally distributed random error term. This section uses the Heckman two-stage method for regression analysis and assumes a binary normal distribution with zero mean and correlation. In applying this method, the presence of sample bias is indicated if the estimated Lambda coefficient is significant [46], and is suitable for use with sample selection models.

3.2.3 Variable Selection

This paper chooses five modes of use: entertainment, study, work, socialization, and e-commerce. In order to describe the use of rural residents on the Internet, this paper sets that as long as the frequency of use of any of the above five modes reaches at least once a month and above is deemed to have used the Internet. The independent variables in Internet use (Eq. 1) and Internet use pattern (Eq. 2) are divided into three categories, including the socioeconomic and demographic characteristics of the respondents, digital skills, and children.

The socio-economic variables include gender, age, education level, household economic status, and personal occupation. Regarding age, this paper refers to the study of Marlen et al., who identified three age ranges of 12–32, 33–64, and 65 and above [47]. Education level is measured using four categories: primary school and below (reference), middle school, high school/technical school/technical school/vocational high school, and university and above. Similarly, When dividing the work types of respondents, they are divided into own agricultural production and operation (reference), private enterprises/individual industrial and commercial households/other self-employment, agriculture work, employment, and non-agricultural casual workers. This paper chooses “net household income per capita” from the CFPS2018 database as a measure to reflect the living conditions of households. And we uses the questions “Do you send and receive emails?”, “Online shopping expenses (yuan/year)?”, “How many hours do you spend online in your spare time every week?” to measure. The number of children in the family are used to measure the impact of children in the family on Internet use. Finally, this article divides China into eastern, central and western to consider the Internet usage in different areas.

4 Empirical Results

4.1 Determinants of Internet Access

Table 1 shows the regression results of Internet access in rural areas of China. Our research results confirm that education level and family income are core influencing factors of Internet access, which means that residents with higher education and higher household income are more likely to have access to the Internet. This result is consistent with Chaudhuri et al. [48]. Possible reasons for this are that the level of education enhances a person’s ability to receive information and learn, while mobile Internet access requires a certain level of learning ability, and the devices that enable mobile Internet access, etc., require a certain level of ability to pay for them. The significance of age and its squared term in relation to mobile Internet access suggests that older people are more likely to adopt mobile Internet access, but with decreasing marginal effects. However, it can be seen from the numerical value of the age coefficient that the impact of age on the mobile Internet is not very strong. Gender is another core factor affecting mobile access, with men (relative to women) more likely to have mobile Internet access, which may be related to the fact that men in rural areas are more likely to be valued in education. Residents who are ‘private enterprise/self-employed/enterprise self-employed’ and ‘employed’ are more likely to have both types of Internet access than those who have their own agricultural business. A possible reason why employed residents are more likely to have access to the Internet is that the organizations they work for have certain requirements for Internet use, which increases their opportunities and likelihood of joining the Internet. Geographically, residents in Western China (compared to Eastern China) are less likely to have access to the Internet (both regular and mobile), and residents in Central China (compared to Eastern China) are less likely to have mobile Internet access. This reveals that Internet access in rural China may be closely linked to economic development across regions, suggesting that the digital divide in rural areas in the digital era is homogeneous with the development divide in the industrialized era Qiu Zeqi. Existing data do not confirm the relationship between household size, number of children and Internet access. In general, the group of farmers accessing the Internet has the following characteristics: better educated in the eastern part of the country, better family income, and men employed by an organization.

Table 1. Regression results of factors influencing rural Internet access

4.2 Determinants of Internet Use (First Stage)

Table 2 shows the results of the first stage regression of the Heckman two-stage estimation model. It shows that age, education level, digital skills, type of job, and household economic status are determinants of Internet adoption. Overall, the decision to use the Internet depends on the expected benefits of going online and the associated costs. The sample shows that young people in rural areas with a university degree or higher, higher digital skills, and better household economic circumstances are more likely to use the Internet. There is a clear disparity in terms of age: the younger the user, the more likely they are to use the Internet, suggesting that younger people are more willing to experiment with and adapt to new technologies, while older people are less comfortable with them. Residents with a university degree or higher are more likely to use the Internet, which is consistent with existing research and confirms that the more educated users are more likely to use the Internet. This positive effect is still significant in rural areas. Residents in the “private enterprise/self-employed/enterprise self-employed” and “employed” categories are more likely to use the Internet than residents in their own agricultural operations, probably because they have a greater need to use the Internet at work.

Table 2. Regression results of factors influencing rural Internet use

Access to wealth is crucial when using the Internet, as people of higher economic status are more likely to use the Internet, which is consistent with the findings of Kilenthong et al. [49]. As for digital skills, the likelihood of using the Internet is higher when a person has adequate work-study skills or business skills or recreational skills, this result is consistent with the findings of Hu Ying’s study on Chinese rural residents [50], suggesting that digital skills are the main reason for promoting residents’ Internet use.

Likewise, geographical location is crucial to Internet usage. Data shows that people living in rural areas of western China are more likely to use the Internet, which is an interesting phenomenon. Scholars generally believe that since the reform and opening up, China’s economy has been doing better in the east and weaker in the central and western parts of the country. Information and communication technology (ICT) has been an important support for less developed regions to achieve “catch-up” development, but weak infrastructure and other factors have made Internet access and use in the west significantly less favorable than in the east [51]. However, our study found that residents of rural areas in the west are more likely to use the Internet, which may be related to China’s western development and the “village-to-village” telecommunication policy, which encourages and supports more residents of rural areas in the west to use the Internet.

4.3 Determinants of Internet Use (Second Stage)

The Lambda coefficient of the treatment variable (Internet use) can be calculated based on the regression results of the first stage. After adding the Lambda coefficient, the regression results of the second stage of the two-stage estimation model can be calculated, as shown in Table 3. The Lambda coefficients of the five Internet usage patterns are all significant at the 0.001 level, Indicates that it is appropriate to use a sample selection model. The results show the regression results of the Heckman two-stage model on factors influencing Internet usage patterns (entertainment, social networking, e-commerce, learning, and work) in rural areas of China. Internet usage patterns vary by gender, age, education level, occupation and geographic location. In gender, women are more likely to use the Internet for social networking and electronic trading activities, while men are more likely to use the Internet to learn. It shows that Chinese rural women mainly use the Internet to maintain contact with family and friends, and these activities are related to the traditional role of Chinese rural women in the family. In terms of age, young people are more likely to use the Internet for entertainment, social networking, e-commerce, study and work, etc.; Except for the elderly people who did not show any significant relationship in learning, they are unlikely to carry out related activities through the Internet in other activities.

Table 3. Regression results of factors influencing rural Internet use

In education, those most likely to study and work via the Internet are those with university education and above, while those most likely to engage in e-commerce are those with high school education and above. It implies that rural users with higher levels of education are more likely to engage in value-added activities via the Internet. The conclusion is interesting. Residents with different education levels are relatively similar in their likelihood of using online entertainment. It is consistent with Hu Ying’s (2022) study, currently, most residents use the Internet mainly for entertainment activities. This Internet usage pattern structure needs to be improved and adjusted urgently [52].

In terms of occupation, those residents who are “employed” and work in the private sector, for example, are more likely to participate in value-added online activities such as learning and working, while “agricultural wage earners” or “non-agricultural casual” users are less likely to participate in value-added online activities such as learning and working. “Working agricultural” or “non-farm casual” users are less likely to participate in such value-added activities, shows that a digital divide at the level of Internet use has formed in rural areas due to differences in occupational types.

Digital skills significantly increase the likelihood of engaging in entertainment, social networking, e-commerce and learning via the Internet, except for a non-significant effect on work patterns. It is broadly consistent with existing research, which generally agrees that people with higher digital skills are more likely to use different Internet modes [53].

Geographic location is crucial in explaining Internet usage patterns, and Table 5 illustrates the heterogeneity in Internet usage patterns across different rural areas in China. In rural China, users in the central region are more likely to use the Internet for social activities, and users in the western region are more likely to use the Internet for social and learning activities. It is inconsistent with existing research, as some scholars believe that residents in rural areas of western China are drowning in entertainment and other recreational “junk information”. However, we feel that as China’s western region continues to develop, the attitudes of rural residents towards the Internet have changed dramatically, and that the Internet may become an effective means of catching up by leaps and bounds in the western region. The higher the economic income, the higher the likelihood of using the Internet for entertainment, socializing, e-commerce, learning and work. Finally, users with more children in the household are more likely to use the Internet for learning, which may be related to the current promotion of online education and the importance that families place on their children’s education.

5 Conclusions and Recommendations

This paper uses the Heckman two-stage model based on the China Family Tracking Survey (CFPS) 2018 data to study the influencing factors of Internet access, usage and usage patterns in rural China, which provides empirical evidence to prove the primary and secondary digital divide in rural China. This paper explains the influencing factors of rural residents’ Internet access and usage behaviors at the micro level. It will help further enrich theoretical research on the digital divide. (1) It is found that the factors influencing Internet access, usage and usage patterns in rural China are similar to those observed in other developing countries at the early stage of Internet diffusion, mainly as follows: (1) We can infer the profile of Internet access user groups in rural areas of China: young and well-educated people with good economic income and digital skills; and there is a big gap in access methods in terms of Internet access. Mobile Internet access is significantly higher than conventional Internet access; (2) Age, education level, digital skills and family economic status are important factors affecting Internet use; (3) Internet use patterns (entertainment, social networking, e-commerce, learning, and geographic location and work) have significant differences in gender, age, education level, occupation and geographic location; and (4) rural users’ Internet use is mainly dominated by non-value-creating activities such as entertainment and socializing. Accordingly, this paper puts forward the following policy recommendations:

Based on the conclusions drawn in this paper, the following countermeasures are proposed for Internet popularization and use in rural areas. First, it is necessary to comprehensively understand the evolution and trend of the digital divide in rural areas in practice, establish a sound digital policy in light of the fact that rural areas mainly rely on smartphones for Internet access, and increase the promotion of online services and applications related to employment, education, and public health, so that more people in rural areas of developing countries can enjoy the digital dividend. Secondly, continue to improve rural Internet access infrastructure to help more rural residents access the Internet through broadband; encourage communication companies to continuously strengthen smart phone products and related services so that mobile Internet access can obtain the same functions as conventional Internet access; establish a sound training system for rural users’ Internet use skills, and realize the use of the Internet by improving the digital skills of rural users; and enhance the advantages brought by Internet use. Enhance the advantages and benefits brought by the use of the Internet. Once again, continuously improve e-commerce-related policies, provide policy support for rural young people to engage in e-commerce. And take e-commerce as an big way to solve the problem of young people’s employment in rural areas; continuously improve young people’s digital skills, and provide support to promote the sustainable and healthy development of e-commerce in rural areas. Finally, the government must pay attention to the problem of digital divide among rural elderly groups. On the one hand, we should use the community as a unit to improve the digital skills of the elderly people and minimize the digital divide among the elderly people; on the other hand, in various specific scenarios of grassroots governance (especially governance scenarios for the purpose of “intelligent”), focus on Taking into account the fact that the elderly people cannot use the Internet well reflects the temperature of grassroots governance.

It should be pointed out that this study only deals with the two levels of digital divide, Internet access and Internet use, and does not discuss the three levels of digital divide, such as the inequality brought by Internet use. This is also a breakthrough for future research, i.e., discussing the current situation of the three-level digital divide in rural areas and its influencing factors, so as to continue to enrich the research on the impact of Internet access and use on the life satisfaction of residents in rural areas in China and other developing countries.