1 Introduction

The continued spate of unauthorized procurement and exploitation of personal information from web sites has acted to increase levels of privacy concern in the online community. This has had a significant negative impact on online business transactions and online social activities. Previous studies estimate that the decreasing levels of users’ trust caused by privacy concerns have led to heavy losses in business-to-consumer online business. Social network sites have also come under criticism because of eroding privacy policies. For example, in 2006, the ability to post content on a friend’s Wall on Facebook was met with some consumer backlash. Some users even protested against the feature being implemented at all. In 2008, Facebook’s Beacon social advertising system received a great number of complaints, becoming the target of a class action lawsuit, and ultimately shutting down in September 2009. And in 2010, Facebook’s Places which enabled location sharing with friends came under fire for its potential use as a cyberstalking tool or for less-than-savory purposes.

Given general privacy concerns, consumers are less inclined to submit information on social network sites, at times even providing fake personal information, for “date-of-birth” or “post code” data entry fields. The self-disclosure of personal information on social network sites, after all, can provide criminals with the opportunity to commit identity theft and a host of other cybercrimes. So how service providers can continue to convince users to provide personal information online despite the ever-increasing risks is a significant research issue in the sustainable development of social network sites.

Previous scholarly works have been mainly focused on studying the connection between privacy concern and information disclosure. Dwyer et al. [10] compared Facebook and Myspace in relation to privacy concerns and trust. Acquisti and Gross [1] analyzed how privacy concerns affect people’s information disclosure on Facebook. However, these studies did not take into account those factors that actually work to encourage users to disclose personal information on social network sites. According to the Theory of Reasoned Action (TRA) and the Theory of Planned Behavior (TPB) [2], the adoption of some behaviors must be directly related to some benefit. In this instance, the self-disclosure of personal information must be commensurately connected with some perceived benefits coming as a result of that disclosure. The privacy calculus model which describes the intention of self-disclosure of personal information within a risk-return exchange has been drawing more and more attention [5, 22, 30]. Some scholars explain the behavior of disclosing private information publicly using the constructs of privacy concern and perceived benefits [5, 36], although most of the studies are limited to online business transactions. Compared to online business sites, social network sites mainly request demographic information such as the user’s age and gender. Users are also encouraged to use their real name and not a pseudonym. In addition, the benefits perceived on social network sites are not discounts or free services, but the foundation of social capital or community attachment. This paper combines the Theory of Planned Behavior and privacy calculus to present a new holistic model toward the explanation of self-disclosure of personal information on social network sites.

2 Literature review

2.1 The theory of planned behavior

Since the 1980s, theories that explain the behavior of people have become more and more refined. Such theories as the Social Cognitive Theory (SCT), the Theory of Reasoned Action (TRA), and the Theory of Planned Behavior (TPB) explain the behavioral intentions of people. For example, in TPB, the attitude, behavior control and subjective norm are the main factors affecting behavior intention which directly determines the adoption of behavior (Fig. 1).

Fig. 1
figure 1

Theory of planned behavior [2]

The Theory of Planned Behavior model has also been used to explain consumer online preferences. Lee (2009) added perceived risk and trust into TPB to study online transactions with acceptable results. The deductive method seemingly supports the conclusion that combined with certain characteristics in social networks, the TPB model can be used to explain the self-disclosure of personal information by users of online social network sites.

2.2 The theory of privacy calculus

Defining privacy is not straightforward, despite the vast number of studies on the topic. Given the diverse contexts in which privacy is described, a universally applied definition of privacy does not exist. “Privacy” and the self-disclosure of “private” information are viewed from an economic angle in the theory of privacy calculus. Klopfer and Rubenstein [17] for instance, regarded privacy as an entitlement which can be exchanged for more value. On that basis, many studies believe one behavior happens after the risk and return calculus is determined [6, 22]. As to the privacy issues, the benefit must exceed the risk to guarantee the motive of self-disclosure [33].

3 The proposed integrated model

Combining the TPB model with privacy calculus, an integrated model is proposed (Fig. 2). In this model, privacy disclosure was determined by privacy concern and perceived benefit which drew from the theory of privacy calculus. The constructs of behavior control, subjective norm and factors of attitude were taken from the TPB model.

Fig. 2
figure 2

Integrated model

3.1 Constructs and hypotheses in the model

3.1.1 Perceived benefit

As most of the relevant studies were in the context of online business transactions, perceived benefit was regarded as discounted or free services. Yang and Wang [36] performed an experiment which used discounts as compensation for the submission of personal information. Phelps et al. [27] found that the voluntary disclosure of personal information by consumers equated to a contraction in shopping time and better purchasing recommendations. Xu et al. [35] regarded more personalized service as perceived benefit. Generally, the perceived benefits mentioned above can be regarded more as attention to personalized service.

In the context of online social network sites, the self-disclosure of personal information can also achieve the same kind of return. For example, renren.com, a China-based social networking platform, rates users by the amount of personal information they disclose on their site using a star rating system. The higher the rating assessment, the more services the user enjoys. However, most users who engage with online social network sites do not expect such pragmatic practical benefits. Most users are motivated to disclose personal information for the purposes of being able to interact with online communities they share a common interest with. To these users, connecting within and between social networks, in essence, is a form of social capital.

People wish to be accepted and valued by the social networks they are members of and this can be explained in two ways.

Participating in a social network repays members’ organizational commitments. According to Forman et al. [12], people can achieve community attachment by enrolling and taking part in a social network. Forman et al. [12] found that the reason why people tend to make their personal information public is: (a) to make it easier to be identified; and (b) to gain an attachment to relevant communities. Khan and Shaikh [16] investigated the reason why people add lots of strangers as “friends” on social network sites and postulated that it had to do with gaining popularity. In organizational behavior (OB) this is known as “the sense of belonging”. Salancik [29] figured that community attachment is the dependence people have on a certain organization, and the behavior prompted by that organization of which they are a member.

Social network sites provide people with lots of virtual social capital. The accumulation of social capital permits people to attain rich resources, such as information linkages, organizing and cooperative abilities and tapping into a web of relationships. Granovetter [13] proved the necessity of social networks using the theory of weak ties. Having more relationships and connections, provides people more resources to call from when they are in need of a helping hand.

The introduction of online social network sites helped some people consolidate former relationships and made it possible to establish new circles of friends. It is said that the development of these platforms have forced the cost of communications to decrease, and have enabled the number of weak ties one can handle to sharply increase [9]. More advanced internet techniques have provided diverse methods to maintain relationships online, such as address lists with power function and video conference. Social capital can only be derived from the willingness of participants in the network to submit fundamental personal information. That is what motivates people to disclose who they are for authentication purposes.

The accumulation of social capital and the sense of community attachment are related. The social network site offering more social capital draws the user’s loyalty and guarantees attachment providing a certain level of “stickiness” in the form of quality return visits and participation; in turn, the sites with abundant loyal users can offer even more in terms of social capital for the individual user, and the members at large.

Numerous articles present the relationship between the perceived benefit by consumers and their online behavior. For example, shoppers will continue to disclose personal information if it means they will qualify for online discounts or promotions [8, 15]. Ellison et al. [3] found perceived benefits had a positive affect on the usage of social network sites. Based on this research, this paper assumes perceived benefits have positive affectivity on the self-disclosure of personal information on social network sites.

Hypothesis 1

Perceived benefit will have a positive effect on the self-disclosure of personal information on social network sites.

3.1.2 Privacy concern

Privacy concern refers to the users’ concern about threats to their privacy online. This construct reflects users’ response to the perceived possibility of a privacy leak and the expected loss induced by the abuse of privacy. According to Paine et al. [24], privacy concern is not only the reaction to the security of privacy but also a motivator for users to take care of their personal information. Milne and Culnan [21] proved high levels of privacy concern provided the impetus for users to read, at least, the introduction of privacy policies online. And users with high levels of privacy concern may also be inclined to refuse to submit personal information to a social network site [31] or to submit false information [14]. Privacy concern has become one of the most important factors in studying online privacy issues.

Hypothesis 2

Privacy concerns will have a negative effect on the self-disclosure of personal information on social network sites.

3.1.3 Privacy sensitivity

Privacy sensitivity represents people’s attitudes toward revealing differing levels of personal information during the online shopping experience. Phelps et al. [28] divided personal information into three categories: (1) demographic information; (2) lifestyle and shopping information; and (3) personal financial information. Malhotra et al. [20] found that people are willing to provide less sensitive information online, but when they are faced with the dilemma of providing more sensitive information they usually decline participation. To measure privacy sensitivity, Yang and Wang [36] set matched groups in their experiment: (1) users that were asked demographic information only; and (2) users that were asked both demographic information and personal financial information.

Yang and Wang’s categorization of personal information mentioned above is not entirely suitable for social network sites, as extensive demographic data is not usually requested, and personal financial information is relevant to an even lesser extent. When compared with online shopping sites the only information that users of social network sites are requested to submit, belongs only to the demographic information category. Thus, it can be assumed that even though the same request to submit “essential” personal information is being made on the social network site, users may individually react in different ways. This is analogous to the manner in which the Technology Acceptance Model (TAM) can perceive different usefulness of the same facility. In experiential studies, it is not the attributes of a given technology that are of importance but a respondent’s feeling towards a given technology that we should be measuring [32]. On social network sites, a user can possess differing levels of personal information sensitivity, while the categories of information themselves are less relevant.

Hypothesis 3

Privacy sensitivity will have a positive effect on the level of privacy concern.

3.1.4 Privacy risk

Privacy risk refers to users’ expectation of losses associated with privacy disclosure online, which is caused by opportunistic behavior and the misuse of personal information. The greater the losses caused by the disclosure of personal information, the greater the risk users would perceive.

Experiential studies of online transactions proved that privacy risk had a negative effect on retail transactions [23, 25]. Many scholars confirmed the existence of the relationship between privacy concern and privacy risk. Chellappa and Sin [4] proved a positive relationship between privacy concern and privacy risk, but did not mention the causation relation. In the model presented by Chellappa and Sin [4] the level of privacy concern of Internet users was proved to have a positive effect on privacy risk. Xu et al. [34] and Dinev and Hart [8] proved that perceived risk had a positive effect on privacy concern. The Theory of Reasoned Action demonstrates that expected outcomes govern the attitudes of behavior, and that expected outcomes are affected by exogenous variables. So we further suppose that the level of privacy concern comes from perceived risk, and perceived risk roots itself in the attributes of web sites and the networked environment.

Hypothesis 4

Privacy risk will have a positive effect on privacy concern.

3.1.5 Information control

Information control refers to the capacity people have to control information released online. Factors that determine the perception people have of information control relate to manners in which web sites collect, store and utilize user personal information. Those factors can be reduced to four points: (1) the presence of a privacy policy on the online site; (2) knowing that information is being collected; (3) voluntary/involuntary submission of the personal information in question; and (4) the openness of the type of information usage by the online organization.

We could postulate that if a privacy policy of a given web site made users feel they could maintain their privacy, it would significantly decrease the level of user privacy concern. In researching online shopping, Phelps et al. [28] thought that the lack of control over personal information explained 42.5 % of the change in people’s level of privacy concern. Dinev and Hart [8] considered that it was the control that people perceived over their personal information that governed the extent of self-disclosure online.

Hypothesis 5

Information control will have a negative effect on the level of privacy concern.

3.1.6 Subjective norm

As a notion deriving from psychology and sociology, subjective norm has been well-used in studies exploring factors acting on people’s attitude about certain behaviors. Lehikoinen et al. [18] found social culture had a significant effect on people’s information disclosure in social networks. Lewis et al. [19] found students were more likely to take part in social network sites where their classmates had already gained membership. Xu et al. [34] demonstrated subjective norm acted on the degree that people regard privacy online. All of these researches have proved that subjective norm has influenced people’s privacy disclosure online.

In contrast to previous studies, the application of the subjective norm on social network sites also includes the effects of other users. Dwyer et al. [10] believed that users who willingly disclose personal information linked their behavior to the trust they had established with other users in the network. If in a given social network site, everyone tended to share real personal information with one another, then that behavior would be considered a subjective norm on that site.

Hypothesis 6

The subjective norm will have a positive effect on privacy concerns.

3.2 Model evolution

Factors in our proposed integrated model have been used in former studies about online privacy behavior. We reviewed empirical studies about privacy disclosure on the Internet, and abstracted the main factors used. There are six constructs thought to be related to the Theory of Planned Behavior model. The relational path between these constructs can be integrated into a comprehensive model according to former studies.

Table 1 shows highly cited articles focusing on privacy disclosure online and main factors mentioned in those articles. Thirteen articles that have done experiential studies about online privacy information disclosure are presented, and eight constructs are abstracted from those studies, six of which are discussed in our proposed integrated model. Trust means users’ sense of trust about certain web sites. Trust factors about privacy online were influenced by the user experience and familiarity to an online web site. Usually, but not always how personal information (e.g. age, gender) characteristics is used by an online organization is set out in their privacy policy.

Table 1 Empirical studies about the self-disclosure of personal information online

Table 1 is presented to show the theoretical basis underpinning the integrated model being proposed in this paper. The constructs used in the model have all been taken from previous studies. Each hypothesis in the model has also been previously proven in studies of situations like online business. The theoretical contribution of this integrated approach lies in a combination of former conclusions and interests on a new topic, the self-disclosure of personal information on social network sites.

Though in the original TPB Model behavior control, subjective norm and attitude were the key factors affecting attitude of behavior, previous studies provided several others factors that may affect privacy concern. In the following section, an exploratory factor analysis was conducted to choose the right constructs.

Ten graduate students who were long term users of renren.com were interviewed after completing a structured pilot questionnaire. Based on the feedback of these students, questions were adapted accordingly.

4 Methodology

The data presented in this paper was collected from Nanjing University using self-administered questionnaires. The sample was collected among senior students and graduate students. Other age groups were not discussed in this paper as most users of renren.com are young people. The questionnaires were distributed by hand at the main cafeteria and auditorium of Nanjing University from October 10 to November 10 in 2009, and 200 questionnaire responses were received. Of the entire sample of 200 students, 171 valid questionnaires were retrieved.

After reviewing the previous literature, several constructs were found to have an impact on privacy concern. These constructs included: Information Control, Subjective Norm, Privacy Risk, Privacy Sensitivity, Trust and Privacy Policy. Xu et al. [34] created a model to explain people’s privacy concern using the four former variables. This paper uses the same approach but differs in that it provides empirical evidence in its analysis. It is the first time that data collected through a questionnaire has been used to support this type of modeling approach.

Conducting a principal component analysis with all the named constructs showed that three components remained after extraction (Table 2). Factor loadings of the indicators of Subjective Norm, Information Control and Privacy Risk on corresponding components all exceeded 0.70. And indicators of Privacy Policy, Trust and Information Control loaded highly on the same factor, which suggests a high correlation among the three constructs. Most of the variances of the three constructs were positively correlated and could be reflected by one construct.

Table 2 Exploratory factor analysis of constructs of privacy concern

The indicator of information sensitivity in this study differed from previous research as the data requested by social network sites of users is usually limited to demographic details. So the measurement of information sensitivity mainly relies on the users’ perception, and thus a new indicator was created in this research to meet this need.

Table 3 shows the results of a principal component analysis of Subjective Norm, Information Control, Privacy Risk and Privacy Sensitivity. All the indicators were loaded on the corresponding construct, and no significant multi-colinearity was detected. So in the integrated model, Privacy Concern was explained by four variables: Subjective Norm, Information Control, Privacy Risk and Privacy Sensitivity, as depicted in Tables 3 and 4.

Table 3 Confirmatory factor analysis of constructs of privacy concern
Table 4 Result of EFA and CFA

As the questionnaire in this research was based on individual items used in previous research, the reliability and validity of constructs must be verified. Cronbach’s alpha coefficient was chosen to evaluate the reliability, and the average variance extracted (AVE) was chosen to evaluate the convergent validity of constructs. Conducting exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) using spss 16.0 software and lisrel 8.70 software, Cronbach’s alpha coefficient and AVE were derived (Table 4).

All of the constructs passed the critical level of Cronbach’s alpha of 0.7 and the average variances extracted were all greater than 0.5, which meant the reliability and validity of the questionnaire was established.

A structural equation model (SEM) was created using lisrel 8.70 which analyzed the relationship between variables based on the covariance matrix. The results of this analysis are shown in Fig. 3.

Fig. 3
figure 3

Path analysis (P<0.1=⋆, P<0.05=⋆⋆, P<0.01=⋆⋆⋆)

According to the result of the SEM, Hypotheses 1, 2, 4 and 5 were all well-demonstrated, while Hypotheses 3 and 6 did not pass the significance testing. This means that the self-disclosure of personal information is determined by both the level of privacy concern and the level of perceived benefit. It was also found that indeed, privacy concern was related to the perceived risk and information control, but not to information sensitivity or subjective norm. CFA showed that the fit indices, such as AGFI, CFI and NNFI, were all over 0.90, which supported the construct validity of the model.

4.1 Mediator variable test

The following section tested the privacy concern as a mediator variable between perceived control, perceived risk and privacy disclosure. The verification method used in this paper was Multiple Hierarchical Regression analysis. Five equations were verified to authenticate the intermediary function of privacy concern, in which the perceived benefit was treated as a control variable.

  1. (1)

    privacy disclosure =β1∗ perceived benefit + α

  2. (2)

    privacy disclosure =β1∗ perceived benefit + β2∗ information control + β3∗ privacy risk + α

  3. (3)

    privacy disclosure =β1∗ perceived benefit + β2∗ information control + β3∗ privacy risk + β4∗ privacy concern + α

  4. (4)

    privacy disclosure =β1∗ perceived benefit + β2∗ privacy concern + α

  5. (5)

    privacy concern =β1∗ information control + β2∗ privacy risk + α

Table 5 presented the results of the regression analysis of the five equations in SPSS. Equation (1) proved that the perceived benefit was significantly related to privacy disclosure. The result of Eq. (2) showed, after eliminating the effect of perceived benefit, that information control and perceived risk were still significantly related to privacy disclosure. But after including privacy concern in Eq. (3), the perceived benefit and information control were no longer related to privacy disclosure. The significance of information control was increased from 0.049 to 0.871 and the significance of perceived risk was increased from 0.029 to 0.791. As the relationship of privacy concern and privacy disclosure was certified in Eq. (4) and information control and perceived risk were proved to affect privacy concern in Eq. (5), this research demonstrates that privacy concern is fully mediated between information control, perceived risk and privacy disclosure.

Table 5 Multiple hierarchical regression analysis

5 Conclusion and contributions

5.1 Theoretical contribution

After presenting a review of previous studies, this article put forward a model to explain people’s privacy disclosure on social network sites. It also modified some constructs: not those exogenous variables as expected outcomes and risk of certain behaviors, but endogenous variables of adopters, and the perceived influence of certain behavior. Though each hypothesis in the integrated model had been proven previously, it was the first time that the theoretical conclusions had been combined. In addition, this paper focuses on a discussion addressing an increasingly important issue which is what factors affect the self-disclosure of personal information by users on social network sites. Using this modeling approach it is possible to identify the main factor(s) affecting people’s privacy concern and privacy disclosure on social network sites. These main indicators can aid in the development of a sustainable Internet where users of social network sites have less privacy concerns, and are willing to contribute to the development of a stable SNS environment.

Though the issue of privacy concern has been previously addressed, this paper is focused on an emerging topic, the disclosure of private information on social network sites. Three theoretical innovations were proposed in this paper: (1) an integrated model, (2) the redefinition of sensitivity; (3) the redefinition of benefits. First, the model proposed in this paper integrated the Theory of Planned Behavior and the Privacy Calculus model, which were both claimed here to interpret privacy disclosure online more comprehensively. Second, the perceived benefit in social networks is different from the profit in online business retail stores, and defining it from the angle of organizational behavior would be more credible. Third, the information sensitivity was used to be measured by experiments setting control groups with different information categories, which was not suitable on social network sites as only demographic information was requested. It is important to note, that this paper paid more attention to the perceived sensitivity felt by people when requested to submit personal information to social network sites.

5.2 Practical contribution

According to the results of the data analysis, perceived risk and information control were proved to have significant effects on privacy concern, while the relationship between information sensitivity, subjective norm and privacy concern did not pass significance testing. The result of the structural equation modeling showed that compared to the unauthorized disclosure of personal information, perceived risk caused by the invasion of privacy played a more powerful role in shaping users’ privacy concern. In practice social network sites demand only basic demographic personal information. People can voluntarily share more detailed information through web blogs and online photo albums (such as on facebook.com and renren.com). For the time being, there is a gap between the ability to submit personal information on a social network site in order to become enrolled on a system and the effort that social network sites go to relieve privacy concerns via the use of more robust privacy policies. In general, it is alleged that social network sites do not seek the privacy protection of their users.

To change the attitudes of people toward the self-disclosure of personal information online, the entire Internet world would need to become secure through a process of regulation. This needs among other things, the cooperation of all the major traffic generating web sites, like search engines, micro-blogs, Internet banking sites, and online ticketing sitesetc.

Compared to privacy concerns, perceived benefits were proven to have a higher effect on the user behavior of self-disclosure of personal information with a t value of 11.39. This value illustrates the desire for community attachment and identification motivated by people who wish to post personal news publicly online. This paper concludes by suggesting that a strategy that can be used by social network sites is to create and deploy an even greater number of online activities, which can entice users to participate, interact and become engaged with one another, building even stronger relationships. This would not only act to increase the number of return visits and number of hits on the social network site, but encourage users to share their real stories building even greater social capital to be shared.

5.3 Limitations

There are a number of limitations in this study which each provide opportunities for further research in the future. First, there are still other factors that may affect people’s behavioral choices which are not contained in the integrated model presented here. For instance, this model does not include some factors about people’s characteristics such as gender [11] and age [26] which were proved to be related to privacy disclosure online.

Secondly, the sample used in this paper was collected among a university student-base in China, and the results may not be applicable universally. However, as the Chinese culture is different from other more developed countries. However, it is true that most users of social network sites are young adults. For example, by the end of 2009, about 70 % of users of Facebook.com were between 18 and 25 years old. Additionally, college students constitute a high proportion of those enrolled in online social networks, and their network behaviors are allegedly representative.

Finally, this research differs from former investigations, as the relationship between information control, subjective norm and privacy concern do not pass significance testing. This may be explained by the particular nature of social network sites. Still, further verification is needed in the future.