Keywords

1 Introduction

As a promising sensing paradigm in the big data era, Mobile Crowdsensing Systems (MCSs) [1, 2] aim at realizing large-scale sensing targets, including environment monitoring [3], online urban sensing [4], mobile social networking [5, 6], and other Internet of Things applications [7, 8], by leveraging pervasively distributed personal mobile smart devices. Mobile smart devices with enriched sensing capabilities are capable of providing fine-grained and economically cheap sensing services [9, 10]. Considering the proliferation of MCS applications with improving security intensities [11, 12], the potential privacy leakage raised by the use of personal mobile devices has been hindering the development of MCSs [13, 14]. Realizing the importance of user privacy protection in MCSs, considerable research efforts have been devoted to preserving mobile user’s privacy, including real identity [15], geological location [16, 17], contributed data [18], among others.

Nonetheless, in privacy-preserving MCSs, it is difficult to relate a mobile user to its MCS task reports if the identity of real user is obscured. Thus, the quality of sensing data reported by mobile users is difficult to achieve due to the lack of accountable information. In addition, self-interested mobile users are more likely to report non-objective information in MCS tasks, considering that the cost of such dishonest behavior is negligible, while the benefits may be substantial and monetary. How to ensure the sensing data quality in privacy-preserving MCSs is essential for MCS in real-world practice.

In this paper, we present a user incentive-based scheme against potential dishonest reporting in privacy-preserving MCSs, which guarantees the quality of sensing data by rewarding honest users who upload objective sensing reports. Our contributions are summarized as follows: First, we develop an online method for the MCS server to simultaneously compute the sensing truth and estimate the quality of user-reported data without the use of any historical information. Our method is feasible in privacy-preserving MCSs, since it only requires the sensing observations of all participants of the assigned MCS task. Second, based on the developed data quality estimation method and user profit model, we present a user incentive-based method to encourage MCS participants to report objective information for higher profit. Our method can autonomously adjust the actual profit of each participant according to its impact on the entire system. In doing so, our method can ensure that honest participants are rewarded while dishonest participants are reprimanded. Third, we carry out a thorough evaluation on the performance of our proposed scheme based on extensive real-world trace-driven simulations. Simulation results demonstrate that our scheme can realize effective user incentives to guarantee the sensing accuracy of MCSs against dishonest reporting in privacy-preserving MCSs.

The remainder of the paper is organized as follows: In Sect. 2, we present the system model and describe the user incentive problem in privacy-preserving MCSs. In Sect. 3, we present our scheme in detail. In Sect. 4, we present our experimental design and the results of our performance evaluation. Finally, we conclude the paper in Sect. 5.

2 System Model and Problem Formalization

In this section, we first introduce the system model. As shown in Fig. 1, a general MCS consists of a cloud server s and a set of registered mobile users \(\mathcal {P}\), where \(|\mathcal {P}|\geqslant 2\). All users in \(\mathcal {P}\) can communicate with s via either WiFi access points, or cellular base stations. For the privacy-preserving requirement, each user in \(\mathcal {P}\) has a pseudo-identity \(i\in \{1,2,\ldots ,i,\ldots ,N\}\) (where \(N=|\mathcal {P}|\)), which enables anonymous communications between i and s (will be explained later).

Fig. 1.
figure 1

A system model

For a specific MCS task \(\tau \), s publishes an announcement that clarifies the required sensing observation \(o_{\tau }\) (e.g., what data is required), and the desired sensing location \(l_{\tau }\) (e.g., where the data should be sensed). An user \(i\in \mathcal {P}\) within the effective sensing area (e.g., the circular area with \(l_{\tau }\) as the center, and \(d_{max}\) as the radius) can participate in \(\tau \) according to their interests.

For each task participant \(i\in \mathcal {E}=\{1,2,\ldots ,M\}\) (\(\mathcal {E}\subseteq \mathcal {P}\)), i determines its serving time \(t_i\) for \(\tau \) based on its serving confidence \(c_i\), and local status \(\varvec{x_i}\) (containing i’s geological zone \(z_i\) and unit serving price \(p_i\)), which will be discussed later. Then, during the serving time \(t_i\), a participant i conducts required sensing obligation and records corresponding observation \(o_i\) under s’s instruction. When \(\tau \) is finished, each \(i\in \mathcal {E}\) uploads its sensing report \(\varvec{r}_i\), which contains \(o_i\), \(c_i\) and \(\varvec{x_i}\), to s. According to \(\mathcal {T}=\{t_1,t_2,\ldots ,t_M\}\) and \(\mathcal {R}=\{\varvec{r}_1,\varvec{r}_2,\ldots ,\varvec{r}_M\}\), s pays corresponding monetary credits \(\varPhi =\{\phi _1,\phi _2,\ldots ,\phi _M\}\) to \(\mathcal {E}\).

We now explicitly define the user incentive problem in privacy-preserving MCSs. To begin with, we clarify the privacy-preserving settings in MCSs. We consider the protection of user privacy with respect to both real identities and precise geological locations. First, via anonymous communications based on pseudo-identities that are dynamically adjusted, the cloud server s will not maintain any derivative relation between sensing reports and their generators. For any user i, no information about its real identity will be disclosed because of participating in MCS tasksFootnote 1. Second, instead of precise GPS coordinates, s is only interested in the rough geological locations of MCS task participants. As illustrated in Fig. 1, the effective sensing area is divided into several geological zones (i.e., from \(z_{min}\) to \(z_{max}\), approaching the desired sensing location \(l_{\tau }\)). As a user i only uploads its geological zone \(z_i\) as a part of its local status, the cloud server s cannot obtain the precise spatial-temporal tracks of MCS task participants.

In order to provide incentives to users to participate in the sensing efforts in MCSs, it is reasonable for the cloud server s to pay the maximum credits to MCS task participants as a stimulation. Nonetheless, to guarantee MCS sensing accuracy, the actual profit that a participant can obtain by serving a certain task must be determined by the quality of its reported data and its serving costs. In privacy-preserving MCSs, the estimation of reported data quality is challenging for the reasons outlined as follows. First, the lack of relation between real identities and reported data of users obstructs the accountability of historical user behaviors. Second, it is highly possible that self-interested users report dishonestly for higher profits, especially in the non-accountable privacy-preserving scenario. Therefore, to achieve effective user incentives that encourage mobile users to upload quality reports in privacy-preserving MCS tasks, the cloud server s needs to address the following three problems: (i) How to estimate the quality of reported data without knowing user historical behaviors, (ii) How to maximize a participant’s profit for stimulation according to the quality of its report, and (ii) How to realize effective user incentives against potential dishonest reporting.

3 Our Approach

In this section, we design our approach to address the aforementioned three problems.

3.1 Online Quality Estimation of User-Reported Data

In the following, we address the issue of how to estimate the quality of the sensing data reported, using only the information related to the current MCS task. Considering the fact that there is no available ground truth of an MCS task \(\tau \), meanwhile there is no referable historical information of task participants, s needs to compute \(\tau \)’s sensing truth \(o_{\tau }\) according to all reported sensing observations \(\mathcal {O}=\{o_1,o_2,\ldots ,o_M\}\), which is treated as the criterion to estimate the quality of user-reported data.

To compute \(o_{\tau }\) for the estimation of reported data quality, we construct Algorithm 1 based on the sensing truth discovery process in our prior work [23]. Here, it is worth noting the difference between Algorithm 1 and the truth discovery process in [23]: since the work of [23] did not focus on the privacy-preserving scenario, its weighted truth discovery process has access to the historical reputation of each participant, which reflects the prior credibility of its sensing observation. Nonetheless, our Algorithm 1 computes the sensing truth only based on observations of the current task. In addition, the primary purpose of truth discovery in [23] is to evaluate each participant’s contribution to the MCS task, while the purpose of our Algorithm 1 is to estimate the quality of reported observations.

3.2 User Profit Definition and Maximization

In this subsection, we formally define the computation of user profits for serving MCS tasks. Meanwhile, we further demonstrate that any participant of an MCS task can obtain its maximum profit only when it behaves honestly and uploads an objective sensing report.

Intuitively, for each participant \(i\in \mathcal {M}\) of task \(\tau \), its serving profit \(\phi _i\) is determined by considering its serving time \(t_i\) (\(t_i\geqslant 0\)), serving confidence \(c_i\) (\(0<c_i\leqslant 1\)), reported observation \(o_i\), and local status \(\varvec{x_i}=(z_i,p_i)\) (i.e., geological zone \(z_i\) (\(z_{min}\leqslant z_i\leqslant z_{max}\)), and unit serving price \(p_i\) (\(0 < p_i\leqslant 1\))). Here, \(c_i\) indicates the expected quality of i’s observation \(o_i\), and \(p_i\) indicates i’s synthesized unit cost with respect to battery and network traffic for serving \(\tau \).

figure a

Specifically, participant i can compute its expected serving profit \(\hat{\phi }_i\) as:

$$\begin{aligned} \hat{\phi }_i=c_it_i-p_i\frac{z_i}{z_{max}}t_i^2. \end{aligned}$$
(1)

Similarly, the cloud server i can derive i’s actual serving profit \(\phi _i\) as:

$$\begin{aligned} \phi _i=q_it_i-p_i\frac{z_i}{z_{max}}t_i^2. \end{aligned}$$
(2)

where \(q_i\) can be estimated according to Algorithm 1 based on \(\mathcal {O}\).

From a practical perspective, we assume that, for each MCS task \(\tau \), participant i can locally determine \(t_i\), \(c_i\), and \(o_i\) all by itself, whereas \(z_i\) and \(p_i\) are extracted by mobile device firmware as its local status \(\varvec{x}_i\). In this case, it is easy to obtain that \(\hat{\phi }_{imax}=\frac{z_{max}c_i^2}{4p_iz_i}\) when participant i determines to serve \(\tau \) for \(\hat{t}_{imax}=\frac{z_{max}c_i}{2p_iz_i}\). Because \(\varvec{x}_i\) is always objectively reported, there is \(\phi _{imax}=\hat{\phi }_{imax}\) only when \(q_i=c_i\). Therefore, if participant i uploads a more objective \(c_i\) that was closer to \(q_i\), and serves for \(\hat{t}_{imax}\), he or she should obtain an actual profit \(\phi _i\), which is closer to \(\hat{\phi }_{imax}\). According to Subsect. 3.1, for each MCS participant, an effective way to enhance the quality of reported data in privacy-preserving MCSs is to honestly report objective sensing observations.

3.3 User Incentive-Based Scheme Against Dishonest Reporting

Based on our data quality estimation mechanism and user serving profit model, we now design a user Incentive-based scheme Against Dishonest Reporting (IADR) for cloud servers of privacy-preserving MCSs.

According to Subsect. 3.2, each participant \(i\in \mathcal {M}\) of task \(\tau \) can determine \(t_i\), \(c_i\), and \(o_i\) in its sensing report \(r_i\). In fact, participants may report dishonestly for potentially higher profits, considering their self-interested nature. To encourage participants to upload objective reports, the cloud server s is responsible for rewarding participants for quality data and reprimand those who are dishonest. Nonetheless, this is difficult to realize based solely on our data quality estimation method and user serving profit model.

From the perspective of the cloud server s, local report determinations of all participants can be formalized as a non-cooperative game, considering the fact that each \(i\in \mathcal {M}\) does not know about the decisions of the other participants. Thus, inspired by the BMT algorithm in [24], we developed the IADR scheme for cloud server s to autonomously enhance/downgrade the serving profit of each participant according to its actual impact on the total profit of all participants. Specifically, IADR consists of two components: (i) Zone-Distinguished Quality Estimation and (ii) Impact-Driven Profit Determination.

Zone-Distinguished Quality Estimation. When receiving all \(r_i\in \mathcal {R}\) of a single MCS task \(\tau \), it is reasonable for cloud server s to separately estimate the quality of the reported data of different geological zones, considering the fact that sensing observations at different distances from the desired sensing location are likely to be different. The quality of each observation is estimated considering all other observations from its same zone.

Impact-Driven Profit Determination. After obtaining all \(q_i\in \mathcal {Q}\), s needs to determine the final serving profit \(\phi _i\) for each \(i\in \mathcal {M}\) based on \(\mathcal {T}\), \(\mathcal {R}\) and \(\mathcal {Q}\). Specifically, our impact-driven profit determination process is shown in Algorithm 2: if a participant has a positive effect on \(\tau \), it deserves a bonus reward; otherwise it will be reprimanded.

figure b

4 Performance Evaluation

In this section, we demonstrate the effectiveness of IADR in confronting dishonest reporting through extensive simulations. In the following, we first describe the evaluation methodology and then present evaluation results.

4.1 Evaluation Methodology

To evaluate the effectiveness of IADR, our evaluation adopted a real-world outdoor temperature dataset crowdsensed by taxis in Rome, Italy [25]. The used dataset contains 4485 entries, which are opportunistically uploaded by 366 taxis within 24 h. Each entry contains a temperature sensing observation, its generator, sensing time, and the GPS coordinates of the sensing location.

According to the dataset, we constructed an MCS with one cloud server and 366 mobile users on the OMNeT++4.6 simulator. A fixed effective sensing area was divided into 5 geological zones (i.e., \(z_{min}=1\), \(z_{max}=5\)), and GPS coordinates in all data entries were mapped into corresponding geological zones. For each round of simulation, the cloud server spontaneously announced MCS tasks, and each user participated in the tasks and uploaded sensing reports according to corresponding data entries. Each round of simulation lasted for 86400 simulation seconds.

In terms of parameter settings, we set unit serving price p of all users as 0.5 for the sake of simplicity, and set the truth discovery convergence threshold \(\epsilon =0.1\) for controlling accuracy. For benign participants who follow the system regulations, the serving confidence c is set as the estimated data quality in the last MCS task \(q'\). They serve each task for \(\hat{t}_{max}\) before uploading objective sensing report r. Nonetheless, for potential dishonest participants, they may upload non-objective sensing observation o or/and serving confidence c.

In our simulations, we formulated two patterns of dishonest reporting from participants: (i) only a single factor in the report is non-objective (i.e., random \(o_i\) (\(2\,^{\circ }\text {C}\leqslant o_i\leqslant 24\,^{\circ }\text {C}\)), random \(c_i\) (\(0<c_i\leqslant 1\)), or higher \(c_i\) (\(c_i=1\))), and (ii) there are multiple non-objective factors in the report (i.e., random \(o_i\) and random \(c_i\), or random \(o_i\) and higher \(c_i\)). Without introducing any dishonest reporting into the system deliberately, we ran a round of simulation, in which all participants followed system regulations as our baseline scenario. Then, we designed two sets of experiments to analyze the impacts of dishonest reporting on MCS’s sensing accuracy and participant’s serving profit, respectively. We further discussed whether IADR can realize effective user incentive against dishonest reporting based on the results.

4.2 Impact of Dishonest Reporting on MCS’s Sensing Accuracy

Considering the fact that the primary requirement of MCS is to achieve accurate sensing, we investigate the impacts of different ratios of dishonest reporting (i.e., the ratio of dishonest reporting to total reports) on MCS’s sensing accuracy in this set of experiments. Specifically, we conducted two groups of simulations, where all participants were set to perform dishonest reporting at rates of 10% and 15% in all MCS tasks, respectively. One thing should be noted is that our settings should be reasonably considering the ratio of malicious behaviors in existing mechanisms (e.g., 10% in [26] and 4% in [27]). All discovered sensing truths during simulations were collected, and their cumulative distributions are shown in Figs. 2 and 3.

Fig. 2.
figure 2

Impact of general dishonest reporting (10%) on discovered truth

According to Fig. 2, we can observe that: with a 10% general dishonest reporting ratio, neither the report with single non-objective factor, nor the report with multiple non-objective factors obviously affects the cumulative distribution of discovered sensing truth. In detail, the average discovered sensing truth of the ‘baseline’ scenario is \(11.28\,^{\circ }\text {C}\), and that of other scenarios are: (i) random \(o_i\): \(11.39\,^{\circ }\text {C}\), (ii) random \(c_i\): \(11.26\,^{\circ }\text {C}\), (iii) higher \(c_i\): \(11.28\,^{\circ }\text {C}\), (iv) random \(o_i\) and random \(c_i\): \(11.54\,^{\circ }\text {C}\), (v) random \(o_i\) and higher \(c_i\): \(11.39\,^{\circ }\text {C}\). The maximum deviation of the average sensing truth is 2.30%, which is nearly negligible, especially considering that 10% of the total reporting are non-objective.

Fig. 3.
figure 3

Impact of general dishonest reporting (15%) on discovered truth

According to Fig. 3, the evaluation results with a 15% general dishonest reporting ratio is similar: the cumulative distribution of discovered sensing truth is not significantly affected by either single non-objective factor or multiple non-objective factors, and the maximum deviation of the average sensing truth is 2.04%, which is also negligible.

To summarize, the evaluation results indicate that IADR can effectively guarantee the MCS sensing accuracy in MCSs with dishonest reporting.

4.3 Impact of Dishonest Reporting on User Profit

Considering the fact that the essential reason for mobile users to participate in MCS tasks is to obtain profit for sensing services offered, we investigate the impact of a user’s dishonest reporting on its own serving profit in this set of experiments. Specifically, we conducted three groups of simulations, where the user (i) participates in the most number of MCS tasks (i.e., the TopC user), (ii) obtains the most serving profit (i.e., the TopP user), and (iii) obtains less serving profit (i.e., the LessP user), in the baseline scenario was set to perform dishonest reporting in all MCS tasks, respectively. It should be reasonable for us to evaluate these representative users for a thorough understanding on IADR’s performance. The variations of the total serving profit of these users along an entire simulation round were collected, which are depicted in Figs. 4, 5, and 6.

Fig. 4.
figure 4

Impact of TopC user’s dishonest reporting on its total profit

Fig. 5.
figure 5

Impact of TopP user’s dishonest reporting on its total profit

Fig. 6.
figure 6

Impact of LessP user’s dishonest reporting on its total profit

According to Fig. 4, we can observe that the total profit of the TopC user is severely downgraded as long as it reports non-objective factor/factors. In detail, in the ‘baseline’ scenario, its total profit at the end of the simulation round is 19.87, and that of other scenarios are: (i) random \(o_i\): 4.32 (−78.26%), (ii) random \(c_i\): 8.49 (−57.27%), (iii) higher \(c_i\): 17.58 (−11.52%), (iv) random \(o_i\) and random \(c_i\): 1.33 (−93.31%), (v) random \(o_i\) and higher \(c_i\): \(-11.63\) (−158.53%). Although that the TopC user is one of the most active users in the MCS, its serving profit is still dominated by the quality of its reports.

According to Fig. 5, we can see that as long as the TopP user performs non-objective reporting, its total profit will be severely downgraded. Considering the fact that the TopP user is one of those who provides the most quality sensing reports in the MCS, objective reporting is demonstrated to be effective in enhancing a user’s serving profit.

According to Fig. 6, the total profit of the LessP user is also severely downgraded if it uploads non-objective reports. In addition, since the LessP user can represent the majority of MCS participants (e.g., mediocre ones), the simulation results indicate that, with the adoption of IADR, the LessP user cannot obtain higher profit by uploading non-objective reports.

To summarize, our results demonstrate that IADR achieves effective user incentive against dishonest reporting in MCSs.

5 Conclusion

In this paper, we developed a user incentive-based scheme against dishonest reporting in privacy-preserving MCSs, which encourages mobile users to honestly report sensing information. To be specific, we first developed an online mechanism for the MCS server to estimate the quality of reported data. We then constructed a serving profit model, which would maximize the profit of honest users. Further, we developed a mechanism for the MCS server to autonomously reward honest users, while reprimanding those who are dishonest. To demonstrate the effectiveness of our proposed scheme, we performed an extensive performance evaluation using real-world crowdsensing data. Our experimental results demonstrated that our scheme can ensure MCS sensing accuracy when there exists harsh dishonest reporting, meanwhile it can effectively reprimand users who report dishonest sensing information.