1 Introduction

Mobile device has evolved into an open platform to execute various applications. Mobile applications are software packages that can be installed and executed in mobile devices, for example, a mobile email client to access emails in a mobile phone. Generally, mobile applications developed by various vendors can be downloaded for installation. Which mobile application is more trustworthy for a user to consume becomes a crucial issue that impacts its final success.

Trust is a multidimensional, multidisciplinary, and multifaceted concept. We can find various definitions in the literature. Common to these definitions are the notions of confidence, belief, and expectation on the reliability, integrity, ability, etc., or characters of an entity [1]. The trustworthiness of mobile applications relates to their dependability, security, and usability [2], as well as popularity [3]. Many reputation systems of applications evaluate application trust based on the number of download although it is not so accurate. Herein, we define a user’s trust in a mobile application as his/her belief on the application that could fulfill a task as expectation. Reputation is public trust derived from direct and indirect knowledge or experiences. In our study, it is defined as the public belief on a mobile application that could fulfill a task according to many people’s expectations. Obviously, trust plays an important role in application consumption and usage because it helps users overcome perceptions of uncertainty and risk and engages in “trust-related behaviors”, in short trust behaviors. The trust behavior is a user’s actions to depend on an application or believe the application could perform as expectation, e.g., provide personal information to the application to engage in a purchase transaction, or use the application regularly to fulfill a routine task, or continue consuming the application even facing some errors [4]. However, a user’s trust in a mobile application is, being highly subjective. It is built-up over time and changes with the use of the application due to the influence of many factors. As trust is an internal “state” of the user, it is hard to measure it directly.

Marsh reasoned that it might prove more suitable to model trust behavior rather than trust itself, removing the need to adhere to specific definitions [5]. Meanwhile, modeling trust behavior overcomes the challenges to measure a subjective concept by evaluating it through objective trust behavior observation, which actually provides a concrete clue of trust. Regarding mobile application usage, we posit that credible information is gained only after a mobile user has both engaged in trust behaviors (e.g., acting on using a mobile application) and assessed the trustworthiness of the application by observing the consequences of its performance and depending on it in his/her routine life.

However, few existing trust models explore trust in the view of human trust behaviors [6]. Thus, little work in the literature generates reputation and provides recommendations based on trust behaviors. In this paper, we propose TruBeRepec, a trust-behavior-based reputation and recommender system for mobile applications. We explore a model of trust behavior for mobile applications through a large-scale user survey with more than 1,500 participants. Its construct has been examined and proved with sound validity and reliability by principal components analysis, reliability analysis, and confirmatory factor analysis [3]. We further formalize this model in order to evaluate individual mobile user’s trust in a mobile application through trust behavior observation. Thereafter, we design several algorithms to generate an application’s reputation by aggregating individual trust and provide application recommendations based on the correlation of trust behaviors. The contributions of this paper are:

  • TruBeRepec achieves auto-data collection for an individual user’s trust evaluation through trust behavior observation and provides application recommendation based on trust behavior correlation;

  • TruBeRepec supports both voting and non-voting; it has sound usability by reducing the need of user-device interaction and at the same time providing a convinced explanation on trust, which can be easily accepted by users since the trust explanation follows the model achieved from a large-scale user study;

  • TruBeRepec’s reputation scheme is robust according to our simulation results. It applies the device auto-generated individual trust as the credibility of user’s voting, thus overcomes the unfair rating attack. Meanwhile, TruBeRepec adopts recommendation trust in reputation generation with the concern of recommendation quality and time decay in order to punish on–off attackers and conflict behavior attackers, as well as attackers on trust behaviors.

  • TruBeRepec preserves user privacy since it does not require users to share and specify personal details, e.g., usage statistics and personal interests.

The rest of the paper is organized as follows. Section 2 gives a brief overview of related work in the literature. Section 3 introduces the trust behavior model for mobile applications. This is followed by TruBeRepec system design in Sect. 4. The algorithms used for individual trust evaluation, application reputation, and recommendation generation are described in Sect. 5. We further evaluate TruBeRepec through simulations and analysis in Sect. 6. Thereafter, we discuss some additional issues such as the practical significance and user privacy preservation in Sect. 7. Finally, conclusions and future work are presented in the last section.

2 Background and related work

2.1 Trust model

The method to specify, evaluate, setup, and ensure trust relationships among entities is the trust model while trust modeling is the technical approach used to represent trust [6]. One of the earliest formalizations of trust in computing systems was done by Marsh [5]. He integrated the various facets of trust from the disciplines of economics, psychology, philosophy, and sociology. Since then, many trust models have been constructed for various computing paradigms including ubiquitous computing, peer-to-peer systems, ad hoc networks, GRID virtual organizations, multi-agent systems, web services, e-commerce, and component software [6, 7]. In almost all of these studies, trust is accepted as a subjective notion, which brings us to the question: how to measure trust? Translation of this subjective concept into a machine-readable language is the main objective of trust modeling.

In computer science, a trust model aids the digital processing and/or management of trust. Most existing trust models are based on the understanding of trust characteristics, accounting for factors influencing trust. A common approach in the literature is with regard to computational trust [811]. Despite the availability of various trust models, the fundamental criteria of trust models are still not well understood. Current work focuses on concrete solutions in specific systems. Additional examination is required before applying an existing solution into another domain.

One promising approach of trust modeling aims to conceptualize trust based on user studies through a psychological or sociological approach (e.g., using a measurement scale, i.e., measure). This kind of research aims to recognize the complicated relationships among trust and other multiple factors in different facets. However, the achieved trust model using this method is conceptual and semantic, thus cannot be directly applied into computer systems. Two examples are the initial trust model proposed by McKnight et al. [4] and the technology trust formation model (TTFM) studied by Li et al. [12]. Initial trust refers to trust in an unfamiliar trustee, a relationship in which the involved entities do not yet have credible, meaningful information about, or affective bonds with, each other [13]. These two models used the framework of the theory of reasoned action (TRA) to explain how people form initial trust in an unfamiliar entity [14]. Since the objective of both models was to predict initial trust (i.e., trusting intention) before any actual interaction with the trusting object, trust behavior was excluded from them.

On the other hand, short-term trust is built up over the first interactions with an entity and long-term trust is developed over the continuous interactions with an entity for a longer period of time. On-going trust concerns the short-term trust and the long-term trust. In our study, we mainly focus on the on-going trust evaluation based on the user’s usage behaviors. In particular, the on-going trust could contribute to the trusting object’s reputation and thus greatly help other entities generate their initial trust.

2.2 Trust behavior study

TRA theory posits that beliefs lead to attitudes, which lead to behavioral intentions, which lead to the behavior itself [14]. Applying this theory, we propose that trusting beliefs (e.g., perceptions of specific mobile application attributes) lead to trusting intentions (e.g., intention to engage in trust behaviors of using a mobile application through user-device interaction), which in turn result in trust behaviors (e.g., using the application in various context). Additionally, numerous researchers have conceptualized trust as a behavior which has been validated in work collaboration and social communications [1517]. Prior research has also confirmed a strong correlation between behavioral intentions and actual behavior, especially for software system usage [18, 19]. However, still very few studies examined trust from the view of trust behaviors [20]. Some work studies the trust behavior in e-banking [20]. To our knowledge, no existing work explores trust behavior of mobile application usage, which is a different context from the above research domains with regard to running environment and user interface. Due to the above differences and the challenges caused by small device interface, the design of reputation systems for mobile applications has additional challenges considering usability and performance.

Muir found a positive correlation between trust and use [21, 22]. The relationship between trust and interaction behavior is obvious since usage through human–device interaction implies trust. Lee and Moray [23] found that trust in a system partially explained system use, but other factors (such as the user’s own ability to provide manual control) also influenced the system use. All above studies serve as the foundation of our work: a user’s trust in a mobile application can be evaluated based on the user’s trust behaviors. It actually plays as our hypothesis to explore and confirm the structure of trust behavior model for mobile application usage through a large-scale user experiment. However, these studies do not provide any implications on the design and development of a reputation and recommender system for mobile applications. In our work, based on the explored and verified trust behavior structural model, we formalize it in a mathematic measure and further design a reputation and recommender system for mobile applications that can be applied in practice with sound effectiveness.

Existing trust behavior studies focus on human’s trust in an automation and intelligent machine [2123]. A number of trust models have been proposed in the context of e-commerce [4, 20] while little work has been done in the context of mobile applications. Prior arts also lack study on the influence of recommendations, personality and usage context on human–computer trust. With the rapid development of mobile computing technologies, a mobile device has become a multi-application system for multipurpose and multi-usage. A mobile device is an open platform with always network connection that allows deploying new or upgraded applications at anytime and anywhere. Therefore, such a dynamically changed system introduces new challenges for human–computer trust behavior study.

2.3 Trust management, reputation, and recommender systems

Trust management is concerned with: collecting the information required to make a trust relationship decision; evaluating the criteria related to the trust relationship, monitoring, and reevaluating the existing trust relationships, as well as ensuring it dynamically; and automating the process [6, 7, 24]. Recently, trust management has emerged as a promising technology to facilitate collaboration among entities in a distributed and uncertain environment [25]. However, prior arts generally lack considerations on the means to gather experiential evidence for effective trust evaluation. Many systems rely on a user to provide feedback [6]. Sometimes, it may not be appropriate or convenient to request him/her feedback, especially for a mobile user. This is because the user may be bothered by such a request during usage. His/her usage experience could be negatively influenced. Moreover, user interface design for feedback requests extra design efforts, which may cause additional challenges for mobile devices with small displayers. Another issue is different users may apply different scales in the feedback, which may cause confusion, even attacks. All above introduces a requirement of automating the experiential evidence in a uniformed norm. In our opinion, observing trust behaviors directly during mobile application usage could be a good way to automatically collect evidence for trust evaluation in a uniformed norm. On the other hand, we found that most methods applied in trust management are not user-centric or user-driven [26], lacking considerations on or support from users in system design. Few of them study trust based on the trusting subject’s behaviors, thus trust management in these systems is mostly based on trust evaluation on the trusting object’s behavior or performance [6]. The work presented in this paper explores users’ trust in a mobile application based on their trust behaviors (i.e., the trusting subject’s behaviors). Our study supports automatic evidence collection for trust evaluation and management.

There are various trust management systems in the literature and practice [25]. However, it still lacks common criteria to evaluate these systems. System context diversity complicates the situation. Most literature results are difficult to be directly applied in practice because the assumed conditions are actually hard to be satisfied and the system design cannot fulfill practical requirements, e.g., usability and privacy [26, 27].

2.3.1 Reputation and recommender systems

A category of large practical importance is reputation-based trust management system, in short, reputation system. Reputation is a measure that is derived from direct or indirect knowledge on earlier interactions of entities and is used to assess the level of trust put into an entity [28]. Thus, the reputation system is a specific approach to evaluate and manage trust. Recommender systems generally apply information filtering technique that attempts to recommend information items (e.g., films, books, web pages, etc.) that are likely to be of interest to users [29]. Typically, a recommender system compares a user profile to some reference characteristics, and seeks to predict the “rating” that a user would give to an item they had not yet considered or experienced [30]. These characteristics may be from the information item (a content-based approach) or the user’s social environment (a collaborative filtering approach) [31]. In [32], the authors introduced using trust as both weighting and filtering in recommendations. The recommendation partners should have similar tastes of preferences and they should be trustworthy with a history of making reliable recommendations. This trust information can be incorporated into the recommendation process. But to our knowledge, most characteristics used for recommendations are not based on trust behaviors, which however is an important clue to imply users’ preferences.

Jøsang et al. [33] classified reputation network architecture into two main types: centralized and distributed. The network architecture determines how ratings and reputation scores are communicated between participants in a reputation system. In the literature, distributed trust evaluations have been studied for distributed systems, e.g., ad hoc networks and peer-to-peer systems [811]. On the other hand, practical reputation systems generally apply a centralized server to collect feedback for reputation generation. However, many existing systems (e.g., Amazon, eBay [34], Yahoo auctions [35]) lack considerations on the credibility of a user’s rating. This greatly influences the quality of produced reputations. Moreover, the centralized reputation network architecture may not be suitable and flexible to be applied in the context of mobile applications. TruBeRepec adopts hybrid reputation network architecture. The algorithms designed to generate reputation and recommendation can be applied in both the mobile devices and a centralized reputation service provider.

Reputation and recommender systems still face several challenges. Firstly, incentives are required in order to encourage users to provide their feedback on interactions and their personal interests and profiles (due to privacy issue, some users are hesitate to provide their details) [29, 35]. This may raise some usability issues, especially for mobile users. Secondly, reputation systems may face the problem of unfair ratings to artificially inflate or deflate reputations [3437]. They are vulnerable to a number of potential attacks, such as Sybil attack, on–off attack, independent bad mouthing attack, collaborative bad mouthing attack, and conflict behavior attack [3739]. The usage of pseudonyms introduces new challenges by making it hard to trace malicious behaviors. This also influences the accuracy of reputation. Sun et al. [39, 40] proposed a number of schemes to overcome some of the above attacks, but they did not consider the additional challenges caused by usability and privacy preservation. In addition, collecting the reference characteristics for recommendation sometimes is not easy due to the users’ concern of privacy. Thirdly, the existing reputation and recommender systems based on user rating generally lack uniform criteria, which makes the rating a totally subjective behavior. Meanwhile, different users could treat and consider the reputation and recommendation information in different ways. These further complicate the users’ decision and could negatively influence their usage behaviors. However, credibility is a positive signal of the trustworthiness of an object [41] since it provides a reason to trust. Feedback credibility is essential to generate a reliable reputation value in order to overcome the above challenges. Particularly, trust behaviors provide an important clue to indicate feedback credibility and users’ preference.

Obviously, the success of a practical reputation and recommender system requests sound usability with regard to user-device interaction. It should be robust to overcome various potential attacks. Meanwhile, a mechanism is expected to uniform the user’s voting with trustworthy credibility. Finally, the system should preserve the user’s privacy to a certain level at the same time when it collects user data for reputation and recommendation generation.

3 Trust behavior model

In order to achieve the trust behavior model, we design a questionnaire survey that asks for user opinion about trust behaviors regarding mobile application usage. Based on the data collected from 1,575 participants, we get a conceptual trust behavior model, as shown in Fig. 1 [3]. The construct of the model and the relations among all factors are analyzed and validated using principal component analysis (PCA), confirmatory factor analysis (CFA), correlation analysis, and reliability analysis with positive psychometric properties and sound validity and reliability [3]. The relationships of different components (i.e., the edge values in Fig. 1) are set based on the correlation analysis with the values in the scope of [0, 1]. The model relates the trust behavior to three types of usage behaviors: using behavior (UB), reflection behavior (RB), and correlation behavior (CB). These behaviors can be automatically monitored by a mobile device during application consumption. They also relate to a number of external factors: personal motivation, brand impact, perceived device quality, and personality. They are further delineated into twelve measurable sub-constructs. Figure 2 illustrates the sub-construct of the UB, RB, and CB according to the PCA, CFA, and correlation analysis.

Fig. 1
figure 1

Trust behavior construct of mobile applications

Fig. 2
figure 2

Internal relationships of a UB; b RB; and c CB

What follows refers to the notations used in Fig. 2. The first type of trust behavior—using behavior relates to normal application usage, which can be reflected mainly by elapsed usage time, number of usages, and usage frequency. We found that trust is reflected by UB1, normal usage behavior. Meanwhile, usage context such as risk, importance, and urgency could also influence the trust behavior (i.e., UB2: behavior related to context). Generally, a mobile application provides a number of functionalities, i.e., features. The more features experienced by the user, the more proficient he/she has in the application usage (i.e., UB3: feature related usage behavior). What is more, frequent usage can somehow indicate trust. This is also reflected from the user data collected in our survey [3]. Herein, we pay more attention to public trust, i.e., reputation, which aggregates many users’ trust opinions on an application.

The second type of trust behavior is reflection behavior. It concerns the usage behaviors after the user confronts application problems/errors or has good/bad usage experiences. It contains six sub-constructs: RB1: bad performance reflection behavior; RB2: bad performance reflection behavior related to context; RB3: good performance reflection behavior; RB4: good performance reflection behavior related to context; RB5: bad experience reflection to context; RB6: good experience reflection to context. The difference of the reflection behavior and the using behavior lies in the fact that the first is a type of event-related behavior while the second is about general usage statistics. Their contributions to trust evaluation could be different. For example, one type of the reflection behavior is the usage behavior when the user confronts an application error, whether he/she would like to continue using the application or not in such a situation. The using behavior only reflects normal usage information, not indicates the change of usage.

Future mobile market could be very competitive. A number of similar functioned mobile applications developed by different vendors would be available at the same time for consumption. The third type of behavior is correlation behavior, which concerns the usage behaviors correlated to a number of similar functioned mobile applications. Since trust is obviously correlated to use [2123], the usage could imply trust. Meanwhile, it is also influenced by various contexts [5, 15]. The correlation behavior has 3 sub-constructs: CB1: comparison of normal usage behavior; CB2: comparison related to context; CB3: recommendation behavior (i.e., a behavior to suggest other people using a mobile application).

The trust behavior construct (i.e., the trust behavior model) for mobile applications is achieved with sound reliability (UB: alpha = 0.71; RB: alpha = 0.85; CB: alpha = 0.79; overall trust behavior: alpha = 0.90) [3]. Reliability is reflected by alpha, a value between 0 and 1, with a larger value indicating better reliability. Generally, alpha above 0.7 implies sound reliability [42]. We found that UB, RB, and CB have significant correlation (as shown in Fig. 1) with the trust behavior, which indicates that these three factors can represent it. We also found that these factors have lower correlations with each other than their correlations with the trust behavior. This indicates that these three factors can measure not only the general aspects but also the specific aspects of the trust behavior. Notably, their mutual correlations are around 0.5, which implies that these factors may influence or impact with each other. However, the assumed relationships cannot be well proved by internal nomological validity of our experiment and in literature theory. This means that these factors could be correspondingly in parallel, without any causal relationships. We also found the influence of a number of external variables (i.e., personal motivation, brand impact, perceived device quality, and personality) on UB, RB, and CB; their correlations are shown in Fig. 1. Note that ** indicates correlation is significant at the 0.01 level (2-tailed); * indicates correlation is significant at the 0.05 level (2-tailed). Herein, the level of correlation significance indicates the error probability of correlation. Thus, the lower the level, the more significant correlation holds.

As can be seen from Fig. 2, the correlation between each internal sub-factor (e.g., UB1, UB2, and UB3) and its corresponding principal factor (construct) (e.g., UB) is almost in the same level (except CB3’s correlation with CB is a bit lower than CB1-CB’s and CB2-CB’s). This correlation is also higher than the correlations among the sub-factors. This indicates that the sub-factors belonging to a concrete principal factor can measure not only the general aspects but also the specific aspects of the represented type of trust behavior.

Our work in [3] only explores the conceptual structure of trust behaviors for mobile application usage based on a user survey. This paper designs TruBeRepec by formalizing the trust behavior model and developing a number of algorithms that can be adopted by TruBeRepec to provide application reputation and recommendation according to trust behaviors.

4 System design

We design a distributed client–server system structure for TruBeRepec, see Fig. 3. Its client software “Trust Manager” can be installed in a number of mobile devices (MD_k, k = 1,…,K). The trust manager contains Trust Behavior Monitor that monitors trust behaviors and inputs statistical data about UB, RB and CB into a secure storage (Trust Data), which is located inside the device platform and has a secure channel to communicate with the behavior monitor and Trust/Reputation Information Presenter. The statistical data can be accessed by Data Interpreter for (a) individual trust evaluation regarding a specific application by Trust Evaluator; (b) data dissemination to send local trust information and vote applications to reputation service provider (RSP) or other devices by Trust Value Disseminator; (c) reputation/recommendation extraction to get mobile applications reputation information and/or application recommendations from the RSP or other devices by Reputation/Recommendation Extractor. Particularly, the Trust Evaluator can also generate application reputation and recommendation at the user’s device based on collected information from other mobile devices, e.g., through an ad hoc network. The Data Interpreter is a secure mechanism to access the user’s usage statistical data from the Trust Data since these data are private information. We design the Data Interpreter based on the trusted computing technology [25, 43]. Only authorized data interpreter mechanisms can access and unseal the protected usage information. The reputation/recommendation extraction can be tailored based on the mobile user’s preference, either a reputation extraction policy or a recommendation extraction policy, or both. In addition, Trust/Reputation Information Presenter is applied to show trust/reputation information to the user in order to aid his/her application usage [44, 45].

Fig. 3
figure 3

TruBeRepec system structure

In the RSP, Trust Value Receiver receives individual trust information and votes automatically or by request from the mobile devices. Reputation/Recommendation Generator generates application reputations and recommendations for mobile users. Herein, the reputation could be generated based on all users’ usage statistics. But due to a privacy concern, we apply another approach to aggregate the individual trust values based on UB, RB, and CB (calculated in each mobile device). The reputation/recommendation information about each mobile application is saved in a secure storage (Reputation Data) in the RSP. This information can be retrieved and distributed to the mobile devices through Reputation/Recommendation Distributor. It receives reputation retrieve requests and provides application reputations and recommendations to the requestors.

Assuming a mobile application, a user receives a recommendation from the RSP that indicates its high public reputation and high personalized recommendation to install it since most people using it consume other applications in a similar way as he does regarding trust behaviors. Further initiated by some external factors (e.g., personal motivation, brand impact, perceived device quality, and personality), the user installs the application and starts consuming it. The Trust Behavior Monitor monitors his trust behaviors regarding UB, RB, and CB and inputs collected statistical data into the Trust Data. Based on these data, the Trust Evaluator evaluates the user’s individual trust in the application. The Trust/Reputation Information Presenter shows the application’s individual trust value, its reputation value and recommendation value to the user. Periodically or by request, the user sends local trust information (e.g., individual trust value, the trust values, respectively, contributed by UB, RB, and CB) and/or application votes to the RSP. The RSP then regenerates the application’s reputation and reprovides the application’s recommendations based on newly collected data. If the recalculated recommendation suggests the user not trusting in the application that he/she has already installed, the device could automatically inquire the user if uninstall is needed or warning is expected at the application start-up. The user could configure his/her personal settings in the mobile device to handle this kind of situations.

Particularly, the user can recommend this application to his friend directly, for example via an ad hoc network. In this case, the user’s trust information is attached to the recommendation message. His friend’s device can generate the application’s reputation and recommendation value (at Trust Evaluator) based on the recommender’s individual trust and their trust behavior correlations with regard to commonly consumed applications.

5 Algorithms

Based on the above system design and the trust behavior model, we propose a number of algorithms to implement individual trust evaluation in a mobile device, and application reputation and recommendation generation. For easy of reference, Table 1 summarizes the notations used in this section.

Table 1 Notations

5.1 Individual trust evaluation

We formalize the conceptual trust behavior model in a computational measure. It is a coherent adaptive trust model for quantifying the individual user’s trust in a mobile application based on trust behavior observation.

5.1.1 Formalizing using behavior

The PCA assumes that the extracted factors are based on linear combinations. Formalizing the using behavior, we consider the influence of the number of usages, elapsed usage time, usage frequency, and experienced features on trust based on the trust behavior model. Their influence is scaled by the total number of usages and elapsed usage time of all applications, the total number of the application features and the usage frequency of all mobile applications in the underlying device (refer to UB1 and UB3). Meanwhile, we further tailor trust based on the index of importance, urgency and risk, i.e., the context index (CI), refer to UB2. Denoting the importance index (ii), urgency index (ui), and risk index (ri) of the nth mobile application i usage as ii (i, n), ui (i, n) and ri (i, n), then

$$ {\text{CI}}_{i} (t) = \frac{1}{{N_{i} (t)}}\sum\limits_{n = 1}^{{N_{i} (t)}} {{\text{ci}}(i,n)} = \frac{1}{{N_{i} (t)}}\sum\limits_{n = 1}^{{N_{i} (t)}} {(\alpha * {\text{ii}}(i,n) + \beta * {\text{ui}}(i,n) + \gamma * {\text{ri}}(i,n))} $$
(1)

where, α, β, and γ are parameters to weight the importance of different context indices. Thus, the individual trust contributed by the using behavior \( T_{i} (t)_{\text{UB}} \) can be calculated as:

$$ T_{i} (t)_{\text{UB}} = \left( {\frac{{N_{i} (t)}}{N(t)} * \frac{{{\text{UT}}_{i} (t)}}{{{\text{UT}}(t)}} * \frac{{{\text{EF}}(i)}}{F(i)} * \frac{{{\text{FE}}_{i} (t)}}{{{\text{FE}}(t)}}} \right) * {\text{CI}}_{i} (t) $$
(2)

In our user study [45], we found that the importance rate is highly related to the elapsed usage time, frequency, and the number of usages. If the weights of urgency index and risk index are 0 (i.e., β = 0, and γ = 0), we can simplify formula (2) as:

$$ {\text{CI}}_{i} (t) = \mu *\left( {\frac{{N_{i} (t)}}{N(t)} * \frac{{{\text{UT}}_{i} (t)}}{{{\text{UT}}(t)}} * \frac{{{\text{FE}}_{i} (t)}}{{{\text{FE}}(t)}}} \right) $$
(1’)
$$ T_{i} (t)_{\text{UB}} = \mu *\left( {\frac{{N_{i} (t)}}{N(t)} * \frac{{{\text{UT}}_{i} (t)}}{{{\text{UT}}(t)}} * \frac{{{\text{FE}}_{i} (t)}}{{{\text{FE}}(t)}}} \right)^{2} \left( {\frac{{{\text{EF}}(i)}}{F(i)}} \right) $$
(2’)

where μ is the parameter used to adjust the context index.

5.1.2 Formalizing reflection behavior

Our user study on the reflection behavior showed that the change of the elapsed usage time, the number of usages, the usage frequency and the change caused by the CI have influence on trust (refer to RB1–RB6). We introduce a parameter called performance index (PI) that can be used to reflect application performance

$$ {\text{PI}}_{i} (t) = d_{t} \left\{ {N_{i} (t) + {\text{UT}}_{i} (t) + {\text{FE}}_{i} (t)} \right\} + d_{t} \left\{ {{\text{CI}}_{i} (t)} \right\} $$
(3)

where, \( d_{t} \{ g(t)\} = \frac{g(t) - g(t - \tau )}{\tau } \), \( (\tau \to 0) \); g(t) is a function of t; and τ is a time interval applied to measure the changes of usage behavior and context. For the same reason mentioned above, we simplify the formula (3) as

$$ {\text{PI}}_{i} (t) = 2\left( {d_{t} \left\{ {N_{i} (t) + {\text{UT}}_{i} (t) + {\text{FE}}_{i} (t)} \right\}} \right) $$
(4)

The contribution of the reflection behavior to individual trust generation \( T_{i} (t)_{\text{RB}} \) can be specified as

$$ T_{i} (t)_{\text{RB}} = {\text{PI}}_{i} (t) $$
(5)

5.1.3 Formalizing correlation behavior

Trust based on the correlation behavior contains two parts. The first part reflects the comparison of normal usage behavior and the level of context index to similar applications (refer to CB1 and CB2). The second part reflects the recommendation behavior (refer to CB3). Herein, we deduct the contribution of the recommendation behavior according to current trust value T i (t) and context index CI i (t). We have

$$ T_{i} (t)_{\text{CB}} = \sum\limits_{k = 1,\,k \ne i}^{I} {ac(i,k)} * \left\{ \begin{gathered} \frac{{N_{i} (t) - N_{k} (t)}}{N(t)} + \frac{{{\text{UT}}_{i} (t) - {\text{UT}}_{k} (t)}}{{{\text{NT}}(t)}} + \frac{{{\text{FE}}_{i} (t) - {\text{FE}}_{k} (t)}}{{{\text{FE}}(t)}} + \hfill \\ \left( {{\text{CI}}_{i} (t) - {\text{CI}}_{k} (t)} \right) + \frac{{{\text{NR}}_{i} (t) - {\text{NR}}_{k} (t)}}{{{\text{NR}}(t)}} \hfill \\ \end{gathered} \right\} + \lambda \frac{{{\text{NR}}_{i} (t)}}{{{\text{NR}}(t)}} * {\text{CI}}_{i} (t) * T_{i} (t) $$
(6)

where λ is a parameter that weights the contribution of the recommendation behavior. An important reason to introduce λ is the correlation of CB3 to CB is lower than CB1’s and CB2’s correlation to CB, as shown in Fig. 2. We use ac(i, k) to indicate the similarity of application i and k.

5.1.4 General metric of individual trust

The PCA assumes that the observed data set is linear combinations of certain basis. Aggregating all the above together, we get the following uniformed formula for individual trust evaluation.

$$ T_{i} (t) = T_{i} (t)_{\text{o}} + \rho T_{i} (t)_{\text{UB}} + \vartheta T_{i} (t)_{\text{RB}} + \varsigma T_{i} (t)_{\text{CB}} $$
(7)

where parameters ρ, ϑ, and ς denote the normalized weight factors for using behavior evaluation, reflection behavior evaluation, and correlation behavior evaluation.

This metric consists of four parts. The first part is the original trust value, which could be an initial trust value at the beginning of the application usage or a trust value generated in the previous time window. The initial trust value could be negative since the usage could go down or a user could prefer using another similar application. The second part is a pure usage experience based trust evaluation according to the using behavior. We consider the influence of elapsed usage time, frequency and the number of usages, and experienced application features, as well as context influence. The third part is contributed by the reflection behavior according to the application’s performance, which is reflected by usage changes and context index change. The last part is a weighted evaluation contribution about the correlation and recommendation behaviors. It takes the current trust value into account to counter dishonest recommendations, and capture the context influence on the recommendations. This history-based evaluation can be seen as a prediction for the recommendation behaviors regarding its contribution to the trust evaluation. Inside the last part, there is an application-comparison based contribution. It adjusts the trust value based on the difference of usage number/time/frequency, recommendations, and the context index with regard to similar functioned applications. In order to uniform the trust value into the scope of (0, 1), we apply a sigmoid function on the trust value

$$ T_{i} (t) = f\left\{ {T_{i} (t)_{\text{o}} + \rho T_{i} (t)_{\text{UB}} + \vartheta T_{i} (t)_{\text{RB}} + \varsigma T_{i} (t)_{\text{CB}} } \right\} $$
(8)

Important to note is that this general metric of individual trust may have different appearances depending on which of the parameters are switched on and how the parameters and weight factors are set. The setting of ρ, ϑ, and ς could be based on the correlation of UB, RB, and CB to trust behavior as 0.776, 0.897 and 0.778. Algorithm 1 is applied to evaluate individual trust at the Trust Evaluator (refer to Fig. 3).

Algorithm 1 Individual trust evaluation

5.2 Application reputation generation

The RSP collects individual trust in various mobile applications. During the individual trust sharing, some mobile users may like to vote the applications directly based on our user study [45]. TruBeRepec considers the individual trust automatically generated by the mobile device and/or direct votes (i.e., the user’s subjective opinion on the application). Based on the votes and individual trust values, we generate the application reputation at the RSP’s Reputation/Recommendation Generator (see Fig. 3) by applying Algorithm 2. We apply weighted aggregation using the individual trust as the credibility of voting and also consider the influence of time and the number of reputation contributors. Note that Algorithm 2 can also be applied by the Trust Evaluator (see Fig. 3) to generate application reputation based on locally collected trust behavior information.

Algorithm 2 Application reputation generation

Obviously, user k, u k (\( u_{k} \in U \), k = 1,…,K) could vote application i many times and at different time \( t_{p} :\left\{ {V_{i}^{k} } \right\} = \left\{ {V_{i}^{k} \left( {t_{p} } \right)} \right\} \). Considering the time influence and potential on–off attack, we pay more attention to the user’s recent voting. \( \overline{{R^{k} (i)}} \) is the aggregated reputation of application i based on user k’s experiences.

$$ \overline{{R^{k} (i)}} = \frac{1}{O}\sum\limits_{p} {V_{i}^{k} (t_{p} )*T_{i}^{k} (t_{p} )*e^{{ - \frac{{\left| {t - t_{p} } \right|^{2} }}{\tau }}} } $$
(9)

where \( O = \sum\nolimits_{p} {T_{i}^{k} (t_{p} )*e^{{ - \frac{{\left| {t - t_{p} } \right|^{2} }}{\tau }}} } \), \( V_{i}^{k} (t_{p} ) \) is user k’s vote on application i at time t p , t is the reputation generation time, τ is the parameter to control the time decaying, (τ = 2 in our simulations). \( T_{i}^{k} (t_{p} ) \) is the individual trust of user k reported at time t p , with vote \( V_{i}^{k} (t_{p} ) \) attached. If \( V_{i}^{k} (t_{p} ) \) is not provided by the user, we set \( V_{i}^{k} (t_{p} ) = T_{i}^{k} (t_{p} ) \) automatically (note that \( V_{i}^{k} (t_{p} ) \in [0,1] \)).

We consider the users’ experiences on application i to generate its public reputation, denoted as R(i) based on the following function by considering also recommendation trust s k:

$$ R(i) = \frac{\theta (K)}{W}\sum\limits_{k = 1}^{K} {s^{k} *\overline{{R^{k} (i)}} } $$
(10)

where K is the number of users who consume application i. Herein, we apply the Rayleigh cumulative distribution \( \theta (K) = \left\{ {1 - \exp \left( {\frac{{ - K^{2} }}{{2(\sigma + \varepsilon )^{2} }}} \right)} \right\} \) to model the impact of K on application reputation. The percentage of usage, \( \varepsilon = - K/K^{\prime } \) is the factor to indicate the popularity of an application. K′ is the total number of users in the TruBeRepec system. \( W = \sum\nolimits_{k = 1}^{K} {s^{k} } \). We introduce user k’s recommendation trust s k in order to overcome potential attacks in TruBeRepec. At the user registration time, s k is set to an initial value (e.g., 0.5 in our simulations) at the RSP. Then, it is further evolved based on user k’s performance regarding application reputation generation. We have

$$ \rho = \frac{1}{2}\left\{ {{ \max }\left( {V_{i}^{k} } \right) - { \min }\left( {V_{i}^{k} } \right)} \right\} $$
$$ y = \rho - \left| {R(i) - V_{i}^{k} (i)} \right| $$
$$ {\text{If}}\,y < \varphi \quad (\varphi = 0),\quad \gamma + + ; $$
$$s^{k} = \left\{\begin{array}{ll}{s^{k} + \delta y} &\quad {\left( {\gamma < {\text{thr}}} \right)} \\{s^{k} + \delta y - \mu \gamma } &\quad{\left( {\gamma \geq {\text{thr}}} \right)} \\\end{array}\right. = \left\{\begin{array}{ll} 1 &\quad {\left( {s^{k} > 1} \right)} \\0 &\quad {\left( {s^{k}< 0} \right)} \\\end{array}\right.$$
(11)

where δ > 0 is a parameter to control the adjustment of s k. In order to detect on–off attackers and conflict behavior attackers, we further introduce a warning flag γ to record the number of bad input into reputation generation. γ’s initial value is 0. It is increased by 1 each time when a bad input happens. Parameter thr is a threshold to indicate the on–off attack or conflict behavior attack (thr = 3 in our simulation). Parameter μ > 0 controls bad input punishment. \( { \max }\left( {V_{i}^{k} } \right) \) is the maximum voting value, while \( \min \left( {V_{i}^{k} } \right) \) the minimum voting value. φ is a parameter to decide the bad input. We set δ = 0.05, μ = 0.1, and φ = 0 in our simulations.

5.3 Application recommendation

Except for T i (t), the RSP also collects T i (t)UB, T i (t)RB, and T i (t)CB in order to provide appropriate recommendations based on the correlation of trust behaviors. In addition, TruBeRepec also provides the public reputation of recommended applications for the reference of mobile users. Algorithm 3 is used to generate application recommendation vector that contains the recommendation value of application i for each TruBeRepec user at the RSP’s Reputation/Recommendation Generator (see Fig. 3). Herein, we only consider good users as recommendation contributors. Note that Algorithm 3 can also be applied by the Trust Evaluator (see Fig. 3) to generate application recommendation vector based on locally collected trust behavior information.

Algorithm 3 Application recommendation vector generation

Suppose a set of applications: \( A = \left\{ {a_{1} ,a_{2} , \ldots ,a_{i} , \ldots ,a_{I} } \right\} \) considered in the system. For each application i, a user could have T i (t)UB, T i (t)RB, and T i (t)CB, where t is the recommendation time. We have K users \( U = \left\{ {u_{1} ,u_{2} , \ldots ,u_{K} } \right\} \) contribute to the application recommendation in TruBeRepec.

For kth user u k , we have the following metric D(u k ) to present his/her trust behaviors regarding applications \( A = \left\{ {a_{1} ,a_{2} , \ldots ,a_{I} } \right\} \) based on past experiences:

$$ D(u_{k} ) = \left\{ {\left\{ \begin{gathered} T_{1}^{k} (t)_{\text{UB}} \hfill \\ T_{1}^{k} (t)_{\text{RB}} \hfill \\ T_{1}^{k} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} \ldots \left\{ \begin{gathered} T_{I}^{k} (t)_{\text{UB}} \hfill \\ T_{I}^{k} (t)_{\text{RB}} \hfill \\ T_{I}^{k} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\}} \right\} $$
(12)

The metric that expresses all users’ trust behaviors is:

$$ D(U) = \left\{ \begin{gathered} \left\{ \begin{gathered} T_{1}^{1} (t)_{\text{UB}} \hfill \\ T_{1}^{1} (t)_{\text{RB}} \hfill \\ T_{1}^{1} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} \ldots \left\{ \begin{gathered} T_{I}^{1} (t)_{\text{UB}} \hfill \\ T_{I}^{1} (t)_{\text{RB}} \hfill \\ T_{I}^{1} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} \hfill \\ \ldots \hfill \\ \left\{ \begin{gathered} T_{1}^{k} (t)_{\text{UB}} \hfill \\ T_{1}^{k} (t)_{\text{RB}} \hfill \\ T_{1}^{k} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} \ldots \left\{ \begin{gathered} T_{I}^{k} (t)_{\text{UB}} \hfill \\ T_{I}^{k} (t)_{\text{RB}} \hfill \\ T_{I}^{k} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} \hfill \\ \ldots \hfill \\ \left\{ \begin{gathered} T_{1}^{K} (t)_{\text{UB}} \hfill \\ T_{1}^{K} (t)_{\text{RB}} \hfill \\ T_{1}^{K} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} \ldots \left\{ \begin{gathered} T_{I}^{K} (t)_{\text{UB}} \hfill \\ T_{I}^{K} (t)_{\text{RB}} \hfill \\ T_{I}^{K} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} \hfill \\ \end{gathered} \right\} $$
(13)

Recommendation vector R k i at time t for u k regarding application i can be calculated based on the following formula to provide personalized recommendations according to the correlation of trust behaviors:

$$ R_{i}^{k} = \left\{ \begin{gathered} R_{i}^{k} (t)_{\text{UB}} \hfill \\ R_{i}^{k} (t)_{\text{RB}} \hfill \\ R_{i}^{k} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} = \frac{{\sum\nolimits_{j \ne k} {\left( {\left\{ \begin{gathered} T_{i}^{j} (t)_{\text{UB}} \hfill \\ T_{i}^{j} (t)_{\text{RB}} \hfill \\ T_{i}^{j} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} * {\text{Re}}l (u_{j} ,u_{k} )} \right)} }}{{\sum\nolimits_{k \ne j} {{\text{Re}}l (u_{j} ,u_{k} )} }},\quad (i = 1, \ldots ,I) $$
(14)
$$ \text{Re} l(u_{j} ,u_{k} ) = \frac{1}{I - 1}\sum\limits_{i^\prime\ne i} {\left( {1 - \sqrt {\frac{{\left( {T_{i^\prime }^{k}(t)_{\text{UB}} - T_{i^\prime }^{j} (t)_{\text{UB}} } \right)^{2} +\left( {T_{i^\prime }^{k} (t)_{\text{RB}} - T_{i^\prime }^{j}(t)_{\text{RB}} } \right)^{2} + \left( {T_{i^\prime }^{k}(t)_{\text{CB}} - T_{i^\prime }^{j} (t)_{\text{CB}} } \right)^{2}}}{3}} } \right)} $$
(15)

Considering the influence of the number of recommenders, we set

$$ N_{K} = \left\{ {1 - { \exp }\left( {\frac{{ - K^{2} }}{{2\sigma^{2} }}} \right)} \right\} $$
(16)

where σ > 0, is a parameter that inversely controls how fast the number of recommender’s impact on R k i , it increases as K increases. The parameter σ can be set from 0 to theoretically ∞, to capture the characteristics of different scenarios.

We use N K to adjust the recommendation vector by considering population K’s influence. The final recommendation vector R k i is:

$$ R_{i}^{k} = \frac{{\sum\nolimits_{j \ne k} {\left( {\left\{ \begin{gathered} T_{i}^{j} (t)_{\text{UB}} \hfill \\ T_{i}^{j} (t)_{\text{RB}} \hfill \\ T_{i}^{j} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} * \text{Re} l(u_{j} ,u_{k} )} \right)} }}{{\sum\nolimits_{k \ne j} {\text{Re} l(u_{j} ,u_{k} )} }} * N_{K} \quad (i = 1, \ldots ,I) $$
(17)

6 Analysis and evaluation

We have evaluated the trust behavior formalization (i.e., Algorithm 1) based on a number of usage models in [46], see “Appendix 1”. Herein, we focus on the evaluation of the reputation and recommender algorithms. In TruBeRepec, malicious users could provide dishonest votes on applications in order to frame good ones and/or boost bad ones. This attack, referred to as the bad mouthing or unfair rating attack is the most straightforward attack [39, 40]. Malicious users could also behave well and badly alternatively, hoping that they can remain undetected while causing damage [39, 40]. This attack is called as the on–off attack. In particular, they can perform differently to different applications in order to impair good users’ recommendation trust. This attack is referred to as the conflict behavior attack. TruBeRepec aims at overcoming the above attacks caused by user subjective voting. Herein, we assume that a malicious user is the user whose opinion on an application is obviously different from the public reputation of the application regarding his/her voting or his/her trust behavior is obviously different from others in terms of application usage. This generally accords with the reality. In this section, we designed a number of experimental simulations to investigate the effectiveness and robustness of the reputation algorithm. Meanwhile, we also evaluate the recommender algorithm based on a number of usage examples.

In our simulations, we assume K = 50 users who commonly consume one application in the TruBeRepec system that has totally 100 registered users (i.e., K′ = 100). There are a number of applications \( A = \left\{ {a_{1} ,a_{2} , \ldots ,a_{I} } \right\} \) that can be selected by the users to consume. In the experiment, honest vote means that the voting value matches the user’s individual trust while dishonest vote means that the voting value mismatches the user’s individual trust. Obviously, TruBeRepec can work without user voting, thus automatically avoid the above attacks. We apply Algorithm 2 to generate the reputation of an application and Algorithm 3 to generate the application recommendation vector. We simply use R to denote the application’s reputation value, while R = 0.1 indicates low reputation and R = 0.9 indicates high reputation. Ri is the reputation of application a i .

We adopt commonly used metrics in information retrieval, Recall (E), Precision (P), and F-measure (F) to describe the malicious user detection performance [38]. For the RSP, the number of users that belong to Malicious User (MU) and are indeed detected as MU, denoted as x; the number of users that don’t belong to MU but are detected as MU, denoted as y; the number of nodes that belong to MU but are not detected as MU, denoted as z. With these data, we do a precision-recall evaluation. Define

$$ E = \frac{x}{x + z} $$
(18)
$$ P = \frac{x}{x + y} $$
(19)
$$ F = \frac{{2{\text{PE}}}}{P + E} $$
(20)

Using F-measure, we express TruBeRepec’s robustness against various attacks. In the simulation, if user k’s recommendation trust \( s^{k} \le 0.1 \), TruBeRepc treats u k as a malicious user.

In the experiment, we assume that the reputation of one application is not related to another. One user could behave honestly in using one application while badly in consumption of another. The F-measure is used to indicate the performance of TruBeRepec according to the recommendation trust values of users. The recommendation trust value is evaluated according to the user’s contribution to applications’ reputation generation, no matter which application the user contributes. Herein, we focus on voting on one application commonly used by 50 users and voting on two applications in case of conflict behavior attack investigation.

The simulation result is similar if 50 users vote different applications at the same time. TruBeRepec can detect the malicious users faster if it can accumulate more information. Due to paper size limitation, we only report the simulation results in the hardest detection cases. That is the users contribute to the reputation of one or two applications.

6.1 Effects of TruBeRepec reputation mechanism

We test the performance of Algorithm 2 with the following scenario: 50 users consume one application. Each user’s initial recommendation trust is 0.5. They recommend the application honestly (e.g., no votes provided) at different time periods while their individual trust is (1) fixed (e.g., 0.1 or 0.9); (2) increasing from 0.1 to 0.9; and (3) decreasing from 0.9 to 0.1. We try to evaluate how effective the TruBeRepec is. Figure 4 shows this simulation result. We observe that TruBeRepec performs very well in these situations (see Fig. 4a). Since the users are honest, thus their recommendation trust is gradually increased until reaching full trust (see Fig. 4b).

Fig. 4
figure 4

Effects of TruBeRepec reputation mechanism a Application reputation in the case that all users are honest (1) all users have fixed individual trust (0.1); (2) all users have fixed individual trust (0.9); (3) all users’ individual trust increases gradually from 0.1 to 0.9; (4) all users’ individual trust decreases gradually from 0.9 to 0.1. b A good user’s recommendation trust in above situations

We further test TruBeRepec in the scenarios with malicious users who intentionally use the application differently from others. That is their individual trust is different from others even though their votes seem honest. We try to evaluate how robust the TruBeRepec is regarding this attack on the trust behavior model. Figure 5 shows this simulation result. We observe that TruBeRepec can evaluate the application’s real reputation efficiently even though some users’ usage behaviors are malicious.

Fig. 5
figure 5

Performance of TruBeRepec reputation mechanism with 10 and 20% malicious users

6.2 Unfair rating attack

Unfair rating could influence the TruBeRepec system in the situation when users are allowed to vote the application. The influence of the unfair rating attack is demonstrated in Fig. 6. In the simulation, we assume that attackers consume the application in a normal way, but with unfair voting. We test four scenarios with 5, 10, 15, 20 unfair rating attackers, respectively, while other users vote the application (with R = 0.1 or R = 0.9) honestly. We observe that TruBeRepec can overcome the unfair rating attack in a very efficient way, mostly it can find the unfair rating attackers immediately if the percentage of attackers is below 30%. Even though the percentage of attackers is up to 40%, TruBeRepec can still find the attackers within 10 time periods if the attackers continuously vote unfairly.

Fig. 6
figure 6

a Performance of TruBeRepec reputation mechanism with 10, 20, 30, and 40% unfair rating attackers; b F-measure with 10, 20, 30, and 40% unfair rating attackers

6.3 On–off attack

The influence of the on–off attack due to malicious voting is demonstrated in Fig. 7. We test four scenarios: 5, 10, 15, and 20 attackers vote the application (with R = 0.1 or R = 0.9) with honest and dishonest recommendations alternatively, while other users vote the application honestly. We can see that TruBeRepec can efficiently overcome the on–off attack when the percentage of attackers is below 40%. Even though half of the users are on–off attackers, TruBeRepec can still detect them, but need more time periods.

Fig. 7
figure 7

a Performance of TruBeRepec reputation mechanism with 10, 20 30, and 40% on–off attackers; b F-measure with 10, 20, 30, and 40% on–off attackers

6.4 Conflict behavior attack

The influence of the conflict behavior attack is demonstrated in Fig. 8. We test five scenarios: 5, 10, 15, 20, and 25 attackers vote one application a 1 (with R1 = 0.9) dishonestly while another application a 2 (with R2 = 0.1) honestly at the same time, while other users vote both applications honestly. We observe that TruBeRepec performs very well against this attack, even though the attackers occupy 50% of the users. It can detect the attackers mostly in the first time period, within 2nd time period if the attackers are 40% of the users, and within 3rd time period if the attackers reach 50% of the users.

Fig. 8
figure 8

a Performance of TruBeRepec reputation mechanism with 10, 20, 30, 40, and 50% conflict behavior attackers; b F-measure with 10, 20, 30, 40 and 50% conflict behavior attackers

6.5 Recommendation accuracy

We illustrate the accuracy of TruBeRepec recommendation mechanism with the following example: 10 users use three applications, with simulated T i (t)UB, T i (t)RB, and T i (t)CB. For the 11th user who only consumes two applications a 0 and a 1 of the three, we calculate the recommendation vector for this user regarding the third application a 2. Our simulation result is shown in Table 2. The random number generated in the simulation is given in “Appendix 2”, where Table 4 provides random \( T_{i}^{k} (t)_{\text{UB}} \) generated for the third test and Table 5 provides random \( T_{i}^{k} (t)_{\text{UB}} \), \( T_{i}^{k} (t)_{\text{RB}} \) and \( T_{i}^{k} (t)_{\text{CB}} \) generated for the forth test. We can see that TruBeRepec can provide personalized recommendations on the basis of the correlation of trust behavior, which is a concrete clue of interest similarity and preferences. For a simple example shown in Table 2, if all eleven users have the same trust behavior values [e.g., (0.6, 0.6, 0.6)] in terms of a 0 and a 1, and the first 10 users have the same trust behavior values [e.g., (0.6, 0.6, 0.6)] regarding a 2, i.e.,

$$ \left\{ \begin{gathered} T_{i}^{k} (t)_{\text{UB}} \hfill \\ T_{i}^{k} (t)_{\text{RB}} \hfill \\ T_{i}^{k} (t)_{\text{CB}} \hfill \\ \end{gathered} \right\} = \left\{ \begin{gathered} 0.6 \hfill \\ 0.6 \hfill \\ 0.6 \hfill \\ \end{gathered} \right\}\quad\left( {i = 0,1,2;\,k = 0, \ldots ,10(i = 0,1);\,k = 0, \ldots ,9(i = 2)} \right), $$

the 11th user gets a recommendation vector (0.6, 0.6, 0.6) for a 2 based on Algorithm 3. This is obviously correct. For another example,

$$\left\{\begin{gathered} T_{i}^{k} (t)_{\text{UB}} \hfill \\T_{i}^{k} (t)_{\text{RB}} \hfill \\ T_{i}^{k} (t)_{\text{CB}} \hfill\\ \end{gathered} \right\} = \left\{ \begin{array}{l} 0.7 \\1 - 0.1*k \\ 0.1*k \\ \end{array} \right\}\quad\left( {i= 0,1,2;\, k = 0, \ldots ,10(i = 0,1);\, k = 0, \ldots ,9(i = 2)}\right),$$

the 11th user’s behavior correlation with other users is increasing with k’s increase for RB and CB and reaches the highest for UB due to the same \( T_{i}^{k} (t)_{\text{UB}} \) values (0.7), thus the recommendation vector for the 11th user regarding a 2 holds the same value 0.7 for UB, and close to average value for RB and CB, i.e., (0.7, 0.43, 0.57).

Table 2 Recommendation vectors

7 Further discussions

7.1 Practical significance and limitations

Developing TruBeRepec based on the trust behavior model has practical significance. First, the model provides a valuable guideline on what kind of user data should be monitored and collected for the purpose of user trust evaluation. In practice, it is hard to directly evaluate user perceived trust, which actually reflects the technical trustworthiness of mobile applications. Second, applying the trust behavior model helps us ease the load of extra human–device interaction that may be required by some existing trust management solutions [25]. This is because the trust behaviors can be monitored through an auto-observation mechanism located at the mobile device. There is no need for extra usability study if TruBeRepec is employed. Through auto-monitoring users’ trust behaviors via user–device interactions during application consumption, we can automatically extract useful information for trust evaluation, reputation generation, and recommendation provision. Thereby, TruBeRepec can provide sound usability. Third, the trust behavior model is examined through user study. The trust explanation mechanism based on this model could be easily understood and accepted by the users [26]. Meanwhile, a recommendation from a user or the RSP can be further assessed and explained with the trust behavior model in order to help other users selecting a trustworthy mobile application. Therefore, TruBeRepec supports usable trust management.

TruBeRepec design assumes that malicious or dishonest users occupy a small proportion (<50%) of the total number of mobile application users. Although this generally accords with the reality, TruBeRepec cannot afford large-scale collaborative attacks, e.g., the malicious users are more than half of total users and they attempt to collaboratively attack the reputation of an application at the same time. This is what TruBeRepec should further improve in the future.

7.2 User data privacy preservation

Based on the interviews with about 180 participants [45], we found that people pay special attention to user data privacy (e.g., usage statistics). In TruBeRepec, the RSP only collects trust values T i (t), T i (t)UB, T i (t)RB, and T i (t)CB in order to generate application reputations and provide recommendations. There is no need for the users to share their detailed application usage information and personal interests or preferences. In Privacy Enhancement Technology (PET), data minimization, i.e., minimizing personal data collected and used by service providers and merchants is one important technique to preserve privacy. Our method falls into this PET category, although it may not be a perfect one. Thus, TruBeRepec can preserve user data privacy to a certain level.

7.3 Attack on trust behavior

Trust is a subjective concept, trust or not trust is a user’s personal opinion. Meanwhile, reputation published by the reputation service provider can be referred by a malicious user. If a user intensively uses a dislike or distrusted application, which is, for example, full of bugs and errors based on his own experiences, his/her behavior is malicious. If a number of users try to increase the reputation of an application with very low reputation through intensive usage, their behaviors are malicious.

Thereby, except for the attacks raised by voting mentioned in Sect. 6, malicious users could intend to attack the proposed trust evaluation mechanism in a way by frequently using a bad application for a long time, by continuing consume it even though the application has many problems and by always recommending it to other people, and meanwhile voting it positively. We argue that this attack could not influence much on the accuracy of application reputation and recommendation in the case that most users are normal users. It is also a big pain for malicious users to use a bad application in a trustworthy way. In addition, this kind of malicious users could be easily detected by the RSP by introducing the recommendation trust s k in the reputation generation, as shown in Fig. 9 (R = 0.1). This is because a user’s recommendation trust s k is negatively influenced if the user’s behaviors and/or votes are different from most of other users, i.e., having a big deviation from the reputation value. In this test, we set \( V_{i}^{k} (t_{p} ) = T_{i}^{k} (t_{p} ) \) automatically since \( V_{i}^{k} (t_{p} ) \) is not provided by the user. We can see that TruBeRepec can also overcome the attack caused by malicious behaviors without user voting. Thus, it can support both voting and non-voting.

Fig. 9
figure 9

Performance of TruBeRepec reputation mechanism regarding malicious usage behavior

Herein, we hold an assumption that the mobile device computing platform applies for example trusted computing technology to reject malware installation or has a good detection mechanism to find and remove malwares [25, 43, 47, 48].

7.4 Performance impact

TruBeRepec can run as an independent mobile application for the user to check usage information and trust/reputation values of installed applications. In this case, it won’t impact other application’s performance. It can also run as a backend application to collect usage information and display both the reputation and detected individual trust values during the application usage. To avoid its impact on other applications’ performance, individual trust evaluation and reputation extraction from the RSP are conducted when the application is starting up or ending up.

In addition, we also explored the effects of trust information’s visualization on mobile application usage in Finland and China. Although the user experiment results achieved in above two countries showed differences, both positively indicated that displaying an application’s reputation value and/or an individual user’s trust value could assist in the usage of mobile applications. Detailed results and discussions are reported in [44].

7.5 Synchronization of s k

In TruBeRepec, we can synchronize local s k (i.e., s k l ) generated at a user’s device and global s k (i.e., s k g ) generated at the RSP based on different policies in practice.

s k l is updated to s k g once it is issued by the RSP in the case that the RSP collects much more trust information than the local device. That is

$$ s_{l}^{k} = s_{g}^{k} $$

\( s_{l}^{k} \) is further evolved each time when s k g is issued if the user would like to consider personally accumulated information. For example, \( s_{l}^{k} = \omega_{l} *s_{l}^{k} + \omega_{g} *s_{g}^{k} \), where ω l and ω g are weighting factors to aggregate s k l and s k g . Suppose the number of trust-related information collected locally about user k is N k l and the number of trust information collected globally about user k is N k g , we have \( \omega_{l} = \frac{{N_{l}^{k} }}{{N_{l}^{k} + N_{g}^{k} }} \) and \( \omega_{g} = \frac{{N_{g}^{k} }}{{N_{l}^{k} + N_{g}^{k} }} \).

8 Conclusions and future work

This paper proposed TruBeRepec, a trust-behavior-based reputation and recommender system for mobile applications. Based on the trust behavior model explored through a large-scale user survey and validated using PCA, CFA, reliability analysis, and correlation analysis, we developed a number of algorithms to evaluate individual user’s trust in a mobile application, generate application reputations and provide application recommendations based on trust behaviors. We showed the practical significance of TruBeRepec through simulations and analysis with regard to effectiveness, robustness, and usability, as well as privacy.

Regarding the future work, we will further improve the TruBeRepec system to eventually develop it toward a product quality implementation. Meanwhile, we will attempt to embed the system into a pervasive social networking platform [49] as part of its trust solution for mobile application services.