1 Introduction

The combination of artificial intelligence and mobile edge computing (MEC) is considered to be a promising path for the development of machine learning techniques in the future. As a model-level coordinated learning paradigm, federated learning [1] can make full use of distributed computing resources in MEC systems. It allows mobile devices to train a global model in a decentralized manner by which mobile users only need to send local model updates after each iteration to the task publisher instead of directly uploading the original data to finish a machine learning task. Google’s research team first proposed this concept in 2016 [2] in order to solve the problem of input prediction on users’ mobile devices. By using a huge number of mobile phone devices, mobile users can train their input models locally. Specifically, users first train the models locally on their own mobile device, then the training parameters are integrated by the platform and sent back to each user to update and optimize the local model. In this way, federated learning is able to train the model with various sources of data while protecting user privacy. This new form of machine learning framework has been widely adopted in different fields including smart medical treatment [3], finance [4], intelligent manufacturing [5], and education [6], e.g. UAVs (Unmanned Aerial Vehicles) [7], app Waze [8] etc.

Despite of the benefits federated learning has brought, there are still many challenges in practice. First, most existing studies have an optimistic assumption that all mobile users will voluntarily contribute their resources. This assumption, however, is not realistic in practice because model training requires resource consumption and without a well-designed compensation mechanism, mobile users are reluctant to participate in federated learning tasks. Second, most existing incentive mechanisms focus more on the compensation of mobile users while ignoring the profits of the federated learning platform. For example in [9,10,11,12], the mechanisms they propose only consider the benefits of users, so as to attract users to participate in training. However, most mechanisms fail to consider the profits of the platform, some even generate negative profits for the platform. If the profits of the platform cannot be guaranteed, they may choose not to continue to publish tasks, resulting in the failure of cooperation between the two sides. Third, current incentive mechanisms fail to consider the time cost of users. The benefits of the platform and users will be influenced if the required time of the incentive mechanism is too long.

This paper aims to solve the above-mentioned problems from the following perspectives: (1) we propose a two-stage incentive mechanism based on combinatorial auction and bargaining (CAB) to encourage the participation of users and the platform; (2) we show that the proposed mechanism can effectively protect the benefits of the federated learning platform and ensure social welfare maximization of the whole system and (3) we introduce a time discount factor to measure time cost in the bargaining stage. Our theoretical and numerical analyses show that this proposed mechanism is individually rational and incentive compatible, and performs better than other baseline models.

The rest of this paper is organized as follows. Section 2 reviews the related work. The system model is introduced in Sect. 3. The relevant propositions of this mechanism are proved in Sect. 4. Simulation results are presented in Sect. 5, and finally, Sect. 6 concludes the paper.

2 Related work

Existing studies primarily focus on the incentive mechanisms, clients selection, and efficiency. Some studies investigate the impact of resource allocation on incentive mechanisms in federated learning. Nishio et al. [13], for example, propose a protocol called FedCS. FedCS consists of a resource request phase to collect computing power and wireless channel status from randomly selected clients, i.e. a subset of federated learning (FL) workers, so that they can complete local training on time, solve the problem of client selection with resource constraints and allow the server to aggregate as many client selection problems as possible. Hence, FedCS can achieve better performance than protocols that ignore client selection. The problem of fairness is also discussed in [14, 15]. Yu et al. [14] propose FL exciter and real-time algorithm, dynamically dividing a given budget among federated data owners in a context-aware manner. Those data owners, who have contributed a large amount of high-quality data and have not been fully compensated for a long time, will enjoy a higher share of the revenue subsequently generated by the federation in order to be fair. Zeng et al. [16] propose a multidimensional auction incentive mechanism, known as FMore, which covers a series of scoring functions. Game theory can also be used to derive the optimal strategy of marginal users and expected utility theory can be used to guide aggregators to effectively obtain the required resources. Jiao et al. [17] design reverse multidimensional auction (RMA) mechanism to achieve social welfare maximization. The RMA mechanism takes into account not only the bid price of training services provided by employees, but also multiple attributes of each employee, including data size, EMD, and wireless channel requirements. Dütting et al. [18] use multiple neural networks to establish an auction model and deep learning to solve multi-item auction problems. The incentive scheme for procurement auction is designed to solve various problems, such as allocation of radio spectrum [19], public testing [20], display advertising [21] and customer-assisted cloud storage systems [22, 23]. However, none of them can be applied directly to federated learning because they only consider the specific attributes of their own problems. Le et al. [8] propose that through combinatorial auctions, each mobile user submits a bid according to the minimum energy cost required to complete the federated learning tasks and is paid through the Vickery-Clarke-Groves (VCG) payment model. Although this idea ensures the profits of users, the profits of federated learning platforms are still missing.

Clients selection is one of the important factors that influence incentive mechanisms. Domingos et al. [15] propose a q-FedAvg algorithm, which gives a higher weight for users with low performance when optimizing the objective function. Kang et al. [24] introduce reputation as a metric to measure the reliability and credibility of mobile users. In their multiple subjective logic model, a reliable federated learning worker selection scheme based on reputation is designed. Xiong et al. [25] propose a scheme to select people according to reputation value for federated learning tasks and the value is calculated based on the local reputation opinions generated by the direct interaction and stored in an open access alliance block chain called reputation block chain for reputation management in a decentralized manner. Auction-based incentive mechanism has also been widely applied in federated learning tasks. For example, Ng et al. [7] use an auction incentive mechanism in Unmanned Aerial Vehicles, which allows the platform to have the complete bidding information of all users, so that the platform is able to select the winners accordingly in the auction.

Existing literature has also explored the role of efficiency in the design of incentive mechanisms in federated learning tasks. Dhakal et al. [26] design a novel coded computing technique for federated learning that increased the convergence speed of the global model by nearly four times. To Address the problem of information asymmetries, Lim et al. [9] propose a hierarchical incentive mechanism based on multiple model owners and alliances. In addition, contract theory has also been applied to encourage high quality data from different types of workers. To maximize the profits of the platform in the federation, Zhan et al. [10] propose an incentive mechanism based on deep reinforcement learning (DRL) to solve the challenge of non-shared information and the difficulty of contribution evaluation in federated learning. In order to maximize the profits of the users, Pandey et al. [27] design a two-stage Stackelberg game to maximize the interests of both the client and the server. Furthermore, Feng et al. [28] propose a Stackelberg game model to study the interaction between servers and mobile devices, where mobile devices determine the price of each unit of data to maximize personal profits, while servers choose the size of training data to optimize their profits. Huang et al. [29] design a novel adaptive gradient descent differential private convolution neural network (DPAGD-CNN) algorithm to update each user’s training parameters, which can protect data privacy more effectively than existing works.

To sum up, despite that existing studies try to construct the incentive mechanisms from various perspectives, there are still problems to be solved. (1) Most incentive mechanisms focus on the profits and fairness of local users, but fail to consider the profits of the federated learning platform. It is important to explicitly consider the profits of the federated learning platforms because its willingness of participation cannot be fully motivated if its benefits can not be properly protected. (2) Most existing research assumes that the information of local users is transparent and available for all other parties. Such an assumption is not realistic in application; (3) Most previous studies fail to consider the cost of time in their incentive mechanism, leading to a low efficiency of federated learning tasks.

3 System model

3.1 Mechanism introduction

As shown in Figure 1, in combinatorial auction stage, first, the platform acts as a buyer and publishes a federated learning task. Second, users who act as sellers receive information about the task and decide whether to participate. Then, they submit bids to the platform according to the cost of data transmission and local model training etc. Third, the platform selects the users who make the total profit of the platform and themselves greater than zero as winners, and adopt their bidding price as the provisional price. Next, we come to the bargaining stage. In this stage, the platform classifies the winners set ω into two categories and uses different payment methods respectively after finishing the training model.

Figure 1
figure 1

System model

As shown in Figure 2, there are three payment strategies for the platform in the bargaining stage: (1) paying directly at the provisional price determined in the combinatorial auction stage; (2) proposing a new price to bargain, and (3) announcing the failure. Among them, we classify the winners who adopt bargaining price as category 1 winners (ω1). The winners who use the provisional price determined in the combinatorial auction stage are classified as category 2 winners (ω2). For users of ω2, the federated learning task completes immediately and there is no waste of time for both parties. Otherwise, the platform will propose a new bargaining price and wait for users to decide whether to continue bargain or not. Therefore, the time of the auction will be prolonged, resulting in the loss of profits for both the platform and users.

Figure 2
figure 2

Classification and trading strategy of winning users

3.2 Hypothesis

  1. (1)

    In the combinatorial auction stage [30], the platform and users (usually referred as bidders in the auction) have incomplete information about each other. No user knows other users’ estimates of task value in this stage.

  2. (2)

    In the bargaining stage [31], all participants submit their bidding information (including the time of local model training and the time of data transmission) to the platform. Therefore, the platform has complete bidding information about the users [7]. However, users still do not know the private information of the platform and the bidding information of other participants.

  3. (3)

    All participants are fixed in our discussed federated learning tasks. The platform does not have to attract new participants.

4 Modeling analysis

4.1 Preliminary of federated learning

Suppose a federated learning platform publishes a federated learning task with N users. Each user submits a bid b to the platform, which includes the time of local model training Ticomp, and the time of data transmission to federated learning platform Ticom.

$$b=f\left({T}_{i}^{comp}, {T}_{i}^{com}\right)$$
(1)

We use VCG mechanism in the combinatorial auction stage and Formula (2) gives the requirements for determining the winning user i. The purpose of the auction is to maximize the profit of the system, that is, to maximize the difference between the platform’s valuation of the bid Si(b) and the user’s bid price Vi(b).

$$\mathrm{max}\sum_{i=1}^{n}\left[{S}_{i}\left(b\right)-{V}_{i}\left(b\right)\right]$$
(2)

This model assumes that there are enough users in the auction, i.e. N is large enough so that there is at least one feasible solution to the optimization problem (2).

Formula (3) is the provisional price function for winners in the combinatorial auction stage. The provisional price will be notified to the winners of the combinatorial auction when it’s over. If user i wins (\(i\in \omega\)), the provisional price is calculated as follows

$${P}_{i}=V\left(I\right)-V\left({I}_{-i}\right)+{V}_{i}\left(b\right)$$
(3)

where

$$V\left(I\right)=\mathrm{max}\sum_{k=1}^{n}\left({S}_{k}\left(b\right)-{V}_{k}\left(b\right)\right)$$
(4)
$$V\left({I}_{-i}\right)=\mathrm{max}\sum_{k\in \left[{I}_{-i}\right]}^{n}\left({S}_{k}\left(b\right)-{V}_{k}\left(b\right)\right)$$
(5)

where I refers to the set of all users, V(I) in Formula (4) represents the maximum total system profit generated by the participation of all users, and V(I-i) in Formula (5) denotes the maximum total system profit generated by users except i.

Formula (6) defines the profit of winning user i in the combinatorial auction stage, and Ci(b) represents the true cost of user i’s bid.

$${\pi }_{i}={P}_{i}-{C}_{i}\left(b\right)$$
(6)

Formula (7) presents the profit of the platform from the federated learning task.

$${\pi }_{s}={S}_{i}\left(b\right)-{P}_{i}$$
(7)

The system net profits brought by user i is the sum of the profit of user i (πi) and the platform (πs).

$${D}_{i}={\pi }_{i}+{\pi }_{s}$$
(8)

In the combinatorial auction stage, user i will be selected as a winner if the net profit he brought to the system is greater than zero, i.e. Di > 0. Then, the platform and all winning users of the first stage will enter the bargaining stage. Both the platform and the winners hope to finish the bargaining process sooner.

The total profit of the system is the sum of the profit of all user i (πi) and the platform (πs).

$$\begin{aligned} D\left(I\right)&=\sum\limits_{i\in \omega }\left({\pi }_{i}+{\pi }_{s}\right)\\ &=\sum\limits_{i\in \omega }\left({S}_{i}\left(b\right)-{C}_{i}\left(b\right)\right)\end{aligned}$$
(9)

where ω is the set of winning users.

This model uses time discount rate r (r > 0) to measure the aversion of procrastination for both sides and the net profit of the platform in the bargaining stage is calculated in Formula (10).

$${\pi }_{s}^{*}={e}^{-{rt}_{1}}\left(\sum_{i\in {\omega }_{1}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{1}}{P}_{i}^{*}\right)+\sum_{i\in {\omega }_{2}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{2}}{P}_{i}$$
(10)

where t1 is the bargaining time of category 1 winners ω1, t2 is the bargaining time of category 2 winners ω2, and Pi* is the new bargaining price of category 1 winners ω1. Since there is no need to bargain for category 2 winners, we have t2 = 0.

The net profit of category 1 winner in the bargaining stage can be calculated as:

$${\pi }_{i}^{*}={e}^{-{rt}_{1}}\left({P}_{i}^{*}-{C}_{i}\left(b\right)\right),i\in {\omega }_{1}$$
(11)

The net profit of category 2 winner in the bargaining stage can be calculated as:

$${\pi }_{i}^{*}={P}_{i}-{C}_{i}\left(b\right),i\in {\omega }_{2}$$
(12)

Here, \({e}^{-{rt}_{1}}\) represents the assumption on the negative exponential decline in utility over time [30,31,32] The time interval for each round of bargaining between the platform and the user is

$$t=-\frac{1}{r}\log\delta$$
(13)

Thus Formula (10) and (11) can be simplified as follows

$${\pi }_{s}^{*}={\delta }^{m}\left(\sum_{i\in {\omega }_{1}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{1}}{P}_{i}^{*}\right)+\sum_{i\in {\omega }_{2}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{2}}{P}_{i}$$
(14)
$${\pi }_{i}^{*}={\delta }^{m}\left({P}_{i}^{*}-{C}_{i}\left(b\right)\right),i\in {\omega }_{1}$$
(15)

where δ is the time discount factor of the bargaining period and m is the number of bargaining rounds.

The major notations to describe the system are listed in Table 1.

Table 1 Table of key notations

4.2 Winner determination and bidding strategy in combinatorial auction

In this paper, a winner determination problem (WDP) model is proposed to help the platform to determine the winning users and to assign the optimal number of users to platform in the combinatorial auction stage. Since it is impossible to know the user’ valuation information, the subsequent bargaining stage will lead to three results. (1) Direct payment—for category 2 winners, the platform will pay directly with the provisional price determined in the combinatorial auction stage. (2) Continuing bargaining – for category 1 winners, the platform proposes a new bargaining price, which will make the platform more profitable. And (3) Announcing failure—the platform will declare the failure of the transaction to users who make the profit of platform negative.

However, since the probability of the above three situations is stochastic, its corresponding bidding strategy at equilibrium is unpredictable. It is impossible for the platform to determine the winner set based on the criterion of platform profit maximization. In this case, the platform will decide winners according to the following two criteria: (1) the maximization of the overall profits of the whole system (profits of the platform and all users) and (2) incentive compatibility where each user can achieve the best profit to themselves just by bidding according to its true cost [22]. As mentioned earlier, Formula (2) gives an optimization problem of criterion (1). For criterion (2), Proposition 1 below shows that the VCG mechanism in the auction stage is incentive compatible.

Proposition 1

Users get the highest profit if and only if they bid with their real cost in the combinatorial auction stage.

Proof

Assuming that user i is in the winner set ω, Ci(b) represents the true cost of bid b of user i, and the profit of user i in the combinatorial auction stage is

$$\begin{array}{l}{\pi }_{i}={P}_{i}-{C}_{i}\left(b\right)\\ \ \ \ \ =\left\{V\left(I\right)-V\left({I}_{-i}\right)+{V}_{i}\left(b\right)\right\}-{C}_{i}\left(b\right)\\ \ \ \ \ =\sum\limits_{k=1,k\ne i}^{n}\left({S}_{k}\left(b\right)-{V}_{k}\left(b\right)\right)-V\left({I}_{-i}\right)+{S}_{i}\left(b\right)-{C}_{i}\left(b\right)\end{array}$$
(16)

where Ci(b) is a fixed value for user i. Users must make their bid price Vi(b) equal the true cost of their bid Ci(b) in order to maximize the profit of the system.

$${V}_{i}\left(b\right)={C}_{i}\left(b\right)$$
(17)

The VCG mechanism in the combinatorial auction stage ensures incentive compatibility of all users. This is because users generate more profit if they bid with their real cost. It can be inferred from Formula (16) that any bid higher than the real cost will result in a negative profit for users, and rational users will not bid with a price lower than their real cost. Bidding with the real cost is the dominant strategy for all users.

In previous discussions on incentive compatibility, we usually assume that all users are required to participate in the auction. But in reality, users are free to decide whether to participate or not. Users are reluctant to participate in a federated learning task if they are not able to earn a profit. Therefore, the platform needs to ensure the profit of the winning users to guarantee their participation, i.e. the individual rationality of users needs to be considered.

Proposition 2

The proposed two-stage incentive mechanism is individually rational for all users.

Proof

According to Dütting et al. [18], a user is individually rational if its utility is non-negative. The utility of the winning user i can be expressed as

$$\begin{array}{l}{U}_{i}=V\left(I\right)-V\left({I}_{-i}\right)+{V}_{i}\left(b\right)-{C}_{i}\left(b\right)\\ \ \ \ \ =V\left(I\right)-V\left({I}_{-i}\right)\end{array}$$
(18)

According to the theory of incentive compatibility, the bid price Vi(b) is equal to the true cost Ci(b). In VCG mechanisms, the system profit brought by user i is necessarily greater than that without user i; therefore, Formula (18) is non-negative, i.e. users are individually rational in the combinatorial auction stage.

The combinatorial auction stage is of great significance to enhance task efficiency and to save the bargaining cost of the second stage. It reveals the true private information of users and helps to determine the winner set of the next bargaining stage in a short period of time. Without this stage, it is not possible to select a set of winners because the platform will have to bargain with each user one by one, eventually resulting in high time cost. Our analysis proves that the profit of the platform can be improved by introducing the bargaining stage.

4.3 Winner category and pricing strategy in the bargaining stage

According to the Rubinstein bargaining model [33] where two users split a piece of cake, if both parties possess complete information and their time discount factors are δ1 < 0, δ2 < 0 respectively, there is a unique Nash equilibrium in the Rubinstein’s model. The profit of the former is \(\frac{1-{\delta }_{2}}{1-{\delta }_{1}{\delta }_{2}}\) and the recipient’s profit is \(\frac{{\delta }_{2}-{\delta }_{1}{\delta }_{2}}{1-{\delta }_{1}{\delta }_{2}}\) [34].

In our proposed CAB mechanism, if the platform knows the true cost Ci(b) of all winners and all winners know the true platform valuation Si(b) about their bids, then Proposition 3 is derived then as below.

Proposition 3

There is a unique equilibrium in the bargaining stage and the equilibrium price Pi* is

$${P}_{i}^{*}=\frac{\delta {S}_{i}\left(b\right)+{C}_{i}\left(b\right)}{1+\delta }$$
(19)

The equilibrium profit of the platform is

$${\pi }_{si}^{*}=\frac{{S}_{i}\left(b\right)-{C}_{i}\left(b\right)}{1+\delta }$$
(20)

And the equilibrium profit of user i is

$${\pi }_{i}^{*}=\frac{\delta \left({S}_{i}\left(b\right)-{C}_{i}\left(b\right)\right)}{1+\delta }$$
(21)

Proof

In this model, the time discount factor of the platform and the winning users are the same, namely δ1 = δ2 < 1. While the size of the “cake” is related to the winning user i and equals to the total system profit Vi(b)-Ci(b), the corresponding profits of both parties in Formula (20) and (21) can be calculated by using the Rubinstein bargaining model. Considering the first item in (14), it can be inferred that, in the complete information scenario, the profit of the platform from winner i after bargaining is.

$${\pi }_{si}^{*}={\delta }^{m}\left({S}_{i}\left(b\right)-{P}_{i}^{*}\right)$$
(22)

According to the Rubinstein bargaining model, user i immediately accepts this offer, so there is no time cost in bargaining, i.e. m = 0. Replacing Formula (20) by Formula (22), we derive the platform’s bidding strategy from Formula (19).

Next, we discuss three special points that divide the winners into different categories.

  • (1) The first special point is the one where the profit of the platform made from the provisional price in the first stage equals the new bargaining price in the second stage. If the platform adopts provisional price immediately, there is no time cost for bargaining and the profit will not be discounted. If the platform chooses to bargain with winner i, the profit of the platform will be discounted by δ due to the time delay. According to Proposition 3, the bargaining price is equal to Formula (19). According to Formula (3), (19), and (22), we have Formula (23) representing the first special point as below:

    $${S}_{i}\left(b\right)-\left[V\left(I\right)-V\left({I}_{-i}\right)+{V}_{i}\left(b\right)\right]=\delta \left\{{S}_{i}\left(b\right)-\frac{\delta {S}_{i}\left(b\right)+{C}_{i}\left(b\right)}{1+\delta }\right\}$$
    (23)

From Formula (23), we have

$${D}_{i}=\left(1+\delta \right){U}_{i}$$
(24)

where Di = Si(b)-Vi(b) represents the net system profits brought by user i and Ui = V(I)-V(I-i) indicates the increase of total social welfare brought by the participation of winner i.

  • (2) The second special point is the one where the provisional price in the first stage equals the new bargaining price in the second stage, i.e.

    $$V\left(I\right)-V\left({I}_{-i}\right)+{V}_{i}\left(b\right)=\frac{\delta {S}_{i}\left(b\right)+{C}_{i}\left(b\right)}{1+\delta }$$
    (25)

    According to Formula (25), we have

    $${P}_{i}={P}_{i}^{*}$$
    (26)
  • (3) The third special point refers to the one where the platform gets zero profit from the new bargaining price in the second stage, i.e.

    $${D}_{i}=0$$
    (27)

The above three special points divide the winning users into three categories (see Figure 2) which demonstrate that despite that category 1 users ensure a positive profit for the platform, the platform is still willing to enter the bargaining stage since it brings more profit.

As shown in Figure 3, category 2 users are divided into two sub-categories by special point Pi = Pi*: i.e. those whose provisional price is less than or equal to the bargaining price and those whose provisional price is greater than the bargaining price. Furthermore, it is worth mentioning that category 2 users will be paid with the provisional price no matter whether the provisional price is higher than the bargaining price or not. This is because there is time cost in the bargaining stage and the profit of the platform will be discounted by δ. Hence, the platform will finally choose the provisional price to conduct the federated learning task.

Figure 3
figure 3

Classification of category 2 users

To sum up, according to the above discussion, for users of category 1, if Di ≤ (1 + δ)Ui, the platform will adopt bargaining strategies to increase its profit; For users of category 2, if Di > (1 + δ)Ui, the platform will immediately pay with the provisional price.

Remark

In the bargaining stage of this model, the winners still have incomplete information about the platform. This is because after the combinatorial auction, despite that the platform already knows the true valuation of the winning users (see Proposition 1), the winning users still have no information about the platform’s valuation of their bid Si(b).

The equilibrium price of bargaining stage shown in Proposition 3 can be used as the reference point to analyze this incomplete information scenario in the bargaining stage.

This mechanism requires that after the combinatorial auction, the provisional price will be immediately told to winner i. Therefore, winner i will know its profit Ui from participating in the federated learning task, i.e. winners know all the information of the above Formulas (23) to (27), enabling them to calculate the above three special points Di and the corresponding platform’s valuation of their bids Si(b) At the same time, if the platform pays directly with the provisional price to category 2 users, we have Di > (1 + δ)Ui. If the platform proposes a new bargaining price for category 1 users, we have Si(b) > Ci(b) + (1 + δ)Ui.

Cramton [35] has shown that the auctioneer cannot let the winning users know and believe its real valuation without adopting the time delay strategy as a signal of its real valuation. However, if the time delay strategy is adopted, there will be a certain loss to the profits of both sides. Therefore, our proposed mechanism suggests that the platform should propose a bargaining price that enables both parties to finish the deal as soon as possible instead of adopting the time delay strategy, i.e. Si(b) = Ci(b) + (1 + δ)Ui. Bringing this value into Formula (19), the bargaining price can be simplified as

$$\begin{array}{l}{P}_{i}^{*}=\frac{\delta \left({C}_{i}\left(b\right)+\left(1+\delta \right){U}_{i}\right)+{C}_{i}\left(b\right)}{1+\delta }\\ \ \ \ \ \ ={C}_{i}\left(b\right)+\delta {U}_{i}\end{array}$$
(28)

The above bargaining price is actually the highest price that can be offered by the platform so that the winners will agree the deal immediately. In addition, if we replace \(\frac{\delta {S}_{i}\left(b\right)+{C}_{i}\left(b\right)}{1+\delta }\) in Formula (23) with Formula (28), we will get the special point Di shown in (24), which shows that the bargaining price in Formula (28) is more beneficial to the platform than the provisional price Di = (1 + δ)Ui determined in the combinatorial auction stage. Theorem 1 gives the condition for the platform to declare the failure of the transaction.

Theorem 1

If Za < Zb and (Za-Zb) + U > 0, we have D(I) < (Za-Zb) + U. The platform will declare the failure of the transaction.

Proof

Here, we define Za and Zb as follows

$${Z_a}\ \text{:=}\ \sum\limits_{i\in {\omega }_{1}}{S}_{i}\left(b\right)-\sum\limits_{i\in {\omega }_{1}}\left({U}_{i}+{V}_{i}\left(b\right)\right)$$
(29)
$${Z_b}\ \text{:=}\ \delta \left(\sum\limits_{i\in {\omega }_{1}}{S}_{i}\left(b\right)-\sum\limits_{i\in {\omega }_{1}}{P}_{i}^{*}\right)$$
(30)
$$\begin{array}{l}U\ \text{:=}\ \sum\limits_{i\in \omega }{U}_{i}\\\ \ \ \ =\sum\limits_{i\in \omega }\left[V\left(I\right)-V\left({I}_{-i}\right)\right]\end{array}$$
(31)

where Za denotes the profit of the platform brought by category 1 users when adopting the provisional price strategy, Zb denotes the profit of the platform brought by category 1 users after bargaining, and U represents the sum of the utility of all winners.

According to Theorem 1, the platform will declare the failure of the transaction if the total profit of the system is negative. In this case, we have

$$\delta \left(\sum_{i\in {\omega }_{1}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{1}}{P}_{i}^{*}\right)+\sum_{i\in {\omega }_{2}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{2}}\left({U}_{i}+{V}_{i}\left(b\right)\right)<0$$
(32)

.

The above Formula (32) is equivalent to

$$\delta \left(\sum_{i\in {\omega }_{1}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{1}}{P}_{i}^{*}\right)+\left(D\left(I\right)-U\right)<\sum_{i\in {\omega }_{1}}{S}_{i}\left(b\right)-\sum_{i\in {\omega }_{1}}\left({U}_{i}+{V}_{i}\left(b\right)\right)$$
(33)

Considering the definitions of Za and Zb, the above formula is equivalent to

$$D\left(I\right)<\left({Z}_{a}-{Z}_{b}\right)+U$$
(34)

Formula (34) is the conditions for the failure of the transaction between the two parties.

Corollary 1

If all winning users are category 1 users, the platform will adopt bargaining strategy; if D(I) ≤ δUi, the platform will announce the failure of the transaction.

Proof

Assuming that all winners are category 1 users, if the bargaining price brings negative profits to the platform, we have

$$\sum_{i\in \omega }{S}_{i}\left(b\right)-\sum_{i\in \omega }{P}_{i}^{*}<0$$
(35)

Substitute (35) with (28) and (17), we have

$$\sum_{i\in \omega }{S}_{i}\left(b\right)-\sum_{i\in \omega }\left({V}_{i}\left(b\right)+\delta {U}_{i}\right)<0$$
(36)

Therefore, the platform will announce the failure of the federated learning task if D(I) ≤ δUi.

Corollary 2

If all winning users are category 2 users, the platform will directly use the provisional price determined in the combinatorial auction stage; if D(I) ≤ Ui, the platform will declare the failure of the transaction.

Proof

Assuming that all winners are category 2 users, if the bargaining price brings negative profits to the platform, we have

$$\sum_{i\in \omega }{S}_{i}\left(b\right)-\sum_{i\in \omega }{P}_{i}<0$$
(37)

Substitute (37) with (3) and (24), we have

$$\sum_{i\in \omega }{S}_{i}\left(b\right)-\sum_{i\in \omega }\left({V}_{i}\left(b\right)+{U}_{i}\right)<0$$
(38)

Therefore, the platform will announce the failure of the federated learning task if D(I) ≤ Ui.

5 Numerical simulation

In this section, we use numerical simulations to evaluate the effectiveness of our proposed mechanism. It is worth noting that our proposed mechanism is only applicable in the horizontal federated learning tasks, where the datasets provided by users are homogeneous, e.g. banks need to obtain user data from multiple banks for users’ credit evaluation tasks. The parameters in the simulation are designed as follows: The whole system consists of 1 platform to publish federated learning tasks and 600 mobile users (\(i=600\)) to participate in the task. We randomly generate the bidding price of user i Vi(b) from the interval (20, 70) [36] and the platform’s valuation of user i Si(b) from the interval (10, 100) [37].

In order to make comparisons with other baseline models, we adopt the simulation data used in other literatures, such as VCG [30], TIM [36], and UPA [37]. We also compare the results from the perspective of individual rationality [38], platform profits [10], total system profits [8, 17], system efficiency [18] and truthfulness [36] with the above baseline models. TIM mechanism iteratively selects the winners who bid with a lower average bidding price. UPA mechanism first calculates the system profit that a user brings, then selects winning users who make the system profit greater than 0, sorts the winning users according to the bidding price, and pays the corresponding payment. The results are shown in Table 2.

Table 2 Simulation results

5.1 The individual rationality of users

As shown in Figure 4, except for UPA, all other mechanisms guarantee positive profits for their users. Therefore, all users are motivated to participate in the federated learning task. As shown in Figure 4(a) and Figure 4(b), the profits of users in CAB are less than that in VCG, because our proposed mechanism provides more profits to the platform. More importantly, compared with Figure 4(d), the number of users of CAB and VCG is significantly greater than that of TIM, which proves the superiority of our proposed mechanism.

Figure 4
figure 4

(a) The individual rationality of CAB (b) The individual rationality of VCG (c) The individual rationality of UPA (d) The individual rationality of TIM

5.2 The profits of the platform

Figure 5(a) shows the provisional price, transaction price, and platform valuation where the final profit of the platform [10] is equal to the platform’s valuation of user i’s bid minus the transaction price. As shown in Fig. 5(a), each winner will receive a price no more than its previous bid, and therefore the profit of the platform will be no less than the profit if it directly adopts the provisional price determined in the combinatorial auction stage. The results show that users will be adequately compensated and the platform will be more strongly motivated to participate in federated learning tasks.

Figure 5
figure 5

(a) Performance under CAB (b) Platform profits under different mechanisms (c) Impact of time discount rate on platform profits

Figure 5(b) shows that the profit of the federated learning platform using our proposed mechanism is always greater than or equal to that in VCG mechanism, TIM mechanism, or UPA mechanism. With the increase of the number of participants, the profits of the platform will continue to increase. The simulation results show that our proposed mechanism can guarantee the profit of the platform (see Figure 5(b)).

In addition, different time discount rates have different effects on the profits of the platform. Figure 5(c) shows that the time discount rate is negatively related to the profits of the federated learning platform.

5.3 The total profit of the whole system

According to Le et al. [8] and Jiao et al. [17], the total system profits are the sum of the profits of the platform and all users. Figure 6 shows that with the increase of the number of users, the total profit of the system continues to increase. Our proposed mechanism and traditional VCG mechanism produce the same total system profits, which are significantly higher than that in UPA mechanism and TIM mechanism.

Figure 6
figure 6

Total profits of the system under different mechanisms

5.4 The influence of time

In this experiment, we analyze the impact of bargaining time on the price, profit and equilibrium of the platform and user i in the bargaining stage. As we can see from Figure 7(a) to Figure 7(d), the platform will decreases its bargaining price in the bargaining process as t (the number of bargaining rounds) increases. The bargaining process stops when the bargaining price equals to the bidding price of the user (the user gets zero profit). Therefore, with the increase of bargaining time t, the profit of user i decreases and the profit of platform increases. In addition, with the decrease of time discount factor, the profit of user i declines increasingly faster and the bargaining process stops faster. According to Proposition 3, due to the discount factor δ, a rational user will immediately agree to the bargaining price proposed by the platform instead of adopting the time delay strategy, because there is no time loss. Therefore, the user i is willing to finish the bargaining stage as soon as possible to maximize its profit.

Figure 7
figure 7

(a) The price and profit when δ = 0.8 (b) The price and profit when δ = 0.6 (c) The price and profit when δ = 0.4 (d) The price and profit when δ = 0.2

5.5 System efficiency

According to Dütting et al. [18], we use the proportion of users who eventually reach a deal with the platform to measure system efficiency. Intuitively, the efficiency of the system [36] depends on the bidding price of the users and the valuation of the platform. If users bid far below the platform’s valuation, there will be few successful transactions, resulting in a reduction of the overall system efficiency. As shown in Figire 8(a), we compare the system efficiency of four mechanisms: CAB, VCG, UPA, and TIM, and the simulation results show that the system efficiency of CAB and VCG is the highest, while TIM has the lowest system efficiency. As shown in Figure 8(b), with the increase of users, the average system profit is always greater than 0 and becomes more stable.

Figure 8
figure 8

(a) Normalized system efficiency (b) Average system profits

5.6 Truthfulness

A mechanism is truthful if and only if every participant gets the highest profit when it reports true value. Figure 9(a) shows a randomly chosen winner i who bids with Vi(b) = Ci(b) = 45.57 and receives a profit of πi = 7.069. It suggests that user i cannot increase its profit by using any other bid prices. Figure 9(b) shows a different scenario that user i fails the auction when it bids truthfully, i.e. Vi(b) = Ci(b) = 44.105. Thus, user i will not participate in this task and get zero profit. Figure 9(b) also shows that even if user i bids incorrectly, its profits won’t be greater than 0. The following figures show truthfulness of different mechanisms.

Figure 9
figure 9

Truthfulness in different mechanisms

6 Conclusion

In this paper, we propose a two-stage federated learning incentive mechanism based on combinatorial auction and bargaining. The properties of this mechanism are proved by theoretical analysis and numerical simulation. Compared with the traditional mechanisms, our proposed mechanism has following advantages: (1) Our mechanism ensures profit maximization of the whole system; (2) Due to the existence of the bargaining stage, our mechanism ensures that the profit of the platform is higher than that of other baseline mechanisms; (3) This mechanism considers time value of mobile users and therefore promotes the efficiency of user selection process; (4) We demonstrate that the proposed mechanism satisfies individual rationality, incentive compatibility and truthfulness of both the platform and users. Numerical results also proved our analysis.

In the future work, in order to improve the benefits of the federated learning platform and achieve reliable federated learning, researchers could consider more factors that affect sthe efficiency of the federated learning platform, such as geographical location, seller’s reputation and so on. The conclusions of this study can also be applied in other similar tasks, like crowdsourcing.