VFL-R: a novel framework for multi-party in vertical federated learning

Li, Jialin; Yan, Tongjiang; Ren, Pengcheng

doi:10.1007/s10489-022-04111-0

VFL-R: a novel framework for multi-party in vertical federated learning

Published: 27 September 2022

Volume 53, pages 12399–12415, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Applied Intelligence Aims and scope Submit manuscript

VFL-R: a novel framework for multi-party in vertical federated learning

Download PDF

Jialin Li¹,
Tongjiang Yan¹ &
Pengcheng Ren¹

853 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Federated learning (FL) provides a robust distributed framework for machine learning that solves privacy leakage concerns. In the some cases, it is hard to train the FL model with limited communication sources and low computational capabilities for the coordinator. Especially, designing an efficient framework for vertical federated learning (VFL) is a concern, as each party has unique data features. Hence, this paper proposes VFL-R, a novel VFL framework combined with a ring architecture for multi-party cooperative modeling. The VFL-R framework simplifies each party’s intricate communication architecture, defending against semi-honest attacks and reducing the coordinator’s influence in the modeling process. Several experiments challenge our framework’s communication performance against current VFL frameworks, highlighting that for similar test accuracy, VFL-R achieves O(K) number of communications in one communication round and an O(1) communication cost for the coordinator.

SVFL: Secure Vertical Federated Learning on Linear Models

Security and Robustness in Federated Learning

A survey on federated learning: a perspective from multi-party computation

Article Open access 02 December 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In traditional machine learning (ML) approaches, data are collected and stored by a single node (or centralized server) and used for training and testing. However, transmitting and centralizing data raises numerous administrative, ethical and legal issues, mainly related to privacy and data protection, according to the General Data Protection Regulation (GDPR) [1]. Federated learning (FL) empowers collaborative learning to address data issues while protecting information security [2]. Recently, the FL framework has been increasingly used in real-world applications, e.g., healthcare [3, 4], purchase recommendation [5, 6], and distributed synthetic data generation systems [7, 8].

Generally, the FL framework involves three primary steps: (i) all parties receive the latest global model W from the centralized server (also called a broker), (ii) the parties train the received model using their local data, and (iii) they upload their locally trained models W_i back to the centralized server to be aggregated and form an updated global model. These steps are repeated until a particular convergence criterion is obtained. However, such distributed framework also results in communication cost and leads to a training bottleneck. Currently, communication efficiency is still a significant concern for FL.

Recently, some researchers proposed several frameworks to improve communication efficiency in the horizontal federated learning (HFL) scenario [9,10,11]. The vertical federated learning (VFL) scenario is opposite to the HFL scenario, where all parties hold homogeneous data, i.e., the parties have partial overlap on the sample space, whereas they differ in the feature space. As a result, the VFL framework requires a more intricate communication architecture to ensure the other parties are unaware of the data and the characteristics of other parties. The literature has proposed several VFL frameworks. For example, in 2019, Yang et al. proposed a simple VFL framework based on the C-S communication architecture with one parameter server (PS) and two parties [12]. Figure 1 highlights that the PS occurs as a trusted coordinator who is mainly responsible for data aggregation and information distribution. Ou et al. [13] designed a vertical federated learning system utilizing Bayesian machine learning with homomorphic encryption, while Hou et al. [14] proposed a verifiable privacy-preserving scheme (VPRF) based on a vertical federated random forest. However, the stability and reliability of the PS are pretty important, as once the PS fails to provide accurate computation results, the VFL may produce a low-quality model [15]. To eliminate the effect of PS, Chen et al. [16] proposed a secure VFL framework based on a pseudo-decentralization communication architecture. As illustrated in Fig. 2, the parties are divided into one active and many passive parties, where the active party replaces the position of PS as the coordinator. In 2021, Zhu et al. [17] introduced a secure VFL framework named PIVODL, which trained GBDTs with data labels distributed on multiple devices. Zhang et al. [18] suggested a VFL framework based on an LSTM fault classification network for the firefighting IoT platform. Chen et al. [19] proposed an efficient and interpretable inference framework for decision tree ensembles in a VFL scenario. However, the pseudo-decentralization communication architecture still needs many communications to achieve high test accuracy and privacy security. Real-word applications involving an intricate communication architecture impose high time and money costs. Although Gu et al. [20] proposed an efficient VFL framework called VFB² to simplify the communication architecture, VFB² still suffers from semi-honest attacks [21] and the coordinator’s effect. Hence, it is quite challenging to design a framework that considers both communication efficiency and privacy security for the VFL scenario.

In addition, a simplified framework is urgently needed to complete VFL modeling with limited communication sources and low coordinator’s effect. Hence, this paper proposes VFL-R, a novel VFL framework integrated with the ring architecture and a HE-based approach, enabling a multi-party scheme to train the model collaboratively. We summarize the contributions of this paper as follows.

We first incorporate the ring communication architecture into the VFL framework. Hence, our novel VFL framework avoids the complicated communication protocol and reduces the coordinator’s effect. The performance of the VFL-R framework is evaluated based on benchmark datasets and challenged against other frameworks. The experimental results reveal that VFL-R effectively reduces the coordinator’s communication cost during the modeling process while preserving a high test accuracy.
We provide our framework’s detailed theoretical analysis of the loss function and gradient. This is important as the theoretical analysis affords a better understanding of the framework’s operating mechanism and optimizes the model.
To protect the privacy security of each party, we integrate a HE-based approach into our framework. Meanwhile, we analyze the classic semi-honest threat models and demonstrate VFL-R’s robustness to semi-honest attacks.

The remainder of this paper is organized as follows. Section 2 introduces some necessary methods and concepts for the VFL-R framework. Section 3 defines our framework’s new model formula, while Section 4 describes the proposed framework in detail. Section 5 presents the security analysis, and Sections 6, 7 and 8 present the experimental setup, varying experiment settings, and challenge VFL-R against different VFL frameworks. Finally, Section 9 concludes this work and provides some future research direction.

2 Preliminary

This section introduces some methods and concepts for the VFL-R framework.

2.1 Paillier homomorphic encryption

Our framework employs the Paillier Homomorphic Encryption (PHE) [22, 23] scheme to protect data privacy, an additive homomorphic encryption method that performs addition and multiplication on the encrypted values. This paper defines a new encryption operation $'\odot ^{\prime }$.

Definition 1 (Encryption operation $'\odot ^{\prime }$)

For any $a,b\in \mathbb {R}^{n}$, the $'\odot ^{\prime }$ performs the following calculation:

a ⊕ b = [[a]] + [[b]] = [[a + b]] (addition)
a ⊗ b = a^T[[b]] = [[a^Tb]] (scalar product),

where the additive homomorphic encryption of a vector $u (u\in \mathbb {R}^{n})$ is represented as [[u]].

2.2 Vertical federated learning

Let a training sample be $\left \{\left (\mathbf {x}_{i}, y_{i}\right ): i=1,2, \ldots , n\right \}$, where $\mathbf {x}_{i} (\mathbf {x}_{i}\in \mathbb {R}^{d})$ and y_i denote the input vector and output label, respectively, d is the feature’s dimension of the training sample. For the VFL setting, x_i is vertically distributed among K parties and each party owns a disjoint subset of features vector $\mathbf {x}_{[i,k]} (\mathbf {x}_{[i,k]} \in \mathbb {R}^{d_{k}},k=1,2,\ldots ,K)$, where d_k is the features dimension of the k-th party and $ \sum \limits _{k=1}^{K}d_{k}=d$. Similarly, we define ${\varTheta }=\left [\theta _{1};\theta _{2};\ldots ;\theta _{K}\right ]$, where $\theta _{k} (\theta _{k} \in \mathbb {R}^{d_{k}})$ denotes the parameter of the k-th party. Suppose that the K-th party holds the label information $y_{[i,K]}\left (y_{[i,K]}\in \mathbb {R}\right )$, we focus on the following empirical risk minimization function:

$$ \min_{\theta} \mathcal{L}({\varTheta}) \triangleq \frac{1}{n} {\sum}_{i=1}^{n} f\left( {\sum}_{k=1}^{K} \mathbf{x}_{[i,k]}\theta_{i}, y_{[i,K]}\right)+\lambda R\left( {\varTheta}\right), $$

(1)

where ${\mathscr{L}}: \mathbb {R}^{d} \rightarrow \mathbb {R}$ is smooth and convex, λ is a tuning parameter, and f(⋅) and R(⋅) denote the loss function and regularizer, respectively.

2.3 Gradient descent

The stochastic gradient descent (SGD) [24, 25] is the most commonly used algorithm for solving convex optimization problems. In this paper, we employ the gradient-based method to optimize (1) . Let ${\mathscr{L}}({\varTheta }) $ be the derivative, the local parameters 𝜃_k from the k-th party are updated according to:

$$ \theta_{k}^{*}=\theta_{k}-\alpha \nabla\mathcal{L}\left( \theta_{k}\right), $$

(2)

where α is the learning rate and $\nabla {\mathscr{L}}\left (\theta _{k}\right )$ denotes the gradient of ${\mathscr{L}}({\varTheta })$ with respect to 𝜃_k. The empirical risk function reaches the step-wise minimum according to the gradient.

2.4 Loss function

The model’s loss function depends on the model’s purpose and can be regarded as a true function of one variable $t (t\in \mathbb {R})$ [26]. We rewrite the loss function in (1) as $\frac {1}{n} \sum \limits _{i=1}^{n} f\left (t\right )$ with t = w − y for regression or t = wy for classification, where $w=\sum \limits _{k=1}^{K}\mathbf {x}_{[i,k]}\theta _{i}$ and y = y_[i,K]. Some common loss functions are reported in Table 1.

Table 1 Typical loss functions used in machine learning

Full size table

3 Preparations for VFL-R framework

A natural question arising is which loss function of Table 1 should be adopted by our framework. To answer this question, this section introduces the necessary theoretical analysis and derives a new model formula applicable to our framework.

3.1 Theoretical analysis

For the existing VFL frameworks, Wan et al. [27] assumed that the loss function is implicitly linearly separable in the form of f(t) = g ∘ h(t), where g is any differentiable function and h(t) is a linearly separable function in the form of $\sum \limits _{k=1}^{K}h\left (\theta _{k},\mathbf {x}_{[i,k]}\right )$. In this paper, we give a new property for the loss function involving the encryption operation $'\odot ^{\prime }$.

Property 1 (Encryption composed property)

For ∀t ∈ $\mathbb {R},$ [[f $(t)]] \in {\mathscr{M}}\left (\ell ;^{\prime } \odot ^{\prime }\right ) $. The encrypted set ${\mathscr{M}}\left (\ell ;^{\prime } \odot ^{\prime }\right )=${m∣m $=(\ell ;^{\prime } \odot ^{\prime })\}$, where m comprises the elements from $\ell =\left \{\ell _{1}, \ell _{2}, \ldots , \ell _{K}\right \}$ is as follows $: m=\ell _{1}^{\prime } \odot ^{\prime } \ell _{2}^{\prime } \odot ^{\prime } \ldots ^{\prime } \odot ^{\prime }$ ℓ_K.

3.2 New model formula in ${\mathscr{L}}({\varTheta })$

In our framework, the K-th party computes the encrypted loss function [[f(t)]] in the form of ${\mathscr{M}}_{[K]}\left (\ell ;'\odot ^{\prime }\right )$, where [K] is the index of the K-th party. We assume that the regularizer satisfies the encryption decomposition property. Then, the K-th party will compute the encrypted regularizer [[R(Θ)]] in the form of $\mathcal {N}_{[K]}\left (\theta ,'\odot ^{\prime }\right )$ with the set $\theta =\left \{\theta _{1},\theta _{2},\ldots ,\theta _{K} \right \}$. The new model formula in (1) can be rewritten as:

$$ \frac{1}{n} \sum\limits_{i=1}^{n} \mathcal{M}_{[K]}\left( \ell;'\odot^{\prime}\right)+\lambda \mathcal{N}_{[K]}\left( \theta;'\odot^{\prime}\right). $$

(3)

3.3 Aggregation of the encrypted gradient $[[\nabla {\mathscr{L}}(\theta _{k})]]$

The encrypted gradient aggregation is important for our framework to update the local parameters. Thus, this subsection introduces the assumption for the gradient.

Assumption 1

The gradients ∇f and ∇R satisfy Property 1, namely [[∇f]] comprises elements from set $\mathcal {A}$ and [[∇R]] is composed of elements from set ${\mathscr{B}}$.

Theorem 1

Under Assumption 1, the encrypted gradient $[[\nabla {\mathscr{L}}(\theta _{k})]]$ can be composed of the elements from set ${\mathscr{L}}$: for t = w − y, ${\mathscr{L}} =\mathcal {A} \cup {\mathscr{B}} \cup \left \{\mathbf {x}_{[i, k]}\right \}$, and for t = wy, ${\mathscr{L}}=\mathcal {A} \cup {\mathscr{B}} \cup \left \{\mathbf {x}_{[i, k]}, y_{[i, K]}\right \}$.

Proof

For t = w − y, we derive the explicit form of $\nabla {\mathscr{L}}(\theta _{k})$ according to (1) as:

$$ \nabla\mathcal{L}\left( \theta_{k}\right)=\frac{1}{n}\sum\limits_{i=1}^{n} \left( \nabla f \times \mathbf{x}_{[i,k]}\right)+\nabla R. $$

(4)

Considering the encrypted form

$$ \begin{array}{@{}rcl@{}} [[\nabla\mathcal{L}\left( \theta_{k}\right)]]&=&\frac{1}{n}\sum\limits_{i=1}^{n} \left( \nabla f \otimes \mathbf{x}_{[i,k]}\right)\oplus\nabla R\\ &=&\frac{1}{n}\sum\limits_{i=1}^{n}\left( \mathbf{x}_{[i, k]}^{T} [[\nabla f]]\right)+[[\nabla R]], \end{array} $$

(5)

$\exists {\mathscr{L}}=\mathcal {A}\cup {\mathscr{B}}\cup \left \{ \mathbf {x}_{[i,k]}\right \}$, namely the $[[\nabla {\mathscr{L}}(\theta _{k})]]$, is comprises elements from set ${\mathscr{L}}$.

For t = wy, we derive the explicit form of $\nabla {\mathscr{L}}(\theta _{k})$ according to (1) as:

$$ \nabla\mathcal{L}\left( \theta_{k}\right)=\frac{1}{n}\sum\limits_{i=1}^{n} \left( y_{[i,K]} \nabla f \times \mathbf{x}_{[i,k]}\right)+\nabla R. $$

(6)

Considering the encrypted form

$$ \begin{array}{@{}rcl@{}} [[\nabla\mathcal{L}\left( \theta_{k}\right)]]&=&\frac{1}{n}\sum\limits_{i=1}^{n} \left( y_{[i,K]} \otimes \nabla f \otimes \mathbf{x}_{[i,k]}\right)\oplus\nabla R\\ &=&\frac{1}{n}\sum\limits_{i=1}^{n} \left( y_{[i,K]} \mathbf{x}_{[i,k]}^{T} [[\nabla f]] \right)+[[\nabla R]], \end{array} $$

(7)

$\exists {\mathscr{L}}=\mathcal {A} \cup {\mathscr{B}} \cup \left \{ \mathbf {x}_{[i,k]} ,y_{[i,K]}\right \}$, namely the $[[\nabla {\mathscr{L}}(\theta _{k})]]$ can be composed of the elements from set ${\mathscr{L}}$. □

According to Theorem 1, we note that the local data $\left \{\mathbf {x}_{[i, k]}, y_{[i, K]}\right \}$ is necessary to compute the $[[\nabla {\mathscr{L}}(\theta _{k})]]$. However, in this paper the K-th party only computes the [[∇f]] and [[∇R]] encrypted results during the aggregation process, written as $\mathcal {D}_{[K]}\left (D,^{\prime } \odot ^{\prime }\right )$, where $D=\mathcal {A}\cup {\mathscr{B}}$, and as $\left \{ d_{1},d_{2},\ldots ,d_{K} \right \}$. Then, each party will compute the encrypted gradient $\left [\left [\nabla {\mathscr{L}}\left (\theta _{k}\right )\right ]\right ]$ during the local updating process. The purpose is to avoid gradient information leakage and reduce the calculation pressure during the aggregation process.

4 The VFL-R architecture

This section introduces the novel VFL framework based on the ring architecture illustrated in Fig. 3. The design framework has the following characteristics:

It includes two party types, one coordinator and some workers. The coordinator does not participate in the model training.
A one-way channel exists among each worker and a two-way channel between the coordinator and the K-th worker.
During the modeling process, each worker only needs one public key from the coordinator. Changing the encryption pairs in our framework is unnecessary.

4.1 The VFL-R framework

We divide our framework into three phases. In Phase One, the primary task is to aggregate the model function and the encrypted results, while Phase Two aims to perform local updating among each worker. Finally, Phase Three focuses on the decryption of the encrypted local parameters.

a. Phase One

The K-th worker needs to compute the ${\mathscr{M}}_{[K]}(\ell ;'\odot ^{\prime })$, $\mathcal {N}_{[K]}\left (\theta ;'\odot ^{\prime }\right )$ and $\mathcal {D}_{[K]}\left (D,'\odot ^{\prime }\right )$. The aggregation ideas are summarized as:

$$ \begin{array}{@{}rcl@{}} &\mathcal{M}_{[1]}=\mathcal{M}\left( \ell_{1};'\odot^{\prime}\right)\\ &\mathcal{M}_{[2]}=\mathcal{M}\left( \mathcal{M}_{[1]}\cup \ell_{2};'\odot^{\prime}\right)\\ & {\cdots} \cdots\\ &\mathcal{M}_{[K-1]}=\mathcal{M}\left( \mathcal{M}_{[K-2]}\cup \ell_{K-1};'\odot^{\prime}\right)\\ &\mathcal{M}_{[K]}\left( \ell;^{\prime} \odot^{\prime}\right)\in\mathcal{M}\left( \mathcal{M}_{[K-1]}\cup \ell_{K};'\odot^{\prime}\right). \end{array} $$

(8)

Denote that the encrypted set ${\mathscr{M}}_{[i]} (i=1,2,\ldots ,K-1)$. Each element in ${\mathscr{M}}_{[i]}$ can be used to compute ${\mathscr{M}}_{[K]}\left (\ell ;'\odot ^{\prime }\right )$. The 1-st worker computes the encrypted ℓ₁ to ${\mathscr{M}}_{[1]}$, while the 2-nd worker computes new elements based on the elements from ${\mathscr{M}}_{[1]} \cup \ell _{2}$. With the transfer of ${\mathscr{M}}_{[i]}$ suggested in our proposed framework, we increase the element availability when computing the target model. Hence, the K-th worker will compute the ${\mathscr{M}}_{[K]}\left (\ell ;'\odot ^{\prime }\right )$ and similarly, the $\mathcal {N}_{[K]}\left (\theta ;'\odot ^{\prime }\right )$ can be aggregated as:

$$ \begin{array}{@{}rcl@{}} &{}\mathcal{N}_{[1]}=\mathcal{N}\left( \theta_{1};'\odot^{\prime}\right)\\ &\mathcal{N}_{[2]}=\mathcal{N}\left( \mathcal{N}_{[1]}\cup \theta_{2};'\odot^{\prime}\right)\\ &{\cdots} \cdots\\ &\mathcal{N}_{[K-1]}=\mathcal{N}\left( \mathcal{N}_{[K-2]}\cup \theta_{K-1};'\odot^{\prime}\right)\\ &\mathcal{N}_{[K]}\left( \theta;'\odot^{\prime}\right)\in\mathcal{N}\left( \mathcal{N}_{[K-1]}\cup \theta_{K};'\odot^{\prime}\right). \end{array} $$

(9)

The $\mathcal {D}_{[K]}\left (D,'\odot ^{\prime }\right )$ can be aggregated as:

$$ \begin{array}{@{}rcl@{}} &{}\mathcal{D}_{[1]}=\mathcal{D}\left( d_{1};'\odot^{\prime}\right)\\ &\mathcal{D}_{[2]}=\mathcal{D}\left( \mathcal{D}_{[1]}\cup d_{2};'\odot^{\prime}\right)\\ & {\cdots} \cdots\\ &\mathcal{D}_{[K-1]}=\mathcal{D}\left( \mathcal{D}_{[K-2]}\cup d_{K-1};'\odot^{\prime}\right)\\ &\mathcal{D}_{[K]}\left( D;'\odot^{\prime}\right)\in\mathcal{D}\left( \mathcal{D}_{[K-1]}\cup d_{K};'\odot^{\prime}\right). \end{array} $$

(10)

This phase includes the following three steps.

Step 1: The coordinator creates encryption pairs and sends the public key to the K-th worker. Then, the K-th worker sends the public key to the 1-st worker.
Step 2: The 1-st worker receives the public key and computes the encrypted sets ${\mathscr{M}}_{[1]}$, $\mathcal {N}_{[1]}$, $\mathcal {D}_{[1]}$. The 2-nd worker does the same operations as the 1-st worker after receiving the public key and ${\mathscr{M}}_{[1]}$, $\mathcal {N}_{[1]}$, $\mathcal {D}_{[1]}$. This process is repeated until the encrypted sets ${\mathscr{M}}_{[K-1]}$, $\mathcal {N}_{[K-1]}$, $\mathcal {D}_{[K-1]}$ are sent to the K-th worker.
Step 3: The K-th worker completes the aggregation of ${\mathscr{M}}_{[K]}\left (\ell ;^{\prime } \odot ^{\prime }\right ), \mathcal {N}_{[K]}\left (\theta ;^{\prime } \odot ^{\prime }\right )$ and $\mathcal {D}_{[K]}\left (D,^{\prime } \odot ^{\prime }\right )$.

b. Phase Two

In this phase, each worker computes the encrypted gradient and updates the local parameters. The steps are as follows.

Step 4: The K-th worker uses the $\mathcal {D}_{[K]}\left (D,'\odot ^{\prime }\right )$ to compute the $[[\nabla {\mathscr{L}}(\theta _{K})]]$ and updates the local parameters in the form of $[[\theta _{K}^{*}]]=[[\theta _{K}]]-\alpha [[\nabla {\mathscr{L}}\left (\theta _{K}\right )]]$ under the ciphertext environment. Next, the $\mathcal {D}_{[K]}\left (D,'\odot ^{\prime }\right )$ is sent to the 1-st worker and the 1-st worker does the same things as the last worker. This procedure repeats until all workers complete the local updating.
Step 5: As illustrated in Figs. 4 and 5, all workers perform Steps 2-4 during the t-th (1 < t < T) iteration. The coordinator does not play any role during the modeling process and rarely has access to the intermediate results concerning the target model.

c. Phase three

Since the local parameter updating is in the ciphertext environment, it is necessary to decrypt the local parameters in the T-th iteration.

Step 6: As illustrated in Fig. 6, the K-th worker sends [[𝜃_K]] to the 1-th worker after updating the local parameters. Then the 1-th worker sends the {[[𝜃₁]], [[𝜃_K]]} to the 2-th worker after updating the local parameters. This process repeats until the encrypted set ${\varTheta }=\left \{[[\theta _{1}]],[[\theta _{2}]],\ldots ,[[\theta _{K}]] \right \}$ is sent to the coordinator. The coordinator will decrypt he encrypted set using its private key.
Fig. 4
The aggregation process for the VFL-R framework in Steps 1-3
Full size image
Fig. 5
The local updating process for the VFL-R framework in Step 4
Full size image
Fig. 6
The decryption of local parameters for the VFL-R framework in Step 6
Full size image

5 Security analysis

This section discusses our framework’s privacy security. Given that the semi-honest threat models have been widely used in FL security analysis [28,29,30], we introduce two assumptions for semi-honest threat models and analyze the privacy security from two aspects: the coordinator and the workers.

Assumption 2 (Honest-but-curious)

Each party follows the designing protocol to perform the correct computations. However, some parties may infer the other party’s raw data and model by retaining the intermediate computation result records.

Assumption 3 (Honest-but-colluding)

Each party follows the designing protocol to perform the correct computations. Unlike Assumption 2, some parties may collude to infer the other party’s raw data and model by sharing their own retained records.

For workers

In our framework, each worker passes the intermediate results and updates local parameters in the ciphertext environment. Workers usually receive the encrypted values from other workers, while under the encryption protection, it is challenging to perform inference attacks for other workers under Assumptions 2–3.

For the coordinator

In our framework, the coordinator’s task is to distribute the public key in the 1-th iteration and decrypt the local parameters in the T-th iteration. Even if the coordinator obtains the actual values of the local parameters, it is different from inferring the raw data under Assumption 2.

6 Experiment setting

All experiments simulate the VFL scenario utilizing Python 3.8.5 on an Intel Core E5-2640 CPU 2.40GHz. The data were partitioned vertically into four non-overlapping parties with a nearly equal number of features. We randomly selected 70% of the samples as the training data, and the remaining were employed as testing data.

6.1 Problem

The following experiment focuses on the binary classifications problem and utilizes the logistic regression [31, 32] scheme written as:

$$ f(w,y)\triangleq \frac{1}{n}{\sum}_{i=1}^{n} \log \left[1+\exp \left( -wy\right)\right], $$

(11)

where w = x_[i,k]𝜃_i and $y=\left \{-1,1\right \}$. We add the ℓ₂-norm regularized risk written as $R({\varTheta })=\frac {1}{2}\|{\varTheta }\|_{2}^{2}$ to avoid overfitting. Meanwhile, we use the second-order Taylor approximations for the logistic loss function to solve the non-linear problem [33]. The model function can be written as:

$$ f(w,y)\approx\frac{1}{n} \sum\limits_{i=1}^{n} \left( \log 2-\frac{1}{2} wy +\frac{1}{8}w^{2}\right)+\frac{\lambda}{2}\|{\varTheta}\|_{2}^{2}. $$

(12)

The gradient with respect to 𝜃_k is:

$$ \nabla_{[k] f(w,y)}\approx\frac{1}{n} \sum\limits_{i=1}^{n}\left( \frac{1}{4} w -\frac{1}{2} y\right) \mathbf{x}_{[i,k]} + \lambda\theta_{k}. $$

(13)

6.2 Benchmark datasets

We evaluate our framework’s performance on benchmark datasets with various numbers of samples and features. Specifically, we select four datasets from the UCI datasets [34]: the lonosphere, statlog (Heart), sonar, and breast cancer wisconsion diagnostic (WDBC) datasets. The sample and feature numbers are listed in Table 2, while the values of each feature are standardized into [0,1].

Table 2 Statistic of benchmark datasets

Full size table

6.3 The algorithm

Affected by the limited calculation power, using a public key to perform repeated PHE operations becomes hard. To solve such problem, we set t as a fixed training period in Algorithm 1. Specifically, the value of t can attend the maximum number of PHE operations. Algorithm 2 gives the system of the VFL-R framework after T (T>t) iterations. In fact, the total period T will be divided into many small periods t. Each party will execute the VFL-R framework in the divided period.

7 Varying experiment setting

The performance assessment metrics are the convergence performance and the classification results on the benchmark datasets. Moreover, we explore the effects of various learning rates α and tuning parameters λ.

7.1 Varying learning rates α

In our experiment, it is hard to get the loss function curves due to the limits of the encrypted local parameters. Thus, we assume that all workers save the results of the encrypted parameters and jointly compute the loss function with the coordinator (t = 1). The loss function curves of the VFL-R framework under various learning rates are presented in Figs. 7, 8, 9 and 10. These figures highlight that the loss function curves have a consistent overall trend regardless of the learning rates. Moreover, from α = 0.01 to α = 0.3, the convergence speed of VFL-R is improved, and the classification results under various learning rates are reported in Table 3. When α = 0.1, the VFL-R framework achieves the best classification, while overall, the learning rate affects the classification performance. Therefore, the learning rate value must be appropriately tuned rather than selecting a large value.

Table 3 Classificationresults of the VFL-R framework under various learning rates on the benchmark datasets for T = 300

Full size table

7.2 Varying tuning parameters λ

For this case, we set the learning rate to 0.1 and alter the tuning parameters. The loss function curves involving various tuning parameters λ are illustrated in Figs. 11, 12, 13 and 14 demonstrating that the loss function curves have a similar convergence trend. The classification results of VFL-R with different tuning parameters are reported in Table 4, indicating that when λ increases from 0.1 to 0.9, VFL-R achieves the high classification performance of 84.69% - 85.05% on the lonosphere dataset, 86.11% - 86.33% on the statlog (Heart) dataset, 81.43% - 82.05% on the snoar dataset, and 95.55% - 95.83% on the WDBC dataset.

Table 4 Classification results of the VFL-R framework with various tuning parameters on four datasets for T = 300

Full size table

8 Comparison with different VFL frameworks

At present, existing VFL frameworks pay little attention to the innovation of the communication architecture. In order to better highlight the performance of VFL-R, we challenge it against the VFL [12] and VFB² [20] framework in different aspects, including functionality analysis, test accuracy, and communication performance.

8.1 Functionality analysis

Table 5 reports the functional comparison of the above frameworks. Specifically, VFL is based on the C-S communication architecture and can defend against semi-honest attacks to preserve data security. However, such a C-S communication architecture is inefficient, especially when many parties are involved. Regarding the VFB² framework, it relies on the tree communication architecture and supports distributed learning. Although the tree communication architecture can significantly reduce the number of communications during the modeling process, its privacy protection can not guarantee high privacy security without using encryption technology [35].

Table 5 Comparison analysis of the VFL and VFB² frameworks

Full size table

Furthermore, VFL and VFB² frameworks impose a significant communication burden for the coordinator, as during the modeling process, the coordinator sends the gradient or other parameters, involving an unnecessary communication cost and a high risk of information disclosure. In contrast, our proposed framework balances the two frameworks and reduces the coordinator’s communication burden.

8.2 Test accuracy

To demonstrate the test accuracy of the VFL-R framework, we challenge it against the VFL and VFB² frameworks. Furthermore, we test the accuracy gap of different loss functions by considering the non-federated (NonF) experiment where all data are integrated for modeling with the logistic loss function.

Figures 15, 16, 17 and 18 plot the test accuracy of four frameworks based on benchmark datasets. For the Taylor loss function, the VFL-R framework achieves a similar test accuracy to the VFL and VFB² frameworks, with the test accuracy of each framework deviating at most by 4% in the lonosphere dataset. Considering the logistic loss function, the VFL-R framework attends the small test accuracy gap.

8.3 Communication cost

We assume that each VFL framework includes N parties. $\operatorname {Enc}\left (\cdot \right )$ is defined as encryption operation and $\left |\cdot \right |$ denotes the data size of each party during the modeling process. w_i and g_i represent the intermediate results for the i-th party, which are used to compute the loss function and gradient, respectively. G is the gradient for modeling process.

For the VFL-R framework.

During the aggregation process, each party sends the $\operatorname {Enc}\left (w_{i},g_{i}\right )$ to the next party and receives the $\operatorname {Enc}\left (w_{i-1},g_{i-1}\right )$ from the last party. For the VFL-R framework, the coordinator does not participate in the modeling process. Thus, the communication cost of the third-coordinator is O(1).

For the VFL framework

Each party needs to send the Enc $\left (w_{i},g_{i}\right )$ to the major party and receives the $\operatorname {Enc}\left (w_{n},g_{n}\right )$ from the other party, where $n=\left \{1,2,\ldots ,i-1\right \}$. The coordinator has to send the $\operatorname {Enc}\left (G\right )$ to each party and therefore the communication cost of the coordinator can be formulated as $O(\left |\operatorname {Enc}\left (G\right )\right |\cdot N )$.

For the VFB² framework

Each party sends the $\left (w_{i},g_{i}\right )$ to the next party based on the tree communication architecture. Meanwhile, each party will receive the G from the coordinator. Thus, the communication cost of the coordinator is $O(\left | G\right |\cdot N )$.

In Table 6, we compare in detail the communication cost of all competitor frameworks, demonstrating that our framework reduces the coordinator’s communication cost. Meanwhile, compared to the VFL framework, our method greatly reduces the communication cost for each party.

Table 6 Communication cost of the VFL-R framework compared with the VFL and VFB² frameworks in one communication round

Full size table

8.4 The number of communications

Next, we compare the number of communications of the three VFL frameworks per communication round. In Fig. 19, the horizontal axis shows the number of parties K and the vertical axis shows the number of communications in one communication round. It reveals that the VFL framework requires O(K²) communications as the number of participants increases. However, our proposed framework requires O(K) communications, similar to the VFB² framework. Nevertheless, as mentioned in Section 8.1, the VFB² framework has poor privacy security, which is not a concern in our framework.

9 Conclusion and future work

This work proposes VFL-R, a new VFL framework that utilizes a ring communication architecture to simplify the intricate communication architecture among each party. In particular, the ring communication architecture reduces the coordinator’s communication burden, and our novel communication architecture reduces the number of communications in one communication round. Furthermore, our framework employs HE-based technology to guarantee privacy security. Functionality analysis and extensive experiments demonstrate that VFL-R effectively reduces the communication cost and achieves high accuracy on all benchmark datasets.

Our framework is limited by the necessary assumptions regarding the loss function and gradient. Therefore, future work will aim for improvements utilizing more complicated machine learning approaches or other methods solving these problems. Meanwhile, we will continue our research in designing an efficient framework that further enhances communication performance in VFL scenario.

References

Hamon R, Junklewitz H, Sanchez G (2022) Bridging the gap between AI and explainability in the GDPR: towards trustworthiness-by-design in automated decision-making. IEEE Comput Intell Mag 17(1):72–85
Article Google Scholar
McMahan B, Moore E, Ramage D (2017) Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th international conference on artificial intelligence and statistics, PMLR, vol 54. pp 1273–1282
Raza A, Tran KP, Koehl L, Li SJ (2022) Designing ECG monitoring healthcare system with federated transfer learning and explainable AI. Knowl-Based Syst 326:107763
Article Google Scholar
Singh S, Rathore S, Alfarraj O, Tolba A (2022) A framework for privacy-preservation of IoT healthcare data using federated learning and blockchain technology. Future Gener Comput Syst 129:380–388
Article Google Scholar
Wang YC, Tian YY, Yin XY, Hei XH (2020) A trusted recommendation scheme for privacy protection based on federated learning. CCF Trans Netw 3(3-4):218–228
Article Google Scholar
Liu Y, Ben T, Vincent WZ, Chen K (2020) Federated recommendation systems. Federated Learn Priv Incent 12500:225–239
Article MathSciNet Google Scholar
Jiang X, Zhou XB, Jens G (2022) Privacy preserving high-dimensional data collection with federated generative autoencoder. Trauma Surg Acute Ca 2022:481–500
Google Scholar
Xin BZ, Yang W, Geng YY, Chen S (2020) Private fl-gan: Differential privacy synthetic data generation based on federated learning. In: 2020 IEEE international conference on acoustics, speech and signal processing, pp 2927–2931
Paragliola G, Coronato A (2022) Definition of a novel federated learning approach to reduce communication costs. Expert Syst Appl 189:116109
Article Google Scholar
Abdellatif AA, Mhaisen N, Mohamed A (2022) Communication-efficient hierarchical federated learning for IoT heterogeneous systems with imbalanced data. Future Gen Comput Syst 128:406–419
Article Google Scholar
Feng CS, Liu B, Yu KP, Goudos SK (2022) Blockchain-empowered decentralized horizontal federated learning for 5G-Enabled UAVs. IEEE Trans Ind Informs 18(5):3582–3592
Article Google Scholar
Yang Q, Liu Y, Chen TJ, Tong YX (2019) Federated machine learning: concept and applications. ACM T Intel Syst Tec 10:1–19
Article Google Scholar
Ou W, Zeng JH, ZJ G, Yan WQ (2020) A homomorphic-encryption-based vertical federated learning scheme for rick management. Comput Sci Inf Syst 17:819–834
Article Google Scholar
Hou J, Su M, Fu A, Yu Y (2021) Verifiable privacy-preserving scheme based on vertical federated random forest. IEEE Internet Things 9461157:1–1
Google Scholar
Li QB, Wen ZY, Wu ZM, Hu SX (2021) A survey on federated learning systems: vision, hype and reality for data privacy and protection. IEEE T Knowl Data En 9599369:1–1
Google Scholar
Cheng KW, Fan T, Jin LY (2021) Secureboost: a lossless federated learning framework. IEEE Intell Syst 36(6):87–98
Article Google Scholar
Zhu HY, Wang R, Jin YC, KT L (2021) PIVODL: Privacy-Preserving vertical federated learning over distributed labels. IEEE Tai 9664283:1–1
Google Scholar
Zhang X, Ma Z, Wang A (2021) Lstfcfedlear: a LSTM-FC with vertical federated learning network for fault prediction. In: WCMC, p 2021
Chen XL, Zhou S, Guan B, Yang K (2021) Fed-EINI: an efficient and interpretable inference framework for decision tree ensembles in vertical federated learning. In: 2021 IEEE international conference on big data (big data), pp 1242–1248
Gu B, Xu A, Huo ZY, Deng C (2021) Privacy-preserving asynchronous vertical federated learning algorithms for multiparty collaborative learning. IEEE T Neur Netw Learn 9463409:1–13
Google Scholar
Amanda CDR, Diego DFA (2021) Faster unbalanced private set intersection in the semi-honest setting. J Cryptogr Eng 11(1):21–38
Article Google Scholar
Somchai P (2019) Database secure manipulation based on paillier’s homomorphic encryption (DSM-PHE). Int J Interact Mob Technol 13(12):136–151
Article Google Scholar
Wang ZW, Zhang Y (2020) Malicious code detection for trusted execution environment based on paillier homomorphic encryption. IEICE Trans Commun 103(3):155–166
Article Google Scholar
Yuan W, Hu F, Lu LF (2022) A new non-adaptive optimization method: stochastic gradient descent with momentum and difference. Appl Intell 52(4):3939–3953
Article Google Scholar
Tang YJ, Vikram R, Zhang JS, Li N (2022) Communication-efficient distributed SGD with compressed sensing. IEEE Control Syst Lett 6:2054–2059
Article MathSciNet Google Scholar
Ritesh N, Nihar BS, Ariel DP (2021) Loss functions, axioms, and peer review. Appl Intell 70:1481–1515
MathSciNet Google Scholar
Wan L, Han SG (2007) Privacy-preservation for gradient descent methods. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 775–783
Vale T, Stacey T, Mehmet EG, Liu L (2020) Data poisoning attacks against federated learning systems. LNCS 12308:480–501
Google Scholar
Jere M, Farnan T, Koushanfar F (2021) A taxonomy of attacks on federated learning. IEEE Secur Priv 19(2):20–28
Article Google Scholar
Lu SW, Li RH, Chen X, Ma YN (2022) Defense against local model poisoning attacks to byzantine-robust federated learning. Front Comput Sci 16(6):166337
Article Google Scholar
Malgorzata L, Jan M, Pawel T (2021) Estimating the class prior for positive and unlabelled data via logistic regression. Adv Data Anal Classif 15(4):1039–1068
Article MathSciNet MATH Google Scholar
Cheng SS, Liu JJ, Shi X, Wang K (2022) Rare variant association tests for ancestry-matched case-control data based on conditional logistic regression. Brief Bioinform 23(2):bbab572
Article Google Scholar
Pietro M, Alessandro T (2017) Calibration of time-interleaved ADCs via hermitianity-preserving taylor approximations. IEEE Trans Circ Syst II Expr Briefs 64(4):357–361
Google Scholar
Muhammad T, Chandan G, Ponnuthural NS (2019) Comprehensive evaluation of twin SVM based classifiers on UCI datasets. Appl Soft Comput 83:133–146
Google Scholar
Jiang X, Zhou XB, Jens G (2022) Comprehensive analysis of privacy leakage in vertical federated learning during prediction. Proc Priv Enhancing Technol 2:263–281
Article Google Scholar

Download references

Author information

Authors and Affiliations

China University of Petroleum, Qingdao, 266580, China
Jialin Li, Tongjiang Yan & Pengcheng Ren

Authors

Jialin Li
View author publications
You can also search for this author in PubMed Google Scholar
Tongjiang Yan
View author publications
You can also search for this author in PubMed Google Scholar
Pengcheng Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tongjiang Yan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by Fundamental Research Funds for the Central Universities (20CX05012A), the Major Scientific and Technological Projects of CNPC under Grant (ZD2019-183-008), and Shandong Provincial Natural Science Foundation of China (ZR2019MF070).

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, J., Yan, T. & Ren, P. VFL-R: a novel framework for multi-party in vertical federated learning. Appl Intell 53, 12399–12415 (2023). https://doi.org/10.1007/s10489-022-04111-0

Download citation

Accepted: 09 August 2022
Published: 27 September 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10489-022-04111-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

VFL-R: a novel framework for multi-party in vertical federated learning

Abstract

Similar content being viewed by others

SVFL: Secure Vertical Federated Learning on Linear Models

Security and Robustness in Federated Learning

A survey on federated learning: a perspective from multi-party computation

Explore related subjects

1 Introduction

2 Preliminary

2.1 Paillier homomorphic encryption

Definition 1 (Encryption operation \('\odot ^{\prime }\))

2.2 Vertical federated learning

2.3 Gradient descent

2.4 Loss function

3 Preparations for VFL-R framework

3.1 Theoretical analysis

Property 1 (Encryption composed property)

3.2 New model formula in \({\mathscr{L}}({\varTheta })\)

3.3 Aggregation of the encrypted gradient \([[\nabla {\mathscr{L}}(\theta _{k})]]\)

Assumption 1

Theorem 1

Proof

4 The VFL-R architecture

4.1 The VFL-R framework

5 Security analysis

Assumption 2 (Honest-but-curious)

Assumption 3 (Honest-but-colluding)

For workers

For the coordinator

6 Experiment setting

6.1 Problem

6.2 Benchmark datasets

6.3 The algorithm

7 Varying experiment setting

7.1 Varying learning rates α

7.2 Varying tuning parameters λ

8 Comparison with different VFL frameworks

8.1 Functionality analysis

8.2 Test accuracy

8.3 Communication cost

For the VFL-R framework.

For the VFL framework

For the VFB2 framework

8.4 The number of communications

9 Conclusion and future work

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

For the VFB² framework