Keywords

1 Introduction

Currently, Mashup technology has emerged as a promising software development method in a service-oriented environment, which allows software developers to compose existing Web APIs to create new or value-added composite RESTful Web services [1]. More and more service providers have published tremendous Web APIs that enable software developers to easily integrate data and functions by the form of Mashup [2]. For example, until July 2016, there has already been more than 15,400 Web APIs on ProgrammableWeb, and the number of it is still increasing. Consequently, it becomes a significant challenge to discover most suitable Web APIs to construct user-desired Mashup application from tremendous Web APIs.

To attack the above challenge, some researchers exploit service recommendation to improve Web service discovery [3, 4]. Where, the topic model technique (e.g. Latent Dirichlet Allocation (LDA) [5]) has been exploited to derive latent topics of Mashup and Web APIs for improving the accuracy of recommendation [3, 4]. A limitation of LDA is that it needs to determine the optimal topics number in advance. For each different topic number in model training, there have a new LDA model training process, resulting in time-consuming problem. To solve this problem, Teh et al. [6] proposed a non-parametric Bayesian model—Hierarchical Dirichlet Process (HDP), which automatically obtain the optimal topics number and save the training time. Thus, it can be used to derive the topics of Mashups and Web APIs for achieving more accurate service recommendation.

In recent years, matrix factorization is used to decompose Web APIs invocations in historical Mashups for service recommendations [7, 8]. It decomposes the Mashup-Web API matrix into two lower dimension matrixes. However, matrix factorization based service recommendation relies on rich records of historical Mashup-Web API interactions [8]. Aiming to the problem, some recent research works incorporated additional information, such as users’ social relations [9] or location similarity [10], into matrix factorization for more accurate recommendation. Even though matrix factorization relieves the sparsity between Mashup and Web APIs, it is not applicable for general prediction task but work only with special, single input data. When more additional information, such as the co-occurrence and popularity of Web APIs, is incorporated into matrix factorization model, its performance will decrease. FMs, a general predictor working with any real valued feature vector, was proposed by S. Rendle [11, 12], which can be applied for general prediction task and models all interactions between multiple input variables. So, FMs can be used to predict the probability of Web APIs invocated by Mashups.

In this paper, we propose a Web APIs recommendation approach based on HDP and FMs for Mashup development. The contributions of this paper are as follows:

  • We use the HDP to derive the latent topics from the description document of Mashups and Web APIs. Based on these topics, similar Mashups and similar Web APIs will be addressed to support the model training of FMs.

  • We apply the FMs to train the topics obtained by the HDP for predicting the probability of Web APIs invocated by Mashups and recommending the high-quality Web APIs for Mashup development. In the FMs, multiple useful information is utilized to improve the prediction accuracy of Web APIs recommendation.

  • We conduct a set of experiments based on a real-world dataset from ProgrammableWeb. Compared with other existing methods, the experimental results show that our method achieves a significant improvement in terms of MAE and RMSE.

The rest of this paper is organized as follows: Sect. 2 describes the proposed method. Section 3 gives the experimental results. Section 4 presents related works. Finally, we draw conclusions and discuss our future work in Sect. 5.

2 Method Overview

2.1 The Topic Modeling of Mashup and Web APIs Using HDP

The Hierarchical Dirichlet Process (HDP) is a powerful non-parametric Bayesian method [13], and it is a multi-level form of the Dirichlet Process (DP) mixture model. Suppose \( (\varTheta ,{\text{B}}) \) be a measurable space, with \( G_{0} \) a probability measure on the space, and suppose \( a_{0} \) be a positive real number. A Dirichlet Process [14] is defined as a distribution of a random probability measure \( G \) over \( (\varTheta ,{\text{B}}) \) such that, for any finite measurable partition \( \left( {A_{1} ,A_{2} , \ldots ,A_{r} } \right) \) of \( \varTheta \), the random vector \( \left( {G(A_{1} } \right), \ldots ,G(A_{r} )) \) is distributed as a finite-dimensional Dirichlet distribution with parameters \( \left( {a_{0} G_{0} (A_{1} } \right), \ldots ,a_{0} G_{0} (A_{r} )) \):

$$ \left( {G(A_{1} ), \ldots ,G(A_{r} )} \right)\sim Dir\left( {a_{0} G_{0} (A_{1} ), \ldots ,a_{0} G_{0} (A_{r} )} \right) $$
(1)

In this paper, we use the HDP to model the documents of Mashup and Web APIs. The probabilistic graph of the HDP is shown in Fig. 1, in which the documents of Mashup or Web APIs, their words and latent topics are presented clearly. Here, D represents the whole Mashup documents set which is needed to derive topics, and d represents each Mashup document in D. \( \gamma \) and \( a_{0} \) are the concentration parameter. \( H \) is the base probability measure and \( G_{0} \) is the global random probability measure. \( G_{d} \) represents a generated topic probability distribution of Mashup document d, \( \beta_{d,n} \) represents a generated topic of the nth word in the d from \( G_{d} \), and \( w_{d,n} \) represents a generated word from \( \beta_{d,n} \).

Fig. 1.
figure 1

The probabilistic graph of HDP

The generative process of our HDP model is as below:

  1. (1)

    For the D, generate the probability distribution \( G_{0} \sim DP\left( {\gamma , H} \right) \) by sampling, which is drawn from the Dirichlet Process \( DP\left( {\gamma , H} \right) \).

  2. (2)

    For each d in D, generate their topic distributions \( G_{d} \sim DP\left( {a,G_{0} } \right) \) by sampling, which is drawn from the Dirichlet Process \( DP\left( {a,G_{0} } \right) \).

  3. (3)

    For each word \( n \in \left\{ {1,2, \ldots ,N} \right\} \) in d, the generative process of them is as below:

    • Draw a topic of the nth word \( \beta_{d,n} \sim G_{d} \) , by sampling from \( G_{d} \);

    • Draw a word \( w_{d,n} \sim Multi\left( {\beta_{d,n} } \right) \) from the generated topic \( \beta_{d,n} \).

To achieve the sampling of HDP, it is necessary to design a construction method to infer the posterior distribution of parameters. Here, Chinese Restaurant Franchise (CRF) is a typical construction method, which has been widely applied in document topic mining. Suppose J restaurants share a common menu \( \upphi = \left(\upphi \right)_{k = 1}^{K} \), K is the amount foods. The jth restaurant contains \( m_{j} \) tables \( \left( {\uppsi_{jt} } \right)_{t = 1}^{{m_{j} }} \), each table sits \( N_{j} \) customers. Customers are free to choose tables, and each table only provides a kind of food. The first customer in the table is in charge of ordering foods, other customers share these foods. Here, restaurant, customer and food are respectively corresponding to the document, word and topic in our HDP model. Suppose \( \delta \) is a probability measure, the topic distribution \( \theta_{ji} \) of word \( x_{ji} \) can be regarded as a customer. The customer sits the table \( \uppsi_{jt} \) with a probability \( \frac{{n_{jt} }}{{i - 1 + a_{0} }} \), and shares the food \( \upphi_{k} \), or sits the new table \( \uppsi_{{jt_{new} }} \) with a probability \( \frac{{a_{0} }}{{i - 1 + a_{0} }} \). Where, \( n_{jt} \) represents the amount of customers which sit the tth table in the jth restaurant. If the customer selects a new table, he/she can assign the food \( \upphi_{k} \) for the new table with a probability \( \frac{{m_{k} }}{{\sum\nolimits_{k} {m_{k} } + \gamma }} \) according to popularity of selected foods, or new foods \( \upphi_{{k_{new} }} \) with a probability \( \frac{\gamma }{{\sum\nolimits_{k} {m_{k} } + \gamma }} \). Where, \( m_{k} \) represents the amount of tables which provides the food \( \upphi_{k} \). We have the below conditional distributions:

$$ \theta_{ji} |\theta_{ji} ,\theta_{ji} , \ldots ,\theta_{ji} ,a_{0} ,G_{0} \sim \sum\nolimits_{t = 1}^{{m_{j} }} {\frac{{n_{jt} }}{{i - 1 + a_{0} }} \delta_{{\uppsi_{jt} }} } + \frac{{a_{0} }}{{i - 1 + a_{0} }}G_{0} $$
(2)
$$ \uppsi_{jt} |\uppsi_{jt} ,\uppsi_{jt} , \ldots ,\uppsi_{jt} , \ldots ,\uppsi_{jt} ,\gamma ,H\sim \sum\nolimits_{k = 1}^{K} {\frac{{m_{k} }}{{\sum\nolimits_{k} {m_{k} } + \gamma }}\delta_{{\upphi_{k} }} } + \frac{\gamma }{{\sum\nolimits_{k} {m_{k} } + \gamma }}H $$
(3)

Thus, the construction of CRF justly is the process of assigning tables and foods for customers. Actually, the process of assigning tables and foods for customers is respectively corresponding to topic assignment of words and document topic clustering in Mashup documents set. After completing the construction of CRF, we use the Gibbs sampling method to infer the posterior distribution of parameters in the HDP model, and thus obtain topics distribution of whole Mashup documents set.

Similarly, the HDP model construction and topic generation process of Web APIs document set are same to those of Mashup documents set, which are not presented in details.

2.2 Web APIs Recommendation for Mashup Using FMs

2.2.1 Rating Prediction in Recommendation System and FMs

Traditional recommendation system is a user-item two-dimension model. Suppose user set \( U = \left\{ {u_{1} ,u_{2} , \ldots } \right\} \), item set \( I = \left\{ {i_{1} ,i_{2} , \ldots } \right\} \), the rating prediction function is defined as below:

$$ y: U \times I \to R $$
(4)

Here, \( y \) represents the rating, i.e. \( y\left( {u,i} \right) \) is the rating of user \( u \) to item \( i \). The task of rating prediction is to predict the rating of any user-item pairs.

FMs is a general predictor, which can estimate reliable parameters under very high sparsity (like recommender systems) [11, 12]. The FMs combines the advantages of SVMs with factorization models. It not only works with any real valued feature vector like SVMs, but also models all interactions between feature variables using factorized parameters. Thus, it can be used to predict the rating of items for users. Suppose there are an input feature vector \( x \in R^{n*p} \) and an output target vector \( y = \left( {y_{1} ,y_{2} , \ldots ,y_{n} } \right)^{T} \). Where, \( n \) represents the amount of input-output pairs, \( p \) represents the amount of input features, i.e. the i th row vector \( x_{i} \in R^{p} \), \( p \) means \( x_{i} \) have \( p \) input feature values, and \( y_{i} \) is the predicted target value of \( x_{i} \). Based on the input feature vector \( x \) and output target vector \( y \), the 2-order FMs can be defined as below:

$$ \hat{y}\left( x \right): = w_{0} + \sum\nolimits_{i = 1}^{p} {w_{i} x_{i} } + \sum\nolimits_{i = 1}^{p} {\sum\nolimits_{j = i + 1}^{p} {x_{i} x_{j} } } \sum\nolimits_{f = 1}^{k} {v_{i,f} v_{j,f} } $$
(5)

Here, \( k \) is the factorization dimensionality, \( w_{i} \) is the strength of the i th feature vector \( x_{i} \), and \( x_{i} x_{j} \) represents all the pairwise variables of the training instances \( x_{i} \) and \( x_{j} \). The model parameters \( \left\{ {w_{0} , w_{1} , \ldots , w_{p, } v_{1,1} , \ldots ,v_{p,k} } \right\} \) that need to be estimated are:

$$ w_{0} \in R, w \in R^{n} , V \in R^{n*k} $$
(6)

2.2.2 The Prediction and Recommendation of Web APIs for Mashup Based on FMs

In this paper, the prediction target is a typical classification problem, i.e. y = {1, 1}. The Web APIs prediction is defined as a task of ranking Web APIs and recommending adequate relevant Web APIs for the given Mashup. If y = 1, then the relevant API will be chosen as a member Web API of the given Mashup. But in practice, we can only obtain a predicted decimal value ranging from 0 to 1 derived from the formula (5) for each input feature vector. We rank these predicted decimal values and then classify them into positive value (+1, the Top-K results) and negative value (−1). Those who have positive values will be recommended to the target Mashup.

As described in Sect. 2.2.1, traditional recommendation system is a two-dimension model of user-item. In our FMs modeling of Web APIs prediction, active Mashup can be regarded as user, and active Web APIs can be regarded as item. Besides the two-dimension features of active Mashup and Web APIs, other multiple dimension features, such as similar Mashups, similar Web APIs, co-occurrence and the popularity of Web APIs, can be exploited as input features vector in FMs modeling. Thus, the two-dimension of prediction model in formula (4) can be expanded to a six-dimension prediction model:

$$ y: MA \times WA \times SMA \times SWA \times CO \times POP \to S $$
(7)

Here, \( MA \) and \( WA \) respectively represent the active Mashup and Web APIs, \( SMA \) and \( SWA \) respectively represent the similar Mashups and similar Web APIs, \( CO \) and \( POP \) respectively represent the co-occurrence and popularity of Web APIs, and \( S \) represents the prediction ranking score. Especially, we exploit the latent topics probability of both the documents of similar Mashup and similar Web APIs, to support the model training of FMs, in which these latent topics are derived from our HDP model in the Sect. 2.1.

The above Fig. 2 is a FMs model example of recommending Web APIs for Mashup, in which the data includes two parts (i.e. an input feature vector set X and an output target set Y). Each row represents an input feature vector \( x_{i} \) with its corresponding output target \( y_{i} \). In the Fig. 2, the first binary indicator matrix (Box 1) represents the active Mashup \( MA \). For one example, there is a link between M 2 and A 1 at the first row. The next binary indicator matrix (Box 2) represents the active Web API \( WA \). For another example, the active Web API at the first row is A 1 . The third indicator matrix (Box 3) indicates Top-A similar Web APIs \( SWA \) of the active Web API in Box 2 according to their latent topics distribution similarity derived from HDP described in Sect. 2.2. In Box 3, the similarity between A 1 and A 2 (A 3 ) is 0.3 (0.7). The forth indicator matrix (Box 4) indicates Top-M similar Mashups \( SMA \) of the active Mashup in Box 1 according to their latent topics distribution similarity derived from HDP described in Sect. 2.2. In Box 4, the similarity between M 2 and M 1 (M 3 ) is 0.3 (0.7). The fifth indicator matrix (Box 5) shows all co-occurrence Web APIs \( CO \) of the active Web API in Box 2 that are invoked or composed in common historical Mashup. The sixth indicator matrix (Box 6) shows the popularity \( POP \) (i.e. invocation frequency or times) of the active Web API in Box 2 in historical Mashup. Target Y is the output result, and the prediction ranking score S are classified into positive value (+1) and negative value (−1) according to a given threshold. Suppose \( y_{i} > 0.5, \) then \( S = + 1, \) otherwise \( S = - 1. \) These Web APIs who have positive values will be recommended to the target Mashup. For example, active Mashup M 1 have two active Web APIs member A 1 and A 3, A 1 will be preferred recommended to M 1 since it have the higher prediction value, i.e. \( y_{2} > 0.92 \). Moreover, in the experiment section, we will investigate the effects of top-A and top-M on Web APIs recommendation performance.

Fig. 2.
figure 2

The FMs model of recommending web APIs for mashup

3 Experiments

3.1 Experiment Dataset and Settings

To evaluate the performance of different recommendation methods, we crawled 6673 real Mashups, 9121 Web APIs and 13613 invocations between these Mashups and Web APIs from ProgrammableWeb. For each Mashup or Web APIs, we firstly obtained their descriptive text and then performed a preprocessing process to get their standard description information. To enhance the effectiveness of our experiment, a five-fold cross-validation is performed. All the Mashups in the dataset have been divided into 5 equal subsets, and each fold in the subsets is used as a testing set, the other 4 subsets are combined to a training dataset. The results of each fold are summed up and their averages are reported. For the testing dataset, we vary the number of score values provided by the active Mashups as 10, 20 and 30 by randomly removing some score values in Mashup-Web APIs matrix, and name them as Given 10, Given 20, and Given 30. The removed score values will be used as the expected values to study the prediction performance. For the training dataset, we randomly remove some score values in Mashup-Web APIs matrix to make the matrix sparser with density 10%, 20%, and 30% respectively.

3.2 Evaluation Metrics

Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) are two frequently-used evaluation metrics [15]. We choose them to evaluate Web APIs recommendation performance. The smaller MAE and RMSE indicate the better recommendation quality.

$$ MAE = \frac{1}{N}\sum\nolimits_{ij} {\left| {r_{ij} - \hat{r}_{ij} } \right|} $$
(8)
$$ RMSE = \sqrt {\frac{1}{N}\sum\nolimits_{ij} {\left( {r_{ij} - \hat{r}_{ij} } \right)^{2} } } $$
(9)

Here, N is the amount of predicted score, \( r_{ij} \) represents the true score of Mashup M i to Web API A j , and \( \hat{r}_{ij} \) represents the predicted score of M i to A j .

3.3 Baseline Methods

In this section, we investigate and compare our proposed method with baseline methods. The baseline methods are briefly described as below:

  • WPCC. Like IPCC [15], Web APIs-based using Pearson Correlation Coefficient method (WPCC), uses PCC to calculate the similarities between Web APIs, and makes recommendation based on similar Web APIs.

  • MPCC. Like UPCC [15], Mashups-based using Pearson Correlation Coefficient method (MPCC), uses PCC to calculate the similarities between Mashups, and predicts Web APIs invocations based on similar Mashups.

  • PMF. Probabilistic Matrix Factorization (PMF) is one of the most famous matrix factorization models in collaborative filtering [8]. It supposes Gaussian distribution on the residual noise of observed data and places Gaussian priors on the latent matrices. The historical invocation records between Mashups and Web APIs can be represented by a matrix \( R = \left[ {r_{ij} } \right]_{n \times k} \), and \( r_{ij} = 1 \) indicates the Web API is invoked by a Mashup, otherwise \( r_{ij} = 0 \). Given the factorization results of Mashup M j and Web API A i , the probability A i would be invoked by M j can be predicted by the equation: \( \hat{r}_{ij} = A_{i}^{T} M_{j} \).

  • LDA - FMs. It firstly derives the topic distribution of document description for Mashup and Web APIs via LDA model, and then use the FMs to train these topic information to predict the probability distribution of Web APIs and recommend Web APIs for target Mashup. Besides, it considers the co-occurrence and popularity of Web APIs.

  • HDP - FMs. The proposed method in this paper, which combines HDP and FMs to recommend Web APIs. It uses HDP to derive the latent topics probability of both the documents of similar Mashup and similar Web APIs, supporting the model training of FMs. It also considers the co-occurrence and popularity of Web APIs.

3.4 Experimental Results

  1. (1)

    Recommendation Performance Comparison

Table 1 reports the MAE and RMSE comparison of multiple recommendation methods, which show our HDP-FMs greatly outperforms WPCC and MPCC, significantly surpasses to PMF and LDA-FMs consistently. The reason for this is that HDP-FMs firstly uses HDP to derive the topics of Mashups and Web APIs for identifying more similar Mashups and similar Web APIs, then exploits FMs to train more useful information for achieving more accurate Web APIs probability score prediction. Moreover, with the increasing of the given score values from 10 to 30 and training matrix density from 10% to 30%, the MAE and RMSE of our HDP-FMs definitely decrease. It means more score values and higher sparsity in the Mashup-Web APIs matrix achieve better prediction accuracy.

Table 1. The MAE and RMSE performance comparison of multiple recommendation approaches
  1. (2)

    HDP-FMs Performance vs. LDA-FMs Performance with different topics number

As we know, HDP can automatically find the optimal topics number, instead of repeatedly model training like LDA. We compare the performance of HDP-FMs to those of LDA-FMs with different topics number. During the experiment, we set different topics number 3, 6, 12, and 24 for LDA-FMs, respectively denoted as LDA-FMs-3/6/12/24. Figures 3 and 4 respectively show the MAE and RMSE of them when training matrix density = 10%. The experimental results in the Figs. 3 and 4 indicate that the performance of HDP-FMs is the best, the MAE and RMSE of LDA-FMs-12 is close to those of HDP-FMs. When the topics number becomes smaller (LDA-FMs-3, LDA-FMs-6) or larger (LDA-FMs-24), the performance of HDP-FMs constantly decreases. The observations verify that HDP-FMs is better than LDA-FMs due to automatic obtain the optimal topics number.

Fig. 3.
figure 3

The MAE of HDP-FMs and LDA-FMs

Fig. 4.
figure 4

The RMSE of HDP-FMs and LDA-FMs

  1. (3)

    Impacts of top-A and top-M in HDP-FMs

As described in Sect. 2.2.2, we use top-A similar Web APIs and top-M similar Mashups derived from HDP as input variables, to train the FMs for predicting the probability of Web APIs invocated by Mashups. In this section, we investigate the impacts of top-A and top-M to gain their optimal values. We select the best value of top-M (top-A) for all similar top-A (top-M) Web APIs (Mashups), i.e. M = 10 for all top-A similar Web APIs, A = 5 for all top-M similar Mashups. Figures 5 and 6 show the MAE of HDP-FMs when training matrix density = 10% and given number = 30. Here, the experimental result in the Fig. 5 indicates that the MAE of HDP-FMs is the optimal when A = 5. When A increases from 5 to 25, the MAE of HDP-FMs constantly increases. The experimental result in the Fig. 6 shows the MAE of HDP-FMs reaches its peak value when M = 10. With the decreasing (<=10) or increasing (>=10) of M, the MAE of HDP-FMs consistently raises. The observations show that it is important to choose an appropriate values of A and M in HDP-FMs method.

Fig. 5.
figure 5

Impact of top-A in HDP-FMs

Fig. 6.
figure 6

Impact of top-M in HDP-FMs

4 Related Work

Service recommendation has become a hot topic in service-oriented computing. Traditional service recommendation addresses the quality of Mashup service to achieve high-quality service recommendation. Where, Picozzi [16] showed that the quality of single services can drive the production of recommendations. Cappiello [17] analyzed the quality properties of Mashup components (APIs), and discussed the information quality in Mashups [18]. Besides, collaborative filtering (CF) technology has been widely used in QoS-based service recommendation [15]. It calculates the similarity of users or services, predicts missing QoS values based on the QoS records of similar users or similar services, and recommends the high-quality service to users.

According to the existing results [19, 20], the data sparsity and long tail problem lead to inaccurate and incomplete search results. To solve this problem, some researchers exploit matrix factorization to decompose historical QoS invocation or Mashup-Web API interactions for service recommendations [21, 22]. Where, Zheng et al. [22] proposed a collaborative QoS prediction approach, in which a neighborhood-integrated matrix factorization model is designed for personalized web service QoS value prediction. Xu et al. [7] presented a novel social-aware service recommendation approach, in which multi-dimensional social relationships among potential users, topics, Mashups, and services are described by a coupled matrix model. These methods address on converting QoS or Mashup-Web API rating matrix into lower dimension feature space matrixes and predicting the unknown QoS value or the probability of Web APIs invoked by Mashups.

Considering matrix factorization rely on rich records of historical interactions, recent research works incorporated additional information into matrix factorization for more accurate service recommendation [4, 8,9,10]. Where, Ma et al. [9] combined matrix factorization with geographical and social influence to recommend point of interest. Chen et al. [10] used location information and QoS of Web services to cluster users and services, and made personalized service recommendation. Yao et al. [8] investigated the historical invocation relations between Web APIs and Mashups to infer the implicit functional correlations among Web APIs, and incorporated the correlations into matrix factorization model to improve service recommendation. Liu et al. [4] proposed to use collaborative topic regression which combines both probabilistic matrix factorization and probabilistic topic modeling, for recommending Web APIs.

The above existing matrix factorization based methods definitely boost performance of service recommendation. However, few of them perceive the historical invocation between Mashup and Web APIs to derive the latent topics, and none of them use FMs to train these latent topics to predict the probability of Web APIs invoked by Mashups for more accurate service recommendation. Motivated by above approaches, we integrated HDP and FMs to recommend Web APIs for Mashup development. We use HDP model to derive the latent topics from the description document of Mashups and Web APIs for supporting the model training of FMs. We exploit the FMs to predict the probability of Web APIs invocated by Mashups and recommend high-quality Web APIs for Mashup development.

5 Conclusions and Future Work

This paper proposes a Web APIs recommendation for Mashup development based on HDP and FMs. The historical invocation between Mashup and Web APIs are modeled by HDP model to derive their latent topics. FMs is used to train the latent topics, model multiple input information and their interactions, and predict the probability of Web APIs invocated by Mashups. The comparative experiments performed on ProgrammableWeb dataset demonstrate the effectiveness of the proposed method and show that our method significantly improves accuracy of Web APIs recommendation. In the future work, we will investigate more useful, related latent factors and integrate them into our model for more accurate Web APIs recommendation.