Keywords

1 Introduction

The aim of ubiquitous computing is to build personalized applications based on the context surrounding users in order to support everyday activities. However, an obstacle to achieving this goal is Web’s information overload. This causes high cognitive load for the users; especially when they look for topics of their interest [11]. Recommender systems (RS) rose in order for responding to this challenge. This research field still lacks a unique way to represent collective intelligence and standardization for experimenting with its algorithms [4, 13]. In fact, there is no tool available for researchers in this RS field to define and test algorithms independently from the recommendation domain. In recent years, however, there has been progress in achieving the integration of this collective intelligence, represented by progress in developing tools and frameworks. In this scenario, this paper proposes RBox as a higher alternative.

2 Related Works

Over the last few years, there have been few tools or frameworks aimed at creating or researching recommendation algorithms, and likewise, others that have contributed, such as IDEs that can generate code or components like RBox does. In short, RBox is software for creating recommender algorithms for web 2.0 based on an event-driven approach.

We have compared twelve tools: (1) C/Matlab Toolkit [9]; (2) WekaFootnote 1 that contains a popular algorithm collection of automatic learning machines; Apache Mahout projectFootnote 2 provides an open source platform for systems with collaborative filtering recommendation; (3) RACOFI [10]; platform for multidimensional recommendation scores based on collaborative filtering; (4) SDVFeatureFootnote 3 recommendation systems with collaborative filtering based on matrix factorization; (5) CrabFootnote 4, as a Python framework to build recommendation engines based on collaborative filtering to be integrated with scientists Python packages; (6) SUGGESTFootnote 5 builds recommendation systems in production environments but without experimentation characteristics; (7) EasyRecFootnote 6, which provides only a service to be used in Web applications; (8) RecommenderLab, [8] which is a framework for developing and testing algorithms aimed toward ecommerce, highly dependent on context; (9) MyMediaLite [6], which is a library of APIs to experiment and use algorithms based on common scenarios recommendation; (10) LensKit [5] is a platform to investigate recommender systems based on collaborative filtering; (11) AIBench [7] is a IDE and framework that speeds up coding, implementation, testing and optimization techniques of Artificial Intelligence.

Special mention is given to the twelfth tool: Synergy [14]. It proposes a scheme of data to work with generic collaborative filtering RS. Its key is to change the matricial or tensorial representation of the data used by most algorithms currently in use, replacing them with events such as tagging content, expressing opinions, and voting, among others. In general, none of these 12 tools is able to generate components that can be reused in other systems in a plug-in format as does RBox.

3 RBox Basics

The field of recommender systems has experienced dramatic growth [1, 4], mainly due to the worldwide increase of social networks and electronic commerce which are the basis of this kind of system. This has led to research, development and implementation of specific algorithms to rescue collective intelligence [3] from specific domains to provide recommendations in those contexts.

As a result, two direct consequences have been detected. The first is that, due to the diversity of domains, datasets belonging to each of them have specific representations. Transitively, the second consequence is that, given this diversity of data representation schemes, there is no tool available to researchers in this area to define and test algorithms independently from their recommendation domain of origin. In recent years, however, there has been progress with regard to achieving the integration of this collective intelligence, which in practice has meant progress in developing tools and frameworks.

3.1 Design Issues

RBox it is a desktop application for researcher users, which allows them to test and compare the results of different algorithms for collaborative filtering recommendation in an investigative process, enabling: population of the dataset for experimentation, defining algorithms, algorithm parameterization, running tests, releases of plug-ins with algorithms and associated settings. Moreover, RBox is designed under the concept of plug-in [12] using the java.util.ServiceLoader class included in Java 6.0. Under this model, RBox defines the software parts that allow users to define both the algorithms and precision metrics used at runtime as interchangeable components used in the experiments. Using this scheme, RBox can define three key issues in RS research. First, it can define similarity algorithms, to compare the proximity of elements (e.g. users or items) and then generate a list to recommend. Second, it can define recommendation algorithms themselves. Finally, it can design evaluation metrics which allow users to evaluate recommendation algorithms by means of error metrics and metric classification. For the first group, algorithms are required to works with a type of event that has an associated numerical value (rating). The second group is more generic. Experimenters must implement the metric type that they need according to the algorithm that will be evaluated.

To allow the operation of the algorithms regardless of data domains, a Generic data model is used, which is event-driven [14] and allows data mapping from the original data sets.

3.2 Generic Data Model

Even-driven approach in RBox means that interactions are associated with a value depending on the type of event concerned, as does Synergy [14]. In the case of collaborative tagging, it is the tag itself. In the case of valuation, it is an integer that indicates the degree of acceptance of the item reviewed by the user. This arises after having seen that interaction among users and content can be varied, and after realizing that there is not one way to generically evaluate different recommendation algorithms for a specific context. Thus, treating all interactions similarly removes dependencies from any specific interaction dimension. In this model (Fig. 1), it is represented as the set of users U; C is the set of items; I is the set of possible interactions (types of interaction); E is the set of events; and V is the vector of values associated with an interaction. Using elementary set theory:

Fig. 1.
figure 1

ERD of data model (Source: [14], p. 4)

$$ \begin{aligned} & {\text{U}}\, = \,\left\{ {{\text{u}}_{1} ,\,{\text{u}}_{2} , \ldots {\text{u}}_{n} } \right\}\quad \;\;{\text{E }} = \, \left\{ {\left( {{\text{s,}}\,{\text{p,}}\,{\text{o,}}\,{\text{t}}} \right)\left| {\,{\text{s}}\,{\text{in}}\,{\text{U}},\,{\text{p}}\,{\text{in}}\,{\text{I,}}\,{\text{o}}\,{\text{in}}\,{\text{C,}}\,{\text{t}}\,{\text{is}}\,{\text{time}}} \right.} \right\} \\ & {\text{C}}\, = \,\left\{ {{\text{c}}_{ 1} ,{\text{c}}_{ 2} , \ldots {\text{c}}_{\text{m}} } \right\}\quad \;\;\;\,{\text{I}}\, = \,\left\{ {{\text{i}}_{ 1} ,{\text{ i}}_{ 2} , \ldots {\text{i}}_{\text{n}} } \right\}\;\;{\text{V}}\, = \,\left\{ {\left( {{\text{e}},{\text{ a}}} \right)\left| {\,{\text{e}}\,{\text{E}},\,{\text{a}}\,{\text{is}}\,{\text{some}}\,{\text{value}}} \right.} \right\} \\ \end{aligned} $$

Each event is generated by a user or Subject. Item or object (Objects) recommendations are provided for the Subject. Each event is of some type, and the type of event identifies the type of interaction between the subject and the object. Besides, the interaction generates some value, so the event is associated with one or more values. For example, by giving a value to a film (rating) the event would have a numerical value within a range of values (e.g., 0–1, 1–3, 1–5).

4 Comparing RBox

Eight criteria have been established for benchmarking different tools reviewed with RBox. They characterize solutions to the problems of domain diversity and lack of research environments for the experimentation in RS. Table 1 summarizes this.

Table 1. Comparison between RS tools and frameworks.

The criteria are: (1) Domain mappability is the ability of the tool to transform recommendation algorithms that are located in a specific domain and map them into a different one (e.g. from book to movies), (2) Experimentability is the feature that allows the tool to be used to evaluate and experiment with different algorithms by means of running tests. Through the results of these tests, it is possible to evaluate and compare them, (3) Maintainability refers to how easily the software is corrected if there are flaws or is extended to include it new features. Generally, this criterion should be supported by well-known design patterns, by the use of frameworks and well- documented standard programming languages; (4) Flexibility (or extensibility) allows the use of various algorithms of recommendation, even if new attributes or characteristics of the user interaction with the items are included; (5) Configurability means adapting the operation of the software through its parameters, thus, different algorithms can be configured according to its particular need, given the criteria of the researcher; (6) Recommender as outcome is able of generating components plug-ins, so, it can be re-used both in an environment of experimentation and in a production environment in real time; (7) Open Source means that the software code is available under open source agreements; (8) Recommendation-oriented the tools are restricted to the construction of recommender algorithms, instead of being a tool for general purpose.

The reader may note that mappability between domains applies for most of the tools. However, except for RBox, in these cases, “mappability” is reached via programming effort through API or frameworks without guidance. On the other hand, in the case of RBox, there is a standardized way of mapping via a mapper. This component gathers data from a social network through a crawling process which is translated to the unique event schema used by RBox. With this scheme, the user interactions are managed as a generic event, enabling evaluating algorithms independent of domain.

5 Conclusions and Future Work

RBox puts forward an approach that makes a significant leap towards the aggregation of RS domains that are usually scattered. This way, users are able to reuse recommendation algorithms from different domains, experiment with new algorithms and generate a plug and play version of these algorithms to be built-in real domains. Thus, RBox users can focus their time and effort mainly on research, being relieved of creating specific algorithms for particular domains. However, the new generation of recommender systems, so-called Context Aware Recommenders [2] require further awareness or contextual information that is not provided by the current version of RBox. In fact, ubiquitousness requires more adaptive and personalized recommendations. Nowadays, recent researchers are combining ubiquitous computing and recommender systems in the so-called ubiquitous recommendation systems [11], which will be the aim of the new version of RBox.