1 Introduction

Consumer behaviors and habits changed drastically since 1950s [21]. Today, people buy more, produce more garbage and demand more [12]. Overconsumption is becoming a huge concern and it is one of the biggest obstacles toward a sustainable world. The equivalent for fast food called ’fast fashion’ is becoming a phenomenon, and people continue to purchase large quantities of clothes. Because of this, textile recycling has become a pressing issue. Textile recycling has old history but was earlier often just concerned with the economic benefits. As of now, the environment has become the number one factor for doing textile recycling. For every pound of recycled textile, more greenhouse gas emissions are prevented than for every pound of glass, plastic, and paper—combined [18]. Moreover, Klepp and Laitala found that 20% of the clothes in Norwegians’ wardrobes were never or rarely used [8]. Because of this, clothing retailers are now starting to offer checkpoints where people can recycle their clothes in order to enhance the environmental sustainability.

It is important that people get incentives to use these recycling checkpoints. Such incentives can be generated by recommender systems and Internet of Things technology. Traditionally, recommender systems try to predict items that might be of interest to the users. A popular technique for recommender systems is known as content-based filtering [14]. Content-based recommendations build a user profile of item properties that the user has shown an interest of in the past and then computes item similarities with other items that the user has not seen yet. In previous studies, content-based recommender systems enabled with semantic web technologies have shown promising results by increasing the accuracy of the recommendations [4]. Linked Open Data (LOD) is a semantic web technology that forms a set of rules for publishing data so that the data become machine-readable and free of use for anyone [1].

In [9], we proposed a system called Connected Closet—a smart closet where clothing items enabled with radio-frequency identification (RFID) tags can be scanned in the user’s closet, tracking the usage history of the clothes. The system’s mobile application leverages data collected from the smart closet, to guide users in their everyday lives and making the users’ wardrobes explorable on a mobile device. In [10], we proposed a recommendation approach for recommending daily outfits to the users. In this paper, we propose a semantic content-based recommender system that leverages a set of context signals obtained from the system’s architecture to provide recycling suggestions to users of the system. By utilizing semantic web technology, the recommender system’s accuracy improves. Moreover, it can improve the system’s transparency and increase the user’s trust and confidence in the system.

This paper is an extended version of [11]. In this paper, we extend our work by giving a more extensive description of the system architecture. Furthermore, we extend our experiments to include an additional LOD Knowledge Base, to be used for comparison. We also discuss some additional related works.

1.1 Contributions

The main contributions of this paper may be summarized as follows:

  1. 1.

    We propose a content-based recommender system that utilizes LOD to recommend clothing items to be recycled from a smart closet.

  2. 2.

    We evaluate our proposed recommender system on a real-world dataset and compare it to a baseline that does not utilize semantic web technology.

The remainder of this paper is structured as follows. In Section 2, we describe the architecture of the system. Then, we describe our recommendation algorithm in Section 3. In Section 4, we evaluate our proposed approach and discuss related work in Section 5. We discuss some obtained insight in Section 6, and conclude with a summary and future works in Section 7.

2 System architecture

2.1 Contextual models

In this section, we describe the contextual signals obtained from the proposed system’s architecture depicted in Fig. 1. We describe how they are obtained and how they affect the recommendation algorithm for recycling recommendations. Figure 1 is a high level overview of the system’s architecture, and more detailed explanations of the services running in the cloud is given in Section 2.1.4.

Fig. 1
figure 1

Architecture for a sustainable wardrobe. The figure indicates where in the system the contextual signals are obtained from

2.1.1 Usage-aware

The user’s clothing items are enabled with RFID tags. These tags can be manually scanned through an RFID reader connected to a tiny computer embedded in the user’s Closet. The computer will broadcast a message containing information about the scan to a set of services deployed in the cloud. Each scan is added to the set S.

Clothing items that are often checked out of the user’s closet will achieve a higher user rating. More formally, based on item usage, we calculate the user’s rating of a clothing item as follows:

$$ \hat{r}_{\mu}(u,i) = \lfloor{\frac{1}{2}|S_{u,i}|}\rfloor, $$
(1)

where Su, i is the set of all scans of clothing item i done by user u. The cardinality of the set Su, i is divided by 2 in order to exclude insertion scans from the rating.

2.1.2 Taste-aware

In the Mobile Application, the user can save his favorite outfits. An outfit is represented as a tuple (one top and one bottom). The user’s favorite outfits are added to the set Ou. Items that occur in many outfit combinations will achieve a high rating. More formally, the user rating based on the user’s favorite outfits is defined as follows:

$$ \hat{r}_{\tau}(u,i) = |\{ (j,k) \in O_{u} \mid j = i \lor k = i \} | $$
(2)

2.1.3 Season-aware

The usage pattern of some items might only occur during a season, e.g., a winter coat will only be used in the winter. Such seasonal clothing items are assigned to a season {winter|spring|summer|fall}. If a seasonal clothing item is recommended for recycling, we check if the item was used during the last assigned season. This is done by looking up the latest item scans in S. If the item was used during the season, the item is removed from the recommended list and not displayed to the user.

2.1.4 Services

In Fig. 2, we show a more detailed overview of the services that are deployed in the cloud. The two services on the left, Inventory service and History service, handles RFID scans and keeps tracks of the usage history of each item. The Catalog service handles all property information about each clothing item, and is responsible for the semantically enrichment of the items. Moreover, the Closet service works as an orchestration layer, handling all the network calls between the services, and is responsible for providing the mobile application with data.

Fig. 2
figure 2

Detailed architecture. A view of the system’s services that are deployed in the cloud. The figure shows how the services are connected and how they are connected to external services

3 Semantic content-based recommender system

In [9], we proposed a model for recycling recommendations that recommended the lowest rated items. Although, considering how the ratings are obtained from the context signals, newly bought clothing items would always be recommended for recycling when using such a model. In this paper, we propose a content-based recommender system that recommends the items that are as least similar as possible to the user profile.

3.1 Vector space model

In our recommendation approach, we adopt the Vector Space Model, where we represent each clothing items as a vector. We then use the Bag of Concepts [19] approach to create the vectors using entities from WikidataFootnote 1 or DBpedia,Footnote 2 which are Knowledge Bases published as LOD. As a weighting scheme, the Concept Frequency (CF) is used. Here, the term concept refers to the LOD entities, and the weight of a concept is determined by the number of entity referals by an item description. The process of extracting entity referals is described in Section 3.2.

The user profile is represented as a set of clothing item vectors:

$$ profile(u) = \{\vec{i}\ |\ \hat{r}(u,i) > \lambda \}, $$
(3)

where \(\hat {r}(u,i)\) is an aggregate of \(\hat {r}_{\mu }(u,i)\) and \(\hat {r}_{\tau }(u,i)\), and \(\vec {i}\) is the CF vector representing clothing item i. In the current prototype, the aggregation is computed by taking the sum of \(\hat {r}_{\mu }(u,i)\) and \(\hat {r}_{\tau }(u,i)\). In future work we will experiment with different types of aggregations to see if more accurate recommendations can be achieved by giving more importance to taste than usage, or vice versa.

The user’s clothing items are then sorted ascending in a ranked list using this scoring function:

$$ \bar{r}(u,i) = \frac{\sum\limits_{\vec{j} \in profile(u)} dist(\vec{i},\vec{j})}{|profile(u)|}, $$
(4)

where \(dist(\vec {i},\vec {j})\) is a distance measure between the vectors representing the clothing items i and j. In our approach we use the Euclidean distance defined as follows:

$$ dist(\mathbf{q},\mathbf{p}) = \sqrt{\sum\limits_{i = 1}^{n} (q_{i} - p_{i})^{2}}, $$
(5)

where q and p are both vectors of n dimensions.

3.2 Semantic item representation

Figure 3 shows an excerpt of the services deployed in the cloud. When new clothing items are fed into the system, they are inputted with a free text description to the Catalog service. To represent the clothing items as vectors using LOD entities with CF, the following process is performed on the item’s text description (the location of where in the system each step is performed is depicted in Fig. 3):

(1) Entity extraction. :

We extract LOD entities from Wikidata using the natural language processing API TextRazorFootnote 3 and DBpedia entities using the DandelionFootnote 4 API. For disambiguating entities, each entity is ranked with a confidence score based on multiple signals in the text.

(2) Weighting. :

We generate vectors using the CF weighting scheme from the entities returned by Step 1.

(3) Storing. :

The vectors are then stored in the system’s graph database called Item storage.

Fig. 3
figure 3

Excerpt of the system’s services deployed in the cloud. An illustration of the process of a new item being stored in the system

In the system’s current stage, removal of stop entities and generic entities, is not addressed and will be included in the process in future research.

4 Experiments

To demonstrate the validity of our approach, we perform an evaluation on a dataset collected from the Web.

4.1 Dataset

In order to evaluate our approach, we desired a dataset consisting of users with outfit ratings, the user’s usage history on clothing items, and information on which clothing items that have been recycled. Because our system is not yet in full scale production and there seems to be little available data in this domain, we could only obtain a dataset addressing outfit ratings.

The dataset was collected from the social media site Polyvore.Footnote 5 Polyvore is a site where users can create fashion outfits. Other people can rate these outfits using a ‘like’ button on a unary rating scale. This mirrors the functionality found in the smart closet’s mobile app. The collected dataset consists of 260 rated outfits composed by 158 clothing items, 7093 users and 19287 outfit ratings.

Figure 4 shows an example from one of the items in the collected dataset. The top shows the representation of the item using the classic Bag of Words approach with a Term Frequency (TF) vector. Below, is the representation of the item after it has gone through the process described in Section 3.2 when using Wikidata as a Knowledge Base. The blue node (item) represent the clothing item, while the green nodes (labels starting with ‘Q’) represent the Wikidata entities describing the clothing item. To visualize the context signals, we have included one user’s interactions with the item. In the figure, \(\hat {r}_{\mu }(u,i)\) (USAGE_COUNT) and \(\hat {r}_{\tau }(u,i)\) (OUTFIT_LIKES) are represented as relations from the user node to the item node.

Fig. 4
figure 4

Representation of a sample item. Text description: “Miu Miu’s vibrant Resort ‘17 collection is inspired by the ‘90s rave scene. Knitted in a kaleidoscope of hues, this cropped sweater has sumptuous touches of wool, mohair and […] Dry clean.. Made in Italy..”. Given this description, the Wikidata entities such as Q42329 (wool), Q232191 (sweater), and Q552230 (Miu Miu) are extracted

4.2 Evaluation method

Due to the nature of the dataset, we can only consider the ratings obtained by the Taste-Aware context signal. For this reason, the other signals are neglected in this experiment. Moreover, the dataset does not contain clothing items that have been recycled by the users. This means that we need to make an assumption for when a clothing item is relevant for recycling. In this experiment, a clothing item that is relevant for recycling is a clothing item that occurs only once in the user’s favorite outfits, i.e., \(\hat {r}_{\tau }(u,i) = 1\).

4.2.1 Evaluation protocol

We evaluate all the users in the dataset that has at least one item i such that \(\hat {r}_{\tau }(u,i) > 3\). Moreover, we set λ = 2 in Eq. 3. We then assume that the user only owns items that occurs at least once in his favorite outfits, i.e., \(\hat {r}_{\tau }(u,i) > 0\). For these items, we generate a recommended list to the users using Eq. 4.

4.2.2 Evaluation metrics

To assess the quality of the recommendations we apply the traditional evaluation metrics Recall and Precision defined as follows:

$$ \text{Recall} = \frac{tp}{tp+fn} $$
(6)
$$ \text{Precision} = \frac{tp}{tp+fp}, $$
(7)

where tp is the number of correctly recommended relevant items, fn is the number of wrongly recommended relevant items, and fp is the number of wrongly recommended non-relevant items. We report Recall@N and Precision@N which is the Recall and Precision in a ranked list just considering the first N items.

4.2.3 Baseline method

As a baseline, we use the classic Bag of Words approach and represent the free text descriptions of the clothing items as TF vectors as opposed to CF in our proposed approach.

4.3 Results

We report the results in Fig. 5. For the evaluation metrics, we focused on N = 5, since our system will display 5 recycling recommendations to the user. From the figure, we note that our proposed approach using the CF weighting scheme outperforms the baseline in both categories. Moreover, our approach performed best when using LOD entities extracted from Wikidata.

Fig. 5
figure 5

Experiment results of Recall@5 and Precision@5 for the baseline (TF) and our proposed approach (CF). The reported results are the average of Recall@5 and Precision@5 for each user evaluated in the experiment

5 Related work

With the advancements in IoT research and applications, development of smart homes gained an increasing pace. Smart home applications usually include many components from more general (e.g. energy saving components) to more personalized ones (e.g. smart closets), and the aim is to provide a more comfortable, secure and environmental friendly living conditions to the people.

Recommender systems aim to deliver the most suitable items to the users based on personalized user profiles. Since recommender systems can be applied to almost any domain where personalization takes place, it is very suitable to use it in the smart homes domain. As explained in detail in previous sections, this work combines a different recommender system approach with the smart closet application.

In this section we give the related work to smart closets combined with recommender systems and the recommendation approach we have proposed in separate subsections.

Smart closets and fashion recommenders

Previous works on smart closets like [6, 25], have proposed similar architectures with RFID-enabled clothing items that is leveraged to generate outfit recommendations. Some other earlier works including [13, 23] are also leveraging RFID technology—as shown in this paper, but the applications in these works are limited to inventory overview. In this paper, we mainly focus on leveraging our proposed architecture to recommend textile recycling suggestions.

Fashion recommender systems aim to help people with their daily clothing choices. This can be used to help to elderly people as a part of assited living systems or just to make life easier for everyone. Fashion recommender systems can use many different methods and different properties of items and users.

In [7] authors propose a recommender system based on fashion photographs. The system recommends tops or bottoms when a bottom or top is given. Topic modeling is used to learn the relations between items. Even though the basic idea of recommending matching outfits is similar to our work, the methods and the baseline used are different.

Shen et al. [22] proposes a different way of recommending fashion items: Scenario oriented fashion recommendation. It aims to help people with the outfit recommendations based on their style and the occasion that they are going to use the outfit for. In this work, semantic methods were used to match the outfits and the style of the user, in addition to the proposed “occasion network” model to employ the scenario in the system.

Some fashion recommender systems include the data from third party sources like real-time weather. In [16] it is proposed a mobile smart wardrobe system called DressRoom. By using this application users can match their own outfits and enter descriptions, as well as getting recommendations based on location and real time weather information and sharing their choices with their friends. In the recommender system part of this work Analytic Hierarchy Process (AHP) was used. In [20] authors propose an intelligent wardrobe for elderly people which works based on the real time weather conditions. A simple database was used to keep track of the users’ clothes and to suggest the existing items according to the weather condition. Cheng et al. [3] is another work which counts on the existing weather condition in order to generate recommendations. In this authors propose an automatic personal wardrobe that classifies the user’s clothing items (based on the photos of items) according to the weather suitability, color harmony, fabric material and outline shape. Especially color and color harmony analysis play an important role in this work.

Another interesting approach to recommender systems in the context of smart living is described in [2]. Authors proposed a model with case based reasoning based on semantic data analysis (CBA-SDA) in order to recommend clothes and accessories to users. Also in this model it is possible to incorporate the real time weather data for more accurate recommendations. Semantic approach in fashion recommender systems also includes ontologies. In [24] authors propose a fashion recommender system based on a fashion ontology. Fashion ontology they propose includes many definitions for clothing items including garments, materials, colors etc. Based on this ontology rule based inference was applied and a general purpose personalization engine was used.

Recommendation approach

In recent years, LOD in recommender systems has been frequently researched and various applications have been proposed [4, 5]. The most common application is to calculate semantic similarity of items based on the item’s relationships found in datasets published as LOD. Using concepts from a LOD Knowledge Base to model the user profile have shown promising results in past work, e.g., [17]. Many of these past works use DBpediaFootnote 6 as a LOD Knowledge Base and focus on recommendation in the traditional domains where large datasets are available, such as movies, music, and books.

Our recommender system addresses a relatively unexplored domain and exploits a LOD Knowledge Base lacking previous research. To the best of our knowledge, recommending items that are no longer of interest to the user is a quite recent idea. Moreover, with the proposed approach built into the architecture of the Internet of Things wardrobe, this paper’s proposed recommender system in the fashion domain is the first of its kind.

6 Discussion

CF vs. TF

In the following section, we highlight the advantages our proposed CF approach has over the TF baseline. These highlights should give an idea of why the CF outperforms TF. A drawback of Bag of Words is that it considers all words in the text descriptions as equally important. To use the item in Fig. 4 as an example, it represents the item as an item containing the word ‘miu’ two times, when—in fact—‘Miu Miu’ is the brand of the item and is of vital importance to the representation of the item. For clothing items with brand names of common terms, such as ‘Jean Shop’, the Bag of Words approach unsurprisingly does not perform well. In advantage, the Bag of Concepts approach is able to extract the ‘Miu Miu’ entity (Q552230 and http://dbpedia.org/resource/Miu_Miu) and able to capture the semantic context of the item. Moreover, the Bag of Words approach will describe the item using words such as ‘clean’, which does not characterize the item in any way.

Wikidata vs. DBpedia

In total, using Wikidata as a Knowledge Base, we extracted 871 entities, whereas the number of extracted entities using DBpedia was 1490. This difference might be an indication that the data collected using DBpedia was more noisy than when using Wikidata, which is a possible explanation for the lower accuracy for DBpedia.

In Table 1, we list all the LOD entities for one clothing item that was used in the experiments. This table supports our hypothesis by showing that the sample item is represented by irrelevant DBpedia entities, such as, e.g., http://dbpedia.org/resource/Retail and http://dbpedia.org/resource/Eternity.

Table 1 Wikidata vs. DBpedia representation of one clothing item

Textile recycling recommendations

By recycling textile waste, benefits can be seen in different areas like environmental, economical, and social sustainability. With today’s recommender systems technological advancements already making a huge impact on people’s pre-purchase behavior, it is evident that they also can affect people’s post-purchase behavior. A system architecture and a recommender system as described in this paper, is a good candidate to affect people’s post-purchase behavior for the social good, by recommending clothes that the system’s users can remove from their wardrobes.

7 Conclusion and future work

In this paper, we describe an Internet of Things wardrobe enabled with a proposed semantic content-based recommendation approach to recommend clothing items for recycling. We describe a set of context signals obtained from the wardrobe’s architecture and how they affect the recommendation list displayed to the user. Evaluation of our approach shows that our approach outperforms a baseline in terms of accuracy, where our approach performed best when semantically enriched with Wikidata entities. Moreover, previous research has shown that LOD can increase recommender system’s transparency and increase the user’s trust in the system by computing convincing explanations to the recommendations [15]. The proposed approach facilitates opportunities for this and is planned for later research. Hence, the approach poses as a promising fit for the system.

Future work will also be devoted to improving the recommendation approach with a Concept Frequency - Inverse Document Frequency weighting scheme, and to develop further steps in system for the users to act on the recommendations.