Leveraging Content-Style Item Representation for Visual Recommendation

Deldjoo, Yashar; Di Noia, Tommaso; Malitesta, Daniele; Merra, Felice Antonio

doi:10.1007/978-3-030-99739-7_10

Yashar Deldjoo¹⁵,
Tommaso Di Noia¹⁵,
Daniele Malitesta¹⁵ &
…
Felice Antonio Merra¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13186))

Included in the following conference series:

European Conference on Information Retrieval

2679 Accesses
3 Citations
4 Altmetric

Abstract

When customers’ choices may depend on the visual appearance of products (e.g., fashion), visually-aware recommender systems (VRSs) have been shown to provide more accurate preference predictions than pure collaborative models. To refine recommendations, recent VRSs have tried to recognize the influence of each item’s visual characteristic on users’ preferences, for example, through attention mechanisms. Such visual characteristics may come in the form of content-level item metadata (e.g., image tags) and reviews, which are not always and easily accessible, or image regions-of-interest (e.g., the collar of a shirt), which miss items’ style. To address these limitations, we propose a pipeline for visual recommendation, built upon the adoption of those features that can be easily extracted from item images and represent the item content on a stylistic level (i.e., color, shape, and category of a fashion product). Then, we inject such features into a VRS that exploits attention mechanisms to uncover users’ personalized importance for each content-style item feature and a neural architecture to model non-linear patterns within user-item interactions. We show that our solution can reach a competitive accuracy and beyond-accuracy trade-off compared with other baselines on two fashion datasets. Code and datasets are available at: https://github.com/sisinflab/Content-Style-VRSs.

Authors are listed in alphabetical order.

F. A. Merra—Work performed while at Politecnico di Bari, Italy.

Access provided by Autonomous University of Puebla. Download conference paper PDF

End-to-End Image-Based Fashion Recommendation

Visual Recommendation and Visual Search for Fashion E-Commerce

Image-Based Fashion Product Recommendation with Deep Learning

Keywords

1 Introduction and Related Work

Recommender systems (RSs) help users in their decision-making process by guiding them in a personalized fashion to a small subset of interesting products or services amongst massive corpora. In applications where visual factors are at play (e.g., fashion [22], food [14], or tourism [33]), customers’ choices are highly dependent on the visual product appearance that attracts attention, enhances emotions, and shapes their first impression about products. By incorporating this source of information when modeling users’ preference, visually-aware recommender systems (VRSs) have found success in extending the expressive power of pure collaborative recommender models [10, 12, 13, 17, 18].

Recommendation can hugely benefit from items’ side information [4]. To this date, several works have leveraged the high-level representational power of convolutional neural networks (CNNs) to extract item visual features, where the adopted CNN may be either pretrained on different datasets and tasks, e.g., [3, 11, 18, 26, 29], or trained end-to-end in the downstream recommendation task, e.g., [23, 38]. While the former family of VRSs builds upon a more convenient way of visually representing items (i.e., reusing the knowledge of pretrained models), such representations are not entirely in line with correctly providing users’ visual preference estimation. That is, CNN-extracted features cannot capture what each user enjoys about a product picture since she might be more attracted by the color and shape of a specific bag, but these features do not necessarily match what the pretrained CNN learned when classifying the product image as a bag.

Recently, there have been a few attempts trying to uncover user’s personalized visual attitude towards finer-grained item characteristics, e.g., [7,8,9, 21]. These solutions disentangle product images at (i) content-level, by adopting item metadata and/or reviews [9, 31], (ii) region-level, by pointing the user’s interest towards parts of the image [8, 36] or video frames [7], and (iii) both content- and region-level [21]. It is worth mentioning that most of these approaches [7, 8, 21, 36] exploit attention mechanisms to weight the importance of the content or the region in driving the user’s decisions.

Despite their superior performance, we recognize practical and conceptual limitations in adopting both content- and region-level item features, especially in the fashion domain. The former rely on additional side information (e.g., image tags or reviews), which could be not-easily and rarely accessible, as well as time-consuming to collect, while the latter ignore stylistic characteristics (e.g., color or texture) that can be impactful on the user’s decision process [41].

Driven by these motivations, we propose a pipeline for visual recommendation, which involves a set of visual features, i.e., color, shape, and category of a fashion product, whose extraction is straightforward and always possible, describing items’ content on a stylistic level. We use them as inputs to an attention- and neural-based visual recommender system, with the following purposes:

We disentangle the visual item representations on the stylistic content level (i.e., color, shape, and category) by making the attention mechanisms weight the importance of each feature on the user’s visual preference and making the neural architecture catch non-linearities in user/item interactions.
We reach a reasonable compromise between accuracy and beyond-accuracy performance, which we further justify through an ablation study to investigate the importance of attention (in all its configurations) on the recommendation performance. Notice that no ablation is performed on the content-style input features, as we learn to weight their contribution through the end-to-end attention network training procedure.

2 Method

In the following, we present our visual recommendation pipeline (Fig. 1).

Preliminaries. We indicate with $\mathcal {U}$ and $\mathcal {I}$ the sets of users and items. Then, we adopt $\mathbf {R}$ as the user/item interaction matrix, where $r_{ui} \in \mathbf {R}$ is 1 for an interaction, 0 otherwise. As in latent factor models such as matrix factorization (MF) [25], we use $\mathbf {p}_{u} \in \mathbb {R}^{1 \times h}$ and $\mathbf {q}_{i} \in \mathbb {R}^{1 \times h}$ as user and item latent factors, respectively, where $h<< |\mathcal {U}|, |\mathcal {I}| $. Finally, we denote with $\mathbf {f}_i \in \mathbb {R}^{1 \times v}$ the visual feature for item image i, usually the fully-connected layer activation of a pretrained convolutional neural network (CNN).

Content-Style Features. Let $\mathcal {S}$ be the set of content-style features to characterize item images. Even if we adopt $\mathcal {S} = \{\text {color}, \text {shape}, \text {category}\}$, for the sake of generality, we indicate with $\mathbf {f}_i^{s} \in \mathbb {R}^{1 \times v_s}$ the s-th content-style feature of item i. Since all $\mathbf {f}_i^{s}$ do not necessarily belong to the same latent space, we project them into a common latent space $\mathbb {R}^{1 \times h}$, i.e., the same as the one of $\mathbf {p}_{u}$ and $\mathbf {q}_{i}$. Thus, for each $s \in \mathcal {S}$, we build an encoder function $enc_s: \mathbb {R}^{1 \times v_s} \mapsto \mathbb {R}^{1 \times h}$, and encode the s-th content-style feature of item i as:

$$\begin{aligned} \mathbf {e}_i^s = enc_s(\mathbf {f}_i^s) \end{aligned}$$

(1)

where $\mathbf {e}_i^s \in \mathbb {R}^{1 \times h}$, and $enc_s$ is either trainable, e.g., a multi-layer perceptron (MLP), or handcrafted, e.g., principal-component analysis (PCA). In this work, we use an MLP-based encoder for the color feature, a CNN-based encoder for the shape, and PCA for the category.

Attention Network. We seek to produce recommendations conditioned on the visual preference of user u towards each content-style item characteristic. That is, the model is supposed to assign different importance weights to each encoded feature $\mathbf {e}_i^s$ based on the predicted user’s visual preference ($\hat{r}_{u, i}$). Inspired by previous works [7, 8, 21, 36], we use attention. Let $ian(\cdot )$ be the function to aggregate the inputs to the attention network $\mathbf {p}_u$ and $\mathbf {e}_i^s$, e.g., element-wise multiplication. Given a user-item pair (u, i), the network produces an attention weight vector $\mathbf {a}_{u,i} = [a^{0}_{u, i}, a^{1}_{u, i}, \dots , a^{|\mathcal {S}| - 1}_{u, i}] \in \mathbb {R}^{1 \times |\mathcal {S}|}$, where $a^{s}_{u, i}$ is calculated as:

$$\begin{aligned} a^{s}_{u, i} = \boldsymbol{\omega }_2(\boldsymbol{\omega _1}ian(\mathbf {p}_u, \mathbf {e}^{s}_i) + \mathbf {b}_1) + \mathbf {b}_2 = \boldsymbol{\omega }_2(\boldsymbol{\omega _1}(\mathbf {p}_u \odot \mathbf {e}^{s}_i) + \mathbf {b}_1) + \mathbf {b}_2 \end{aligned}$$

(2)

where $\odot $ is the Hadamard product (element-wise multiplication), while $\boldsymbol{\omega }_{*}$ and $\mathbf {b}_{*}$ are the matrices and biases for each attention layer, i.e., the network is implemented as a 2-layers MLP. Then, we normalize $\mathbf {a}_{u, i}$ through the temperature-smoothed softmax function [20], so that $\sum _s a_{u, i}^s = 1$, getting the normalized weight vector $\boldsymbol{\alpha }_{u, i} = [\alpha ^{0}_{u, i}, \alpha ^{1}_{u, i}, \dots , \alpha ^{|\mathcal {S}| - 1}_{u, i}]$. We leverage the attention values to produce a unique and weighted stylistic representation for item i, conditioned on user u:

$$\begin{aligned} \mathbf {w}_i = \sum _{s \in \mathcal {S}} \alpha _{u, i}^s \mathbf {e}_i^s \end{aligned}$$

(3)

Finally, let $oan(\cdot )$ be the function to aggregate the latent factor $\mathbf {q}_i$ and the output of the attention network $\mathbf {w}_i$ into a unique representation for item i, e.g., through addition. We calculate the final item representation $\mathbf {q}'_i$ as:

$$\begin{aligned} \mathbf {q}'_i = oan(\mathbf {q}_i, \mathbf {w}_i) = \mathbf {q}_i + \mathbf {w}_i \end{aligned}$$

(4)

Neural Inference. To capture non-linearities in user/item interactions, we adopt an MLP to run the prediction. Let $concat(\cdot )$ be the concatenation function and $out(\cdot )$ be a trainable MLP, we predict rating $\hat{r}_{u, i}$ for user u and item i as:

$$\begin{aligned} \hat{r}_{u, i} = out(concat(\mathbf {p}_u, \mathbf {q}'_i)) \end{aligned}$$

(5)

Objective Function and Training. We use Bayesian personalized ranking (BPR) [32]. Given a set of triples $\mathcal {T}$ (user u, positive item p, negative item n), we seek to optimize the following objective function:

(6)

where $\boldsymbol{\Theta }$ and $\lambda $ are the set of trainable weights and the regularization term, respectively. We build $\mathcal {T}$ from the training set by picking, for each randomly sampled (u, p) pair, a negative item n for u (i.e., not-interacted by u). Moreover, we adopt mini-batch Adam [24] as optimizing algorithm.

3 Experiments

Datasets. We use two popular categories from the Amazon dataset [17, 28], i.e., Boys & Girls and Men. After having downloaded the available item images, we filter out the items and the users with less than 5 interactions [17, 18]. Boys & Girls counts 1,425 users, 5,019 items, and 9,213 interactions (sparsity is 0.00129), while Men counts 16,278 users, 31,750 items, and 113,106 interactions (sparsity is 0.00022). In both cases, we have, on average, $>6$ interactions per user.

Feature Extraction and Encoding. Since we address a fashion recommendation task, we extract color, shape/texture, and fashion category from item images [34, 41]. Unlike previous works, we leverage such features because they are easy to extract and always accessible and represent the content of item images at a stylistic level. We extract the color information through the 8-bin RGB color histogram, the shape/texture as done in [34], and the fashion category from a pretrained ResNet50 [6, 11, 15, 37], where “category” refers to the classification task on which the CNN is pretrained. As for the features encoding, we use a trainable MLP and CNN for color (a vector) and shape (an image), respectively. Conversely, following [30], we adopt PCA to compress the fashion category feature, also to level it out to the color and shape features that do not benefit from a pretrained feature extractor.

Baselines. We compare our approach with pure collaborative and visual-based approaches, i.e., BPRMF [32] and NeuMF [19] for the former, and VBPR [18], DeepStyle [26], DVBPR [23], ACF [7], and VNPR [30] for the latter.

Evaluation and Reproducibility. We put, for each user, the last interaction into the test set and the second-to-last into the validation one (i.e., temporal leave-one-out). Then, we measure the model accuracy with the hit ratio (HR@k, the validation metric) and the normalized discounted cumulative gain (nDCG@k) as performed in related works [7, 19, 39]. We also measure the fraction of items covered in the catalog (iCov@k), the expected free discovery (EFD@k) [35], and the diversity with the 1’s complement of the Gini index (Gini@k) [16]. For the implementation, we used the framework Elliot [1, 2].

3.1 Results

What are the Accuracy and Beyond-Accuracy Recommendation Performance? Table 1 reports the accuracy and beyond-accuracy metrics on top-20 recommendation lists. On Amazon Boys & Girls, our solution and DeepStyle are the best and second-best models on accuracy and beyond-accuracy measures, respectively (e.g., 0.03860 vs. 0.03719 for the HR). In addition, our approach outperforms all the other baselines on novelty and diversity, covering a broader fraction of the catalog (e.g., $iCov \simeq 90\%$). As for Amazon Men, the proposed approach is still consistently the most accurate model, even beating BPRMF, whose accuracy performance is superior to all other visual baselines. Considering that BPRMF covers only the 0.6% of the item catalog, it follows that its superior performance on accuracy comes from recommending the most popular items [5, 27, 40]. Given that, we maintain the competitiveness of our solution, being the best on the accuracy, but also covering about 29% of the item catalog and supporting the discovery of new products (e.g., $EFD = 0.01242$ is the second to best value). That is, the proposed method shows a competitive performance trade-off on accuracy and beyond-accuracy metrics.

Table 1. Accuracy and beyond-accuracy metrics on top-20 recommendation lists.

Full size table

Table 2. Ablation study on different configurations of attention, ian, and oan.

Full size table

How performance is affected by different configurations of attention, ian , and oan? Following [8, 21], we feed the attention network by exploring three aggregations for the inputs of the attention network (ian), i.e., element-wise multiplication/addition and concatenation, and two aggregations for the output of the attention network (oan), i.e., element-wise addition/multiplication. Table 2 reports the HR, i.e., the validation metric, and the iCov, i.e., a beyond-accuracy metric. No ablation study is run on the content-style features, as their relative influence on recommendation is learned during the training. First, we observe that attention mechanisms, i.e., all rows but No Attention, lead to better-tailored recommendations. Second, despite the {Concat, Add} choice reaches the highest accuracy on Men, the {Mult, Add} combination we used in this work is the most competitive on both accuracy and beyond-accuracy metrics.

4 Conclusion and Future Work

Unlike previous works, we argue that in visual recommendation scenarios (e.g., fashion), items should be represented by easy-to-extract and always accessible visual characteristics, aiming to describe their content from a stylistic perspective (e.g., color and shape). In this work, we disentangled these features via attention to assign users’ personalized importance weights to each content-style feature. Results confirmed that our solution could reach a competitive accuracy and beyond-accuracy trade-off against other baselines, and an ablation study justified the adopted architectural choices. We plan to extend the content-style features for other visual recommendation domains, such as food and social media. Another area where item content visual features can be beneficial is in improving accessibility to extremely long-tail items (distant tails), for which traditional CF or hybrid approaches are not helpful due to the scarcity of interaction data.

References

Anelli, V.W., et al.: Elliot: a comprehensive and rigorous framework for reproducible recommender systems evaluation. In: SIGIR, pp. 2405–2414. ACM (2021)
Google Scholar
Anelli, V.W., et al.: V-elliot: design, evaluate and tune visual recommender systems. In: RecSys, pp. 768–771. ACM (2021)
Google Scholar
Anelli, V.W., Deldjoo, Y., Di Noia, T., Malitesta, D., Merra, F.A.: A study of defensive methods to protect visual recommendation against adversarial manipulation of images. In: SIGIR, pp. 1094–1103. ACM (2021)
Google Scholar
Anelli, V.W., Di Noia, T., Di Sciascio, E., Ferrara, A., Mancino, A.C.M.: Sparse feature factorization for recommender systems with knowledge graphs. In: RecSys, pp. 154–165. ACM (2021)
Google Scholar
Boratto, L., Fenu, G., Marras, M.: Connecting user and item perspectives in popularity debiasing for collaborative recommendation. Inf. Process. Manag. 58(1), 102387 (2021)
Google Scholar
Chen, J., Ngo, C., Feng, F., Chua, T.: Deep understanding of cooking procedure for cross-modal recipe retrieval. In: ACM Multimedia, pp. 1020–1028. ACM (2018)
Google Scholar
Chen, J., Zhang, H., He, X., Nie, L., Liu, W., Chua, T.: Attentive collaborative filtering: multimedia recommendation with item- and component-level attention. In: SIGIR, pp. 335–344. ACM (2017)
Google Scholar
Chen, X., et al.: Personalized fashion recommendation with visual explanations based on multimodal attention network: towards visually explainable recommendation. In: SIGIR, pp. 765–774. ACM (2019)
Google Scholar
Cheng, Z., Chang, X., Zhu, L., Kanjirathinkal, R.C., Kankanhalli, M.S.: MMALFM: explainable recommendation by leveraging reviews and images. ACM Trans. Inf. Syst. 37(2), 16:1–16:28 (2019)
Google Scholar
Chong, X., Li, Q., Leung, H., Men, Q., Chao, X.: Hierarchical visual-aware minimax ranking based on co-purchase data for personalized recommendation. In: WWW, pp. 2563–2569. ACM/IW3C2 (2020)
Google Scholar
Deldjoo, Y., Di Noia, T., Malitesta, D., Merra, F.A.: A study on the relative importance of convolutional neural networks in visually-aware recommender systems. In: CVPR Workshops, pp. 3961–3967. Computer Vision Foundation/IEEE (2021)
Google Scholar
Deldjoo, Y., Schedl, M., Cremonesi, P., Pasi, G.: Recommender systems leveraging multimedia content. ACM Comput. Surv. (CSUR) 53(5), 1–38 (2020)
Article Google Scholar
Deldjoo, Y., Schedl, M., Hidasi, B., He, X., Wei, Y.: Multimedia recommender systems: algorithms and challenges. In: Recommender Systems Handbook. Springer, US (2022)
Google Scholar
Elsweiler, D., Trattner, C., Harvey, M.: Exploiting food choice biases for healthier recipe recommendation. In: SIGIR, pp. 575–584. ACM (2017)
Google Scholar
Gao, X., et al.: Hierarchical attention network for visually-aware food recommendation. IEEE Trans. Multim. 22(6), 1647–1659 (2020)
Google Scholar
Gunawardana, A., Shani, G.: Evaluating recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 265–308. Springer, Boston, MA (2015). https://doi.org/10.1007/978-1-4899-7637-6_8
Chapter Google Scholar
He, R., McAuley, J.J.: Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In: WWW, pp. 507–517. ACM (2016)
Google Scholar
He, R., McAuley, J.J.: VBPR: visual Bayesian personalized ranking from implicit feedback. In: AAAI, pp. 144–150. AAAI Press (2016)
Google Scholar
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.: Neural collaborative filtering. In: WWW, pp. 173–182. ACM (2017)
Google Scholar
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)
Google Scholar
Hou, M., Wu, L., Chen, E., Li, Z., Zheng, V.W., Liu, Q.: Explainable fashion recommendation: a semantic attribute region guided approach. In: IJCAI, pp. 4681–4688. ijcai.org (2019)
Google Scholar
Hu, Y., Yi, X., Davis, L.S.: Collaborative fashion recommendation: a functional tensor factorization approach. In: ACM Multimedia, pp. 129–138. ACM (2015)
Google Scholar
Kang, W., Fang, C., Wang, Z., McAuley, J.J.: Visually-aware fashion recommendation and design with generative image models. In: ICDM, pp. 207–216. IEEE Computer Society (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (Poster) (2015)
Google Scholar
Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Liu, Q., Wu, S., Wang, L.: Deepstyle: learning user preferences for visual recommendation. In: SIGIR, pp. 841–844. ACM (2017)
Google Scholar
Mansoury, M., Abdollahpouri, H., Pechenizkiy, M., Mobasher, B., Burke, R.: Feedback loop and bias amplification in recommender systems. In: CIKM, pp. 2145–2148. ACM (2020)
Google Scholar
McAuley, J.J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR. ACM (2015)
Google Scholar
Meng, L., Feng, F., He, X., Gao, X., Chua, T.: Heterogeneous fusion of semantic and collaborative information for visually-aware food recommendation. In: ACM Multimedia, pp. 3460–3468. ACM (2020)
Google Scholar
Niu, W., Caverlee, J., Lu, H.: Neural personalized ranking for image recommendation. In: WSDM, pp. 423–431. ACM (2018)
Google Scholar
Packer, C., McAuley, J.J., Ramisa, A.: Visually-aware personalized recommendation using interpretable image representations. CoRR abs/1806.09820 (2018)
Google Scholar
Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: Bayesian personalized ranking from implicit feedback. In: UAI, pp. 452–461. AUAI Press (2009)
Google Scholar
Sertkan, M., Neidhardt, J., Werthner, H.: Pictoure - A picture-based tourism recommender. In: RecSys, pp. 597–599. ACM (2020)
Google Scholar
Tangseng, P., Okatani, T.: Toward explainable fashion recommendation. In: WACV, pp. 2142–2151. IEEE (2020)
Google Scholar
Vargas, S.: Novelty and diversity enhancement and evaluation in recommender systems and information retrieval. In: SIGIR, p. 1281. ACM (2014)
Google Scholar
Wu, Q., Zhao, P., Cui, Z.: Visual and textual jointly enhanced interpretable fashion recommendation. IEEE Access 8, 68736–68746 (2020)
Google Scholar
Yang, X., et al.: Interpretable fashion matching with rich attributes. In: SIGIR, pp. 775–784. ACM (2019)
Google Scholar
Yin, R., Li, K., Lu, J., Zhang, G.: Enhancing fashion recommendation with visual compatibility relationship. In: WWW, pp. 3434–3440. ACM (2019)
Google Scholar
Zhang, Y., Zhu, Z., He, Y., Caverlee, J.: Content-collaborative disentanglement representation learning for enhanced recommendation. In: RecSys, pp. 43–52. ACM (2020)
Google Scholar
Zhu, Z., Wang, J., Caverlee, J.: Measuring and mitigating item under-recommendation bias in personalized ranking systems. In: SIGIR, pp. 449–458. ACM (2020)
Google Scholar
Zou, Q., Zhang, Z., Wang, Q., Li, Q., Chen, L., Wang, S.: Who leads the clothing fashion: Style, color, or texture? A computational study. CoRR abs/1608.07444 (2016)
Google Scholar

Download references

Acknowledgment

The authors acknowledge partial support of the projects: CTE Matera, ERP4.0, SECURE SAFE APULIA, Servizi Locali 2.0.

Author information

Authors and Affiliations

Politecnico di Bari, Bari, Italy
Yashar Deldjoo, Tommaso Di Noia & Daniele Malitesta
Amazon Science Berlin, Berlin, Germany
Felice Antonio Merra

Authors

Yashar Deldjoo
View author publications
You can also search for this author in PubMed Google Scholar
Tommaso Di Noia
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Malitesta
View author publications
You can also search for this author in PubMed Google Scholar
Felice Antonio Merra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniele Malitesta .

Editor information

Editors and Affiliations

Martin Luther University Halle-Wittenberg, Halle, Germany
Matthias Hagen
Leiden University, Leiden, The Netherlands
Suzan Verberne
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Duisburg-Essen, Essen, Germany
Christin Seifert
University of Stavanger, Stavanger, Norway
Krisztian Balog
Norwegian University of Science and Technology, Trondheim, Norway
Kjetil Nørvåg
University of Stavanger, Stavanger, Norway
Vinay Setty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deldjoo, Y., Di Noia, T., Malitesta, D., Merra, F.A. (2022). Leveraging Content-Style Item Representation for Visual Recommendation. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13186. Springer, Cham. https://doi.org/10.1007/978-3-030-99739-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-99739-7_10
Published: 05 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99738-0
Online ISBN: 978-3-030-99739-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Leveraging Content-Style Item Representation for Visual Recommendation

Abstract

Similar content being viewed by others

End-to-End Image-Based Fashion Recommendation

Visual Recommendation and Visual Search for Fashion E-Commerce

Image-Based Fashion Product Recommendation with Deep Learning

Keywords

1 Introduction and Related Work

2 Method

3 Experiments

3.1 Results

4 Conclusion and Future Work

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Leveraging Content-Style Item Representation for Visual Recommendation

Abstract

Similar content being viewed by others

End-to-End Image-Based Fashion Recommendation

Visual Recommendation and Visual Search for Fashion E-Commerce

Image-Based Fashion Product Recommendation with Deep Learning

Keywords

1 Introduction and Related Work

2 Method

3 Experiments

3.1 Results

4 Conclusion and Future Work

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation