Abstract
A supermarket can have numerous stock keeping units (SKUs) in a single store. The arrangement of SKUs or products in a supermarket is carefully controlled and planned to maximize sales. However, verifying that the real shelves match ideal layout, a task called planogram compliance, is a costly process that requires store personnel to take an inventory of thousands of products. In order to automate this task, we have developed a system for retail product identification that doesn’t require fine tuning on the supermarket products, shows impressive generalization and is scalable. In this chapter, we address the problem of product identification on the grocery shelves by using a deep convolutional neural network to generate variable length embeddings corresponding to varying accuracy. For embedding generation, we created an in-house dataset containing more than 6,900 images and tested our model on the dataset created from the real store with products in different rotations and positions. Our experimental results show the effectiveness of our approach. Furthermore, our solution is designed to run on low powered devices such as Intel’s Neural Compute Stick 2 on which our perception system was able to achieve 5.8 frames per second (FPS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Auclair, A., Cohen, L.D., Vincent, N.: How to use sift vectors to analyze an image with database templates. In: Boujemaa, N., Detyniecki, M., Nürnberger, A. (eds.) Adaptive Multimedia Retrieval: Retrieval, User, and Semantics, pp. 224–236. Springer, Berlin, Heidelberg (2008)
Batchelor, O., Green, R.: Object recognition by stochastic metric learning. In: Dick, G., Browne, W.N., Whigham, P., Zhang, M., Bui, L.T., Ishibuchi, H., Jin, Y., Li, X., Shi, Y., Singh, P., Tan, K.C., Tang, K. (eds.) Simulated Evolution and Learning, pp. 798–809. Springer International Publishing, Cham (2014)
Bell, S., Bala, K.: Learning visual similarity for product design with convolutional neural networks. ACM Trans. Graph. 34(4), 98:1–98:10 (2015). https://doi.org/10.1145/2766959
Bouma, S., Pawley, M.D.M., Hupman, K., Gilman, A.: Individual common dolphin identification via metric embedding learning. CoRR abs/1901.03662 (2019). arXiv:1901.03662
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. In: Proceedings of the 6th International Conference on Neural Information Processing Systems, NIPS’93, pp. 737–744. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993). http://dl.acm.org/citation.cfm?id=2987189.2987282
Collette, A.: Python and HDF5. O’Reilly (2013)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR09 (2009)
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout (2017). arXiv:1708.04552
Geng, M., Wang, Y., Xiang, T., Tian, Y.: Deep transfer learning for person re-identification. CoRR abs/1611.05244 (2016). arXiv:1611.05244
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015). arXiv:1512.03385
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. CoRR abs/1703.07737 (2017). arXiv:1703.07737
Hu, J., Lu, J., Tan, Y.: Discriminative deep metric learning for face verification in the wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1875–1882 (2014). https://doi.org/10.1109/CVPR.2014.242
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014). arXiv:1412.6980
Klasson, M., Zhang, C., Kjellström, H.: A hierarchical grocery store image dataset with visual and semantic labels. In: IEEE Winter Conference on Applications of Computer Vision (WACV) (2019)
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, NIPS’12, pp. 1097–1105. Curran Associates Inc., USA (2012). http://dl.acm.org/citation.cfm?id=2999134.2999257
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015). https://doi.org/10.1126/science.aab3050. https://science.sciencemag.org/content/350/6266/1332
Lu, J., Hu, J., Zhou, J.: Deep metric learning for visual understanding: an overview of recent advances. IEEE Signal Process. Mag. 34(6), 76–84 (2017). https://doi.org/10.1109/MSP.2017.2732900
Ray, A., Kumar, N., Shaw, A., Mukherjee, D.P.: U-pc: Unsupervised planogram compliance. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision - ECCV 2018, pp. 598–613. Springer International Publishing, Cham (2018)
Ridgeway, K., Mozer, M.C.: Learning deep disentangled embeddings with the f-statistic loss. CoRR abs/1802.05312 (2018). arXiv:1802.05312
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. CoRR abs/1503.03832 (2015). arXiv:1503.03832
Scott, T., Ridgeway, K., Mozer, M.C.: Adapted deep embeddings: A synthesis of methods for k-shot inductive transfer learning. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 31, pp. 76–85. Curran Associates, Inc. (2018). http://papers.nips.cc/paper/7293-adapted-deep-embeddings-a-synthesis-of-methods-for-k-shot-inductive-transfer-learning.pdf
Sharma, V., Karnick, H.: Automatic tagging and retrieval of e-commerce products based on visual features. In: Proceedings of the Student Research Workshop, SRW@HLT-NAACL 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, pp. 22–28 (2016). http://aclweb.org/anthology/N/N16/N16-2004.pdf
Shi, H., Yang, Y., Zhu, X., Liao, S., Lei, Z., Zheng, W., Li, S.Z.: Embedding deep metric for person re-identification A study against large variations. CoRR abs/1611.00137 (2016). arXiv:1611.00137
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). arXiv:1409.1556
Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. CoRR abs/1703.05175 (2017). arXiv:1703.05175
Song, H.O., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. CoRR abs/1511.06452 (2015). arXiv:1511.06452
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015). arXiv:1409.4842
The HDF Group: Hierarchical data format version 5 (2000–2010). http://www.hdfgroup.org/HDF5
Tonioni, A., di Stefano, L.: Product recognition in store shelves as a sub-graph isomorphism problem. CoRR abs/1707.08378 (2017). arXiv:1707.08378
Triantafillou, E., Zemel, R.S., Urtasun, R.: Few-shot learning through an information retrieval lens. CoRR abs/1707.02610 (2017). arXiv:1707.02610
Ustinova, E., Lempitsky, V.S.: Learning deep embeddings with histogram loss. CoRR abs/1611.00822 (2016). arXiv:1611.00822
Varol, G., Salih, R.: Toward retail product recognition on grocery shelves. In: Sixth International Conference on Graphic and Image Processing (ICGIP), p. 944309 (2015). https://doi.org/10.1117/12.2179127
Vinyals, O., Blundell, C., Lillicrap, T.P., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. CoRR abs/1606.04080 (2016). arXiv:1606.04080
Wald, I., Johnson, G., Amstutz, J., Brownlee, C., Knoll, A., Jeffers, J., Günther, J., Navratil, P.: Ospray - a CPU ray tracing framework for scientific visualization. IEEE Trans. Visual Comput. Graph. 23(1), 931–940 (2017). https://doi.org/10.1109/TVCG.2016.2599041
Wang, J., Zhou, F., Wen, S., Liu, X., Lin, Y.: Deep metric learning with angular loss. CoRR abs/1708.01682 (2017). arXiv:1708.01682
Winlock, T., Christiansen, E., Belongie, S.: Toward real-time grocery detection for the visually impaired. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 49–56 (2010). https://doi.org/10.1109/CVPRW.2010.5543576
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: 2014 22nd International Conference on Pattern Recognition, pp. 34–39 (2014). https://doi.org/10.1109/ICPR.2014.16
Yörük, E., Öner, K.T., Akgül, C.B.: An efficient Hough transform for multi-instance object recognition and pose estimation. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 1352–1357 (2016). https://doi.org/10.1109/ICPR.2016.7899825
Zhang, Q., Lee, K., Bao, H., You, Y., Li, W., Guo, D.: Large scale classification in deep neural network with label mapping. CoRR abs/1806.02507 (2018). arXiv:1806.02507
Zhang, Y., Wang, L., Hartley, R., Li, H.: Handling significant scale difference for object retrieval in a supermarket. In: DICTA, pp. 468–475. IEEE Computer Society (2009). http://dblp.uni-trier.de/db/conf/dicta/dicta2009.html
Zhang, Y., Wang, L., Hartley, R.I., Li, H.: Where’s the weet-bix? In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV (1), Lecture Notes in Computer Science, vol. 4843, pp. 800–810. Springer (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Sinha, S., Byrne, J. (2022). Robots Collecting Data: Robust Identification of Products. In: Villani, L., Natale, C., Beetz, M., Siciliano, B. (eds) Robotics for Intralogistics in Supermarkets and Retail Stores. Springer Tracts in Advanced Robotics, vol 148. Springer, Cham. https://doi.org/10.1007/978-3-031-06078-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-06078-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06077-9
Online ISBN: 978-3-031-06078-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)