Learning Discriminative and Shareable Features for Scene Classification

Zuo, Zhen; Wang, Gang; Shuai, Bing; Zhao, Lifan; Yang, Qingxiong; Jiang, Xudong

doi:10.1007/978-3-319-10590-1_36

Zhen Zuo¹⁹,
Gang Wang^19,20,
Bing Shuai¹⁹,
Lifan Zhao¹⁹,
Qingxiong Yang²¹ &
…
Xudong Jiang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8689))

Included in the following conference series:

European Conference on Computer Vision

38k Accesses
35 Citations

Abstract

In this paper, we propose to learn a discriminative and shareable feature transformation filter bank to transform local image patches (represented as raw pixel values) into features for scene image classification. The learned filters are expected to: (1) encode common visual patterns of a flexible number of categories; (2) encode discriminative and class-specific information. For each category, a subset of the filters are activated in a data-adaptive manner, meanwhile sharing of filters among different categories is also allowed. Discriminative power of the filter bank is further enhanced by enforcing the features from the same category to be close to each other in the feature space, while features from different categories to be far away from each other. The experimental results on three challenging scene image classification datasets indicate that our features can achieve very promising performance. Furthermore, our features also show great complementary effect to the state-of-the-art ConvNets feature.

Download to read the full chapter text

Chapter PDF

Learning extremely shared middle-level image representation for scene classification

Article 23 December 2016

Robust Scene Classification with Cross-Level LLC Coding on CNN Features

On fusing the latent deep CNN feature for image classification

Article 15 June 2018

Keywords

References

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
Google Scholar
Le, Q.V., Karpenko, A., Ngiam, J., Ng, A.Y.: Ica with reconstruction cost for efficient overcomplete feature learning. In: NIPS, pp. 1017–1025 (2011)
Google Scholar
Zou, W.Y., Zhu, S.Y., Ng, A.Y., Yu, K.: Deep learning of invariant features via simulated fixations in video. In: NIPS, pp. 3212–3220 (2012)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation (2006)
Google Scholar
Coates, A., Lee, H., Ng, A.Y.: An analysis of single-layer networks in unsupervised feature learning. In: International Conference on Artificial Intelligence and Statistics, pp. 215–223 (2011)
Google Scholar
Sohn, K., Jung, D.Y., Lee, H., Hero, A.O.: Efficient learning of sparse, distributed, convolutional feature representations for object recognition. In: ICCV, pp. 2643–2650 (2011)
Google Scholar
Zuo, Z., Wang, G.: Learning discriminative hierarchical features for object recognition. Signal Processing Letters 21(9), 1159–1163 (2014)
Article Google Scholar
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A deep convolutional activation feature for generic visual recognition. In: ICML, pp. 647–655 (2014)
Google Scholar
Deng, J., Berg, A.C., Li, K., Fei-Fei, L.: What does classifying more than 10,000 image categories tell us? In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 71–84. Springer, Heidelberg (2010)
Chapter Google Scholar
Oliva, A., Torralba, A.: Building the gist of a scene: The role of global image features in recognition. Progress in Brain Research 155, 23–36 (2006)
Article Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 971–987 (2002)
Article Google Scholar
Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y.: What is the best multi-stage architecture for object recognition? In: ICCV, pp. 2146–2153 (2009)
Google Scholar
Taylor, G.W., Fergus, R., LeCun, Y., Bregler, C.: Convolutional learning of spatio-temporal features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 140–153. Springer, Heidelberg (2010)
Chapter Google Scholar
Le, Q.V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., Ng, A.Y.: Building high-level features using large scale unsupervised learning. In: ICML (2012)
Google Scholar
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: CVPR, pp. 3361–3368 (2011)
Google Scholar
Shen, X., Xu, L., Zhang, Q., Jia, J.: Multi-modal and multi-spectral registration for natural images. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 309–324. Springer, Heidelberg (2014)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv preprint arXiv:1311.2524 (2013)
Google Scholar
Jiang, Z., Lin, Z., Davis, L.S.: Learning a discriminative dictionary for sparse coding via label consistent k-svd. In: CVPR, pp. 1697–1704 (2011)
Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Supervised dictionary learning. In: NIPS (2008)
Google Scholar
Yang, M., Zhang, L., Feng, X., Zhang, D.: Fisher discrimination dictionary learning for sparse representation. In: ICCV, pp. 543–550 (2011)
Google Scholar
Kong, S., Wang, D.: A dictionary learning approach for classification: Separating the particularity and the commonality. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 186–199. Springer, Heidelberg (2012)
Chapter Google Scholar
Li, Q., Wu, J., Tu, Z.: Harvesting mid-level visual concepts from large-scale internet images. In: CVPR (2013)
Google Scholar
Doersch, C., Gupta, A., Efros, A.A.: Mid-level visual element discovery as discriminative mode seeking. In: NIPS, pp. 494–502 (2013)
Google Scholar
Sun, J., Ponce, J., et al.: Learning discriminative part detectors for image classification and cosegmentation. In: ICCV (2013)
Google Scholar
Juneja, M., Vedaldi, A., Jawahar, C., Zisserman, A.: Blocks that shout: Distinctive parts for scene classification. In: CVPR, pp. 923–930 (2013)
Google Scholar
Song, H.O., Zickler, S., Althoff, T., Girshick, R., Fritz, M., Geyer, C., Felzenszwalb, P., Darrell, T.: Sparselet models for efficient multiclass object detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 802–815. Springer, Heidelberg (2012)
Chapter Google Scholar
Song, H.O., Darrell, T., Girshick, R.B.: Discriminatively activated sparselets. In: ICML, pp. 196–204 (2013)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MATH MathSciNet Google Scholar
Wang, G., Forsyth, D., Hoiem, D.: Improved object categorization and detection using comparative object similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(10), 2442–2453 (2013)
Article Google Scholar
Wang, Z., Gao, S., Chia, L.-T.: Learning class-to-image distance via large margin and L1-norm regularization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 230–244. Springer, Heidelberg (2012)
Chapter Google Scholar
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: CVPR, pp. 1–8 (2008)
Google Scholar
McCann, S., Lowe, D.G.: Local naive bayes nearest neighbor for image classification. In: CVPR, pp. 3650–3656 (2012)
Google Scholar
Wang, Z., Feng, J., Yan, S., Xi, H.: Linear distance coding for image classification. IEEE Transactions on Image Processing 22(2), 537–548 (2013)
Article MathSciNet Google Scholar
Yao, B., Fei-Fei, L.: Action recognition with exemplar based 2.5D graph matching. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 173–186. Springer, Heidelberg (2012)
Chapter Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, vol. 2, pp. 2169–2178 (2006)
Google Scholar
Li, L.J., Fei-Fei, L.: What, where and who? classifying events by scene and object recognition. In: ICCV (2007)
Google Scholar
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)
Google Scholar
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: CVPR, pp. 3485–3492 (2010)
Google Scholar
Wu, J., Rehg, J.M.: Centrist: A visual descriptor for scene categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(8), 1489–1501 (2011)
Article Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)
Article Google Scholar
Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: ICCV, pp. 1307–1314 (2011)
Google Scholar
Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: NIPS (2010)
Google Scholar
Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 73–86. Springer, Heidelberg (2012)
Chapter Google Scholar
Boureau, Y.L., Bach, F., Le Cun, Y., Ponce, J.: Learning mid-level features for recognition. In: CVPR (2010)
Google Scholar
Wang, X., Wang, B., Bai, X., Liu, W., Tu, Z.: Max-margin multiple-instance dictionary learning. In: ICML (2013)
Google Scholar
Gao, S., Tsang, I.H., Chia, L.T.: Laplacian sparse coding, hypergraph laplacian sparse coding, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1), 92–104 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Nanyang Technological University, Singapore
Zhen Zuo, Gang Wang, Bing Shuai, Lifan Zhao & Xudong Jiang
Advanced Digital Sciences Center, Sinapore
Gang Wang
City University of Hong Kong, China
Qingxiong Yang

Authors

Zhen Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bing Shuai
View author publications
You can also search for this author in PubMed Google Scholar
Lifan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qingxiong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xudong Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
PSI, iMinds, KU Leuven, ESAT, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zuo, Z., Wang, G., Shuai, B., Zhao, L., Yang, Q., Jiang, X. (2014). Learning Discriminative and Shareable Features for Scene Classification. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8689. Springer, Cham. https://doi.org/10.1007/978-3-319-10590-1_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-10590-1_36
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10589-5
Online ISBN: 978-3-319-10590-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Discriminative and Shareable Features for Scene Classification

Abstract

Chapter PDF

Similar content being viewed by others

Learning extremely shared middle-level image representation for scene classification

Robust Scene Classification with Cross-Level LLC Coding on CNN Features

On fusing the latent deep CNN feature for image classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning Discriminative and Shareable Features for Scene Classification

Abstract

Chapter PDF

Similar content being viewed by others

Learning extremely shared middle-level image representation for scene classification

Robust Scene Classification with Cross-Level LLC Coding on CNN Features

On fusing the latent deep CNN feature for image classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation