Abstract
Subspace learning is most traditional and important in multimedia analysis. Numerous researches have focused on how to introduce machine learning and statistical methods to multimedia subspace learning for semantic understanding and denoising, and have gained remarkable achievement in different multimedia applications, such as content-based retrieval, data clustering, face recognition, etc. However, most of these researches are based on multimedia data of single modality. Nowadays, with the rapid development of multimedia and information technology, multimedia data of different modalities often coexist, and the presence of one has a complementary effect on the other to some extent. Because different multimedia data are usually represented with heterogeneous low-level features and there exists the well-known semantic gap, it is interesting and challenging to learn multimedia semantics by multi-feature subspace learning of different modalities. In this paper, we analyze sparse canonical correlation between feature matrices of different multimedia data, construct an isomorphic sparse multi-feature subspace; moreover, we propose subspace optimization strategy with correlation fusion, which explores both geometrical-based content correlation and graph-based semantic correlation. Our algorithm has been applied to content-based multimodal retrieval and data classification. Comprehensive experiments have demonstrated the superiority of our method over several existing algorithms.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Yang, Y., Nie, F., Xu, D., Luo, J., et al.: A Multimedia Retrieval Framework based on Semi-Supervised Ranking and Relevance Feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 34(4), 723–742 (2012)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences and trends of the new age. ACM Computing Surveys 40(2) (2008)
Zhang, R., Zhang, Z.: Effective Image Retrieval based on Hidden Concept Discovery in Image Database. IEEE Transactions on Image Processing 16(2), 562–572 (2007)
Nie, F., Xu, D., Tsang, I., Zhang, C.: Spectral Embedded Clustering. In: International Joint Conference on Artificial Intelligence (IJCAI), California, pp. 1181–1186 (2009)
Ye, J., Zhao, Z., Wu, M.: Discriminative k-means for clustering. Advances in Neural Information Processing Systems 20, 1649–1656 (2008)
Liang, D.W., Liu, Y., Huang, Q.M., et al.: Video2Cartoon: Generating 3D Cartoon from Broad-cast Soccer Video. In: Proceedings of ACM Multimedia (2005)
Typke, R., Wiering, F., Veltkamp, R.: A survey of music information retrieval systems. In: Proceedings of ISMIR, pp. 153–160 (2005)
Lew, M., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state-of-the-art and challenges. ACM Transactions on Multimedia Computing, Communication, and Applications 2(1), 1–19 (2006)
Yang, Y., Xu, D., Nie, F., Luo, J.: Ranking with local regression and global alignment for cross-media retrieval. ACM Multimedia (2009)
Zhang, H., Zhuang, Y., Wu, F.: Cross-modal correlation learning for clustering on image-audio dataset. ACM Multimedia (2007)
Zhang, H., Meng, F.: Multi-modal Correlation Modeling and Ranking for Retrieval. In: IEEE Pacific-Rim Conference on Multimedia, pp. 637–646 (2009)
Witten, D.M., Tibshirani, R.: Extensions of sparse canonical correlation analysis, with applications to genomic data. Statistical Applications in Genetics and Molecular Biology 8(1) (2009)
Yang, Y., Zhuang, Y., Xu, D., Pan, Y., Tao, D., Maybank, S.: Retrieval Based Interactive Cartoon Synthesis via Unsupervised Bi-Distance Metric Learning. ACM Multimedia, 311–320 (2009)
Turk, M.A., Pentland, A.P.: Face Recognition using Eigenface. In: Computer Vision and Pattern Recognition, pp. 586–591 (1991)
Guo, G., Li, S.Z., Chan, K.: Face Recognition by Support Vector Machines. In: IEEE Intl. Conf. on Auto. Face and Gesture Recognition, pp. 196–201 (2000)
McGurk, H., MacDonald, J.: Hearing Lips and Seeing Voices. Nature 264, 746–748 (1976)
Zhang, H., Liu, Y., Ma, Z.: Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval. Neurocomputing (2013), doi:10.1016/j.neucom.2012.03.033
Ma, Q., Akiyo, N., Katsumi, T.: Complementary Information Retrieval for Cross-media News Content. In: Proceedings of Information Systems, vol. 31(7), pp. 659–678 (2006)
Joliffe: Principal component analysis. Springer, New York (1986)
He, X.F., Yan, S.C., Hu, Y.X., et al.: Face recognition using laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 328–340 (2005)
Tenenbaum, J.B., Silva, V.D., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290, 2319–2323 (2000)
Hansen, L., Larsen, J., Kolenda, T.: On Independent Component Analysis for Multimedia Signals. In: Multimedia Image and Video Processing, pp. 175–200. CRC Press (2000)
Guo, G., Li, S.Z.: Content-based Audio Classification and Retrieval by Support Vector Machines. IEEE Transactions on Neural Networks 14(1), 209–215 (2003)
Slaney, M., Covell, M.: FaceSync: A linear operator for measuring synchronization of video facial images and audio tracks. In: NIPS, pp. 814–820 (2000)
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7), 711–720 (1997)
Zhang, H., Yu, J., Wang, M., Liu, Y.: Semi-supervised Distance Metric Learning based on Local Linear Regression for Data Clustering. Neurocomputing 93, 100–105 (2012)
Lovasz, L., Plummer, M.: Matching Theory, pp. 307–349. Akadémiai Kiadó, North Holland (1986)
Cai, D., He, X., Han, J.: Semi-supervised Discriminant Analysis. In: IEEE 11th International Conference on Computer Vision, pp. 1–7 (2007)
Ma, Z., Yang, Y., Nie, F., Uijlings, J., Sebe, N.: Exploiting the entire feature space with sparsity for automatic image annotation. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 283–292
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, H., Zhang, Y. (2013). Multi-feature Subspace Learning via Sparse Correlation Fusion and Embedding. In: Huet, B., Ngo, CW., Tang, J., Zhou, ZH., Hauptmann, A.G., Yan, S. (eds) Advances in Multimedia Information Processing – PCM 2013. PCM 2013. Lecture Notes in Computer Science, vol 8294. Springer, Cham. https://doi.org/10.1007/978-3-319-03731-8_55
Download citation
DOI: https://doi.org/10.1007/978-3-319-03731-8_55
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03730-1
Online ISBN: 978-3-319-03731-8
eBook Packages: Computer ScienceComputer Science (R0)