Abstract
We are interested in varying the vocabulary size in the image categorization task with a bag-of-visual-words to investigate its influence on the classification accuracy in two cases: in the first one, both the test-set and the training set contains the same objects (with only different view points in the test-set) and the second one where objects in the test-set do not appear at all in the training set (only other objects from the same category appear). In order to perform these tasks, we need to scale-up the algorithms used to deal with millions data points in hundred of thousand dimensions. We present k-means (used in the quantization step) and SVM (used in the classification step) algorithms extended to deal with very large datasets. These new incremental and parallel algorithms can be used on various distributed architectures, like multi-thread computer, cluster or GPU (graphics processing units). The efficiency of the approach is shown with the categorization of the 3D-Dataset from Savarese and Fei-Fei containing about 6700 images of 3D objects from 10 different classes. The obtained incremental and parallel SVM algorithm is several orders of magnitude faster than usual ones (like lib-SVM, SVM-perf or CB-SVM) and the incremental and parallel k-means is at least one order of magnitude faster than usual implementations.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Blackford, L.S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., Whaley, R.C.: An Updated Set of Basic Linear Algebra Subprograms (BLAS). ACM Trans. Math. Soft. 28(2), 135–151 (2002)
Boser, B., Guyon, I., Vapnik, V.: A Training Algorithm for Optimal Margin Classifiers. In: 5th ACM Annual Workshop on Computational Learning Theory, Pittsburgh, Pennsylvania, pp. 144–152 (1992)
Cauwenberghs, G., Poggio, T.: Incremental and Decremental Support Vector Machine Learning. In: Advances in Neural Information Processing Systems, vol. 13, pp. 409–415. MIT Press, Cambridge (2001)
Chang, C.C., Lin, C.J.: LIBSVM: a Library for Support Vector Machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Do, T.N., Fekete, J.D.: Large Scale Classification with Support Vector Machine Algorithms. In: 6th International Conference on Machine Learning and Applications, ICMLA 2007, pp. 7–12. IEEE Press, Ohio (2007)
Do, T.N., Pham, N.K., Poulet, F.: GPU-based Parallel SVM Algorithm. Journal of Frontiers of Computer Science and Technoloy 3(4), 368–377 (2009)
Do, T.N., Poulet, F.: Classifying one Billion Data with a New Distributed SVM Algorithm. In: 4th IEEE International Conference on Computer Science, Research, Innovation and Vision for the Future, RIVF 2006, Ho Chi Minh, Vietnam, pp. 59–66 (2006)
Fung, G., Mangasarian, O.: Incremental Support Vector Machine Classification. In: The 2nd SIAM Int. Conf. on Data Mining, SDM 2002, Arlington, Virginia, USA (2002)
Grid5000, https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home
Joachims, T.: A Support Vector Method for Multivariate Performance Measures. In: The International Conference on Machine Learning, ICML (2005)
Lowe, D.: Object Recognition from Local Scale-Invariant Features. In: The 7th International Conference on Computer Vision, ICCV 1999, vol. 2, pp. 1150–1157 (1999)
McQueen, J.: Some Methods for classification and Analysis of Multivariate Observations. In: The 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Mikolajczyk, K., Schmid, C.: Scale and Affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)
NVIDIA® CUDATM, CUDA Programming Guide 1.1 (2007)
NVIDIA® CUDATM, CUDA CUBLAS Library 1.1 (2007)
Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Advances in Kernel Methods – Support Vector Learning, pp. 185–208 (1999)
Poulet, F., Do, T.N.: Mining Very Large Datasets with Support Vector Machine Algorithms. In: Enterprise Information Systems, pp. 177–184. Kluwer Academic Publishers, Dordrecht (2004)
Savarese, S.L.: Fei-Fei.:3D generic object categorization, localization and pose estimation. In: International Conference on Computer Vision (2007)
Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: The International Conference on Computer Vision, pp. 1470–1477 (2003)
Suykens, J., Vandewalle, J.: Least Squares Support Vector Machines Classifiers. Neural Processing Letters 9(3), 293–300 (1999)
Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: The Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence, Stockholm, Sweden (1999)
Tong, S., Koller, D.: Support Vector Machine Active Learning with Applications to Text Classification. In: ICML 2000, The 17th Int. Conf. on Machine Learning, Stanford, USA, pp. 999–1006 (2000)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Wasson, S.: Nvidia’s GeForce 8800 graphics processor. Technical report, PC Hardware Explored (2006)
Yu, H., Yang, J., Han, J.:Classifying Large Data Sets Using SVM with Hierarchical Clusters.In: The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 306–315(2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Poulet, F., Pham, NK. (2010). High Dimensional Image Categorization. In: Cao, L., Feng, Y., Zhong, J. (eds) Advanced Data Mining and Applications. ADMA 2010. Lecture Notes in Computer Science(), vol 6440. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17316-5_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-17316-5_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17315-8
Online ISBN: 978-3-642-17316-5
eBook Packages: Computer ScienceComputer Science (R0)