Abstract
Information retrieval is prominent as a significant exploration area because of the sensational development in generation of text data, documents images, audio and video files which are uploaded in various forms in Internet. The information retrieval is used for perusing, looking and recovering reports from a gigantic dataset. Most regular systems for image recovery utilize some strategy for including metadata and title or portrayals. There is a need for efficient machine learning and information retrieval algorithms to access the required documents from a large set of text documents. In this paper, we present our method which is an efficient image retrieval using text and visual features. We present our method with the various visual semantic features for different queries through keyword expansions. We experimented our method with sample and standard datasets, and the results have been improved in terms of re-ranking and in terms of precisions in document retrieval.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cai, D., He, X., Han, J.: Locally consistent concept factorization for document clustering. IEEE Trans. Knowl. Data Eng. 23(6), 902–913 (2011)
Hammouda, K.M., Kamel, M.S.: Efficient phrase-based document indexing for web document clustering. IEEE Trans. Knowl. Data Eng. 16(10) (2004)
Lan, M., Tan, C.L., Su, J., Lu, Y.: Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans. Pattern Anal. Mach. Intell. 31(4) (2009)
Carlberger, J., Dalianis, H., Hassel, M., Knutsson, O.: Improving precision in information retrieval for Swedish using stemming. In: Proceedings of the 13th Nordic Conference on Computational Linguistics NODALIDA’01 (2001)
Chen, A., Gey, F.: Combining query translation and document translation in cross language retrieval. In: CLEF 2003 (2003). http://www.clef-campaign.org/2003/WNweb/05.pdf
Croft, W.B.: Organizing and searching large files of documents. Ph.D. thesis, University of Cambridge, Oct 1978
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB Conference (1994)
Beil, F., Ester, M., Xu, X.: Frequent term-based text clustering. In: ACM KDD Conference (2002)
Abualigah, L.M., Khader, A.T., Hanandeh, E.S.: A novel weighting scheme applied to improve the text document clustering techniques. In: Zelinka, I., Vasant, P., Duy, V., Dao, T. (eds.) Innovative Computing, Optimization and Its Applications. Studies in Computational Intelligence, vol. 741. Springer (2018)
Zamir, O., Etzioni, O.: Web document clustering: a feasibility demonstration. In: SIGIR’98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Aug 1998
Solorio-Fernández, S., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F.: A review of unsupervised feature selection methods. J. Artif. Intell. Rev. (2) (2020)
Roul, R.K., Sahoo, J.K.: A novel approach for ranking web documents based on query-optimized personalized pagerank. Int. J. Data Sci. Anal. 11(1), 37–55 (2021)
Nagy, G.: Twenty years of document image analysis in PAMI. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 38–62 (2000)
Digital Library of India: http://dli.iiit.ac.in/
The Universal Library: http://www.uliborg
Willet, P.: Recent trends in hierarchical document clustering: a critical review. Inf. Process. Manage. 24, 577–597 (1988)
Chen, C.-L., Tseng, F.S.C., Liang, T.: Mining fuzzy frequent itemsets for hierarchical document clustering. Int. J. Inf. Process. Manag. 46(2), 193–211 (2010)
Cui, X., Potok, T.E., Palathingal, P.: Document clustering using particle swarm optimization. In: Swarm Intelligence Symposium, 2005. SIS 2005. Proceedings 2005 IEEE, June 2005. IEEE, pp. 185–191
Murugesan, A.K., Zhang, B.J.: A new term weighting scheme for document clustering. In: 7th International Conference on Data Mining (DMIN 2011—WORLDCOMP 2011), Las Vegas, Nevada (2011)
Cutting, D.R., Karger, D.R., Pedersen, J.O., Tukey, J.W.: Scatter/gather: a cluster-based approach to browsing large document collections. In: Proceedings of the 15th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329 (1992)
Roul, R.K.: Topic modeling combined with classification technique for extractive multi-document text summarization. Soft Comput. 25(2), 1113–1127 (2021)
Kumar, R.L., Kannammal, N., Krishnamoorthy, S., Kadry, S., Nam, Y.: Semantics based clustering through cover-Kmeans with OntoVsm for information retrieval. Inf. Technol. Control 49(3), 370–380 (2020)
Kalyanasundaram, C., Ahire, S., Jain, G., Jain, S.: Text clustering for information retrieval system using supplementary information. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 6(2), 1613–1615 (2015)
Du, S., Ma, Y., Li, S., Ma, Y.: Robust unsupervised feature selection via matrix factorization. Neurocomputing 241, 115–127 (2017). https://doi.org/10.1016/j.neucom.2017.02.034
Dutta, D., Dutta, P., Sil, J.: Simultaneous feature selection and clustering with mixed features by multi objective genetic algorithm. Int. J. Hybrid Intell. Syst. 11(1), 41–54 (2014)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 507–514 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Santhosh Ramchander, N., Hegde, N.P. (2022). An Efficient Information Retrieval Technique for Document Classification. In: Satapathy, S.C., Bhateja, V., Favorskaya, M.N., Adilakshmi, T. (eds) Smart Intelligent Computing and Applications, Volume 2. Smart Innovation, Systems and Technologies, vol 283. Springer, Singapore. https://doi.org/10.1007/978-981-16-9705-0_6
Download citation
DOI: https://doi.org/10.1007/978-981-16-9705-0_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-9704-3
Online ISBN: 978-981-16-9705-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)