A Text Preprocessing Approach for Efficacious Information Retrieval

Virmani, Deepali; Taneja, Shweta

doi:10.1007/978-981-10-8968-8_2

Deepali Virmani¹⁹ &
Shweta Taneja¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 669))

619 Accesses
6 Citations

Abstract

The information retrieval is the task of obtaining relevant information from a large collection of databases. Preprocessing plays an important role in information retrieval to extract the relevant information. In this paper, a text preprocessing approach text preprocessing for information retrieval (TPIR) is proposed. The proposed approach works in two steps. Firstly, spell check utility is used for enhancing stemming and secondly, synonyms of similar tokens are combined. In this paper, proposed technique is applied to a case study on International Monetary Fund. The experimental results prove the efficiency of the proposed approach in terms of complexity, time and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On Georgian Text Processing Toolkit Development

A Review of Tools and Techniques for Preprocessing of Textual Data

Text Analysis and Classification for Preprocessing Phase of Automatic Text Summarization Systems

References

Subbaiah S.: Extracting Knowledge using Probabilistic Classifier for Text Mining. In: International Conference on Pattern Recognition Informatics and Mobile Engineering, pp. 440–442 (2013).
Google Scholar
Bhujade V., Jhanwe N.J.: Knowledge Discovery in Text Mining Technique Using Association Rules Extraction. In: International Conference on Computational Intelligence and Communication Systems, pp. 498–502 (2011).
Google Scholar
Tari, Hakenberg J., Chen Y., Son T., Gonzalez G., Baral C.: Incremental Information Extraction Using Relational Databases. IEEE Transactions on Knowledge and Data Engineering, 24, 1, 86–99 (2012).
Google Scholar
Ramasubramanian C., Ramya R.: Effective Pre-Processing Activities in Text Mining using Improved Porter’s Stemming Algorithm. International Journal of Advanced Research in Computer and Communication Engineering. 2, 12, 4536–4538 (2013).
Google Scholar
Patil L.H., Atique M.: A Novel Approach for Feature Selection Method TF-IDF in Document Clustering. In: IEEE International Advance Computing Conference (IACC), pp. 858–862 (2013).
Google Scholar
Amarasinghe K., Hruska R.: Optimal Stop Word Selection for Text Mining in Critical Infrastructure Domain. In: IEEE Conference, pp. 179–184 (2015).
Google Scholar
Singh V., Saini B.: An Effective Pre-Processing Algorithm for Information Retrieval Systems. International Journal of Database Management Systems (IJDMS), 6, 6, 13–24 (2014).
Google Scholar
Nayak A.S., Kanive A.P., Chandavekar N., Balasubramani R: Survey on Pre-Processing Techniques for Text Mining. International Journal Of Engineering And Computer Science, ISSN: 2319-7242, 5, 6, pp. 16875–16879, (2016).
Google Scholar
Xubu M., Guo J.: Information Extraction of Strategic Activities based on Semi-structured Text. In: International Joint Conference on Computational Sciences and Optimization, pp. 579–583 (2014).
Google Scholar
Hadni M., Lachkar A., Ouatik S.A.: A New and Efficient Stemming Technique for Arabic Text Categorization. In: International Conference on Multimedia Computing and Systems (ICMCS), pp. 791–796 (2012).
Google Scholar
Feilmayr C.: Text Mining-Supported Information Extraction an Extended Methodology for Developing Information Extraction Systems. In: International Workshop on Database and Expert Systems Applications, pp. 217–221 (2011).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Bhagwan Parshuram Institute of Technology, New Delhi, India
Deepali Virmani & Shweta Taneja

Authors

Deepali Virmani
View author publications
You can also search for this author in PubMed Google Scholar
Shweta Taneja
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepali Virmani .

Editor information

Editors and Affiliations

Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India
Bijaya Ketan Panigrahi
Department of Computer Science and Engineering, ABES Engineering College, Ghaziabad, India
Munesh C. Trivedi
Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology, Allahabad, Uttar Pradesh, India
Krishn K. Mishra
CSED, ABES Engineering College, Ghaziabad, Uttar Pradesh, India
Shailesh Tiwari
Department of Computer Science and Engineering, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh, India
Pradeep Kumar Singh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Virmani, D., Taneja, S. (2019). A Text Preprocessing Approach for Efficacious Information Retrieval. In: Panigrahi, B., Trivedi, M., Mishra, K., Tiwari, S., Singh, P. (eds) Smart Innovations in Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 669. Springer, Singapore. https://doi.org/10.1007/978-981-10-8968-8_2

Download citation

DOI: https://doi.org/10.1007/978-981-10-8968-8_2
Published: 19 June 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8967-1
Online ISBN: 978-981-10-8968-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Text Preprocessing Approach for Efficacious Information Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

On Georgian Text Processing Toolkit Development

A Review of Tools and Techniques for Preprocessing of Textual Data

Text Analysis and Classification for Preprocessing Phase of Automatic Text Summarization Systems

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Text Preprocessing Approach for Efficacious Information Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

On Georgian Text Processing Toolkit Development

A Review of Tools and Techniques for Preprocessing of Textual Data

Text Analysis and Classification for Preprocessing Phase of Automatic Text Summarization Systems

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation