Abstract
In the last few decades, we have seen a tremendous increase in the amount of data available on the web. There have been significant advances in constructing knowledge bases consisting of relations from the text data. These relations are words in the text often represented as pairs (Noun, Context), for example (Disease, Symptom), which can be classified into some predefined category to give us some useful information. Categorization of relations using tolerance-rough set based semi-supervised learning algorithm (TPL) have been successfully demonstrated in several works. However, an unexplored problem is the automatic selection of hyper parameters of the TPL algorithm. This paper proposes a genetic algorithm-based approach (TPL-GA) for optimizing the hyper-parameters that are fundamental to the TPL algorithm. The proposed approach was tested on two standard datasets drawn from different domains representing two different languages: English and Hindi text.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sengoz, C., Ramanna, S.: Learning relational facts from the web: a tolerance rough set approach. Pattern Recogn. Lett. 67(P2), 130–137 (2015)
Sengoz, C., Ramanna, S.: A semi-supervised learning algorithm for web information extraction with tolerance rough sets. In: AMT 2014, LNCS 8610. pp. 1–10 (2014)
Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan press, Ann Arbor (1975)
Ramanna, S., Peters, J., Sengoz, C.: Application of tolerance rough sets in structured and unstructured text categorization: a survey. In: Wang, G., et al. (eds.) Thriving Rough Sets, Studies in Computational Intelligence, vol. 708, pp. 119–173 (2017)
Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11(5), 341–356 (1982)
Schroeder, M., Wright, M.: Tolerance and weak tolerance relations. J. Comb. Math. Comb. Comput. 11, 123–160 (1992)
Mahdisoltani, F., Biega, J., Suchanek, F.M.: YAGO3: a knowledge base from multilingual wikipedias. In: 7th Biennial Conference on Innovative Data Systems Research (CIDR 2015)
Suchanek, F.M.: Automated construction and growth of a large ontology. PhD Thesis, Natural Sciences and Technology of Saarland University, 2009
Suchanek, F.M., Kasneci, G., Weikum, G., Yago, G.: A core of semantic knowledge. In: 16th International World Wide Web Conference (WWW 2007). ACM Press, New York, pp. 697–706 (2007)
Etzioni, O., Fader, A., Christensen, J., Soderland, S.: Open information extraction: the second generation. In: International Joint Conference on Artificial Intelligence, pp. 3–10 (2011)
Banko, M., Cafarella, M., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: Proceedings of IJCAI, pp. 2670–2676 (2007)
Carlson, A., Betteridge, J., Wang, R.C., Hruschka, E.R. Jr., Mitchell, T.M.: Coupled semi-supervised learning for information extraction. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining, pp. 101–110 (2010)
Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., Krishnamurthy, J., Lao, N., Mazaitis, K., Mohamed, T., Nakashole, N., Platanios, E., Ritter, A., Samadi, M., Settles, B., Wang, R., Wijaya, D., Gupta, A., Chen, X., Saparov, A., Greaves, M., Welling, J.: Neverending learning. Commun. ACM 61(5), 103–115 (2018)
Verma, S., Hruschka, E.R. Jr.: Coupled Bayesian sets algorithm for semisupervised learning and information extraction. In: ECML PKDD Part II LNCS 7524, pp. 307–322 (2012)
Bharadwaj, A., Ramanna, S.: Categorizing relational facts from the web with fuzzy rough sets. Knowl. Inf. Syst. 61, 1695–1713 (2019)
Razali, N.M., Geraghty, J.: A genetic algorithm performance with different selection strategies. In: Proceedings of the World Congress on Engineering, vol. II (2011)
Jain, A., Arora, A.: Named entity recognition in Hindi using hyperspace analogue to language and conditional random field. Pertanika J. Sci. Technol. UPM 26(4), 1801–1822 (2018)
Jain, A., Tayal, D.K., Arora, A.: OntoHindi NER—an ontology based novel approach for Hindi named entity recognition. Int. J. Artif. Intell. 16(2), 1–36 (2018)
Ramanna, M.: Patterns 1, 100053 (2020). https://doi.org/10.1016/j.patter.2020.100053
Thengade, A., Dondal, R.: Genetic algorithm—survey paper. In: IJCA Proceedings National Conference on Recent Trends in Computing, NCRTC, 5 (2012)
Radwan, A., Latef, B., Ali, A., Sadek, O.: Using genetic algorithm to improve information retrieval systems. World Acad. Sci. Eng. Technol. 17(2), 6–13 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Shubham Agrawal, Rashad Ahmed, Anand Kumar, M., Sheela Ramanna (2022). Categorizing Relations via Semi-supervised Learning Using a Hybrid Tolerance Rough Sets and Genetic Algorithm Approach. In: Gupta, D., Khamparia, A., Khanna, A., Castillo, O. (eds) Soft Computing for Data Analytics, Classification Model, and Control. Studies in Fuzziness and Soft Computing, vol 413. Springer, Cham. https://doi.org/10.1007/978-3-030-92026-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-92026-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92025-8
Online ISBN: 978-3-030-92026-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)