A Data-Based Approach for Computer Domain Knowledge Representation

Zhou, Lin; Zhong, Qiyu; Zhang, Shaohong

doi:10.1007/978-3-031-20738-9_93

Lin Zhou⁸,
Qiyu Zhong⁸ &
Shaohong Zhang⁸

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 153))

Included in the following conference series:

The International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery

1742 Accesses

Abstract

Representation learning is a method to compute the corresponding vectorized representations of entities or relationships. It is one of the most basic and essential natural language processing tasks. Current computer domain knowledge modeling techniques have two flaws: (1) the neglect of fine-grained knowledge hierarchies, and (2) the lack of a unified reference standard for modeling domain information. The fine-grained knowledge hierarchy includes knowledge domains, units, and topics. We use the Computer Science Guidelines as a standard to annotate an unstructured and unlabeled corpus in the computer domain with knowledge annotation and topic mapping. We organise the corpus into a computer domain knowledge system with a three-level hierarchy. We propose a knowledge representation method that incorporates contextual semantic information and topic information. The method can be applied to discover connections between knowledge of entities of different granularity. We compare it with several existing textual representation methods. Experimental results on extracting knowledge representations in computer domains show that combining contextual semantic information and topic information methods are more effective than single ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Creating Knowledge Base from Automatically Extracted Information

An Automatic Construction of Concept Maps Based on Statistical Text Mining

The Multi-language Knowledge Representation Based on Hierarchical Network of Concepts

References

ACM/IEEE-CS Joint Task Force on Computing Curricula: Computer science curricula 2013. Technical Report. ACM Press and IEEE Computer Society Press (2013). https://doi.org/10.1145/2534860
Bendersky, M., Croft, W.B.: Modeling higher-order term dependencies in information retrieval using query hypergraphs. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 941–950 (2012)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. In: Advances in Neural Information Processing Systems, vol. 13 (2000)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Interspeech, vol. 2, pp. 1045–1048. Makuhari (2010)
Google Scholar
Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Google Scholar
Wang, P., Xu, J., Xu, B., Liu, C., Zhang, H., Wang, F., Hao, H.: Semantic clustering and convolutional neural network for short text categorization. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 2: Short Papers, pp. 352–357 (2015)
Google Scholar
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Article MATH Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Google Scholar
Schenker, A., Last, M., Bunke, H., Kandel, A.: Graph representations for web document clustering. In: Iberian Conference on Pattern Recognition and Image Analysis, pp. 935–942. Springer (2003)
Google Scholar
Mihalcea, R., Tarau, P.: Textrank: Bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab (1999)
Google Scholar
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202. https://aclanthology.org/N18-1202
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Kenton, J.D.M.W.C., Toutanova, L.K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: 2010 IEEE International Conference on Data Mining, pp. 911–916. IEEE (2010)
Google Scholar
Zhang, S., Yang, Z., Xing, X., Gao, Y., Xie, D., Wong, H.S.: Generalized pair-counting similarity measures for clustering and cluster ensembles. IEEE Access 5, 16904–16918 (2017). https://doi.org/10.1109/ACCESS.2017.2741221
Article Google Scholar

Download references

Acknowledgment

The work described in this paper was partially supported by grants from the funding of Guangzhou education scientific research project [No. 1201730714], and the Guangdong Basic and Applied Basic Research Foundation [No. 2022A151501-1697].

Author information

Authors and Affiliations

School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, Guangdong, China
Lin Zhou, Qiyu Zhong & Shaohong Zhang

Authors

Lin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Qiyu Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Shaohong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaohong Zhang .

Editor information

Editors and Affiliations

Division of Intelligent Future Technologies, Mälardalen University, Västerås, Västmanlands Län, Sweden
Ning Xiong
Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, Middlesex, UK
Maozhen Li
School of Information Science and Technology, Hunan University, Changsha, Hunan, China
Kenli Li
School of Information Science and Technology, Hunan University, Changsha, Hunan, China
Zheng Xiao
College of Computer and Data Science, Fuzhou University, Fuzhou, Fujian, China
Longlong Liao
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Lipo Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, L., Zhong, Q., Zhang, S. (2023). A Data-Based Approach for Computer Domain Knowledge Representation. In: Xiong, N., Li, M., Li, K., Xiao, Z., Liao, L., Wang, L. (eds) Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 153. Springer, Cham. https://doi.org/10.1007/978-3-031-20738-9_93

Download citation

DOI: https://doi.org/10.1007/978-3-031-20738-9_93
Published: 30 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20737-2
Online ISBN: 978-3-031-20738-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Data-Based Approach for Computer Domain Knowledge Representation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Creating Knowledge Base from Automatically Extracted Information

An Automatic Construction of Concept Maps Based on Statistical Text Mining

The Multi-language Knowledge Representation Based on Hierarchical Network of Concepts

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Data-Based Approach for Computer Domain Knowledge Representation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Creating Knowledge Base from Automatically Extracted Information

An Automatic Construction of Concept Maps Based on Statistical Text Mining

The Multi-language Knowledge Representation Based on Hierarchical Network of Concepts

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation