Abstract
We first analyzes the deviation when current similarity calculation methods for texts are applied to short texts, and proposes a similarity calculation method for short texts based on language network and word semantic information. Firstly, models the short texts as language network according to the complex-network characteristic of human being’s language. Then analyzes the comprehensive eigenvalue of the words in the language network and the word similarity between different texts to obtain the word semantic. Calculate the similarity between short texts combining language network and word semantic. Finally the effectiveness of proposed algorithm is verified through clustering algorithm experiments.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Fung, B.C.M., Wang, K., Ester, M.: Hierarchical document clustering. In: John, W. (ed.) The Encyclopedia of Data Warehousing and Mining, pp. 970–975. Idea Group (2005)
Hall, P., Dowling, G.: Approximate string matching. Computing Survey 12(4), 381–402 (1980)
Lamontagne, L., Lee, H.-H.: Textual reuse for email response. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 242–256. Springer, Heidelberg (2004)
Glass, J., et al.: A Framework for Developing Conversational User Interfaces. In: Fourth International Conference on Computer-Aided Design of User Interfaces, Funchal, Isle of Madeira, Portugal (2004)
Bickmore, T., Giorgino, T.: Health dialog systems for patients and consumers. J. Biomed. Inform. 39(5), 556–571 (2006)
Cassell, J., et al.: Embodied Conversational Agents (2000)
Gorin, A.L., Riccardi, G., Wright, J.H.: How I help you? Speech Communication 23, 113–127 (1997)
Graesser, A.C., et al.: AutoTutor: An Intelligent Tutoring System With Mixed Initiative Dialogue. IEEE Transactions on Education 48(4), 612–618 (2005)
Salton, G.: The SMART Retrieval System-Experiments in Automatic Document Processing. Prentice Hall Inc., Englewood Cliffs (1971)
Dinesh, R., Harish, B.S., Guru, D.S., Manjunath, S.: Concept of Status Matrix in Text Classification. In: The Proceedings of Indian International Conference on Artificial Intelligence, Tumkur, India, pp. 2071–2079 (2009)
Mitra, V., Wang, C.J., Banerjee, S.: Text Classification: A least square support vector machine approach. Journal of Applied Soft Computing 2007(7), 908–914 (2007)
Fung, G.P.C., Yu, J.X., Lu, H., Yu, P.S.: Text classification without negative example revisit. IEEE Transactions on Knowledge and Data Engineering 2006(18), 23–47 (2006)
Strogatz, S.H., Stewart, I.: Coupled oscillators and biological synchronization. Sci. Am. 269(6), 102–109 (1993)
Gerhardt, M., Schuster, H., Tyson, J.J.: A cellular automaton model of excitable media including curvature and dispersion. Science 247, 1563–1566 (1990)
Hopfield, J.J., Herz, A.V.M.: Rapid local synchronization of action potentials: Toward computation with coupled integrate-and-fire neurons. Proc. Natl Acad. Sci. USA 92, 6655–6662 (1995)
Nowak, M.A., May, R.M.: Evolutionary games and spatial chaos. Nature 359, 826–829 (1992)
Kauffman, S.A.: Metabolic stability and epigenesis in randomly constructed genetic nets. J. Theor. Biol. (22), 437–467 (1969)
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Harris, Z.: Distributional Structure. Word (10), 146–162 (1954)
Firth, J.R.: A Synopsis of Linguistic Theory, 1930–1957. In: Special Volume of the Philological Society. Blackwell, Oxford (1957)
Miller, G., Charles, W.: Contextual Correlates of Semantic Similarity. Language and Cognitive Processes 6, 1–28 (1991)
Li, Y., McLean, D., Bandar, Z.A., James, D.: Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering 18(8), 1138–1150 (2006)
Ferrer i Cancho, R., Sole, R.V.: The small world of human language. Biological Sciences 268(1482), 2261–2265 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhan, Z., Lin, F., Yang, X. (2014). Semantic Similarity Calculation of Short Texts Based on Language Network and Word Semantic Information. In: Wu, J., Chen, H., Wang, X. (eds) Advanced Computer Architecture. Communications in Computer and Information Science, vol 451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44491-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-662-44491-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44490-0
Online ISBN: 978-3-662-44491-7
eBook Packages: Computer ScienceComputer Science (R0)