Abstract
Semantic dependency parsing is a deep semantic analysis task based on large-scale and canonically annotated corpora. This chapter will present a new Chinese semantic dependency scheme using solid linguistic knowledge of Chinese. Chinese is a meaning-combined language with flexible syntactic structures and complex modifying relations among words. Thus, we used dependency graphs instead of dependency trees as target representations to allow nodes to have more than one incoming arc and crosses among dependency arcs. We annotated the dependency structures of 30,161 sentences, with 570,403 words, using this scheme. This chapter will describe the semantic dependency scheme in detail, including its specifications and the process involved in creating the corpus. Using Fleiss’ kappa, the inner-annotated agreement evaluation results were 0.835 for non-labeled arcs and 0.686 for labeled arcs as assignments. This chapter will also provide the statistics of the annotated corpus.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
1 Introduction
Sentence analysis based on dependency grammar has recently become a hot issue in natural language processing. This task has been extensively studied and has proven to be useful in several applications, including question answering (Cui et al. 2005; Punyakanok et al. 2004), semantic structure extraction (Johansson and Nugues 2007), and semantic role labeling (Hacioglu 2004; Pradhan et al. 2005).
Much work has focused on constructing dependency parsers. So far, all the dependency parsing technologies have been data driven, and large-scale corpora have been annotated to construct automatic dependency parsers. The Prague Dependency Treebank (Böhmová et al. 2003), the first dependency structure annotation work, has been influential. Dependency treebanks have been built for at least 30 languages, on a large or small scale, by hand or via algorithms to automatically convert available phrase structure treebanks to dependency structure notations (Marimon and Bel 2014), such as Chatterji et al. (2014), Haverinen et al. (2014), and Marneffe and Manning (2008). Liu et al. (2006) created a Chinese syntactic dependency treebank (CDT) consisting of 60,000 sentences from the People’s Daily in the 1990s. Several studies have been conducted on Chinese dependency parsing using this corpus, such as Niu et al. (2009) and Li et al. (2012). Most studies on dependency analysis have been syntax-oriented. Semantic dependencies were seldom studied until the share tasks in the SemEval-2012 (Che et al. 2012) and SemEval-2014 (Oepen et al. 2014), where semantic dependencies annotated in Chinese and English were provided for participants to build dependency parsing systems.
Distinct from English, Chinese is an ideographic language belonging to the Sino-Tibetan family (Lu 2001) that organizes sentences based on logical connections among lexical meanings and the semantics of sub-sentences, so no formal meanings or fixed syntactic structures are available. Because rich latent information is hidden in facial words, the semantic analysis of Chinese is specialized. Conversely, English is a hypotaxis language that organizes sentences by linguistically formal meanings, wherein grammar prioritizes syntax and even disengages from semantics.
Semantic dependency parsing aims to determine all the word pairs with exact semantic relations and connect each word pair to a dependency arc with a relation label, indicating their semantic relations. Semantic dependency has similarities with and differences from syntactic dependency. Both are based on dependency grammar (Robinson 1970) and annotate each word in a sentence. Syntactic dependency gives a transparent encoding of the predicate-argument structure, while semantic dependency explicitly displays semantics hidden behind predicate-argument structures.
The number of semantic dependency labels is more than five times higher than syntactic dependency labelsFootnote 1, which allows them to express different information of sentences. Syntactic dependency analyzes syntactic functions from the perspective of grammar systems (e.g., subjective, predicate, and objective), and for this task, dependency tree structures are sufficient. By contrast, semantic dependency involves semantic relations (e.g., agent, patient, and experiencer) between each pair of words. According to the above analysis of the Chinese language, semantic relations between word pairs do not always generate tree structures, and graphs describe semantics better than trees. These findings coincide with the meaning-text theory (MTT), a theoretical framework for the description of natural languages (Žolkovskij and Mel’čuk 1967). MTT considered that trees are not sufficient to express the complete meaning of sentences in some cases, which has been proven undoubted in our practice of corpus annotation.
Comparing word pairs connected by dependency arcs, semantic dependency seeks to depict the relations among content words, whereas syntactic dependency mostly relies on functional words (e.g., coordinating conjunctions and prepositions). Figure 13.1 presents an example of this difference. In the prepositional phrase 在教室 zai jiaoshi “at the classroom,” the preposition 在 zai “at” is the head word in (a), whereas the headword in (b) is the content word 教室 jiaoshi “classroom.”
The rest of this chapter is organized as follows. Section 13.2 will describe the details of our dependency scheme, while Sect. 13.3 will introduce the origin of our corpus and the design of our annotation tool. Then, an evaluation of the inner-annotator agreement of our annotated corpus will be given, concretely describing the assessment method, in Sect. 13.4. Section 13.5 will present some statistics of our annotated corpus, followed by the conclusion in Sect. 13.6.
2 Annotation Scheme of the Semantic Dependency Graph
Dependency tree structures are traditionally prerequisites for syntactic dependency analysis. However, dependency trees are not suited for meaning representation because of some distortion in or omission of the dependency arcs needed to preserve a legal dependency structure. According to large-scale real corpus and parataxis characteristics, a word may be the argument of more than one predicate, resulting in multiple incoming arcs. Therefore, we extended dependency tree structures to graphs.
2.1 Graph Structure of Semantic Dependency
Semantic dependency graphs (SDGs) are directed acyclic graphs. Nodes refer to words, while edges refer to semantic relations between labeled words. There is only one node without a head, which is the root of the entire graph. Graphs overcome the limitations of dependency trees by allowing more than one head on certain nodes and crosses of arcs. Figure 13.2 shows that the node 杯子 beizi “cup” has semantic relations with both 打 da “break” and 破 po “damaged,” which means that 杯子 beizi “cup” has two heads, and the arcs connecting 杯子 beizi “cup” and 破 po “damaged” as well as 他 ta “he” and 打 da “break” cross.
The dependency structure in traditional dependency grammar must be single-headed, connective, acyclic, and projective. Since dependency graphs do not include single-headed and projective relations, only connective and acyclic relations, they are considered extensions of dependency grammar.
2.2 Semantic Relation Set
Lu (2001) explained the parataxis network of Chinese grammar. We applied this semantic unit classification and semantic combination, as well as integrated the semantic characteristics, to construct a clear semantic relation scheme. At the same time, we also considered some of the semantic relation tags in HowNet (Dong and Dong 2006).
Semantic units are divided from high to low into event chains, events, arguments, concepts, and marks. Arguments refer to noun phrases related to certain predicates. Concepts are simple elements in basic human thought or content words in syntax. Marks represent the meaning attached to the entity information conveyed by speakers (e.g., speakers’ tones or moods). These semantic units correspond to compound sentences, simple sentences, chunks, content words, and function words. The meanings of sentences are expressed by event chains, which consist of multiple simple sentences. The meanings of simple sentences are expressed by arguments, while arguments are reflected by predicate, referential, or defining concepts. Marks are attached to concepts.
The meaning of a sentence consists of the meanings of the semantic units and their combinations, including semantic relations and attachments. Semantic attachments refer to marks on semantic units which are listed in Table 13.1 as “semantic marks” such as prepositions, mood words, punctuations, and so on. Semantic relations are classified into symmetric and asymmetric types. Symmetric relations include coordination, selection, and equivalence relations, while asymmetric relations include the following:
-
1.
Cooperative relations occur between core and non-core roles. For example, in 工人修理管道 gongren_xiuli_guandao “workers repair the pipeline,” 管道 guandao “pipeline” serves as a non-core role and is the patient of 修理 xiuli “repair,” which is a verb that serves as a core role. Relations between predicates and nouns belong to cooperative relations. Semantic roles usually refer to cooperative relations. Table 13.1 presents the 32 semantic roles we defined, divided into 8 small categories.
-
2.
Additional relations refer to the modifying relations among concepts within an argument, in which all semantic roles are available; for example, in 地下的管道 dixia_de_guandao “underground pipeline,” 地下 dixia “underground” is the modifier of 管道 guandao “pipeline,” which refers to a location relation.
-
3.
Connectional relations are bridging relations between two events that are neither symmetric nor nested relations. For example, for the sentence “如果天气好, 我会去颐和园 ruguo_tianqi_hao, wo_hui_qu_yiheyuan ‘If the weather is good, I will go to the Summer Palace’,” the former event is the hypothesis of the latter. Fifteen event relations were defined by our scheme.
We analyzed how the elements of each sentence constitute the entire meaning of the sentence and used the results as the theoretical basis in designing the SDG corpus. Table 13.1 shows the entire semantic relations set, which includes five types of semantic relations, i.e., semantic roles, reverse relations, nested relations, event relations, and semantic marks.
2.3 Special Situations
-
1.
Reverse relations. When a verb modifies a noun, a reverse relation is applied with the label r-XX (XX refers to a single-level semantic relation). A reverse relation is generated when a word pair with the same semantic relation appears in different sentences with different modifying orders. A reverse relation distinguishes different modifying orders (i.e., they have arcs with reverse directions in the two situations). For example, the semantic relation between the head word 男孩 nanhai “boy” and the kernel word 打 da “play” in Fig. 13.3 is the r-agent, and the label agent is labeled the kernel word 打 da “play” and its modifier 男孩 nanhai “boy.” The expression of the semantic tri-tuple of this pair of words in Fig. 13.3a is 男孩 nanhai “boy,” 打 da “play,” r-agent, and in Fig. 13.3b, it is 打 da “play,” 男孩 nanhai “boy,” agent. Here, the first word in the tri-tuple is the head word, and the second one is a modified or dependency word, while the last one has asemantic role.
-
2.
Nested events. Two events have a nested relation (i.e., one event is regarded as a grammatical item of the other), which belongs to two semantic hierarchies. For example, in the sentence in Fig. 13.4, the event 小孙女在玩计算机 xiao_sunnv_zai_wan_jisuanji “little granddaughter is playing the computer” is regarded as the content of the action 看见 kanjian “see.” A prefix “d” is added to single-level semantic relations as a “distinctive” label. The tri-tuple of this sentence is labeled 看见 kanjian “see,” 玩 wan “play,” d-content.
-
3.
Quantitative phrases. There are no English quantifiers such as 个 ge, 本 ben, 只 zhi, etc. in Chinese. Here, a “quantitative word” refers to the combination of one numeral and one quantifier, such as 十个 shi_ge “ten,” and a “quantitative phrase” represents the combination of a quantitative word and a noun, such as 十个人 shi_ge_ren “ten persons.” In our scheme, considering that sometimes numerals can be omitted, such as 这本书 zhe_ben_shu “this book,” the quantifier of the quantitative word was labeled the head word, and the numeral was the dependency word, while the semantic relation between them was labeled “Quan” (quantity), a measurement role. When a quantitative word modified a noun, the noun was labeled the head word of the whole quantitative phrase, and the quantifier was the dependency word. The semantic relation between the noun and the quantitative word was labeled “Qp” (quantity phrase). For example, for the quantitative phrase 五本书 wubenshu “five books,” the semantic tri-tuples were 本 ben “ben,” 五 wu “five,” Quan and 书 shu “book,” 本 ben “ben,” Qp.
-
4.
Serial verb sentences. When several verbs occur in one sentence and there is neither a pause punctuation nor a conjunction sub-sentence, these kinds of sentences are called serial verb sentences or compressed sentences, which in fact includes more than two events in one sentence. Mostly, the front verb of the serial verb sentence is selected as the head word, and in rare cases such as manner serial verb sentences, the head word is the rear verb. According to the relations between different verbs, the semantic relations of serial verb sentences are classified as succession, purpose, manner, result, and soon. For instance, the head word of the Chinese sentence “他穿衣服走了。 ta_chuan_yifu_zou_le ‘He wore his cloth and left’.” is the front verb 穿 chuan “wear,” and the relation between the two events is labeled “eSucc” (successor event). The tri-tuple of the two verbs in this sentence is 穿 chuan “wear,” 走 zou “leave,” eSucc. In fact, the subject word 他 ta “he” has two parent nodes—one is the verb 穿 chuan “wear” and the other is the verb 走 zou “leave.”
-
5.
“De” structures with the omission of the head word. The Chinese word 的 de “De” is always used as an auxiliary word, and it is often taken as a dependency mark. However, sometimes the head word of the De structure is omitted. In this head word deletion situation, 的 de “De” was labeled the head word in our scheme. For example, in the Chinese sentence “卖菜的走了。 mai_cai_de_zou_le ‘The man who sold vegetables left’.”, the head word 人 ren “person” of the De structure was omitted. Different from the Abstract Meaning Representation (AMR) semantic labeling system (Li et al. 2016), our scheme did not add the omitted component to the sentence, so the auxiliary word 的 de “De” was considered the head word of the De structure, and the tri-tuples were expressed as 走 zou “leave,” 的 de “De,” agent and 的 de “De,” 卖 mai “sell,” r-agent. Because 的 de “De” is often labeled as an auxiliary mark, if it is not annotated as a mark, it will mean that the situation of omission has occurred.
-
6.
Predicate-complement structures. The semantic relations between verbs in verb serial sentences can also be applied to the predicate-complement structure. For example, for the Chinese sentence “他走累了 。 ta_zou_lei_le ‘He got tired of walking’.”, the semantic relation between the predicate 走 zou “walk” and the complement 累 lei “tired” was labeled “eResu” (result event), which means that the complement was the “result” of the verb.
-
7.
Separable words. In Chinese, some words can be separated into two parts, which are called “separable words.” For example, the word 洗澡 xizao “take a bath” can be split into 洗个澡 xi_ge_zao “take a bath” by inserting the Chinese quantifier word 个 ge “Ge” into the word 洗澡 xizao “take a bath.” In this case, the semantic relation between the two Chinese characters 洗 xi “take” and 澡 zao “bath” can be labeled “mSepa” (separation mark).
3 Corpus
3.1 Corpus Origin
Our corpus contained more than 30,000 sentences. The sentences were chosen from newspapers, spoken sentences, and Sina Weibo microblogs. We selected 10,068 newspaper sentences and labeled the word segmentation and part-of-speech (POS) information using Chinese PropBank 6.01 (Xue and Palmer 2003). Of the remaining sentences, 10,038 spoken and 10,055 Sina Weibo sentences had no annotated tags. Thus, we annotated the morphological information first before annotating semantic dependency. Chinese Treebank (CTB)-style POS tags were derived from the Penn English Treebank, which belongs to the Indo-European word class system that includes 33 POS tags.
Table 13.2 presents additional details on our annotated corpus, while Fig. 13.5 shows the curve of the number of sentences relative to sentence lengths. Spoken sentences refer to sentences with rich expressions (e.g., dialogues, dialogue sentences, Chinese-English bilingual sentences, and primary school texts). The sentences in the primary school texts were not all colloquial, as some of them exploited luxuriant expressions. Differences and the diversification of resources resulted in rich linguistic phenomena. Fan (1998) and Huang and Liao (2003) reduced sentence patterns into single and compound sentences from a linguistic perspective. In our annotated corpus, single sentences were categorized into 8 patterns, while compound sentences were categorized into 12 patterns, and each sentence pattern had corresponding sentences.
3.2 Annotation Tool
We developed an online annotation tool to enable annotators to conveniently search, annotate, and revise. Figure 13.6 shows the annotation interface of the tool. On the annotation page, two buttons are used to switch to the word segmentation and POS tagging sub-pages. On the history page, sentences are displayed with dependency labels and relations. Annotators can click on a sentence, which will take them to a page to revise the annotation. On the search page, different keywords and their combinations can be used to search for sentences and corresponding annotation results. When annotators are confused about certain words or relations, they can search and learn from other labeling results. This online tool provides helpful functions for those involved in the annotation process.
4 Evaluation of the Corpus
The quality of an annotated corpus is crucial for automatic dependency parsing. We measured the consistency degree of the inner-annotators’ agreement to evaluate the quality of our annotated corpus, wherein the same linguistic phenomena were labeled with the same dependency structures and relation labels. We employed three linguistics master’s students to annotate the same smaller corpus blindly. The smaller corpus included 422 randomly selected sentences from the 30,000 sentences collected. We evaluated the agreements on the dependency arcs level and both the arc and relation levels, respectively. The average agreements among the three pairs of annotators were 88.78% for arcs only and 72.15% for both arcs and relations. The latter result was lower than the former because only when both the dependency arcs and corresponding relations were consistent could an agreement item be obtained. Hundreds of relations were defined, so this low result was conceivable. Table 13.3 shows the agreement results.
In addition, we evaluated the agreement using Fleiss’ kappa discussed in Fleiss (1971). The degree of agreement between all annotators was computed in terms of Fleiss’ kappa (κ), as shown in Eq. (13.1):
The proportion of all assignments used for assigning the jth assignment was defined using Eq. (13.2), where N is the total number of words, n is the number of annotators for our resource building work, K is the total number of assignment types conducted by the annotators, and N × n is the total number of assignments made by all the annotators, while the mean proportion of assignments for all assignments was defined using Eq. (13.3):
The extent of the annotator pairs’ agreement for the ith word was defined using Eq. (13.4), where subscript i (1, …, N) represents the words and subscript j (1, …, K) represents the assignments; thus, nij is the number of annotators who assigned the ith word to the jth assignment, and n(n − 1)/2 represents the pairs of annotators, while the mean of agreements for all words was defined using Eq. (13.5):
In this case, n is equal to 3 (i.e., the three annotators that participated in this experiment). The total number of sentences annotated was 422, which included 6634 words. We calculated two Fleiss’ kappa scores, one using arcs as assignments and the other using both arc and relation labels. For the two criteria, we had 48 and 1638 assignments, respectively. We achieved kappa scores of 0.835 and 0.686, respectively, for the two criteria. If all three annotators agreed on all the assignments, then the kappa score would be 1. Generally, when the kappa score is above 0.7, agreement is good, and when the kappa score is below 0.7 but above 0.4, agreement is reasonable. The kappa scores indicated that the three annotators mostly agreed when annotating the semantic dependency graph corpus.
5 Corpus Statistics
We performed statistics on our annotated corpus. Table 13.4 illustrates the highest and lowest frequent labels in the annotated corpus. The bottom five labels with the least occurrence were reverse or nested relations, which are uncommon kinds of linguistic phenomena. By contrast, the labels with the most frequent appearances are shown in the third and fourth columns. The mPunc (punctuation) label was excluded. Each sentence had at least one punctuation mark, and the total occurrence of mPunc exceeded 30,161. Both Exp (experiencer) and Agt (agent) appeared in the top 5 label list because they belong to the subject-predicate structure, which frequently appears in languages, at the syntactic level. Two relation marks—mAux (auxiliary mark) and mMod (modal mark)—had the highest frequencies. Desc (description) appeared the most frequently as it was used between most adjectives and nouns.
Figure 13.7 shows the relation numbers and frequencies by relation groups. The frequencies of each group were added. We recorded 27 nested relations and 28 reverse relations in our annotated corpus. Reverse relations appeared the least among all groups, followed by nested relations. These two kinds of linguistic phenomena are not common in the Chinese language. The occurrence of event relations was directly related to the number of sub-sentences.
Table 13.5 shows the arc proportions that caused crossed arcs and nodes with multiple heads. Statistical analysis was performed on the entire annotated corpus, including 30,161 sentences. The proportion of sentences with cross arcs was 24.31%, while sentences with multiple heads accounted for 30.59%. Figure 13.8a shows an example of the sentence with crossed arcs, and Fig. 13.8b is an example of sentence with multiple heads. Example (a) shows the Agt arc from 哭 ku “cry,” 她 ta “she,” and the Exp arc from 肿 Zhong “swollen,” to 眼睛 yanjing “eye” cross, while (b) shows the node 妹妹 meimei “sister,” which has two parent nodes—有 you “have” and 能干 nenggan “competent.” As can be seen, the structure of quite a few sentences in Chinese highlights the limitations of dependency trees, so using semantic dependency graphs to describe semantic structures is quite necessary.
6 Conclusion
The current chapter proposed a scheme for Chinese semantic dependency, and each label in this scheme reflected concrete semantic information. The SDG is a human-understandable semantic representation both visually and logically. The semantic relations were designed from the perspective of linguistics to adapt to the characteristics of the Chinese language. Very little abstraction of semantic information exists, which distinguishes this proposed scheme from existing dependency schemes. Inducing semantics directly, we employed more relation labels than syntactic dependencies. To clarify the boundaries of relation labels, we classified them into several hierarchies that represented different types of information, namely, main semantic roles, event relations, and semantic marks.
We annotated more than 30,000 sentences based on this scheme. The sentences were chosen from spoken sentences, newswires, and Sina Weibo microblogs, covering both the common core of the language and more specialized domains. In the process of constructing this corpus, we obtained the utmost out of other gold standard information labeled in the sentences to generate pre-annotation results by rules or by machine learning tools. Triple-blinded annotation experiments were conducted to measure the inner-annotators’ agreement by calculating the widely used Fleiss’ kappa. We achieved kappa scores of 0.835 and 0.686 for non-labeled arcs and labeled arcs as assignments, respectively. These results indicate that the three annotators had a great majority of agreements while annotating the corpus, although the semantic dependency scheme was slightly complicated.
According to the statistics and analysis of the annotated corpus, we arrived at the conclusion that although most sentences constitute projective dependency trees in Chinese, non-projective trees and dependency graphs do exist but in a smaller proportion. Thus, using semantic dependency graphs to describe semantic information is quite necessary and reasonable.
Notes
- 1.
CDT and Malt syntactic dependency have 13 and 12 labels, respectively. The Malt dependency corpus was acquired via automatic conversion from Penn Chinese Treebank phrase structure trees using Penn2Malt. Semantic dependency labels exceed 50, including those produced by Li et al. (2003) and Chen et al. (1999). Hundreds of labels are available in our BLCU-HIT Semantic Dependency Parsing (BH-SDP) system.
References
Böhmová, Alena, Jan Hajič, Eva Hajičová, and Barbora Hladká. 2003. The Prague dependency treebank: A three-level annotation scenario. In Treebanks: Building and using parsed corpora, ed. Anne Abeillé, Amsterdam: Kluwer, 103–127.
Chatterji, Sanjay, Tanaya Mukherjee Sarkar, Pragati Dhang, Samhita Deb, Sudeshna Sarkar, Jayshree Chakraborty, Anupam Basu. 2014. A dependency annotation scheme for Bangla treebank. Language Resources and Evaluation 48:443–477.
Che, Wanxiang, Meishan Zhang, Yanqiu Shao, and Ting Liu. 2012. SemEval-2012 task 5: Chinese semantic dependency parsing. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (Vol. 1): Proceedings of the main conference and the shared task; (Vol. 2): Proceedings of the sixth international workshop on semantic evaluation, Montréal, Canada, 378–384. Available at https://aclanthology.info/papers/S12-1050/s12-1050. Accessed 8 March 2019.
Chen, Feng-Yi, Pi-Fang Tsai, Keh-jiann Chen, and Chu-Ren Huang 陈凤仪, 蔡碧芳, 陈克健, 黄居仁. 1999. Project Report: Sinica Treebank 中文句结构树资料库的构建. Computational Linguistics and Chinese Language Processing 中文计算语言学期刊 4(2):87–104.
Cui, Hang, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. Question answering passage retrieval using dependency relations. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval—SIGIR’05, ACM Press, New York, NY, 400–407. Available at https://www.researchgate.net/publication/221300315_Question_answering_passage_retrieval_using_dependency_relations. Accessed 8 March 2019.
De Marneffe, Marie-Catherine, and Christopher D. Manning. 2008. The Stanford typed dependencies representation. In Proceedings of the COLING 2008 Workshop on Cross-framework and Cross-domain Parser Evaluation, Manchester, United Kingdom, 1–8. Available at https://nlp.stanford.edu/pubs/dependencies-coling08.pdf. Accessed 8 March 2019.
Dong, Qiang, and Zhendong Dong. 2006. HowNet and computation of meaning. World Scientific Publishing Company.
Fan, Xiao 范晓. 1998. The sentence types of Chinese 汉语的句子类型. Shuhai Publishing House.
Fleiss, Joseph L. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76(5):378–382.
Hacioglu, Kadri. 2004. Semantic role labeling using dependency trees. In Proceedings of the 20th International Conference on Computational Linguistics—COLING ’04, Geneva, Switzerland, Article number 1273. 1–4. Available at https://dl.acm.org/citation.cfm?doid=1220355.1220541. Accessed 8 March 2019.
Haverinen, Katri, Jenna Nyblom, Timo Viljanen, Veronika Laippala, Samuel Kohonen, Anna Missilä, Stina Ojala, Tapio Salakoski, Filip Ginter. 2014. Building the essential resources for Finnish: The Turku dependency treebank. Language Resources and Evaluation 48:493–531.
Huang, Bo-rong, and Xu-dong Liao 黄伯荣, 廖旭东. 2003. Contemporary Chinese language 现代汉语. Higher Education Press.
Johansson, Richard, and Pierre Nugues. 2007. LTH: Semantic structure extraction using nonprojective dependency trees. In Proceedings of the 4th International Workshop on Semantic Evaluations, Prague, Czech Republic, 227–230. Available at https://dl.acm.org/citation.cfm?id=1621522. Accessed 8 March 2019.
Li, Mingqin, Juanzi Li, Zhendong Dong, Zuoying Wang, and Dajin Lu. 2003. Building a large Chinese corpus annotated with semantic dependency. In Proceedings of the Second SIGHAN Workshop on Chinese Language Processing (Vol. 17), Sapporo, Japan, 84–91. Available at http://aclweb.org/anthology/W03-1712. Accessed 8 March 2019.
Li, Zhenghua, Ting Liu, and Wanxiang Che. 2012. Exploiting multiple treebanks for parsing with quasi synchronous grammars. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers (Vol. 1), Jeju Island, Korea, 675–684. Available at http://ir.hit.edu.cn/~lzh/papers/zhenghua-P12-multi-treebanks.pdf. Accessed 8 March 2019.
Li, Bin, Lijun Wen, Weiguang Qu, Lijun Bu, and Nianwen Xue. 2016. Annotating the Little Prince with Chinese AMRs. In Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016), Berlin, Germany, 7–15. Available at http://aclweb.org/anthology/W16-1702. Accessed 8 March 2019.
Liu, Ting, Jinshan Ma, and Sheng Li 刘挺, 马金山, 李生. 2006. Chinese dependency parsing model based on lexical governing degree 基于词汇支配度的汉语依存分析模型. Journal of Software 软件学报 17(9):1876–1883.
Lu, Chuan 鲁川. 2001. The parataxis network of the Chinese grammar 汉语语法的意合网络. The Commercial Press.
Marimon, Montserrat, and Núria Bel. 2014. Dependency structure annotation in the IULA Spanish LSP treebank. Language Resources and Evaluation 49(2):433–454.
Niu, Zheng-Yu, Haifeng Wang, and Hua Wu. 2009. Exploiting heterogeneous treebanks for parsing. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1, Association for Computational Linguistics, Suntec, Singapore, 46–54. Available at http://www.aclweb.org/anthology/P09-1006. Accessed 8 March 2019.
Oepen, Stephan, Marco Kuhlmann, Daniel Zeman, Yusuke Miyao, Dan Flickinger, Jan Hajič, Angelina Ivanova, and Yi Zhang. 2014. SemEval-2014 task 8: Broad-coverage semantic dependency parsing. In Proceedings of the Eighth International Workshop on Semantic Evaluation (SemEval-2014), Dublin City University, Dublin, Ireland, 63–72. Available at http://aclweb.org/anthology/S14-2008. Accessed 8 March 2019.
Pradhan, Sameer, Wayne Ward, Kadri Hacioglu, James H. Martin, Daniel Jurafsky. 2005. Semantic Role Labeling Using Different Syntactic Views. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, Michigan, 581–588. Available at http://cemantix.org/papers/pradhan-acl-2005.pdf. Accessed 8 March 2019.
Punyakanok, Vasin, Dan Roth, and Wen-tau Yih. 2004. Mapping dependencies trees: An application to question answering. In Proceedings of International Symposium on Artificial Intelligence & Mathematics Fort, 1–10. Available at http://l2r.cs.uiuc.edu/~danr/Papers/PunyakanokRoYi04a.pdf. Accessed 8 March 2019.
Robinson, Jane J. 1970. Dependency structures and transformational rules. Language 46:259–285.
Xue, Nianwen, and Martha Palmer. 2003. Annotating the propositions in the Penn Chinese treebank. In Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan, 47–54. Available at http://www.aclweb.org/anthology/W03-1707. Accessed 8 March 2019.
Žolkovskij, Aleksandr, and Igor A. Mel’čuk. 1967. O sistemesemantiˇceskogosinteza. II: Pravilapreobrazovanija [On a system of semantic synthesis (of texts). II: Paraphrasing rules]. Nauˇcno-texniˇceskaja informacija 2, Informacionnye processy I sistemy, 17–27.
Acknowledgments
We appreciatively acknowledge the support of the National Natural Science Foundation of China (61872402), the Humanities and Social Science Project of the Ministry of Education (17YJAZH068), and the Science Foundation of Beijing Language and Culture University (supported by the Fundamental Research Funds for the Central Universities, 18ZDJ03).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Shao, Y., Che, W., Liu, T., Ding, Y. (2023). The Construction of a Chinese Semantic Dependency Graph Bank. In: Huang, CR., Hsieh, SK., Jin, P. (eds) Chinese Language Resources. Text, Speech and Language Technology, vol 49. Springer, Cham. https://doi.org/10.1007/978-3-031-38913-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-38913-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38912-2
Online ISBN: 978-3-031-38913-9
eBook Packages: EducationEducation (R0)