詞庫簡介
研究概況
線上系統使用
詞庫授權資源
詞庫成員
技術報告
網路資源
連絡我們

 

 

 

 


搜尋所有網站
搜尋詞庫網站
 
 
 

  中文句結構樹資料庫從86年起由中央研究院詞庫小組(CKIP)從中央研究院現代漢語平衡語料庫(Sinica Corpus)中,抽取句子,以訊息為本格位語法(Information - based Case Grammar, ICG)的表達模式為基本架構,經由電腦自動剖析成結構樹,再加以人工修正、檢驗後的所得的成果。中文句結構樹資料庫研究,目前發展至3.0版,包含了6個檔案,61,087個中文樹圖,361,834個詞;此「中文句結構樹資料庫」目前開放網上檢索及資料移轉,以供學者專家在中文句法、語意關係研究參考之用。另有1000個句結構樹開放下載。

  中文句結構樹資料庫(Sinica Treebank)建構的主要目的是提供中文自然語言處理研究一個具有句結構標記的語料作為研究素材,我們可以從這個中文句結構樹資料庫中抽取語法知識,也藉由語法知識的抽取與瞭解使剖析系統功能更趨完善。

  中文句子的語法結構表達採取中心語主導原則 ( Head-Driven Principle )。剖析中文句子時,詞組類型由中心語決定,並且參照中心語和其他成分所記載的語法和語意訊息,表達出句子中詞和詞之間的語法結構和語意角色關係。同時我們提出三項輔助原則:詞類小而美原則、由左至右聯併原則、扁平原則。中文句結構樹的表達原則與輔助原則細節、符號說明、語意角色、詞組結構等,請參見陳鳳儀、蔡碧芳、陳克健、黃居仁《中文句結構樹資料庫 (Sinica Treebank)的構建》。

   
 
   
 

Li, Shih-Min, Su-Chu Lin, Chia-Hung Tai and Keh-Jiann Chen, 2006. "A Probe into Ambiguities of Determinative-Measure Compounds", International Journal of Computational Linguistics & Chinese Language Processing, Vol. 11, No. 3. pp.245-280.

Shih-Min Li, Su-Chu Lin, Keh-Jiann Chen, 2005, "A Probe into Ambiguities of Determinative-Measure Compounds", The 17th ROCLING Conference on Computational Linguistics and Speech Processing, september 15-16, 2005, national cheng hung university, tainan, taiwan, ROC.

Li Shih-Min, Su-Chu Lin and Keh-Jiann Chen, 2005. "Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects", International Journal of Computational Linguistics & Chinese Language Processing, Vol. 10, No. 4. pp.445-457.

Chen Keh-Jiann, Yu-Ming Hsieh, 2004, "Chinese Treebanks and Grammar Extraction", Proceedings of IJCNLP-04, pp560-565.

Li Shih-Min, Su-Chu Lin, Keh-Jiann Chen. 2004. “Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects”. 5th Chinese Lexical Semantics Workshop (CLSW-5). Singapore (14-16 June, 2004) & Genting Highland, Malaysia (17-19 June, 2004).

Lin Su-Chu, Shu-Ling Huang, Keh-Jiann Chen. 2004. “Taxonomy of Fine-grain Semantic Roles for Nominal Modifiers”. 5th Chinese Lexical Semantics Workshop (CLSW-5). Singapore (14-16 June, 2004) & Genting Highland, Malaysia (17-19 June, 2004).

You Jia-Ming, Keh-Jiann Chen, 2004 "Automatic Semantic Role Assignment for a Tree Structure", Proceedings of SIGHAN workshop.

Chen Keh-Jiann, Chu-Ren Huang, Feng-Yi Chen, Chi-Ching Luo,Ming-Chung Chang, Chao-Jan Chen, and Zhao-Ming Gao, 2003, "Sinica Treebank: Design Criteria, Representational Issues and Implementation". In Anne Abeille (Ed.) Treebanks Building and Using Parsed Corpora. Language and Speech series. Dordrecht:Kluwer, pp231-248.

Huang Chu-Ren, Keh-Jiann Chen, Feng-Yi Chen, Keh-Jiann Chen, Zhao-Ming Gao and Kuang-Yu Chen. 2000. Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface. Proceedings of 2nd Chinese Language Processing Workshop (Held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, ACL-2000). 29-37. October 7, 2000, Hong Kong.

Chen Keh-Jiann, et al. 1999. “The CKIP Chinese Treebank: Guidelines for Annotation.” ATALA Workshop – Treebanks, Paris, June 18-19 1999 , pp85-96.

陳鳳儀、蔡碧芳、陳克健、黃居仁. 1999. 中文句結構樹資料庫 (Sinica Treebank)的構建. Computational Linguistics and Chinese Language Processing, Vol. 4, No. 2. pp.87-104.

Chen Keh-Jiann. 1996. “A Model for Robust Chinese Parser.” Computational Linguistics and Chinese Language Processing, Vol. 1, No. 1. pp.183-204.

Chen Keh-Jiann, Chu-Ren Huang, Li-Ping Chang, Hui-Li Hsu. 1996. “Sinica Corpus: Design Methodology for Balanced Corpra.” Proceedings of the 11th Pacific Asia Conference on Language, Information, and Computation (PACLIC II), SeoulKorea, pp.167-176.

Chen Keh-Jiann, Chu-Ren Huang. 1994. “Features Constraints in Chinese Language Parsing.” Proceedings of ICCPOL '94, pp. 223-228.

中文詞知識庫小組. 1993. 中文詞類分析. CKIP-93-05中文詞知識庫.

Chen Keh-Jiann. 1992. “Design Concepts for Chinese Parsers.” 3rd International Conference on Chinese Information Processing, pp.1-22.

林甫雯. 1992. ICG中的論旨角色. CKIP-92-01中文詞知識庫.

陳克健、黃居仁. 1989. “訊息為本的格位語法– 一個適用於表達中文的語法模式” Proceedings of ROCLING II, pp97-119.

   
  林素朱: jess
   
  中文剖析 中文斷詞系統 現代漢語平衡語料庫 廣義知網
 
   
 
 
   
中央研究院 資訊科學所 中文組實驗室 中文詞知識庫小組 版權所有(c)