(3.234.244.18) 您好!臺灣時間:2020/04/06 10:39
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
本論文永久網址: 
line
研究生:楊允言
研究生(外文):Un-Gian Iunn
論文名稱:台語文處理技術:以變調及詞性標記為例
論文名稱(外文):Processing Techniques for Written Taiwanese --Tone Sandhi and POS Tagging
指導教授:高成炎
學位類別:博士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文出版年:2009
畢業學年度:97
語文別:英文
論文頁數:139
中文關鍵詞:台語文變調詞類標記白話字自然語言處理
外文關鍵詞:Written TaiwaneseTone SandhiPOS TaggingPeh-Oe-JiNatural Language Processing
相關次數:
  • 被引用被引用:2
  • 點閱點閱:748
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
台語是世界上重要的語言,可惜沒有受到應有的重視。在某些方面,台語文的特性與華文或英文相當不同。本論文主要討論台語文處理技術。
白話字(台語羅馬字)是台語文的重要書寫系統。我們先介紹白話字的字元編碼,提及白話字數字調號做為不同白話字字元編碼的內部表示法。針對白話字文本搜尋,我們提出兩階段搜尋策略,並提出白話字音節近似搜尋的方法。我們還描述白話字顯示方法、白話字文字處理相關應用程式以及漢羅台語文斷詞方法。
我們提出以規則方法處理變調問題的演算法。先將每個台語詞翻成華語詞,找出其詞類標記訊息,以詞類標記和變調規則來決定變調後的聲調。我們實作出台語變調系統。此系統在訓練資料及測試資料分別達到97.4%和89.0%的變調正確率。
此外,我們提出詞類標記方法。我們先開發語詞對齊檢查程式將逐段對齊的兩種台語文本做語詞對齊,之後利用HMM機率模型挑選最適當的華語對應詞,再利用MEMM分類器挑選出其詞性標記。我們的方法達到91.5%的正確率。
過去幾年,我們建立了一些有用的線上台語文工具。希望這些工具以及我們所做的初步研究成果,能讓台語文處理相關研究更加蓬勃發展。
Taiwan Southern Min (Taiwanese) is an important language that has received only a little attention in the world. The characteristic of written Taiwanese is quite different from Mandarin or English in some respects. We will focus on Taiwanese processing techniques in this dissertation.
POJ is an important script of Taiwanese. We introduce character code of POJ, and mention the numbered POJ as the interchange code for various POJ encodings. Then, we propose a two-stage search strategy for POJ text search, and propose POJ syllable query expansion. We also describe the display method for POJ, POJ word processing utilities and word segmentation method for HR mixed script.
We propose a rule-based tone sandhi algorithm. We translate every word into Mandarin, and obtain the POS information. Using the POS data and tone sandhi rules, we then tag each syllable with its post-sandhi tone marker. Finally we implemented a Taiwanese tone sandhi processing system. Our system achieves 97.4% and 89.0% accuracy rate with training and test data, respectively.
Additionally, we propose a POS tagging method. We develop a word alignment checker to help the two Taiwanese scripts word alignment work, select the most adequate Mandarin word using Hidden Markov probabilistic model, and finally tag the word using Maximal Entropy Markov Model classifier. We achieve an accuracy rate of 91.5% on Taiwanese POS tagging work.
We have established some useful online written Taiwanese tools for past several years. Based on these tools and preliminary research results, we hope the written Taiwanese processing related research can be promoted.
Preface i
Acknowledgments iv
摘要 xi
Abstract xiii
Abbreviations xxv
Chapter 1 Introduction 1
1.1 Background 1
1.1.1 Language Population in Taiwan 1
1.1.2 Southern Min Language Population 2
1.1.3 Another Investigation: the Taiwan Southern Min Viewers 3
1.1.4 The Confusing Name of This Language 5
1.2 Different Types of Written Taiwanese Scripts 6
1.2.1 The Han Characters Script 7
1.2.2 The Romanized Scripts 9
1.2.3 The Han-Romanization Mixed Script 10
1.2.4 Other Scripts 10
1.2.5 Target Scripts in This Dissertation 11
1.3 Issues Related to Written Taiwanese Processing 11
1.4 Organization of This Dissertation 12
Chapter 2 Resources and Survey of Written Taiwanese Processing 19
2.1 Digital Resources for Written Taiwanese 19
2.1.1 Fonts 19
2.1.2 Dictionary 20
2.1.3 Text Corpora 23
2.1.4 Electronic Books 27
2.2 Survey of Written Taiwanese Processing Techniques 28
2.2.1 Input Method 28
2.2.2 Word Segmentatation 29
2.2.3 POS Tagging 30
2.2.4 Scripts Conversion 30
2.2.5 Text-to-Speech 30
2.2.6 Translation 33
2.2.7 Parsing 33
2.3 Summary 33
Chapter 3 Coding, I/O for POJ, and Text Processing 35
3.1 Character Code of POJ 35
3.2 Two Kinds of POJ Representation 39
3.3 Search Problem with POJ Text 41
3.3.1 Issues with POJ Text Search 41
3.3.2 Two-Stage Search Method: String Matching Then Filtering 42
3.3.3 Query Expansions: Toneless, Glottal Stop, Checked Syllable, and Vowel 44
3.3.4 Examples of Search Results 47
3.4 POJ Text Display 49
3.4.1 Issues with POJ Text Display 49
3.4.2 POJ and Numbered POJ Conversion Method 50
3.4.3 POJ Graph Display 52
3.4.4 Examples of Display Results 53
3.5 Some Text Processing Utilities for POJ 55
3.5.1 POJ Phoneme Segmentation and Spelling Checker 55
3.5.2 POJ Syllable/Word/Sentence Count 57
3.6 Word Segmentation for HR Mixed Script 58
Chapter 4 Tone Sandhi Problem and Algorithm 63
4.1 Tone Sandhi Problem of the Taiwanese Language 63
4.1.1 Types of the Taiwanese Language Tone Sandhi 64
4.1.2 Boundary of Tone Sandhi Group 68
4.2 Implementation of the Taiwanese Pronunciation System 68
4.2.1 System Diagram 68
4.2.2 Observation Data and Test Data 70
4.2.3 POS Tagging Set 71
4.2.4 Tone Sandhi Marks 73
4.3 Rule-based Tone Sandhi Algorithm 73
4.4 Results, Accuracy Rate and Discussion 78
4.4.1 Experiment Results 78
4.4.2 Accuracy Rate and Related Analysis 80
4.4.3 Discussion 83
4.5 Summary and Possible Direction 85
Chapter 5 POS Tagging Method 87
5.1 Problems of POS Tagging 87
5.2 POS Tagging Methods 88
5.2.1 Origin of the Corpus 89
5.2.2 Word for Word Alignment 89
5.2.3 Searching for the Corresponding Mandarin Candidate Words 90
5.2.4 Selecting the Best Mandarin Translation 91
5.2.5 Selecting the Most Appropriate POS According to the Corresponding Mandarin Word 92
5.3 Results 94
5.4 Error Analysis 99
5.4.1 Incorrect Corresponding Mandarin Word Selection 99
5.4.2 Absence of Appropriate Mandarin Words in the OTMD 100
5.4.3 Unknown Words from the Viewpoint of Mandarin 101
5.4.4 Propagation Error 101
5.4.5 Other Cases 101
5.4.6 Summary of Error Conditions 102
5.5 Discussion 103
5.5.1 Is Improvement Possible ? 103
5.5.2 Hyphen Problems, Distinction between Taiwanese and Mandarin 104
5.5.3 The Distinction between Different Eras or Different Genres 105
5.6 Summary 106
Chapter 6 Conclusion and Future Work 109
6.1 Our Contributions to Written Taiwanese Resources and Processing 109
6.2 Future Work and Prospects for Written Taiwanese Processing Research 112
Reference 117
Appendix 127
A.1 Brief Introduction to The Phoneme of Taiwanese 127
A.1.1 Initials 127
A.1.2 Vowels 128
A.1.3 Tones 129
A.1.4 Compared with Mandarin 130
A.2 Examples of Written Taiwanese 132
A.3 Terminologies 136
A.4 Webpages Made by Author 138
A.5 Differences between POJ and TL 139
Academia Sinica. (2008). Southern Min and Hakka Language Archive. Retrieved 1/24, 2009, from http://www.ling.sinica.edu.tw/files/SelectedResearchProject971111-12.pdf
Benenson, A. Transliterator (ToCyrillic). Retrieved 11/29, 2008, from https://addons.mozilla.org/zh-TW/firefox/addon/883?id=883&application=firefox
Berger, A. L., Pietra, S. A. D., & Pietra, V. J. D. (1996). A Maximum Entropy Approach to Natural Language Processing. Computational Linguistics, 22(1), 39-71.
Brill, E. (1993). Automatic grammar induction and parsing free text: A transformation-based approach, Proceedings of the DARPA Speech and Natural Language Workshop (pp. 237-242).
Chan, K.-i. (2008). Comparison with the Usage of Academic and Non-academic Taiwanese Words ''台語學術類和非學術類的詞彙使用比較''. National Taitung University, Taitung.
Chan, K.-k. (1997). The Discussion of Taiwanese Word Segmentation Principles ''台語斷詞原則討論''. In The Project Report for the Collecting, Cataloging and Select Editing of Taiwanese Literature Publications ''台灣文學出版物收集、目錄、選讀編輯計畫結案報告'' (pp. 45-72). Taipei: Council for Culture Affairs ''文建會''.
Chang, C.-l. (2007). A Comparative Study on the Verb "qi-lai " in Mandarin Chinese and Taiwan Southern Min ''華閩語趨向動詞「起來」之語義句法功能研究''. National Sun Yat-sen University ''國立中山大學'', Kaohsiung.
Chen, M. Y. (2000). Phonological Phrase as a Sandhi Domain. In Tone Sandhi : Patterns Across Chinese Dialects: Cambridge Univ. Press.
Chen, S.-L. (2006). The Use and Grammaticalization of Taiwan Southern Min「生」 ''臺灣閩南語「生」的用法及其語法化歷程''. 國立新竹教育大學, Hsinchu.
Cheng, C.-C., Ho, D.-a., Hsiao, S.-y., Chiang, M.-h., & Chang, Y.-l. (Eds.). (2007). Multiculturalism Thinking of the Language Policy ''語言政策的多元文化思考''. Taipei: Institute of Linguistics of Academia Sinica.
Cheng, R. L. (1990). In the evolution of Taiwan''s society and language literacy ''演變中ê台灣社會語文''. Taipei: 自立.
Cheng, R. L. (1997). Taiwanese and Mandarin Structures and Their Developmental Trends in Taiwan Book I: Taiwanese Phonology and Morphology ''台語的語音與詞法''. Taipei City: Yuan-liou Publishing Co. ''遠流''.
Cheng, R. L. (2002). 語法模板上的聲調變化──認知及測驗 ''Tone Sandhi on Template Grammar -- Cognition and Test'', 1st International Conference on Taiwanese Romanization. Taitung: Taiwanese Romanization Association.
Cheng, Y.-F. (2007). Patterns of Negative Words of A-not-A Questions in Taiwan Southern Min ''台灣閩南語正反問句中否定助動詞的系統''. National Tsing-hua University ''國立清華大學'', Hsinchu.
Chhong-bi Memorial Foundation. TBTS Taiwanese Writing Forum. Retrieved 12/1, 2008, from http://chhongbi.org/index2.html
Chiang, Y.-c. (2004). Dai-im Input Method. Retrieved 12/30, 2008, from http://taiwantp.net/eternity/holodownload.htm
Chiunn, U.-b. (2008). Taiwanese and Hakka Modern Literature Website ''台語及客語現代文學專題網站''. Retrieved 11/29, 2008, from http://140.116.10.241/NCKUTaiWeb/View/index.aspx
Chou, S.-y. (2006). T3 Taiwanese Treebank and Brill Part-of-Speech Tagger ''T3台語剖析樹語料庫與Brill詞類標記''. National Tsing Hua University, Hsin-chu.
CKIP. (1993). Analysis of Chinese Part-of-speech ''中文詞類分析''. Taipei: The Association for Computational Linguistics and Chinese Language Processing.
CKIP. (2004). Chinese Word Segmentation and Tagging System. Retrieved 11/22, 2008, from http://ckipsvr.iis.sinica.edu.tw/
Embree, B. L. M. (1984). A dictionary of Southern Min ''台英辭典''. Taipei: Taipei Language Institute.
Faith Hope Love Foundation. (2006). Online Taiwanese Bible. Retrieved 11/20, 2008, from http://taigi.fhl.net/list.html
Faith Hope Love Foundation. (2008). the Rare Edition Bible Digital Archives Query system ''珍本聖經數位典藏查詢系統''. Retrieved 11/23, 2008, from http://bible.fhl.net/new/ob.html
Fung, P., & Wu, D. (1995). Coerced Markov Models for cross-lingual lexical tag relations, Sixth International Conference on Theoretical and Methodological Issues in Machine Translation (Vol. 1, pp. 240-255). Leuven, Belgium.
Google Inc. Google Analytics. Retrieved 5/31, 2008, from http://www.google.com/analytics/zh-TW/
Google Inc. (2007). Google Book Search. Retrieved 12/12, 2008, from http://books.google.com/
Gordon, R. G. J. (Ed.). (2005). Ethnologue : Languages of the world (15 ed.). Dallas: SIL International.
Gou, S.-l. (1995). Lam-koan lyrics as preserved in early Southern Min drama texts ''保存在早期閩南戲文中的南管曲詞''. In Anthology of Min/Taiwan Dialects Research ''閩台方言研究集'' (Vol. 1). Taipei: Nan-tian.
Grimes, B. F. (Ed.). (2000). Ethnologue: Languages of the World (14 ed.). Dallas: SIL International.
Hsiao, Y. C. E. (2000). Optimal Tone Sandhi in Taiwanese ''台灣閩南語之優選變調''. Chinese Studies ''漢學研究'', 18 special, 25-40.
Huang, S.-f. (1995). Language, Society and Ethnicity ''語言社會與族群'' (2 ed.). Taipei: Crane ''文鶴''.
ISO/IEC JTC1/SC2 & WG2. (2004). Universal Multiple-Octet Coded Character Set(UCS). Retrieved 11/30, 2008, from http://www.dkuug.dk/JTC1/SC2/WG2/docs/n2754.pdf
Iunn, U.-g. (1998). Written Taiwanese Website. Retrieved 12/30, 2008, from http://iug.csie.dahan.edu.tw/
Iunn, U.-g. (2000). Online Taiwanese-Mandarin Dictionary ''台文華文線上辭典''. Retrieved 11/20, 2008, from http://iug.csie.dahan.edu.tw/q/q.asp
Iunn, U.-g. (2001). A Study of Handling Peh-oe-ji by Personal Computer ''個人電腦處理台語羅馬字ê探討'', Conference on Information Technology and Application ''資訊技術與應用學術研討會'' (pp. 45-57). Hualian: Dept. of CSIE, Dahan Inst. Tech.
Iunn, U.-g. (2002). Statistical Data of The Online Taiwanese-Mandarin Dictionary ''台文華文線上辭典統計資料''. Retrieved 12/30, 2008, from http://iug.csie.dahan.edu.tw/iug/ungian/soannteng/chil/thongke.asp
Iunn, U.-g. (2003a). The Competition of the Taiwanese Romanization Spelling Symbols ''台語羅馬拼音符號ê競爭'', 2003 Conference on the Taiwanese Written Form ''2003台語文字化研討會'' (pp. 45-60). Kaohsiung: Kian-kok elementary school.
Iunn, U.-g. (2003b). Online Taiwanese Concordancer System ''台語文語詞檢索系統''. Retrieved 11/20, 2008, from http://iug.csie.dahan.edu.tw/TG/concordance/
Iunn, U.-g. (2003c). Online Taiwanese Syllable Dictionary ''台語線上字典''. Retrieved 11/20, 2008, from http://iug.csie.dahan.edu.tw/TG/jitian/
Iunn, U.-G. (2003d). Probe into the Taiwanese language writing style for the point of view of the register and loan words ''從語域及借詞觀點探討台語文寫作風格'', 15th ROCLing (pp. 73-86). Hsin-chu: National Tsing-hua Univ.
Iunn, U.-g. (2003e). Statistical Data of Online Taiwanese Concordancer System ''台語文語詞檢索系統統計資料''. Retrieved 12/30, 2008, from http://iug.csie.dahan.edu.tw/TG/concordance/thongke.asp
Iunn, U.-g. (2003f). Statistical Data of Online Taiwanese Syllable Dictionary ''台語線上字典統計資料''. Retrieved 12/30, 2008, from http://iug.csie.dahan.edu.tw/TG/jitian/thongke.asp
Iunn, U.-g. (2003g). Survey of the Online Taiwanese-Mandarin Dictionary -- Discussion of Building Technique and its Utilization ''台文華文線上辭典建置技術及使用情形探討''. Paper presented at the 3rd International Conference on Internet Chinese Education ''2003第三屆全球華文網路教育國際學術研討會'', Taipei.
Iunn, U.-g. (2005a). Project Report : Taiwanese Corpus Collection and Corpus Based Syllable / Word Frequency Counts for Written Taiwanese (No. NSC 93-2213-E-122-001).
Iunn, U.-g. (2005b). Taiwanese Word Frequency Report for the Han-Romanization mixed script. Retrieved 11/20, 2008, from http://iug.csie.dahan.edu.tw/giankiu/keoe/KKH/guliau-supin/rslt/whf.asp
Iunn, U.-g. (2005c). Taiwanese Word Frequency Report for the Taiwanese Romanization script. Retrieved 11/20, 2008, from http://iug.csie.dahan.edu.tw/giankiu/keoe/KKH/guliau-supin/rslt/wpf.asp
Iunn, U.-g. (2006a). Digital Archive Database for Written Taiwanese (2nd stage). Retrieved 11/23, 2008, from http://www2.nmtl.gov.tw/dadwt/pbk.asp
Iunn, U.-g. (2006b). Taiwanese Romanization Character Graph ''白話字字母圖形''. Retrieved 11/20, 2008, from http://iug.csie.dahan.edu.tw/TG/POJJB/POJjb.asp
Iunn, U.-g. (2007a). the Memory of Written Taiwanese. Retrieved 11/30, 2008, from http://iug.csie.dahan.edu.tw/memory/TGB/mowt.asp
Iunn, U.-g. (2007b). New Manifestation of the Taiwanese vernacular literature -- Introduction to Digital Archive for Written Taiwanese ''台語白話文學ê全新表現─台語文數位典藏資料庫簡介''. National Museum of Taiwanese Literature Communication ''台灣文學館通訊'', 15, 42-44.
Iunn, U.-g. (2007c). Taiwanese-Chinese Dictionary Gadget. Retrieved 1/10, 2009, from http://www.google.com.tw/ig/directory?hl=zh-TW&type=gadgets&url=ungian.googlepages.com/TwChDict.xml
Iunn, U.-g. (2007d). Written Taiwanese Gadget. Retrieved 1/10, 2009, from http://www.google.com.tw/ig/directory?hl=zh-TW&type=gadgets&url=ungian.googlepages.com/WTR.xml
Iunn, U.-g. (2008). Written Taiwanese Digital Archive - take example for the Memory of Written Taiwanese System ''台語文數位典藏─以台語文記憶系統做例'', 4th Conference of Taiwanese Romanization ''第四屆台灣羅馬字學術研討會'' (pp. 119-136). Taichung.
Iunn, U.-g., & Kao, C.-y. (2004). Discuss Taiwanese Words Change from the New Testament of the Bible ''Ùi台語新約聖經探討台語語詞變化'', 2004 International Conference on Taiwanese Romanization. Tainan: Taiwanese Romanization Association.
Iunn, U.-g., & Lau, K.-g. (2007). Introduction to Online Taiwanese Dictionaries and Corpora ''台語文線頂辭典kap語料庫簡介''. In C.-C. Cheng, D.-a. Ho, S.-y. Hsiao, M.-h. Chiang & Y.-l. Chang (Eds.), Multiculturalism Thinking of the Language Policy ''語言政策的多元文化思考'' (pp. 311-328). Taipei: Institute of Linguistics of Academia Sinica.
Iunn, U.-g., Lau, K.-g., Tan-Tenn, H.-g., Lee, S.-a., & Kao, C.-y. (2007). Modeling Taiwanese Southern-Min Tone Sandhi Using Rule-Based Methods. International Journal of Computational Linguistics and Chinese Language Processing, 12(4), 349-370.
Iunn, U.-g., & Lui, B.-h. (2009). Taiwanese POJ script and Han-Romanization mixed script conversion systems ''台語文全羅漢羅轉換系統 ''. Retrieved 1/29, 2009, from http://iug.csie.dahan.edu.tw/TGB/CLHLMI/clhlmi.asp
Iunn, U.-g., & Tiunn, H.-k. (1999). Review and Analysis of Taiwan Ho-lo Language non-Han Character Spelling Symbols ''台灣福佬話非漢字拼音符號的回顧與分析'', 1st Taiwan Mother Tongue Revival and Reconstruction Academic Conference ''第一屆台灣母語文化重生與再建學術研討會'' (pp. 62-76). Tainan city: Tainan City Culture Foundation ''台南市文化基金會''.
Iunn, U.-g., Tiunn, H.-k., & Li, B.-c. (Eds.). (2008). Written Taiwanese Movement : Interviews and the Collection of Historical Materials "台語文運動訪談暨史料彙編". Taipei: Academia Historica.
Khu, N.-p. (2008). Comparison with Borrowed Words in Two Eras of Loa Jin-seng''s Taiwanese Novels ''賴仁聲兩個時代台語小說中的借詞比較''. National Taiwan Normal University, Taipei.
Klöter, H. (2005). Written Taiwanese. Wiesbaden: Harrassowitz.
Laenen, B., Jacquerye, D., & Kulev, O. (2004). DejaVu fonts. Retrieved 11/30, 2008, from http://dejavu.sourceforge.net/wiki/index.php/Main_Page
Lau, K.-g. (2002). Process POJ using Unicode ''使用Unicode來處理白話字'', 1st International Conference on Taiwanese Romanization. Taitung: Taiwanese Romanization Association.
Lau, K.-g. (2005). Taigi Unicode Font. from http://taigi.fhl.net/TP/taigi_unicode_font.htm
Lau, K.-g. (2006). Syllable/Word/Sentence/Paragraph Count System for Romanized Taiwanese. Retrieved 12/30, 2008, from http://iug.csie.dahan.edu.tw/nmtl/dadwt/count.htm
Lau, K.-g. (2007). Finding Mandarin Candidate Words by POJ script and Han-Romanization mixed script word pair ''全羅漢羅對照文本找華語候選詞''. Retrieved 11/22, 2008, from http://iug.csie.dahan.edu.tw/nmtl/dadwt/pos_tagging/clhl_hoagi_hausoansu.asp
Lau, K.-g., & Iunn, U.-g. (2002). Study of Taiwanese POJ text processing ''白話字電腦文書處理ê研究'', The 4th International Symposium on Taiwanese Languages and Teaching ''第四屆台灣語言及其教學國際學術研討會'' (pp. 341-349). Kaosiung: NSYSU.
Lau, K.-g., & Iunn, U.-g. (2007). Online Word Segmentation System for the Taiwanese Han-Romanization Mixed Script ''漢羅台語文斷詞系統''. Retrieved 11/29, 2008, from http://poj.likulaw.info/hanlo_hunsu.php
Li, B.-c. (2008). E-Cryptic Book : Taiwanese Vernacular Literature Digital Archive ''天書e化:台語白話字文獻數位典藏'', Cyber Island, Taiwan e-paper Volume 22 ''數位島嶼電子報22期'': National Digital Archives Program, Taiwan. Content Development Division.
Li, K.-h. (2000). Lexical Change and Variation in Taiwanese Literary text, 1916-1998 -- A Computer-Assisted Corpus Analysis. University of Hawaii, Hawaii.
Li, K.-h. (2007). Taiwan Romanization Spelling Guide with Graphic Illustration ''台灣羅馬字拼音圖解, Tâi-uân-lô-má-jī phing-im tôo-kái''. Tainan: Khai-long ''開朗''.
Li, K.-h. (2008). Taiwanese Vernacular Literature Archive ''台灣白話字文獻資料館''. Retrieved 11/23, 2008, from http://140.122.80.250/pojbh/script/index.php
Li, K.-h., & Ang, U.-j. (2007). A language with no name ? Taiwanese, Southern or Holo language ? ''沒有名字的語言?台灣話、閩南語還是Hoh-lo話?''. Museum of Taiwan Literature communications, 15, 36-41.
Li, K.-h., Li, B.-c., & Lau, S.-h. (2008). A Preliminary Study of Tan Chheng-tiong and Northern Taiwan Presbyterian Church news "Koa-chhai-chi" 陳清忠與北部台灣基督長老教會公報《芥菜子》初探'', 4th International Conference on Taiwanese Literature. Tainan: National Cheng-kung Univ.
Liang, M.-s., Yang, J.-c., Chiang, Y.-c., & Lyu, R.-y. (2004). A Taiwanese Text-to-Speech System with Applications to Language Learning, Proceedings of the 4th IEEE International Conference on Advanced Learning Technologies (pp. 91-95).
Liao, Y.-P. (2008). A Study On Mandarin And Southern Min Verbal Classifiers ''華語和閩南語動量詞的研究''. Yuan Ze University ''元智大學'', Taoyuan.
Lim, C.-i. (2005). KKS (Kài khin-sang) input method. Retrieved 11/29, 2008, from http://taigi.fhl.net/KKS_Kaulo_ol.zip
Lim, C.-i. (2008). YKS (Iah khin-sang) input method. Retrieved 11/29, 2008, from http://taigi.fhl.net/YKS_Kaulokks_ol.zip
Lim Chun-iok, T. J.-m., Zhou Jia-hong. (2006). Han Romanization scripts conversion and Mandarin Taiwanese translation system based on the phrase substitution''以片語替換為基礎的漢羅轉換、華台翻譯系統'', 1st Conference on Taiwanese Language and Culture ''中山醫學大學第一屆台灣語文暨文化研討會'' (pp. 94-103). Taichung: Chung shan Medical University.
Lin, C.-j., & Chen, H.-h. (1999). A Mandarin to Taiwanese Min Nan Machine Translation System with Speech Synthesis of Taiwanese Min Nan. International Journal of Computational Linguistics and Chinese Language Processing, 4(1), 59-84.
Liu, Y.-z. (2005). Building the Taiwanese Treebank in T3 Corpus. Unpublished master, National Tsing-Hua University, Hsin-chu.
Lua, S.-l. (2008). Comparative Analysis of The Taiwan Southern Min "thui-tsiàn-iōng-jī" (Recommended Words) of The Ministry of Education ''教育部台灣閩南語推薦用字的比較分析''. National Taitung Univ., Taitung.
Manning, C., & Schütze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge: MIT Press.
McCallum, A., Freitag, D., & Pereira, F. (2000). Maximum Entropy Markov Models for Information Extraction and Segmentation, the Seventeenth International Conference on Machine Learning (pp. 591-598). Stanford Univ.: Stanford Univ.
McEnery, A., Xiao, R., & Tono, Y. (2006). Corpus-based language studies : an advanced resource book. London: Routledge.
MOE. (2006). Taiwan Southern Min Lō-má-jī (Romanization) Phonetic Scheme ''臺灣閩南語羅馬字拼音方案''. Retrieved 11/29, 2008, from http://www.edu.tw/files/regulation/M0001/151609.pdf
MOE. (2007a). The first 300 Taiwan Southern Min Recommended Orthorgraphy Words List Retrieved 11/30, 2008, from www.edu.tw/files/download/M0001/300iongji_960523.pdf
MOE. (2007b). TL input method. Retrieved 11/29, 2008, from http://www.edu.tw/MANDR/content.aspx?site_content_sn=3847
MOE. (2007c). User Manual of Taiwan Southern Min Lō-má-jī phonetic scheme ''臺灣閩南語羅馬字拼音方案使用手冊''. Taipei: MOE.
MOE. (2008a). MOE Taiwan Southern Min Common Word Dictionary ''教育部臺灣閩南語常用詞辭典''. Retrieved 1/21, 2009, from http://twblg.dict.edu.tw/tw/index.htm
MOE. (2008b). The second 100 Taiwan Southern Min Recommended Orthorgraphy Words List. Retrieved 1/21, 2009, from http://www.edu.tw/files/download/M0001/100iongji_970501.pdf
MOE Advisory Office. (2007). Workshop on The Use of Language Corpora of Taiwan ''台灣語言語料庫使用工作坊''. Retrieved 12/30, 2008, from http://lists.topica.com/lists/NTUphon/read/message.html?sort=t&mid=1720785428
Niu, S.-h. (2004). Contextualization into the Taiwan Society of Three Versions of the New Testament Taiwanese Translation ''台語新約聖經三種版本的台灣社會實況化研究''. National Hsinchu University of Education ''國立新竹教育大學'', Hsin-chu.
Ong, S.-l. (1999). The Texts Database of Folk Songs in Southern Min Dialect ''閩南語俗曲唱本「歌仔冊」全文資料庫''. Retrieved 11/20, 2008, from http://www32.ocn.ne.jp/~sunliong/index.html
Pan, N.-h., Yu, M.-s., & Tsai, z.-m. (2008). Mandarin Sentence to Taiwanese Speech System, The 13th Conference on Artificial and Applications ''第十三屆人工智慧與應用研討會'' (pp. 1-5). Yilan.
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257-286.
Ratnaparkhi, A. (1996). A maximum entropy model for part-of-speech tagging, Conference on Empirical Methods in Natural Language Processing (pp. 133-142): University of Pennsylvania.
Samuelsson, C. (2003). Statistical methods. In R. Mitkov (Ed.), the Oxford Handbook of Computational Linguistics (pp. 358-375). New York: Oxford Univ. Press.
Shi, D.-m. (2006). T3 Taiwanese Treebank and Brill Parser. National Tsing-Hua University, Hsin-chu.
Siau, P.-t. (2007). SKS (Siōng khin-sang) input method. Retrieved 11/29, 2008, from http://iug.csie.dahan.edu.tw/TG/POJtn/SKS/default.asp
SIL. (2005a). Charis SIL Font Home Retrieved 11/30, 2008, from http://scripts.sil.org/CharisSILFont
SIL. (2005b). Doulos SIL Font Home. Retrieved 11/30, 2008, from http://scripts.sil.org/DoulosSILfont
SIL. (2005c). Gentium — a typeface for the nations Retrieved 11/20, 2008, from http://scripts.sil.org/Gentium
Tai, C.-h. (2007). Word and POS tagging selection for Taiwanese Language ''台語選詞跟詞性''. Retrieved 9/20, 2008, from http://140.109.19.105/
Taichung Library. The Virtual Library - Classical Collections of the National Taichung Library, Taiwan Archives files from the Japanese Colonial Period ''數位圖書館-日文舊籍數位典藏資料庫檢索系統''. Retrieved 11/23, 2008, from http://jdlib.ntl.gov.tw/
Tan, U.-c. (2008). With the way the world : Taiwanese Vernacular ''跟世界接軌的方式:「台灣白話字」'', Cyber Island, Taiwan e-paper Volume 32 ''數位島嶼電子報32期'' National Digital Archives Program, Taiwan. Content Development Division.
TCA. (2003). N2628: Proposal to add COMBINING RIGHT DOT ABOVE as required by Taiwanese Holo language Romanization to ISO/IEC 10646. Retrieved 11/30, 2008, from http://std.dkuug.dk/JTC1/SC2/wg2/docs/n2628.pdf
Te, K.-s., & Everson, M. (1997). N1593: Proposal to add Latin characters required by Latinized Taiwanese languages to ISO/IEC 10646.
The Unicode Consortium. (2006). The Unicode Standard, Version 5.0: Addison-Wesley.
Tiunn, H.-k. (1998). Writing in Two Scripts:A Case Study of Digraphia in Taiwanese. Written Language and Literacy, 1(2), 223-231.
Tiunn, H.-k. (2002). N2507:Draft of proposal to add Latin characters required by Latinized Holo language to ISO/IEC 10646. Retrieved 11/30, 2008, from http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2507.pdf
Tiunn, H.-k. (2003). Peh-oe-ji & Unicode : Fonts Software Development ''白話字&萬國碼:字型及軟體開發''. Retrieved 11/30, 2008, from http://iug.csie.dahan.edu.tw/TG/Unicode/TwMd/index.htm
Tiunn, J.-h. (2001). Principles of POJ or the Taiwanese Orthography : An Introduction to Its Sound-Symbol Correspondences and Related Issues ''白話字基本論:台語文對應&相關的議題淺說''. Taipei: Crane(Bun-hoh).
Tiunn, J.-h. (2003). International System of Spelling for Taiwanese POJ manual ''ISOS萬國通用拼音台語白話字手冊''. Taipei: Li Kang-khioh Taiwanese Cultural and Educational Foundation(李江卻台語文教基金會).
Tsai, J.-m. (2005a). The Han-Romanization Mixed to POJ script converting tool ''漢羅轉全羅工具''. Retrieved 11/29, 2008, from http://taigi.fhl.net/lohan/
Tsai, J.-m. (2005b). Mandarin Taiwanese converting system ''華台轉換系統''. Retrieved 11/30, 2008, from http://taigi.fhl.net/ht/
Tsai, J.-m. (2005c). POJ to Han-Romanization mixed script converting system ''全羅轉漢羅轉換系統''. Retrieved 11/29, 2008, from http://taigi.fhl.net/hanlo/
Tsai, J.-m. (2007). Code transfer. Retrieved 11/30, 2008, from http://taigi.fhl.net/CTS/
Tsai, Y.-f., & Chen, K.-j. (2004). Reliable and Cost-Effective Pos-Tagging. International Journal of Computational Linguistics and Chinese Language Processing, 9(1), 83-96.
Tsay, J. S. (2005). Taiwan Child Language Corpus: Data Collection and Annotation, 5th Workshop on Asia Language Resources (pp. 56-61). Jeju Island, Korea.
Tsay, J. S. (2007). Construction and Automatization of a Minnan Child Speech Corpus with some Research Findings. International Journal of Computational Linguistics & Chinese Language Processing, 12(4), 411-442.
UPenn. (1992). Linguistic Data Consortium. Retrieved 12/11, 2008, from http://www.ldc.upenn.edu/
Wikimedia Foundation. (2004). Wikipedia Holo Taiwanese Version. Retrieved 11/20, 2008, from http://zh-min-nan.wikipedia.org/
Yu, S.-w. (2003). Introduction to Computational Linguistics ''計算語言學概論''. Beijing: Shang-wu.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔