(3.232.129.123) 您好!臺灣時間:2021/03/04 18:03
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:巫溪修
研究生(外文):Hsi-Hso Wu
論文名稱:容錯性中文語詞比對架構的設計
論文名稱(外文):The Design of an Error Tolerant Chinese Phrase Matching Scheme
指導教授:杜敏文
指導教授(外文):Min-Wen Du
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊科學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:1999
畢業學年度:87
語文別:英文
論文頁數:30
中文關鍵詞:詞庫語句比對系統語音辯識容錯錯誤改正能力
外文關鍵詞:Cartesian productCoveringApproximate string matchingChinese phrase matchinglarge phrase tablespeech recognition
相關次數:
  • 被引用被引用:0
  • 點閱點閱:128
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
對於設計中文輸入法而言,容錯或錯誤改正能力是非常需要的。尤其是在以語音辯識技術為主的輸入法而言更是有用,因為錯誤是難以避免的。在一些實際應用上,如自然語言的語音辯識的製作上,我們常需要處理極大的詞庫。我們所面對的問題就是在極大詞庫下如何設計一個有錯誤改正能力而且即時的語句比對系統。
這篇論文試著在一個極大的詞庫下去製作一個索引架構來幫助具錯誤改正能力的詞句比對運算。所採取的方法主要是以下三種概念:1. Cartesian Product File 2. Covering between buckets 3. Gradual expansion of search region。實驗的結果顯示在一個極大的詞庫下製作一個多錯誤改正能力而即時的語句比對系統是可行的。

Error tolerant capability is very desirable in designing a Chinese computer input method. It is especially useful in designing an input method based on speech recognition technology because where errors are inevitable. In practical applications, such as natural language speech recognition, we need to handle very large phrase tables. How to do error tolerant phrase matching with very large phrase tables in a real-time speech recognition environment is the problem we are facing.
This thesis developed an index scheme to help the error tolerant phrase matching calculations with very large phrase tables. The approach is based on three concepts. 1. Cartesian Product File. 2. Covering between buckets. 3. Gradual expansion of search region. The results show that doing multiple error tolerant phrase matching with very large phrase tables is feasible.

Chapter 1.Introduction1
1.1The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
1.2Assumptions and Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 2.Phrase Processing and Buckets Constructing6
2.1Set of Syllables in Mandarin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
2.2Partition of Syllable Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
2.3Cartesian Product of Syllable Domain and Phrase Bucket . . . . . . . . . . . . .10
2.4An Error Tolerant Phrase Matching Procedure . . . . . . . . . . . . . . . . . . . . . . 11
Chapter 3.Bucket Index Structure 12
3.1Buckets and Partitions of Phrase Table . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
3.2Buckets with Unknown X Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3Covering Relations Between Enlarged Buckets . . . . . . . . . . . . . . . . . . . . .13
3.4Bucket Index Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
3.5Hash Function Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
3.6Example of Bucket Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
Chapter 4.Performance Analysis19
4.1Number of Inverted Files Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
4.2Memory requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
4.3Execution Time Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21
Chapter 5.Experimental Results23
5.1The Buckets Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
5.2Domain Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
5.3Maximum Bucket Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
5.4Total Number of Buckets Fetched . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Chapter 6.Conclusion29
References30

[1]L. S. Lee, C. Y. Tseng, H. Y. Gu, F. H. Liu, C. H. Chang, Y. H. Lin, Y. Lee, S. L. Tu, S. H. Hsieh, and C. H. Chen, "Golden Mandarin (I)-A Real-Time Mandarin Speech Dictation machine for Chinese language with Very Large Vocabulary," IEEE Transactions on Speech and Audio Processing, vol. 1, no.2, pp. 158-179, Apr. 1993.
[2]C. C. Chang and M. W. Du, "The Hierarchical Ordering in Multiattribute Files," Information Sciences, vol. 31, pp. 41-75, 1983.
[3]C. C. Chang, R. C. T. Lee, and M. W. Du, "Symbolic Gray Code as a Perfect Multiattribute Hashing Scheme for Partial Match Queries," IEEE Transactions on Software Engineering, vol. se-8, no. 3, pp. 235-248, May 1982.
[4]M. W. Du, and S. C. Chang, "An Approach to Designing Very Fast Approximate String Matching Algorithms," IEEE Transactions on Knowledge and Data Engineering, vol. 6, no. 4, pp. 620-633, Aug. 1994.
[5]M. W. Du and S. C. Chang, "A model and a Fast Algorithm For Multiple Errors Spelling Correction," Acta Informatica, vol. 29, pp. 281-302, 1992.
[6] J. T. Wang, and C. Y. Chang, "Fast Retrieval of Electronic Messages That Contain Mistyped Words or Spelling Errors," IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 27, no. 3, pp. 441-451, June 1997.
[7]H. L. Morgan, "Spelling correction in systems programs, " Commun. ACM, vol. 13, no. 2, pp. 90-94, Feb. 1970.
[8]M. J. Folk, B. Zoelick, File Structures. Reading, MA: Addison-Wesley, 1992.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔