( 您好!臺灣時間:2021/04/23 00:59
字體大小: 字級放大   字級縮小   預設字形  


研究生(外文):Hsu Kuo-Tung
論文名稱(外文):Design of a Web-based Speaking Training System
指導教授(外文):Herng-Yow, Chen
外文關鍵詞:E-learning/distant learningASRTTSAlignmentDynamic programming
  • 被引用被引用:2
  • 點閱點閱:113
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
目前透過網路的教學系統種類性質紛多,但以「語言訓練」的教學系統,最為風行,然而這些語言學習系統,大都只針對聽、讀、寫等各方面技能設計相關的訓練模式與教材,卻少有針對「說」方面的能力訓練及研究。有鑑於此,本論文主要結合現今的「語音辨識」與「語音合成」的相關技術,透過動態演算法(Dynamic Programming),設計出以音素(phoneme)為基礎的字串比對評分機制,找出學習者的發音錯誤,並加以分析,以給予使用者適當的建議與改進方法。透過這些評分與分析的機制,建構出完整的網路式語音訓練環境。配合著不同的訓練目的,設計多種模式,以雙向、互動學習的機制,提供使用者最完善的訓練教材,以求達到最佳的學習效果。目前初步的研究成果已應用於暨大網路多媒體英語教室的語音訓練雛形(http://english.csie.ncnu.edu.tw)。

With the rapid development of Internet and multimedia technologies, the E-learning/distance-learning systems have been developed rapidly in recent years. Language training is one of the popular domains in distance-learning applications. To date, however, most of the systems are designed for listening, writing and reading training. Few systems are designed for "speaking" training. The objective of this study is to design a web-based speaking training system by integrating automatic speech recognition (ASR), text-to-speech (TTS) technologies and the proposed phonetic-string-based alignment method for pronunciation scoring. The alignment method based on dynamic programming strategy is adopted to detect possible mispronounced parts of a speaking. The scoring method is proposed to identify the mispronunciation in voice uttered by students. Thus the error indication fed back to learners can be used for further improvement.
The experimental results show that the phonetic encoding schemes (e.g. CMUdict) are useful operators to reduce the distance of two different strings with similar pronunciation; hence, it can be used to compensate for some recognition errors from ASR. The prototype system has been implemented to assist students in speaking practice.

Contents I
List of Figures III
List of Tables IV
Abstract V
Chinese Abstract VI
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Introduction 1
1.3 Research Issues 2
1.4 Organization of this Thesis 2
Chapter 2 Technologies and Related Work 3
2.1 Automatic Speech Recognition 3
2.2 Text-to-Speech 4
2.3 Streaming Technology 4
2.4 Related Work 5
2.4.1 Automatic Detection and Correction of the Mispronunciation 5
2.4.2 Grading/Scoring for the Pronunciation 5
2.4.3 Feedback in Speaking Training Systems 6
Chapter 3 Phonetic-String-Based Alignment for Scoring and Analysis 7
3.1 Overview of Alignment Problem 7
3.1.1 Sequences Alignment 7
3.2 The Phonetic Domain String Alignment 10
3.2.1 Phonetic Encoding 11 Phonetic Encoding Scheme 11
3.2.2 Phonetic Domain String Alignment 13 Phonetic String Distance Measurement 14 Phonetic String Similarity 14 Phonetic String Alignment 15
3.2.3 Analysis Module 18
3.3 Summary 20
Chapter 4 System Architecture and Implementation 21
4.1 System Framework 21
4.1.1 Server Side 21
4.1.2 Client Side 22
4.2 Training Modes 23
4.2.1 Vocabulary Practice 23
4.2.2 Role-Playing Functionality 24
4.3 Experiments 25
4.4 Summary 27
Chapter 5 Conclusion and Future Work 28
5.1 Conclusion 28
5.2 Future Work 28
References 29
Appendix Publication List 32

String matching
[Gonz01] Gonzalo Navarro, Ricardo Baeza-Yates, Erkki Sutinen, Jorma Torma Tarhio, Indexing Methods for Approximate String Matching. IEEE Data Engineering Bulletin, 24(4):19-27, 2001.
[Jame97] James C. French, Allison L. Powell, Eric Schulman, Applications of approximate word matching in information retrieval. In Proceedings of the Sixth International Conference on Knowledge and Information Management, pp. 9-15, 1997.
[Zobe96] Zobel, J. and Dart, P., Phonetic String Matching: Lessons from Information Retrieval. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1996, pp. 166-172, 1996
[Phil90] Philip L Hanging on the Metaphone. Computer Language Magazine, Vol. 7, No. 12, pp. 38-43, 1990
[Gadd88] Gadd T, Phonetic Retrieval of Written Text in Information System. Program: Automated Library and Information Systems, 22(3), pp. 222-237, 1988
[Hall80] Hall P and Dowling G, Approximate String Matching. Computing Surveys, 12(4), pp.381-402, 1980
[Dame64] Damerau FJ, The Technique for Computer Detection and Correction of Spelling Errors. Communications of the ACM, 7(3), pp. 171-176, 1964
Automatic pronunciation scoring
[Fran97] Franco, H.; Neumeyer, L.; Yoon Kim; Ronen, O, Automatic pronunciation scoring for language instruction. Acoustics, Speech, and Signal Processing, ICASSP-97, IEEE International Conference on , Volume: 2 , 21-24, Apr 1997
[Yoon97] Yoon Kim, Horacio Franco, and Leonardo Neumeyer, Automatic pronunciation scoring of specific phone segments for language instruction Proc. Eurospeech, pp. 645-648, Vol. 2, Rhodes, Greece, 1997
[Leon96] Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price, Automatic Text-independent pronunciation scoring of foreign language speech. Proceeding of ICSLP, 1996
Speaking Training System
[Ambr02] Ambra Neri, Catia Cucchiarini, Helmer Strik. Feedback in computer assisted pronunciation training: Technology push or demand pull? ICSLP-2002 TECHNICAL PROGRAM, 2002
W. Menzel, D. Herron, P. Bonaventura, R. Morton, Automatic detection and correction of non-native English pronunciations. Proceedings of InSTILL, 2000
[Herr99] Herron, D., Menzel W., Atwell E., Bisiani R., Daneluzzi F., Morton R., Schmidt J. A. “Automatic localization and diagnosis of pronunciation errors for second-language learners of English”, Eurospeech 99, Budapest, 5-9 September, v. 2, p.855-858, 1999
O. Ronen, L. Neumeyer, and H. Franco. Automatic Detection Of Mispronunciation For Language Instruction Proc. Of EUROSPEECH 97, pp. 645-648, Rhodes, 1997
[Silk97] Silke Witt, Steve Young. Computer-assisted Pronunciation Teaching based on Automatic Speech Recognition In Proceedings of Language Teaching and Language Technology, Groningen, Netherlands, Swet and Zeitlinger, 1997
[Cucc97] Cucchiarini, C., Strik, H., Boves, L., Automatic evaluation of Dutch pronunciation by using speech recognition technology. Automatic Speech Recognition and Understanding, 1997. Proceedings, 1997 IEEE Workshop on , 14-17 Dec. Page(s): 622 -629, 1997
[Mahs96] Mahshie, J.J. Feedback considerations for speech training systems. Spoken Language, ICSLP 96. Proceedings, Fourth International Conference on , Volume: 1 , 3-6 Oct 1996 Page(s): 153 -156 vol.1, 1996
Speech Recognition
[Shne00] Shneiderman B. The Limits of Speech Recognition. Communications of The ACM, 43(9), pp. 63-65, 2000
[Diga94] Digalakis, V. Murveit, H. Optimizing the degree of mixture typing in a large vocabulary Hidden Markov Model based speech recognition” IEEE ICASSP, pp. 1537-1540, 1994
Hidden Markov Model (HMM)
[Lawr89] LAWRENCE R. RABINER. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 77(2):257-286, 1989
Language Learning Systems
[Kuo02] Kuo-Yu Liu, Natalius Huang, Bo-Hung Wu, Wei-Ta Chu, Herng-Yow Chen, “The WSML System: Web-based Synchronization Multimedia Lecture System”, Video Demonstrations of ACM Multimedia 2002, Juan-les-Pins, France, pp. 662-663, December, 2002
[Fuji00] Fujii, S., Iwata, J., Hattori, M., Iijima, M., Mizuno, T., “Web-CALL”: a language learning support system using Internet. Parallel and Distributed Systems: Workshops, Seventh International Conference on, 2000 , 4-7 Page(s): 326 -331, July 2000
[Vict01] Victoria J. Hodge, Jim Austin. An Evaluation of Phonetic Spell Checkers, 2001
[Suwa98] P. Suwanvisat and S. Prasitjutrakul, "Thai-English Cross-Language Transliterated Word Retrieval using Soundex Technique" the National Computer Science and Engineering Conference 1998, Kasetsart University, Bangkok, Thailand, 1998
Internet Reference
[Bell] http://www.bell-labs.com/
[CMU] http://www.speech.cs.cmu.edu/
[MSSDK] http://www.microsoft.com/speech/
[Thie99] Thierry Dutoit: A Short Introduction to Text-to-Speech Synthesis.
TTS research team, TCTS Lab. December 17, 1999

第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔