研究生(外文):Hsu Kuo-Tung
論文名稱(外文):Design of a Web-based Speaking Training System
指導教授(外文):Herng-Yow, Chen
外文關鍵詞:E-learning/distant learningASRTTSAlignmentDynamic programming
目前透過網路的教學系統種類性質紛多,但以「語言訓練」的教學系統,最為風行,然而這些語言學習系統,大都只針對聽、讀、寫等各方面技能設計相關的訓練模式與教材,卻少有針對「說」方面的能力訓練及研究。有鑑於此,本論文主要結合現今的「語音辨識」與「語音合成」的相關技術,透過動態演算法(Dynamic Programming),設計出以音素(phoneme)為基礎的字串比對評分機制,找出學習者的發音錯誤,並加以分析,以給予使用者適當的建議與改進方法。透過這些評分與分析的機制,建構出完整的網路式語音訓練環境。配合著不同的訓練目的,設計多種模式,以雙向、互動學習的機制,提供使用者最完善的訓練教材,以求達到最佳的學習效果。目前初步的研究成果已應用於暨大網路多媒體英語教室的語音訓練雛形(http://english.csie.ncnu.edu.tw)。

With the rapid development of Internet and multimedia technologies, the E-learning/distance-learning systems have been developed rapidly in recent years. Language training is one of the popular domains in distance-learning applications. To date, however, most of the systems are designed for listening, writing and reading training. Few systems are designed for "speaking" training. The objective of this study is to design a web-based speaking training system by integrating automatic speech recognition (ASR), text-to-speech (TTS) technologies and the proposed phonetic-string-based alignment method for pronunciation scoring. The alignment method based on dynamic programming strategy is adopted to detect possible mispronounced parts of a speaking. The scoring method is proposed to identify the mispronunciation in voice uttered by students. Thus the error indication fed back to learners can be used for further improvement.
The experimental results show that the phonetic encoding schemes (e.g. CMUdict) are useful operators to reduce the distance of two different strings with similar pronunciation; hence, it can be used to compensate for some recognition errors from ASR. The prototype system has been implemented to assist students in speaking practice.

Contents I
List of Figures III
List of Tables IV
Abstract V
Chinese Abstract VI
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Introduction 1
1.3 Research Issues 2
1.4 Organization of this Thesis 2
Chapter 2 Technologies and Related Work 3
2.1 Automatic Speech Recognition 3
2.2 Text-to-Speech 4
2.3 Streaming Technology 4
2.4 Related Work 5
2.4.1 Automatic Detection and Correction of the Mispronunciation 5
2.4.2 Grading/Scoring for the Pronunciation 5
2.4.3 Feedback in Speaking Training Systems 6
Chapter 3 Phonetic-String-Based Alignment for Scoring and Analysis 7
3.1 Overview of Alignment Problem 7
3.1.1 Sequences Alignment 7
3.2 The Phonetic Domain String Alignment 10
3.2.1 Phonetic Encoding 11 Phonetic Encoding Scheme 11
3.2.2 Phonetic Domain String Alignment 13 Phonetic String Distance Measurement 14 Phonetic String Similarity 14 Phonetic String Alignment 15
3.2.3 Analysis Module 18
3.3 Summary 20
Chapter 4 System Architecture and Implementation 21
4.1 System Framework 21
4.1.1 Server Side 21
4.1.2 Client Side 22
4.2 Training Modes 23
4.2.1 Vocabulary Practice 23
4.2.2 Role-Playing Functionality 24
4.3 Experiments 25
4.4 Summary 27
Chapter 5 Conclusion and Future Work 28
5.1 Conclusion 28
5.2 Future Work 28
References 29
Appendix Publication List 32

