跳到主要內容

臺灣博碩士論文加值系統

(3.229.142.104) 您好!臺灣時間:2021/07/28 11:53
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:魏綸毅
研究生(外文):Lun-Yi Wei
論文名稱:針對語調訓練之線上口說英文學習系統
論文名稱(外文):Online Spoken English Learning System with Intonation Training
指導教授:張克寧
指導教授(外文):Keh-Ning Chang
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:英文
論文頁數:34
中文關鍵詞:語音辨識音高追蹤語音文字比對重音
外文關鍵詞:speech recognitionpitch trackingspeech-text alignmentstress
相關次數:
  • 被引用被引用:1
  • 點閱點閱:348
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
英文教學盛行已久,傳統英文教學為了應付考試,只強調文法與單字量的重要性。為了更符合英文在現實生活中的實用性,現在托福已經增加了口說能力測驗。而實際上,英文畢竟不是我們的母語,我們與母語人士溝通時常常因為發音或語調不正確,讓他們不知所云。要真正能和英語人士正常溝通,持續而艱辛的努力不可或缺。在說英文時,我們講話的語調聽起來總是跟外國人有所差別,這些差別與發音及語調的正確性有關。
本篇論文結合了語音辨識與音高追蹤等技術,藉著語音辨識引擎找出文字與語音之間的時間對應關係。由於發音的準確性使得辨識結果並不一定可以完整地找出每個字的對應時間,因此再由演算法(speech-text alignment)來完整地找出所唸的語音與文字的對應關係。接下來利用語音分析中的音高追蹤計算出語音中的音高,結合以上技術,可以找出語音中每個英文單字的音高,再藉由平均音高,最高音高與單字所佔的時間長短(duration)來決定重音。
English teaching has prevailed for a long time. Traditional English teaching focuses on the vocabulary and grammar in order to pass the examinations. Now TOEFL adds the spoken test to make English more practical in everyday life. Nevertheless, we are not native so that the native may not understand easily what we talk about because of our intonation. Our research focuses on intonation correcting.
We propose a system which can determine the stressed words. Viewing the stress differences between the learner’s and the native speaker’s, learner can train his/her intonation until the difference is gone.
Our system employs the techniques of speech recognition and speech analysis. The speech of the native is fed into the speech recognition engine. Then, the alignment between speech and text is performed. The next step is to compute the pitches of speech by the pitch tracking of speech analysis. Through the above procedures, the pitches of each word are obtained. Furthermore, the maximum and the average of pitches of each word and the duration of each word are derived. These three factors are used to decide the stress.
List of Figures I
List of Tables II
Abstract III
摘 要 IV
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Web-Based Learning System 1
1.3 Speaking 2
1.4 Intonation Information 3
1.5 Pronunciation of Word 3
Chapter 2 Related Works 4
Chapter 3 Speech Analysis 7
3.1 Volume 7
3.2 Zero Crossing Rate 8
3.3 Endpoint Detection 9
3.4 Pitch Tracking 9
ACF (Autocorrelation Function) 11
Chapter 4 Speech-Text Alignment 14
4.1 Preprocessing 14
4.2 Phonetic Domain String Alignment 15
4.2.1 Phone Encoding 16
4.2.2 Dynamic Programming Alignment Module 17
4.2.3 Word-based timestamp prediction 21
4.3 Summary 22
Chapter 5 System Architecture and Implementation 23
5.1 System Architecture 23
5.2 Implementation 24
5.2.1 Syllable Segmentation 24
5.2.2 Wave Information 25
5.2.3 Stress of a Sentence and a Word 27
Chapter 6 Conclusion 30
6.1 Conclusion 30
6.2 Future Work 30
References 31
Internet References 32
Appendix A 33
[Ambr02] Ambra Neri, Catia Cucchiarini, Helmer Strik. Feedback in computer assisted pronunciation trainin: Technology push or demand pull? ICSLP-2002 TECHNICAL PROGRAM, 2002
[Huan 93] Huang, Xuedong; Fileno, Alleva; Hwang, Mei-Yuh;Rosenfeld, Ronald (1993) The SPHINX Ⅱ Speech Recognition System: An Overview. Computer Speech and Language, 2(7),pp. 137-148
[Weit01] Chu,Wei-Ta (2001), Exploiting Computer Synchronization and Its Application for Navigated Hypermedia Documents, Master Thesis.
[Weit02] Chu, Wei-Ta & Chen, Herng-Yow (2002), Cross-Media Correlation: a case study of navigated hypermedia documents. ACM MM, pp.57-66
[Weit04] Chu, Wei-Ta and Chen, Herng-Yow (2004), Toward better retrieval and presentation by exploiting cross-media correlation, Springer Multimedia Systems, V10, pp. 183-198.
[Hall80] Hall, Patrick and Dowling, Geoff (1980) Approximate String Matching. Computing Surveys, 12(4), pp.381-402
[Gadd88] Gadd, TN (1988) Phonetic Retrieval of Written Text in Information System. Program: Automated Library and Information Systems, 22(3), pp.222-237.
[Phil90] Philips, Lawrence (1990) Hanging on the Metaphone. Computer Language Magazine, Vol.7, No 12, pp.38-43.
[Shne00] Shneiderman, Ben (2000) The Limits of Speech Recognition Communication of The ACM, 43(9), pp.63-65.
[Rone97] Ronen, Orith; Neumeyer, Leonardo; Franco, Horacio (1997) Automatic Detection Of Mispronunciation For language Instruction Proc. Of EUROSPEECH 97, pp.645-648
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top