跳到主要內容

臺灣博碩士論文加值系統

(3.235.227.117) 您好!臺灣時間:2021/07/28 02:37
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林佳琪
研究生(外文):Chia-Chi Lin
論文名稱:適用於DAISY數位有聲書之中英夾雜語音合成系統
論文名稱(外文):Mixed Chinese-English Speech Synthesis System for DAISY Digital Talking Book Applications
指導教授:廖元甫廖元甫引用關係
口試委員:郭志忠王逸如
口試日期:2012-07-31
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電腦與通訊研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2012
畢業學年度:100
語文別:中文
論文頁數:87
中文關鍵詞:語音合成中英夾雜文脈訊息
外文關鍵詞:Speech SynthesisMixed Chinese-English SpeechContext Dependent
相關次數:
  • 被引用被引用:0
  • 點閱點閱:197
  • 評分評分:
  • 下載下載:12
  • 收藏至我的研究室書目清單書目收藏:0
本研究主旨之目標是設計一個適用於DAISY數位有聲書的中英夾雜語音合成系統(Text-To-Speech,簡稱TTS),為了達成這個目標,我們首先設計一套雙語語料庫,並用Extended SAM Phonetic Alphabet(X-SAMPA)來統一中英雙語音素集(Phone Set)。然後分析文句擷取文脈相關訊息,考慮能提升文章自然度的相關文脈,包括語句階層以上的文脈、符號的判斷、語句是純中文、純英文,還是中英夾雜的文脈,並加入語意分析,設計出標註檔(Label)以及決策樹(Decision Tree)之問題集,最後合成。
從整體主觀評估來看,長篇合成文章閱讀可接受度評估為3.70分,整體自然度評估為3.19分,整體相似度評估為3.23分。細分自然度評估來看,中文自然度評估為3.47分、英文自然度評估為2.93分、中英夾雜自然度評估為3.17分。細分相似度評估來看,中文相似度評估為3.53分、英文相似度評估為2.92分、中英夾雜相似度評估為3.23分。最後,測試有無語意資訊的兩系統偏好測試,其偏好比為50%:50%,顯示語意影響不大。總結來說,我們的系統還可以被接受,但有很大的改善空間。


The goal of this study is to design a Mixed Chinese-English Speech Synthesis System for DAISY Digital Talking Books. In order to improve the quality of synthesyzed speech, especially for long paragraph or even the whole story, several keypoints are carefully considered including (1) design and collectation of a suitable bilingual corpus, (2) unification of the English and Chinese transcriptions using Extended SAM Phonetic Alphabet (X-SAMPA), (3) extraction of meanful linguistic and semantic cues beyond sentence level.
From the results of a subjective listening assessment, the overall mean opinion scores (MOSs) of acceptability, naturalness and similarity are 3.70, 3.19 and 3.23, respectively. In detail, the naturalness scores of Chinese, English and mixed Chinese-English are 3.47, 2.93 and 3.17, respectively. On the other hand, the similarity scores of Chinese, English and mixed Chinese-English are 3.53, 2.92 and 3.23, respectively
In comclusion, our sustem is acceptable, but there is still a lot of room for further improvement.


中文摘要 i
ABSTRACT ii
誌 謝 iii
目錄 iv
圖目錄 vii
表目錄 viii
第一章 緒論 1
1.1 研究動機與目的 1
1.2 相關研究 2
1.3 研究方法 3
1.4 章節概要 4
第二章 單一語言單一語句語音合成系統 5
2.1 HTS語音合成系統架構 5
2.2 中英單一語言特性分析 6
2.2.1 英文特性分析與音素集 6
2.2.2 中文特性分析與音素集 8
2.3 常用單一語句內文脈相關訊息 11
第三章 中音夾雜長篇語音合成系統 14
3.1 雙語言特性分析與整合 14
3.2 語料設計 17
3.2.1 語料選取與錄製 17
3.2.2 涵蓋率 18
3.2.2.1 Monophone 涵蓋率 19
3.2.2.2 Diphone 涵蓋率 21
3.3 文章文脈相關訊息 22
3.3.1 句子以上層次相關文脈 23
3.3.2 語意文脈相關 24
第四章 系統實作 25
4.1 系統架構 25
4.1.1 斷詞器 27
4.1.2文法剖析 28
4.1.3切割資訊 29
4.1.4語意分析 30
4.2 文脈擷取 32
4.3 決策樹 33
第五章 實驗結果評估、分析與討論 36
5.1 實驗設定 36
5.1.1訓練語料 36
5.1.2合成語料 37
5.1.3評估方法 37
5.1.4考慮語意與否之偏好測定 39
5.2實驗結果與分析 40
5.2.1結果評估與討論 40
5.2.2考慮語意與否之偏好測試結果 42
5.3實驗討論 42
5.3.1 停頓點 43
5.3.2 破音字 44
5.3.3 順暢度 44
5.3.4 聲調 45
第六章 結論與未來展望 46
6.1結論 46
6.2未來展望 46
參考文獻 47
附錄A:Penn Englisg Treebank 50
附錄B:Penn Chinese Treebank 53
附錄C:大五碼 56
附錄D:決策樹 62


[1] chung-Hsien Wu, “Forwards Multilingual Text-to-Speech Synthesis From Monolingual to Polyglot,” 2011, 語音訊號處理研討會
[2] Chia-Ping Chen, Yi-Chin Huang, Chung-Hsien Wu, and Kuan-De Lee, “CROSS-LINGUAL FRAME SELECTION METHOD FOR POLYGLOT SPEECH SYNTHESIS”
[3] Javier Latorre, Koji Iwano, Sadaoki Furui, “POLYGLOT SYNTHESIS USING A MIXTURE OF MONOLINGUAL CORPORA,” ICASSP 2005
[4] David Sundermann, Harald Hogel, Antonio Bonafonte, Hermann Ney, and Julia Hirschberg, “Text-indepebdent cross-language voice conversion,” in Proc. Interspeech, 2006.
[5] Daniel Erro, Asuncion Moreno, and Antonio Bonafonte, “INCA algorithm for training voice conversion systems from nonparallel corpora,” IEEE Transactions on Audio, Speech and Language Processing, vol. 18, pp. 944-953,July 2010
[6] Christof Traber, Karl Huber, Karim Nedir, Beat Pfister, Eric Keller, Brigitte Zellner, “FROM MULTILINGUAL TO POLYGLOT SPEECH SYNTHESIS”
[7]Stanford-Word-Segmenter :http://nlp.stanford.edu/software/segmenter.shtml ,2012 , July
[8] Stanford-Parser : http://nlp.stanford.edu/software/lex-parser.shtml ,2012 , July
[9] Stanford-TMT : http://nlp.stanford.edu/software/tmt/tmt-0.4/ , 2012, July
[10] Alan W. Black and Nick Campbell, “Optimising Selection of Units from speech Databases for Concatenative Synthesis,” in Proc. Of EUROSPEECH pp.581-584, Sep. 1995
[11] Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, and Tadashi Kitamura, “Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-based Speech Synthesis,” in Proc. Of EURPSPEECH, pp.2347-2350,1999.
[12] Beatrice Santorini, “Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd Revision, 2nd Printing),” 1990, June
[13] IPA for English: http://zh.wikipedia.org/wiki/Wikipedia:IPA_for_English , 2012, July
[14] 江振宇、蕭希群、余秀敏、廖元甫 “語音韻律簡介”
[15] Hsu-Ying Lo, Jyh-Shing Roger Jang, “An Initial Study on HMM-based TTS for Mandarin Chinese,” 2009, June
[16] Fei Xia, “The Part-Of-Speech Tagging Guidelines for the Penn Chinese Treebank,” 2000, October 17
[17] IPA for Mandarin : https://en.wikipedia.org/wiki/Wikipedia:IPA_for_Mandarin , 2012, July
[18] 吳仲耘、吳宗憲 “Pitch Prediction Using Prosody Hierarchy and Dynamic Features for HMM-based Mandarin Speech Synthesis,” 2008, July
[19] Keiichiro Oura, “An example of context-dependent label format for HMM-based speech synthesis in English,” July 2011
[20] Miao-Ru Wu, Lin-Shan Lee, “Initial Study on Chinese/English Bilingual Speech Recognition based on Lecture Recording,” 2007, June
[21] Chiu-yu Tseng and Fu-chiang Chou, “Machine Reachable Phonetic Transcription System for Chinese Dialects Spolen in Taiwan,” the first Oriental COCOSDA Workshop 1998
[22] IPA轉X-SAMPA : http://aveneca.com/xipa.html , 2012, July
[23] 新世紀語料庫 : http://mmc.sinica.edu.tw/sampa.htm ,2012 , July
[24] Blizzard Challenge : http://www.synsig.org/index.php/Blizzard_Challenge
[25] John Kominek and Alan W Black, “CMU ARCTIC databases for speech synthesis”
[26] David M. Blei, Andrew Y. Ng, Michael I. Jordan, “Latent Dirichlet Allocation”, 2003
[27] 林千翔、張嘉惠, “基於特製隱藏式馬可夫模型之中文斷詞研究”
[28] Chen-Yu Chiang, Dr. Sin-Horng Chen, “An improvement on Chinese Parser,” July, 2004
[29] J. Rissanen, “Stochastic Complexity in Statistical Inquiry,” World Scientific Publ. Co. 1989.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top