(3.230.76.48) 您好!臺灣時間:2021/04/13 16:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:蔡幸峰
研究生(外文):Hsin-Feng Tsai
論文名稱:一位元可調CELP語音編碼方法
論文名稱(外文):A Bit Rate Scalable CELP Coding Method
指導教授:陳進興陳進興引用關係
指導教授(外文):Chin-Hsing Chen
學位類別:碩士
校院名稱:國立成功大學
系所名稱:電機工程學系碩博士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:英文
論文頁數:92
中文關鍵詞:可調語音
外文關鍵詞:speechCELPscalability
相關次數:
  • 被引用被引用:0
  • 點閱點閱:130
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:20
  • 收藏至我的研究室書目清單書目收藏:0
Coded Excitation Linear Prediction (CELP) 自1984年被提出,已成為各種標準所採用之語音編碼技術。MPEG-4不僅採用CELP作為標準編碼,更提供位元可調 (scalability) 的功能。可調編碼功能可藉由更多的數據來解出品質更高的語音訊號。
MPEG-4以層為單位來增加解碼訊號的品質,為改進單層位元率過高的問題,差值位置波底基底編碼 (DPWBC) 被提出。DPWBC採用一適當基底波形對殘餘訊號做編碼。DPWBC將連續正值或連續負值的殘餘訊號視為一個訊號,並依各個訊號的寬度來作為取捨的準則。寬度較長的訊號具有較高的優先權被選擇作為編碼訊號,寬度較短的訊號則被排除。
本論文則對DPWBC提出另外兩個取捨準則:功率取捨與能量取捨。在功率取捨準則下,功率較高的訊號具有較高的優先權被選擇作為編碼訊號,功率較低的訊號則被排除;在能量取捨準則下,能量較高的訊號具有較高的優先權被選擇作為編碼訊號,能量較低的訊號則被排除。此外,本論文也測試其他不同的波形與音框大小。
實驗結果顯示,能量取捨在三種取捨準則中表現是最好的。此外,功率取捨與寬度取捨準則幾乎有相同的表現。
CELP (coded excitation linear prediction) has been the dominant speech coding scheme of various standards since it was proposed in 1984. MPEG-4 not only adopts CELP as the coding scheme but also provides a function of scalability. The function of scalability could decode higher quality speech signal by utilizing more parameters.
MPEG-4 increases the quality of the decoded signal by means of the concept of “enhancement layer”. In order to overcome the drawback of the large bitrate step size of an enhancement layer, DPWBC was proposed. DPWBC adopts a suitable waveform as basis to encode each selected signal. DPWBC regards a continuous positive or continuous negative signal as one signal and decide whether the signal should be encoded or not according to their width. The signal with large width has higher priority to be selected while signals with smaller width are eliminated.
This thesis proposes other two decision criteria: power-decision and energy-decision. Under the power-decision criterion, signals with higher power have higher priority to be selected while signals with lower power are eliminated; under the energy-decision criterion, signals with higher energy have higher priority to be selected while signals with lower energy are eliminated. Besides, in this thesis other shape waveforms as basis and different frame size are test.
Experimental results show that the performance of energy-decision criterion is superior to those of width-decision and power-decision. Besides, the performance of power-decision criterion is comparable to that of width-decision criterion.
Abstract ... I
Contents ... III
Figure Captions ... VI
Table Captions ... XI
Chapter 1 Introduction ... 1
1.1 Motivation ... 1
1.2 Organization of the Thesis ... 3
Chapter 2 Overview of Speech Signals ... 4
2.1 Speech Production ... 4
2.2 Speech Category ... 5
2.3 What is Pitch ... 5
2.4 Pitch Detection ... 6
2.4.1 Auto-Correlation PDA ... 7
2.4.2 AMDF PDA ... 7
2.4.3 Normalized Auto-Correlation Method ... 8
2.5 Physiology of the Human Ear ... 9
2.6 Data Compression ... 11
2.7 Quality Measure of Speech Coder ... 12
Chapter 3 Speech Analysis and Speech Coding ... 13
3.1 Short Term Prediction ... 13
3.1.1 Source Filter Model ... 13
3.1.2 Solution of the LPC Equations ... 15
3.2 Long Term Prediction ... 18
3.2.1 Long Term Prediction Model ... 18
3.2.2 How to obtain and ... 20
3.3 LPC to LSF Transformation ... 23
3.3.1 What is LSF ... 23
3.3.2 LSF Properties ... 24
3.3.3 Quantization of LSF ... 25
3.4 Conventional Speech Coding ... 26
3.4.1 PCM ... 26
3.4.2 Log PCM ... 29
3.4.3 DPCM ... 32
3.5 Analysis-and-Synthesis Speech Coding ... 34
3.6 Analysis-by-Synthesis Speech Coding ... 36
3.6.1 Perceptual Weighting Filter ... 38
3.6.2 More about LTP ... 39
3.6.3 More about Weighting and LPC ... 41
3.6.4 Various Codebooks ... 42
Chapter 4 Multi-Bitrate Coding and Scalability ... 48
4.1 Multi-Bitrate Coding ... 48
4.2 Scalability ... 49
4.2.1 Bitrate Scalability ... 51
4.2.2 Bandwidth Scalability ... 52
4.3 Strategy for Scalability ... 55
4.3.1 DPWBC ... 55
4.3.2 Modified DPWBC ... 57
Chapter 5 Experimental Results, Discussion and Conclusion ... 64
5.1 Experimental Configurations ... 64
5.2 Experimental Results ... 65
5.3 Discussion and Conclusion ... 82
References ... 86
[1] ISO/IEC JTC1 SC29/WG11, ISO/IEC FCD 14496-3. Information Technology – Coding of Audiovisual Object – Part 3: Audio, Nov 1998.
[2] 黃彥彰, A Hybrid Bit Rate Scalable Coding Method for MPEG-4 CELP, master thesis, National Cheng Kung University, 2002.
[3] Watkinson , John , The Art of Digital Audio , 2nd Ed , Chapter 10 “ Perceptual Coding “, 1994.
[4] MPEG Digital Audio Coding, IEEE Signal Precessing Magzine, pp. 59-81, Sep. 1997.
[5] 戴顯權, 資料壓縮, 紳藍出版社, 2001.
[6] A. M. Kondoz, Digital Speech: Coding for Low Bit Rate Communications Systems, Wiley, 1994.
[7] Thomas F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice, Prentice Hall PTR, 2001.
[8] Ravi P. Ramachandran and Peter Kabal, “Pitch Prediction Filters in Speech Coding”, IEEE Trans. Acoust., Speech, Signal Processing, Vol. 37, pp. 467-478, Apr 1989.
[9] Sharad Singhal and Bishnu S. Atal, “Amplitude Optimization and Pitch Prediction in Multipulse Coders”, IEEE Trans. Acoust., Speech, Signal Processing, Vol. 37, pp. 317-327, Mar 1989.
[10] Peter Kroon, ED F. Deprettere and ROB J. Sluyter, “Regular-Pulse Excitation – A Novel Approach to Effective and Efficient Multipulse Coding of Speech”, IEEE Trans. Acoust., Speech, Signal Processing, Vol. ASSP-34, pp. 1054-1063, Oct 1986.
[11] Kuldip K. Paliwal and Bishnu S. Atal, “Efficient quantization of LPC Parameters at 24 Bits/Frame,” IEEE Trans. Acoust., Speech, Signal Processing, Vol. 1, pp. 3-14, Jan 1993.
[12] W. P. Leblanc, B. Bhattacharya, S. A. and V. Cuperman, “Efficient Search and Design Procedures for Robust Multi-State VQ of LPC Parameters for 4 kb/s Speech Coding”, IEEE Trans. Speech and Audio Processing, Vol.1, pp. 373-385, Oct 1993.
[13] Hitoshi Ohmuro, Takehiro Moriya, Kazunori Mano and Satoshi Miki, “Coding o f LSP Parameters Using Interframe Moving average Prediction and Multi-Stage Vector Quantization”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 63-64, 1993.
[14] K.K. Paliwal and B.S. Atal, “Efficient Vector Quantization of LPC Parameters at 24 Bits/Frame”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 661-664, 1991.
[15] Bishnu S. Atal, “Predictive Coding of Speech at Low Bit Rates”, IEEE Trans. Acoust., Speech, Signal Processing, Vol. com-30, pp. 600-614, Apr 1982.
[16] Peter Kroon and Bishun S. Atal, “Strategies for Improving the Performance of CELP Coders at Low Bit Rates”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp.151-154, 1988.
[17] Shinichi Taumi, Kazunori Oawa, Toshiyuki Nomura and Masahiro Serizawa, “Low-Delay CELP With Multi-Pulse VQ and Fast Search for GSM EFR”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 562-565, 1996.
[18] Claude R. Galand, Jean E. Meanez, and Michele M. Rosso, “Adaptive Code Excited Predictive Coding”, IEEE Trans. Signal Processing, Vol. 40, pp. 1317-1326, Jun 1992.
[19] J. Menez C.Galand, M.Rosso and F.Bottau, “Adaptive Code Excited Linear Predictive Coder (ACELPC)”, ICASSP, pp. 132-135, 1989.
[20]M. Elshafei and M.I. Al-Suwaiyel, “High Speed Multi-Stage Code Search Algorithm in CELP”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 710-713, 1991.
[21] Peter Kroon and ED F. Eeprettere, “A Class of Analysis-by-Synthesis Predictive Coders for High Quality Speech Coder at Rates Between 4.8 and 16kbits/s”, IEEE Trans. Selected Areas in Communications, Vol. 6, pp. 353-363, Feb 1988.
[22] Ira A. Gerson and Mark A. Jasiuk, “Vector Sum Excited Linear Prediction (VSELP) Speech Coding At 8 KBPS ”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp-461-464, 1990.
[23] James P. Ashley, Edgardo M. Cruz-Zeno, Udar Mittal and Weimin Peng, “Wideband Coding of Speech Using A Scalable Pulse Codebook”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 148-150, 2000.
[24] Jongseo Sohn and Wonyong Sung, “Variable Dimensional Algebraic CELP Coding of Prototype Waveforms”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 1443-1446, 2000.
[25] C. Laflamme, J.P. Adoul, H.Y. Su and S. Morissette, “on Reducing Computational Complexity of Codebook Search I CELP Coder Through the Use of Algebraic Codes”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp.177-180, 1990.
[26] Miguel Arjona Ramirez and Max Gerken, “Joint Position and Amplitude Search of Algebraic Multipulses”, IEEE Trans. Speech and Audio Processing, Vol. 8, pp. 633-637, Sep 2000.
[27] Miguel Arjona Ramirez and Max Gerken, “Efficient Algebraic Multipulse Search”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 231-236, 1998.
[28] Miguel Arjona Ramirez and Max Gerken, “A Multistage Search of Algebraic CELP Codebooks”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 17-20, 1999.
[29] C. Laflamme, J.P. Adoul, R. Salami, S. Morissette, and P. Mabilleau, “16KBPS Wideband Speech Coding Technique Based on Algebraic CELP”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 13-16, 1991.
[30] Chih-Chung Kuo. Fu-Rong Jean, and Hsiao-Chuan Wang, “Speech Classification Embedded in Adaptive Codebook Search for Low Bit-Rate CELP Coding”, IEEE Trans. Speech and Audio Processing, Vol. 3 pp. 94-98, Jan 1995.
[31] Stan McClellan, Jerry D. Gibson and B. Keith Rutherford, “Efficient Pitch Filter Encoding for Variable Rate Speech Processing”, IEEE Trans. Speech and Audio Processing, Vol. 7, pp. 18-29, Jan 1999.
[32] Naoya Tankak, Toshiyuki Morri, Koji Yoshida and Koichi Homma, “A Multi-Mode Variable Rate Speech Coder for CDMA Cellular System”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 198-202 ,1996.
[33] Yoonjoo Lee, Myungkyu Ham and MyungJin Bae, “A Study on a Reduction of the Transmission Bit Rate by U/V decision Using LSP in the CELP Vocoder”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 997-1000, 1999.
[34] Chih-Chung Kuo, Fu-Rong Jean and Hsiao-Chuan Wang, “Speech Classification Embedded in Adaptive Codebook Search for CELP Coding”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. II147-150, 1993.
[35] Shihua Wang and Allen Gersho, “Improved Phonetically-Segmented Vector Excitation Coding at 3.4 KB/s”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. I-349-352, 1992.
[36] Allen Gersho and Erdal Paksoy, “An Overview of Variable Rate Speech Coding for Cellular Networks”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 172-175, 1992.
[37] Erdal Paksoy, K.Srinivasan and Allen Gersho, “Variable Rate Speech Coding with Phonetic Segmentation”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. II155-158, 1993.
[38] Masahiro Serizawa, Hironori Ito and Toshiyuki Nomura, “A Silence Compression Algorithm for Multi-rate/Dual-BandWidth MPEG-4 CELP Standard”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 1173-1176, 2000.
[39] Toshiyuki Nomura, Masahiro Iwadare, Masahiro Serizawa and Kazunori Ozawa, “A Bitrate and Bandwidth Scalable CELP Coder”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 341-344, 1998.
[40] S. Chopun, S. Jitapunkul and D. Tancharoen, “Novel Technique for Tonal Language Speech Compression Based on A Bitrate Scalable MP-CELP Coder”, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing, pp. 461-464, 2001.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔