跳到主要內容

臺灣博碩士論文加值系統

(54.83.119.159) 您好!臺灣時間:2022/01/17 09:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:黃彥彰
研究生(外文):Yen-Chung Huang
論文名稱:一MPEG-4CELP位元率可調混合編碼方法
論文名稱(外文):A Hybrid Bit Rate Scalable Coding Method for MPEG-4 CELP
指導教授:陳進興陳進興引用關係
指導教授(外文):Jin-Xing Chen
學位類別:碩士
校院名稱:國立成功大學
系所名稱:電機工程學系碩博士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:英文
論文頁數:86
中文關鍵詞:位元率可調語音
外文關鍵詞:SpeechScalabilityMPEG-4 CELP
相關次數:
  • 被引用被引用:0
  • 點閱點閱:137
  • 評分評分:
  • 下載下載:17
  • 收藏至我的研究室書目清單書目收藏:0
現今的的語音編碼大都以code–excited linear prediction (CELP)為核心,且朝向以層為單位的編碼方法來增加訊號的品質,其中MPEG-4 CELP Bit Rate Scalability (MCBRS)就是一個例子。
本篇論文提出一新的位元率可調混合編碼方法,這個方法的目的跟MCBRS一樣,都是以層為單位來提升語音品質。此一編碼方法分成兩個模型:當基本層的位元率夠高時,使用差值位置波形基底編碼(DPWBC)來壓縮殘餘訊號;當基本層的位元率較低時,則採用MCBRS部份解碼(PDMCBRS)的方法來壓縮殘餘值。DPWBC不採用LPC係數來合成訊號,而是統計訊號殘餘值,找出適合的波形基底來編碼殘餘值,壓縮資料包括波形的位置、正負號、寬度及振幅;其中訊號先依能量大小分組後重新排列編碼順序,再編碼位置差值,此方式大大節省位元率及提升編碼的準確性。而PDMCBRS則仍採用原本MCBRS的編碼方法,再加入部分解碼的觀念來解碼位元流。以上二種編碼模式的位元率間隔都比MCBRS要小,因此更接近連續精細化(Successive refinement)的功能。
實驗結果顯示,基本層位元率夠高及聲音吵雜時,DPWBC比MCBRS有更高的訊號雜訊比,而PDMCBRS在任何情況下都能保持跟MCBRS一樣的編碼效果。這二種模式搭配使用所組成的混合編碼方法,比起標準MCBRS,可以有更好的語音品質且有較小的位元率間隔。
Many speech coding standards are based upon code-excited linear prediction (CELP), and it is desirable to enhance signal performance by using layered coding methods that are compatible with this base coder.
In this thesis, a hybrid bit rate scalable coding method is proposed for MPEG-4 CELP that addresses similar functionalities of MPEG-4 CELP Bit Rate Scalability, and is also layered structured to enhance signal performance. It offers two modes of coding in different situations: differential position waveform-based coding (DPWBC) for high coding bit rate of the base layer and partial decoding of MPEG-4 CELP Bit rate Scalability (PDMCBRS) for low coding bit rate of the base layer. Instead of using analysis-by-synthesis coding method, the DPWBC method codes the waveform of the residual signal. In the residual signal, continuous positive (or negative) samples are regards as one signal to be coded. Each signal is coded by using its position, sign, width and magnitude. The coding order of signals are arranged according to signal’s group and position to reduce coding bit rate and increase coding accuracy. The PDMCBRS method employs partial decoding of the MPEG-4 CELP Bit Rate Scalability to gradually enhance speech performance. Either modes of the proposed coding method has smaller bit rate step than MCBRS, so the concept of successive refinement is more closely approached.
Experiments show that DPWBC enhances the performance more effectively than MCBRS at high bit rate and noisy background; and PDMCBRS has the same performance as the MPEG-4 CELP Bit rate Scalability. So the proposed coding method employing DPWBC and PDMCBRS performs better than MPEG-4 CELP Bit rate Scalability and has smaller bit rate step of enhancement.
Abstract Ⅰ
Contents Ⅲ
Figure Captions Ⅵ
Table Captions Ⅷ
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Recent Works 2
1.3 Thesis Organization 3
Chapter 2 Background Review 4
2.1 Data Compression 4
2.2 Characteristics of Human Hearing and Vocal Systems 5
2.2.1 Hearing Characteristic 6
2.2.2 Human Speech Characteristic 8
2.3 Intelligibility and Quality Test Methods 10
2.4 Development of MPEG Audio 13
Chapter 3 MPEG-4 CELP 15
3.1 MPEG-4 Audio 16
3.1.1 Overview of MPEG-4 Audio 16
3.1.2 MPEG-4 Natural Audio Coding 17
3.1.3 Synthetic and ANHC Audio in MPEG-4 18
3.1.4 New Concepts in MPEG-4 Audio 19
3.1.5 MPEG-4 Audio Capabilities 20
3.2 What is CELP 21
3.2.1 Waveform Coding 21
3.2.2 Voice Coding 22
3.2.3 Hybrid Coding 22
3.3 MPEG-4 CELP 23
3.3.1 General Description of MPEG-4 CELP Decoder 23
3.3.2 Functionality of MPEG-4 CELP 25
3.3.3 Configuration and Features of the MPEG-4
CELP Coder 25
3.4 MPEG-4 CELP Syntax 27
3.4.1 MPEG-4 CELP Sequence Overview 27
3.4.2 MPEG-4 Audio Sequence Header 28
3.4.3 MPEG-4 CELP Sequence Header 30
3.4.4 MPEG-4 CELP Frame 32
3.4.5 Linear Predictive Coding (LPC) Coefficients 33
3.4.6 Excitation Coefficients 34
3.4.7 Scalability 35
3.5 MPEG-4 CELP Tools 37
3.5.1 MPEG CELP Encoder Tools 37
3.5.2 MPEG CELP Decoder Tools 42
Chapter 4 Bit Rate Scalability in MPEG-4 CELP 48
4.1 Bit Rate Scalability of MPEG-4 CELP 49
4.1.1 Bit Rate Scalable (BRS) Multi-Pulse Tool 50
4.1.2 Data Structure and Bit Assignment 51
4.2 A Hybrid Bit Rate Scalable Coding Method 54
4.2.1 Differential Position Waveform-Based Coding 55
4.2.2 Partial Decoding of MPEG-4 CELP Bit Rate Scalability 64
4.2.3 Determination of Conditions for Use of the Two
Modes 65
Chapter 5 Experiment Results and Conclusion 67
5.1 Setting the Configuration 67
5.2 Experiment Results 68
5.3 Discussion 79
5.4 Conclusions 80
5.5 Future Work 82
References 83
[1]ITU-T, Recommendation G.729, “Coding of Speech at 8 kbps Using Conjugate-Structure Algebraic-Code-Excited Linear Prediction (CS-ACELP),” March 1996.
[2]ITU-T, Recommendation G.723.1, “Dual Rate Speech Coder for Multimedia Communications Transmitting at 5.3 and 6.3 kbps,” March 1996.
[3]ISO/IEC JTC1 SC29/WG11, ISO/IEC FCD 14496-3. Information Technology – Coding of Audiovisual Object – Part 3: Audio, Nov 1998.
[4]H. C. Woo and J. D. Gibson, “Low Delay Tree Coding of Speech at 8 kbps,” IEEE Trans. on Speech and Audio Processing, Vol. 2, No. 3, pp. 361-370, July 1994.
[5]S. Yeldener, “A 4kb/s Toll Quality Harmonic Excitation Linear Predictive Speech Coder,” IEEE ICASSP, Vol. 1, pp. 481-484, 1999.
[6]L. Nishiguchi, K. Iijima and J. Matsumoto, “Harmonic Vector Excitation Coding of Speech at 2.0 kbps,” IEEE Workshop on Speech Coding, pp. 39-40, Sep. 1997.
[7]S. Ahmadi and A.S. Spanias, “New Algorithms for Sinusoidal Speech Coding at Low Bit Rates,” IEEE International Conference on Personal Wireless Communications, pp. 57-61, 1997.
[8]A. McCree, Kwan Truong, E. B. George, T. P. Barnwell and V. Viswanathan, “A 2.4 kb/s MELP Coder Candidate for The New U.S. Federal Standard,” IEEE ICASSP, Vol. 1, pp. 200-203, 1996.
[9]M. R. Nakhai and F. A. Marvasti, “A 4.1 kb/s Hybrid Speech Coder,” IEEE International Symposium on Circuits and Systems, Vol. 3, pp. 110-113, 1999.
[10]J. Stachurski and A. McCree, “A 4kb/s Hybrid MELP/CELP Coder with Alignment Phase Encoding and Zero-Phase Equalization,” IEEE ICASSP, Vol. 3, pp. 1379-1382, 2000.
[11]M. Schroeder and B. Atal, “Code Excited Linear Prediction: High Quality Speech at Low Bit Rates,” IEEE ICASSP, pp. 937-940, 1985.
[12]J. P. Ashley, E. M. Cruz-Zeno, U. Mittal and Weimon Peng, “Wideband Coding of Speech Using a Scalable Pulse Codebook,” IEEE Workshop on Speech Coding, pp. 148-150, 2000.
[13]R. Taori, R. J. Sluijter and A. J. Gerrits, “On Scalability in CELP Coding Systems,” IEEE Workshop on Speech Coding, pp. 67-68, 1997.
[14]Hui Dong and J. D. Gibson, “Universal Successive Refinement of CELP Speech Coders,” IEEE ICASSP, Vol. 2, pp. 713-716, 2001.
[15]Xuedong Huang, Alex Acero and Hsiao-Wuen Hon, Spoken Language Processing, prentice-Hall, 2001.
[16]A. M. Kondoz, Digital Speech, John Wiley & Sons, 1994.
[17]K. R. RAO and J. J. HWANG, Techniques and Standards for Image, Video, and Audio Coding, Prentice Hall PTR, 1996.
[18]戴顯權, 資料壓縮, 紳藍出版社, 2001.
[19]Touradj Ebrahimi et al., “MPEG-4 Natural Video Coding - An Overview,” http://leonardo.telecomitalialab.com/icjfiles/mpeg-4_si/.
[20]B. S. Atal and L. Hanauer, “Speech Analysis and Synthesis by Linear Prediction of the Speech Wave,” Journal of the Acoustical Society of America, pp. 637-655, 1971.
[21]J. D. Tardelli and E. W. Kreamer, “Vocoder Intelligibility and Quality Test Methods,” IEEE ICASSP, Vol. 2, pp. 1145-1148, 1996.
[22]ITU-T, “Methods for Subjective Determination of Transmission Quality,” Int. Telecommunication Unit, 1996.
[23]T. Tremain, “The Government Standard Linear Predictive Coding Algorithm (LPC-10),” Speech Technology, Vol. 1, pp. 40-49, 1982.
[24]P. Kroon, E. F. Deprettere, and R. J. Sluyter, “Regualr-Pulse Excitation – A Novel Approach to Effective and Efficient MultiPulse Coding of Speech,” IEEE Trans. SP, Vol. ASSP-34, No. 5, pp. 1054-1063, Oct. 1986.
[25]T. Nomura, M. Iwadare, M. Serizawa, and K. Ozawa, “A Bitrate and Bandwidth Scalable CELP Coder,” IEEE ICASSP 98, Vol. 1, pp. 341-344, May 1998.
[26]N. Tanaka, T. Morii, K. Yoshida and K. Honma, “A Multi-Mode Variable Rate Speech Coder for CDMA Cellular Systems,” IEEE Vehicular Technology Conference 96, pp. 198-202, Apr. 1996.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top