跳到主要內容

臺灣博碩士論文加值系統

(44.220.247.152) 您好!臺灣時間:2024/09/16 22:18
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:吳東翰
研究生(外文):WU,DONG-HAN
論文名稱:使用語音活動檢測演算法減少語音編碼器的計算量
論文名稱(外文):Reduced Computation of Speech Coder Using a Voice Activity Detection Algorithm
指導教授:林榮三林榮三引用關係
指導教授(外文):LIN,RONG-SAN
口試委員:魏永強王木良
口試委員(外文):WEI,YONG-CIANGWANG,MU-LIANG
口試日期:2017-01-24
學位類別:碩士
校院名稱:南臺科技大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:44
中文關鍵詞:VAD演算法減少計算複雜度G.723.1
外文關鍵詞:VAD algorithmReduction computational complexityG.723.1
相關次數:
  • 被引用被引用:0
  • 點閱點閱:205
  • 評分評分:
  • 下載下載:9
  • 收藏至我的研究室書目清單書目收藏:0
隨著網路的使用與多媒體技術的快速成長,現今多媒體通訊已被整合於個人行動裝置中,由於低階行動裝置的運算處理速度有限,需要具有低計算複雜度之語音編碼器才能適用於低階行動裝置的硬體平台及整合多媒體資源的服務。對網路或是無線語音通訊器,大量的編碼計算需要消耗更多的功率,需求更高單價的通訊器,減少電池壽命。為了實現語音通訊的即時性和連續性,對於現代通訊系統來說,期望減少語音編碼器的計算複雜度。因此,本論文提出使用語音活動檢測(VAD)演算法,將語音訊號分成活動音框與非活動音框。
在實驗中我們分析非活動語音訊號的特性,發現非活動語音訊號的編碼參數在碼簿結構呈現均勻分佈。因此,當音框為非活動語音訊號時,則不編碼該音框的隨機碼簿激發訊號,而是隨機採用碼簿內的參數代替該音框的隨機碼簿激發訊號,達到減少語音編碼器的編碼計算量。實驗結果證明,整體的客觀語音品質評估平均略微降低0.023,且本文提出的方法相對於原始G.723.1編碼器的計算複雜度,可以降低總計算複雜度大約30%,主觀聽覺上感受不出語音品質下降。

The explosive growth of Internet use and multimedia technology, multimedia communication is integrated into a personal information machine nowadays, and due to the latter’s limited computational capability, the need for a coder with low computational complexity to match different hardware platforms and integrate the services of media sources has arisen. For an Internet or wireless speech communicator, heavy computation uses more power and contributes to higher pricing of the communicator or reduced battery life. In order to achieve the real-time and continuity of speech communication, reduction of computational complexity for the speech coder is desirable for modern communication systems. In this thesis, we use a Voice Activity Detection (VAD) algorithm, which is merely used to classify the speech signal into two types of frames, active frames and inactive frames in our proposed method.
We analyzed the characteristic of the inactive speech signals in our experiments. The experimental results are obvious that the encoding parameters are uniform distributed for the inactive speech subframes. Therefore, if the current frame is an inactive speech frame, then the code excited signal of current frame is not encoded instead of random arrangement the encoding parameters for the codebook structure. The Overall simulation results indicate that the average perceptual evaluation of speech quality score is degraded slightly, by 0.023, and our proposed methods can reduce total computational complexity by about 30% relative to the original G.723.1 encoder computation load with perceptually negligible degradation.

摘 要
ABSTRACT
致 謝
目 錄
表目錄
圖目錄
第1章 簡介
1.1 語音編碼技術背景
1.2 ITU-T制定語音編碼標準
1.3 論文研究目的
1.4 論文大綱
第2章 G.723.1語音編碼器
2.1 CELP編碼架構
2.2 ITU-T G.723.1語音編碼器
2.3 音框處理(Framer)
2.4 高通濾波器(High Pass Filter)
2.5 線性預估編碼分析(LPC Analysis)
2.6 線頻譜對量化(LSP Quantizer)
2.7 共振峰感官加權濾波器(FPWF)
2.8 開迴路基週預估(Pitch Estimator)
2.9 閉迴路適應性基週預估器(Pitch Predictor)
2.10 諧波雜訊濾波器(Harmonic Noise Shaping)
2.11 代數碼激發線性預估(ACELP)
2.12 語音活動檢測(VAD)
2.13 語音品質評估(PESQ)
第3章 非活動語音訊號編碼參數特性之分析
3.1 閉迴路適應性基週預估器編碼參數特性之分析
3.1.1 五階閉迴路適應性基週(pitch lag)特性之分析
3.1.2 五階閉迴路適應性基週增益(gain)特性之分析
3.2 ACELP 碼簿參數特性之分析
3.2.1 ACELP激發脈衝位置與極性之特性分析
3.2.2 ACELP激發脈衝增益(Gain)特性之分析
第4章 計算複雜度與語音品質的評估實驗
4.1 隨機閉迴路基週參數對音質的評估實驗
4.2 隨機ACELP碼激參數對音質的評估實驗
4.3 非活動語音訊號使用隨機編碼參數對音質之評估
第5章 結論
參考文獻
符號彙編
Computational Complexity Reduction of G.723.1 oder Using a voice activity detection Algorithm
Efficient Reduction Computational Complexity For Speech Coding
作者簡介

[1]王小川,語音訊號處理,全華科技圖書股份有限公司,2005。

[2]F.K. Chen and D.J. Yue, “Complexity scalability design in coding of the adaptive codebook for ITU-T G.729 speech coder,” Information, Communications and Signal Processing (ICICS), Dec 2011.

[3]S. Wu, G. Zhang, “8Kbit/s Low Delay Speech Coding Algorithm with Adaptive Codebook,”IEEE ISECS International Colloquium on Computing, Aug 2009.

[4]S.K. Jung, K.T. Kim, Y.C. Park and H.G. Kang, “A Fast Adaptive-Codebook Search Algorithm for G.723.1 Speech Coder,” IEEE Signal processing letters, vol.12, no.1, pp.75-78, January 2005.

[5]V. Cuperman and R. Pettigrew,“Robust low-complexity backward adaptive pitch predictor for low-delay speech coding,”IEE Proceedings I - Communications, Speech and Vision, 1991, pp.338-344.

[6]E. D. Lee, S. H. Yun, S. I. Lee and J. M. Ahn, Iteration-Free Pulse Replacement Method for Algebraic Codebook Search, Electronics Letters, Vol.43, No.1, 2007, pp.59-60.

[7]E. D. Lee, M. S. Lee and D. Y. Kim, Global Pulse Replacement Method for Fixed Codebook Search of ACELP Speech Codec, Proceedings of the Second IASTED International Conference on Communications, Internet and Information Technology (CIIT 2003), Scottsdale, AZ, November,2003, pp.372-375.

[8]F.K. Chen and J.F. Yang, “Maximum-Take-Precedence ACELP: A Low Complexity Search Method,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2001), vol.2, pp.693-696, May 2001.

[9]Fu-Kun Chen, Jar-Ferr Yang and Yu-Pin Lin, Complexity Scalability for ACELP and MP-MLQ Speech Coders, IEICE Transactions of Information and Systems, Vol.E85-D, No.1, 2002, pp.255-263.

[10]J. Jin, T.Q. Zhang, Y.L. Wan and.L.Deng, “Effective complexity reduction in codebook search for ACELP?,” IEEE Mechatronic Sciences, Dec 2013.

[11]L. Hua, G.F Yan and L.J Hong,“Improvement and Simulation for the ACELP Speech Encoding Algorithm,”Proceedings of the 30th Chinese Control Conference, July 2011.

[12]Mu-Liang Wang and Jar-Ferr Yang, A Generalized Candidate Scheme of Stochastic Codebook Search for Scalable CELP Coders, IEE Proceeding Vision Image and Signal Processing, Vol.151, No.5, 2004, pp.443-452.

[13]Rong-San Lin and Jia-Yu Wang, Efficient Candidate Scheme for Fast Codebook Search in G.723.1, IEICE Transactions on Information and Systems, Vol.E95-D, No.1, 2012, pp.239-246

[14]S. Kim, H. Park, S. Kang and T.R. Fischer,“Fixed codebook design for ACELP coder using algebraic trellis vector codes,”IEEE Signal Processing and Communication Systems (ICSPCS), Oct 2011

[15]Shu-Min Tsai and Jar-Ferr Yang, Efficient Algebraic Code-Excited Linear Predictive Codebook Search, IEE Proceedings -- Vision, Image, and Signal Processing, Vol.153, No.6, 2006, pp.761-768.

[16]Y. Zhao, S. Zhang and X. Li,“Two methods of Design and Implementation of ACELP Vocoder,”IEEE Signal Processing, Aug 2013.

[17]M.R. Schroeder and B.S. Atal, “Code-excited linear prediction (CELP): High quality speech at very low bit rates,” in ICASSP’85, 1985, pp. 937-940.

[18]林裕斌,ITU-T G.729 和G.723.1語音編碼器之快速演算法,國立成功大學電機工程研究所碩士論文,2002。

[19]ITU-T Rec. G.723.1: Dual Rate Speech Coder for Multimedia Communications Transmitting at 5.3 and 6.3 kbps, March 1996.

[20]ITU-T Rec. H.323: Visual Telephone Systems And Equipment for Local Area Networks Which Provide A Non-Guaranteed Quality of Service, November 1996.

[21]ITU-T Rec. H.324: Terminal for Low Bit Rate Multimedia Communication, March 1996.

[22]X.D. Gan, T. Chen, S.M. Si, L. van den Berghe, T. Miki and T. Ohya, “Implementation of Silence Compression Scheme for G.723.1 Speech Coder Using TI TMS320C51 DSP Chip,” IEEE Communications and Signal Processing, Sep 1997.

[23]ITU-T Rec. P.862: Perceptual Evaluation of Speech Quality (PESQ): An Objective Method for End-to-End Speech Quality Assessment of Narrow-Band Telephone Networks And Speech Codecs, Feb 2001.

[24]S.M. Lee, S. Park and Y. Jang, Cost-effective Implementation of ITU-T G.723.1 on A DSP Chip, Proceedings of 1997 IEEE International Symposium on Consumer Electronics, December 1997, pp. 31-34.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top