跳到主要內容

臺灣博碩士論文加值系統

(3.238.135.174) 您好!臺灣時間:2021/08/05 07:45
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:邱鐘毅
研究生(外文):Jhong-Yi Ciou
論文名稱:語音驗證系統之語音特徵分類改良
論文名稱(外文):Voice Feature Classification for Voice Verification
指導教授:歐陽彥杰
指導教授(外文):Yen-Chieh Ouyang
學位類別:碩士
校院名稱:國立中興大學
系所名稱:通訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:英文
論文頁數:80
中文關鍵詞:線性預測參數梅爾倒頻譜參數K-means分群法模糊 C-means分群法模糊減法分群法倒頻譜平均值與變異數正規化法
外文關鍵詞:Linear predictive coding (LPC)Mel-Frequency Cepstral Coefficients (MFCC)K-means clusteringFuzzy C-means clusteringFuzzy subtractive clusteringcepstral mean and variance normalization (CMVN)
相關次數:
  • 被引用被引用:0
  • 點閱點閱:142
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
語音辨識系統的應用範圍擴大,相對著語音驗證系統準確率也受到重視。本論文提出幾種不同針對語音特徵值分類,來強化驗證系統上的精準度,並比較不同分類法的效能。
在本文中,我們利用線性預估參數(Linear Predictive Coding, LPC)與其一階倒頻譜(Delta-cepstral Coefficients),再與梅爾倒頻譜參數(Mel-frequency Cepstral Coefficients, MFCC)結合成為新的一組特徵值。然而我們使用了K-means 分群法(K-means clustering)、模糊C-means 分群法(Fuzzy C-means clustering)與模糊減法分群法(Fuzzy subtractive clustering) 對特徵值找出最佳平均值與變異數,再使用倒頻譜平均值與變異數正規化法(Cepstral mean and variance normalization, CMVN)以抵抗雜訊。

Speech identification system has expanded the scope of applications, and the relative accuracy of the system is also taken seriously. This thesis presents several different kinds of classifications for speech features to strengthen the system’s accuracy, and compare the performance of different classifications.
In this thesis, we combined with linear prediction coding (LPC) and its first-order cepstrum (delta-LPC) and Mel-Frequency Cepstral Coefficients (MFCCs) to form a new set of above features. However, we used the K-means clustering, Fuzzy C-means clustering and Fuzzy subtractive clustering to find the individual mean and variance for every frame, then used the cepstral mean and variance normalization (CMVN) to reduce the influence of noise.

摘要...............................................................................................................................i
Abstract........................................................................................................................ii
Chapter 1......................................................................................................................1
Introduction..................................................................................................................1
1.1 Motivation........................................................................................................1
1.2 Classification of Speaker Recognition.............................................................2
1.3 Speech Production...........................................................................................7
1.4 Model of Speaker Verification.........................................................................9
1.5 Improvement..................................................................................................11
1.6 Organization of Thesis...................................................................................11
Chapter 2....................................................................................................................12
Fundamentals of Speaker Recognition ....................................................................12
2.1 Speech Signal Pre-Processing........................................................................14
2.1.1 Signal Sampling Process.....................................................................14
2.1.2 DC-Offset Removal of Signal.............................................................15
2.1.3 Bandpass filter....................................................................................15
2.1.4 Volume Normalization........................................................................17
2.1.5 Frame Blocking and windowing.........................................................17
2.1.6 Endpoint Detection.............................................................................19
2.1.7 Signal Pre-Emphasis Process..............................................................21
2.2 Parameter Extraction......................................................................................23
2.2.1 Linear Predictive Cepstral, LPC.........................................................23
2.2.2 Delta-cepstrum Coefficient [19].........................................................26
2.2.3 Cepstral Mean Subtraction..................................................................27
2.2.4 Cepstral Mean and Variance Normalization.......................................29
2.2.5 Mel-Frequency Cepstrum Coefficients, MFCC..................................30
2.2.5.1 Fast Fourier Transform............................................................30
2.2.5.2 Triangular Band-Pass Filter.....................................................32
2.2.5.3 Discrete Cosine Transform......................................................33
2.2.5.4 Delta Cepstrum........................................................................34
2.2.6 Combined Features.............................................................................34
2.3 Pattern Matching............................................................................................35
2.3.1 Hidden Markov Models, (HMM) [18]................................................35
2.3.2 The Forward Procedure [23]...............................................................38
2.3.3 The Backward Procedure [23]............................................................39
2.3.4 Viterbi algorithm.................................................................................40
Chapter 3 The Use of Mixture Endpoint Detection................................................41
3.1 Entropy-based Endpoint Detection................................................................42
3.2 Zero-Crossing Rate [22]................................................................................43
3.3 Combine Entropy and Zero-Crossing Rate together [23]..............................45
Chapter 4 Voice Feature Classification and Speaker Model..................................47
4.1 K-means Clustering [29] [30]........................................................................48
4.2 Fuzzy C-means Clustering [29].....................................................................49
4.3 Fuzzy Subtractive Clustering [29].................................................................50
4.4 Speaker model................................................................................................52
4.5 Testing Process...............................................................................................58
Chapter 5 Experiment and Result............................................................................60
5.1 Setting the system parameter.........................................................................60
5.2 Testing............................................................................................................62
5.3 Result.............................................................................................................68
Chapter 6 Conclusion and future work ...................................................................75
6.1 Conclusion.....................................................................................................75
6.2 Future work....................................................................................................77
References...................................................................................................................78

[1] J. M. Naik, “Speaker Verification: A Tutorial,” IEEE Communication Magazine, 28, 1, pp.42-48 (1 990)
[2] F. Bimbot, J. F. Bonastre, C. Fredouille, G. Gravier, M. C. Ivan, S. Meignier, T. Merlin, O. G. Javier, P. D. Dijana, and D. A. Reynolds, “A Tutorial on Text-independent Speaker Verification,” EURASIP Journal on Applied Signal Processing 2004:4, pp. 430-451, 2004.
[3] L. Rabiner, and B. H. Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993.
[4] J. P. Campbell, Jr, “Speaker Recognition: A Tutorial,” IEEE Invited Paper, Proceedings of The IEEE, Vol. 85, No. 9, pp. 1-26, September 1997.
[5] T. F. Quatieri, and Massachusetts Institute of Technology Lincoln Laboratory, “Discrete-Time Speech Signal Processing Principles and Practice,” Pearson Education Taiwan Ltd, 2005.
[6] B. R. Wildermoth, “Text-independent Speaker Recognition Using Source Based Features,” Master of Philosophy, Griffith University, Australia, January 2001.
[7] B. R. Wildermoth, “Text-independent Speaker Recognition Using Source Based Features,” Master of Philosophy, Griffith University, Australia, January 2001.
[8] D. S. Reynold and R. C. Rose “Robust Test-independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995.
[9] KSR Murty and B. Yegnanarayana, ”Combining Evidence From Residual Phase and MFCC Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 13, No. 1, pp. 52-55, January 2006.
[10] A. Mezghani, and D. O’Shaughnessy, “Speaker Verification Using a New Representation Based on a Combination of MFCC and Formants,” CCECE/CCGEI, Saskatoon, pp. 1461-1464, May 2005.
[11] K. Chen, Senior Member, IEEE, “On the Use of Different Speech Representations for Speaker Modeling,” IEEE Transactions on Systems, MAN, and Cybernetics-Part C: Applications and Reviews, Vol. 35, No. 3, pp. 301-314, August 2005.
[12] S. Haykin, “Communication Systems 4th Edition,” John Wiley & Sons, Inc., 2001.
[13] J.L.Shen, J.W.hung, and L.S.Lee, “Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments”, Int. Conf. on Spoken Lang. Processing, 1998, pp.1-4
[14] Q. Li, Senior Member, IEEE, J. Zheng, A. Tsai, and Q. Z., Member, IEEE, “Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition,” IEEE Transactions on Speech and Audio Processing, Vol. 10, No. 3, pp. 146-157, March 2002.
[15] Y. Linde, A. Buzo and R.M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., Vol. COM 28, pp. 84-95, Jan. 1980.
[16] J. GRodriguez J.O. Garcia Cesar Martin and Luis Hernandez “Increasing Robustness In GMM Speaker Recognition System for noisy and reverberant Speech eith Low complexity Microphone Arrays”
[17] A.Acero and X.Huang “Augmented Cepstral Normalization fo Robust Speech Recognition”.
[18] M. Stengel, “ Introduction to Graphical Models, Hidden Markov Models and Bayesian Networks, ” Yoyohoshi, 441-8580 Japan March 7th, 2003.
[19] 楊鎮光,” Visual Basic 語音辨識”,松崗出版,pp3-34-36,2002 年6 月
[20] H. Matsumoto and M. Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” Proc. ICASSP, vol. 1, pp. 117–120, 2001.
[21] J.L.Shen, J.W.hung, and L.S.Lee, “Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments”, Int. Conf. on Spoken Lang. Processing, 1998, pp.1-4
[22] R.Jang (張智星) neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/index.asp Audio Signal Processing and Recognition (語音處理與辨識)
[23] 王小川,”語音訊號處理”,全華出版,2005 年2 月
[24] D. S. Reynold and R. C. Rose “Robust Test-independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995.
[25] A. Martin, G. Doddington, T. Kamm, M. Ordowski, M. Przybocki, “The DET Curve in Assessment of Detection Task Performance,” IEEE.
[26] H. Matsumoto and M. Moroto, “Evalution of Mel-LPC Cepstrum in A Large Vocabulary continuous Speech Recognition,”IEEE,pp.117-120,2001
[27] Rabiner, L. and B.H. Juang, “Fundamentals of Speech Recognition” Prenrice-Hall, 1993.
[28] 謝忠穎,”An Improved Speaker Verification System Using Orthogonal GMM, National Chung Hsing University 2006”
[29] Khaled Hammouda and Fakhreddine Karray, “A Comparative Study of Data Clustering Techniques”, Canada
[30] Andrew Moore, “K-means and Hierarchical Clustering - Tutorial Slides”


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top