跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.85) 您好!臺灣時間:2024/12/07 02:13
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:羅文輝
研究生(外文):Wen-Hui Lo
論文名稱:稀少性的輸入資訊下所造成的分佈不匹配問題在語者確認上的可靠度分析
論文名稱(外文):Reliability Analysis Focusing on Sparse Input Data Caused Distribution Mismatch Problems for Speaker Verification
指導教授:陳信宏陳信宏引用關係
指導教授(外文):Sin-Horng Chen
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電機學院碩士在職專班電信組
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:中文
論文頁數:126
中文關鍵詞:分佈不匹配混合高斯稀少資料語者確認可靠度
外文關鍵詞:distribution mismatchGMMsparse dataspeaker verificationreliability
相關次數:
  • 被引用被引用:0
  • 點閱點閱:247
  • 評分評分:
  • 下載下載:22
  • 收藏至我的研究室書目清單書目收藏:0
在語音辨識的領域上,往往需要使用少量的資料來對模型進行校估藉以使得模型更為強健(robust)。在語者確認的問題上,時常也需要面對資料量很少的情形之下從事語者模型的訓練或測試的問題。
本研究首先提出稀少性資料(sparse data)的輸入情況下,語者確認(speaker verification) 的問題在混合高斯 GMM (Gaussian mixture model)模型上的度量分數分佈情形會產生和原先假設之間有落差的現象。本研究稱此種現象為「分佈不匹配(distribution mismatch)的問題」。針對此分佈不匹配的問題,本研究首先提出使用截尾分佈機率密度函數(truncated probability distribution function)的概念來近似。最後以此為基礎,使用次序統計(order statistic)量的概念,推導得出一個以圖(graph)為基礎的聯合分佈機率模型;可以同時以機率的形式描述完整機率密度函數和截尾分佈機率密度函數。
本研究建立一個以輸入資料,資料之最小值,資料之分佈範圍大小,資料分佈範圍下的累積機率(覆蓋率)及資料長度五個隨機變數的聯合分佈機率密度函數。配合Gaussian quadrature 積分的取樣概念,得出最少取樣點下最精準的估計公式。最終的目的是希望以較優勢的資訊量補償在傳統的統計推估上,因為資料量稀少所造成的估計標準誤增加的問題。
最後,本研究以語者語句所獲得之相對於UBM(universal background model)模型規一化平均分數對EER(equal error rate)進行假設檢定(hypothesis test);由實驗的結果得知,假設檢定可以有效的減少語者確認時,因為抽樣誤差所造成的誤判。
本研究的另外的主要成果在於確立稀少性的輸入資訊下,如果要出現原先我們所假設的分佈狀況的可能性將是一個機率的隨機行為,不再是一個假設性的確定性(deterministic)描述。本研究所得出的結論為:「當輸入的樣本數量小於20的時候,輸入樣本的覆蓋範圍和原來的假設PDF之間會互相匹配一致」的假設必須使用機率事件來描述才能完全掌握,而本研究完成了這個機率事件的描述公式。
ABSTRACT
It is a frequent facing problem for sparse data input to make a robust model testing with speech recognition. This phenomenon also encountered in the field of speaker verification with small data enrollment to do training or testing.
A new approach to sparse data input caused problems named “distribution mismatch(DM)” was addressed. The core of DM which was on account of the coverage of the probability distribution function(PDF) of the input data which are applied to GMM(Gaussian mixture model) score calculation is not full mapping to the original PDF assumption. There maybe be some differences between the original assumption PDF to the new one generated by sparse data input and we suggested to using the truncated probability distribution function for modeling this situation.
The most important addition to be made to what we have said about DM is that we have derived a new joint PDF based on graph theory with order statistic and the new formula would act as the truncated PDF or the original PDF measured by this joint PDF.
We succeed establishing the joint PDF which is compose of five random variables, including the input data, the minimum order of input data, the range of input data, the coverage of input data and the sample size of input data to estimate with Gaussian quadrature integration.
In the end of experiment, we take a hypothesis test to the equal error rate(EER) of the average score per frame of per sentence announced by the speaker normalized to the universal background model(UBM) and the same score announced by imposter normalize to the UBM model.
There are good evidences to show that hypothesis test could decrease the error probability for speaker verification. The other finding finished by this study is that we discover a special fact caused by sparse data input.
We usually regard the input random variable submitted to a certain probability distribution function but it is probabilistic to agree with this assumption when the input sample size is less than 20. Finally, we have derived the joint probability distribution function about it.
1. 緒論......................................................................................................................11
1.1. 研究緣起..................................................................................................11
1.2. 研究動機..................................................................................................11
1.3. 研究方法..................................................................................................11
1.4. 語者確認文獻回顧..................................................................................13
  傳統的語者確認方法(Conventional Speaker Verification)............13
  決策準則..........................................................................................16
  相似度分數標準化(Likelihood Score Normalization)...................16
  針對偽裝者模型之分數標準化(Score Normalization of Imposters of UBM or Cohort Set)..............................................................................17
2. 可靠度相關文獻回顧..........................................................................................20
2.1. 以雜訊為影響基礎之可靠度分析......................................................21
2.2. 使用統計觀點來看待語者確認中之分數標準化過程..........................22
  Hard Decision..................................................................................24
  Soft Decision....................................................................................25
2.3. 工業產品之壽命分析(Lifetime Analysis)..............................................28
2.4. 醫學上之臨床統計應用(Survival Analysis)...........................................30
3. 截尾分佈之介紹..................................................................................................32
4. 截尾分佈之推導..................................................................................................36
4.1. 左截尾常態分佈之最大概度估計(Maximum Likelihood Estimators for Left Truncated Normal Distribution)...................................................................36
4.2. 右截尾常態分佈之最大概度估計(Maximum Likelihood Estimators for Right Truncated Normal Distribution).................................................................41
4.3. 雙截尾常態分佈之最大概度估計(Maximum Likelihood Estimation for Doubly Truncated Normal Distribution)..............................................................44
  機率密度函數..................................................................................45
  最大概度函數..................................................................................45
5. 模式建立..............................................................................................................48
5.1. 模型定義..................................................................................................48
5.2. 覆蓋率之實例解釋..................................................................................48
5.3. 聯合機率分佈函數(Joint Probability Distribution Function)1|:(,,,)npxxrcn之模型假設與推導..............................51
  模式目的..........................................................................................52
  次序統計量......................................................................................54
6
5.4. 覆蓋率之機率密度函數􀃎(|)pcn之計算.....................................57
5.5. 使用均等分佈U[0,1]下的全距分佈公式作為覆蓋率的機率密度函數 59 ˆr
5.6. 條件機率(|,)prcn之推導.............................................................61
  條件機率計算剖析..........................................................................63
5.7. 使用聯合機率的角度來思考全距(range)公式......................................68
  步驟A..............................................................................................70
  解釋1:(())ngxδ的物理意義........................................................73
  建立端點,減少電腦運算時間......................................................80
  留下合乎限制式的根......................................................................81
  執行上一節的步驟C......................................................................83
5.8. 條件機率1:(|,)npxrn.........................................................................86
5.9. 組合切片,進行區間估計......................................................................90
5.10. 再一次使用gaussian quadrature.........................................................91
  Gauss-Legendre Integration.............................................................92
  首先計算出切片的位置..................................................................94
6. 實驗設計..............................................................................................................98
6.1. 稀少資料的隨機分佈現象......................................................................99
6.2. 實驗環境設定........................................................................................100
6.3. 將自我判讀及偽裝者測試所得之相對分數視為隨機分佈處理........105
6.4. 問題的分析............................................................................................107
6.5. 實驗Case 1:基本組態實驗性能測試................................................112
6.6. 實驗Case 2􀃎將稀少性樣本視為truncated probability distribution處理 115
6.7. 使用Hypothesis Test輔助判別............................................................118
  檢定已知的imposter是否為 client? right-tailed test..................118
  檢定已知的client是否為 imposter? left-tailed test....................119
使用Hypothesis Test輔助之結果............................................................119
6.8. 實驗Case 3............................................................................................120
計算方式:以權重方式相加:................................................................120
7. 結論與未來展望................................................................................................123
8. 參考文獻............................................................................................................124
R. Auckenthaler, M. Carey, and H. Lloyd-Thomas, ”Score Normalization for Text-Independent Speaker Verification Systems”, Digital Signal Precessing, Vol. 10, pp.42-54, 2000
C. Barras and J.-L. Gauvain, "Feature and Score Normalization for Speaker Verification of Cellular Data," in Proceedings of ICASSP, May 2003.
Mariethoz, J.and Bengio, S.,”A Unified Framework for Score Normalization Techniques Applied to Text-Independent Speaker Verification”, IEEE Signal Processing Letters, Vol. 12, No. 7, pp. 532-535, July 2005

T. Ganchev, I. Potamitis, N. Fakotakis, “Noise-Source Modeling for Robust Speaker Verification in Adverse Environments”, Competitive Environment, Renewable Energy, Distributed Generation ISAP 2003, Lemnos, Greece, August 31st - September 3rd, 2003. paper ISAP03/137

Jonas Richiardi, Plamen Prodanov, Andrzej Drygajlo, “Speaker Verification with Confidence and Reliability Measures”, IEEE International Conference on Acoustics, Speech and Signal processing, 2006.

U.V. Chaudhari, G.N. Ramaswamy, G. Potamianos, and C. Neti, “Audio-Visual Speaker Recognition Using Time-Varying Stream Reliability Prediction”, Proc. International Conference on Acoust, Speech and Signal Processing, Vol. V, pp. 712-715, Hong Kong, Apr. 2003.

Mijail Arcienega and Andrzej Drygajlo, “A Bayesian Network Approach for Combining Pitch and Reliable Spectral Envelope Features for Robust Speaker Verification”, lecture note in Computer Science, Springer, 2003.

K.Y. Leung, M.W. Mak, M.H. Siu, and S.Y. Kung, “Adaptive Articulatory Feature-Based Conditional Pronunciation Modeling for Speaker Verification”, Speech Communications, Vol. 48, Issue 1, pp. 71-84, Jan. 2006.

Erhan Mengusoglu, “Confidence Measure Based Model Adaptation for Speaker Verification”, Proc. 2nd IASTED International Conference on Communications, Internet and Information Technology, Scottsdale, AZ, USA, 17-19, November 2003.

顏月珠, ”商用統計學”,三民書局,民79

Hamdy, H. I., “Bayesian Predication Bounds for the Pareto Lifetime Model”, Commun, Statust.-Theory Method, Vol. 16, Issue 6, pp.1761-1772, 1987.

Lagakos, S. W., Barraj, L. M., and Degruttola, V., “Nonparametric Analysis of Truncated Survival Data, with Application to AIDS”, Biometrika 75, pp.515-523, 1988.

13 R. Auckenthaler, M. Carey, and H. Lloyd-Thomas, ”Score Normalization for Text-Independent Speaker Verification Systems”, Digital Signal Precessing, Vol. 10, pp.42-54, 2000.

14 C. Barras and J.-L. Gauvain, "Feature and Score Normalization for Speaker Verification of Cellular Data," in Proceedings of ICASSP, May 2003.

15.COHEN, A. CLIFFORD, “Truncated and Censored Samples: Theory and Applications”, Marcel Dekker ,1991

16.Helmet Schneider , “Truncated and Censored Samples from Normal Populations”, ISBN 0-8247-7591-0. Marcel Dekker, 1986.

17.H.A.DAVID, “Order Statistics, 2nd Edition”, Iowa State University, 1981.

18.N. Balakrishnan and A. Clifford Cohen, “Order Statistics and Inference: Estimation Methods”, Academic Press, Inc., 1991.

19.Barry C. Arnold, N. Balakrishnan and H. N. Nagaraja, “A First Course in Order Statistics”, John Wiley & Sons, Inc., 1992.

20.John P. Klein and Melvin L. Moeschberger, “Survival Analysis Techniques for Censored and Truncated data, 2nd Edition”, Springer, 2003.

21.Vincent Wan, “Speaker Verification using Support Vector Machines”, Ph.D. thesis, University of Sheffield, U.K., June 2003.

22.Chun-Nan Hsu and Hau-Chung Yu and Bo-Hou Yang. "Speaker Verification without Background Speaker Models," In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2003, Hong Kong, China, 2003.

23.C. T. Liao and H. K. Iyer, “A Tolerance Interval for the Normal Distribution with Several Variance Components”, Statistica Sinica, Vol.14, pp,217-229, 2004

24.J HH Barrett and KJ Myers, “The Dirac Delta and other Generalized Functions,” in Foundations of Image Science, John Wiley & Sons, Inc., New Jersey, pp. 63-94, 2004.

25.Arfken, G. "Appendix 2: Gaussian Quadrature." Mathematical Methods for Physicists, 3rd Edition. Orlando, FL: Academic Press, pp. 968-974, 1985.

26.Taischi, Chi., “A Segment-Based Speaker Verification System Using SUMMIT. Massachusetts”, Massachusetts Institute of Technology, Department of Electrical Engineering and. Computer Science, Master thesis, 1997
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top