臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.106) 您好！臺灣時間：2026/04/02 04:19

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

丁川偉

研究生(外文):

Chuan-Wei Ting

論文名稱:

因素分析模型於語音辨識之研究

論文名稱(外文):

Factor Analysis and Modeling for Speech Recognition

指導教授:

簡仁宗

指導教授(外文):

Jen-Tzung Chien

學位類別:

博士

校院名稱:

國立成功大學

系所名稱:

資訊工程學系碩博士班

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2009

畢業學年度:

語文別:

英文

論文頁數:

117

中文關鍵詞:

語音、因素分析

外文關鍵詞:

speech recognition、factor analysis

相關次數:

被引用:1
點閱:494
評分:
下載:114
書目收藏:0

在建立語音辨識系統的過程中，統計分析與機器學習扮演了相當重要的角色，而在語音辨識的領域當中，常存在有因訓練環境與測試環境不同所產生的不匹配問題，因此，系統的強健性一直是語音處理上一門嚴峻的課題。在文獻當中，已有大量的方法被提出並探討強健性的議題，一般來說，可以依語音處理的順序將這些方法分為層面處理：在訊號空間處理、在語音特徵參數空間處理以及在語音模型上做處理。在訊號空間上做處理，主要的觀點是在進入辨識的程序前，將噪音抑制或增強雜訊語音中的語音訊號部份，進而使訓練與測試的資料皆處於匹配的乾淨訊號空間以解決不匹配問題；對於語音特徵參數空間的處理方式上，則是儘可能取萃取出較不受語者變異或是語音環境所影響的語音特徵參數；而在模型空間上，則是將目前所訓練好的模型，利用與測試語料條件較為接近的調適語料，調適至適合測試語料的環境。
本論文主要依據機器學習的觀點，並提出了因素分析(factor analysis)模型在多種不同的層面解決自動化語音辨識中強健性的議題，更進一步來說，我們所提出的研究包含有：(1)建立了一套新穎的子空間模型化與子空間選擇的方法，藉由從語音訊號的主要子空間與次要子空間中所萃取出的細致語音資訊，提升噪訊語音辨識的效能；(2)提出一個因素分析串流式隱藏式馬可夫模型(factor analyzed streamed hidden Markov model, FASHMM)架構，除了對原始特徵參數做轉換外，更對於每個因素之特徵參數以個別的馬可夫模型做為描述；(3)設計出具備自我學習能力的可調變式活化因素分析於隱藏式馬可夫模型拓撲機制，由新進資料當中，自動找出不存在於既有模型中的語音變異性，並依據變異特性以不同的拓撲架構新增於既有模型中以完成模型的更新；(4)在TIMIT、AURORA2與WSJ資料庫，實現及評估本論文所提之各種方法。
在語音辨識的實驗中我們採用了不同的安排來評估所提出的各種方法在語音訊號增強、語音特徵參數轉換與模型分流化、模型選擇以及隱藏式馬可夫模型拓撲上的效能。實驗結果顯示所提出的方法在不同的評估當中皆達到令人滿意的效果。在語音訊號增強技術中，除了利用因素分析的特性，我們於兩個子空間中個別最小化語音失真的能量以達到更佳之語音品質，並且根據因素分析的特性，利用假設檢定的機制設計了選取最佳子空間的法則，此兩種技術皆可有效提升噪訊語音訊號的訊噪比以及噪訊語音的辨識率；在語音特徵參數的串流化中，我們採用因素分析的特徵參數轉換，並以個別的馬可夫鍊描述每個共同因子與特殊因子，來達到串流式的隱藏式馬可夫模型，實現過程中考量不同特徵參數個數與狀態個數，都顯著地提高了語音辨識率；另外在隱藏式馬可夫模型拓撲上，我們在每個不同的訓練子集合中，循序地學習到各種不同的發音辨異性，且調整與學習模型的拓撲，此外我們並提出一個以因素分析為出發點，量測隱藏式馬可夫模型中狀態間相似度的法則，增進隱藏式馬可夫模型拓撲中自適學習的效能。本論文中所提出的所有方法與分析探討，可提供機器學習和語音辨識之學者重要的研究參考。

Statistical analysis and machine learning play a crucial role for building flexible speech recognition systems. The robustness issue is known as a highly-impacting topic in speech recognition because the mismatch between training and test environments always exists in real-world applications. In the literature, there are quite many works proposed to deal with the robustness issue. In general, these works tackled this issue by using the statistical learning methods in three spaces; signal space, feature space, and model space. In signal space, we may reduce the noise interference or enhance the noisy speech for resolving the mismatch problem prior to the recognition stage. In feature space, our objective is to find a robust feature representation which is insensitive to the variations due to the noises, channels and speakers. In the model space, we aim to adapt the current model to meet the test conditions, or equivalently capture the characteristics from adaptation data that is closer to new environments.
In this dissertation, we are motivated from the machine learning perspective and present the factor analysis (FA) approaches in signal space, feature space and model space for dealing with the robustness issue in automatic speech recognition. More specifically, we present several studies including the works of: (1) developing a novel subspace modeling and selection approach which is proposed by extracting the delicate speech information from principal subspace and minor subspace of speech signals for noisy speech recognition, (2) developing a FA streamed hidden Markov model (FASHMM) framework where the acoustic features are analyzed and transformed by FA principle, and the individual Markov chain is applied to the transformed features corresponding to the same common factor, (3) building a flexible FA-activated HMM topology with a self-learning capability so as to learn the new pronunciation variations from the ceaselessly input data, and (4) implementing and evaluating the proposed methods by using the TIMIT, AURORA2 and WSJ speech corpora.
We conduct different sets of experiments to evaluate the performance of speech recognition by using the proposed new methods including speech enhancement, acoustic feature streaming, model selection, and HMM topology. Experimental results showed that the proposed methods achieved the desirable performance in different evaluations. In speech enhancement, we minimized the energies of speech distortion in the principal subspace as well as in the minor subspace so as to estimate the clean speech with residual information. Following the FA principle, we explored the optimal subspace selection via solving the hypothesis test problems. We increased the signal-to-noise ratios (SNRs) and improved the recognition accuracies in noisy speech recognition. In acoustic feature streaming, we performed the FA feature transformation and adopted the individual Markov chain for streamed HMM modeling. Speech recognition performance was significantly improved by different realizations of streaming in the numbers of features and states. In HMM topology, we sequentially learned the pronunciation variations and adapted the topology at different learning epochs. An FA similarity measure between two HMM states was proposed and shown effective in adaptive learning of HMM topology. All of the methods proposed in this dissertation are helpful for the researchers or scientists working on the related topics.

中文摘要 I
ABSTRACT III
致　　謝 V
TABLE OF CONTENTS VI
LIST OF TABLES IX
LIST OF FIGURES X
Chapter 1 Introduction 1
1.1 Motivations 8
1.2 Outline of This Dissertation 8
1.3 Contributions of This Dissertation 10
Chapter 2 Background Survey 12
2.1 Statistical Speech Recognition 12
2.2 Factor Analysis 13
2.2.1 Maximum Likelihood Estimation 13
2.2.2 Principal Component Method 14
2.2.3 Principal Factor Analysis 15
Chapter 3 Factor Analyzed Subspace Modeling and Selection 17
3.1 Subspace Modeling 17
3.1.1 Modeling of Noisy Signal 17
3.1.2 Estimation of Clean Signal 20
3.2 Subspace Selection 22
3.2.1 Selection via Testing Equivalence of Eigenvalues 23
3.2.2 Selection via Testing Diagonal Covariance Matrix 25
3.2.3 Illustration and Implementation 26
3.3 Experiments 29
3.3.1 Experimental Setup 29
3.3.2 Evaluation of Waveforms and SNRs of Enhanced Speech 31
3.3.3 Effects of Subspace Modeling on Noisy Speech Recognition 33
3.3.4 Effects of Subspace Selection on Noisy Speech Recognition 35
3.4 Summary 38
Appendix 3-A 38
Appendix 3-B 40
Chapter 4 Factor Analysis for Streamed Hidden Markov Modeling 42
4.1 Factor Analysis 42
4.1.1 Factor Analysis of Acoustic Features 42
4.1.2 FA Parameter Estimation 45
4.2 Streamed Hidden Markov Models 48
4.2.1 HMM Topologies 48
4.2.2 Factor Analysis Streamed HMM 50
4.2.3 FA Parameter Sharing 53
4.2.4 FASHMM Viterbi Algorithm 55
4.3 Experiments 58
4.3.1 Experimental Setup 58
4.3.2 Recognition Results of HMM, SFHMM and FASHMM 60
4.3.3 Evaluation of FA Parameter Sharing in FASHMM 62
4.3.4 Evaluation of FASHMM for Phone Recognition 64
4.3.5 Evaluation of FASHMM for Noisy Speech Recognition 66
4.4 Summary 68
Appendix 4-A 69
Appendix 4-B 70
Chapter 5 Adaptive Factor Analyzed HMM Topology 71
5.1 Related Works 71
5.1.1 Similarities between Two GMMs 71
5.1.2 HMM Topology Learning 74
5.2 Adaptive Factor Analyzed HMM Topology and Parameters 76
5.2.1 Adaptive HMM Topology in State Level 76
5.2.2 Adaptive HMM Topology in Gaussian Level 81
5.2.3 HMM Topologies and Adaptive Learning Algorithm 86
5.3 Experiments 93
5.3.1 Experimental Setup 93
5.3.2 Adaptive HMM Topology 94
5.3.3 Evaluation of AHMMT for Phone Recognition 95
5.3.4 Evaluation of AHMMT for Word Recognition 97
5.4 Summary 100
Chapter 6 Conclusions and Future Works 102
Bibliography 105
作者簡歷 (Author’s Biographical Notes) 114

Akaike, H., “A new look at the statistical model identification”, IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716-723, 1974.
Anderson, T. W., “Asymptotic theory for principal component analysis”, Annals of Mathematical Statistics, vol. 34, pp.122-148, 1963.
Anderson, T. W., Introduction to Multivariate Statistical Analysis 2nd Edition, New York: Wiley, 1984.
Attias, H., “Independent Factor Analysis”, Neural Computation, vol. 11, no.4, pp. 803-851, 1999.
Basilevsky, A., Statistical Factor Analysis and Related Methods - Theory and Applications, John Wiley & Sons, 1994.
Biem, A., Ha, J.-Y. and Subrahmonia, J., “A Bayesian model selection criterion for HMM topology optimization”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), vol. 1, pp. 13-17, 2002.
Boll, S. F., “Suppression of acoustic noise in speech using spectral subtraction”, IEEE Transactions on Acoustic, Speech and Signal Processing, vol. ASSP-27, pp. 113–120, 1979.
Bourland, H. and Dupont, S., “A new ASR approach based on independent processing and recombination of partial frequency bands”, Proc. of International Conference on Spoken Language Processing (ICSLP), pp. 426-429, 1996.
Box, G. E. P., “A general distribution theory for a class of likelihood criteria”, Biometrika, vol. 36, pp.317-346, 1949.
Campbell, M. W., Assaleh, K. T., and Brown, C. C., “Speaker recognition with polynomial classifiers”, IEEE Transactions on Speech and Audio Processing, vol. 10, no. 4, pp. 205-212, 2002.
Chien, J.-T., “Online hierarchical transformation of hidden Markov models for speech recognition”, IEEE Transactions on Speech and Audio Processing, vol. 7, no. 6, pp. 656-667, 1999.
Chien, J.-T., “Decision tree state tying using cluster validity criteria”, IEEE Transactions on Speech and Audio Processing, vol. 13, no. 2, pp. 182-193, 2005.
Chien, J.-T. and Chen, B.-C., “A new independent component analysis for speech recognition and recognition”, IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1245-1254, 2006.
Chien, J.-T. and Furui, S., “Predictive hidden Markov model selection for speech recognition”, IEEE Transactions on Speech and Audio Processing, vol. 13, no. 3, pp. 377-387, 2005.
Chien, J.-T. and Huang, C.-H., “Bayesian learning of speech duration models”, IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp. 558-567, 2003.
Chein, J.-T. and Liao, C.-P., “Maximum confidence hidden Markov modeling for face recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 606-616, 2008.
Chien, J.-T. and Ting, C.-W., “Speaker identification using probabilistic PCA model selection”, Proc. of International Conference on Spoken Language Processing (ICSLP), vol. 3, pp. 1785-1788, 2004.
Chein, J.-T. and Ting, C.-W., “Subspace modeling and selection for noisy speech recognition”, Proc. of International Conference on Spoken Language Processing (INTERSPEECH), pp. 789-792, 2006.
Chien, J.-T. and Ting, C.-W., “Factor analyzed subspace modeling and selection”, IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 1, pp. 239-248, 2008.
Chien, J.-T. and Ting, C.-W., “Acoustic factor analysis for streamed hidden Markov modeling”, IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 7, pp. 1279-1291, 2009.
Dempster, A. P., Laird, N. M., and Robin, D. B., “Maximum likelihood from incomplete data via the EM algorithm”, Journal of the Royal Statistical Society (B), vol. 39, no. 1, pp. 1-38, 1977.
Deoras, A. N. and Hasegawa-Johnson, M., “A factorial HMM approach to simultaneous recognition of isolated digits spoken by multiple talkers on one audio channel”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), vol. 1, pp.861-864, 2004.
Droppo, J. and Acero, A., “Maximum mutual information SPLICE transform for seen and unseen conditions”, Proc. of European Conference on Speech Communication and Technology (INTERSPEECH), pp. 989-992, 2005.
Droppo, J., Deng, L., and Acero, A., “Evaluation of the SPLICE algorithm on the Aurora2 database”, Proc. of European Conference on Speech Communication and Technology (EUROSPEECH), pp. 217-220, 2001.
Dupont, S. and Luettin, J., “Audio-visual speech modeling for continuous speech recognition”, IEEE Transactions on Multimedia, vol. 2, no. 3, pp. 141-151, 2000.
Ephraim, Y. and Malah, D., “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator”, IEEE Transactions on Acoustic, Speech and Signal Processing, vol. ASSP-32, no. 6, pp.1109-1121, 1984.
Ephraim, Y. and Van Trees, H. L., “A signal subspace approach for speech enhancement”, IEEE Transactions on Speech and Audio Processing, vol. 3, no. 4, pp. 251-266, 1995.
Falkhausen, M., Reininger, H., and Wolf, D., “Calculation of distance measures between hidden Markov models”, Proc. of European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1487-1490, 1995.
Furao, S. and Hasegawa, O., “An incremental network for on-line unsupervised classification and topology learning”, Neural Networks, vol. 19, pp. 90-106, 2006.
Furui, S., “Recent advances in speaker recognition”, Pattern Recognition Letters, vol. 18, pp. 859-872, 1997.
Gales, M. J. F., “Maximum likelihood linear transformations for HMM-based speech recognition”, Computer Speech and Language, vol. 12, no. 2, pp. 75-98, 1998.
Gales, M. J. F. and Young, S. J., “Robust continuous speech recognition using parallel model combination”, IEEE Transactions on Speech and Audio Processing, vol. 4, no. 5, pp. 352-359, 1996.
Gauvain, J.-L. and Lee, C.-H., “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains”, IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp.291-298, April 1994.
Ghahramani, Z. and Jordan, M. I., “Factorial hidden Markov models”, Machine Learning, 29, pp. 245-275, 1997.
Hämäläinen, A., Bosch, L., and Boves, L., “Modeling pronunciation variation using multi-path HMMs for syllables”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), pp. 781-784, 2007.
Haykin, S., Neural Networks: A Comprehensive Foundation 2nd Edition, Prentice Hall, 1998.
He, J., Liu, L., and Gunther, P., “A discriminative training algorithm for VQ-based speaker identification”, IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 353-356, 1999.
Hershey, J. R. and Olsen, P. A., “Approximating the Kullback-Leibler divergence between Gaussian mixture models”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), pp. 317-320, 2007.
Hirsch, H. G. and Pearce, D., “The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions”, Proc. of ISCA ITRW ASR2000, Paris-France, September 2000.
Hu, Y. and Loizou, P. C., “A generalized subspace approach for enhancing speech corrupted by colored noise”, IEEE Transactions on Speech and Audio Processing, vol. 11, no. 4, pp. 334-341, 2003.
Hwang, M.-Y. and Huang, X., “Dynamically configurable acoustic models for speech recognition”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), pp. 669-672, 1998.
Jitsuhiro, T. and Nakamura, S., “Variational Bayesian approach for automatic generation of HMM topology”, Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, pp. 77-82, 2003.
Jolliffe, I. T., Principal Component Analysis, Springer-Verlag, 1986.
Kim, H.-C., Kim, D., and Bang, S.-Y., “Extensions of LDA by PCA mixture model and class-wise features”, Pattern Recognition vol. 36, pp. 1095-1105, 2003.
Kumar, N. and Andreou, A. G., “Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition”, Speech Communication, vol. 26, no. 4, pp. 283-297, 1998.
Lamel, L., Kassel, R., and Seneff, S., “Speech database development: design and analysis of the acoustic-phonetic corpus”, Proc. of the DARPA Speech Recognition Workshop, pp. 100-109, 1986.
Lee, K. F. and Hon, H. W., “Speaker-independent phone recognition using hidden Markov models”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 11, pp. 1641-1648, 1989.
Leggetter, C. J. and Woodland, P. C., “Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models”, Computer Speech and Language, vol. 9, pp.171-185, 1995.
Logan, B. and Moreno, P., “Factorial HMMs for acoustic modeling”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), pp.813-816, 1998.
Mackay, D. J. C., “Bayesian interpolation”, Neural Computation, vol. 4, pp. 405-447, 1992.
Mak, B. and Chan, K.-W., “Pruning hidden Markov models with optimal brain surgeon”, IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 993-1003, 2005.
Markov, K. and Nakamura, S., “Never-ending learning with dynamic hidden Markov network”, Proc. of European Conference on Speech Communication and Technology (INTERSPEECH), pp.1437-1440, 2007.
Merhav, N., “The estimation of the model order in exponential families”, IEEE Transactions on Information Theory, vol. 35, no. 5, pp. 1109-1114, 1989.
Nadas, A., Nahamoo, D., and Picheny, M. A., “Speech recognition using noise-adaptive prototypes”, IEEE Transactions on Acoustic, Speech, and Signal Processing, vol. 37, no. 10, pp. 1495-1503, 1989.
Nagao, H., “On some test criteria for covariance matrix”, The Annals of Statistics, vol. 1, no. 4, pp. 700-709, 1973.
Ostendorf, M. and Singer, H., “HMM topology design using maximum likelihood successive state splitting”, Computer Speech and Language, vol. 11, pp. 17-41, 1997.
Printz, H. and Olsen, P., “Theory and practice of acoustic confusability”, Proc. of ISCA ITRW ASR2000, pp. 77-84, 2000.
Rabiner, L. R. and Juang, B.-H., Fundamentals of Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall, 1993.
Rencher, A. C., Methods of Multivariate Analysis, John Wiley & Sons, 1995.
Reyes-Gomez, M. J., Raj, B., and Ellis, D. P. W., “Multi-channel source separation by factorial HMMs”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), pp. 664-667, 2003.
Reynolds, D. A., “Speaker identification and verification using Gaussian mixture speaker models”, Speech Communication, vol. 17, pp. 91-108, 1995.
Reynolds, D. A., and Rose, R.C., “Robust text-independent speaker identification using Gaussian mixture speaker models”, IEEE Transactions on Speech and Audio Processing, vol. 3, pp. 72-83, 1995.
Rissanen, J., “A universal prior for integers and estimation by minimum description length”, The Annals of Statistics, vol. 11, no. 2, pp. 416-431, 1983.
Roch, M. and Hurtig, R. R., “The integral decode: a smoothing technique for robust HMM-based speaker recognition”, IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 315-324, 2002.
Rosti, A.-V. I. and Gales, M. J. F., “Factor analyzed hidden Markov models for speech recognition”, Computer Speech and Language, vol. 18, no. 2, pp. 181-200, 2004.
Saul, L. K. and Rahim, M. G., “Maximum likelihood and minimum classification error factor analysis for automatic speech recognition”, IEEE Transactions on Speech and Audio Processing, vol. 8, no. 2, pp. 115-125, 2000.
Schwarz, G., “Estimating the dimension of a model”, The Annals of Statistics, vol. 6, no. 2, pp. 461-464, 1978.
Silva, J. and Narayanan, S., “Average divergence distance as a statistical discrimination measure for hidden Markov models”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 3, pp. 890-906, 2006.
Singh, R., Raj, B. and Stern, R., “Structured redefinition of sound units by merging and splitting for improved speech recognition”, Proc. of International Conference on Spoken Language Processing (ICSLP), 2000.
Srivastava, M. S., Methods of Multivariate Statistics, John Wiley & Sons, 2002.
Takami, J. and Sagayama, S., “A successive state splitting algorithm for efficient allophone modeling”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), vol. 1, pp. 573-576, 1992.
Ting, C.-W. and Chien, J.-T., “Factor analysis of acoustic features for streamed hidden Markov modeling”, Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, pp. 30-35, 2007.
Ting, C.-W. and Chien, J.-T., “Factor analyzed HMM topology for speech recognition”, Proc. of International Conference on Spoken Language Processing (INTERSPEECH), 2009.
Ting, C.-W., Lee, K.-Y., and Chien, J.-T., “Adaptive HMM topology for speech recognition”, Proc. of International Conference on Spoken Language Processing (INTERSPEECH), pp. 127-1240, 2008.
Tipping, M. E. and Bishop, C. M., “Mixtures of probabilistic principal component analyzers”, Neural Computation, vol. 11, pp. 443-482, 1999.
Varga, A. P. and Moore, R. K., “Hidden Markov model decomposition of speech and noise”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), pp. 845-848, 1990.
Vasko Jr., F. C., El-Jaroudi, A., and Boston, J. R., “An algorithm to determine hidden Markov topology”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), vol. 6, pp. 3577-3580, 1996.
Vertanen, K., “Baseline WSJ acoustic models for HTK and SPHINX: training recipes and recognition experiments”, Technical Report, Cavendish Laboratory, 2006.
Vetter, R., Virag, N., Renevey, P., and Vesin, J.-M., “Single channel speech enhancement using principal component analysis and MDL subspace selection”, Proc. of European Conference on Speech Communication and Technology (EUROSPEECH), pp. 2411-2414, 1999.
Vihola, M., Harju, M., Salmela, P., Suontausta, J., and Savela, J., “Two dissimilarity measures for HMMs and their application in phoneme model clustering”, Proc. of International Conference on Acoustic, Speech, and Signal Processing (ICASSP), pp. 933-936, 2002.
Virtanen, T., “Speech recognition using factorial hidden Markov models for separation in the feature space”, Proc. of International Conference on Spoken Language Processing (INTERSPEECH), pp.89-92, 2006.
Wang, W. and O’Shaughnessy, D., “Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping”, Proc. of International Conference on Spoken Language Processing (ICSLP), pp. 125-128, 2004.
Watanabe, S., Sako, A., and Nakamura, A., “Automatic determination of acoustic model topology using variational Bayesian estimation and clustering for large vocabulary continuous speech recognition”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 3, pp. 855- 872, 2006.
Wu, J. and Huo, Q., “An environment compensated minimum classification error training approach and its evaluation on Aurora2 database”, Proc. of International Conference on Spoken Language Processing (ICSLP), pp. 453-457, 2002.
Xu, M. and Golay, M. W., “Data-guided model combination by decomposition and aggregation”, Machine Learning, vol. 63, pp. 43-67, 2006.
Yapanel, U., Hansen, J. H. L., Sarikaya, R., and Pellom, B., “Robust digit recognition in noise: an evaluation using the AURORA corpus”, Proc. of European Conference on Speech Communication and Technology (EUROSPEECH), pp. 209-212, 2001.
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Valtchev, V., and Woodland, P., The HTK Book, Cambridge University Speech Group, 2000.

電子全文

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	利用馬可夫鏈模型探討電子海報設計之視覺軌跡

無相關期刊

1.	疊層式有機多層膜白光二極體
2.	利用直流電漿化學氣相沉積法沉積鑽石薄膜與氫端鑽石導電特性量測
3.	異質結構可撓曲有機薄膜太陽能電池之研究
4.	具有表面處理之三-氮族化合物半導體系列氫氣感測器之研製
5.	航空發動機減速組齒輪箱失效調查與分析研究
6.	透明生物晶片平台應用於細胞膜片箝制技術之研製
7.	砷化鎵系列雙極性-場效電晶體之研製
8.	固態粉體擴散滲鋅法於低碳鋼SAE1010之鍍層結構與腐蝕性質研究
9.	時空守恆法模擬薛丁格方程式
10.	仿射魏爾群的龐加萊級數
11.	以Agent-basedmodel重現空間垃圾桶模型—都市自組織的探討
12.	開迴路諧振器型式GSM微帶線雙頻濾波器設計
13.	基於多星系全球導航衛星系統於準確性及妥善率提升之定位法則發展
14.	智慧型自主式駕駛控制器之設計與實現
15.	積體式三－氮族化合物半導體系列氣體感測器之研製

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室