跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.175) 您好!臺灣時間:2024/12/10 16:42
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳甲樺
研究生(外文):Chia-Hua Chen
論文名稱:具遺失訊息下混合高斯分佈的精簡建模
論文名稱(外文):Parsimonious Gaussian Mixture Modelling With Missing Information
指導教授:林宗儀林宗儀引用關係
指導教授(外文):Tsung-I Lin
口試委員:吳宏達王婉倫
口試日期:2011-06-18
學位類別:碩士
校院名稱:國立中興大學
系所名稱:應用數學系所
學門:數學及統計學門
學類:數學學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:中文
論文頁數:35
中文關鍵詞:高斯混合模型遺失精簡EM演算法貝氏訊息準則
外文關鍵詞:GMMMissingParsimoniousEM algorithmsBIC
相關次數:
  • 被引用被引用:0
  • 點閱點閱:231
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
Celeux and Govaert (1995, Pattern Recognition, 28, pp. 781-793)提出了一個新型的高斯混合模型(GMM),其中群組內的共變異數矩陣是在幾何的解釋方式下去作精簡地架構,此概念原創於Banfield and Raftery (1993, Biometics, 49, pp. 803-821)。在隨機遺失訊息下,本文建立一些具計算彈性的EM演算法來估計十種精簡的GMM之模型參數。為了計算上的便利與理論的發展,在估計過程中,我們引入兩個輔助指標矩陣來正確地選取觀察到與遺失成份的位置。此外,我們也討論起始值的選擇及聚集評估的不確定性等計算方面的問題。在此研究中,我們以貝氏訊息準則(BIC)為基礎對可能的模型進行選擇,其中BIC乃為貝氏因子的一個可靠近似。最後,我們藉由實例分析及不同遺失比例下之模擬研究來闡述所提出方法的實用性。

Celeux and Govaert (1995, Pattern Recognition, 28, pp. 781-793) presented a new class of Gaussian mixture models (GMM) in which the within-group covariance matrices are structured parsimoniously in a geo -metrically interpretable way as originally introduced by Banfield and Raftery (1993, Biometics, 49, pp. 803-821). In this thesis, we establish computation -ally flexible EM-type algorithms for parameter estimation of ten parsi -monious forms of GMM under missing at radom mechanism. For the ease of computation and theoretical developments, two auxiliary indicator matrices are incorporated into the estimating procedure for exactly extracting the location of observed and missing components of each observation. Computational aspects including the choice of starting values as well as the uncertainties of clustering assessment are also discussed. In this approach, the probable models are selected based on the Bayesian information criterion, which is a reliable approximation to the Bayes factors. The practical usefulness of the proposed methodology is illustrated with real examples and simulation studies with varying proportions of missing values.

目錄
1.緒論 1
1.1研究動機 ………………………………………………………………1
1.2研究目的 ………………………………………………………………3
1.3論文架構 ………………………………………………………………5
2.文獻回顧與演算法簡介 6
3.研究方法 8
3.1具遺失訊息的多變量常態分佈 ………………………………………8
3.2具遺失訊息的多變量常態混合模型 …………………………………9
3.3十種精簡的高斯混合模型及其估計…………………………………11
4.模擬與實例分析 17
4.1模擬……………………………………………………………………17
4.2實例分析………………………………………………………………24
5.結論 30

參考文獻 31

附錄 34


Banfield, J. D. and Raftery, A. E. (1993) “Model-based Gaussian and non-Gaussian clustering,” Biometrics, 49, 803–821.
Celeux, G. and Govaert, G. (1995) “Gaussian parsimonious clustering models,” Pattern Recognition, 28, 781–793.
Day, N. E. (1969) Estimating the Component s of a Mixture of Normal Dist ributions. Biomet rika , 56 (3) :463–474
Dempster, A. ,Laird, N. , Rubin, D. (1977) Maximum Likehood Estimation f rom Incomplete Data via the EM Algo2 rithm. J . Royal Statistical Soc. B , 39 :1–38
Diebolt, J., Robert, C.P. (1994) Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society. Series B 56, 363–375.
Fraley, C. and Raftery, A.E. (1998) How many clusters? Which clustering methods? Answers via model-based cluster analysis. Computer Journal, 41, 578–588.
Fraley, C., Raftery, A.E. (2002) Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97, 611–612.
Ghahramani, Z., Hinton, G.E. (1997) The EM algorithm for mixtures of factor analyzers (Tech. Report No. CRG-TR-96-1), University of Toronto.
Healy, M.J.R. (1968) Multivariate normal plotting. Applied Statistics 17, 157–161.
Jain, A.K. , Duin, R.P.W. , Mao, J. (2000) Statistical Pattern Rec2 cognition :A Review. IEEE Transactions on Pattern A2 nalysis and Machine Intelligence , 22 (1) :4–48
Kim, J. O., and Curry, J. (1977) The treatment of missing data in multivariate analysis. Social. Meth. Res. 6, 215–240.
Lin, T.I. (2009) Maximum likelihood estimation for multivariate skew normal mixture models. Journal of Multivariate Analysis 100, 257–265.
Lin, T.I., Lee, J.C., Ho, H.J. (2006) On fast supervised learning for normal mixture models with missing information. Pattern Recognition 39, 1177–1187.
Lin, T.C., Lin, T.I. (2010) Supervised learning of multivariate skew normal mixture models with missing information. Computational Statistics 25, 183–201.
Liu, C.H., Rubin D.B. (1994) The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81, 633–648.
Liu C.H., Rubin D.B. (1995) ML estimation of the t distribution using EM and its extensions, ECM and ECME. Statistica Sinica 5, 19–39.
Little, R. J. A. and Rubin, D. B. (1987) Statistical analysis with missing data. New York:Wiley.
M. Nishida and T. Kawahara (2005) “Speaker Model Selection Based on the Bayesian Information Criterion Applied to Unsupervised Speaker Indexing”, IEEE Trans. On Speech and Audio Processing, Vol. 13, No. 4.
McLachlan, G. J. and Basford, K. E. (1988) Mixture models: Inference and applications to clustering, New York: Marcel Dekker Inc.
McLachlan, G.J. and D. Peel. (2000) Finite Mixture Models. New York: John Wiley and Sons INC.
McLachlan, G.J., Krishnan, T. (2008) The EM Algorithm and Extensions, 2nd edn, John Wiley and Sons, New York.
McNicholas P.D., Murphy T.B. (2008) Parsimonious Gaussian mixture models. Statistics and Computing 18, 285–296.
McNicholas, P.D. (2010) Model-based classification using latent Gaussian mixture models. Journal of Statistical Planning and Inference 140, 1175–1181.
McNicholas, P.D., Murphy, T.B., McDaid, A.F., Frost, D. (2010) Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Computational Statistics and Data Analysis 54, 711–723.
Meng, X.L., Rubin, D.B. (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80, 267–78.
Meng, X.L., van Dyk, D. (1997) The EM algorithm-an old folk-song sung to a fast new tune. Journal of the Royal Statistical Society. Series B 59, 511–567.
Schwarz, G. (1978) Estimating the dimension of a model. The Annuals of Statistics, 6:461–464.
Tipping, M.E., Bishop, C.M. (1999) Mixtures of probabilistic principal component analyzers. Neural Computation 11, 443–482.
Ueda, N., Nakano, R., Ghahramani, Z., Hinton, G.E. (2000) SMEM algorithm for mixture models. Neural Computation 12, 2109–2128.
Zhao, J.H., Yu, P.L.H. (2008) Fast ML Estimation for the Mixture of Factor Analyzers via an ECM Algorithm. IEEE Transactions on Neural Networks 19, 1956–1961.
Zhao, J.H., Yu, P.L.H., Jiang Q. (2008) ML estimation for factor analysis: EM or non-EM? Statistics and Computing 18, 109–123.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top