(3.237.20.246) 您好!臺灣時間:2021/04/14 11:09
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:徐偉舜
研究生(外文):Wei-Shun Hsu
論文名稱:混合共同因子分析模型之貝氏推論
論文名稱(外文):Bayesian Inference for Mixtures of Common Factor Analyzers
指導教授:王婉倫
指導教授(外文):Wan-Lun Wang
口試委員:林文欽林宗儀
口試委員(外文):Win-Chin LinTsung-I Lin
口試日期:2013-06-06
學位類別:碩士
校院名稱:逢甲大學
系所名稱:統計學系統計與精算碩士班
學門:數學及統計學門
學類:統計學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:中文
論文頁數:64
中文關鍵詞:共同因子負荷量Gibbs 抽樣逆貝氏公式馬可夫鏈蒙地卡羅混合共同因子分析器混合因子分析器
外文關鍵詞:Common factor loadingsGibbs samplerInverse Bayes formulaeMarkov chain Monte CarloMCFAMFA
相關次數:
  • 被引用被引用:0
  • 點閱點閱:224
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
混合因子分析(MFA) 方法對於高維度資料之基於模型的密度估計和分群是一個很自
然可以聯想到的應用工具, 尤其在樣本數相對於變數的維度來得小時更是如此。然而, 當
群集個數不小時, MFA 之因子共變異數矩陣的參數個數會相當的大, 造成估計上的困難。
為了進一步減少參數的個數, 混合共同因子分析(MCFA), 其為MFA 的精簡延伸法, 最
近已被發展出來分析高維度資料。在本文中, 我們採用貝氏分析方法來配適MCFA, 更明
確地說, 貝氏方法將基於感興趣的後驗分佈之隨機抽樣來執行參數估計和後驗推論。對於
模型的參數, 採用共軛和弱訊息先驗分佈以確保得到合適的參數後驗分配。我們利用有效
的馬可夫鏈蒙地卡羅(Markov Chain Monte Carlo; MCMC) 技術, 其結合了資料擴增
法以填補隱藏變數以及吉布斯(Gibbs) 抽樣法來生成參數。進一步地, 為了加速MCMC
的收斂速度, 我們亦採用逆貝氏公式(Inverse Bayes formulae; IBF) 結合Gibbs 抽樣法
來推論模型。同時, 估計隱藏因子和新個體分類的技術也被探討。模擬研究和實例分析顯
示了我們的方法在實際應用上提供令人滿意的結果。
The mixtures of factor analyzers (MFA) approach is a natural tool for model-based
density estimation and clustering of high-dimensional data, especially when the number
of observations is not relatively large than their dimension. However, the number
of parameters in the component-covariance matrices of MFA is quite large when
the number of clusters is not small. To further reduce the number of parameters,
mixtures of common factor analyzers (MCFA) have recently been developed as a
parsimonious extension of the MFA to analyze high-dimensional data. In this paper,
we adopt a fully Bayesian approach, more specifically a treatment that carries
out estimation and inference based on stochastic sampling of the posterior distributions
of interest, to fitting the MCFA. Natural conjugate and weakly informative
priors on the distributions of model parameters are introduced to ensure proper
posterior distributions of parameters. We provide an efficient Markov Chain Monte
Carlo (MCMC) technique which incorporates data augmentation for imputation of
latent variables with Gibbs sampler for generation of parameters. Futhermore, in
order to accelerate the convergence of MCMC procedure, we also adopt the inverse
Bayes formulae coupled with Gibbs sampler to infer the MCFA. The techniques for
estimation of latent factors and classification of new objects are also investigated.
Simulation studies and real-data examples demonstrate that our methodology performs
satisfactorily.
1 簡介1
1.1 背景. . . . . . . . . . . . . . . 1
1.2 動機與目的. . . . . . . . . . . . 2
1.3 概要. . . . . . . . . . . . . . . 3
2 混合因子分析(MFA).................... 4
2.1 MFA模型. . . . . . . . . . . . . . 4
2.2 MFA之先驗與完全條件後驗分佈. . . . . 5
2.3 MCMC 演算法. . . . . . . . . . . . 8
2.4 IBF-Gibbs 演算法. . . . . . . . .. 9
3 混合共同因子分析(MCFA) ...............11
3.1 MCFA模型. . . . . . . . . . . . . 11
3.2 MCFA之貝氏模形結構. . . . . . . . . 12
3.2.1 先驗分佈與超參數設定. . . . . .... 12
3.2.2 後驗分配推論. . . . . . . . ..... 13
3.3 演算過程. . . . . . . . . . . . . .15
3.3.1 MCMC 演算過程. . . . . . . ..... 15
3.3.2 IBF-Gibbs 抽樣法. . . . . . . .. 16
4 模擬研究...........................17
4.1 參數估計表現. . . . . . . . . . . 18
4.2 配適值精準度. . . . . . . . . . . 22
4.3 分群表現.........................22
5 實例分析...........................23
5.1 義大利酒實際資料分析............... 23
5.2 模擬研究: 干擾變數的葡萄酒模擬試驗... 29
6 基因微陣列資料分析...................31
7 結論..............................39
附錄A MFA之完全條件後驗分佈推導.........46
附錄B MCFA之完全條件後驗分佈推導........50
Alon,U. Barkai,N., Notterman,D.A., Gish,K., Ybarra,S. et al.(1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal
colon tissues probed by oligonucleotide arrays. Proceedings of the National
Academy of Sciences 96, 6745-6750.
Andrews, J.L. and McNicholas, P.D. (2010). Extending mixtures of multivariate
t-factor analyzers. Statistics and Computing 21, 361–373.
Akaike, H. (1973). Information theory and an extension of the maximum likelihood
principle. In B.N. Petrov and F. Csake (eds.), Second International
Symposium on Information Theory. Budapest: Akademiai Kiado, 267–281.
Auguie,B. (2013). R gridExtra package: Functions in Grid graphics. R package
version 0.9.1.
Baek, J., McLachlan, G.J. and Flack, L.K. (2010). Mixtures of factor analyzers
with common factor loadings: applications to the clustering and visualization
of high-dimensional data. IEEE Transactions on Pattern Analysis and
Machine Intelligence 32, 1298–1309.
Baek, J., McLachlan, G.J. (2011). Mixtures of common t-factor analyzers for
clustering high dimensional microarray data. Bioinformatics 27(9), 1269–
1276.
Banfield J.D. and Raftery A.E. (1993). Model-based Gaussian and non-Gaussian
clustering. Biometrics 49, 803–821.
Berger, J.O. (1985). Statistical Decision Theory and Bayesian Analysis. New York:
Springer.
Biernacki, C., Celeux, G. and Govaert, G. (2000). Assessing a Mixture Model for
Clustering with the Integrated Completed Likelihood. IEEE Transactions on
PAMI 22, 719–725.
Brooks, S. P. (2002). Discussion on the paper by Spiegelhalter, D. J., Best, N. G.,
Carlin, B. P., and van der Linde, A. (2002). Journal Royal Statistical Society,
Series B 64(3), 616–618.

Catherine, H. (2012). R gclus package: Clustering Graphics. R package version
1.3.1.
Diebolt, J. and Robert, C. (1994). Estimation of finite mixtures through Bayesian
Sampling. Journal of the Royal Statistical Society, Series B 56, 363–375.
Dong, K.K. and Taylor, J.M.G. (1995). The restricted EM algorithm for maximum
likelihood estimation under linear restrictions on the parameters. Journal of
the American Statistical Association 90, 707–716.
Deepayan, S. (2013). R lattice package: Lattice Graphics. R package version
0.20-15.
Flury, B.N. (1984). Common principle components in k groups. Journal of the
American Statistical Association 79, 892–898.
Forina, M., Armanino, C., Castino,M. and Ubigli, M. (1986). Multivariate data
analysis as a discriminating method of the origin of wines. Vitis 25, 189–201.
Fokou`e, E. and Titterington, D.M. (2003). Mixtures of factor analysers. Bayesian
estimation and inference by stochastic simulation. Machine Learning 50, 73–
94.
Fr¨uhwirth-Schnatter, S. and Pyne, S. (2010). Bayesian inference for finite mixtures
of univariate and multivariate skew-normal and skew-t distributions.
Biostatistics 11, 317–336.
Fr¨uhwirth-Schnatter, S. (2006) Finite Mixture and Markov Switching Models. Springer
Series in Statistics. New York/Berlin/Heidelberg: Springer.
Galimberti, G., Montanari, A. and Viroli, C. (2009). Penalized factor mixture
analysis for variable selection in clustered data. Computational Statistics &;
Data Analysis 53, 4301–4310.
Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and
the Bayesian restoration of images. IEEE Transactions on Pattern Analysis
and Machine Intelligence 6, 721–741.

Gregory, R. W., Ben, B.,Lodewijk, B., Robert. G., Wolfgang, H., Andy, L.,
Thomas, L., Martin, M., Arni, M., Steffen, M., Marc, S., Bill, V. (2013).
R gplots package: Various R programming tools for plotting data. R package
version 2.11.0.1.
Ghahramani, Z. and Beal, M.J. (2000). Variational inference for Bayesian mixtures
of factor analysers. In Advances in Neural Information Processing Systems 12,
Cambridge, MA: MIT Press.
Ghahramani, Z. and Jordan, M.I. (1994). Supervised learning from incomplete
data via an EM approach. In: Cowan JD, Tesarro G, Alspector J (eds) Ad-
vances in neural information processing systems, vol 6. Morgan Kaufmann,
San Francisco 120–127.
Habert, L. and Arabie, P. (1985). Comparing partitions, J. Classific. 2, 193–218.
Hinton, G., Dayan, P. and Revow, M. (1997). Modeling the manifolds of images of
handwritten digits. IEEE Transactions on Neural 8, 65–73.
Lee, W.L., Chen, Y.C. and Hsieh, K.S. (2003). Ultrasonic liver tissues classification
by fractal feature vector based on M-band wavelet transform, IEEE
Transactions on Medical Imaging 22, 382–392.
Lin, T.I. (2009). Maximum likelihood estimation for multivariate skew normal
mixture models. Journal of Multivariate Analysis 100, 257–265.
Lin, T.I. (2010). Robust mixture modeling using multivariate skew t distributions.
Statistics and Computing 20, 343–356.
Lin, T.I., Lee, J.C. and Ho, H.J. (2006). On fast supervised learning for normal
mixture models with missing information. Pattern Recogition 39, 1177–1187.
Lopes, H.F. and West, M. (2004). Bayesian model assessment in factor analysis.
Statistica Sinica 14, 41–67.
Martella, F. (2006). Classification of microarray data with factor mixture models.
Bioinformatics 22, 202–208.

McLachlan, G.J., Bean, R.W. and Peel, D. (2002). A mixture model-based approach
to the clustering of microarray expression data. Bioinformatics 18,
413–422.
McLachlan, G.J. and Peel, D. (2000). Finite Mixture Models. New York: Wiley.
McLachlan, G.J., Peel, D. and Bean, R.W. (2003). Modelling high-dimensional
data by mixtures of factor analyzers. Computational Statistics &; Data Anal-
ysis 41, 379–388.
McNicholas, P.D. and Murphy, T.B. (2008). Parsimonious Gaussian mixture models.
Statistics and Computing 18, 285–296.
Newton, M.A. and Raftery, A.E. (1994). Approximate Bayesian inference with
the weighted likelihood bootstrap (with discussion). Journal of the Royal
Statistical Society, Series B 56, 3–48.
Press, S.J. and Shigemasu, K. (1989). Bayesian inference in factor analysis. Con-
tributions to Probability and Statistics Springer Verlag.
R Development Core Team, R (2009). A Language and Environment for Statistical
Computing. Vienna: R Foundation for Statistical Computing. Austria. ISBN
3-900051-07-0, URL http://www.Rproject.org
Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixture models
with an unknown number of components (with discussion). J. Roy. Statist.
Soc. Ser. B 59, 731–792.
Robert, S., Husmeter, D., Rezek, I. and Penny, W. (1998). Bayesian approaches
to Gaussian mixture modeling. IEEE Transactions on Pattern Analysis and
Machine Intelligence 20, 1133–1142.
Rubin, D.B. (1987). Using the SIR algorithm to simulate poster distributions.
Bayesian statistics 3, 395–402.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics 6,
461–464.
Spiegelhalter, D.J., Best, N.G., Carlin, B.P. and Linde, A.V.D. (2002). Bayesian
measures of model complexity and fit. Journal of the Royal Statistical Society,
Series B 64, 583–639.
Stephen, B. and Andrew, G. (1998). General methods for monitoring convergence
of iterative simulations. Journal of Computational and Graphical Statistics 7,
434–456.
Tan, M., Tian, G.L. and Ng, K.W. (2003). A Noniterative sampling method for
computing posteriors in the structure of EM-Type algorithms. Statistica Sinica
13, 625-639.
Utsugi, A. and Kumagai, T. (2001). Bayesian analysis of mixtures of factor analyzers.
Neural Computation 13, 993–1002.
Wang, W. L., Fan, T. H. (2012). Bayesian analysis of multivariate t linear mixed
models using a combination of IBF and Gibbs samplers. Journal of Multivari-
ate Analysis 105, 300-310.
Wickham, H. and Chang, W. (2013). R ggplot2 package: An implementation of
the grammar of graphics. R package version 0.9.3.1.
Xie, B., Pan, W. and Shen, X. (2010). Penalized mixtures of factor analyzers with
application to clustering high dimensional microarray data. Bioinformatics
22, 2405–2412.
Zhang, Z., Chan, K.L., Wu, Y. and Chen, C. (2004). Learning a multivariate
Gaussian mixture model with the reversible jump MCMC algorithm. Statistics
and Computing 14, 343–355.
Zio, M.D., Guarnera, U. and Luzi, O. (2007). Imputation through finite Gaussian
mixture models. Computational Statistics &; Data Analysis 51, 5305–5316.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 林素玟,〈紅樓解碼──小說敘事的隱喻象徵〉,中央大學人文學報45期,頁81-140,2011年。
2. 林素玟,〈覺性的追尋──《紅樓夢》敘事的意義建構與療癒〉,《東華中文學報》第五期,2012.年12月。
3. 林美珠,〈敘事研究:從生命故事出發〉,《輔導季刊》36卷4期, 2000年。
4. 劉冠妏、黃宗堅,〈創傷與復原:隱喻故事在心理治療中的運用〉,《諮商與輔導》,第250 期,2006.10.5,頁22-27。
5. 林素玟,〈覺性的追尋──《紅樓夢》敘事的意義建構與療癒〉,東華中文學報第五期,頁29-61,2012年。
6. 林素玟,〈《紅樓夢》的病/罪書寫與療癒〉,台北:華梵人文學報16期,頁31-77,2011年。
7. 鍾鐵民,〈鍾理和文學中所展現的人性尊嚴〉,《臺灣文藝》8期總號128,1991年12月。
8. 鍾鐵民,〈鍾理和的文學生活〉,《國文天地》191期,2001年4月。
9. 林素玟,〈《紅樓夢》創傷敘事的解構與療癒〉,彰化師大國文學誌22期,頁211-240,2011年。
10. 鍾肇政,〈為文學而生,為文學而死:紀念鍾理和八秩冥誕〉,《聯合文學》122期,1994年12月。
11. 鍾怡彥,〈關於祖父鍾理和〉,《國文天地》第16 卷11 期,2001 年4 月。
12. 簡怡人、詹美涓、呂旭亞,(書寫治療的應用及其療效),《諮商與輔導》第239卷,2005年11月,頁24。
13. 楊照,〈「抱著愛與信念而枯萎的人」──記鍾理和〉,《聯合文學》11卷2期總號122,1994年12月。
14. 許素蘭,〈鍾理和小論〉,《臺灣文藝》11卷44期,1974年7月。
15. 許南村(陳映真),〈原鄉的失落──試評〈夾竹桃〉(鍾理和全集) 〉,《現代文學》復刊1,1977年8月。
 
系統版面圖檔 系統版面圖檔