跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.15) 您好!臺灣時間:2026/06/12 18:13
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:張中瀚
研究生(外文):Chung-Hen Chang
論文名稱:結合主成分分析與叢聚式迴歸於晶圓允收測試資料之建模
論文名稱(外文):Combination of Priciple Component Analysis and Clusterwise Regression for Modeling Wafer Acceptance Test Data
指導教授:范治民
學位類別:碩士
校院名稱:元智大學
系所名稱:工業工程與管理學系
學門:工程學門
學類:工業工程學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:中文
論文頁數:121
中文關鍵詞:資料分群最大期望演算法主成分分析叢聚式迴歸晶圓允收測試
外文關鍵詞:ClusteringExpectation MaximazationPrinciple Component AnalysisPCAClusterwise RegressionWafer Acceptance TestWAT
相關次數:
  • 被引用被引用:1
  • 點閱點閱:353
  • 評分評分:
  • 下載下載:17
  • 收藏至我的研究室書目清單書目收藏:1
晶圓允收測試 (Wafer Acceptance Test,WAT)數據之建模,是半導體業界在全廠製程監控的一個新興議題。WAT相關資料通常呈現多重混合的線性模式,然而決定WAT資料中線性模式的指標(Model Indicator)卻往往不出現在分析資料欄位中,我們稱這些不出現在資料欄位中的數據為「隱藏因子」。如何偵測這些「隱藏因子」,進而估測已被混合的WAT資料線性模式,是本論文要解決的問題。
針對「隱藏因子」偵測與多重混合WAT線性模式之估測問題,以利用迴歸分析並搭配重複疊代最大期望(Expectation Maximum; EM)演算法之叢聚式迴歸分析(EM-Based Clusterwise Regression; EMCR)已被提出,並已在少許半導體業界案例獲得驗證。然而EMCR並未考慮到實際半導體廠的兩個特性:(1) WAT線性模式的解釋變數之間呈現高度共線性,(2)WAT資料有離群值(Outlier),而這兩種特性會影響EMCR的效能。
當WAT線性模式之解釋變數間呈現高度共線性時,EMCR所估測之迴歸係數其變異數會增大,因此不僅會影響EMCR疊代運算過程之收斂性,也會造成錯誤解讀個別解釋變數之效應的現象。為了消除共線性對於EMCR估測效能的影響,本論文在EMCR之前先進行資料前處理,將具高度相關的多個解釋變數進行主成分分析(Principle Component Analysis ;PCA)的轉換,以轉換後的主成分進行EMCR,以估測出較為穩定的「隱藏因子」與各個主成分的迴歸係數。而為了增加迴歸係數的解讀能力,在EMCR之後再進行資料後處理,將各個主成分的迴歸係數重新轉回成其所對應的解釋變數之迴歸係數。本論文將上述方法稱為PCA-Enhanced EMCR(PEMCR)。經由模擬實驗發現,當半導體良率分析的解釋變數之間呈現高度線性相關時,PEMCR相較於EMCR在收斂品質上有大幅度的改善且改善至少都為EMCR的5倍以上,而收斂效率方面會與EMCR表現相當。
當資料中存在離群值時,以EMCR估測得到的迴歸參數會有明顯的偏差情形,原因是EMCR針對離群值在模式歸屬權重會有過量估測問題,對於此問題本論文提出模式的迴歸權重;主要概念為設計一個能夠抵抗離群值之權重,以解決離群值在模式歸屬權重的過量估測問題,所以會將模式歸屬權重×抵抗離群值權重以抵抗給予離群值過量估測的問題,這兩種權重相乘稱為雙重權重,但此權重僅用於估測參數上,所以也稱為模式迴歸權重,這種雙重權重加強EMCR的方法稱為DW-EMCR(Double Weighted EMCR),經由模擬實驗發現,使用DW-EMCR估測具有離群值資料時,不但確實提昇了迴歸模式對WAT的解釋變異能力,同時也提升了運算的效率。
本論文也針對同時具有共線性與離群值的實際案例資料來做驗證分析,由於實際半導體廠資料並沒有真實模型可以比對,所以使用重覆抽樣(Resample)的方法來進行EMCR。針對共線性方面,並不需把資料全部使用PCA在進行EMCR,而只需要對幾個具有共線性因子做PCA並進行EMCR所建立的迴歸模式會對WAT解釋最多的變異,此種針對部分因子做PCA轉換在進行EMCR的方法稱為部份PEMCR,針對離群值方面,採用雙重權重可大幅提升參數上的穩定度(也就是大幅減少95%信賴區間),提升了穩定度就可幫助工程師找出因離群值被遮蔽的效應因子。



In semiconductor manufacturing, the conduction of regression models on Wafer Acceptance Test (WAT) data plays a cornerstone to Fab-wide Process Control. Unfortunately, WAT data usually manifests multiple liner models, in which the model Indicator is usually a “hidden variable”. EM-Based Clusterwise Regression( EMCR) technique has been applied to the modeling of WAT data with multiple linear models. Though EMCR has been validated in few semiconductor manufacturing cases, the performance of EMCR degrades under two characteristics of WAT data: (1) high collinearity among explanatory variables in a WAT regression model, and (2) outliers in WAT data.
The characteristic of high collinearity will result in large variation of EMCR regression coefficient, which could further mislead the interpretation of EMCR regression coefficient. To remove the collinearity among explanatory variables, Principle Component Analysis (PCA) is integrated with EMCR and is named as PCA-Enhanced EMCR. Simulation studies show that PCA-Enhanced EMCR is 5 times better than EMCR in convergence quality,.
Though EMCR adopts the weighted regression technique, it cannot be free from the outlier impacts. In EMCR, for each data point, its weight for regression is set as its probabilistic membership with respect to individual regression models. Outlier may induce excessive estimation of probabilistic membership and therefore result in biased estimation of regression coefficients. To cope with the outlier, in addition to the probabilistic membership, a new weight resistant to outlier is designed to be part of the regression weight. The EMCR enhanced by double weighted method is called DW-EMCR. Simulation studies show that DW-EMCR not only improves the variation explanation ability of WAT regression model, but also improves the convergence efficiency of EMCR.
A data set in which collinearity and outlier characteristics coexist is collected from a semiconductor foundry for validations. Due to the fact that the true model is unknown, the re-sampling technique is applied for performance evaluation. For the collinearity problem, the technique of PCA-Enhanced EMCR which conducts PCA on highly correlated variables indeed performs better than EMCR. For the outlier problem, DW-EMCR demonstrates it capability on improving the statistical confidence interval of regression coefficients, which further helps engineer discover the important factor masked by outlier.


第一章 緒論 1
1.1研究背景與動機 1
1.2研究問題之特性: 具隱藏類別之多重線性模式 5
1.3相關文獻 7
1.4 研究目標與方法 11
1.5 研究架構 13
第二章 以最大期望演算法進行晶圓允收測試資料之叢聚式迴歸 14
2.1呈現混合模式的晶圓允收測試資料 14
2.2混合模式的概似函數(Likelihood Function) 16
2.3混合模式的最大概似值 (Maximum Likelihood Estimation) 17
第三章 共線性問題與解決方法 21
3.1 共線性( Collinearity )於X因子間的問題 21
3.2 針對共線性(Collinearity)的解決方法 31
第四章 離群值的問題與解決方法 44
4.1 資料中存在離群值的問題 44
4.2 針對離群值的解決方法 48
第五章 廣泛模擬與效能評估 54
5.1 模擬評估設定 54
5.2 針對共線性問模擬展現 59
5.3 針對資料中存在異常值問題 75
第六章 半導體廠實際資料模擬 87
第七章 結論 106

參考文獻

[1] Späth, H. (1979). Algorithm 39: Clusterwise linear regression. Computing, 22, 367–373
[2] Späth, H. (1982). Algorithm 48: A fast algorithm for clusterwise linear regression. Computing, 29,175–181.
[3] C. F. Jeff Wu, On the Convergence Properties of the EM Algorithm Author(s), The Annals of Statistics, Vol. 11, No. 1 (Mar., 1983), pp. 95-103.
[4] Hathaway, R. J. (1985), A constraint formulation of maximum-likelihood estimation for normal mixture distributions, The Annals of Statistics, 13, 795-800.

[5] DeSarbo, W. S., Oliver, R. L., & Rangaswamy, A. (1989). A simulated annealing methodology for clusterwise linear regression. Psychometrika, 54, 70–736.

[6] Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the E-M algorithm. Journal of the Royal Statistical Society B, 39, 1–38.
[7] DeSarbo,W. S., & Grisaffe, D. (1998). Combinatorial optimization approaches to constrainedmarket segmentation: An application to industrial market segmentation. Marketing Letters, 9, 115–134.
[8]黃賢文,建立半導體晶圓允收測試參數之預測模型—以電容為例,
元智大學 工業工程與管理研究所,民98)
[9]呂學地,應用資料探勘技術於半導體晶圓允收測試參數預測之研究,元智大學 工業工程與管理研究所, 民98)
[10] Hennig, C. (2000). Identifiability of models for clusterwise linear regression. Journal of Classification, 17, 273–296.
[11] Qin, S.J. and T. Sonderman (2002). From chemical process control to semiconductor manufacturing control. In: Keynote at theAEC/APC Symposium XIV.
[12] Brusco, M. J., Cradit, J. D., & Tashchian, A. (2003). Multicriterion clusterwise regression for joint segmentation settings: An application to customer value. Journal of Marketing Research, 40,225–234.
[13] Harrison, C.A., R. Good, D. Kadosh and S.J. Qin (2003). Multi-step supervisory control of flash memory device production via a simple first-principles model. In: AEC/APC Symposium XV.
[14] Leisch, F. (2004). FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software, 11, 1–18.
[15] Moyne, J. (2004). Making the move to fab-wide apc. Solid State Technology 47 (9), 47
[16] Qin, S.J., G. Cherry, R. Good, J. Wang and C. A. Harrison (2004). Control and monitoring of semiconductor manufacturing processes: Challenges and opportunities. In: IFAC symposium on dynamics and control of process systems.
[17] Muller, C. H., & Garlipp, T. (2005). Simple consistent cluster methods based on redescending Mestimators with an application to edge identification in images. Journal of Multivariate Analysis, 92, 359–385.
[18] Qin, S.J., G. Cherry, R. Good, J. Wang and C.A. Harrison (2006). Semiconductor manufacturing process control and monitoring: A fab-wide framework. Journal of Process Control16 (3), 179-191.
[19] Clare Schoene, S. Joe Qin (2007), Erhan Kutanoglu and John Stuber, Electrical parameter control for semiconductor device manufacturing: a fabwide approach, In: IFAC symposium on dynamics and control of process systems.
[20] Prasad A. Naik, Peide Shi, and Chih-Ling Tsai, Extending the Akaike Information Criterion to Mixture Regression Models, Journal of the American Statistical Association, 2007, 102(477): 244-254.

[21] M. J. Brusco, J. D. Cradit, D. Steinley and G. L. Fox, Cautionary remarks on the use of clusterwise regression, Multivariate Behavioral Research, Volume 43, Issue 1 January 2008 , pages 29 – 49
[22] T.E. Chang and C.S. Kuo (2008), Application of multiple regression model to IDU control and improvement, TSMC Enterprise Information and Knowledge Management.
[23] C. M. Fan and Y.P. Lu, 2009 August, “A Bayesian Ranking Scheme for Cost-Effective Yield Diagnosis Services,” Proceedings of IEEE Conference on Automation Science and Engineering
[24] C.Y. Lu and C.M. Fan (2009), Correlation analysis between wafer acceptance test and in-line data for process control, In: AEC/APC Symposium
[25] SKS Fan, Y Lin, A fast estimation method for the generalized Gaussian mixture distribution on complex images, Computer Vision and Image Understanding, 2009, Volume 113, Issue 7, Pages 839-853


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊