跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.84) 您好!臺灣時間:2024/12/14 17:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:吳芷瑄
研究生(外文):Chih-Hsuan Wu
論文名稱:自我更新過程之穩健迴歸分析方法
論文名稱(外文):Robust regression by self-updating process
指導教授:陳定立陳定立引用關係
指導教授(外文):Ting-Li Chen
口試日期:2017-06-28
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:應用數學科學研究所
學門:數學及統計學門
學類:其他數學及統計學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:28
中文關鍵詞:穩健迴歸疊代過程均值偏移演算法
外文關鍵詞:robust regressioniterative processmean-shift
相關次數:
  • 被引用被引用:0
  • 點閱點閱:319
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
由Huber (1973)提出的穩健迴歸分析方法及相關研究為現今常見的迴歸分析方法之一,其實作疊代演算法的概念是依據前一次的迴歸線更新權重,作為下一次加權最小平方法的權重。本文提出的穩健迴歸分析方法是受到Chen and Shui (2007)提出的自我更新過程群類分析演算法和Cheng (1995)的均值偏移演算法的啟發,能減少離群值對迴歸分析的影響。此方法亦是一個疊代過程,透過資料間的距離決定局部加權最小平方法的權重,再由更新的資料作為下一步的原始資料,完成自我更新過程。本文透過模擬研究展示了此方法在三種類型資料上的優異表現:有均勻雜訊的資料、重尾誤差的資料、多重迴歸模型資料。除了模擬研究外,最後引用一筆美國職業棒球大聯盟選手的薪水資料作為此方法的實際應用。
Robust regression based on an M-estimator has been developed by Huber since 1973. A common algorithm is to perform weighted least-squares in which the weights are iteratively updated according to the new fitted line. In this paper, we will present an iterative process reducing the effect from outliers. It is an extension of SUP (self-updating process) clustering algorithm (Chen and Shiu 2007) and mean-shift clustering (Cheng 1995). This process updates the weights and moves the data points to a locally fitted line in each iteration. We also provide some estimation protocols after this process converged. Simulation studies were done to show that our proposed method outperforms the traditional approach in some types of data. For example, (i) data with uniform noise, (ii) heavy-tailed noise data, and (iii) multiple linear models. The convergence problem and some simple examples will be discussed. Finally, a real data set about MLB players’ salaries is analyzed to demonstrate our method.
口試委員審定書i
中文摘要ii
Abstract iii
1 Introduction 1
1.1 Robust regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Clustering by self-updating process . . . . . . . . . . . . . . . . . . . 2
1.3 Mean-shift clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 The concept of our method . . . . . . . . . . . . . . . . . . . . . . . 5
2 Regression by SUP 5
2.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 The effect of parameters . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 The parameter r . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 The parameter T . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.3 The parameter p . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Strengths of the algorithm 12
3.1 Data with uniform noise . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Heavy-tailed noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Multiple linear models . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Discussion about convergence 17
4.1 Blurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Non-blurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Real data 21
6 Discussion and future work 25
Reference 26
[1] Peter J Huber. Robust regression: asymptotics, conjectures and monte carlo. The Annals of Statistics, pages 799–821, 1973.

[2] Peter J Huber. Robust methods of estimation of regression coefficients 1. Statistics: A Journal of Theoretical and Applied Statistics, 8(1):41–53, 1977.

[3] Paul W Holland and Roy E Welsch. Robust regression using iteratively reweighted least-squares. Communications in Statistics-theory and Methods, 6(9):813–827, 1977.

[4] PJ Roesseuw. Least median squares regression, 1984.

[5] Alan M Gross. Confidence interval robustness with long-tailed symmetric distributions. Journal of the American Statistical Association, 71(354):409–416, 1976.

[6] James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and robability, volume 1, pages 281–297. Oakland, CA, USA., 1967.

[7] Shokri Z Selim and Mohamed A Ismail. K-means-type algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Transactions on pattern analysis and machine intelligence, (1):81–87, 1984.

[8] Ting-Li Chen and Shang-Ying Shiu. A new clustering algorithm based on self-updating process. JSM proceedings, statistical computing section, Salt Lake City, Utah, pages 2034–2038, 2007.

[9] Shang-Ying Shiu and Ting-Li Chen. On the strengths of the self-updating process clustering algorithm. Journal of Statistical Computation and Simulation, 86(5):1010–1031, 2016.

[10] Ting-Li Chen, Dai-Ni Hsieh, Hung Hung, I-Ping Tu, Pei-Shien Wu, Yi-Ming Wu, Wei-Hau Chang, Su-Yun Huang, et al. gamma-sup: A clustering algorithm for cryo-electron microscopy images of asymmetric particles. The Annals of Applied Statistics, 8(1):259–285, 2014.

[11] Ting-Li Chen. On the convergence and consistency of the blurring mean-shift process. Annals of the Institute of Statistical Mathematics, 67(1):157–176, 2015.

[12] Ting-Li Chen, Hironori Fujisawa, Su-Yun Huang, and Chii-Ruey Hwang. On the weak convergence and central limit theorem of blurring and nonblurring processes with application to robust location estimation. Journal of Multivariate Analysis, 143:165–184, 2016.

[13] Keinosuke Fukunaga and Larry Hostetler. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions
on information theory, 21(1):32–40, 1975.

[14] Yizong Cheng. Mean shift, mode seeking, and clustering. IEEE transactions on pattern analysis and machine intelligence, 17(8):790–799, 1995.

[15] Dorin Comaniciu and Peter Meer. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on pattern analysis and machine intelligence, 24(5):603–619, 2002.

[16] Xiangru Li, Zhanyi Hu, and Fuchao Wu. A note on the convergence of the mean shift. Pattern Recognition, 40(6):1756–1762, 2007.

[17] Youness Aliyari Ghassabeh. A sufficient condition for the convergence of the mean shift algorithm with gaussian kernel. Journal of Multivariate Analysis, 135:1–10, 2015.

[18] Ery Arias-Castro, David Mason, and Bruno Pelletier. On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. Journal of Machine Learning Research, 2015.

[19] Miguel Á Carreira-Perpiñán and Christopher KI Williams. On the number of modes of a gaussian mixture. In International Conference on Scale-Space Theories in Computer Vision, pages 625–640. Springer, 2003.

[20] Mitchell R Watnik. Pay for play: Are baseball salaries based on performance? Journal of Statistics Education, 6(2), 1998.

[21] Abbas Khalili and Jiahua Chen. Variable selection in finite mixture of regression models. Journal of the American Statistical Association, 102(479):1025–1038, 2007.

[22] Kuo-Jung Lee, Ray-Bing Chen, and Ying Nian Wu. Bayesian variable selection for finite mixture model of linear regressions. Computational Statistics and Data Analysis, 95:1–16, 2016
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top