跳到主要內容

臺灣博碩士論文加值系統

(44.220.247.152) 您好!臺灣時間:2024/09/18 23:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳春樹
研究生(外文):Chun-Shu Chen
論文名稱(外文):Model Selection for Curve and Surface Fitting Using Generalized Degrees of Freedom
指導教授:陳玉英陳玉英引用關係黃信誠黃信誠引用關係
學位類別:博士
校院名稱:國立中央大學
系所名稱:統計研究所
學門:數學及統計學門
學類:統計學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:英文
論文頁數:79
外文關鍵詞:Mean squared prediction errorStein''s unbiased risk estimateVariable selectionSmoothing splineSpatial predictionSquared error lossNoise variance estimationSelection variabilityNonparametric regressionNonlinear estimateData perturbationBootstrapKriging
相關次數:
  • 被引用被引用:0
  • 點閱點閱:429
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在資料分析的過程中,通常有多種的統計方法或模型可以採用,然而不同的統計方法或模型通常具有不同的配適能力與預測精準度,如何從中選擇適當的方法或模型因而是一個重要的問題。在這篇論文中,我們探討如何從一組曲線配適或曲面配適方法中選擇一個適當的方法,尤其著重在非線性方法間的選擇。其中最棘手的部分在於如何建立一個可以衡量非線性方法及複雜配適方法的統計機制,使其可以公平地比較各種配適方法,進而從中選出一個最好的資料分析方法。

本篇論文主要是以Ye (1998)所提出的廣義自由度為基礎。在曲線配適問題中,雖然已經有很多的準則可以用來選擇配適曲線的方法,然而這些準則大多只適用於比較線性方法間之優劣。在此論文中我們提出一個準則使其可以公平地衡量一組樣條函數平滑參數選取方法的表現,進而從中選出最適當的平滑參數選取方法。此外,我們進一步將廣義自由度的想法推廣到空間預測的曲面配適問題上,並提出一個地理統計模型選取的一般準則。所提出的準則不僅可以用來選擇各種空間預測方法,也適用具空間相關雜訊的迴歸變數選擇問題。此篇論文在曲線配適及曲面配適問題上所提的新方法,除了透過模擬實驗來呈現其優越性,也同時以理論證明這些方法皆具有漸近最佳的性質。
In the process of data analysis, there are usually a number of candidate statistical methods (models) that can be used, and different methods (models) generally have different performances under different situations. In this thesis, we focus on model selection in curve and surface fitting. We develop a general rule to fairly assess among candidate curve or surface fitting methods regardless of whether the fitting procedures are complex and whether the corresponding estimates are linear, nonlinear, or even discontinuous.

Based on the concept of generalized degrees of freedom (GDF) (Ye 1998), we propose an improved Cp method to select among a class of selection criteria in spline smoothing. In addition, a general methodology for geostatistical model selection is proposed by further generalizing GDF to spatial prediction. The proposed method not only can be used to select among various spatial prediction methods, but also can be applied to the variable selection problem in spatial regression. The validities of the proposed model selection methods for curve and surface fitting are justified both numerically and theoretically.
Contents
1 Introduction ------ 1
1.1 Model Selection ------ 1
1.2 Curve Fitting ------ 2
1.2.1 Cubic Splines ------ 2
1.2.2 Choice of the Smoothing Parameter ------ 5
1.3 Surface Fitting ------ 6
1.3.1 Kriging ------ 7
1.3.2 Thin-Plate Splines ------ 9
1.4 Dissertation Organization ------ 11
2 Model Selection and Generalized Degrees of Freedom ------ 12
2.1 Model Selection in Regression ------ 12
2.2 Generalized Degrees of Freedom ------ 14
2.3 Estimating the Generalized Degrees of Freedom ------ 15
2.3.1 Stein''s Unbiased Risk Estimate ------ 16
2.3.2 Parametric Bootstrap ------ 16
2.3.3 Data Perturbation ------ 17
3 An Improved Cp Criterion for Spline Smoothing ------ 19
3.1 Introduction ------ 19
3.2 The Proposed Method ------ 20
3.2.1 Adaptive Cp ------ 22
3.2.2 Asymptotic Optimality of Adaptive Cp ------ 23
3.3 Simulation Study ------ 24
3.4 Discussion ------ 27
4 Geostatistical Model Selection ------ 30
4.1 Introduction ------ 30
4.2 Geostatistical Models and Spatial Prediction ------ 32
4.3 Generalized Degrees of Freedom for Spatial Model Selection ------ 35
4.3.1 Generalized Degrees of Freedom ------ 36
4.3.2 Data Perturbation ------ 38
4.3.3 Optimal Model Selection for Spatial Prediction ------ 40
4.3.4 Estimation of Noise Variance ------ 43
4.4 Simulation Study ------ 45
4.5 Application ------ 58
4.6 Discussion ------ 62
5 Summary ------ 64
Bibliography ------ 66
Appendix ------ 73
1. Akaike, H. (1973) Information theory and the maximum likelihood principle. In International Symposium on Information Theory. (V. Petrov and F. Csaki eds.). Akademiai Kiado, Budapest, 267-281.
2. Altman, N. (2000) Krige, smooth, both or neither? (with discussion). Australian & New Zealand Journal of Statistics, 42, 441-461.
3. Buja, A., Hastie, T., and Tibshirani, R. (1989) Linear smoothers and additive models (with discussion). The Annals of Statistics, 17, 453-555.
4. Cleveland, W. S, and Devlin, S. (1988) Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American Statistical Association, 1988, 83, 596-610.
5. Cleveland, W. S. (1979) Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74, 829-836.
6. Craven, P. and Wahba, G. (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403.
7. Cressie, N. (1990) Reply to a Letter by G. Wahba. The American Statistician, 44, 256-258.
8. Cressie, N. (1993) Statistics for Spatial Data (revised edition). Wiley: New York.
9. Cressie, N. and Lahiri, S. N. (1993) The asymptotic distribution of REML estimators. Journal of Multivariate Analysis, 45, 217-233.
10. Cressie, N. and Lahiri, S. N. (1996) Asymptotics for REML estimation of spatial covariance parameters. Journal of Statistical Planning and Inference, 50, 327-341.
11. Davis, B. M. (1987) Uses and abuses of cross-validation in geostatistics. Mathematical Geology, 19, 241-248.
12. Donoho, D. L. and Johnstone, I. M. (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, 425-456.
13. Dubrule, O. (1983) Two methods with different objectives: splines and kriging. Mathematical Geology, 15, 245-257.
14. Durrett, R. (1995) Probability: Theory and Examples (second edition). Duxbury Press: Belmont.
15. Efron, B. (2001) Selection criteria for scatterplot smoothers. The Annals of Statistics, 29, 470-504.
16. Efron, B. (2004) The estimation of prediction error: covariance penalties and cross-validation. Journal of the American Statistical Association, 99, 619-632.
17. Eubank, R. (1999) Nonparametric Regression and Spline Smoothing (second edition). Marcel Dekker: New York.
18. Fan, J. and Gijbels, I. (1996) Local Polynomial Modelling and its Applications. Chapman and Hall: London.
19. George, E. I. and Foster, D. P. (1994) The risk inflation criterion for multiple regression. The Annals of Statistics, 22, 1947-1975.
20. Green, P. and Silverman, B. (1994) Nonparametric Regression and Generalized Linear Models. Chapman and Hall: London.
21. Gu, C. (2002) Smoothing Spline ANOVA Models. Springer: New York.
22. Hoeting, J. A., Davis, R. A., Merton, A. A., and Thompson, S. E. (2006) Model selection for geostatistical models. Ecological Applications, 16, 87-98.
23. Huber, P. T. (1981) Robust Statistics, Wiley: New York.
24. Hurvich, C., Simonoff, J., and Tsai, C. (1998) Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society, Series B, 60, 271-294.
25. Hutchinson, M. F. and Gessler, F. R. (1994) Splines---more than just a smooth interpolator. Geoderma, 62, 45-67.
26. Kohn, R., Ansley, C., and Tharm, D. (1991) The performance of cross-validation and maximum likelihooh estimators of spline smoothing parameters. Journal of the American Statistical Association, 86, 1042-1050.
27. Kou, S. C. (2003) On the efficiency of selection criteria in spline regression. Probability Theory and Related Fields, 127, 153-176.
28. Kou, S. C. (2004) From finite sample to asymptotics: a geometric bridge for selection criteria in spline regression. The Annals of Statistics, 32, 2444-2468.
29. Kou, S. C. and Efron, B. (2002) Smoothers and the Cp, generalized maximum likelihood, and extended exponential criteria: a geometric approach. Journal of the American Statistical Association, 97, 766-782.
30. Laslett, G. M. (1994) Kriging and splines: an empirical comparison of their predictive performance in some applications. Journal of the American Statistical Association, 89, 391-400.
31. Laslett, G. M. and McBratney, A. B. (1990) Further comparison of spatial methods for predicting soil pH. Journal of the Soil Science Society of America, 54, 1553-1558.
32. Laslett, G. M., McBratney, A. B., Pahl, P. J., and Hutchinson, M. F. (1987) Comparison of several spatial prediction methods for soil pH. Journal of Soil Science, 38, 325-341.
33. Lehmann, E. L. (1994) Testing Statistical Hypotheses (second edition). Chapman & Hall: New York.
34. Li, K. C. (1986) Asymptotic optimality of C_L and generalized cross-validation in ridge regression with application to spline smoothing. The Annals of Statistics, 14, 1101-1112.
35. Loader, C. (1999) Local Regression and Likelihood. Springer-Verlag: New York.
36. Mallows, C. (1973) Some comments on Cp. Technometrics, 15, 661-675.
37. Mardia, K. V. and Marshall, R. J. (1984) Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika, 71, 135-146.
38. Matern, B. (1986) Spatial Variation (second edition). Lecture Notes in Statistics, Springer: New York.
39. Matheron, G. (1963) Principles of geostatistics. Economic Geology, 58, 1246-1266.
40. Matheron, G. (1981) Splines and kriging: their formal equivalence. In Down-to-Earth Statistics: Solutions Looking for Geological Problems (D. F. Merriam ed.). Syracuse University Geological Contributions, Syracuse, 77-95.
41. McGilchrist, C. A. (1989) Bias of ML and REML estimators in regression models with ARMA errors. Journal of Statistical Computation and Simulation, 32, 127-136.
42. Miller, A. J. (1990) Subset Selection in Regression. Chapman and Hall: London.
43. Patterson, H. D. and Thompson, R. (1971) Recovery of inter-block information when block sizes are unequal. Biometrika, 58, 545-554.
44. Schabenberger, O. and Gotway, C. A. (2005) Statistical Methods for Spatial Data Analysis. Chapman & Hall/CRC: Boca Raton.
45. Schwarz, G. (1978) Estimating the dimension of a model. The Annals of Statistics, 6, 461-464.
46. Sen, A. and Srivastava, M. S. (1990) Regression Analysis Theory, Methods, and Applications. Springer-Verlag: New York.
47. Shen, X. and Huang, H.-C. (2006) Optimal model assessment, selection, and combination. Journal of the American Statistical Association, 101, 554-568.
48. Shen, X., Huang, H.-C., and Ye, J. (2004a) Comment on “The estimation of prediction error: covariance penalties and cross-validation” by B. Efron. Journal of the American Statistical Association, 99, 634-637.
49. Shen, X., Huang, H.-C., and Ye, J. (2004b) Adaptive model selection and assessment for exponential family models. Technometrics, 46, 306-317.
50. Shen, X. and Ye, J. (2002) Adaptive model selection. Journal of the American Statistical Association. 97, 210-221.
51. Stein, C. (1981) Estimation of the mean of a multivariate normal distribution. The Annals of Statistics. 9, 1135-1151.
52. Stein, M. L. (1999) Interpolation of Spatial Data. Springer: New York.
53. Stone, C. J. (1977) Consistent nonparametric regression (with discussion). The Annals of Statistics, 5, 595-645.
54. Stone, C. J. (1980) Optimal rates of convergence for nonparametric estimators. The Annals of Statistics, 8, 1348-1360.
55. Stone, C. J. (1982) Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10, 1040-1053.
56. Voltz, M. and Webster, R. (1990) A comparison of kriging, cubic splines, and classification for predicting soil properties from sample information. Journal of Soil Science, 41, 473-490.
57. Wahba, G. (1985) A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem. The Annals of Statistics, 13, 1378-1402.
58. Wahba, G. (1990a) Spline Models for Observational Data. Society for Industrial and Applied Mathematics: Philadelphia.
59. Wahba, G. (1990b) Comment on Cressie. The American Statistician, 44, 255-256.
60. Wand, M. and Jones, M. C. (1995) Kernel Smoothing. Chapman and Hall: New York.
61. Wand, M. P. (2000) A comparison of regression spline smoothing procedures. Computational Statistics, 15, 443-462.
62. Wang, Y. (1998) Smoothing spline models with correlated random errors. Journal of the American Statistical Association, 93, 341-348.
63. Wecker, W. and Ansley, C. (1983) The signal extraction approach to nonlinear regression and spline smoothing. Journal of the American Statistical Association, 78, 81-89.
64. Ye, J. (1998) On measuring and correcting the effects of data mining and model selection. Journal of the American Statistical Association, 93, 120-131.
65. Zhang, C. (2003) Calibrating the degrees of freedom for automatic data smoothing and effective curve checking. Journal of the American Statistical Association, 98, 609-628.
66. Zhang, H. (2004) Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. Journal of the American Statistical Association, 99, 250-261.
67. Zhang, H. and Zimmerman, D. L. (2005) Toward reconciling two asymptotic frameworks in spatial statistics. Biometrika, 92, 921-936.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top