跳到主要內容

臺灣博碩士論文加值系統

(44.201.97.0) 您好!臺灣時間:2024/04/16 09:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:方偉泉
研究生(外文):Wei-Quan Fang
論文名稱:線性迴歸模型在量測誤差與共線性影響下之估計方法初探
論文名稱(外文):A Note on Parameters Estimation in Linear Regression Models Subject to Measurement Error and Multicollinearity
指導教授:吳裕振吳裕振引用關係
指導教授(外文):Yuh-Jenn Wu
學位類別:博士
校院名稱:中原大學
系所名稱:應用數學研究所
學門:數學及統計學門
學類:數學學類
論文種類:學術論文
論文出版年:2015
畢業學年度:104
語文別:英文
論文頁數:155
中文關鍵詞:線性迴歸;共線性;量測誤差
外文關鍵詞:Linear RegressionCollinearityMeasurement Error
相關次數:
  • 被引用被引用:0
  • 點閱點閱:662
  • 評分評分:
  • 下載下載:34
  • 收藏至我的研究室書目清單書目收藏:0
近幾年來,大數據之訊息分析與應用已經成為各領域中無所不在的重要技術;從學術研究到工商業發展,似有源源不絕的知識情報持續累積。在資料數據的分析研究中,有時可能需要較為複雜的統計方法(例如,沒有唯一解的參數估計式可以直接使用),其中的部分原因也許可歸咎於測量的不精確性或無法預期的資料收集錯誤。此外迄今為止,已經有一些文獻回顧去說明如果共線性存在於數據結構中,統計分析可能導致嚴重錯誤的結論。因此,於資料分析和訊息掘取時,調查研究人員應該注意這種情況,並在解讀數據時慎選統計方法避免錯誤結論發生。本篇論文裡,我們研究了線性模型的量測誤差與共線性問題。作為解釋因果關係可能性的一種工具,迴歸分析已在悠久的歷史當中扮演關鍵的角色。本文的其中一個主要目的則是想要提醒讀者古典理論的模型估計程序(最小平方法)在某些應用上可能需要修正參數係數的估計偏差,此一偏差即來自於量測誤差或共線性之影響。根據上述,我們在這篇論文裡將提出兩個新的估計方法去修正可能的偏差。並提出可能適合進一步延伸的主題與研究觀點。這也是我們希望一些在本文所介紹的估計方法將有助於迴歸分析能夠處理這些偏差更一般性理論的後續發展。

In recent years, big data applications and intelligence have become ubiquitous and keep discovering more insights from the academia and the industry/economy alike. Data analyses in those researches may sometimes require complex statistical methods where, in general, no closed-form solutions can be directly used because of imprecision of measurements and/or unexpected errors. In addition, there have been, to date, a number of literature reviews that some statistical results may lead to undesirable conclusions due to collinearity in data structure. Hence investigators should pay attention to such situations when digging the data and drawing the information. In this dissertation, we study the mis-measured and collinear issues in linear models. To be a tool for interpreting the possibility of cause-effect relations, regression analysis plays an important role for a long time. One of the main goals of this study is to remind the readership that classical estimation approach, least-squares method, may need to correct for the biases of parameter coefficients in certain applications. According to the aforementioned, we propose two new methods to correct such biases and give an outlook on further extended works. It is also our hope that some of the estimation approaches proposed in this dissertation will contribute to the subsequent development of a more general theory for biases correction in regression analysis.

Contents IV
List of Figures V
List of Tables VI
Guide to Abbreviation VII
1. Introduction and General Overview 1
1.1: Classical Approach to Regression Problems 3
1.2: Influence of Collinearity in Linear Models 4
1.3: Measurement Error Problems in Statistical Analysis 6
1.4: Motivation 9
1.5: Outline of the Planned Dissertation 11
2. Review of Estimation of Linear Regression Models with Multicollinearity or Measurement Error 14
2.1: Principal Components Regression in Linear Models 16
2.2: Ridge Regression Methodologies in Linear Models 19
2.2.1: Ridge Parameter Selections on ORRE 20
2.2.2: Liu-Type Estimators for Collinear Problems 22
2.3: PCRR Estimation Approach in Linear Models 25
2.3.1: r-k Class Estimator for Collinear Problems 25
2.3.2: r-k-d class Estimator for Collinear Problems 28
2.4: Simulation Results and Some Remarks – Collinearity 31
2.4.1: Simulation Examples 31
2.4.2: More on RR Based Estimators 34
2.4.3: Certain General Remarks 35
2.5: Biases in the Presence of Measurement Errors 36
2.6: Maximum Likelihood Approach in Measurement Error Models 38
2.7: Moment Equation Approach in Measurement Error Models 43
2.7.1: Berkson-type Measurement Errors on SLSE 44
2.7.2: Classical Measurement Errors on SLSE 48
2.8: Classical Measurement Errors on Simulation Extrapolation 51
2.9: Wald-Type Estimation Approach in Measurement Error Models 56
2.10: Simulation Results and Some Remarks – Measurement Error 65
2.10.1: Simulation Examples 66
2.10.2: Estimations in the Presence of a Mixture Error 68
2.10.3: Certain General Remarks 74
3. More on Estimation in Linear Regression Models with Multicollinearity or Measurement Errors 76
3.1: Berkson-type Measurement Errors on SIMEX 76
3.2: Berkson-type Measurement Errors on WTE 82
3.3: Consistent Estimation with Multicollinearity and Measurement Error 85
3.3.1: Collinear Assumption in Linear Models 86
3.3.2: Adjusted Wald-Type Estimator in Linear Models 88
3.4: Simulation Results and Some Remarks 94
3.4.1: Simulation Examples 94
3.4.2: Certain General Remarks 96
4. Numerical Examples and Model Evaluation 98
4.1: Real Data Examples – An Air Pollution Study 99
4.2: Real Data Examples – A Liver Disorders Study 100
5. Summary and Concluding Remarks 102
5.1: Summary and Discussion 102
5.2: Future Work 105
Bibliography 108
Appendix 116



List of Figures
Figure 2-1 Condition number for parameters matrix in SLSE 68
Figure 4-1 Naive fitted lines for six estimates in air pollution study 99
Figure 4-2 Naive fitted lines for six estimates in liver disorder study 101



List of Tables
Table 2-1 EMSE for six estimates in case of α=0.7 without mismeasuring 32
Table 2-2 EMSE for six estimates in case of α=0.9 without mismeasuring 32
Table 2-3 EMSE for six estimates in case of α=0.99 without mismeasuring 33
Table 2-4 EMSE for six estimates in case of α=0.999 without mismeasuring 33
Table 2-5 EMSE for five estimators with Berkson-type measurement errors 66
Table 2-6 EMSE for five estimators with classical measurement errors 67
Table 3-1 EMSE for six estimators with α=0.01 94
Table 3-2 EMSE for six estimators with α=0.2 95
Table 3-3 EMSE for six estimators with α=0.5 95
Table 3-4 EMSE for six estimators with α=0.8 96
Table 4-1 Coefficient estimates for an air pollution study 99
Table 4-2 Coefficient estimates for a liver disorder study 101



Abarin, T. (2008). Second-order least squares estimation in regression models with application to measurement error problems. PhD Disseration, University of Manitoba.
Alheety, M. I., Kibria, B. M. G. (2012). Modified Liu-type estimator based on (r–k) class estimator. Communications in Statistics - Theory and Methods 42:304–319.
Apanasovich, T. V., Carroll, R. J., Maity, A. (2009). SIMEX and standard error estimation in semiparametric measurement error models. Electronic Journal of Statistics 3:318–348.
Aslam, M., Riaz, T., Altaf, S. (2013). Efficient estimation and robust inference of linear regression models in the presence of heteroscedastic errors and high leverage points. Communications in Statistics - Simulation and Computation 42:2223–2238.
Baltagi, B. H., Li, Q. (2002). On instrumental variable estimation of semiparametric dynamic panel data models. Economics Letters 76:1–9.
Bartlett, M. S. (1949). Fitting a straight line when both variables are subject to error. Biometrics 5:207–212.
Batah, F. M., Ozkale M. R., Gore S. D. (2009). Combining unbiased ridge and principal component regression estimators. Communications in Statistics - Theory and Methods 38:2201–2209.
Baye, M. R., Parker, D. F. (1984). Combining ridge and principal component regression. Communications in Statistics - Theory and Methods 13:197–205.
Belsley, D. A., Kuh, E., Welsch R. E. (2005). Regression diagnostics: identifying influential data and sources of collinearity. New York: John Wiley and Sons.
Berkson, J. (1950). Are there two regressions? Journal of the American Statistical Association 45:164–180.
Bickel, P. J., Ritov, Y. (1987). Efficient estimation in the errors in variables model. Annals of Statistics 15:513–540.
Bissantz, N., Munk, A., (2001). New statistical goodness of fit techniques in noisy inhomogeneous inverse problems. Astronomy & Astrophysics 376:735–744.
Bound, J., Jaeger, D. A., Baker, R. M. (1995). Problems with instrumental variables estimation when the correlation between the instruments and the explanatory variable is weak. Journal of the American Statistical Society 90:443–450.
Buonaccorsi, J. P. (2010). Measurement error: models, methods and applications. Boca Raton: Chapman and Hall/CRC Press.
Buzas, J. S., Stefanski, L. A. (1996). Instrumental variable estimation in generalized linear measurement error models. Journal of the American Statistical Association 91:999–1006.
Buzas, J. S., Stefanski, L. A., Tosteson T. (2005). Measurement Error. Handbook of Epidemiology vol. 1 London: Springer.
Carturan, L., Dalla Fontana, G., Borga, M., 2012. Estimation of winter precipitation in a high-altitude catchment of the Eastern Italian Alps: validation by means of glacier mass balance observations. Geografia Fisica e Dinamica Quaternaria 35, 37–48.
Carroll, R. J., Ruppert, D., Stefanski, L. A., Crainiceanu, C. M. (2006). Measurement error in nonlinear models: a modern perspective, 2nd edition. Boca Raton, Florida: Chapman & Hall/CRC Press.
Chang, J., Guo, B., Yao, Q. (2015). High dimensional stochastic regression with latent factors, endogeneity and nonlinearity. Journal of Econometrics In Press.
Chang X., Yang H. (2012). Combining two-parameter and principal component regression estimators. Statistical Papers 53:549–562.
Chatterjee, S., Hadi, A. S. (2006). Regression analysis by example. Fourth Edition. New York: John Wiley and Sons.
Cook, J. R., Stefanski, L. A. (1994). Simulation-Extrapolation Estimation in Parametric Measurement Error Models. Journal of the American Statistical Association 89:1314–1328.
Crouse, R., Jin, C., Hanumara, R. (1995). Unbiased ridge estimation with prior information and ridge trace. Communications in Statistics - Theory and Methods 24:2341–2354.
Cui, H., Hu, T. (2011). On nonlinear regression estimator with denoised variable. Computational Statistics & Data Analysis 2:1137–1149.
Damos, D. L., Parker, E. S. (1994). High false alarm rates on a vigilance task may indicate recreational drug use. Journal of Clinical and Experimental Neuropsychology 16:713–722.
Delaigle, A., Hall, P. (2008). Using SIMEX for smoothing-parameter choice in errors-in-variables problems. Journal of the American Statistical Association 103:280–287.
Fuller, A. W. (1987). Measurement error models. New York: Wiley.
Gao, F., Liu, X. Q. (2011). Linearized ridge regression estimator under the mean squared error criterion in a linear regression model. Communications in Statistics - Simulation and Computation 40:1434–1443.
Gentle, J. E. (2007). Matrix algebra: theory, computations and applications in statistics. New York: Springer-Verlag.
Gibbons, D. G. (1981). A simulation study of some ridge estimators. Journal of the American Statistical Association 76:131–139.
Hansen, C., Kozbur, D. (2014). Instrumental variables estimation with many weak instruments using regularized JIVE. Journal of Econometrics 182:290–308.
Han, C., Schmidt, P. (2001). The asymptotic distribution of the instrumental variable estimators when the instruments are not correlated with the regressors. Economics Letters 74:61–66.
Hoerl, A. E., Kennard, R. W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67.
Hoerl, A. E., Kennard, R. W., Baldwin, K. F. (1975). Ridge regression: some simulation. Communications in Statistics 4:105–123.
Huwang, L., Huang, Y. H. S. (2000). On errors-in-variables in polynomial regression -Berkson case. Statistica Sinica 10:923–936.
Kaciranlar, S., Sakallioglu, S. (2001). Combining the Liu estimator and the principal component regression. Communications in Statistics - Theory and Methods 30:2699–2705.
Kendall, M. G. (1957). A course in multivariate analysis. London: Griffin.
Kim, M., Ma, Y. (2012). The efficiency of the second-order nonlinear least squares estimator and its extension. Annals of the Institute of Statistical Mathematics 64:751–764.
Kibria, B. M. G. (2003). Performance of some new ridge regression estimators. Communications in Statistics - Theory and Methods 32:419–435.
Kloeck, T., Mennes, L. B. M. (1960). Simultaneous equations estimation based on principal components of predetermined variables. Econometrika 28:45–61.
Larcker, D. F., Rusticus, T. O. (2010). On the use of instrumental variables in accounting research. Journal of Accounting and Economics 49:186–205.
Lertsima, C., Chaisomphob, T., Yamaguchi, E. (2004). Stress concentration due to shear lag in simply supported box girders. Engineering Structures 26:1093–1101.
Li, A. M, Yin, J., Au, J. T., So, H. K., Tsang, T. , Wong, E. et al. (2007). Standard reference for the six-minute-walk test in healthy children aged 7 to 16 years. American Journal of Respiratory and Critical Care Medicine 176:174–180.
Liang, H., Thurston, S. W., Ruppert, D., Apanasovich, T. (2008). Additive partial linear models with measurement errors. Biometrika 3:667–678.
Lin, M. T., Chang, C. H., Hsieh, W. C., Chang, C. E., Chang, Y. M. et al. (2014). Coronary diameters in Taiwanese children younger than 6 years old: Z-Score regression equations derived from body surface area. Acta Cardiologica Sinica 30:266-273.
Liu, K. (1993). A new class of biased estimate in linear regression. Communications Statistics - Theory and Methods 22:393–402.
Liu, K. (2003). Using Liu-type estimator to combat collinearity. Communications in Statistics - Theory and Methods 32:1009–1020.
Liu, K. (2004). More on Liu-Type estimator in linear regression. Communications in Statistics - Theory and Methods 33:2723–2733.
Madansky, A. (1959). The fitting of straight lines when both variables are subject to error. Journal of the American Statistical Association 54:173–205.
Mansson, K., Shukur, G., Kibria, B. M. G. (2010). A simulation study of some ridge regression estimators under different distributional assumptions. Communications in Statistics - Simulation and Computation 39:1639–1670.
Moran, P. A. P. (1971). Estimating structural and functional relationships. Journal of Multivariate Analysis 1:232–255.
Muniz, G., Kibria, B. M. G. (2009). On some ridge regression estimators: an empirical comparisons. Communications in Statistics - Simulation and Computation 38:621–630.
Neyman, J., Scott E. L. (1951). On certain methods of estimating the linear structural relation. Annals of Mathematical Statistics 22:352–361.
Pakes, A. (1982). On the asymptotic bias of the Wald-type estimators of a straight line when both variables are subject to error. International Economic Review 23:491–497.
Rencher, A. C. (2002). Methods of multivariate analysis. Second Edition. New York: John Wiley and Sons.
Rencher, A. C., Schaalje, G. B. (2008). Linear Models in Statistics, 2nd edition. New York: Wiley.
Schennach, S. (2013). Regressions with Berkson errors in covariates - a nonparametric approach. Annals of Statistics 41:1642–1668.
Schennach, S. M., Hu Y. (2013). Nonparametric identification and semiparametric estimation of classical measurement error models without side information. Journal of the American Statistical Association 108:177–186.
Silvey, S. D. (1969). Multicollinearity and imprecise estimation. Journal of the Royal Statistical Society Series B 31:539–552.
Siray, G. U., Sakallioglu, S. (2012). Superiority of the r–k class estimator over some estimators in a linear model. Communications in Statistics - Theory and Methods 41:2819–2832.
Spiegelman, C. (1979). On estimating the slope of a straight line when both variables are subject to error. Annals of Statistics 7:201–206.
Stefanski, L. A., Cook, J. R. (1995). Simulation-Extrapolation: The Measurement Error Jackknife. Journal of the American Statistical Association 90:1247–1256.
Tapsoba, J. D., Lee, S. M., Wang, C. Y. (2014). Expected estimating equation using calibration data for generalized linear models with a mixture of Berkson and classical errors in covariates. Statistics in Medicine 33:675–692.
Theil, H., van Yzeren, J. (1956). On the efficiency of Wald’s method of fitting straight lines. Review of the International Statistical Institute 24:17–26.
Turkmen, A. S., Tabakan, G. (2015). Outlier resistant estimation in difference-based semiparametric partially linear models. Communications in Statistics - Simulation and Computation 44:417–432.
Vittinghoff, E., Shiboski, S. C., Glidden, D. V., McCulloch, C. E. (2005). Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. New York: Springer.
Wald, A. (1940). Fitting of straight lines if both variables are subject to error. Annals of Mathematical Statistics 11:284–300.
Wang, L. (2004). Estimation of nonlinear models with Berkson measurement errors. Annals of Statistics 32:2559–2579.
Wang, L., Brown, L. D., Cai, T. T. (2011). A difference based approach to the semiparametric partial linear model. Electronic Journal of Statistics 5:619–641.
Wang, L., Leblanc, A. (2008). Second-order nonlinear least squares estimation. Annals of the Institute of Statistical Mathematics 60:883–900.
Wansbeek, T., Meijer, E. (2000). Measurement error and latent variables in econometrics. Amsterdam: North-Holland.
Weisberg, S. (2005). Applied linear regression. New York: John Wiley and Sons.
Wu, Y. J., Fang, W. Q. (2015). Consistent estimation approach to tackling collinearity and Berkson-type measurement error in linear regression using adjusted Wald-type estimator. Communications in Statistics - Theory and Methods. Accepted.
Wu, Y. J., Fang, W. Q. (2015a). Performance of Wald-type estimator for parametric component in partial linear regression with a mixture of Berkson and classical error models. Communications in Statistics - Simulation and Computation. Accepted.
Yang H., Chang X. (2010). A new two-parameter estimator in linear regression. Communications in Statistics - Theory and Methods 39:923–934.
Yatchew, A. (1997). An elementary estimator of the partial linear model. Economics Letters 57:135–143.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top