(3.237.20.246) 您好!臺灣時間:2021/04/15 19:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:林汶鑫
研究生(外文):Wen-Shin Lin
論文名稱:遺傳演算法應用在雙線性模式之研究
論文名稱(外文):The Study on the Application of Bilinear Model Using Genetic Algorithms
指導教授:郭寶錚郭寶錚引用關係
指導教授(外文):Bo-Jein Kuo, Ph. D.
學位類別:碩士
校院名稱:國立中興大學
系所名稱:農藝學系
學門:農業科學學門
學類:一般農業學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:中文
論文頁數:119
中文關鍵詞:淨最小平方法離群值遺傳演算法
外文關鍵詞:PLSoutlierGenetic Algorithms
相關次數:
  • 被引用被引用:3
  • 點閱點閱:165
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:23
  • 收藏至我的研究室書目清單書目收藏:0
隨著科技的快速發展,儀器設備的設計愈趨精良,許多資料的取得也就愈趨便捷及精密,致使在同一觀測值樣本上可獲得更多的訊息,例如以近紅外光譜資料來預測稻米重要化學成分的含量等。因此在建立檢量模式時,常會遇上解釋變數多於觀測值個數的情況,使得解釋變數間可能會存在有高度相關性的情況。此時,若使用MLR來建立檢量模式,常會使得迴歸係數的估計不穩定。面對此問題時,有許多學者證實利用淨最小平方法建立檢量模式能避免解釋變數間高度相關性的問題,提高模式的預測能力。
然而,影響檢量模式預測能力的因素,除了資料的高度相關性外,觀測值的異常亦會嚴重的影響所建立的檢量模式,造成預測能力的降低。一般而言,除了單一異常值對模式有影響外,刪除二個或多個觀測值後,對模式或其餘觀測值仍是有影響,也就是所謂的遮蔽效應。面對遮蔽效應的診斷,Cook和Weisberg (1982)曾提出診斷方法,即”多重個案診斷”。但利用此方法往往會耗費冗長且繁複的運算。
因此,本論文利用模仿生物進化的遺傳演算法來避免異常值的影響,減少運算規模,進而增加模式預測能力及離群值的偵測能力。本研究中所欲探討的是糙米粉末的米質性狀中的粗蛋白質含量,利用近紅外光反射光譜儀( near infrared reflectance spectroscopy:NIRS)於1100nm∼2500nm的波長範圍內,每間隔4nm波長紀錄一次對於糙米粉末的光譜反射值(R),再以 進行轉換而成O.D.值。因此,每一個樣本可以得到351個吸光值,以此351個吸光值為解釋變數,並以化學分析方法所測得的粗蛋白質含量為依變數,粗蛋白質含量之測定為選取100毫克之糙米粉末中的粗蛋白質含量。經由本研究的實例分析得知,利用遺傳演算法的運算後所建立的PLSR模式,在預測能力比較的分析上,證實的確能提高模式的穩定性及預測能力;在離群值偵測能力的分析研究中,由人為設計產生的離群值加入觀測值資料數據後,明顯影響模式的預測能力,而所建立的檢量模式亦無法偵測出全數的離群值。但經遺傳演算法的運算後,所建立的檢量模式不僅其預測能力有明顯的改善,並且能偵測出全數的離群值。綜合所研究之實例分析,利用遺傳演算法的運算後所建立的PLSR模式,無論是提高模式的穩定性、增加模式的預測能力及精確偵測離群值的能力上確有明顯的改善。
Because of the quick development in technology and precise instruments, there are many convenient and accurate data so that we may get much more information from observations. For instance, Near-infrared reflectance spectra data were used to analyze the important chemical components of rice. The universal way to develop calibration equation is multiple linear regression (MLR). However, when the calibration equation was developed, the MLR approach often lead to an unstable regression coefficients estimation. Because the multicollinearity among problem variables will happen when the number of variables is more than that of observations. Many scholars confirmed that the partial least-squares method could be used to resolve the kind of serious multicollinearity problems and to increase the prediction ability.
In addition, the abnormal observations also seriously affect the calibration equation and bring about the decreasing of the prediction ability except the problem of the multicollinearity among variables. However, deleting more than one observation also affects the development of calibration equation or the effect of other observations, which is so-called masking effect. Cook and Weisberg (1982) demonstrated that the method called multiple-case Diagnostics could overcome above problem. However, the method often expends
long-winded and numerous operational analysis.
Therefore, in order to avoid the effects of abnormal observations and to spare the time of operational analysis, increasing the ability of prediction and that of outlier detection are very important. The objectives of this study are to solve the above problems using Genetic Algorithms, which is imitated natural genetic. We analyzed the protein content in brown rice flour, which was scanned by using near infrared reflectance spectroscopy (NIRS) over the wavelength range from 1,100-nm to 2,500-nm in 4-nm increment, yielding 351 values per spectrum. Reflectance spectrum reading(R) were recorded and transformed to log (1/R). The samples were also measured the quantity of protein per 100 mg brown rice flour by chemical analysis. In this study, it was showed that the PLSR that by Genetic Algorithms could increase the stability and the prediction ability. In the comparison study of the ability of outlier detection, the contaminated data, which mingled with man-made outliers, obviously affected the prediction ability in calibration model and the detection ability of the outliers. However, the calibration model built by Genetic Algorithms not only obviously remained the prediction ability but also
could detect all outliers.
In summary, according to the results of the above two studies, the PLSR model built by Genetic Algorithms really can raise the stability, increase the prediction ability, and detect the whole outliers successfully.
頁次
中文摘要………………………………………………………………Ⅰ
Abstract…………………………………………………………………Ⅲ
第一章 緒言…………………………………………………………1
第二章 文獻探討……………………………………………………4
壹、 淨最小平方法(partial least squares;PLS)………………4
1. 前言…………………………………………………………4
2. 雙線性模式(Bilinear Modelling;BLM)……………………4
3. 淨最小平方法回歸(PLSR)…………………………………7
4. 驗證預測模式……………………………………………10
貳、 離群值偵測……………………………………………… 12
1. 前言………………………………………………………12
2. 有影響力觀測值…………………………………………12
2.1槓桿作用(leverage)……………………………………13
2.2解釋變數X的殘差……………………………………14
2.3依變數y的殘差………………………………………16
2.4 Student化殘差(studentized residual)…………………17
2.5 Cook’s Di………………………………………………17
3. 穩健程序(Robust procedure) ………………………………19
4. 遮蔽效應(Masking Effect)………………………………..20
參、 遺傳演算法……………………………………………… 21
1. 前言………………………………………………………22
2. 最適(佳)化與搜尋方法…………………………………22
3. 遺傳演算法(Genetic Algorithms:GAs)……………25
3.1遺傳演算法的源由……………………………………26
3.2遺傳演算法的運作方式………………………………27
3.2.1基本假設…………………………………………27
3.2.2編碼(Encoding)…………………………………30
3.2.3起始族群的選擇…………………………………33
3.2.4適合度函數………………………………………34
3.2.5 遺傳運算元(GAs operators)……………………35
3.2.6 子代族群(“Children” population)………………45
3.2.7 停止標準(Stopping criterion)……………………46
3.3 GAs的數學理論-Schema Theorem…………………47
3.4 GAs的特性……………………………………………51
3.5 GAs的缺陷及修正方法………………………………52
4. 遺傳演算法的應用及未來發展…………………………54
第三章 遺傳演算法在近紅外光譜分析上的研究…………………56
壹、 統計分析方法與統計量的意義…………………………56
1. 前言………………………………………………………56
2. MSC轉換…………………………………………………56
3. 判斷模式效能之統計量…………………………………58
貳、 預測能力比較……………………………………………61
1. 實驗材料分析……………………………………………61
2. 遺傳演算法………………………………………………66
3. 結果與討論………………………………………………70
參、 離群值測能力比較………………………………………79
1. 實驗材料分析……………………………………………79
2. 離群值偵測方法…………………………………………83
3. 遺傳演算法………………………………………………85
4. 結果與討論………………………………………………88
第四章 綜合討論……………………………………………………99
參考文獻………………………………………………………………104
附錄:
The program of Genetic Algorithms by using SAS/IML……………109
王惠文。1998。偏最小二成回歸方法及其應用。國防工業出版社。
邵遵文。2003。淨最小平方、主成分回歸、脊回歸與前進選取之預測能力的比較。碩士論文。台中。國立中興大學農藝學系碩士班。
周鵬程。1999。遺傳演算法原理與應用-活用Matlab。全華科技圖書股份有限公司。
Alander, J. T. 1992. On Optimal Population Size of Genetic Algorithms. In Proceedings of CompEuro 92. 65-70. IEEE Computer Society Press.
Belsley, D. A., E.,Kuh, and R. E.Welsh. 1980. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: John Wiley.
Brereton, R. G. 2000. Introduction to multivariate calibration in analytical chemistry. Analyst 125:2125-2154.
Chatterjee, S., M. Laudato, and Lynch L. A. 1996. Genetic algorithms: an introduction. Comput. Stat. Data Anal. 22:633-651.
Cook, R. D. 1977. Detection of influential observations in linear regression. Technometrics. 19: 15-18.
Cook, R. D., and S.Weisberg. 1982. Residuals and influence in regression. New York: Chapman and Hall.
Darwin, C. 1964.The Origin of Species. Cambridge. Harvard University Press.
De Jong, K. A. 1975. An Analysis of The Behavior of a Class of Genetic Adaptive System. Ph. D. Dissertation, Department of Computer and Communication Sciences. University of Michigan.
De Jong, K. A., and W. M. Spears. 1990. An analysis of the interacting roles of population size and crossover in genetic algorithms. In First International Conference on Parallel Problem Solving from Nature, 38-47. IEEE Society Press.
Delwiche, S. R., M. M. Bean, R. E. Miller, B. D. Wedd and P. C. Williams. 1995. Apparent amylose content of milled vice by near-infrared reflectance spectrophotometry. Cereal Chem. 72(2): 182-187
Garthwaite, P. H. 1994. An interpretation of partial least squares. J. Am. Stat. Assoc. 89: 122-127.
Geladi, P., D. MacDougall, and H. Martens. 1985. Linearization and scatter-correction for near-infrared reflectance spectra of meat. Appl. Spectrosc. 39(3): 491-500.
Goldberg, D. E. 1985. Optimal Initial Population Size for Binary-coded Genetic Algorithms ( TCGA Report No. 85001 ). The Clearinghouse for Genetic Algorithms, Department of Engineering Mechanics. Tuscaloosa: University of Alabama.
Goldberg, D. E. 1989. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company.
Goldberg, D. E., and K. Deb. 1991. A Comparative Analysis of Selection Schemes Used in Genetic Algorithms. In G. Rawlins, ed. Foundations of Genetic Algorithms. Morgan Kaufmann.
Grefenstette, J. J. 1986. Optimization of Control Parameters for Genetic Algorithms. IEE Transactions on System, Man, and Cybernetics. SMC-16(1): 122-128.
Han, W., and Z. P. Liao. 1999. A Note on the Convergence of Genetic Algorithms.Earthquake Engineering and Engineering Vinration 19(4): 13-16.
Helland, I. S. 1988. On the structure of partial least squares regression. Commun. Statist. Simula. Comput. 17:581-607.
Helland, I. S. 1990. Partial least squares regression and statistical models. Scand. J. Statist. 17: 97-114
Holland, J. H. 1973.Genetic Algorithms and the Optimal Allocations of Trails. SIAM J. Comput. 2(2): 88-105.
Holland, J. H. 1975. Adaptation in Natural and Artificial Systems. Ann Arbor: The University of Michigan Press.
Holland, J. H. 1986. Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems. Machine learning Ⅱ. Morgan Kaufmann.
Isaksson, T. and T. Naes. 1988. The effect of multiplicative scatter correction (MSC) and linearity improvement in NIR spectroscopy. Appl. Spectrosc. 42: 1273-1284.
Lawrance, A. J. 1995. Detection influence and masking in regression. J. R. Statist. Soc. B. 57(1): 181-189.
Martens, H., and S. Å. Jensen. 1983. Partial least squares regression: A new two-stage NIR calibration method. Prague June 1982 Elsevier Publ., Amsterdam, 607-647.
Martens, H., and T.Næs. 1989. Multivariate Calibration. 4th ed. New York: John Wiley.
Michalewicz, Z. 1992. Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag, New York.
Mitchell, M. 1997. An Introduction to Genetic Algorithms. The MIT Press. Cambridge, Massachusetts.London, England.
Osborne, B. G., and T. Fearn. 1986. Near infrared spectroscopy in food analysis. Longman scientific and technical. P.20-115.
Rousseeuw, P. J., and A. M. Leroy. 1987. Robust regression and outlier detection. New York: John Wiley.
SAS Institute Inc. 1999. SAS/IML User’s Guide, Version 8. SAS Institute Inc., Cary, NC, USA.
Schaffer, J. D., R. A. Caruana, L. J. Eshelman, and R. Das. 1989. A Study of Control Parameters affecting online Performance of Genetic Algorithms for Function Optimization. In J. D. Schaffer, ed., Proceedings of the Third International Conference on Genetic Algorithms. Morgan Kaufmann.
Schmitt, L. M. 2001. Fundamental Study Theory of Genetic Algorithms. Theor. comput. sci. 259:1-61.
Spencer, H. 1863. The Principles of Biology. Williams/Londom.
Vose, M. D. 1999. The Simple Genetic Algorithms — Fundations and Theory, The MIT Press. Cambridge Massachusetts London, England.
Walczak, B. 1995. Outlier detection in multivariate calibration. Chemom. Intell. Lab. Syst. 28:259-272.
Walczak,B. 1995. Outlier detection in bilinear calibration. Chemom. Intell. Lab. Syst. 29:63-73.
Walczak, B., and D. L. Massart. 1998. Multiple outlier detection revisited. Chemom. Intell. Lab. Syst.41: 1-15.
Weisderg, S. 1985. Applied linear regression. John Wiley and Sons. New York.
Whitley, L. D. 1989. The genitor algorithm and selection pressure: Why rank-based allocation of reproductive trials is best? In J. D. Schaffer, ed., Proceedings of the third international conference on genetic algorithms. Morgan Kaufmann.
Wold, S., H. Antti, F. Lindgren, and J. Öhman. 1998. Orthogonal signal correction of near-infrared spectra. Chemom. Intell. Lab. Syst. 44:175-185.
Wold, S., M. Sjőstrőm, and L. Eriksson. 2001. PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58:109-130.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔