跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.168) 您好!臺灣時間:2024/12/15 06:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:曹惠鈴
研究生(外文):Huei-Ling Tsao
論文名稱:運用混合式進化演算法於法則探勘之回應模型
論文名稱(外文):Using a hybrid meta-evolutionary rule mining approach as a response model
指導教授:陳大正陳大正引用關係
指導教授(外文):Ta-Cheng Chen
學位類別:碩士
校院名稱:國立虎尾科技大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:66
中文關鍵詞:分類法則專家系統進化式演算法資料探勘
外文關鍵詞:Data miningMeta-Evolutionary AlgorithmsClassification rulesExpect System
相關次數:
  • 被引用被引用:6
  • 點閱點閱:381
  • 評分評分:
  • 下載下載:54
  • 收藏至我的研究室書目清單書目收藏:0
資料探勘乃經常被用來發掘資料庫中新知識的方法及工具。本研究提出一混合式進化演算法在分類問題上去評量數值樣本並同時萃取包含預測變數,相對應的不等式及門檻值之分類法則,來建立最高預測準確率之決策模型。傳統的統計模型及統計相關技術如邏輯迴歸及複迴歸常被拿來使用,但現實生活的問題經常是高度非線性,很難使用統計方法去發展出一套包含所有獨立變數模型。近年來具非線性及複雜度之機器學習方法如:類神經網路及支援向量機已經被證明比傳統的統計方法更具可靠,儘管文獻中顯示出類神經網路及支援向量機的好處,但大多數的障礙在於建立及所使用的分類法則難以被理解。本研究和文獻中各種方法及商業軟體比較結果,實驗數據顯示此法則探勘分類方法是較能增加預測的準確度以及模型更具簡明性,本研究所提方法所萃取的分類法則可以建立類似專家系統之預測或分類問題的模型。
Data mining usually means the methodologies and tools for the efficient new knowledge discovery from databases. Based on the data mining techniques, a response model can be built as a decision model for prediction or classification of a domain problem potential like expert systems. In this paper, a hybrid meta-evolutionary rule mining based approach to assess nu-merical data pattern in the classification problem is proposed for extracting the decision rules including the predictors, the corresponding inequality and parameter values simultaneously so as to building a decision-making model with maximum prediction accuracy. Conventional statistical methods and statistical related techniques include logistic regression and multi-normal regressions were used. As the real world problems are highly nonlinear in na-ture, they are hard to develop a comprehensive model taking into account all the independent variables using the these statistical approaches. Recently, nonlinear and complex machine learning approaches such as neural networks (NNs) and support vector machines (SVMs) have been demonstrated to be with more reliable than the conventional statistical approaches. Although the usefulness of using NNs/SVMs has been reported in literatures, the most obsta-cles is in the building and using the model in which the classification rules are hard to be re-alized. We compared our results against the other methods in literature, and we show ex-perimentally that the proposed rule extraction approach is promising for improving prediction accuracy and enhancing the modeling simplicity. In particular, the extracting rules by using the proposed approach can be developed as a computer model for prediction or classification problem like expert systems.
中文摘要 i
英文摘要 ii
致謝 iii
目錄 iv
表目錄 vi
圖目錄 viii
一、 緒論 1
1.1 研究背景與動機 1
1.2 研究範圍及目的 2
1.3 研究方法與步驟 2
1.4 研究內容及架構 3
二、 文獻探討及回顧 4
2.1 知識挖掘與資料探勘 4
2.1.1 知識挖掘之定義 4
2.1.2 知識挖掘與資料探勘循環之步驟 4
2.2 法則探勘與特徵選擇 6
2.3 類免疫演算法 8
2.3.1 免疫系統 8
2.3.3 類免疫演算法之步驟 10
2.4 粒子群最佳化 11
2.4.1 粒子群最佳化更新法則 12
2.4.2 粒子群最佳化之步驟 14
2.5 二元粒子群最佳化 15
2.5.1 二元粒子群最佳化更新法則 15
2.5.2 二元粒子群之最佳化步驟 16
三、 研究方法 17
3.1 分類法則定義及說明 17
3.2 分類法則探勘步驟 17
3.2.1 資料前處理程序 17
3.2.2 法則探勘程序 17
3.2.3 訓練資料修正程序 18
3.3 基因編碼的法則表示 20
3.4 評量機制 21
四、 實驗數據之分析與討論 24
4.1 Iris 花類資料集 24
4.1.1 參數設定 25
4.1.2 資料選取規則(一) 25
4.1.3 資料選取規則(二) 28
4.2 肝臟病變資料集 30
4.2.1 參數設定 31
4.2.2 資料選取規則 32
4.2.3 型I、型II錯誤分析 35
4.3 財務資訊資料集 37
4.3.1 參數設定 39
4.3.2 資料選取規則 40
4.3.3 型I、型II錯誤分析 44
五、 結論與未來研究 47
5.1 結論 47
5.2 未來研究 47
參考文獻 48
[1] Inmon, W. H., "The data warehouse and data mining", Communications of the ACM,39(11), pp. 49-50, 1996.
[2] Mitra, S., Pal, S. K., and Mitra, P., "Data mining in soft computing framework: a survey",
IEEE Transactions on Neural Networks, 13(1), pp. 3-14, 2002.
[3] Backus, P., Janakiram, M., Mowzoon, S., Runger, C., and Bhargava, A., "Factory cycle-
time prediction with a data-mining approach", Semiconductor Manufacturing,
IEEE Transactions on, 19(2), pp. 252-258, 2006.
[4] Kusiak, A. and Shah, S., "Data-mining-based system for prediction of water chemistry
faults", Industrial Electronics, IEEE Transactions on, 53(2), pp. 593-603, 2006.
[5] Gao, H. T., Hayes, J. H., and Cai, H., "Integrating biological research through Web
services", Computer, 38(3), pp. 26-31, 2005.
[6] Shu-Ching, Kuo, Sheng-Tun, Li, Yi-Chung, Cheng, and Men-Hsieu, Ho, "Knowledge
discovery with SOM networks in financial investment strategy", Proceedings of the
Fourth International Conference on Hybrid Intelligent Systems, pp. 98-103, 5-8 Dec.
2004.
[7] Kirkos, Efstathios, Spathis, Charalambos, and Manolopoulos, Yannis, "Data Mining
techniques for the detection of fraudulent financial statements", Expert Systems with
Applications, 32(4), pp. 995-1003, 2007.
[8] Lamma, E., Mello, P., Nanetti, A., Riguzzi, F., Storari, S., and Valastro, G., "Artificial
intelligence techniques for monitoring dangerous infections", Information Technology
in Biomedicine, IEEE Transactions on, 10(1), pp. 143-155, 2006.
[9] Haiying, Wang, Azuaje, F., and Black, N., "An integrative and interactive framework
for improving biomedical pattern discovery and visualization", Information Technology
in Biomedicine, IEEE Transactions on, 8(1), pp. 16-27, 2004.
[10] Voth, D., "Using AI to detect breast cancer", Intelligent Systems, IEEE [see also IEEE
Intelligent Systems and Their Applications], 20(1), pp. 5-7, 2005.
[11] Douglas, S., Agarwal, D., Alonso, T., Bell, R. M., Gilbert, M., Swayne, D. F., and
Volinsky, C., "Mining Customer Care Dialogs for "Daily News"", Speech and Audio
Processing, IEEE Transactions on, 13(5), pp. 652-660, 2005.
[12] Etzioni, O., "The World Wide Web: Quagmire or Goldmine?" Communications of the ACM, 39, pp. 65-68, 1996.
[13] Tanasa, D. and Trousse, B., "Data preprocessing for WUM", Potentials, IEEE, 23(3),
pp. 22-25, 2004.
[14] Hung, Shin-Yuan, Yen, David C., and Wang, Hsiu-Yu, "Applying data mining to telecom
churn management", Expert Systems with Applications, 31(3), pp. 515-524,2006.
[15] Daskalaki, S., Kopanas, I., Goudara, M., and Avouris, N., "Data mining for decision
support on customer insolvency in telecommunications business", European Journal of
Operational Research, 145(2), pp. 239-255, 2003.
[16] Fayyad, U., Haussler, D., and Stolorz,P., "Mining scientific data", Communications of
the ACM, 39, pp. 51-57, 1996.
[17] Wang, J. T. L., Zaki, M. J., Toivonen, H. T. T., and Shasha, D. E., Data Mining in Bioinformatics,Advance Information and Knowledge Processing Series. London,Springer-Verlag, 2005.
[18] Fayyad, U. and Uthurusamy, R., "Data mining and knowledge discovery in databases",Communications of the ACM, 39(11), pp. 24-26, 1996.
[19] Bentz, Y. and Merunka, D., "Neural networks and the multinomial logit for brand choice modeling: A hybrid approach", Journal of Forecasting, 19(3), pp. 177-200,2000.
[20] Ha, K., Cho, S., and MacLachlan, D., "Response models based on bagging neural
networks", Journal of Interactive Marketing, 19(1), pp. 17-30, 2005.
[21] Kim, Y. S. and Street, W. N., "An intelligent system for customer targeting: a data
mining approach", Decision Support Systems, 37(2), pp. 215-228, 2004.
[22] Haughton, D. and Oulabi, S., "Direct marketing modeling with CART and CHAID",
Journal of Direct Marketing, 11(4), pp. 42-52, 1997.
[23] Cheung, K. -W., Kwok, J. T., Law, M. H., and Tsui, K. -C., "Mining customer product
ratings for personalized marketing", Decision Support Systems, 35(2), pp. 231-243,
2003.
[24] Parpinelli, R. S., Lopes, H. S., and Freitas, A.A., "An ant colony based system for data
mining: applications to medical data", Proceedings of the 2001 Genetic and Evolutionary
Computation Conference, pp. 791-798, San Francisco, USA, July 2001.
[25] Chen, T. -C. and Hsu, T. -C., "A GAs based approach for mining breast cancer pattern",
Expert Systems with Applications, 30(4), pp. 674-681, 2006.
[26] Chen, T. -C. and Chen, C. -Y., "IAs Based Rule Mining Approach for Satellite-Derived
Land-Cover Classification", WSEAS Transactions on Computers, 5(6), pp. 1345-1353,
2006.
[27] Huang, C. L. and Wang, C. J., "A Ga-based feature selection and parameters optimization
for support vector machines", Expert Systems with Applications, 31, pp. 231-240,2006.
[28] Peng, S., Xu, Q., Ling, X. B., Peng, X., Du, W., and Chen, L., "Molecular classification
of cancer types from microarray data using the combination of genetic algorithms
and support vector machines", Federation of European Biochemical Societies, 555, pp.
358-362, 2003.
[29] Kim, Y. S., Street, W. N., and Menczer, F., "Optimal ensemble construction via
meta-evolutionary ensembles", Expert Systems with Applications, 30(4), pp. 705-714,
2006.
[30] Tsai, Y. -C., Cheng, C. -H., and Chang, J. -R., "Entropy-based fuzzy rough classification
approach for extracting classification rules", Expert Systems with Applications,
31(2), pp. 436-443, 2006.
[31] Fayyad, Usama M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., Advances
in Knowledge Discovery and Data Mining. Menlo Park, CA, AAAI/MIT Press, 1996.
[32] Frawley, William J., and, Gregory Piatetsky-Shapiro, and Matheus, Christopher J.,
"Knowledge Discovery in Databases: An Overview", AI Magazine, 13, pp. 57-70,
1992.
[33] Groupe, F. H. and Owrang, M. M., "Data base mining discovery new knowledge and
cooperative advatage", Information System Management, 16, pp. 26-31, 1995.
[34] Berry, Michael J. A. and Linoff, G., Data Mining Techniques for Marketing,
Sales and Customer Support. New York, John Wiley & Sons, Inc., 1997.
[35] Cabena, P., Hadjinaian, P., Stadler, R., Verhees, J., and Zanasi, A., Discovering Data
Mining: From Concept to Implementation. New Jersey, Prentice Hall, 1997.
[36] Kleissner, C., "Data mining for the enterprise", Proceedings of the Thirty-First Hawaii
International Conference on System Sciences, pp. 295-304, Kohala Coast, HI, USA,
6-9 Jan 1998.
[37] Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A., Classification and Regression
Trees Monterey, CA, USA, Wadsworth International Group, 1984.
[38] Fayyad, Usama M., Piatetsky-Shapiro, G., and Smyth, P., From Data Mining to
Knowledge Discovery: An Overview, 1996.
[39] Dash, M. and Liu, H., "Feature selection for classification", Intelligent Data Analysis,
1(1), pp. 131-156, 1997.
[40] Jain, A. K., Duin, R. P. W., and Mao, J., "Statistical Pattern Recognition: A Review",
IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, pp. 4-37, 2000.
[41] Yu, E. and Cho, S., "Ensemble based on GA wrapper feature selection", Computers &
Industrial Engineering, 51(1), pp. 111-116, 2006.
[42] Yang, J. and Honavar, V., "Feature subset selection using a genetic algorithm", IEEE
Intelligent Systems, 13(1), pp. 44-49, 1998.
[43] Yu, E. and Cho, S., "Constructing response model using ensemble based on feature
subset selection", Expert Systems with Applications, 30(2), pp. 352-360, 2006.
[44] Blum, A. and Langley, P., "Selection of Relevant Features and Examples in Machine
Learning", Artificial Intelligence, 97, pp. 245-271, 1997.
[45] Liu, H. and Motoda, H., Feature selection for knowledge discovery and data mining.
Norwell, MA, Kluwer Academic, 1998.
[46] Roitt, I., Brostoff, J., and Male, D., Immunilogy,5/e, Original English edition Copyright
Mosby International Ltd., 1998.
[47] Weissman, I. L. and Cooper, M. D., "How the immune system develops", Scientific
American, 269(3), pp. 33- 40, 1993.
[48] Institute, National Cancer, Understanding the Immune System, National Cancer Institute,
2003.
[49] Jerne, N. K., "The immune system", Scientific America, 229(1), pp. 52-60, 1973.
[50] De Castro, L. N. and Von Zuben, F. J., "ARTIFICIAL IMMUNE SYSTEMS:PART
II – A SURVEY OF APPLICATIONS", 2000.
[51] Michalewicz, Z., Genetic algorithm + Data structures = Evolution programs, Third,
Revised and Extend ed, Berlin: Springer, 1994.
[52] Kennedy, J. and Eberhart, R. C., "Particle swarm optimization", Proceedings of the
IEEE International Conference on Neural Networks, pp. 1942-1948, Perth, WA,
11/27/1995 - 12/01/1995.
[53] Elbeltagi, E., Hegazy, T., and Grierson, D., "Comparison among five evolutionary-
based optimization algorithms", Advanced Engineering Informatics, 19(1), pp.
43-53, 2005.
[54] Eberhart, R. and Kennedy, J., "A new optimizer using particle swarm theory", Proceedings
of the the Sixth International Symposium on Micromachine and Human Science,
pp. 39-43, Nagoya, Japan, 10/04/1995 - 10/06/1995.
[55] Shi, Y. and Eberhart, R., " A modified particle swarm optimizer", Proceedings of the
the IEEE International Conference on Evolutionary Computation, pp. 69-73, Anchorage,
Alaska, May 1998.
[56] Chen, M. C., Tsai, D. M., and Tseng, H. Y., "A stochastic optimization approach for
roundness measurements", Pattern Recognition Letters, 20, pp. 707-719, 1999.
[57] Kennedy, J. and Eberhart, R. C., "A discrete binary version of the particle swarm algorithm",
Proceedings of the International Conference on Evolutionary Computation, pp.
4104-4108, Orlando, FL, USA, Oct. 1997.
[58] Freitas, A. A., A survey of evolutionary algorithms for data mining and knowledge
discovery., Springer-Verlag, 2002.
[59] Lopes, H. S., Coutinho, M. S., and Lima, W. C., An evolutionary approach to simulate
cognitive feedback learning in medical domain, World Scientific, 1998.
[60] Alberg, Anthony J, Park, Ji Wan, Hager, Brant W, and Diener-West, Marie, "The use
of "overall accuracy" to evaluate the validity of screening or diagnostic tests." Journal
of General Internal Medicine, 19(5 Pt 1), pp. 460-465, 2004.
[61] Newman, D. J., Hettich, S., Blake, C. L., and Merz, C. J., UCI Repository of machine
learning databases, Available from
http://www.ics.uci.edu/~mlearn/MLRepository.html, 1998.
[62] Fisher, R. A., "The use of multiple measurements in taxonomic problems", Annals of
Eugenics, 7, pp. 179-188, 1936.
[63] Marshall, M., Iris Plants Database, Available from
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/iris/iris.data, 1988.
[64] Chen, S. -M. and Yu, C. -H., "A new method to generate fuzzy rules from training instances
for handling classification problems", Cybernetics and Systems, 34(3), pp.
217-232, 2003.
[65] Wu, T. -P. and Chen, S. -M., "A new method for constructing membership functions
and fuzzy rules from training examples", Systems, Man and Cybernetics, Part B, IEEE
Transactions on, 29(1), pp. 25-40, 1999.
[66] Heylighen, F., Occam''s razor principle, Available from
http://pespmc1.vub.ac.be/OCCAMRAZ.html, 1997.
[67] Hong, T. -P. and Lee, C. -Y., "Induction of fuzzy rules and membership functions from
training examples", Fuzzy Sets and Systems, 84(1), pp. 33-47, 1996.
[68] Forsyth, R. S., BUPA liver disorders, Available from
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/liver-disorders/bupa.data, 1990.
[69] Delen, D., Walker, G., and Kadam, A., "Predicting breast cancer survivability: a comparison
of three data mining methods", Artificial Intelligence in Medicine, 34(2), pp.
113-127, 2005.
[70] Johnson, R. A. and Wichern, D. W., Applied multivariate statistical analysis (5th ed.).
NJ, Prentice-Hall, 2002.
[71] West, David, "Neural network credit scoring models", Computers & Operations Research,
27(11-12), pp. 1131-1152, 2000.
[72] 周至文, "資訊揭露與股權結構關係之研究", 崑山科技大學, 碩士論文. 民國94
年.
[73] Freund, Y. and Schapire, R. E. , "Experiments with a New Boosting Algorithm ", Proceedings
of the thirteenth International Conference on Machine Learning, pp. 148-156,
San Francisco,CA, 1996.
[74] Roiger, Richard J. and Geatz, Michael W., Data Mining: a tutorial-based primer, Person
Education, Inc., 2003.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top