臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.34) 您好！臺灣時間：2025/10/31 01:56

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

伍碧那

研究生(外文):

Bi-NaWu

論文名稱:

適用於不同分類器的混合型離散化方法

論文名稱(外文):

A hybrid discretization method for classification algorithms

指導教授:

翁慈宗

指導教授(外文):

Tzu-Tsung Wong

學位類別:

碩士

校院名稱:

國立成功大學

系所名稱:

資訊管理研究所

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2014

畢業學年度:

102

語文別:

中文

論文頁數:

中文關鍵詞:

混合型離散化、網路最佳化模型、動態規劃、分類器

外文關鍵詞:

Classifier、dynamic programming、hybrid discretization method、network optimization model

相關次數:

被引用:2
點閱:178
評分:
下載:0
書目收藏:0

分類是資料探勘領域處理資料的一種方法，根據資料的屬性，經過運算處理而得到每筆資料的分類結果。大多數資料檔內的屬性都包含了連續型屬性，在適用於離散型屬性的分類器中，一般會先將連續型屬性進行離散化動作，將資料轉換為離散型屬性。因此，離散化方法的挑選有可能影響到分類器的分類預測的效果。混合型離散化將連續型屬性個別進行離散化動作，來搜尋最適合的離散化方法，相較於將同一資料檔內的屬性皆採用同一種離散化方法來說，更能提升分類正確率。在混合型離散化的文獻中，主要研究適用於簡易貝氏分類器上，並且須採用分類結果來判定最適合的離散化方法，無法在資料前置處理步驟立即完成所有的離散化動作，因此本研究的目的在於建立出一個適用於其它處理離散型屬性的分類器的混合型離散化方法，且在資料前置處理步驟時即可完成所有的離散化動作。本研究將結合作業研究中的網路最佳化問題，並將混合型離散化問題轉換成網路最佳化模型圖，再根據屬性之間以及屬性與類別值的相關性作為評估指標，使用動態規劃來找出一條最佳的路徑，此路徑亦代表著最適合的混合離散化方法。本研究使用20個資料檔分別使用決策樹、簡易貝氏分類器與基於規則分類器進行分類驗證，相較於使用統一離散化方法，混合型離散化方法在放入簡易貝氏分類器與基於規則分類器時，大部分的資料檔的分類正確率皆有所提升，在決策樹的分類結果則是混合型離散化方法與統一離散化方法的結果差不多，因此本研究之研究方法在挑選混合離散化組合上是可行的。

Discretization is one of the major approaches for processing continuous attributes for classifiers. Hybrid discretization sets the method for discretizing each continuous attribute individually. A previous study found that hybrid discretization method is a better approach to improve the performance of naïve Bayesian classifier than unified discretization. That approach determines the discretization method for each attribute based on whether accuracy can be improved or not. The objectives of this study is to develop a hybrid discretization method applicable for classifiers such that it can determine the discretization method for each attribute in data preprocessing step instead of using accuracy. This study will first build a network optimization models based on the association among attributes and the class. Dynamic programming is then employed to find the optimal solution for the network, and this solution indicates the discretization method for each continuous attribute. The classification tools for testing our methods are decision trees, naïve Bayesian classifiers, and rule-based classifiers. The experimental results on 20 data sets show that the computational cost of our method is low, and that in general, the hybrid discretization method have a better performance in naïve Bayesian classifiers and rule-based classifiers, but not in decision trees.

摘要 I
誌謝 V
第一章緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 研究架構 2
第二章文獻回顧 3
2.1離散化方法 3
2.2屬性與類別值之關係 6
2.3 動態規劃 9
第三章研究方法 11
3.1 研究流程 11
3.2 連續型屬性排序 12
3.3 離散化方法 14
3.3.1 等寬度離散化方法 15
3.3.2 等頻率離散化方法 15
3.3.3 比例式離散化方法 16
3.3.4 最小熵值離散化方法 16
3.4 網路最佳化模型 17
3.4.1建構網路模型圖 17
3.4.2相關性衡量 18
3.5 動態規劃 21
3.6 分類器 25
3.6.1 決策樹 25
3.6.2 簡易貝氏分類器 25
3.6.3 基於規則的分類器 26
3.6.4 K等分交叉驗證法 27
第四章實證研究 28
4.1 資料檔介紹 28
4.2 動態規劃求解之結果 29
4.3 分類結果之驗證 31
4.3.1 決策樹 32
4.3.2 簡易貝氏分類器 34
4.3.3 基於規則分類器 36
4.3.4 分類正確率驗證小結 38
4.4 統一離散化方法之分類驗證 39
4.4.1決策樹 39
4.4.2 簡易貝氏分類器 40
4.4.3 基於規則分類器 41
4.5 小結 43
第五章結論與未來發展 45
5.1 結論 45
5.2 未來發展 46
參考文獻 47

Ballesteros, A. J. T., Martínez, C. H., Riquelme, J. C., and Ruiz, R. (2013). Feature selection to enhance a two-stage evolutionary algorithm in product unit neural networks for complex classification problems. Neurocomputing, 114, 107–117.
Bellman, R. (1957). Dynamic Programming. Princeton, Princeton University Press.
Cannas, L. M., Dessi, N., and Pes, B. (2013). Assessing similarity of feature selection techniques in high-dimensional domains. Pattern Recognition Letters, 34, 1446–1453.
Concepción, M. Á. Á. D. L., Abril, L. G., Morillo, L. M. S., and Ramírez, J. A. O. (2013). An adaptive methodology to discretize and select features. Artificial Intelligence Research, 2( 2), 77-86.
Fayyad, U. M. and Irani, K. B. (1993). Multi-interval discretization of continuous-Valued attributes for classification learning. The 13th International Joint Conference on Artificial Intelligence (IJCAI), 1022-1029.
García, S., Luengo, J., Sáez, J. A., López, V., and Herrera, F. (2013). A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Transactions on Knowledge and Data Engineering, 25(4), 734-750.
Golding, D., Nelwamondo, F. V., and Marwala, T. (2013). A dynamic programming approach to missing data estimation using neural networks. Information Sciences, 237, 49–58.
Gu, Q., Li, Z., and Han, J. (2012). Generalized fisher score for feature selection. The 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, arXiv preprint arXiv,1202.3725.
Hu, Q., Pedrycz, W., Yu, D., and Lang, J. (2010). Selecting discrete and continuous features based on neighborhood decision error minimization. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, 40(1), 137-150.
Jiang, S. Y., Li, X., Zheng, Q., and Wang, L. X. (2009). Approximate equal frequency discretization method. Intelligent Systems, GCIS '09. WRI Global Congress, 3, 514-518.
Li, M., Deng, S. B., Feng, S., and Fan, J. (2011). An effective discretization based on Class-Attribute Coherence Maximization. Pattern Recognition Letters, 32, 1962–1973.
Liu, H., Sun, J., Liu, L., and Zhang, H. (2009). Feature selection with dynamic mutual information. Pattern Recognition, 42, 1330-1339.
Jung, Y. G., Kim, K. M., and Kwon, Y. M. (2012). Using weighted hybrid discretization method to analyze climate changes. Computer Applications for Graphics, Grid Computing, and Industrial Environment. Springer Berlin Heidelberg, Communications in Computer and Information Science, 351, 189–195.
Lustgarten, J. L, Visweswaran, S., Gopalakrishnan1, V., and Cooper, G. F. (2011). Application of an efficient Bayesian discretization method to biomedical data. BMC Bioinformatics, 12, 309.
Park, C. E. and Lee, M. (2009). A SVM-based discretization method with application to associative classification. Expert Systems with Applications, 36, 4784–4787
Pisica, I., Taylor, G., and Lipan, L. (2013). Feature selection filter for classification of power system operating states. Computers and Mathematics with Applications, 66, 1795–1807.
Sakar, C. O., Kursun, O., and Gurgen, F. (2012). A feature selection method based on kernel canonical correlation analysis and the minimum Redundancy–Maximum Relevance filter method. Expert Systems with Applications, 39, 3432–3437.
Sang, Y., Jin, Y., Li, K., and Qi, H. (2013). UniDis: a universal discretization technique. Journal of Intelligent Information Systems, 40, 327–348.

Shen, C. C. and Chen, Y. L. (2008). A dynamic-programming algorithm for hierarchical discretization of continuous attributes. European Journal of Operational Research, 184, 636–651.
Tian D., Zeng, X. J., and Keane, J. (2011). Core-generating approximate minimum entropy discretization for rough set feature selection in pattern classification. International Journal of Approximate Reasoning, 52 , 863–880.
Wong, T. T. (2012). A hybrid discretization method for naive Bayesian classifiers. Pattern Recognition, 45, 2321–2325.
Yu, L. and Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. Proceedings of the Twentieth International Conference on Machine Learning, Washington DC, 856-863.
Zhao, J., Han, C. Z., Wei, B., and Han, D. Q. (2012). A UMDA-based discretization method for continuous attributes. Advanced Materials Research, 403-408, 1834-1838.
Zou, L., Yan, D., Karimi, H. R., and Shi, P. (2013). An algorithm for discretization of real value attributes based on interval similarity. Journal of Applied Mathematics, 1-8.

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	資料探勘技術之績效評估-以企業導入ERP系統為例
2.	不同分類器的混合型離散化方法之一致性分析

無相關期刊

1.	結合多項式馬可夫貝氏分類器與廣義狄氏分配參數估算方法於基因序列分類之研究
2.	探討創新跨界產品之擴散過程-以平板手機為例
3.	不同分類器的混合型離散化方法之一致性分析
4.	以化學水浴沉積法製備硫化鎘緩衝層於硒化銅銦太陽能電池上的研究
5.	探討K等分交叉驗證法對於分類器錯選率之研究
6.	以基因演算法求解工件族群批件處理機台之多目標流程式生產排程問題
7.	以商業網模式篩選種子網路創業團隊之實務探討
8.	《李爾在此》之跨文化劇場改編研究
9.	快速流行服飾商品消費行為之實證研究─以快速流行服飾涉入為干擾變數
10.	銦基界面散熱材料與銅、鎳基材之界面反應與相平衡
11.	台灣文化創意產業關鍵成功因素之研究
12.	應用雲端科技的創新商業模式-以碳纖維產品透過虛實整合電子商務平台為例
13.	應用層次分析法於不同遊艇客戶之設計溝通
14.	大學生參與服務學習和利社會行為之相關
15.	以條件總經變數模型衡量管理期貨基金經理人績效

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室