跳到主要內容

臺灣博碩士論文加值系統

(44.200.171.156) 您好!臺灣時間:2023/03/27 08:14
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳鵬雄
研究生(外文):CHEN, PENG-HSIUNG
論文名稱(外文):An Improved Feature Selection Method toward Precise Disease Classification
指導教授:薛幼苓
指導教授(外文):HSUEH, YU-LING
口試委員:薛幼苓江振國陳奕中
口試委員(外文):HSUEH, YU-LINGCHIANG, CHEN-KUOCHEN, YI-CHUNG
口試日期:2016-07-28
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:英文
論文頁數:35
中文關鍵詞:資料探勘特徵選取胸病醫療決策
外文關鍵詞:Data miningFeature selectionMedical decisionChest pain
相關次數:
  • 被引用被引用:0
  • 點閱點閱:151
  • 評分評分:
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:1
在急診醫療之中,如何正確且快速的判斷病患的狀況並決定如何治療是現今很重要的一個議題。若急診室同時面對大量的病患就診且病患人數超過急診醫生的負荷,此時就需要快速的將病患按照危急程度來分類,以達到最有效率的治療。但是在緊急情況下,醫生所能得到病患的資訊相當有限,而且許多病徵都非常相似,所以即時判斷病患的嚴重程度是一件非常困難的事。因此本研究使用電子健康記錄及資料探勘的技術來建立分類模型,並以其幫助醫生快速診斷。我們使用的資料為嘉義基督教醫院急診室 2012 到 2013 年之間的看診資料,其中 15775 位胸痛病患裡包含了 338 位危急的胸痛病患。本研究著重在更有效率的建立分類模型以協助醫生分辨危急胸痛病患,因此我們提出基於集群快速特徵選取之優化演算法 (i-FAST)將不相關及多餘的特徵排除。我們將演算法分為兩部份:第一部份我們使用 ReliefF 演算法來移除不相關的特徵。第二部份使用三種不同的互資訊方法 (1)對稱不確定性、(2)信息增益及 (3)獲利比率計算每個特徵之間的關聯性後,再分別使用算出的關聯性分別建立最小生成樹,並利用切割樹的方式找出代表性的特徵。最後將篩選後的特徵使用資料探勘的技術來建立胸痛病患的分類器。
In the emergency room of a hospital, the patients need to be quickly diagnosed so that the doctors can decide the required treatment. Doctors have to decide the treatment order for patients based on the level of emergency. However, it is hard to diagnose the disease immediately when patients go to the emergency room because patients may have the similar symptoms for different diseases. In this research, we use data mining techniques to analyze the electronic health records (EHRs) for helping doctors diagnose patients responsively. The dataset used in this research was collected from the emergency room of Chiayi Christian Hospital, Chiayi City, Taiwan. It contains the medical records from 2012 to 2013. The objective is to build a classifier to identify the chest pain patients. For this purpose, we design a feature selection algorithm, improved fast clustering-based feature subset selection algorithm (i-FAST), to facilitate any existing classifiers. The i-FAST aims to remove the irrelevant and redundant features and find the important features for classifier construction. Firstly, the irrelevant features are removed by ReleifF. Secondly, the distances of features are calculated based on three mutual information measurements, symmetric uncertainty, information gain, and gain ratio. We then construct the MST with the distances of features and partition the tree to select the representative features. Finally, the classifier with the selected features is built for identifying chest pain patients. The experiment result show that our classifier integrated with the i-FAST algorithm outperforms the classifier integrated with the FAST algorithm.
1. Introduction ....1
2. Background and Related Work ....4
3. Preliminary ....7
3.1 Decision tree ....7
3.2 Support vector machine ....8
3.3 Bayesian network ....9
3.4 FAST algorithm ....11
4. The i-FAST Algorithm ....13
4.1 Data preprocessing ....13
4.2 Feature correlation measurement ....16
4.3 The core of the i-FAST algorithm ....19
5. Experiments ....24
5.1 Dataset description ....24
5.2 Evaluation measurement ....26
5.3 Experimental results ....27
5.3.1 Unbalanced vs. balanced datasets ....28
5.3.2 Continuous vs. categorized features ....29
5.3.3 Feature selection results ....29
6. Conclusions and Future Work ....34
Bibliography ....36
[1] E. Cela and N. Frasheri. Data mining techniques and tools used in healthcare databases.
[2] V. Chaurasia and S. Pal. Data mining approach to detect heart dieses, international journal of ad-vanced computer science and information technology (ijacsit). 2013.
[3] X.-W. Chen, G. Anantha, and X. Lin. Improving bayesian network structure learning with mutual information-based node ordering in the k2 algorithm. Knowledge and Data Engineering, IEEE Transactions on, 20(5):628–640, 2008.
[4] C.-Y. Fan, P.-C. Chang, J.-J. Lin, and J. Hsieh. A hybrid model combining casebased reasoning and fuzzy decision tree for medical data classification. Applied Soft Computing, 11(1):632–644, 2011.
[5] S. H. Ha and S. H. Joo. A hybrid data mining method for the medical classification of chest pain. International Journal of Computer and Information Engineering, 4(1):33–38, 2010.
[6] K. Kira and L. A. Rendell. A practical approach to feature selection. In Proceedings of the ninth international workshop on Machine learning, pages 249–256, 1992.
[7] I. Kononenko. Estimating attributes: analysis and extensions of relief. In European conference on machine learning, pages 171–182. Springer, 1994.
[8] D. Lavanya and K. U. Rani. Performance evaluation of decision tree classifiers on medical datasets. International Journal of Computer Applications (0975–8887 (, Volume 26–No. 4, 1-4, 2011.
[9] D. Lavanya and K. U. Rani. Ensemble decision tree classifier for breast cancer data. International Journal of Information Technology Convergence and Services (IJITCS), 2(1):17–24, 2012.
[10] K. Polat and S. G‥unes﹐. Automated identification of diseases related to lymph system from lymphography data using artificial immune recognition system with fuzzy resource allocation mechanism (fuzzy-airs). Biomedical Signal Processing and Control, 1(4):253–260, 2006.
[11] K. Polat and S. G‥unes﹐. A hybrid approach to medical decision support systems: Combining feature selection, fuzzy weighted pre-processing and airs. Computer methods and programs in biomedicine, 88(2):164–174, 2007.
[12] J. R. Quinlan. Induction of decision trees. Machine learning, 1(1):81–106, 1986.
[13] J. R. Quinlan. C4. 5: Programs for Machine Learning, volume 1. Morgan Kaufmann, 1993.
[14] F. H. Rutten, M.-J. M. Cramer, D. E. Grobbee, A. P. Sachs, J. H. Kirkels, J.-W. J. Lammers, and A.W. Hoes. Unrecognized heart failure in elderly patients with stable chronic obstructive c disease. European heart journal, 26(18):1887–1894, 2005.
[15] A. S. M. Salih and A. Abraham. Novel ensemble decision support and health care monitoring system. Journal of Network and Innovative Computing, 2(2014):041–051.
[16] T. Silander and P. Myllymaki. A simple approach for finding the globally optimal bayesian network structure. arXiv preprint arXiv:1206.6875, 2012.
[17] Q. Song, J. Ni, and G. Wang. A fast clustering-based feature subset selection algorithm for high-dimensional data. Knowledge and Data Engineering, IEEE Transactions on, 25(1):1–14, 2013.
[18] S. P. VikasChaurasia. Data mining approach to detect heart dieses. International Journal of Advanced Computer Science and Information Technology (IJACSIT) Vol, 2:56–66.
[19] L. Yu and H. Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. In ICML, volume 3, pages 856–863, 2003.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊