跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.223) 您好!臺灣時間:2025/10/08 01:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李建明
研究生(外文):Jian-Ming Li
論文名稱:數量相關法則技術在疾病資料庫之應用
論文名稱(外文):Mining Quantitative Association Rules in Disease Databases
指導教授:陳志宏陳志宏引用關係
指導教授(外文):Jyh-Horng Chen
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電機工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2000
畢業學年度:88
語文別:中文
論文頁數:65
中文關鍵詞:資料探勘相關法則關連式資料庫DHP演算法布林演算法
外文關鍵詞:Data MiningAssociation rulesRelational databaseDHP algorithmBoolean algorithm
相關次數:
  • 被引用被引用:4
  • 點閱點閱:248
  • 評分評分:
  • 下載下載:25
  • 收藏至我的研究室書目清單書目收藏:2
由於目前醫院管理已經逐步資訊化,而醫學資料庫也相當普及,相較於傳統的儲存方式,資料量的累積可說是與日遽增,而其中必然隱藏著許多已知與未知的資訊等待我們去探勘,而傳統的統計方法並不適用於處理如此大量的資料,一種稱之為資料探勘的技術正逐日發展茁壯,而當中的相關法則技術更是專門用來探討項目之相關性之利器。
相關法則的研究起初是用在零售商找尋顧客購買物品的模式,後來推廣到應用在各式的資料庫上,尤其在關連式資料庫上面,而關連式資料庫的資料屬性包含數量性及類別性的。為改善以往相關法則技術不適用於關連式資料庫之缺失,我們特提出了一套統計的方法,將數量性屬性資料作適當的區間分割處置,有別於前人對於全距範圍作等分之方法,我們以資料分佈為出發點,且以平均值及標準差作為分割參數,如此反映出資料庫本身之偏差,同時改善了前人對於高度偏斜之資料未能有良好效能之缺點,也簡化了其方法之不便性。接著我們結合了DHP演算法及布林演算法,進一步地求出數量相關法則。
最後我們把方法應用於疾病資料庫,評估結果並與前人的等分法作一比較,顯示出我們不僅產生了較少的雜訊,同時也提供了一個較簡易的執行方法。
With the computerization of medical information and popularity of medical database, the amount of data grows much more rapidly than ever. There must be numerous known or unknown information hidden behind these data. Traditional statistical approach is not suit for processing such large amount of data. A technique called “Data Mining” is emerging in which the “Association Rules” is the one focusing on the relationship among data items.
The technique of mining association rules was first introduced to search the pattern of items that a customer may buy in a supermarket. It can also be extended for mining association rules from a relational database. There are two kinds of attributes in a relational database, one is quantitative and the other is categorical. In this thesis, we introduce a statistical method to finely partition the values of a quantitative attribute into a set of intervals. Different from the previous method which equally partitions the range of an attribute, we suggest a method based on the observation of the data distribution. And we use the mean and standard deviation of each attribute as two parameters of partition. This choice reflects the bias of databases so that it can improve the effectiveness of analysis in highly skewed data. To demonstrate the feasibility of our method, we combine two effective rule-mining algorithms called the DHP algorithm and the Boolean algorithm. With the combination, we can mine association rules from the relational database.
Finally, we use this approach on two disease databases. We show the experimental results and compare them with previous methods. The results reveal that our method generated less noises and it was executed easier.
目錄 1
圖目錄 4
中文摘要 6
Abstract 7
第一章 緒論 9
1.1研究背景 9
1.2研究動機及目的 10
1.3資料探勘應用實例簡介 12
1.4數量相關法則之探討 15
1.5相關文獻研究 17
1.6論文架構 22
第二章 整體系統架構分析 23
2.1數量相關法則系統流程 23
2.2醫學資料分析 24
2.2.1 分析及轉換的方法 24
2.3大項目組的產生 27
2.3.1 Apriori 演算法 27
2.3.2 DHP演算法 32
2.3.3 多個最小支持度機制 36
2.4 相關法則的產生 37
2.4.1 布林演算法 37
2.4.2 假性的相關法則 40
第三章 疾病資料庫之應用 42
3.1 實驗設計與實驗結果 42
3.1.1 糖尿病資料庫 43
3.1.2 甲狀腺資料庫 48
3.2 結果比較 53
3.3 討論 57
第四章 結論與未來展望 60
4.1結論 60
4.2未來工作 61
參考文獻 63
[1] Yi-Jung Lin ,“Design and Implementation of a Java-Based
Data Mining” , 國立陽明大學公共衛生研究所1999.
[2] http://www.hic.gov.au
[3] http://www9.s390.ibm.com/customer
[4] http://ims-global/com/
[5] http://www.software.ibm.com/data
[6] Jaw-Ching Chiang and Shi-Jen Lin, “A study of Data Mining in Medical Informatics” The Journal of China Association for Medical Informatics, NO 9, June 1999, pp. 71-82.
[7] M.-S. Chen, J. Han and P. S. Yu, ''Data Mining: An Overview from Database Perspective,'''' IEEE Trans. on Knowledge and Data Engineering, Vol. 8, No. 6, pp. 866-883, December 1996.
[8] Tomasz Imielinski and Heikki Mannila;"A database perspective on knowledge discovery" Commun. ACM 39, 11 (Nov. 1996), Pages 58 — 64
[9] R. Agrawal, T. Imielinski, A. Swami: ''Database Mining: A Performance Perspective'''', IEEE Transactions on Knowledge and Data Engineering, Special issue on Learning and Discovery in Knowledge-Based Databases, Vol. 5, No. 6, December 1993, 914-925.
[10] 資料挖採 — 找出隱藏在你資料中的寶藏。http://www.ibm.nctu.edu.tw/news/lib/news0093.html Aug. 12 1997
[11] R. Agrawal, A. Arning, T. Bollinger, M. Mehta, J. Shafer, R. Srikant: "The Quest Data Mining System", Proc. of the 2nd Int''l Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, August, 1996.
[12] R. Agrawal, T. Imielinski, A. Swami: ''Mining Associations between Sets of Items in Massive Databases'''', Proc. of the ACM SIGMOD Int''l Conference on Management of Data, Washington D.C., May 1993, 207-216.
[13] J.-S. Park, M.-S. Chen, and P.S. Yu, “Using A Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Trans. On Knowledge and Data Eng., vol. 9, no. 5, pp. 813-825, Sept./Oct. 1997.
[14] R. Agrawal, R. Srikant: ''Fast Algorithms for Mining Association Rules'''', Proc. of the 20th Int''l Conference on Very Large Databases, Santiago, Chile, Sept. 1994.
[15] Takeshi Fukuda, Yasuhiko Mormoto, Shinichi Morishita, “Data Mining Using Two-Dimensional Optimized Association Rules: Scheme, Algorithms, and Visualization”, Proceedings of the ACM SIGMOD International Conference on Management of Data, p13-p23 ,1996.
[16] Takeshi Fukuda, Yasuhiko Mormoto, Shinichi Morishita ,and Takeshi Tokuyama, “Mining Optimized Association Rules for Numeric Attributes” ,Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems ,p182-p191,June 1996.
[17] Rajeev Rastogi, Kyuseok Shim, “Mining Optimized Association Rules with Categorical and Numeric Attributes”, IEEE International Conference on Data Engineerings 1998.
[18] Ramakrishnan Srikant and Rakesh Agrawal,"Mining quantitative association rules in large relational tables" Proceedings of the 1996 ACM SIGMOD international conference on Management of data , 1996, Pages 1 — 12
[19] Chun-Ching Ling, “Mining Quantitative Association Rules in Bag Databases”; 國立中央大學資訊管理研究所 1999.
[20] Charu C. Aggarwal and Philip S. Yu, “Mining Large Itemsets for Association Rules”, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering,1998 Pages 1-9
[21] Shi Ming Tsai, “An Improvement on Mining Association Rules Using Efficient Categorization of Large Itemsets”, 元智大學電資與資訊工程研究所 1998.
[22] Bing Liu, Wynne Hsu and Yiming Ma,"Mining association rules with multiple minimum supports", Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining , 1999, Pages 337 — 341
[23] Suh-Ying Wur; Yungho Leu,"An effective Boolean algorithm for mining association rules in large databases", Database Systems for Advanced Applications, 1999. Proceedings., 6th International Conference on , 1999 , Page(s): 179 —186
[24] 生物統計學, 毛文秉醫師譯, 環球書社,民68
[25] 生物統計學導論, 蕭如英譯, 五南圖書出版公司印行,民73
[26] 生物統計學入門, 沈明來著, 九州圖書, 民88
[27] Handbook of laboratory diagnostic tests : with nursing implications / Joyce LeFever Kee. Stamford, Conn. : Appleton & Lange, c1998.
[28] A manual of laboratory & diagnostic tests / Frances Talaska Fischbach. New York : J.B. Lippincott, c1996.
[29] 全身健康檢查指南 / 安藤幸夫,真山享,藤田善幸著; 楊鴻儒譯, 書泉, 1994[民83]
[30] http://www.ics.uci.edu/~mlearn/MLRepository.html
[31] 糖尿病防制手冊 / 行政院衛生署編,遠流出版公司,1999[民89].
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊