跳到主要內容

臺灣博碩士論文加值系統

(44.192.20.240) 您好!臺灣時間:2024/02/24 02:20
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:童俊宏
研究生(外文):Jiun-Hung Tung
論文名稱:無候選型樣產生之頻繁樹狀結構探勘
論文名稱(外文):MINT: Mining Frequent Rooted Induced Unordered Tree without Candidate Generation
指導教授:張嘉惠張嘉惠引用關係
指導教授(外文):Chia-Hui Chang
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:中文
論文頁數:36
中文關鍵詞:支持度標準型式子樹頻繁型樣
外文關鍵詞:frequentpatternsubtreecanonical formsupport
相關次數:
  • 被引用被引用:0
  • 點閱點閱:117
  • 評分評分:
  • 下載下載:8
  • 收藏至我的研究室書目清單書目收藏:0
在資料探勘(Data Mining)的領域中樹狀結構的探勘(Tree Mining)是一個重要的問題,它可以應用在網站記錄(Web Logs)的分析、生物資訊(Bioinformatics)和半結構式的文件(Semi-structured Documents)上。然而在此方面的先前研究都是先產生候選型樣,再測試其是否為頻繁出現的型樣,如果不是則會被刪除。以這樣的做法會用都掉很多的時間及空間在候選者的產生與測試上。所以,在此篇論文裡面,我們使用區域頻繁的這個概念設計了一個不會有候選者產生的演算法來做「有樹根的」、「誘導的」、「無序的」樹狀結構的探勘工作,而我們把這個演算法稱為MINT。我們利用資料產生器產生一些人工合成的資料集,以及實際的網站記錄資料,和HybridTreeMiner 來做比較。實驗結果顯示出即使在樹狀結構這種複雜的資料型態中,使用找尋區域頻繁的觀念是依然可以有不錯的效能。
Tree pattern mining is an important issue in data mining area and it has many emerging applications including web log analysis, bioinformatics, semi-structured documents, and so on. However, most of the previous works are candidate-generation-and-testing approach. They enumerate candidate patterns from shorter patterns based on the apriori frequent patterns. Because this approach costs a lot of time and space in candidate generation and testing, in this paper, we adopt the idea of pattern growth to mine frequent rooted induced unordered tree without candidate generation. In the performance study, we use synthetic datasets and real world application datasets to compare with HybridTreeMiner. The experiments show that our algorithm is an efficient algorithm and cost-effective.
目錄..........................................................................I
圖目錄......................................................................III
表目錄...................................................................... IV
第一章 緒論...................................................................1
1.1. 研究動機與目的...........................................................1
1.2. 論文架構.................................................................3
第二章 問題定義...............................................................4
第三章 相關研究...............................................................8
3.1. Unot 演算法..............................................................9
3.2. uFreqt 演算法...........................................................10
3.3. HybridTreeMiner 演算法..................................................10
3.4. RootedTreeMiner 演算法..................................................12
3.5. 相關研究之比較..........................................................12
第四章 演算法................................................................14
4.1. 樹狀結構探勘的挑戰......................................................14
4.2. 演算法架構..............................................................15
4.2.1. 標準型式(Canonical From) ...........................................15
4.2.2. 型樣列舉方法(Enumeration)...........................................17
4.2.3. 延伸點所允許的標記值範圍計算(Label Range Computing)...................19
4.2.4. 樹型樣成長機制(Extension)...........................................21
4.3. 演算法..................................................................25
第五章 實驗結果..............................................................27
5.1 合成資料集...............................................................27
5.1.1 資料產生器說明.........................................................27
5.1.2 合成資料集實驗分析.....................................................29
5.2 實際資料集...............................................................30
5.2.1 實際資料集實驗分析.....................................................30
第六章 結論..................................................................33
參考文獻.....................................................................34
[1] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In proceedings of 1994 International Conference. Very Large Data Bases (VLDB’94), Setp.1994, 487-499.
[2] T. Asai, K. Abe, S. Kawasoe, H. Arimura, H.Sakamoto, and S. Arikawa, Efficient Substructure Discovery from Large Semi-structured Data. In proceedings of the 2nd SIAM International Conference on Data Mining, April 2002.
[3] T. Asai, H. Arimura, T. Uno, and S. Nakano: Discovering Frequent Substructures in Large Unordered Trees. In proceedings of 6th International Conference on Discovery Science, October 2003.
[4] Y. Chi, Y. Yang, and R. R. Muntz, Indexing and Mining Free Trees. In proceedings of the 3rd IEEE International Conference on Data Mining (ICDM’03), November 2003.
[5] Y. Chi, Y. Yang, and R. R. Muntz, HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms. In proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM’04), June 2004.
[6] Y. Chi, Y. Yang, and R. R. Muntz, Canonical Forms for Labeled Trees and Their Applications in Frequent Subtree Mining. Journal of Knowledge and Information Systems (KAIS), August 2005, 203-234.
[7] Y. Chi, Y. Yang, Y. Xia, and R. R. Muntz: CMTreeMiner, Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees. IEEE Transactions on Knowledge and Data Engineering, 17(2), February, 2005.
[8] J. Han, J. Pei, Y. Yin, and R. Mao, Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Journal of Data Mining and Knowledge Discovery, 8(1), 53-87, 2004.
[9] K. Y. Huang, C. H. Chang and K. Z. Lin, PROWL: An efficient frequent continuity mining algorithm on event sequences. In proceedings of 6th International Conference on Data Warehousing and Knowledge Discovery (DaWak), 2004.
[10] S. Nijssen and J. N. Kok: Efficient Discovery of Frequent Unordered Trees. 1st international Workshop on Mining Graphs, Trees and Sequences, 2003.
[11] H. Tan, T. S. Dillon, F. Hadzic, E. Chang, and L. Feng, IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding. In proceeding of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2006), 450 - 461, April 9-12 2006.
[12] C. Wang, M. Hong, J. Pei, H. Zhou, W. Wang, and B. Shi, Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining. In proceedings of PAKDD, 2004.
[13] Y. Xiao, J. F. Yao, Z. Li, and M. H. Dunham, Efficient Data Mining for Maximal Frequent Subtrees. In proceedings of the 3rd IEEE international Conference on Data Mining, 2003.
[14] J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, H-Mine: Hyper-Structure Mining of Frequent Pattern in Large Database. In proceedings of International Conference on Data Mining (ICDM), 2001.
[15] J. Pei, J. Han, B. M. Asl, and H. Pinto, PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In proceedings of 17th International Conference on Data Engineering (ICDE), 2001.
[16] J. Punin, M. Krishnamoorthy, M. Zaki, LOGML: Log markup language for web usage mining. In WEBKDD Workshop (with SIGKDD), August 2001.
[17] Y. Xiao, J. F. Yao, and G. Yang, Discovering Frequent Embedded Subtree Patterns from Large Databases of Unordered Labeled Trees. International Journal of Data Warehousing and Mining (IJDWM), 1(2), 44-66, April-June 2005.
[18] M. J. Zaki, C. C. Aggarwal, XRules: An Effective Structural Classifier for XML Data. In proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2003.
[19] M. J. Zaki, Efficiently Mining Frequent Trees in a Forest, Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering, 17(8), 1021-1035, August 2005.
[20] M. J. Zaki, Efficiently Mining Frequent Embedded Unordered Trees. In proceedings of the Fundamenta Informaticae, 2005.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
1. 戴傳文、許文耀(1991) • 社區居民對精神病患的概念與態度 • 中華心理衛生學刊,5(2),71-87。
2. 鄭南鵬,張自強,林鈺雯,徐嘉隆,沈楚文(2003) • 精神障礙者之就業需求 • 臺灣精神醫學,17(3),225-231。
3. 蔣竹雲(1996) • 臺灣地區已立案之「精神病患及家屬自助團體」現況 • 衛生報導,6(5),19-29。
4. 劉容孜、陳彥穎、洪秀主、楊淳雅(2000) • 中美殘障者支持性就業之比較分析 • 兒童福利論叢,4,68-102。
5. 劉秀美、黃孟卿、陳碧秋、陳美碧、蔡欣玲(1993) • 精神科日間病房復健護理之一~生活安排團體 • 榮總護理,10(4),402-406。
6. 葉英堃(1984) • 台灣地區精神疾病醫療與心理衛生保健工作網之建立 • 中華心理衛生學刊,1,1-10。
7. 萬育維、方雅麗、曾婉君(1999) • 慢性精神障礙者安置照顧模式之初探--以臺北市領有殘障手冊者為例 • 東吳社會工作學報,5,1-39。
8. 黃梅羹、胡海國、黃曼聰(1994)• 都會地區發展精神分裂病患者社區復健相關因素之研究 • 中華醫務社會工作學刊,4,117-139。
9. 陸瑞玲、張阿雪、劉秀美、葉紅秀(2003) • 提昇某醫學中心日間留院慢性精神病患工作復健—支持性就業模式之方案 • 醫院,36(1),70-81。
10. 陳美碧、尹祚芊、蔡欣玲(1999) • 台北市北區慢性精神病患心理衛生需求未滿足相關因素之探討 • 護理研究,7(1),77-89。
11. 梁玉雯、蔣欣欣、李從業、吳玫勳、嚴小燕、崔翔雲(1997) • 慢性精神病患者客觀生活品質之探討 • 護理研究,5(3),212-222。
12. 高麗芷(1987) • 精神病患社區適應力之探討 • 中華心理衛生學刊,3,1-7。
13. 林慶仁(2002) • 復健、復健諮商、與醫學復健等名詞之比較 • 特教園丁,17(4),32-35。
14. 林清良(1988) • 精神病患社區工廠就業之評估 • 職能治療學會雜誌,6,27-36 。
15. 李文瑄、葉英堃、劉蓉台(1984) • 慢性精神病患出院後社會適應之評估 • 中華心理衛生學刊,1,41-47。