(54.236.58.220) 您好!臺灣時間:2021/03/09 16:01
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:吳欣怡
研究生(外文):Shin-Yi Wu
論文名稱:企業動態環境中事件序列之遞增式探掘
論文名稱(外文):Incremental Mining of Event Sequences from Dynamic Business Environments
指導教授:劉瑞瓏劉瑞瓏引用關係
指導教授(外文):Rey-Long Liu
學位類別:碩士
校院名稱:中華大學
系所名稱:電機工程學系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:69
中文關鍵詞:遞增式探掘序列探掘企業環境資訊事件時間順序資料探掘
外文關鍵詞:incremental miningsequences miningbusiness environmental informationtemporal order of eventsdata mining
相關次數:
  • 被引用被引用:0
  • 點閱點閱:70
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
對於多數企業管理者而言,企業環境資訊的搜集與分析是相當必要的。適切地分析出管理者關心之企業相關資訊出現之順序關係,可讓管理者及早預測重要事件的發生,及時做適當應對與防範。本研究即以網際網路這樣的動態資訊來源,根據企業管理者之資訊需求,以遞增的方式探掘出事件序列。又因為網際網路上常出現網路忙線與伺服器未正常提供服務等問題,使得每次搜集之文章不必然按發生先後順序排列,造成事件探掘機制原先所探掘之事件序列遭到破壞,影響探掘結果之正確性。類似的問題也常發生於實務上資料之登錄與收集上。本研究即就此問題提出一個解決策略,以遞增式(incremental)的方式探掘並處理事件序列遭破壞的問題,不必重新探掘亦能確保探掘結果之正確性。為驗證此機制之實際貢獻與效能,我們自網際網路上搜集了大量的文章,並將我們所提出之機制與未處理序列遭受破壞之探掘機制之探掘結果相比較。實驗與研究結果顯示,本論文所提出之方法有較高之正確性,並在效率上而言,也優於傳統非遞增式的探掘機制。

Collecting and analyzing the business environmental information is increasingly significant for most businesses. Properly identifying the sequences of critical events can help managers to find out significant implications early, and accordingly respond to the implications in a timely manner. However, due to network congestion and heavy loading of the various information servers, the collected information pieces can not always be in temporal order. This problem breaks the sequences mined before, and thus decreases the correctness of the results of mining. This thesis proposes an incremental mining technique iSMART to teckle the problem. Instead of re-mining the whole database once an out-of-order event is collected, iSMART incrementally examines necessary parts of the database only. To evaluate the performance of iSMART, we collect documents from the Internet, and compare iSMART with traditional mining approaches that did not handle the problem of broken sequences. The experimental results indicate that iSMART has higher correctness and better efficiency.

誌謝 i
摘要 ii
ABSTRACT iii
目錄 iv
圖目錄 vii
表目錄 viii
第一章 緒論 1
1.1. 問題定義與動機 1
1.2. 主要挑戰 2
1.3. 主要目的 3
1.4. 研究方法與本文結構 3
第二章 文獻探討 5
2.1. 文字探掘 5
2.1.1. 文件處理 5
2.1.2. 規則歸納 6
2.2. 序列探掘 7
2.2.1. Apriori-based方法 7
2.2.2. 效率改良上之研究 8
2.2.3. 序列性資料之頻繁片段探掘 10
2.2.4. 遞增式序列探掘 10
第三章 動態事件序列之探掘 12
3.1. 文章搜集與監控模組 12
3.2. 文章前處理模組 13
3.3. 事件辨識模組 14
3.4. 遞增式探掘模組 16
第四章 遞增式動態事件序列探掘方法 17
4.1. 基本定義 17
4.2. 遞增式動態序列探掘 21
4.2.1. iSMART 23
4.2.2. Changed-S2-Detection 25
4.2.3. Update-Sequence 27
4.2.4. Remove-non-Frequent 31
4.3. iSMART運作範例說明 33
第五章 實驗 40
5.1. 資料搜集 40
5.2. 實驗設計 43
5.2.1. 正確性驗證 43
5.2.2. 效能驗證 45
5.3. 實驗結果與分析 45
5.3.1. 正確性驗證結果 45
5.3.2. 效能驗證結果 48
第六章 結論與未來展望 52
6.1. 研究結論 52
6.2. 未來展望 52
參考文獻 56
附錄一、網際網路文章範例 59
附錄二 (A)、iSMART探掘結果(F1) 60
附錄二 (B)、iSMART探掘結果(F2) 61
附錄二 (C)、iSMART探掘結果(F3) 64
附錄二 (D)、iSMART探掘結果(F4) 67
附錄三 (A)、傳統探掘機制探掘結果(F1) 68
附錄三 (B)、傳統探掘機制探掘結果(F2) 69

[1] 吳欣怡、劉瑞瓏 (2000),“運用於例外式管理之合作式例外監控代理人”,第六屆資訊管理研究暨實務研討會論文集。
[2] R. Agrawal and R. Srikant (1995), “Mining sequential patterns,” Proceedings of the Eleventh International Conference on Data Engineering, pp.3 -14.
[3] S. Basu, R. J. Mooney, K. V. Pasupuleti, and J. Ghosh et al. (2001), “Evaluating the Novelty of Text-Mined Rules using Lexical Knowledge,” In the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001).
[4] J. Dörre, P. Gerstl and R. Seiffert (1999), “Text mining: finding nuggets in mountains of textual data,” Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp.398-401.
[5] F. H. Fukuda, E. L. P. Passos, M. A. Pacheco, L. B. Neto, J. Valerio, V. De Roberto, E. R. Antonio, and L. Chiganer (2000), “Web text mining using a hybrid system,” Proceedings of Sixth Brazilian Symposium on Neural Networks, pp.131-136.
[6] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M.-C. Hsu (2000), “FreeSpan: frequent pattern-projected sequential pattern mining,” Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining (ACM SIGKDD-2000), pp.355-359.
[7] C.-H. Lee and H.-C. Yang (1999), “A web text mining approach based on self-organizing map,” Proceedings of the second international workshop on Web information and data management, pp.59-62.
[8] B. Lent, R. Agrawal, and R. Srikant (1997), “Discovering Trends in Text Databases,” Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining.
[9] R.-L. Liu and S.-Y. Lin (2000), “Adaptive Coordination of Agents for Timely and Resource-Bounded Information Monitoring,” Proc. Of the 4th International Conference on MultiAgent Systems (ICMAS-2000), Boston, U.S.A.
[10] R.-L. Liu, M.-J. Shih, and Y.-F. Kao (2001), “Adaptive Exception Monitoring Agents for Management by Exceptions,” Applied Artificial Intelligence (AAI), Vol. 15, No. 4, 397-418.
[11] H. Mannila, H. Toivonan, and I. Verkamo (1995), “Discovering frequent episodes in sequences”, In 1st Intl. conf. KDD, pp.210-215.
[12] F. Massegila, F. Cathala, and P. Poncelet (1998), “The Psp Approach for Mining Sequential Patterns,” In Proc. European Symposium on Principle of Data Mining and Knowledge Discovery (PKDD'98), pp.176-184.
[13] U. Y. Nahm and R. J. Mooney (2001), “Mining Soft-Matching Rules from Textual Data,” In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI -01).
[14] S. Parthasarathy, M. J. Zaki, M. Ogihara, and S. Dwarkadas (1999), “Incremental and interactive sequence mining,” Proceedings of the eighth international conference on Information knowledge management, pp.251-258.
[15] R. Srikant and R. Agrawal (1996), “Mining sequential patterns: Generalizations and performance improvements,” In Proc. 5th Int. Conf. Extending Database Technology (EDBT), pp.3-17.
[16] T. Theeramunkong (2000), “Passage-based Web text mining,” Proceedings of the fifth international workshop on on Information retrieval with Asian languages (poster session), pp.205-206.
[17] M. J. Zaki (1998), “Efficient enumeration of frequent sequences,” Proceedings of the 1998 ACM 7th international conference on Information and knowledge management, pp.68-75.
[18] M. J. Zaki (2000), “Sequence mining in categorical domains: incorporating constraints,” In 9th International Conference on Information and Knowledge Management, pp.422-429.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊
 
系統版面圖檔 系統版面圖檔