跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2024/12/09 11:02
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:呂理瑋
研究生(外文):Li-Wei Lu
論文名稱:使用關聯式法則在指定的領域中搜尋文件
論文名稱(外文):Using Association Rules to Perform Document Retrieval in the Pre-Specified Doamin
指導教授:楊東麟楊東麟引用關係
指導教授(外文):Don-Lin Yang
學位類別:碩士
校院名稱:逢甲大學
系所名稱:資訊工程所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:英文
論文頁數:59
中文關鍵詞:全球資訊網負關聯性法則關聯性法則搜尋引擎資料探勘
外文關鍵詞:data miningnegative association rule miningdocument retrievalWorld Wide Webassociation rule mining
相關次數:
  • 被引用被引用:0
  • 點閱點閱:186
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
大量資訊網路化的時代已經來臨,對於我們而言,獲取知識的途徑已經不再只是單純的研讀書籍及傳統的課堂授課方式,而是逐漸轉變為結合電腦、多媒體與網路的新一代學習方式。藉由網路搜尋引擎,全球資訊網變成了一個擁有龐大且豐富資源的資料庫,也因為網路的特性,我們可以在任何時間任何地點上網找尋我們需要的資料或是文件。也因此許多人的學習習慣正逐漸改變中,藉由網路上所提供之資訊來學習新知。雖然現今已經有發展出許多網路的搜尋引擎技術可以幫助我們搜尋資料,不過這些技術還是仍有許多可以改進之處。在本論文中,我們運用資料探勘的技術提出了一個新的有彈性的搜尋方法,於某個指定的學術領域中來分析網路上有關於此領域的文件,來更進一步的幫我們找到最符合需求的學術性文章。此方法可以彌補目前搜尋引擎找到太多無用資料之缺點。
我們的方法主要結合了兩種資料探勘的技術,正關聯性法則與負關聯性法則。利用關聯性法則發掘出隱含在文件間字與字的關係,藉以找出有用的字集,並將這些關鍵字集來幫助使用者搜尋。再利用負關聯性法則來發掘出字與字間互斥之關係,藉由這些互斥的關係,來過濾掉無用的文件,讓使用者得到更精確的搜尋結果。使用者並可以藉由回饋的機制,來對於搜尋到之文件做評鑑。藉著此機制,將可讓我們的方法找出來之結果,更符合使用者之需求,讓使用者可以藉著這些資料來達到自我學習的目的。最後,我們實作了所提出方法的核心部份,並拿至期刊搜尋引擎如IEEE及ACM上所提供的搜尋功能及學術文章搜尋引擎Google Scholar及Scopus做驗證。結果証明了此方法能有效的協助使用者找出更有用的文件。
With the innovation of network and information technology, there are more and more information that we can obtain on the Internet. For all of us, there are lots of ways to obtain knowledge right now. We can get knowledge not just surveying books or learning on the class, but also from computers, multimedia, and the Internet. By using the search engine, the World Wide Web becomes a large database that contains rich resources. Because of the characteristics of the Internet, we can find desired materials for study any time any where. For this reason, the learning habits of people are gradually changing. Although numerous searching engines for Web searches, there still has a lot of space for improvement. In this paper, we propose a flexible searching method by using data mining techniques. We analyze the theses from the specialized domain for helping us to get research papers closer to our needs. Even though a search engine can help us find lots of information, it sometimes contains lots of redundant data. Our method can solve this problem.
Our method uses two kinds of data mining techniques called association rule mining and negative association rule mining. We use association rule mining to find out the relationships between words hidden in the articles for discovering the useful word sets in order to help the user in searching documents. We also use negative association rule mining to find out the exclusive relations between words for filtering some useless document to get accurate searching results. Users can be used to use the feedback mechanism to rate the searching results. This can improve our method and meet users’ needs. We have developed the core of proposed method and examined it on the specialized Web sites like IEEE, ACM, and Google Scholar. The result shows that our method really can help users find the useful documents as needed.
中文摘要.................................i
Abstract.................................ii
Table of Contents........................iv
List of Figures..........................vi
List of Tables...........................vii
Chapter 1 Introduction...................1
1.1 Motivation...........................1
1.2 Thesis Organization..................3
Chapter 2 Background and Related Work....4
2.1 Search Engine........................4
2.2 Data Mining..........................5
2.2.1 Association Rules..................6
2.2.2 Negative Association Rules.........10
Chapter 3 System Architecture............12
3.1 Crawler..............................14
3.2 Language Handler.....................16
3.3 Word Association Miner...............17
3.4 Database Connector...................19
3.5 Learning Assistant...................21
Chapter 4 System Working Process.........22
4.1 Data Preprocessing...................22
4.2 Keyword Mining .......................26
4.3 Information Searching................37
Chapter 5 Experimental Results...........39
5.1 Implementation .......................39
5.2 Data Preprocess......................39
5.3 Results and Discussion...............41
Chapter 6 Conclusions and Future Work....54
6.1 Conclusions..........................54
6.2 Future work..........................54
References...............................56
Acknowledgements.........................59
[1] IEEE Xpl http://ieeexplore.ieee.org/Xplore/dynhome.jsp
[2] Google http://www.google.com
[3] Yahoo! http://tw.yahoo.com
[4] Altavista http://www.altavista.com
[5] S. Lawrence and C.L. Clies,”Searching the World Wide Web,” Science, 280(5630):98-100, April 1998.
[6] S Chakrabarti, M. van den Berg, and B. Dom,”Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery,” In Proceedings of the International World Wide Web Conference, pp.1623-1640, May, 1999.
[7] J, Cho, H.Garcia-Molina, and L. Page, “Efficient Crawling through URL Ordering,” In Proceedings of the International World Wide Web Conference, pp.161-172, April, 1998.
[8] W. B. Frakes and R. Baeza-Yates, “Information Retrieval: Data Structures and Algorithms,” Prentice-Hall, 1992.
[9] S Chakrabarti, “Data Mining for Hypertext: A tutorial Survey,” ACM SIGKDD Explorations, January, 2000.
[10] R.Agrawal, T. ImieLinski, and A. Swami, “Mining Association rules between sets of items in large databases,” Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207-216, May 1993.

[11]G. Piatetsky-Shapiro, “Discovery, Analysis, and Presentation of Strong Rules Knowledge Discovery in Databases,” 1991, pp.229-248.
[12] R. Agrawal, R. Srikant, "Fast Algorithms for Mining Association Rules," Proc. of the 20th Int''l Conference on Very Large Databases, pp.487-499, Santiago, Chile, September 1994. Expanded version available as IBM Research Report RJ9839, June 1994.
[13] B. Padmanabhan, and A. Tuzhilin, “A belief-driven method for discovering unexpected patterns,” In Proceedings of the Fourth International Conference on knowledge Discovery and Data Mining (KDD-98), 1998. AAAI, Newport beach, California, USA, pp.94-100.
[14] B. Padmanabhan, and A. Tuzhilin, “Small is beautiful: discovering the minimal set of unexpected patterns,” In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000. ACM, Boston, MA, USA, pp.54-63.
[15] F. Hussain, H. Liu, E. Suzuki, and H. Lu, “Exception rule mining with a relative interestingness measure,” In Proceedings of the Third Pacific Asia Conference on Management of Data, 2000. ACM, Dallas, Texas, pp.1-12
[16] F. Hussain, H. Liu, E. Suzuki, and H. Lu, “Exception rule mining with a relative interestingness measure,” In Proceedings of the Third Pacific Asia Conference on Management of Data, 2000. ACM, Dallas, Texas, pp.1-12
[17] S. Brin,R. Motwani, and C. Silverstein, “Beyondmarket baskets: Generalizing association rules to correlations,” In Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, 1997. ACM, Seattle, Washington, pp.85-93.
[18] A. Savasere, E. Omiecinski, and S. Navathe, “Mining for strong negative associations in a large database of customer transactions,” In Proceedings of the Fourteenth International Conference on Data Engineering, 1998. IEEE Computer Society, Orlando, Florida, pp.494-502.
[19] CiteSeer http://citeseer.ist.psu.edu/
[20] W. XinDong, Z. ChengQi, Z. ShiChao, “Efficient Mining of Both Positive and Negative Association Rules,” ACM Transactions on Information Systems, Vol.22, No.3, July 2004, pp.381-405.
[21] C. Edith, D. Mayur, S. Fujiwara, A. Gionis, P. Indyk, R. Motwani, Jeffrey D. Ullman, C. Yang, “Finding Interesting Associations without Support Pruning,” IEEE Transactions on Knowledge and Data Engineering, Vol.13, No.1, January/February 2001, pp.64-78.
[22] H. Yo-Ping, K. Li-Jen,” Using fuzzy support and confidence setting to mine interesting association rules,” Processing NAFIPS ''04. IEEE Annual Meeting of the Fuzzy Information, 2004. pp.514-519.
[23] Edward R. Omiecinski, “Alternative Interest Measures for Mining Associations in Databases,” IEEE Transactions on Knowledge and Data Engineering, Vol.16, No.1, January/February 2003, pp.57-69.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top