(3.232.129.123) 您好!臺灣時間:2021/03/06 02:11
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:張琬婷
研究生(外文):Chang Wan Ting
論文名稱:以網路探勘技術建立維基百科瀏覽輔助介面之初探
論文名稱(外文):An Exploratory Study of Designing a Browsing Support Interface in Wikipedia based on Web Mining Technique
指導教授:吳怡瑾吳怡瑾引用關係
指導教授(外文):Wu I-Chin
學位類別:碩士
校院名稱:輔仁大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
畢業學年度:98
語文別:中文
論文頁數:50
中文關鍵詞:決策導向任務階層式分群連結探勘文件摘要Wikipedia
外文關鍵詞:Decision-making oriented taskHierarchical agglomerative clusteringLink-based analysisText SummarizationWikipedia
相關次數:
  • 被引用被引用:3
  • 點閱點閱:236
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
由於網站易於開發與網頁易於存取的特性造成網路資訊快速的成長,網路世界逐漸成為使用者資訊的識來源,其中維基百科(Wikipedia)更為使用者快速獲取定義、解釋…等資訊的重要網路服務。研究即以中文Wikipedia為研究對象,以連結探勘、分群技術與文件摘要為主要理論基礎,試圖根據Wikipedia頁面連結與內容分析結果於浩瀚的頁面資訊中建構主題摘要搜尋輔助介面。研究主要步驟為(1)基於過去研究方法建構同時考量Wikipedia頁面連結型態(type)與連結頻率(frequency)的連結網路地圖;(2)研究採用階層式分群法(hierarchical agglomerative clustering,簡稱HAC)分析文章主題,建構主題階層樹以協助使用者快速瀏覽與查詢相關之主題。研究使用者搜尋任務設計(task-oriented approach)評估所提出方法之有效性。研究結果顯示輔以主題樹與摘要之介面較我們過去所提出之以連結為基礎的介面(WikiMap)與維基百科傳統介面更能幫助使用者以較短時間找到所需資訊,相關評估結果提供本探索式介面研究未來改善之方向。
Wikipedia, the biggest multi-lingual online encyclopedia, is the result of collaboration by countless volunteers around the world. It allows users to contribute their knowledge to Wikipedia as a Wiki community; thus, the number of articles in Wikipedia is constantly expanding. Accordingly, it is difficult for users to find articles efficiently via hyperlinks. To tackle the problem, we implement a search aided system in Wikipedia that is that is based on the theories and techniques of link-based analysis, text clustering and summarization. First, we employed a link strength measure to establish a link-based network by analyzing the relationships between articles. Second, we adopted the hierarchical agglomerative clustering (HAC) methods to identify topics from the contracted network. For assisting user to find the needed articles efficiently, the interface shows each article’s summary once the user clicks the article that he or she wants to read. Finally, we further conducted a user task-oriented evaluation. The preliminary results showed that the proposed interface achieve slightly better performance with higher precision value and shorter search time than those of the WikiMap interface and traditional Wikipedia interface, especially for decision-making oriented task. Our findings have implications for the design of an effective search aided interface which is crucial for finding relevant information efficiently in link-based Web sites.
圖 次 ....................................................................................................................... vi
表 次 ...................................................................................................................... vii
第壹章 緒 論 ........................................................................................................ 1
第一節 研究背景與動機 ................................................................................... 1
第二節研究目的 ............................................................................................... 2
第貳章文獻探討 ................................................................................................. 5
第一節 文件探勘與文件摘要 ........................................................................... 5
一、文件探勘 ............................................................................................ 5
二、文件摘要 ............................................................................................ 6
第二節文件分群產生概念圖 ........................................................................... 8
一、階層式分群法 .................................................................................... 8
二、概念圖(Concept Map) ................................................................ 10
第三節以連結分析建構網路關聯圖 ............................................................. 10
第參章研究問題與流程 ................................................................................. 13
第一節 定義研究問題 ..................................................................................... 13
第二節研究流程 ............................................................................................. 14
一、文件前處理 ...................................................................................... 14
二、文件連結強度分析建立網路關聯圖 .............................................. 15
三、以網路探勘技術建立主題概念圖 .................................................. 15
第肆章以網路探勘技術建立維基百科瀏覽輔助圖 .......................................... 17
第一節 文件前處理 ......................................................................................... 17
一、刪除文件格式 .................................................................................. 19
二、 字詞處理 .......................................................................................... 19
第二節 文件連結強度分析 ............................................................................. 20
第三節以分群技術分析文件 ......................................................................... 21
一、關鍵詞向量(term vectors) .......................................................... 21
二、 進行階層式分群 .............................................................................. 23
第四節 建立文件摘要 ..................................................................................... 24
一、句子排序(Sentence Ranking) ..................................................... 25
二、 摘要的產生 ...................................................................................... 26
第五節 維基百科瀏覽輔助圖說明 ................................................................. 27
一、網路關聯圖的建立 .......................................................................... 28
iv
二、以分群技術建立主題概念圖 .......................................................... 29
第伍章 實驗設計 .................................................................................................. 31
第一節 實驗流程 ............................................................................................. 31
第二節實驗介面說明 ..................................................................................... 32
一、傳統的瀏覽介面(baseline介面) ..................................................... 32
二、 維基網路關聯地圖瀏覽介面(介面A) ........................................... 33
三、 維基主題摘要輔助導覽介面(介面B) ............................................ 34
第三節 評估對象與任務設計說明 ................................................................. 35
第四節實驗分配 ............................................................................................. 36
第五節評估指標 ............................................................................................. 36
一、實驗結果評估 .................................................................................. 36
二、 問卷調查評估 .................................................................................. 37
第陸章 實驗結果 .................................................................................................. 39
第一節 項目搜尋任務評估 ............................................................................. 39
一、精確率(Precision) ............................................................................. 39
二、 執行項目搜尋任務耗時 .................................................................. 40
第二節 決策導向任務評估 ............................................................................. 41
一、排序正確性 ...................................................................................... 41
二、 執行決策導向任務耗時時間 .......................................................... 42
第三節 問卷調查結果 ..................................................................................... 43
第四節討論與發現 ......................................................................................... 44
第柒章結論與未來展望 ........................................................................................ 47
參考文獻 .................................................................................................................. 49
附錄一 問卷內容 .................................................................................................. 53
v
1.
江憲坤、陳鴻文、楊境榮,自動鏈結分析演算法在社會網絡之開發與應用,中華民國資訊管理學報,15卷第3期,2008,頁157-180。
2.
林書呈,利用資料探勘改善代理伺服器預先擷取效率之研究,國立中央大學資訊管理研究所碩士論文,2004。
3.
洪鵬翔,中文新聞自動群聚,國立清華大學資訊工程研究所碩士論文,2000。
4.
孫建成,促進個人化網頁摘要搜尋的階層式分群系統,國立中央大學資訊管理研究所碩士論文,2006。
5.
梁靜雯、郭禮賢,概念圖結合網絡學習對中六文化科學習的成效評估,香港教師中心學報,2006年。
6.
陳年興、孫振凱,透過網頁分析建構知識分佈圖以輔助線上教學,第十三屆國際資訊管理學術研討會論文集,2002,頁675-682。
7.
陳道輝,利用學位論文資訊萃取資訊相關領域之研究主題關聯性,國立中山大學資訊管理研究所碩士論文,2003。
8.
維基百科:關於,中文維基百科網頁。2010/07/18,取自:http://zh.wikipedia.org/zh-tw/Wikipedia:%E5%85%B3%E4%BA%8E。
9.
Barzilay, R., McKeown, K. & Elhadad, M., Information Fusion in the Context of Multi-Document Summarization, Proceedings of the 37th Association for Computational Linguistics, 1999.
10.
Björneborn, L. & Ingwersen, P., Towards a basic framework of webometrics, Journal of the American Society for Information Science and Technology, 55(14), 2004, pp. 1216-1227.
11.
Chen, N.S., Kinshuk, Wei, C.W., & Chen, H.J. Mining e-Learning domain concept map from academic articles, Computers & Education, 50(3), 2008, pp.1009-1021.
12.
Chin, A. & Chignell, M., A social hypertext model for finding community in blogs, Proceedings of the seventeenth conference on Hypertext and hypermedia, 2006, pp.22-25.
13.
Daniel, M.M. & Chen, H.C., Summary in Context: Searching Versus Browsing, ACM Transactions on Information Systems, Vol. 24, No. 1, January 2006, pp.111–141.
14.
Davis, F.D., A technology acceptance model for empirically testing new end-user information systems: Theory & results, Ph.D. dissertation, Massachusetts Institute of Technology, 1986.
15.
Dolan S., Six Degrees of Wikipedia, 2008, Retrieved May, 2010, from the World Wide Web: http://www.netsoc.tcd.ie/~mu/wiki/.
16.
Drucker, P.F., Managing in the Next Society, United States:Truman Talley Books, 2002.
17.
Fayyad, U., Piatetsky-Shapiro, G.., Smyth, P. & Uthurasamy, R, Advances in Knowledge Discovery & Data Mining, AAAI Press/The MIT Press, 1996.
18.
Forsyth, R. & Rada, R., Adding an Edge in Machine Learning: Applications in Expert Systems and Information Retrieval, Ellis Horwood Ltd, 1986, pp.198-212.
19.
Hahn, U., Mani, I., The Challenges of Automatic Summarization, IEEE Computer, Vol.33, No.11, 2000, pp.29-36.
20.
Han, J. & Kamber, M., Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000.
21.
Hovy, E. & Lin, C.Y., Identifying Topic by Position, Proceedings of the 5th Conference on Applied Natural Language Processing(ANLP), Washington, DC, 1997.
22.
Johnson, S. C., Hierarchical Clustering Schemes, Psychometrika, Vol.32, No.3, 1967, pp.241-254.
23.
Kleinberg, J., Authoritative sources in a hyperlinked environment, Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms, 1998, pp. 668-677.
24.
McKeown, K.R. & Radev, D.R., Generating summaries of multiple news articles, Proceedings of ACM Conference on Research and Development in Information Retrieval, July 1995, pp.74–82
25.
Morris, A.H., Kasper, G., & Adams, D., The Effects and Limitations of Automatic Text Condensing on Reading Comprehension Performance, Information Systems Research, vol. 3, no.1, 1992, pp.17-35.
26.
Page, L., Brin, S., Motwani, R. & Winograd, T. “The pagerank citation ranking: Bringing order to the web,” Stanford Digital Library Technologies Project, November 1998, Retrieved May, 2010, from the World Wide Web: http://dbpubs.stanford.edu:8090/pub/1999-66.
27.
Radev,D.R., Jing, H., Stys, M. & Tam, D., Centroid-based summarization of multiple documents, Information Processing and Management, vol.40, 2004, pp.919–938
28.
Rush, J.E., Salvador, R. & Zamora, A., Automatic Abstracting and Indexing, Journal of American Society for Information Sciences, Vol.22, No4, 1971, pp.260-274.
29.
Saggion, H., Multilingual multidocument summarization tools and evaluation, Proceedings of Fifth International Conference on Language Resources and Evaluation, Italy, May 2006, pp.1312-1317.
30.
Salton, G., Wong, A. & Yang, C. S., A Vector Space Model for Automatic Indexing , Communications of ACM, vol.18, 1975, pp.. 613-620.
31.
Salton, G.. & Buckley, C., Term weighting approaches in automatic text retrieval, Information Processing and Management, vol.24, no.4, 1988, pp.513–523.
32.
Sullivan, D., Document Warehousing and Text Mining, Wiley Computer, 2001, pp.326
33.
Teufel, S., Moens, M., Sentence extraction as a classification task, Proceedings of the Workshop on Intelligent Scalable Summarization, ACL/EACL Conference, Madrid, Spain,1999.
34.
Wu I-Chin & Wu C.-Y., A User-oriented Topic Discovery Approach for Effective Browsing of Wikipedia, Proceedings of the 13th International Conference on Human-Computer Interaction(HCII 2009), San Diego, CA, USA, pp.574-579.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊
 
系統版面圖檔 系統版面圖檔