研究生(外文):Chen-wen lin
論文名稱(外文):A Classification Framework of Website Content Based on Folksonomy in Social Bookmarking
指導教授(外文):Shih-Ming Pi
外文關鍵詞:FolksonomySocial bookmarkingWeb2.0WordNetClassification Method
Document classification technique applies extensively in knowledge management and enterprise. Automation document classification has focused on two dimensions. The first domain is keyword-based that is based on TFIDF in the early time, and develop into SVM for the modern way. However, the keyword-based automation document classification has problems of semantic. The second classification is semantic-based that has focused on keyword-based problem-solving. Many researchers propose that ontology-based classification can solve keyword-based semantic problem. But need to concern with how to build ontology and the representation of defining the domain of expert knowledge. Due to these dubious interpretation, Folk Classification (Folksonomy) is produced of Web2.0 conception, and based on keyword-based. This paper proposed a Folk Classification Bookmark System that com combined WordNet and TFIDF classification method. We expect the Folk Classification Bookmark System can solve the semantic and classification problem.
This research proposes the mechanism of Folk Classification Bookmark System that integrates WordNet and TFIDF technologies. Users can define the tag by themselves. After separating the keyword from tags, the system will find the synonym from WordNet. Finally the synonym would use TFIDF to classify and user can query or browse from the “keyword”.
On the research, results of this study show that Folk Classification achieves 30% or higher data reduced rate. The result of the classification promotes the classified data quality and increase user satisfaction. On the conclusion, this study proposes the mechanism of Folk Classification, and shows that Folk Classification is capable to improve of synonym and classified problem.
目 錄
中文摘要 I
英文摘要 II
誌謝辭 III
目錄 IV
表目錄 VI
圖目錄 VII
第一章、緒論 1
第一節、研究背景與動機 1
第二節、研究目的與問題 3
第三節、研究範圍 4
第二章、文獻探討 5
第一節、傳統分類學 5
第二節、WEB2.0 11
第三節、大眾分類法 14
1. 標籤註記(Tagging) 14
2. 大眾分類法(folksonomy) 16
第四節、TFIDF 19
第五節、WORDNET 21
第三章、研究方法 23
第一節、建構概念框架 24
第二節、發展系統架構 26
第三節、分析與設計系統 28
第四節、建置與評估系統 39
第四章、雛形系統實作與評估 40
第一節、雛形系統實作 40
1. 需求分析 40
2. 系統分析與設計 41
3. 系統測試 42
第二節、系統評估 44
1. 系統評估 45
2. 統計分析 50
第五章、結論與建議 54
第一節、研究結論 54
第二節、研究限制 55
第三節、未來研究方向 57
參考文獻 59
個人資料 62

表 目 錄
表2-1、Web1.0與2.0之比較 12
表2-2、Web2.0七項特色 13
表2-3、大眾分類優缺點 18
表2-4、WordNet相關研究 22
表3-1、個人資料檔主要欄位說明 28
表3-2、分類參數檔主要欄位說明 29
表3-3、分類資料檔主要欄位說明 29
表3-4、使用者行為定義 31
表3-5、自訂書籤流程主要資料流 32
表3-6、JWordNet 主要函數表 34
表3-7、讀取使用者書籤資料流 35
表3-8、讀取分類參數資料流 36
表3-9、相關語意查詢參數表 36
表3-10、存取分類結果資料流 38
表4-1、測試資料表 43
表4-2、系統實驗使用者之人口統計資料表 51
表4-3、本研究滿意度調查衡量因素表 51
表4-4、單一樣本統計量 52
表4-5、單一樣本檢定 53

圖 目 錄
圖2-1、KNN分類示意圖 6
圖2-2、基因演算法流程圖 7
圖2-3、類神經網路架構圖 8
圖2-4、支援向量機分類示意圖 9
圖2-5、Web2.0概念圖 11
圖2-6、flickr應用標籤分類結果畫面 15
圖2-7、Gmail信箱應用標籤分類畫面 16
圖2-8、「del.icio.us」書籤工具分類結果畫面 17
圖3-1、系統開發研究程序 23
圖3-2、系統概念框架圖 25
圖3-3、系統架構圖 26
圖3-4、WordNet詞彙庫概念圖 30
圖3-5、使用者瀏覽書籤流程 31
圖3-6、使用者自訂書籤流程 32
圖3-7、WordNet語意分析流程 33
圖3-8、大眾分類流程 35
圖3-9、分類權重機制細部流程 37
圖4-1、實作流程圖 40
圖4-2、開發環境與系統平台架構圖 41
圖4-3、雛形法系統開發流程圖 42
圖4-4、系統評估流程 44
圖4-5、使用者註冊畫面 46
圖4-6、修改書籤畫面 46
圖4-7、使用者標籤分類畫面 47
圖4-8、分類結果比較畫面 48
圖4-9、使用者閱讀書籤畫面 49
圖4-10、滿意度調查畫面 50
1.Fichter, D., “Intranet Applications for Tagging and Folksonomies,” Intranet Librarian, 2006, pp. 43.
2.Gao, J., Zhang, J. and Zhou, M., “On the Use of Words and N-grams for Chinese Information Retrieval,” Proceedings of The 5th International Workshop on Information Retrieval with Asian Languages, 2000, pp. 141-148.
3.Golder, S., and Huberman, B. A., “Usage patterns of collaborative tagging systems,” Journal of Information Science,” 32(2), 2006, pp. 198-208
4.Gordon-Murnane, L., “Social Bookmarking, Folksonomies, and Web 2.0 Tools,” Searcher, 14(6), 2006, pp. 26.
5.Gregg, D.G. and Walczak, S., “Auction advisor: An agent-based online-auction decision support system,” Decision Support System, 41(2), 2006, pp. 449-471.
6.Guy, M., and Tonkin, E., Folksonomies:tidying UP Tags? D-Lib Magazine, Vol.12, No.1, 2006. <http://www.dlib.org/dlib/january06/guy/01guy.html>(Accessed January 8, 2007)
7.Kobayashi, M., and Takeda, K., “Information Retrieval on the Web,” ACM Computing Surveys, 32(2), 2000.
8.Larkey, L.S. and Croft, W.B., “Combining classifiers in text categorization,” Proceedings of the 19th ACM International Conference on Research and Development in Information Retrieval (SIGIR-96), 1996, pp. 89-297.
9.De Luca, E. W. and Nürnberger, A., “Improving Ontology-Based Sense Folder Classification of Document Collections with Clustering Methods,” Proceedings of the 2nd International Workshop on Adaptive Multimedia Retrieval (AMR 2004), 2004, pp. 72-86.
10.Mathes, A., “Folksonomies – Cooperative Classification and Communication Through Shared Metadata,” Computer Mediated Communication, LIS590CMC (Doctoral Seminar), Graduate School of Library and Information Science, University of Illinois Urbana-Champaign, 2004. <http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html> (Accessed January 8, 2007)
11.Miller, G. A., “WordNet: A Lexical Database for English,” Communication of the ACM, 38(11), 2005.
12.Miller, G., A., Leacock, C., Tengi, R., and Bunker, R., T. “A Semantic Concordance,” Human Language Technology Conference, 1993, pp. 303-308.
13.Nunamaker J F, JR., Chen M. and Purdin T.D.M., “Systems Development in information Systems Research,” Journal of Management Information Systems, 1991, pp. 89-106.
14.Ohmukai, I., Hamasaki, M., and Takeda, H., “A Proposal of Community-based Folksonomy with RDF Metadata”, Proceedings of the 4th International Semantic Web Conference (ISWC2005), 2005.
15.O'Reilly, T., What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software, O'Reilly Network, 2005. <http://www.oreillynet.com/lpt/a/6228>(Accessed January 8, 2007)
16.Prabowo, R., and Thelwal, M. , “A comparison of feature selection methods for an evolving RSS feed corpus,” Information Processing and Management, 2006, pp. 1491-1512.
17.Salton, G. and McGill, M., Introduction to Modern Information Retrieval, McGraw-Hill, 1983.
18.Skiba, D. J., “WEB 2.0: Next Great Thing or Just Marketing Hype?” Nursing Education Perspectives, 2006.
19.Song, M., H., Lim, S., Y., Kang, D., J., and Lee, S., J., “Automatic Classification of Web Pages Based on the Concept of Domain Ontology,” Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE 2002), 2002, pp.182-191.
20.Vapnik, V., Statistical Learning Theory, Springer, N.Y, 1998.
21.Voorhees, E. M., “Query expansion using lexical-semantic relations,” Proceedings of 17th Inter-national Conference on Research and Development in Information Retrieval (SIGIR'94) Dublin, 1994, pp. 61-69.
22.Educause, “7 things you should know about...Social Bookmarking.” Educause Learning Initiative, 2005. <http://www.educause.edu/ir/library/pdf/ELI7001.pdf>(Accessed January 8, 2007)
25.陳品均,「Web 2.0 應用服務策略行動之研究-以Yahoo!、Google、MSN 為例」,國立臺灣大學商學研究所未出版碩士論文,2006。
27.曾憲雄、蔡秀滿、蘇東興、曾秋蓉、王慶堯,資料探勘Data Mining,旗標出版社,2005。
28.林建宏,「從HEMiDEMi開發經驗談Web2.0」,<http://www.hemidemi.com/blog/doc/HEMiDEMi_web20.pdf>,2006。(Accessed January 8, 2007)
29.維基百科,「社會性書籤」,<http://zh.wikipedia.org/w/index.php?title=%E7%A4%BE%E4%BC%9A%E6%80%A7%E4%B9%A6%E7%AD%BE >。(Accessed January 8, 2007)
30.HEMiDEMi,「什麼是HEMiDEMi共享書籤?」,<http://www.hemidemi.com.tw/>,2006。(Accessed January 8, 2007)
