跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.11) 您好!臺灣時間:2025/09/23 09:31
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳勁瑋
研究生(外文):Chen, JIN-WEI
論文名稱:利用文字探勘技術分析閱讀偏好與推薦文章
論文名稱(外文):Preference Analysis and Articles Recommendation based on Text Mining Techniques
指導教授:許中川許中川引用關係
指導教授(外文):Hsu, Chung-Chian
口試委員:陳重臣胡念祖
口試委員(外文):CHEN, JONG-CHENHU,NIAN-ZE
口試日期:2018-06-13
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:英文
論文頁數:37
中文關鍵詞:文字探勘推薦系統使用者偏好資料檢索餘弦相似度
外文關鍵詞:Text miningRecommendation systemUser preferencesInformation retrievalCosine similarity
相關次數:
  • 被引用被引用:3
  • 點閱點閱:789
  • 評分評分:
  • 下載下載:255
  • 收藏至我的研究室書目清單書目收藏:2
隨著網路科技的蓬勃發展,每日皆有不同類型的新文章發佈於網際網路,因此管理者難以分析文章點閱者偏好哪些類型的文章。本研究透過文字探勘與資料檢索,計算文章中詞彙的TF-IDF,並以其挑選對推薦系統具代表性的關鍵字,建立關鍵詞字典。建立字典後透過使用者瀏覽紀錄與文章集關鍵字向量計算使用者閱讀偏好,以此了解使用者最近瀏覽文章的傾向,最後以使用者偏好與文章集關鍵字向量計算餘弦相似度,將相似度高的文章推薦予使用者。在實驗中,第一部分以量化指標F_1-measure 評比不同衰退參數下的推薦結果,第二部分為挑選一個具代表性的使用者,分析其各月份下閱讀文章的數量,並呈現各衰退參數下所計算出的推薦成果,以利決定最適合的衰退參數。
IN THE INFORMATION AGE, THERE ARE VARIOUS TYPES OF NEW ARTICLES POSTED ON THE INTERNET EVERY DAY. THE NUMBER OF ARTICLES GROWS AT A RAPID SPEED SO THAT READERS HARDLY KEEP TRACK OF THE ARTICLES THEY MIGHT BE INTERESTED. IN THIS PAPER, WE PROPOSED AN APPROACH TO IDENTIFY READERS’ PREFERENCE AND THEN RECOMMEND NEW COMING ARTICLES ACCORDINGLY. SPECIFICALLY, WE USE INFORMATION RETRIEVAL TECHNIQUES TO CALCULATE TF-IDF OF WORDS IN THE ARTICLES, SELECT REPRESENTATIVE WORDS AS KEYWORDS, AND THEN BUILD A KEYWORDS DICTIONARY. READER’S PREFERENCE IS CONSTRUCTED AS A KEYWORD VECTOR BASED ON THE ARTICLES APPEARING IN HIS READING LOG. FOR NEW ARTICLES, TOPICS OF THE ARTICLES ARE REPRESENTED BY CORRESPONDING KEYWORD VECTORS AND THE DEGREE OF MATCH BETWEEN READER’S PREFERENCE AND THE TOPICS OF THE ARTICLES IS MEASURED BY COSINE SIMILARITY WHICH THE RECOMMENDATION IS BASED ON. EXPERIMENTS ARE DIVIDED INTO TWO PARTS. THE FIRST USES THE STATISTICAL INDICATOR F_1 "-" MEASURE TO COMPARE THE RESULTS OF DIFFERENT DECAY PARAMETERS. THE SECOND SELECTS A REPRESENTATIVE USER, ANALYZES THE NUMBER OF ARTICLES READ IN EACH MONTH, AND PRESENTS THE RECOMMENDATION RESULTS ACCORDING TO VARIOUS DECAY PARAMETERS.
摘要 i
ABSTRACT ii
Table of content iii
List of Tables v
List of Figures vi
1. Introduction 1
1.1 Motivation 1
1.2 Objective 2
1.3 Organization 2
2. Literature review 3
2.1 Text segmentation 3
2.2 Term frequency analysis 3
2.3 Recommender system 4
2.3.1 Content-Based methods 4
2.3.2 Collaborative 5
2.3.3 Hybrid methods 5
2.4 Article recommendation 6
3. Methodology 7
3.1 Data processing 7
3.2 Term frequency analysis 8
3.2.1 Selection of part of speech 8
3.2.2 Synonym processing 8
3.2.3 Compute TF-IDF 8
3.3 Merge multi article datasets TF-IDF result 10
3.4 Transfer article content to vector 11
3.5 User preferences 12
3.5.1 Compute user preferences fraction 12
3.5.2 Decay user preference fraction by article reading date 12
3.5.3 Regularized term weight by term appear count in reading record 13
3.6 Article Recommender 14
3.6.1 Compute cosine similarity 14
3.6.2 Evaluation indicators 14
4. Experiments 16
4.1 Dataset 16
4.2 Unstructured natural language processing 16
4.3 Transfer article content to vector results 17
4.4 User preferences results 18
4.5 Recommended articles 18
4.6 Recommend new articles 19
4.6.1 Compare with different decay weight and non-decay results 19
4.6.2 In-depth analysis of user browsing pattern 21
5. Conclusions 26
References 27


Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. ACM, p. 463.
Balabanović, M., & Shoham, Y. (1997). content-based, collaborative recommendation.
Bancu, C., Dagadita, M., Dascalu, M., Dobre, C., Trausan-Matu, & Florea, A. (2012). ARSYS - Article Recommender System.
Beel, J., Gipp, B., Langer, S., & Breitinger, C. (2016, 11). Research-paper recommender systems: a literature survey. International Joural on Digital Libraries.
Blei, D. M., & Ng, A. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, pp. 993–1022.
Burke, R. (2005). Hybrid systems for personalized recommendations. Intelligent Techniques for Web Personalization, (pp. 133–152).
Croft, W., & Belkin, N. (1992). Information filtering and information retrieval: Two sides of the same coin? Communications of the ACM.
Ekstrand, M. D., Riedl, J. T., & Konstan, J. A. (2011). Collaborative filtering recommender systems. Foundations and Trends in Human-Computer Interaction.
Haizhou, L., & Baosheng, Y. (1998). Chinese word segmentation. Language, 212, 217.
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of UAI, (p. 21).
Ji, X., Ritter, A., & Yen, P. (2017). Using ontology-based semantic similarity to facilitate the article screening process for systematic reviews. Siomedical Informatics.
Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, 28(1), 11-21.
Jurafsky, D., & Martin, J. (2000). Speech and language processing. Prentice Hall.
Kompan, M., & Bielikov´a, M. (2010). Content-based News Recommendation.
Lin, H., Yang, X., & Wang, W. (2014). A Content-Boosted Collaborative Filtering Algorithm for Personalized Training in Interpretation of Radiological Imaging.
Pan, W., Xia, S., Liu, Z., Peng, X., & Ming, Z. (2016). Mixed factorization for collaborative recommendation with heterogeneous explicit feedbacks. INFORMATION SCIENCES.
Salton, G. (1989). Automatic text processing: The transformation, analysis, and retrieval of. Reading: Addison-Wesley. NY.
Stern, D. H., & Graepel, T. (2009). Stern, D. H., Herbrich, R., & Graepel, T. (2009). Matchbox: Large scale online bayesian recommendations. ACM.
Tiroshi, A., Berkovsky, S., Kaafar, M., Vallet, D., & Kflik.T. (2014). Graph-Based Recommendations: Make the Most Out of Social Data.
Yamamoto, M., & Church, K. W. (2001). Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus. Computational Linguistics, 1-30.
Zheng.L., Li.L., Hong.W., & Li.T. (2013). PENETRATE: Personalized news recommendation using ensemble hierarchical clustering. Expert systems with application.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊