(18.210.12.229) 您好!臺灣時間:2021/02/26 09:37
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:簡健宇
研究生(外文):John C. Chien
論文名稱:應用特徵擷取、SVM和LSA於分析大量稀疏資料的推薦系統
論文名稱(外文):Apply Feature Extraction, SVM and LSA to Analyze Large-Scale Data for Recommendation Systems
指導教授:洪政欣洪政欣引用關係
指導教授(外文):Hong, Jen-Shin
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:中文
論文頁數:60
中文關鍵詞:推薦系統LSACollaborative Filtering 演算法
外文關鍵詞:Recommendation systemlatent semantic analysiscollaborative filteringknowledgedata
相關次數:
  • 被引用被引用:0
  • 點閱點閱:852
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:180
  • 收藏至我的研究室書目清單書目收藏:1
當前的推薦系統常藉由分析使用者的購買商品的喜好,預測出使用者對於其他商品是否有潛在的興趣,並將使用者有興趣的商品推薦給使用者。其中KNN是最常見的演算方法之一,通常以使用者或是商品為對象進行分群,並在群體內找出最受歡迎的商品推薦給屬於同群分類的使用者。在這個研究中,我們使用了在資訊擷取領域中著名的Latent Semantic Analysis 方法來分析眾多使用者與商品之間的關聯程度。相對於其他的推薦系統,我們所提出的方法可以避免電腦系統進行龐大的資料關聯探索,也不需要事先建立商品的知識庫,我們完全藉由數學的方法來完成這些分析。這個方法可以在分析的過程中,依照計算機實際的硬體資源彈性地調整並擷取最重要的前幾項資訊進行分析。在初步的實驗中可以得到不錯的預測結果,在小規模的資料樣本中準確率可以達到九成以上。

為了更進一步的將這個方法套用到非常大規模的資料樣本中,我們用這個方法提出了一個解決的方案。首先利用SVM分類機器將大量的資料進行預先分類,將每次運算的數量降低,接著進一步過濾出具有代表性或是鑑別性的特徵,保留較重要的資訊進行運算,藉此可以在一般架構的計算機上進行大規模資料的分析以及預測。
Recommendation systems based on collaborative filtering predict customer preference for items by learning the customer-item pairs of the past. A predominant approach to collaborative filtering is neighborhood based (KNN-based), where the customer-item preference rating is discovered from ratings of similar items or customers. In this research work, we apply latent semantic analysis of IR to discover relation densities between customers and items. Unlike previous approaches, this method does not require system to perform exhaustive search for association rules nor need to build a predefine knowledge base. This analysis is done by pure mathematic method. During the math procedures, users will be able to decide the size of data to be analyzed depending on their actual circumstances. The preliminary experiment results show an encouraging accuracy over 90%. To further deal with large-scale, sparse data, we apply SVM and feature selection techniques to reduce the dimension of data representation. SVM allows categorizing data samples into several subsets, in which the data samples are smaller. We apply feature selection in those smaller data samples to filter out less important or less relevant information that help fit the RAM size on ordinary PCs.
Preface 3
Acknowledgement 3
論文摘要 4
Abstract 5
Content 6
List of Figures 9
1. Introduction 10
1.1. Motivation 11
1.2. Challenge 12
1.3. Thesis Structure 13
2. Related Work 15
2.1. Recommender System 15
2.2. Collaborative Filtering 16
2.3. Genetic Algorithm, GA 16
2.4. Latent Semantic Analysis, LSA 18
2.5. Singular Value Decomposition, SVD in Data Mining 20
2.6. Support Vector Machine, SVM 21
2.7. Feature Selection 22
2.8. Top K Nearest Neighbor, KNN 23
3. The Netflix Prize Competition 24
3.1. Introduction 24
3.2. Official Training Dataset 25
3.3.1. The movie_titles.txt File 26
3.3.2. The training_set.tar File 26
3.3.3. The qualifying.txt File 26
3.3.4. The probe.txt File 27
3.3. Target in Brief 27
3.4. Problem Set 28
3.5. Database Procession 28
4. Data Preparation and Variable Definitions 28
4.1. Data Preparation: Movie feature collection 29
4.2. Variable Definition: FM matrix / Movie representation with features 30
4.3. Variable Definition: CM matrix / Consumer rating record 31
4.4. Variable Definition: FC matrix / Consumer taste toward features 31
4.5. Variable Definition: NF, NM, NC 32
4.6. Sparseness of the Matrices 33
4.7. Analysis to the Processed Data 33
5. Approximation Matrices with SVD 34
5.1. Idea 34
5.2. Decompose singular values of FM and FC matrices 35
5.3. Analyze the Cells of and Matrices 38
5.4. Prediction Scores 38
5.5. Normalization: Scores to Rates 40
6. Alternative Computing 42
6.1. Movie Categories 43
6.2. Feature Domain Construction 44
6.3. The Intelligent Agent 46
The Idea 48
7. Additional Supporting Techniques 49
7.1. Affection Sensing 49
8. Other Related Works 50
8.1. L-R Method 51
8.2. LSA-LR Method 53
9. Experiment and Evaluation 55
9.1. Computing Environment and Hardware Configuration 55
9.2. Guideline 55
9.3. Experiment Results 57
10. Conclusion 58
11. Future Work 58
Reference 60
[1].J. Bennett and S. Lanning. The Netflix Prize, Netflix Corp., SIGKDD August 2007.
[2].Microsoft Corp, Microsoft Technical Report: http://support.microsoft.com/kb/929605/en-us
[3].S. Marschner. QR factorization and orthogonal transformations, Lecture notes of Cornell University, http://www.cs.cornell.edu/Courses/cs322/2007sp/notes/qr.pdf, March 2007.
[4].T. Landauer, P. Foltz and D. Laham. Introduction to Latent Semantic Analysis. In Discourse Processes 25, 1998. pp. 259-284.
[5].P. Resnick,N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl, (1994). GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceeding of CSCW 1994.
[6].J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon and J. Riedl, (1997). GroupLens: Applying Collabrative Filtering to Usenet News. Communications of the ACM, 40(3), pp. 77-87.
[7].W.P. Lee, C.H. Liu and C.C. Lu. Intelligent agent-based systems for personalized recommendations in Internet commerce, Expert Systems With Applications, Elsevier 2002.
[8].S. Deerwester, ST Dumais, GW Furnas, TK Landaue and R. Harshman. Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41, 1990.
[9].S.H. Lin, T.Y. Hsu, K.J. Feng and W.P. Yang, "Feature Selection Methods for Metadata Classification on Hierarchical Union Catalogs. International Conference on Digital Archive Technologies (ICDAT'05), June, 2005.
[10].C. E. Shannon, “A Mathematical Theory of Communication,” Bell System Technical Journal, 27(1, 3): 379-423 and 623-656, July and October, 1948.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊
 
系統版面圖檔 系統版面圖檔