研究生(外文):Rai-Yi Du
論文名稱(外文):A Study for the Collaborative Recommendation with Quantitative Association Rules
指導教授(外文):Ing-Long Wu
外文關鍵詞:Data MiningAssociation RulesRecommendation SystemsCollaborative Filtering
隨著科技不斷的進步以及網際網路的發展,網際網路的運用迅速地普及,網際網路早已成為現代人主要接受資訊的媒體之ㄧ。相較於一般傳統的媒體,網際網路資訊的成長量與傳播速度皆呈爆炸性成長,為解決資訊因快速成長而造成的超載(Overload)問題,釵h因應的資訊科技也紛紛被提出,並且應用在非營利或營利的商業活動上,其中以資料探勘以及推薦系統為廣泛地被使用。所謂資料探勘(Data mining)指從大量的歷史交易資料庫中,找出類似、有關聯或者是潛在需求的商品,其中以關聯規則(Association rules)最常被使用;但傳統的關聯式規則探勘只能做布林值(Boolean)分析,無法作量值(Quantitative data)處理,這忽略了量值所隱含的重要資訊,將可能造成所找出的規則可能是不適用或有偏差的。
推薦系統(Recommendation system)則是可以協助使用者搜尋資訊之外,還可以推薦使用者可能有興趣或是需求的其他商品資訊;主要分兩種方法:內容為基礎過濾(Content-based filtering)以及協同式過濾(Collaborative filtering )。內容為基礎過濾方法因為一些限制,例如格式限制,鮮少意外發現…等,較不廣泛被使用。而協同式過濾方法則是找出一群具有共同興趣的使用者形成鄰近群(Neighborhood),透過分析鄰近群成員共同的興趣與喜好,再根據這些共同特性推薦相關的項目給同一鄰近群中有需求之成員;由於不需要分析項目,因此並無內容為基礎過濾方法的缺點。儘管如此,協同式過濾方法仍有稀疏性(Sparsity)與擴充性(Scalability)等問題。因此本研究嘗試結合量化關聯規則探勘與協同式過濾方法以解決上述兩大的問題,並且提供使用者更精準的推薦。經過實驗證實,本研究提出的方法是可行的,並且相較一般傳統的協同式過濾方法有較好的推薦結果。
By the evolution of technologies and the growth of Internet, the application of Internet gets widespread quickly. It has become the main medium that modern people gain information. Compared with other media, the information on Internet grows and disseminates explosively. To overcome the overload problem derived from rapid growth of information, many information technologies are proposed and applied to non-commercial or commercial activities. Among those technologies, data mining and recommendation systems are in widespread use. Data mining means to find similar, relevant or potential products that customers need from large transactions database. And association rule mining is frequently used. However, traditional association rule mining could merely analyze data with Boolean values but with quantitative values. That will miss some significant information, which will cause the rules found to be improper or biased.
Then, recommendation systems could not only help users to search information but also recommend other products they may be interested or demanded. Besides, content-based filtering and collaborative filtering are two main techniques in use. Content-based filtering is less adopted because of its limitations, for example, the constraint of text document, serendipitous discovery, etc. But collaborative filtering will find a group, called neighborhood, in which people have common interests. By analyzing the interest and needs of members in the neighborhood, the system will find some characteristics. According to those, it could recommend related items to other members in the same neighborhood. Because it needn’t analyze items, it has no drawbacks like content-based filtering. Despite, collaborative filtering still has some shortcomings, such as sparsity and scalability. Thus, this paper tries to combine collaborative filtering with quantitative association rule mining to solve the problems just mentioned and recommend users more precisely. Through experiment evaluation on real data, it shows that the proposed methodology is feasible, and can achieve better recommendation than the traditional collaborative filtering methodology.
一. 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究目的 5
1.4 論文架構 6
二. 文獻探討 7
2.1 推薦系統於電子商務之應用 9
2.2 推薦系統 14
2.2.1簡介 14
2.2.2推薦系統架構 17
2.3協同式過濾方法(Collaborative filtering) 22
2.3.1定義 22
2.3.2 協同過濾方法的運作機制 24
2.3.3 協同式過濾方法的優點與限制 26
2.3.4 協同式過濾系統的評估指標 29
2.4 關聯式規則探勘(Association Rule Mining) 31
2.4.1 關聯式規則探勘的定義 31
2.4.2 探勘程序 35
三. 研究方法 37
3.1 研究架構 38
3.2 資料前處理 39
3.3 協同式過濾方法 40
四. 驗證 51
4.1基本資料說明 51
4. 2推薦方法運算流程展示 53
4.2.1 前置處理 54
4.2.2產生喜好鄰近群與推薦清單 58
4.3驗證與評估 61
4.3.1評估指標 61
4.3.2評估 62
4.3.3與其他方法之比較 64
4.4結果分析與改進 66
五. 結論與未來發展 68
5.1 結論 68
5.2未來研究方向 69
參考文獻 71
