臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.10) 您好！臺灣時間：2025/09/30 14:19

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

陸坤義

論文名稱:

應用分層隨機抽樣和動差保留法採掘重要關聯規則之方法

論文名稱(外文):

Mining Interesting Association Rules by Stratified Sampling and Moment-Preserving Thresholding

指導教授:

許玟斌

學位類別:

碩士

校院名稱:

東海大學

系所名稱:

資訊工程與科學系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2003

畢業學年度:

語文別:

中文

論文頁數:

中文關鍵詞:

分層隨機抽樣、動差保留法、高頻項目集、關聯規則、Apriori演算法

外文關鍵詞:

stratified sampling、moment-preserving thresholding approach、frequent itemsets、association rules、Apriori algorithm

相關次數:

被引用:4
點閱:328
評分:
下載:48
書目收藏:2

資料探勘(Data Mining)是現今非常熱門的研究領域，其中，有關如何快速產生關聯規則(Association Rules)的議題，更是被廣泛的討論與研究。產生關聯規則可分為二個階段；第一階段，自交易資料庫中找出高於使用者設定門檻值的高頻項目集(Frequent Itemsets)。第二階段，利用高頻項目集產生信賴度高的關聯規則。由於資料庫的資料量相當龐大，若經由頻繁的存取來產生高頻項目集相當的花費時間，因此，須以有效率且有效果之演算法來進行以節省成本。鑑於以上所述，本研究以分層隨機抽樣(Stratified Sampling)和動差保留法(Moment Preserving Thresholding)為演算基礎，期望能減少自資料庫中採掘高頻項目集和關聯規則所需的時間。在計算項目集支持度時，許多演算法[5][11][12][16]均未考慮購買數量，造成支持度之誤差，進而影嚮高頻項目集的參考價值。有鑑於此，本研究嘗試在採掘高頻項目集時，將購買數量做為計算支持度之加權值，以避免產生誤差並提昇支援決策的效果。本研究之演算法可分為五個步驟：步驟一、利用模擬程式產生交易資料庫。步驟二、利用動差保留法將每筆交易依利潤分類。步驟三、利用分層隨機抽樣法自前一步驟產生的分類結果中抽取足夠的樣本。步驟四、使用Apriori演算法掃描樣本以採掘高頻項目集。步驟五、利用高頻項目集產生關聯規則。經模擬証明，本研究提出之演算法除能夠有效且快速的自資料庫採掘高頻項目集之外，產生的關聯規則對決策者更具有參考價值。

Data mining is a very important issue; the association rule mining is the mostly studied one due to the wide applications among the proposed mining methods. The association mining problem should be proceeding by efficient algorithm; it could be divided into two phases. Phase 1: mining frequent itemsets from database. Phase 2: using frequent itemsets to generate the association rules. The proposed algorithm is based on stratified sampling and moment-preserving thresholding approach. Because of the reducing size of dataset, the proposed algorithm is efficient for the association rule mining problem. Moreover, we considered the buying quantities of items in support counting phase to increase the persuasion of frequent itemsets and association rules. The proposed algorithm has five steps. Step 1: generating transaction database by our simulator. Step 2: using moment-preserving thresholding approach to classify transaction by profit. Step 3: using stratified sampling to draw sample database. Step 4: mining frequent itemsets in sample database by Apriori algorithm. Step 5: generating association rules by frequent itemsets. By way of simulation results, the proposed algorithm is efficient in mining frequent itemsets. Besides, the association rules generated by proposed quantitative support counting method are more valuable.

摘要 I
ABSTRACT II
目錄 III
圖目錄 V
表目錄 VI
第 1 章緒論 1
1.1 資料探勘 1
1.2 資料探勘相關技術 3
1.2.1 購物籃分析 3
1.2.2 分類技術 6
1.2.3 群集技術 7
1.3 研究動機與目的 8
1.4 論文架構 10
第 2 章文獻探討 12
2.1 APRIORI演算法 12
2.2 DHP演算法 14
2.3 PINCER-SEARCH演算法 18
2.4 SAMPLING演算法 20
第 3 章理論架構 23
3.1 研究步驟 23
3.2 理論方法 27
3.2.1 抽樣理論 27
3.2.2 動差保留法 29
3.2.3 數量化支持度計算法 32
3.2.4 數量化信賴度計算法 33
第 4 章模擬實驗 35
4.1 實驗環境 35
4.2 傳統支持度計算法之效能評估 36
4.3 應用QSC計算法之效能評估 40
第 5 章結論與未來研究 44
5.1 結論 44
5.2 未來研究 45
參考文獻 47

[1] A. Tabatabai, “Edge Location and Data Compression for Digital Imagery, ” Ph.D. dissertation, School of Elect. Engrg., Purdue University, Dec. 1981.
[2] C. C. Aggarwal and P. S. Yu, “A New Approach to Online Generation of Association Rules, ” IEEE Trans. On Knowledge and Data Engineering, Vol. 13, No. 4, pp. 527-540, 2001.
[3] C. C. Aggarwal, C. Procopiuc, and P. S. Yu, “Find Localized Associations in Market Basket Data, ” IEEE Trans. On Knowledge and Data Engineering, Vol. 14, No. 1, pp. 51-62, 2002.
[4] D. I. Lin and Z. M. Kedem, “Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set, “ IEEE Trans. on Knowledge and Data Engineering, Vol. 14, No. 3, pp. 553-556, 2002.
[5] E. Cohen, M. Datar, S. Fujiwara, A. Gionis, P. Indyk, R. Motwani, J. D. Ullman, and C. Yang, “Finding Interesting Associations without Support Pruning, “ IEEE Trans. On Knowledge and Data Engineering, Vol. 13, No. 1, pp. 64-78, 2001.
[6] F. Berzal and J. C. Cubero, “TBAR: An efficient method for association rule mining in relational databases,” Data and Knowledge Engineering, Vol. 37, No. 1, pp. 47-64, 2001.
[7] G. Szego, “Orthogonal Polynomials, ” Vol. 23, 4th ed., Amer. Math. Soc., Providence R. I., 1975.
[8] H. Toivonen, “Sampling Large Databases for Association Rules, “ Proc. Int’l Conf. Very Large Data Bases, pp. 134-145, 1996.
[9] J. Han, Y. Fu, “Mining Multiple-Level Association Rules in Large Databases, “ IEEE Trans. on Knowledge and Data Engineering, Vol. 11, No. 5, pp. 798-805, 1999.
[10] J. R. Quilan, “C4.5: Programs for Machine Learning, ” Morgan Kaufmann, 1993.
[11] J. R. Quilan, “Induction of decision trees, ” Machine Learning, pp. 81-106, 1986.
[12] J. S. Park, M. S. Chen, and P. S. Yu, “An Effective Hash-Based Algorithm for Mining Association Rules, “ Proc. ACM SIGMOD Int’l Conf. Management of Data, pp. 175-186, 1995.
[13] L. Breiman, J. Friedman, R. Olshen, and C. Stone, “Classification of Regression Trees, ” Wadsworth, 1984.
[14] R. Agrawal, R. Srikant, “Fast Algorithms for Mining Association Rules, ”Proc. Int’l Conf. Very Large Data Bases, pp. 487-499, 1994.
[15] R.T. Ng and J. Han, “Efficient and Effective Clustering Methods for Spatial Data Mining,” Proc. Int’l Conf. Very Large Data Bases, pp. 144-155, 1994.
[16] T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An Efficient Data Clustering Method for Large Databases, ” Proc. ACM SIGMOD Int’l Conf. Management of Data, pp. 103-114, 1996
[17] V. Ganti, J. Gehrke, and R. Ramakrishnan, “Mining Very Large Databases, “ IEEE Computer Society, Vol. 32, No. 8, pp. 38-45, 1999.
[18] V. Ganti, R. Ramakrishnan, and J. Gehrke, “Clustering Large Datasets in Arbitrary Metric Spaces,” Proc. Int’l Conf. Data Engineering, pp. 502-511, 1999.
[19] W. H. Tsai, “Moment-preserving thresholding: A New Approach, ” Computer Vision, Graphics, and Image Processing, Vol. 29, 377-393, 1985.
[20] W. Mendenhall, L. Ott, and R. L. Scheaffer, “Elementary Survey Sampling, ” Wadsworth, 1986.

電子全文

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	基於視覺樣本比對之物件檢索方法設計
2.	以抽樣方法分析台灣專利案件之創新層級與發明原則
3.	地方稅務系統應用模糊理論進行房屋稅資料探勘之實務研究
4.	高效率之關聯法則探勘演算法
5.	從限定項目個數及交易長度的資料中有效地找出關聯規則之研究
6.	以交易分群建立樹狀結構之關聯法則
7.	關聯法則應用於貨架空間管理之研究
8.	漸進式異動資料的關聯法則挖掘之研究
9.	基於高頻項目集結合近似樣式匹配之文件分群
10.	應用關聯性規則探勘於股市時間序列分析─以TFT-LCD產業股價行為為例
11.	條件樣式探勘演算法與相關應用
12.	以最小完美雜湊與資料修剪策略發掘關聯法則之研究
13.	以雜湊為基礎有效尋找最大高頻率集合的方法
14.	改良Apriori演算法探勘關聯規則
15.	關聯分析與分類技術於資訊安全之應用

無相關期刊

1.	分層隨機抽樣下母體平均數推論之研究
2.	分層隨機抽樣下的兩個問題
3.	南投縣觀光建縣政策執行之研究
4.	挖掘高獲利性關聯規則之研究
5.	公私部門協力參與社區總體營造之研究－以九二一重建區為例
6.	營建工程土石方資源回收再利用之研究
7.	台灣原住民自治政策之研究
8.	地方派系對府會關係影響之研究：以台中縣為例
9.	知識時代公部門退休人力資源運用之研究---以台中市公教退休人員為例
10.	「一鄉一休閒農漁園區」政策執行成效之研究--以南投縣六鄉鎮市為列
11.	稻殼吸附銅鉻熱處理之研究
12.	谷關溫泉中三嗜熱厭氧木質素降解細菌之分離定性及鑑定
13.	谷關溫泉中二嗜熱厭氧木聚醣降解螺旋體菌株之分離與鑑定
14.	以乙醯丙酮作為螯合劑及使用石墨式原子吸光法測定煤碳中鈹之研究
15.	台灣台中地區市郊及工業區大氣中多環芳香族碳氫化合物之特性及來源鑑定

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室