跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.110) 您好!臺灣時間:2026/05/06 02:01
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:沈劭庭
研究生(外文):Shan-Ting Shen
論文名稱:基於網站功能導向建立多類別決策樹以過濾色情網站之研究
論文名稱(外文):Filtering Porn Site by Constructing Decision Trees of Multiple Categories Based on Website’s Functional Orientation
指導教授:楊維邦楊維邦引用關係許志堅許志堅引用關係
指導教授(外文):Wei-Pang YangJyh-Jian Sheu
學位類別:碩士
校院名稱:國立東華大學
系所名稱:資訊管理碩士學位學程
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:中文
論文頁數:65
中文關鍵詞:色情網頁色情網頁過濾文件分類決策樹
外文關鍵詞:porn sitesfiltering of porn sitesdocument classificationdecision
相關次數:
  • 被引用被引用:0
  • 點閱點閱:384
  • 評分評分:
  • 下載下載:43
  • 收藏至我的研究室書目清單書目收藏:1
隨著網際網路的普及化,網頁的瀏覽已成為了使用者最主要的活動之一。然而,在網際網路中的透明化、互動性及開放的特性使得原本被視為不當的色情圖片、資訊和影片,可以開放式的以網頁呈現在使用者的面前。而網路中色情網站頁呈現出各種不同的色情要素例如:色情影碟的買賣下載、色情圖片與色情資訊的散佈、性交易訊息的傳遞等等。為了減少對未成年使用者的負面影響與造成的社會成本,本研究提出對應色情網站的多種類型的過濾機制。依照在網頁文件中擷取網頁特徵值(feature)分為四種網站導向的類別,分為部落格導向、販賣行為導向、入口網站類型導向、散佈性交易媒介之導向。選用ID3決策樹演算法,分別針對各個網頁類別建立決策樹,運用決策樹演算法的概念,來建立過濾網頁所需決策樹,找出關連法則,再針對不同類型的網頁分別算出法則配分,最後用法則得到的分數過濾未知網頁。

經本研究證實,不同類型的網頁存在色情關鍵字的HTML欄位,有著不同的比例,所產生的法則也有所不同。用同樣網頁類型的法則判斷該網頁,可得到較高的準確率。本研究的實驗數據accuracy平均可達97.68%、f-measure平均也達到了97.38%、 precision也可達98.26%。過濾衡量指標都有良好的數據呈現。代表在網頁過濾前做網頁類型的區分,對過濾效果而言確實有正面的幫助。可以驗證本研究提出之色情網頁過濾機制有完善的效能與貢獻。
The transparency, interaction and openness of Internet have caused some improper pornography photos, information and videos easily accessed by users through Internet. In order to reduce the negative impact on minors, this thesis aims to propose an efficient filtering mechanism for porn sites. Our method will divide various webs into four kinds of categories: blog-oriented, selling-oriented, portal-oriented and Sex trade-oriented. The ID3 decision tree data mining algorithm will be applied to establish the decision tree for every category. Therefore, the association rules between the features of webs and the web's type (porn or legal) should be found. Then, the rules can be used to filter the unknown websites.

According to the experimental results, the average accuracy of filter proposed in this study is up to 97.68%, f-measure to 97.38% and precision to 98.26%. The results have showed that if we can differentiate the types of websites before filtering, it will be contributive to the identification of porn sites.
第一章 緒論 1
1.1研究背景 1
1.2研究動機與目地 5
第二章 文獻探討 9
2.2.2文件分類判斷方式 13
2.3常見色情網站過濾機制 14
2.3.1 針對色情圖片的過濾機制 14
2.3.2 動態與靜態過濾方式 15
2.3.3 其他過濾機制 16
2.4 決策樹演算法 16
第三章 研究架構 25
3.1網頁類型區分階段(Page categorization) 26
3.1.1網頁擷取模組 27
3.1.2 特徵擷取模組 27
3.1.3網頁類型區分模組 31
3.2 訓練階段(training phase) 34
3.2.1決策樹建立模組 35
3.3.2法則配分模組 37
3.3執行階段(execute phase) 38
3.3.1網頁計分模組 39
3.3.2 調分模組 41
第四章 實驗設計 43
4.1 資料集選用 43
4.2過濾績效衡量指標 44
4.3實驗結果分析 46
4.3.1實驗結果 46
4.3.2 過濾數據比較 55
第五章 結論與未來研究方向 57
參考文獻 59
[1] [Ahmadi,2010] A.Ahmadi, F.Fotouhi, M.Khaleghi .Intelligent classification of web pages using contextual and visual features. Applied Soft Compu-ting,2010.

[2] [Breman,1984 L.Breiman,H.J.Friedman ,R.A.OlshenC,J.Stone.Classificati on and regression tree. Belmont,CA:Wadsworth International Group,1984.

[3] [Check,1986] J. V. P .Check, N.Malamuth . Pornography and sexual ag-gression: A social learning theory analysis. In M. L. McLaughlin (Ed.), Communication Yearbook.(Vol. 9, pp.181-213),1986.

[4] [Chen,2009]C.M.Chen, H.M.Lee, Y.J.Chang. Two novel feature selection approaches for web page classification Expert System with Application. (pp.260-272) ,2009.

[5] [Cooper,2002] A.Cooper, E.Griffin-Shelley (2002). Introduction. The in-ternet: The next sexual revolution. In A. Cooper (Ed.), Sex and the internet: A guidebook for clinicians . (New York: Brunner-Routledge pp.1-15),2002

[6] [Cooper,2004]A.Cooper, N.Galbreath, M.Becker . Sex on the Internet: Furthering our understanding of men with online sexual problems. Psy-chology of Addictive Behaviors, (18(3), pp.223-230.),2004

[7] [Cooper,1999]A.Cooper, C.R.Scherer, S.C.Boies, B.L.Gordon.(1999). Sexuality on the Internet: From sexual exploration to pathological expres-sion. Professional Psychology: Research and Practice. (30(2), pp.154-164),1999.

[8] [Sebastaiani,2002] F.Sebastaiani. machine learning in automated text categorization .ACM computing surveys.(vol 34, pp.1-47),2002.

[9] [Spink,2004] A. Spink, H. C. Ozmutlu, and D.P. Lorence, 2004. “Web Searching for Sexual Information: An Exploratory Study,” Information Processing and Management, volume 40, number 1, pp. 113–124,2004.

[10] [Griffiths,2001] M.Griffiths .Sex on the Internet: Observations and impli-cations for Internet sex addiction. Journal of Sex Research, Nov. Retrieved December 1, 2004 http://old.npf.org.tw/PUBLICATION/EC/095/EC-B-095-037.htm

[11] [Huang,1997] Y.L.Huang. A theoretic research of cluster indexing for mandarin chinese full text document—the construction of vector space model,1997.

[12] [ Hartign ,1975] J.A. Hartign. Clustering Alogorithms,1975.

[13] [Fleck,1996]M.M.Fleck, D.A.Forsyth and C.Bregler. Finding naked people, Proc.European Conf. on Computer Vision,1996.

[14] [Kan,2004] M.Y.Kan.Web Page Categorization without the Web Page. The thirteenth international world wide web conference,2004.

[15] [Lee,2004]P.Y. Lee, S.C. Hui, A.C.M. Fong .A structure and content-based analysis for Web filtering. Internet Research: Electronic Networking Appli-cations and Policy, (Volume 13, pp. 27-37),2004

[16] [Quinlan,1986]Quinlan. Introduction of decision tree. Machine learning (Vo1 pp.81-106),1986.

[17] [Quinlan,1993 ]Quinlan.C4.5:programs for machine learn-ing.SanMateo.(pp265-240),1993.

[18] [R.Guermazi,2008] R.Guermazi,2008 , M.Hammami, A.B. Hamadou, “Using a semi-automatickeyword dictionary for improving violent Web site filtering” IEEE Conference on Signal-Image Technologies and Inter-net-Based System ,pp.337-344,2008.

[19] [Ross,2002] M.W.Ross, M.R.Kauth. Men who have sex with men, and the Internet: Emerging clinical issues and their management. In A. Cooper (Ed.), Sex and the Internet: A guidebook for clinicians (pp.47-69),2002

[20] [Salton,1983]Salton.G.Introduction to information retrieval. McGraw-Hill.

[21] [Sebastiani,2002] Sebastiani.Machine Learning in Automated Text Cate-gorization. ACM Computing Surveys.(Vol. 34, No. 1, pp. 1–47),1983.

[22] [Sebastiani,2005] Sebastiani.Text Categorization .Text Mining and it Ap-plication (pp.109-129),2005.

[23] [Stol,2009] W.Ph.Stol, H.K.W.Kaspersen, J.Kerstens, E.R.Leukfeldt A.R. Lodder.Governmental filtering of websites: The Dutch case. computer law & security review. (pp.251– 262),2009.

[24] [Yang,1997] Y.Yang, J.O.Pedersen. A comparative study on feature selec-tion in text categorization. In Proceedings of the 14th international confe-rence on machine learning(ICML’97) Nashville,Tennessee (pp.412-420),1997.

[25] [Yang,1997] Y.Yang . Document automatic classification and ranking. Master Thesis, Hsinchu, Taiwan: Department of Computer Science, Na-tional Tsing Hua University,1997.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊