跳到主要內容

臺灣博碩士論文加值系統

(3.87.250.158) 您好!臺灣時間:2022/01/25 18:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃鴻儒
研究生(外文):Hung-Ju Huang
論文名稱:針對非離散及整合型資料於簡易貝氏分類器上的分析及應用
論文名稱(外文):A Study of Naive Bayesian Classifiers for Nondiscrete and Aggregate Data
指導教授:許鈞南許鈞南引用關係李嘉晃李嘉晃引用關係
指導教授(外文):Chun-Nan HsuChia-Hoang Lee
學位類別:博士
校院名稱:國立交通大學
系所名稱:資訊科學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:英文
論文頁數:90
中文關鍵詞:簡易貝氏分類器連續型變數區間查詢同質集語者辨識
外文關鍵詞:Naive Bayesian classifierContinuous variableInterval queryHomologous setSpeaker recognition
相關次數:
  • 被引用被引用:0
  • 點閱點閱:292
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
簡易貝氏分類器是一種簡單而且有用的分類工具。它已經被廣泛的使用在離散型變數的系統上,但對於處理其它非離散型變數卻有其窒礙難行的地方,如連續型變數。本論文主要是討論如何應用簡易貝氏分類器於處理多種常見的非離散型及整合型資料上。針對連續型變數,本論文顯示在一般情況下將連續型變數切割離散化後的效果會比假設它是常態分佈來的好。本論文並解釋了為什麼前人所提出各種不同切割連續型變數的方法對於簡易貝氏分類器來說其效果都差不多。經由分析,我們提出了一個稱為懶惰切割的切割方法,這個方法是根據測試資料來對連續型變數做動態的切割。此法不僅可以對連續型變數做有效的動態切割,且可使簡易貝氏分類器處理集合型,區間型,及多重區間型資料的分類查詢問題。對於整合型資料,本論文探討了如何使用簡易貝氏分類器來分類同質集。我們定義同質集內的樣本是來自於同一種未知的類別,像這樣型態的資料常常可在多種應用上中遇到。我們深入探討如何運用我們知道同質集內的每個樣本是屬於同一種未知的類別的這個資訊來提高簡易貝氏分類器的分類正確性。我們提出一個方法,稱作同質簡易貝氏分類器,是由簡易貝氏分類器擴充並將整個同質集當成一個物件做為輸入的分類器。將此法與常用的投票方法及其它幾種方法相比較,此法明顯的優於它種方法,即使當同質集內的樣本數還很少時也有很好的效果。我們並將此法成功的運用在語者辨識上,但其應用範圍不僅僅侷限在語者辨識系統。
Naive Bayes is a simple and useful classification tool. It is the most commonly used in situations which all the variables are discrete because naive Bayes is difficult to model complex probability densities over nondiscrete data such as continuous variables. This thesis describes how to use naive Bayes to classify several types of nondiscrete and aggregate data. We show that, in general discretization of continuous variables can outperform parameter estimation assuming a normal distribution. Based on our analysis, we can explain why a wide variety of well-known discretization methods can perform well with insignificant difference. Our analysis leads to a lazy discretization method, which dynamically discretizes continuous variables according to test data. This method can be extended to allow a naive Bayes to classify set-valued, interval and multi-interval query data. We also address the problem of how to classify a set of query vectors belonging to the same unknown class. Sets of data known to be sampled from the same class are often seen in many application domains. We refer to these sets as homologous sets. We show how to take advantage of homologous sets in classification to improve accuracy over by classifying each query vector individually. Our method, called homologous naive Bayes (HNB), uses a modified classification procedure that classifies multiple instances as a single unit. Compared with a voting method and several other variants of naive Bayes classification, HNB significantly outperforms these methods in a variety of test data sets, even when the number of query vectors in the homologous sets is small. We also report a successful application of HNB to speaker recognition.
Abstract
Chapter 1. Introduction
Chapter 2. Bayesian Classification
Chapter 3. Classifying Continuous Data
Chapter 4. Lazy Discretization
Chapter 5. Classifying Set and Interval Data
Chapter 6. Classifying Data from the same Unknown Class
Chapter 7. Conclusions and Future Work
Bibliography
[1] C.-N. Hsu, H.-J. Huang and T.-T. Wong,“Implications of the Dirichlet Assumption for Discretization of Continuous Attributes in Naive Bayesian Classifiers,”Machine Learning.
[2]. H.-J. Huang and C.-N. Hsu, “Bayesian Classification for Data from the Same Unknown Class,”IEEE Trans. on Systems, Man, and Cybernetics — Part B: Cybernetics, vol. 32, no. 2, Apr. 2002.
[3] C.-N. Hsu, H.-J. Huang and D. Schuschel , “The ANNIGMA Wrapper Approach to Fast Feature Selection for Neural Nets,” IEEE Trans. on Systems, Man, and Cybernetics — Part B: Cybernetics, vol. 32, no. 2, Apr. 2002.
[4] H.-J. Huang and C.-N. Hsu,“Recognizing 100 Speakers using Homologous Naive Bayes,” The Seventh Pacific Rim International Conference on Artificial Intelligence, 2002, Tokyo, Japan.
[5] H.-J. Huang and C.-N. Hsu,“Bayesian Classification for Set and Interval Data,” International Computer Symposium, 2000, ChiaYi, Taiwan.
[6] C.-N. Hsu, H.-J. Huang and T.-T. Wong, “Why Discretization Work on Naive Bayes,” International Conference on Machine Learning, 2000 ,Stanford, CA, USA.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文