臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.176) 您好！臺灣時間：2025/09/06 23:31

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

李櫂安

研究生(外文):

LI, CHAO-AN

論文名稱:

基於連通域標示法之分群演算法

論文名稱(外文):

A Clustering Algorithm based on Connected Component Labeling

指導教授:

練光祐

指導教授(外文):

LIAN, KUANG-YOW

口試委員:

阮議聰、邱謙松、曾傳蘆、黃正民

口試委員(外文):

JUAN, YEE-JSONG、CHIU, CHIAN-SONG、TSENG, CHWAN-LU、HUANG, CHENG-MING

口試日期:

2019-07-25

學位類別:

碩士

校院名稱:

國立臺北科技大學

系所名稱:

電機工程系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2019

畢業學年度:

107

語文別:

中文

論文頁數:

中文關鍵詞:

非監督式分群演算法、資料區域化、數值二值化、連通域標示法

外文關鍵詞:

Unsupervised Clustering Algorithm、Data Regionalization、Thresholding、Connected Component Labeling

相關次數:

被引用:0
點閱:203
評分:
下載:4
書目收藏:0

本文為改進既有的非監督式分群演算法之缺點，提出一套新穎的分群演算法。傳統分割式演算法透過資料點與群聚中心點之間的距離來進行分群，但不少資料型態皆並不適用於此方法。另種常用的密度式演算法則是透過預設距離資料點半徑長度以內的鄰近區域，判斷該區域中是否包含了預設的最少資料點數，若符合，則該區域為同一群聚；若否，該資料點則為離群點。但如果事先不知道資料的型態，將會很難決定半徑長度以及範圍內最少資料點的個數。本文以連通域標示法(Connected Component Labeling)之概念，提出一種創新的分群演算法。其方法主要是找出資料群聚中密度的核心，透過資料區域化、資料二值化等統計學的方法處理，並透過連通域標示法可以有效去除面積較小的連通區域。不再是使用群聚中心點進行資料點距離計算，而是使用連通區域的所有座標點進行距離計算，並將連通區域外的資料點分配到鄰近的區域。因此不管資料集合是非凸形資料分布，或是被離群點壓縮，或是群聚間彼此資料佔的區域差異極大，或是群聚間有極多資料點分布稀疏的情形下，都可以成功分配至正確的群聚中。

In order to improve the shortcomings of existing unsupervised clustering algorithms, this thesis proposes a novel clustering algorithm. The partitional clustering algorithm performs grouping by the distance between the data points and the cluster center points, but many of the data are not applicable to this method. The density clustering algorithm determines whether the area contains the preset minimum number of data points by using a neighboring area within a radius of the preset distance of the data point. If they match, the area is the same cluster. If not, the data point is a discrete point. However, if you do not know the type of the data beforehand, it will be difficult to determine the length of the radius and the number of minimum data points in the range. This thesis proposes an innovative clustering algorithm based on the concept of connected component labeling. The method is mainly to find out the core of the medium density of data clustering, and to deal with the statistical methods such as data regionalization and thresholding, and to effectively remove the connected components with small area through the connected component labeling. Instead of using the cluster center point for data point distance calculation, all the coordinate points of the connected component are used for distance calculation. Assign data points outside the connected component to components with smaller distances. Therefore, this method can be used regardless of whether the data set is a non-convex data distribution, or is compressed by outliers, or the area between the clusters is extremely different, or there are many data points distributed between the clusters. The data is grouped into the correct clusters.

摘要 i
ABSTRACT ii
誌謝 iii
目錄 iv
圖目錄 vii
表目錄 xi
第一章緒論 1
1.1 研究動機 1
1.2 研究目的 8
1.3 相關文獻回顧 8
1.4 分群演算法基本概念 11
1.5 研究貢獻 11
1.6 論文架構 11
第二章相關技術 13
2.1 分群演算法 13
2.1.1 K-means分群演算法 13
2.1.2 PAM分群演算法 16
2.1.3 CLARA分群演算法 18
2.1.4 DBSCAN分群演算法 18
2.1.5 OPTICS 分群演算法 21
2.2 均值濾波器(Mean Filter) 23
2.3 大津演算法(Otsu's method) 25
2.4 連通域標示法(Connected Component Labeling) 26
第三章系統流程 28
3.1 整體架構 28
3.2 簡化流程 30
第四章連通分群演算法 32
4.1 資料正規化(Data Normalization) 32
4.2 資料區域化(Data Regionalization) 33
4.3 濾除離群點(Remove Outlier by Mean Filter) 36
4.4 資料二值化(Thresholding) 37
4.5 連通域標示法(Connected Component Labeling) 39
4.6 資料分配(Assign Each Data to the Near Region) 40
4.7 區域合併(Data Close to Each Region are Merged) 41
第五章系統優化 43
5.1 改善記憶體不足的問題 43
5.2 改善時間過長的問題 44
5.2.1 找出每個維度適當的間隔數目 44
5.2.2 省略濾除離群點的步驟 46
第六章實驗結果 48
6.1 二、三維資料 48
6.2 高維度資料 55
6.2.1 實驗數據資料庫 55
6.2.2 實驗流程 55
6.2.3 實驗結果 57
第七章結論與未來展望 62
7.1 結論 62
7.2 未來展望 62
參考文獻 63

[1]J. Grabmeier and A. Rudolph, “Techniques of cluster algorithms in data mining,” Data Mining knowledge discovery, vol. 6, no. 4, pp. 303-360, 2002.
[2]A. P. Reynolds, G. Richards, and V. J. Rayward-Smith, “The application of k-medoids and pam to the clustering of rules,” in International Conference on Intelligent Data Engineering and Automated Learning, pp. 173-178, 2004.
[3]陳榮昌和林育臣，「群聚演算法及群聚參數的分析與探討」，朝陽學報，no. 8_1, pp. 327-353, 2003.
[4]M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Kdd, pp. 226-231, 1996.
[5]M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander, “OPTICS: ordering points to identify the clustering structure,” in ACM Sigmod record, pp. 49-60, 1999.
[6]N. Arbin, N. S. Suhaimi, N. Z. Mokhtar, and Z. Othman, “Comparative analysis between K-means and K-medoids for statistical clustering,” in 2015 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS), pp. 117-121, 2015.
[7]W. S. Manjoro, M. Dhakar, and B. K. Chaurasia, “Operational analysis of k-medoids and k-means algorithms on noisy data,” in 2016 International conference on communication and signal processing (ICCSP), pp. 1500-1505, 2016.
[8]H. K. Kanagala and V. J. R. Krishnaiah, “A comparative study of K-Means, DBSCAN and OPTICS,” in 2016 International Conference on Computer Communication and Informatics (ICCCI), pp. 1-6, 2016.
[9]M. Verma, M. Srivastava, N. Chack, A. K. Diswar, and N. Gupta, “A comparative study of various clustering algorithms in data mining,” International Journal of Engineering Research Applications, vol. 2, no. 3, pp. 1379-1384, 2012.
[10]N. Shental, A. Bar-Hillel, T. Hertz, and D. Weinshall, “Computing Gaussian mixture models with EM using equivalence constraints,” in Advances in neural information processing systems, pp. 465-472, 2004.
[11]S. Sengupta, S. Basak, and R. A. Peters, “Data Clustering using a Hybrid of Fuzzy C-Means and Quantum-behaved Particle Swarm Optimization,” in 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), pp. 137-142, 2018.
[12]J. Sun, W. Xu, and B. Feng, “A global search strategy of quantum-behaved particle swarm optimization,” in IEEE Conference on Cybernetics and Intelligent Systems, 2004., pp. 111-116, 2004.
[13]J. C. Bezdek, “Pattern recognition with fuzzy objective function algorithms,” in Book Pattern recognition with fuzzy objective function algorithms. Springer Science & Business Media, 2013.
[14]R. Eberhart and J. Kennedy, “Particle swarm optimization,” in Proceedings of the IEEE international conference on neural networks, pp. 1942-1948, 1995.
[15]J. Liang, X. Zhou, Y. Sha, P. Liu, L. Guo, and S. Bai, “Unsupervised Clustering Strategy Based on Label Propagation,” in 2013 IEEE 13th International Conference on Data Mining Workshops, pp. 788-794, 2013.
[16]X. Zhu and Z. Ghahramani, “Learning from labeled and unlabeled data with label propagation,” 2002.
[17]S. Gregory, “Finding overlapping communities in networks by label propagation,” New Journal of Physics, vol. 12, no. 10, p. 103018, 2010.
[18]J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp. 281-297, 1967.
[19]影像平滑化 - 維基百科，Avaiable: https://zh.wikipedia.org/wiki/%E5%BD%B1%E5%83%8F%E5%B9%B3%E6%BB%91%E5%8C%96
[20]N. Otsu, “A threshold selection method from gray-level histograms,” IEEE transactions on systems, man, cybernetics, vol. 9, no. 1, pp. 62-66, 1979.
[21]M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” Journal of Electronic imaging, vol. 13, no. 1, pp. 146-166, 2004.
[22]Connected-component labeling - Wikipedeia, Avaiable: https://en.wikipedia.org/wiki/Connected-component_labeling
[23]UCI Machine Learning Repository, Avaiable: https://archive.ics.uci.edu/ml/index.php
[24]G. Lanaro, Evaluation measures for multiclass problems, Avaiable: http://gabrielelanaro.github.io/blog/2016/02/03/multiclass-evaluation-measures.html
[25]陳孟廷，「基於低解析度紅外線陣列感測訊號之睡眠評估系統」，碩士論文，國立台北科技大學電機工程研究所，2018。

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	基於低解析度紅外線陣列感測訊號之睡眠評估系統
2.	工程圖影像字元萃取之研究與實作

無相關期刊

1.	基於模板韻律特徵之英語口說語調轉換系統
2.	以田口法研究植物生長控制因子
3.	考慮語速特徵的中文關鍵詞語音辨識
4.	基於卷積神經網路之雙模型技術應用於語音辨識系統
5.	三維多聲源定位系統
6.	基於聲源定位之自走車跟隨控制
7.	基於聲源定位之多輛自走車跟隨控制及轉彎軌跡規劃
8.	電梯電力回生節能效益分析
9.	以英文關鍵詞進行語者性別與口音辨識的混合CNN-SVM模型
10.	利用穿戴式裝置之生理信號進行情緒辨識與放鬆機制設計
11.	基於超寬頻感測器之室內定位最佳搜尋演算法設計
12.	基於低解析度紅外線陣列感測訊號之睡眠評估系統
13.	行動裝置M-PCIe匯流排訊號完整度之改善
14.	多聲源定位結合音訊強化之語音辨識輔助系統
15.	基於光源偵測之自走車編隊控制

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室