# 臺灣博碩士論文加值系統

(44.192.48.196) 您好！臺灣時間：2024/06/16 10:49

:::

### 詳目顯示

:

• 被引用:0
• 點閱:52
• 評分:
• 下載:13
• 書目收藏:0
 這篇文章主要的目標是想要將現實的資料轉換成能夠良好的成為監督式學習的訓練資料，這個過程也被稱為特徵選擇，其中的兩個步驟：異常檢測、資料降維，是本篇專注的重點，而我將使用流形學習的演算法來嘗試。在一開始介紹主成分分析(principal component analysis)、多維標度(multidimensional scaling)、分群與合併多維標度(split and combine multidimensional scaling)、擴散映射(diffusion map)、局部尺度擴散映射(diffusion map with local scaling)、局部線性嵌入(locally linear embedding)、黑塞特徵映射(Hessian locally linear embedding)以及切面局部線性嵌入(tangential locally linear embedding)，並進行初步的證明。接下來我將用兩個平滑的流形展示上述幾種方法的差異，以及其中的參數對於演算法的影響，然後我們將展示異常值對於演算法結果的影響，其中的異常值包括離群值及雜訊。最終，我們會使用譜聚類(Spectral clustering)以及K-平均演算法(K-means clustering)來嘗試分離我們想要的資料以及異常值。
 The main goal of this article is to transform real-world data into well-behaved training data for supervised learning, a process known as feature selection. Two key steps in this process, namely anomaly detection and data dimensionality reduction, are the focal points of this study. I will be using manifold learning algorithms to accomplish this task. The article begins by introducing principal component analysis, multidimensional scaling, split and combine multidimensional scaling, diffusion map, diffusion map with local scaling, locally linear embedding, Hessian locally linear embedding and tangential locally linear embedding, and give some prove.Next, I will present two smooth manifolds to show the differences among the previous methods and the impact of their parameters on the algorithms. Additionally, we will examine the influence of anomaly values, including outliers and noise, on the algorithm's results. Finally, we will apply spectral clustering and K-means clustering to separate the desired data from the anomaly values.
 摘要................................................... iAbstract............................................... iiTable of Contents......................................iiiList of Figures........................................ V1 Introduction......................................... 12 Methodology.......................................... 22.1 Notation........................................... 22.2 Principal Component Analysis....................... 32.3 Multidimensional Scaling........................... 42.3.1 Classical Multidimensional Scaling............... 42.3.2 Split and Combine Multidimensional Scaling....... 52.4 Diffusion map...................................... 72.4.1 Connect to MDS................................... 92.4.2 Local Scale...................................... 102.5 Locally Linear Embedding........................... 112.5.1 Original method.................................. 112.5.2 Hessian method................................... 132.5.3 Tangential Locally Linear Embedding.............. 143 Numerical Experiment................................. 163.1 Data............................................... 163.2 PCA and SCMDS...................................... 173.2.1 Special result of SCMDS.......................... 193.2.2 Result of PCA.................................... 203.3 Diffusion Map...................................... 203.3.1 Diffusion Map and Local Scale.................... 213.4 LLE................................................ 223.4.1 Cost and Neighbors............................... 234 Anomaly Detection.................................... 264.1 Data with Noise.................................... 264.1.1 Outlier.......................................... 264.1.2 Noise............................................ 284.2 Spectral Clustering................................ 324.2.1 K-means Clustering............................... 324.2.2 PCA and LLE...................................... 324.2.3 Diffusion Map.................................... 345 Conclusion and future work........................... 36Reference.............................................. 38Appendix A............................................. 39
 [1] I. Jolliffe, Principal component analysis. Wiley Online Library, 2005.[2] J. Kruskal and M. Wish, “Multidimensional scaling. 1978,” Beverly Hills, CA, 1978.[3] J. Tzeng, H. H.-S. Lu, and W.-H. Li, “Multidimensional scaling for large genomic data sets.” BMC Bioinform., vol. 9, p. 179, 2008.[4] Coifman and S. Lafon, “Diffusion map,” Appl. Comput. Harmon. Anal., vol. 21, pp. 5––30, 2006.[5] P. Perona and L. Zelnik-Manor, “Self-tuning spectral clustering,” Proc. Adv. Neural Inf. Process. Syst., vol. 17, pp. 1601–1608, 2004.[6] L. Lin, “Avoiding unwanted results in locally linear embedding: A new under standing of regularization,” arXiv, 2021, 2108.12680.[7] D. L. Donoho and C. Grimes, “Hessian eigenmaps: locally linear embedding techniques for high-dimensional data.” Proc. Natl. Acad. Sci. U.S.A., vol. 100, pp. 5591–5596, 2003.[8] L. Lin and C.-W. Chen, “A new locally linear embedding scheme in light of hessian eigenmap,” arXiv, 2021, 2112.09086.[9] J. von Neumann, “Some matrix-inequalities and metrization of matrix-space,” omsk. Univ. Rev., vol. 1, pp. 286–300, 1937.[10] A. Ruhe, “Perturbation bounds for means of eigenvalues and invariant subspaces,” BIT Numer. Math., vol. 10, pp. 343–354, 1970.
 電子全文
 連結至畢業學校之論文網頁點我開啟連結註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！
 推文當script無法執行時可按︰推文 網路書籤當script無法執行時可按︰網路書籤 推薦當script無法執行時可按︰推薦 評分當script無法執行時可按︰評分 引用網址當script無法執行時可按︰引用網址 轉寄當script無法執行時可按︰轉寄

 1 關於一些非線性降維的方法與改進 2 利用區域標識與各向異性擴散映射實現非線性巨量資料可視化 3 基於擴散映射分群之特殊日短期電力負載預測 4 以主成分分析及譜分群進行高維度異常偵測 5 基於在流形結構上隨機漫步的主動式學習演算法 6 基於局部學習對車牌影像超解析化

 無相關期刊

 1 利用區域標識與各向異性擴散映射實現非線性巨量資料可視化 2 以神經網路進行算子學習 3 增量式學習的策略 4 使用 Neural ODE 家族預測台灣確診病例數 5 以神經網路進行算子學習 6 降維演算法的即時更新方法 7 以神經網路進行函數逼近 8 利用流形學習分析動脈血壓之波形 9 Cahn-Hilliard方程在圓盤上的數值模擬 10 基於最優質量傳輸預處理的深度學習肺腫瘤分割 11 臺北盆地江姓宗族發展與認同—以擺接平原江璞亭家族為中心的考察 12 新式胺基酸的化學反應及藥物遞送應用之研究 13 第一型人類白血球抗原與偏頭痛患者偏頭痛慢性化以及過度使用藥物之關聯性研究 14 一項有關台灣毒藥物防治諮詢中心的海洋動物刺傷中毒個案之致毒物種、臨床症狀、嚴重度及相關的海洋溫度之研究 15 影像壓縮數值研究

 簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室