 這篇文章主要的目標是想要將現實的資料轉換成能夠良好的成為監督式學習的訓練資料，這個過程也被稱為特徵選擇，其中的兩個步驟：異常檢測、資料降維，是本篇專注的重點，而我將使用流形學習的演算法來嘗試。在一開始介紹主成分分析(principal component analysis)、多維標度(multidimensional scaling)、分群與合併多維標度(split and combine multidimensional scaling)、擴散映射(diffusion map)、局部尺度擴散映射(diffusion map with local scaling)、局部線性嵌入(locally linear embedding)、黑塞特徵映射(Hessian locally linear embedding)以及切面局部線性嵌入(tangential locally linear embedding)，並進行初步的證明。接下來我將用兩個平滑的流形展示上述幾種方法的差異，以及其中的參數對於演算法的影響，然後我們將展示異常值對於演算法結果的影響，其中的異常值包括離群值及雜訊。最終，我們會使用譜聚類(Spectral clustering)以及K-平均演算法(K-means clustering)來嘗試分離我們想要的資料以及異常值。
 The main goal of this article is to transform real-world data into well-behaved training data for supervised learning, a process known as feature selection. Two key steps in this process, namely anomaly detection and data dimensionality reduction, are the focal points of this study. I will be using manifold learning algorithms to accomplish this task. The article begins by introducing principal component analysis, multidimensional scaling, split and combine multidimensional scaling, diffusion map, diffusion map with local scaling, locally linear embedding, Hessian locally linear embedding and tangential locally linear embedding, and give some prove.Next, I will present two smooth manifolds to show the differences among the previous methods and the impact of their parameters on the algorithms. Additionally, we will examine the influence of anomaly values, including outliers and noise, on the algorithm's results. Finally, we will apply spectral clustering and K-means clustering to separate the desired data from the anomaly values.
 摘要................................................... iAbstract............................................... iiTable of Contents......................................iiiList of Figures........................................ V1 Introduction......................................... 12 Methodology.......................................... 22.1 Notation........................................... 22.2 Principal Component Analysis....................... 32.3 Multidimensional Scaling........................... 42.3.1 Classical Multidimensional Scaling............... 42.3.2 Split and Combine Multidimensional Scaling....... 52.4 Diffusion map...................................... 72.4.1 Connect to MDS................................... 92.4.2 Local Scale...................................... 102.5 Locally Linear Embedding........................... 112.5.1 Original method.................................. 112.5.2 Hessian method................................... 132.5.3 Tangential Locally Linear Embedding.............. 143 Numerical Experiment................................. 163.1 Data............................................... 163.2 PCA and SCMDS...................................... 173.2.1 Special result of SCMDS.......................... 193.2.2 Result of PCA.................................... 203.3 Diffusion Map...................................... 203.3.1 Diffusion Map and Local Scale.................... 213.4 LLE................................................ 223.4.1 Cost and Neighbors............................... 234 Anomaly Detection.................................... 264.1 Data with Noise.................................... 264.1.1 Outlier.......................................... 264.1.2 Noise............................................ 284.2 Spectral Clustering................................ 324.2.1 K-means Clustering............................... 324.2.2 PCA and LLE...................................... 324.2.3 Diffusion Map.................................... 345 Conclusion and future work........................... 36Reference.............................................. 38Appendix A............................................. 39
