降維是指減少數據集中變量的數量,理想情況下接近其內在維度,同時保留原始 數據的有意義特性,它通常是資料科學中模型訓練之前的數據預處理步驟。具體 來說,它可用於資料可視化、集群分析、降噪,或作為促進其他研究的中間步 驟。在這篇論文中,我們簡要介紹主成分分析和線性判別分析兩種線性降維的方 法,和一些主要的非線性降維方法,包括多維尺度擬合、等距映射、擴散映射、 拉普拉斯特徵映射、局部線性嵌入和核主成分分析等方法的推導。此外,我們借 助測地距離對拉普拉斯特徵映射和擴散映射進行了改進,我們還提出了一種選擇 降維維度的方法。最後,我們進行數值實驗並比較各種降維技術。;Dimensionality reduction is reducing the number of variables in a dataset, ideally close to its intrinsic dimension, while retaining meaningful properties of the orig- inal data. It is usually a data preprocessing step before training models in data science. Specifically, it can be used for data visualization, cluster analysis, noise reduction, or as an intermediate step to facilitate other studies. In this thesis, we briefly present the derivations of linear dimensionality reduction methods of the principal component analysis and linear discriminant analysis, and several nonlinear dimensionality reduction methods, including the multidimensional scaling, isometric mapping, diffusion maps, Laplacian eigenmap, locally linear embedding, and ker- nel PCA. Furthermore, we propose modifications to the Laplacian eigenmap and diffusion maps with the help of geodesic distance. We also present a method for selecting the dimension for dimensionality reduction. Finally, we perform numerical experiments and compare the various dimensionality reduction techniques.