中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/93100
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 42714622      Online Users : 1398
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/93100


    Title: 基於自注意力與擬合平面感知局部幾何之三維點雲分類網路;PointGPS - A 3D Point Cloud Classification Network Aware Local Geometry from Fitting Plane and Self-Attention
    Authors: 劉晉丞;Liu, Jin-Cheng
    Contributors: 資訊工程學系
    Keywords: 三維點雲;3D Point Clouds
    Date: 2023-07-17
    Issue Date: 2024-09-19 16:42:06 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 三維點雲的資料與一般二維的圖像,無論是數據儲存、資料特性及分類網路架構有許多不同之處,近年來三維掃描設備有增加的趨勢,像是手機的三維結構光掃描,抑或是新世代的汽車搭載的雷達或是光達,所以面對與日俱增的三維點雲資料,除了使用傳統二維分類網路的架構,我們需要更準確的三維點雲分類網路,而PointNet是在三維點雲分類網路率先有效且準確的模型,然而PointNet++以包含簡單的局部特徵去克服只有考慮三維點雲全局特徵的侷限性,因此我們為了要讓三維點雲分類網路能夠更好去學習更細緻局部的幾何結構,我們於本論文提出了基於自注意力與擬合平面感知局部幾何特徵的三維點雲分類網路架構PointGPS。

    我們提出的網路架構基於PointMLP所設計,PointMLP基於PointNet++ 所設計,PointMLP其設計了透過帶有殘差的 MLP (Multi-Layer Perceptron) 提升了準確度,網路架構一開始使用 Embedding 模組來將點雲提升成為更高維度的特徵,接著會經過四次的幾何特徵映射模組和前後的特徵擷取模組,透過幾何特徵映射模組擷取點雲的特徵,這裡透過最遠點採樣把點雲每次減少一半,並且選取其周圍附近的鄰居,這些鄰居的高維特徵和最遠點本身的高維特徵相減,以及最遠點本身的高維度特徵來訓練模型,我們在此模組設計透過 SVD (Singular Value Decomposition) 奇異值分解來擬合鄰居的平面,以及透過 Self-attention來計算更細緻的局部幾何結構特徵,接著經過前後的特徵擷取模組,也就是含有殘差的MLP模組,最後經過Max Pooling Layer最大池化層將特徵縮小,再經過分類器也就是全連接層、批量標準化層、以及激勵函數,還有透過隨機遺忘部分權重來讓模型對於更多資料有泛化的能力,最後我們提出的設計對於準確度有很大的提升。
    ;The data of three-dimensional point clouds differs significantly from that of regular two-dimensional images in terms of data storage, data characteristics, and classification network architecture. 3D scanning devices have become increasingly popular recently, like structured light scanners in smartphones or radar/lidar systems in next-gen cars. As the volume of 3D point cloud data increases, there is a growing demand for more accurate classification networks that are specifically designed to analyze such data. PointNet is the pioneering and effective model among 3D point cloud classification networks, However, PointNet++ was introduced to incorporate simple local features to conquer the limitation of considering only global features of 3D point clouds. The aim is to assist the 3D point cloud classification network learn more detailed local geometric structures better. In this thesis, we introduce PointGPS, a novel 3D Point Cloud classification network architecture that leverages self-attention and plane fitting for perceiving local geometric features.

    The foundation of our proposed network architecture is built on PointMLP, which is built upon PointNet++. PointMLP enhances the accuracy by incorporating residual Multi-Layer Perceptron (MLP) modules. The architecture begins with an embedding module that elevates the point cloud into higher-dimensional features. It then undergoes four rounds of geometric feature mapping module and feature extraction modules. The geometric feature mapping modules capture features from the point cloud using farthest point sampling where the point cloud is reduced by half at each step, and then neighboring points are selected. The high-dimensional characteristics of the surrounding points are subtracted from the high-dimensional attributes of the farthest point itself, in addition to the high-dimensional attributes of the farthest point. In this module, we utilize Singular Value Decomposition (SVD) to fit the plane of the neighbors and employ self-attention to compute more detailed local geometric structure features. Subsequently, the feature extraction modules, which include MLP modules with residual connections, are applied. In the end, the attributes undergo a downsizing process using a Max Pooling layer, which is subsequently followed by a classifier comprising of fully connected layers and batch normalization layers, activation functions, and random weight dropping to enhance the model′s generalization capability to unseen data. With these design choices, our proposed model achieves significant improvements in accuracy.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML20View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明