彩色手套影像下基於 EANet 的手部姿態預測方法;A Hand Pose Estimation Method Based on EANet with Colored Glove Images

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/95790

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/95790

Title:	彩色手套影像下基於 EANet 的手部姿態預測方法;A Hand Pose Estimation Method Based on EANet with Colored Glove Images
Authors:	徐嘉彤;Hsu, Chia-Tung
Contributors:	資訊工程學系
Keywords:	深度學習;電腦視覺;電腦圖學;影像處理;3D 手部姿態辨識;Deep Learning;Computer Vision;Computer Graphics;Image Processing;3D Hand Pose Estimation
Date:	2024-08-12
Issue Date:	2024-10-09 17:17:00 (UTC+8)
Publisher:	國立中央大學
Abstract:	台灣的聽障人口數超過 13 萬人，手語是這些人的主要溝通方式。對於手語翻譯以及手語辨識等應用，準確的手部姿態預測模型至關重要。然而，由於雙手的互動與手部的遮擋，此任務對單一鏡頭的 RGB 影像是一大挑戰。因此，本研究旨在提升雙手手語場景的手部姿態預測結果。　　本論文提出了一種應用 Extract-and-adaptation network(EANet)與彩色手套的手部姿態預測方法，並針對彩色手套手語影像進行優化。我們使用將資料集渲染成彩色手套的方式增加手指的資訊，並採用基於Transformer 架構的 EANet 進行模型訓練，再使用多種影像處理技術來優化手部關鍵點的預測結果。實驗結果顯示，該方法在彩色手套手語資料集上完整偵測雙手的穩定性高於 Mediapipe 55%，亦在測試資料集中得到比使用原始資料集訓練的 EANet 更好的結果。;With over 130,000 hearing-impaired individuals in Taiwan, sign language serves as their primary mode of communication. Accurate hand pose estimation models are crucial for applications such as sign language translation and recognition. However, due to interactions between two hands and occlusions, this task poses a significant challenge for single RGB images. This study aims to enhance hand pose estimation in two-hand sign language scenarios. This research proposes a hand pose estimation method using Extract-and-adaptation network (EANet) and colored gloves, optimized for sign language images with colored gloves. We enhance finger information by rendering the dataset into colored gloves and employ a Transformer-based EANet for model training. Additionally, multiple image processing techniques were employed to optimize the prediction result of hand keypoints. Experimental results demonstrate that our method achieves a 55% higher stability in detecting two hands on sign language datasets compared to Mediapipe and yields superior results on test datasets compared to EANet trained on the original dataset.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	47	View/Open

社群 sharing

Loading...