中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/95463
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 80990/80990 (100%)
造访人次 : 42700126      在线人数 : 1481
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/95463


    题名: 基於雙重注意力機制與人臉強化機制 之人體姿態遷移;Enhancing Human Pose Transfer with Attention Mechanisms, Convolutional Block Attention Module and Facial Loss Optimization
    作者: 江俊辰;Chiang, Chun-Chen
    贡献者: 資訊工程學系
    关键词: 姿態轉換;生成對抗網路;Pose Transfer;Generative Adversarial Network
    日期: 2024-07-11
    上传时间: 2024-10-09 16:52:57 (UTC+8)
    出版者: 國立中央大學
    摘要: 在合成物體和場景的領域中,有許多相關技術可以適用於計算機圖形學、圖像重建、攝影以及視覺數據的生成。在合成視角時,我們經常遇到遮擋、照明變化和幾何失真等挑戰。當處理可變形物體,例如人類時,這些問題尤為突出。這些因素顯著增加了視角合成的複雜性。
    而在現代社會中,運動和舞蹈不僅有助於提升身體健康和生活品質,也是展現個人魅力和藝術表現的途徑。對非專業人士來說,有效率地在閒暇時間提升技能是一大挑戰。深度學習中的姿態轉換技術,是將一人的動作姿態轉移到提供的參考動作上,提供了一種創新解決方案。這技術讓老師與學員能直觀比較動作差異,即使在無人指導的情況下,也能有效學習和修正動作。本篇論文提供一個姿態轉換系統,藉由使用者提供參考圖片與選擇本系統提供之姿態,讓系統自動生成出相關動作的人物圖片,並且可以提供使用者下載成影片於本地端。
    在架構上,我們以Multi-scale Attention Guided Pose Transfer(MAGPT)模型為基礎,修改其中Residual Block,對其加入Convolutional Block Attention Module (CBAM) 並且將激活函數從Relu改為Mish以獲得更多關於衣服與人物膚色相關等特徵,並且對於原架構生成之圖片臉部特徵與原圖相比有所差異,對於此問題,我們提出兩種不同臉部特徵的損失函數可以分別幫助模型學到更精確的圖片特徵。最後,基於本系統的架構下,我們只要使用一張參考圖片,就可以讓使用者轉換成不同的動作影片。
    ;In the field of synthesizing objects and scenes, many related techniques can be applied to computer graphics, image reconstruction, photography, and the generation of visual data. When synthesizing perspectives, we often encounter challenges such as occlusion, lighting changes, and geometric distortions. These issues are particularly pronounced when dealing with deformable objects, such as humans. These factors significantly increase the complexity of perspective synthesis.
    In modern society, sports and dance not only contribute to physical health and quality of life but also serve as avenues for personal charm and artistic expression. For non-professionals, efficiently improving skills during leisure time poses a significant challenge. Pose transfer technology in deep learning, which transfers the motion and posture of one individual onto a provided reference movement, offers an innovative solution. This technology enables coaches and students to intuitively compare movement differences, allowing effective learning and correction of actions even without the presence of a coach. This paper presents a pose transfer system that generates related action images automatically by using reference images provided by users and selecting poses offered by the system, and it also allows users to download the videos locally.
    In terms of architecture, our model is based on the Multi-scale Attention Guided Pose Transfer (MAGPT) model, with modifications to its Residual Block by integrating the Convolutional Block Attention Module (CBAM) and changing the activation function from Relu to Mish to capture more features related to clothing and skin color. Additionally, as the generated images had facial features differing from the original image, we propose two different facial feature loss functions to help the model learn more precise image features. Ultimately, with our system′s architecture, just one reference image is required to enable users to transform into different action videos.
    显示于类别:[資訊工程研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML28检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明