具全局及局部特徵自注意力機制之高效通用型影像去模糊神經網路;An Efficient Universal Image Deblurring Neural Network with Global and Local Feature Self-Attention

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Electrical Engineering > Electronic Thesis & Dissertation > Item 987654321/95535

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/95535

Title:	具全局及局部特徵自注意力機制之高效通用型影像去模糊神經網路;An Efficient Universal Image Deblurring Neural Network with Global and Local Feature Self-Attention
Authors:	林羿恒;Lin, Yz-Heng
Contributors:	電機工程學系
Keywords:	影像去模糊;深度學習;特徵自注意力機制;Image Deblurring;Self-Attention;Transformer
Date:	2024-03-14
Issue Date:	2024-10-09 16:59:43 (UTC+8)
Publisher:	國立中央大學
Abstract:	影像去模糊演算法一直以來都是一項重要的電腦視覺任務，無論是日常應用或是醫學應用，其目的旨在將受到退化破壞的影像還原回原始清晰且銳利的影像。傳統影像去模糊算法在當影像帶有複雜的晃動、失焦等模糊干擾時，無法有效地重建出原始影像中的細節。受視覺變換器模型(Vision Transformer)在各種任務中取得成功的啟發，本論文提出了一個創新架構，利用特徵自注意機制，同時在全局和局部範圍內捕捉影像中的模糊特徵並消除。為減輕運算負擔，許多視覺變換器技術都採用將影像分割成多個視窗的策略，接著對每個獨立視窗內的關係進行建模。然而，這些方法對視窗之間的資訊交換造成了限制，從而影響了整體效能。因此我們提出了一個變換器模組，包括將影像分割成水平和垂直條紋以捕捉長距離模糊特徵，以及使用窗口捕捉短距離模糊特徵。為進一步擴大變換器的視覺感受野(Receptive Field)，我們額外提出了一個有效的算法，計算不同視窗之間的相關性。為了評估模型的有效性及泛用性，我們在多個一般影像以及MRI影像資料集上進行了實驗與驗證。驗證結果證明，與當前影像去模糊領域中數個最先進方法相較之下，我們所提出的架構在影像修復能力上是具有競爭力的，並能夠達到更高的影像PSNR (Peak Signal Noise Ratio)和SSIM (Structural Similarity)。;The enhancement of image clarity has always been a low-level computer vision task, aiming to restore degraded images to their original clear and sharp state. Traditional image enhancement algorithms struggle to effectively reconstruct details in the original image when it is affected by complex factors such as motion blur and defocus. Inspired by the success of Vision Transformer models in various tasks, this paper proposes an innovative framework that utilizes a feature self-attention mechanism to simultaneously capture and eliminate blurry features in both global and local contexts within the image. To alleviate computational burden, many Vision Transformer techniques adopt a strategy of dividing images into multiple windows and then modeling relationships within each independent window. However, these methods impose limitations on information exchange between windows, thereby affecting overall performance. Therefore, we propose a transformer module that involves segmenting images into horizontal and vertical stripes to capture long-distance blurry patterns, and using windows to capture short-distance blurry features. To further expand the visual receptive field of the transformer, we additionally introduce an effective algorithm to compute correlations between different windows. To assess the model′s effectiveness and generalization, we conducted experiments on several general image and MRI datasets, validating our approach. The results indicate that compared to several state-of-the-art methods in the image restoration domain, our proposed architecture is competitive in image restoration capability, achieving higher Peak Signal Noise Ratio (PSNR) and Structural Similarity (SSIM).
Appears in Collections:	[Graduate Institute of Electrical Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	21	View/Open

社群 sharing

Loading...