基於深度學習之文件影像陰影偵測及去除演算法;A Deep Learning-based Algorithm for Shadow Detection and Removal from Document Images

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/93396

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/93396

題名:	基於深度學習之文件影像陰影偵測及去除演算法;A Deep Learning-based Algorithm for Shadow Detection and Removal from Document Images
作者:	王譽鈞;Wang, Yuh-Jiun
貢獻者:	資訊工程學系
關鍵詞:	深度學習;陰影偵測;陰影去除;條件式生成對抗網路;光學字元辨識;Deep Learning;Shadow Detection;Shadow Removal;cGAN;OCR
日期:	2023-08-01
上傳時間:	2024-09-19 16:57:09 (UTC+8)
出版者:	國立中央大學
摘要:	隨著科技不斷發展和進步，幾乎每個人都有一支智慧型手機，時常需要用於拍照或是拍攝文件來記錄重要的資訊，但拍攝過程中卻經常因為光線被物體所阻擋，例如拍攝者的手或是手機本身，而導致拍攝的影像中產生不必要的陰影。如此一來，除了會造成照片本身觀感不佳之外，有時甚至還會影響到文字的閱讀。為了避免陰影的產生，拍攝者必須調整成特定的拍攝角度，或是再後續自行使用修圖軟體，選取陰影部分並且調整亮度、色調等，但這些步驟不僅花費大量時間且修改完的結果不盡理想。而對於文件影像的處理，不但要去除陰影，還要同時確保文字可被識別，自然就更加困難了。本論文提出基於深度學習的演算法，可以針對文件影像偵測及去除陰影。首先，訓練一個條件式生成對抗網路，使能夠找出一張影像中陰影區域，並產生陰影遮罩。從陰影區域與非陰影區域找出各自的主要背景顏色，並結合輸入影像明度資訊與前一個階段的陰影遮罩，透過另一個條件式生成對抗網路生成出影像修復的結果，以達成陰影去除的目的。在實驗結果中，本論文的方法所生成的結果，能夠同時達成陰影去除且使文字可閱讀，與未經過處理之原始輸入影像相較之下，PSRN 與SSIM評估指標皆有所提升，也大幅提高光學字元辨識的正確率。 ;With the constantly development of technology, almost everyone has a smartphone, which is often used for taking pictures or documenting important information. Nevertheless, unwanted shadows may appear in the captured picture due to the blocked light cause by user’s hand or the phone itself. In this way, it will not only result in bad visual quality of images, but also make the text unreadable sometimes. In order to prevent shadows in images, users need to capture images under well-controlled lighting conditions or use an image editing tool to get rid of shadows by selecting the shadow areas and adjusting the brightness or hue. However, these processes waste a lot of time and do not always come up with a result that users really want. Correcting illumination distortion of document images is even greater challenges because it requires not only removing shadows but also ensuring the legibility of the text. This paper proposes a deep learning-based algorithm to detect and remove shadows from document images. The algorithm starts with a conditional Generative Adversarial Network (cGAN), which has a generator that can find shadow areas from an image and create shadow detection mask. Then, estimating the main background color of the shadow and non-shadow areas combine with the brightness information of original image and its shadow detection mask as input. With the second cGAN, the input goes through the generator to get a shadow-free image. According to the experimental results, the proposed method can be more efficient at both correcting illumination and making text more legible. Compared to original images, both PSNR and SSIM have been increased and the correct rate of Optical Character Recognition (OCR) has also been greatly improved.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	13	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....