使用YOLO架構在標準環境中進行動態舌頭影像偵測及切割;Detection and Segmentation of Dynamic Tongue Images Using YOLO Technique in a Standardized Environment

NCUIR > college of Health Sciences and Technology > Institute of Biomedical Engineering > Electronic Thesis & Dissertation > Item 987654321/85765

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/85765

Title:	使用YOLO架構在標準環境中進行動態舌頭影像偵測及切割;Detection and Segmentation of Dynamic Tongue Images Using YOLO Technique in a Standardized Environment
Authors:	王珅愷;Wang, Shen-Kai
Contributors:	生物醫學工程研究所
Keywords:	影像偵測;切割;YOLO;Detection;Segmentation
Date:	2021-08-10
Issue Date:	2021-12-07 11:22:38 (UTC+8)
Publisher:	國立中央大學
Abstract:	本研究的目標是利用YOLOv4技術達成即時的動態影像檢測及切割，其中將以舌頭特徵辨識作為技術呈現的對象。本技術之所以選擇追蹤動態舌頭影像作為本論文所發展之檢測方法的挑戰，其理由在於舌頭周圍的嘴唇與臉頰之像素分布與舌頭非常相近，故以動態舌頭影像的檢測及切割作為呈現本技術之目標。於生醫領域應用方面，本研究業與桃園區聯新國際醫院中醫師以及智慧醫療實驗室合作，以提供分割後「去識別化」的舌頭影像交付中醫師利用舌診手法為病患判別病徵與演算法標記。技術比較上，YOLOv4與目前電腦視覺較為熱門的R-CNN不同。R-CNN能先預測多個物體可能存在之位置，並在獲得目標位置後再依序判斷目標的類別，因此辨識精確度非常高，然而如此的辨識技術代價就是時間複雜度較高。而YOLOv4是在圖像輸入的同時便將圖像處理成同時帶有圖像分類及位置資訊的格式後再執行預測。預測輸出時便可同時得出要辨識物體的種類以及位置，因此在計算時間可以大幅下降。而在精確度上，YOLOv4比起v3更有顯著的提升。在YOLOv4的研究文獻中表明，在使用TeslaV100 GPU的硬體條件以及在54 FPS (frames per second)的表現下，AP (average precision)會有41.2%的達成率。而在相同硬體中R-CNN在AP有42.8%的準確度，但是FPS僅有9。故本研究選擇YOLOv4作為即時影像檢測及切割的基本架構。同時在這個基礎上，除了利用YOLOv4去判斷出物體的位置，本研究將再進一步利用YOLOv4給出的座標實踐邊緣檢測以及分割。而為了達成精確地偵測出符合中醫檢驗的需求，本研究提供了一標準化的影像採集箱並在YOLO架構中加進了負樣本一同訓練並於本論文中構造出雙骨幹架構(double backbone structure)的YOLOv4技術。目前實驗結果表明，在Windows系統下使用Visual Studio編譯，並在GTX 1050 Ti與RAM 16GB的硬體條件下，可獲得FPS 7至10的結果。而偵測上的成果更能符合中醫師所需的舌頭角度圖像，歪斜角度的圖像已由副樣本骨幹移出了資料數據。而準確度的實踐上，若使用YOLOv4提供之confidence score，亦皆可獲得90%以上的分數。;I employed the YOLOv4 technique to achieve real-time dynamic image detection and segmentation, and I focused on tongue feature recognition to present my results. Tracking dynamic tongue images is a challenge in the research. It is because the pixel distributions of the lip and cheek are similar to that of the tongue. Thus, my research aims to develop a new technique of detection and segmentation to deal with dynamic tongue images. In terms of biomedical applications, this technique can generate "de-identified" tongue images after segmentation. Thus Chinese medical physicians can use tongue diagnosis techniques to identify symptoms or find features related to diseases. YOLOv4 and R-CNN-based methods are two mainstream techniques in the field of computer vision. The R-CNN-based methods can predict the possible locations of multiple objects and then determine the type of the target object after obtaining the target position. Thus, the recognition of R-CNN-based methods is in high accuracy with the costs of high computation complexity and time consumption. On the other hand, the YOLOv4 technique can simultaneously predict the classified result and the location information of an input image. Thus, the YOLOv4 technique has a significant reduction in computational complexity. Additionally, the YOLOv4 technique has a significant progression in accuracy over the YOLOv3 version. Relevant literature shows that under the conditions of TeslaV100 GPU hardware and 54 FPS (frames per second), the YOLOv4 technique will have 41.2% performance in AP (average precision). Under the same hardware conditions, the R-CNN technique has an accuracy of 42.8% in AP, but its FPS is only 9. Therefore, the YOLOv4 architecture became the basic framework for real-time image detection and segmentation in my research. The YOLOv4 technique can only determine object locations but also provide the corresponding coordinates. Thus, it could help us to achieve edge detection and segmentation in practice. To detect the required precise angle of the tongue, I also added the method of negative sampling into the model. Then I proposed a new framework by utilizing a double backbone structure. The preliminary results show that FPS 7-10 can be obtained under Windows compiling with Visual Studio and GTX 1050 Ti and RAM 16GB hardware conditions. In terms of detection accuracy, it precisely generates images of the required angle of the tongue without skew angle circumstances. By utilizing the confidence score provided by YOLOv4, the predicted results can also reach a grade of more than 90% or more.
Appears in Collections:	[Institute of Biomedical Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	86	View/Open

社群 sharing

Loading...