中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/89831
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 42716989      Online Users : 1538
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/89831


    Title: 結合自我注意力模塊的多尺度特徵融合網路用於場景文字偵測;Multi-Scale Feature Fusion Network Combined with Self-Attention Module for Scene Text Detection
    Authors: 何立群;Ho, Li-Chun
    Contributors: 資訊工程學系
    Keywords: 自我注意力模塊;多尺度網路;場景文字偵測;Self-Attention Module;Multi-Scale Network;Scene Text Detection
    Date: 2022-07-28
    Issue Date: 2022-10-04 12:01:27 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 場景文字偵測的研究在近年來有突破性的發展,並且有著許多不同的應用,例如文件文字偵測及停車場的車牌辨識。但是,對於像是招牌、告示牌等任意形狀的場景文字偵測依然存在著許多問題,例如,許多方法沒辦法將彎曲的文字完整的標示出來,也無法有效的分開相鄰的文字。因此,我們提出了一個更有效的模型,它可以更有效的融合及利用特徵,並偵測出任意形狀的場景文字。我們是基於文字的中心區域進行預測,並透過後處理將預測出的機率圖進行擴張,得到整個文字區域的結果。我們提出Multi-Scale Feature Fusion Network以更有效的萃取及融合特徵,其中包含了結合Self-Attention Module (SAM)的Multi-Scale Attention Module (MSAM),可以更有效的精煉特徵,最後由Self-Attention Head (SAH)預測文字機率圖。本文透過實驗證實了此方法的效果,在Total-Text數據集上得到87.4分的F-score。;The research on scene text detection has made breakthroughs in recent years and has many different applications, such as document text detection and license plate recognition in parking lots. However, there are still many problems in scene text detection with arbitrary shapes such as signboards and billboards. For example, many methods cannot mark curved text fully, nor can they effectively separate adjacent text. Therefore, we propose a more efficient model, which can more effectively fuse and utilize features and detect scene texts of arbitrary shapes. In this paper, the result is predicted based on the central area of the text, and the predicted probability map is expanded through post-processing to obtain the result of the entire text area. We propose a Multi-Scale Feature Fusion Network to extract and fuse features more effectively, including Multi-Scale Attention Modules (MSAMs) combined with Self-Attention Modules (SAMs), which can refine features more effectively. Finally, Self-Attention Head (SAH) predicts the text probability map. We confirm the effect of this method through experiments and achieve F-score of 87.4 on the Total-Text dataset.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML45View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明