具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：18

、訪客IP：3.145.131.238

姓名

周楓錡(Fong-Ci Jhou) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術
(Generating Mask with Meta-Learning Classifier Weight Transformer Network for Few-Shot Image Segmentation)

相關論文

★ 具有注意力機制之隱式表示於影像重建三維人體模型

★ 基於弱監督式學習可變形模型之三維人臉重建

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

在現代硬體技術快速發展下，許多人工智慧的研究也得到了突破性的進展，愈來愈多的領域紛紛朝向機器取代人類或輔助人類的方向發展與研究。但大多數的人工智慧或深度學習都需要大量的訓練資料並且只能應用於單一任務目標，然而在取得這些大量訓練資料都是相當困難的，例如醫學圖像。在圖像處理領域中，少樣本圖像切割便是其中的一個研究。近期的研究都透過深度學習與元學習的方法，使訓練的模型只需要少量的訓練資料就可以切割出圖像中的目標並使模型可以快速適應於新任務目標。
本論文提出的以元學習分類權重轉移網路生成遮罩於少樣本圖像分割架構做為元學習少樣本圖像切割訓練的網路架構，透過預訓練好的分類權重轉移架構去生成良好的先驗遮罩，並利用預訓練好的特徵提取架構進行query image與support image的特徵提取，然後利用特徵增強模塊中自上而下的路徑自適應地將信息從更精細的特徵傳遞到粗糙的特徵來進行query image的特徵提取，最後再通過分類模塊去進行query image的切割預測。實驗結果表明，以聯合平均交集(mIOU)為評估機制與Baseline相比，在1-shot的實驗結果中準確度上升了1.7%，而在5-shot的實驗結果中準確度也上升了2.6%，所以與Baseline相比下證明了以元學習分類權重轉移網路生成遮罩於少樣本圖像分割架構有最佳的少樣本圖像切割表現。
關鍵字：元學習、少樣本圖像切割、語意分割、少樣本學習

摘要(英)

With the fast development of the hardware technology in today′s world, many artificial intelligence researches have also made a breakthrough, and the more and more fields are developing and researching in the direction of replacing humans with machines or aiding humans. However, most artificial intelligence or deep learning requires large amounts of training data and can only be applied to a single task, but it is very difficult to obtain these large amounts of training data, such as medical images. In the image processing field, one of the studies is few-shot image segmentation. Recent studies have used deep learning and meta-learning approaches to enable trained model to segment targets in images with only a few training data and to adapt the model to new tasks quickly.
In this thesis, we proposed the meta-learning classifier weight transformer network generation masks for few-shot image segmentation architecture. The top-down path in the feature enrichment module is used to transfer the information from finer features to coarser features for query image feature extraction, and finally the classification module is used for query image segmentation prediction.
The experimental results show that the mean intersection over union(mIOU) ratio as the evaluation mechanism, proposed method increased the accuracy by 1.7% in the 1-shot experiment and 2.6% in the 5-shot experiment compared with Baseline, so it proves that the meta-learning classifier weight transfer network generation masks for few-shot image segmentation architecture has the best performance.
Keywords – meta learning, few-shot image segmentation, semantic segmentation, few-shot learning.

關鍵字(中)

★ 元學習
★ 少樣本圖樣切割
★ 語意分割
★ 少樣本學習

關鍵字(英)

★ meta-learning
★ few-shot image segmentation
★ semantic segmentation
★ few-shot learning

論文目次

摘要 I
Abstract II
致謝 III
目錄 V
圖目錄 VII
表目錄 IX
第一章緒論 1
1-1 研究背景 1
1-2 研究動機與目的 2
1-3 論文架構 3
第二章神經網路與深度學習 4
2-1 神經網路 4
2-1-1 類神經網路 4
2-1-2 卷積神經網路 6
2-2 注意力機制 11
2-2-1 自注意力機制 11
2-2-2 多頭自注意力機制 13
第三章元學習與少樣本學習背景 15
3-1 元學習 15
3-1-1 基於梯度優化的元學習 16
3-1-2 基於度量空間的元學習 18
3-2 少樣本學習 19
3-3 圖像分割 20
3-3-1 語意分割 20
3-3-2 少樣本圖像分割 22
第四章實驗架構與設計 24
4-1 系統架構 24
4-1-1 第一種實驗架構：更新部分網路的CWT架構 24
4-1-2 第二種實驗架構：具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割架構 28
4-2 特徵提取 31
4-3 分類權重轉移器 35
4-4 先驗遮罩產生 36
4-5 特徵增強 37
4-6 損失函數 39
第五章實驗結果與分析 40
5-1 實驗環境與設定 40
5-2 實驗數據集 41
5-3 評估方法 42
5-4 實驗結果比較與分析 44
第六章結論與未來展望 55
參考文獻 57

參考文獻

[1] I. Aizenberg, N. Aizenberg, C. Butakov and E. Farberov, “Image recognition on the neural network based on multi-valued neurons,” Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, 2000, pp. 989-992 vol.2.
[2] W. S. McCulloch and W. Pitts, “A Logical Calculus of the Ideas Imminent in Nervous Activity,” Bulletin of Mathematical Biophysics, vol. 5, pp. 115-133, 1943.
[3] K. Fukushima, “Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position,” Biological Cybernetics, vol. 36, pp. 193-202, 1980.
[4] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324.
[5] A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.
[6] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[7] Szegedy, Christian, Liu, Wei, Jia, Yangqing, Sermanet, Pierre, Reed, Scott, Anguelov, Dragomir, Erhan, Dumitru, Vanhoucke, Vincent, and Rabinovich, Andrew. Going deeper with convolutions. Technical report, arXiv preprint arXiv:1409.4842, 2014a.
[8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[9] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” CoRR, abs/1409.0473, 2014.
[10] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, etal., “Attention is all you need,” CoRR, vol. abs/1706.03762, 2017.
[11] C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta learning for fast adaptation of deep networks. In International Conference on Machine Learning, 2017.
[12] Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, pages 4077–4087, 2017.
[13] Spyros Gidaris and Nikos Komodakis. Dynamic few-shot visual learning without forgetting. In CVPR, 2018.
[14] Micah Goldblum, Steven Reich, Liam Fowl, Renkun Ni, Valeriia Cherepanova, and Tom Goldstein. Unraveling metalearning: Understanding feature representations for few-shot tasks. In ICML, 2020.
[15] Jinlu Liu, Liang Song, and Yongqiang Qin. Prototype rectification for few-shot learning. In ECCV, 2020.
[16] Y. Chen, Z. Liu, H. Xu, T. Darrell and X. Wang, "Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9042-9051
[17] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
[18] F. Yu and V. Koltun. Multi-scale context aggregation by dilated convolutions. ICLR, 2016.
[19] L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI, 2018.
[20] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.
[21] W. Liu, A. Rabinovich, and A. C. Berg. Parsenet: Looking wider to see better. arXiv, 2015.
[22] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In CVPR, 2017.
[23] A. Shaban, S. Bansal, Z. Liu, I. Essa, and B. Boots. One-shot learning for semantic segmentation. In BMVC, 2017.
[24] N. Dong and E. P. Xing. Few-shot semantic segmentation with prototype learning. In BMVC, 2018.
[25] K. Wang, J. Liew, Y. Zou, D. Zhou, and J. Feng. Panet: Few-shot image semantic segmentation with prototype alignment. In ICCV, 2019.
[26] C. Zhang, G. Lin, F. Liu, R. Yao, and C. Shen. Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In CVPR, 2019.
[27] G. Lin, A. Milan, C. Shen, and I. D. Reid. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In CVPR, 2017.
[28] Z. Lu, S. He, X. Zhu, L. Zhang, Y. -Z. Song and T. Xiang, "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 8721-8730
[29] Zhuotao Tian, Hengshuang Zhao, Michelle Shu, Zhicheng Yang, Ruiyu Li, and Jiaya Jia. Prior guided feature enrichment network for few-shot segmentation. TPAMI, 2020.
[30] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft COCO: Common objects in context. In ECCV, 2014.
[31] Khoi Nguyen and Sinisa Todorovic. Feature weighting and boosting for few-shot segmentation. In ICCV, 2019.
[32] Xiaoliu Luo, Zhuotao Tian, Taiping Zhang, Bei Yu, Yuan Yan Tang, and Jiaya Jia. Pfenet++: Boosting few-shot semantic segmentation with the noise-filtered context-aware prior mask. arXiv preprint arXiv:2109.13788, 2021.

指導教授

張寶基陳永芳(Pao-Chi Chang Yung-Fang Chen)

審核日期

2022-8-4

推文