摘要: | 使用深度學習技術的物件偵測目前已取得階段性的成功。人工神經網路(Artificial Neural Networks, ANNs)與卷積神經網路(Convolutional Neural Networks, CNNs)等深度學習模型均被使用在此領域中。然而,傳統技術並無將物件之關聯性考慮其中。與之對比,圖形神經網路(Graph Neural Networks, GNNs)具有更適於計算關聯性之特性,及圖片中像素之間的關係。 本論文提出的方法為以圖形神經網路(Graph Neural Networks, GNNs) 之深度學習模型,來實現物件偵測。圖形神經網路(Graph Neural Networks, GNNs)之優點在於其除了考慮隱藏特徵之外,也同時計算鄰接矩陣,及隱藏特徵之間關連性。圖形神經網路也能有效地保留與學習不同尺度之特徵,對於物件偵測時偵測不同大小之物件有巨大優勢,如: 特徵金字塔(Feature Pyramid Network, FPN)。基於以上,我們提出一基於圖形神經網路與特徵金字塔之深度學習模型。此模型利用圖形神經網路為生成器,特徵金字塔為鑑別器,從而獲得更好的物件偵測平均精準度(Average Precision, AP)。最終,提出之模型平均精準度達至50.1,並且在微軟場景中常見物件數據集(Microsoft Common Objects in Context, MS COCO)勝過其他模型。 ;Object Detection using deep learning has achieved great success recently. Several deep learning architectures are used in this field, such as Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs). However, traditional neural networks are characterized by not including relationships between objects. In contrast, graph neural network has properties that are more suitable for them to compute adjacencies, which typically represent relationships between objects or pixels in an image. In this paper, we propose an GNN-based object detection model. Graph Neural Networks (GNNs) are considered to be an efficient framework for neural net for its node and edge learning. Graph Neural Networks (GNNs) can effectively preserve and learn multi-scale features generated by backbone, in our case, ResNet101. Furthermore, we also propose an adversarial model based on GNNs and Feature Pyramid Network (FPN). Using graph-autoencoder as generator and FPN as discriminator, we successfully improve the overall performance (50.1 AP) and outperform other GNNs based models on non-trivial dataset, MS COCO 2017. |