基於模板更新之孿生網路的三百六十度視訊等角立方體投影之行人追蹤

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：37

、訪客IP：18.225.11.98

姓名

戴鸛臻(Kuan-Chen Tai) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

基於模板更新之孿生網路的三百六十度視訊等角立方體投影之行人追蹤
(People Tracking Based on Siamese Network with Template Update for EAC Format of 360-degree Videos)

相關論文

★ 應用於車內視訊之光線適應性視訊壓縮編碼器設計	★ 以粒子濾波法為基礎之改良式頭部追蹤系統
★ 應用於空間與CGS可調性視訊編碼器之快速模式決策演算法	★ 應用於人臉表情辨識之強健式主動外觀模型搜尋演算法
★ 結合Epipolar Geometry為基礎之視角間預測與快速畫面間預測方向決策之多視角視訊編碼	★ 基於改良式可信度傳遞於同質區域之立體視覺匹配演算法
★ 以階層式Boosting演算法為基礎之棒球軌跡辨識	★ 多視角視訊編碼之快速參考畫面方向決策
★ 以線上統計為基礎應用於CGS可調式編碼器之快速模式決策	★ 適用於唇形辨識之改良式主動形狀模型匹配演算法
★ 以運動補償模型為基礎之移動式平台物件追蹤	★ 基於匹配代價之非對稱式立體匹配遮蔽偵測
★ 以動量為基礎之快速多視角視訊編碼模式決策	★ 應用於地點影像辨識之快速局部L-SVMs群體分類器
★ 以高品質合成視角為導向之快速深度視訊編碼模式決策	★ 以運動補償模型為基礎之移動式相機多物件追蹤

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在360度視訊中，等角立方體投影(equi-angular cubemap projection, EAC)屬於立方體投影(cubemap projection, CMP)的變體，等角立方體投影相較於立方體投影幾何形變程度較小，在追蹤問題上較不易衍生錯誤。然而等角立方體投影的影像仍具有相鄰面內容不連續的特性，且仍有不均勻幾何失真，導致現有追蹤方案在等角立方體投影的影像中準確性嚴重下降。因此，本論文針對等角立方體投影的360度視訊，提出基於孿生網路的行人追蹤方案，以卷積神經網路(convolutional neural network)對目標模板與目前畫面之搜索視窗提取特徵，並比對特徵以追蹤目標。在影像不連續的問題上，本文使用面拼貼 (face stitching)措施，使追蹤器能於連續的影像內容進行追蹤，同時避免造成更多幾何形變。因應不均勻幾何失真，基於孿生網路由當前畫面計算的分數圖(score map)，來預測更新模板(template update)的時機，使用FLD (Fisher’s linear discriminate)將分數圖降維，並計算分數圖之平均值與標準差作為三種特徵，再通過貝氏分類器(Bayes classifier)決定是否更新模板。實驗結果顯示，本論文所提出之面拼貼與模板更新方案有效提升SiamFC追蹤準確率。

摘要(英)

Variants of cubemap projection format (CMP) such as equi-angular cubemap (EAC) of 360-degree videos has less geometric deformation, which may reduce tracking error. However, accuracy and speed of most existing trackers degrade seriously in the face of content discontinuity and non-uniform geometric deformation in EAC formats of 360-degree videos. Thus, this paper proposes a Siamese network based people tracking scheme for 360-degree videos using EAC format. The tracker extracts features from the target template and the search window of the current frame by a convolutional neural network, and compare features to predict the bounding box of target. To be robust against the content discontinuity between inconsistent adjacent faces of EAC images, this paper proposes an efficient face stitching scheme such that the tracker keeps tracking across adjacent faces and avoids raising geometric deformation simultaneously. By referring to the score map generated by Siamese networks, the proposed pre-trained Bayes classifier based mechanism of template update determines the right timing of update. The input feature vector of Bayes classifier includes the data that generated by dimensionality reduction from score map using Fisher linear discriminant (FLD), the mean of the score map and the standard deviation of the score map. Experimental results show that the proposed face stitching scheme and the mechanism of template update effectively improve the tracking accuracy of SiamFC.

關鍵字(中)

★ 行人追蹤
★ 360度視訊
★ 等角立方體投影
★ 孿生網路
★ FLD
★ 貝氏分類器

關鍵字(英)

★ people tracking
★ 360-degree videos
★ equi-angular cubemap (EAC)
★ Siamese neural network
★ Fisher linear discriminant (FLD)
★ Bayes classifier

論文目次

摘要 I
Abstract II
誌謝 IV
目錄 V
圖目錄 VII
表目錄 X
第一章緒論 1
1.1 前言 1
1.2 研究動機 1
1.3 研究方法 3
1.4 論文架構 4
第二章基於孿生網路之視覺追蹤與基於立方體投影之視覺追蹤技術介紹 5
2.1 基於孿生網路之視覺追蹤 6
2.2 視覺追蹤之模板更新技術介紹 9
2.3基於立方體投影之360度視覺追蹤技術介紹 10
2.3.1 立方體投影與等角立方體投影原理 11
2.3.2基於立方體投影之視覺追蹤 14
2.4總結 16
第三章基於降維之特徵擷取 17
3.1 Principal Component Analysis 18
3.2 Fisher’s Linear Discriminant 18
3.2.1 Linear Discriminate Analysis於電腦視覺之應用 21
3.2.2 Shrinkage 22
3.3總結 24
第四章本論文所提之三百六十度視訊等角立方體投影之行人追蹤方案 25
4.1 系統架構 26
4.2等角立方體投影影像之面拼貼與面切換 27
4.3 本論文提出之模板更新方案 30
4.3.1分數圖之特徵擷取 33
4.3.2基於貝氏分類器之模板更新時機決策 41
4.4總結 43
第五章實驗結果與討論 44
5.1 實驗參數與測試影片規格 44
5.2 追蹤系統實驗結果 47
5.2.1 基於Overlap Ratio之追蹤準確率 48
5.2.2 基於Location Error之追蹤準確率 60
5.2.3 基於Success Plot、Precious Plot之追蹤準確率 63
5.3總結 76
第六章結論與未來展望 77
參考文獻 78
Publications 83
符號表 84

參考文獻

[1] N. K. Sankaran, H. J. Nisar, J. Zhang, K. Formella, J. Amos, L. T. Barker, and T. Kesavadas, “Efficacy study on interactive mixed reality (IMR) software with sepsis prevention medical education,” in Proc. 2019 IEEE Conference on Virtual Reality and 3D User Interfaces, pp. 664-670, March 2019.
[2] F. Ekpar, “A framework for intelligent video surveillance,” in Proc. IEEE International Conference on Computer and Information Technology Workshops, pp. 421-426, July 2008.
[3] L. Heng, B. Choi, Z. Cui, M. Geppert, S. Hu, B. Kuan, and G. H. Lee, “Project autovision: Localization and 3d scene perception for an autonomous vehicle with a multi-camera system,” in Proc. 2019 International Conference on Robotics and Automation, pp. 4695-4702, May 2019.
[4] J. Ahn, M. Kim, S. Kim, S. Lee, and J. Park, “Formation-based tracking method for human following robot,” in Proc. IEEE International Conference on Ubiquitous Robots, pp. 24-28. June 2018.
[5] M. Zhou, “AHG8: A study on Equi-Angular Cubemap projection (EAC), ” in Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 7th Meeting: JVET-G0056, Torino, Italy, July 13-21, 2017.
[6] Z. Zhou, B. Niu, C. Ke, and W. Wu, “Static object tracking in road panoramic videos,” in Proc. IEEE International Symposium on Multimedia, pp. 57-64, December 2010.
[7] J. Bromley, I. Guyon, Y. LeCun, E. Sackinger, and R. Shah, “Signature verification using a siamese time delay neural network,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. 7, No. 4, pp. 669-688, 1993.
[8] D. Held, S. Thrun, and S. Savarese, “Learning to track at 100 fps with deep regression networks,” in Proc. European Conference on Computer Vision, pp. 749-765, September 2016.
[9] L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and P. H. A. Torr, “Fully-convolutional Siamese networks for object tracking,” in Proc. European Conference on Computer Vision, pp. 850-865, September 2016.
[10] J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi and P.H. Torr, “End-to-end representation learning for correlation filter based tracking,” In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2805-2813, July 2017.
[11] Z. Xu, H. Luo, B. Hui, Z. Chang, and M. Ju, “Siamese tracking with adaptive template-updating strategy,” Applied Sciences, Vol. 9, No. 18, 3725, September 2019.
[12] Nam, Hyeonseob and Bohyung Han, “Learning multi-domain convolutional neural networks for visual tracking,” in Proc. IEEE conference on computer vision and pattern recognition, pp. 4293-4302, June 2016.
[13] R. Tao, E. Gravves, and A. W. M. Smeulders, “Siamese instance search for tracking,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420-1429, June 2016.
[14] W. Cao, Y. Li, and Z. He, “Weighted optical flow prediction and attention model for object tracking,” IEEE Access, Vol. 7, pp. 144885-144894, September 2019.
[15] L. Zhang, A. Gonzalez-Garcia1, J. van de Weijer, M. Danelljan, and F. S. Khan, “ Learning the model update for Siamese trackers,” in Proc. IEEE International Conference on Computer Vision, pp. 4010-4019, October 2019.
[16] K.-C. Tai and C.-W. Tang, “Siamese networks based people tracking for 360-degree videos with equi-angular cubemap format,” accepted by IEEE International Conference on Consumer Electronics - Taiwan, Taiwan, September 2020.
[17] ISO/IEC JTC 1/SC 29/WG 11, “Algorithm descriptions of projections of projection format conversion and video quality metrics in 360 Lib,” Doc JVET-E1003, Geneva, January 2017.
[18] J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review, ” Data Classification: Algorithms and Applications, July 2014.
[19] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks, ” Science, Vol. 313, No. 5786, pp. 504-507, July 2006.
[20] K. Fukunaga, Introduction to Statistical Pattern Recognition. 2nd ed, Elsevier, 2013.
[21] I. K. Fodor, “A survey of dimension reduction techniques,” No. UCRL-ID-148494. Lawrence Livermore National Lab., CA (US), 2002.
[22] R. A. Fisher, “The use of multiple measurements in taxonomic problems” Annals of Eugenics, Vol. 7, No.2, pp. 179–188, September 1936.
[23] B. Ghojogh and M. Crowley, “Linear and quadratic discriminant analysis: Tutorial, ” arXiv preprint arXiv:1906.02590, June 2019.
[24] W. Zhao, R. Chellappa, and N. Nandhakumar. “Empirical performance analysis of linear discriminant classifiers, ” in Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 164-169, June 1998.
[25] R. Ramirez and Z. Vamvakousis. “Detecting emotion from EEG signals using the emotive epoc device, ” in Proc. International Conference on Brain Informatics, pp. 175-184, December 2012.
[26] A. Bouzalmat, J. Kharroubi, and A. Zarghili. “Comparative study of PCA, ICA, LDA using SVM classifier, ” Journal of Emerging Technologies in Web Intelligence, Vol.6, No.1, pp. 64-68, February 2014.
[27] G. Li, D. Liang, Q. Huang, S. Jiang, and W. Gao, “Object tracking using incremental 2D-LDA learning and Bayes inference,” in Proc. IEEE International Conference on Image Processing, pp. 1568-1571, October 2008.
[28] O. Ledoit and M. Wolf , “A well-conditioned estimator for large-dimensional covariance matrices”, Journal of Multivariate Analysis, Vol. 88, No. 2, pp. 365-411, February 2004.
[29] J. Bai and S. Shi, “Estimating high dimensional covariance matrices and its applications,” Annuals of Economics and Finance, Vol. 12, No.2, pp. 199-215, September 2011.
[30] J. Schäfer and S. Korbinian , “A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics,” Statistical Applications in Genetics and Molecular Biology, Vol. 4, No. 1, January 2005.
[31] https://www.mettle.com/360vr-master-series-free-360-downloads-page.
[32] Y. Wu, J. Lim, and M.-H. Yang, “Online object tracking: A benchmark,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411-2418, June 2013.
[33] F. Duanmu, Y. Mao, S. Liu, S. Srinivasan, and Y. Wang, “A subjective study of viewer navigation behaviors when watching 360-degree videos on computers,” in Proc. IEEE International Conference on Multimedia Expo, pp. 1-6, July 2018.
[34] Shum sir, “360 VR,” Shum sir Rebik’s Cube, 2017 . https://www.youtube.com/watch?v=g5taEwId2wA
[35] B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu, “High performance visual tracking with siamese region proposal network, ” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971-8980, June 2018.
[36] R. Woods, D. J. Czitrom, R. C. Gonzalez, and S. Armitage, Digital Image Processing, 3e, 2008.
[37] X. Corbillon, F. De Simone, and G. Simon, “360-degree video head movement dataset,” in Proc. the 8th ACM on Multimedia Systems Conference, pp. 199-204, June 2016.

指導教授

唐之瑋(Chih-Wei Tang)

審核日期

2020-7-15

推文