基於機器學習之360度視訊的 VVC快速畫面間預測演算法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：12

、訪客IP：3.144.86.134

姓名

李穎(Ying Lee) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

基於機器學習之360度視訊的 VVC快速畫面間預測演算法
(Machine Learning Based Fast Inter Prediction Algorithm of VVC for 360-degree Videos)

相關論文

★ 應用於車內視訊之光線適應性視訊壓縮編碼器設計	★ 以粒子濾波法為基礎之改良式頭部追蹤系統
★ 應用於空間與CGS可調性視訊編碼器之快速模式決策演算法	★ 應用於人臉表情辨識之強健式主動外觀模型搜尋演算法
★ 結合Epipolar Geometry為基礎之視角間預測與快速畫面間預測方向決策之多視角視訊編碼	★ 基於改良式可信度傳遞於同質區域之立體視覺匹配演算法
★ 以階層式Boosting演算法為基礎之棒球軌跡辨識	★ 多視角視訊編碼之快速參考畫面方向決策
★ 以線上統計為基礎應用於CGS可調式編碼器之快速模式決策	★ 適用於唇形辨識之改良式主動形狀模型匹配演算法
★ 以運動補償模型為基礎之移動式平台物件追蹤	★ 基於匹配代價之非對稱式立體匹配遮蔽偵測
★ 以動量為基礎之快速多視角視訊編碼模式決策	★ 應用於地點影像辨識之快速局部L-SVMs群體分類器
★ 以高品質合成視角為導向之快速深度視訊編碼模式決策	★ 以運動補償模型為基礎之移動式相機多物件追蹤

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2024-8-1以後開放)

摘要(中)

VVC（Versatile video coding）可降低高畫質視訊傳輸位元率，但VCC之編碼時間複雜度過高使其很難在即時傳輸設備上實現，也因此VVC編碼之快速演算法為視訊編碼中重要研究方向。EAC（equi-angular cubemap）格式為360度視訊格式之一，其相較於ERP（equirectangular projection）格式能減少冗餘資訊，然則現有VVC畫面間編碼的快速模式決策與深度決策演算法尚無針對EAC格式設計，因此本論文提出針對EAC格式設計之畫面間編碼快速劃分深度與模式決策演算法，其考慮EAC格式各面之影像內容相連性，與畫面間面之相關性，協助快速畫面編碼決策之準確性，並且畫面間編碼之深度決策與模式決策皆考量VVC新增之affine merge mode進行設計。又，與現有畫面間快速演算法方案相比，本論文採用LNN（light-weighted neural network）作為分類器，比經驗法則更能適應視訊內容之多樣性，並且相較於深度學習方案，僅使用中央處理器（CPU）便可以進行分類。實驗結果顯示本論文所提方案相較於VTM 7.0，平均可節省21%的編碼時間，並僅有1.03%BDBR之上升，與現有採用經驗法則之方案相比亦節省較多的編碼時間節省。

摘要(英)

VVC (versatile video coding) can reduce the bitrate of the high-resolution videos before transmission. However, the encoding complexity of VVC is extremely high cause it hard to implements in real-time hardware. Therefore, fast algorithm of VVC encoder is important. EAC (equi-angular cubemap) format has lower redundant information than ERP format. There is not a fast inter mode or depth decision algorithm about EAC format within the survey of existing literatures. Accordingly, this paper proposed the fast inter prediction algorithm of VVC for EAC format to facilitate mode decision and depth decision process, which taking the inter prediction information of face and face boundary’s connection in EAC format into consideration. Furthermore, this paper considered method of affine merge mode which added by VVC in fast inter mode decision and depth decision. Compare with widely used classification models in VVC fast inter coding algorithms, LNN (light-weighted neural network) can better adjust oneself to different video and coding conditions than rule of thumb and just depends on CPU execution which is difficult on deep learning. Experimental results show that the proposed method reduce the encoding complexity of VTM7.0 about 21% with 1.03% BDBR (Bjontegaard delta bit rate) increasement in average and better than the rule of thumb.

關鍵字(中)

★ 360度視訊
★ EAC（equi-angular cubemap）
★ VVC（versatile video coding）
★ 畫面間編碼
★ 快速演算法
★ LNN（light-weighted neural network）

關鍵字(英)

★ 360-degree videos
★ EAC(equi-angular cubemap)
★ VVC(versatile video coding)
★ inter frame coding
★ fast algorithm
★ LNN（light-weighted neural network）

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 xi
第一章緒論 1
1.1 前言 1
1.2研究動機 2
1.3 研究方法 2
1.4 論文架構 3
第二章 360度視訊編碼現況介紹 4
2.1 多功能影像編碼（Versatile Video Coding, VVC）介紹 4
2.2 劃分流程與模式介紹 6
2.3 多功能影像編碼之仿射合併模式（Affine Merge Mode）與常規合併模式（Regular Merge Mode） 9
2.4 360度視訊品質量測與EAC格式介紹 16
2.5 總結 21
第三章　Versatile Video Coding之快速畫面內及畫面間編碼演算法現況介紹 22
3.1 快速VVC模式決策及運動估測演算法 23
3.2 快速VVC編碼樹單元深度預測演算法 25
3.3 基於360度視訊之VVC 畫面內與畫面間編碼快速演算法現況介紹 28
3.4總結 30
第四章本論文所提出之VVC快速畫面間360度視訊編碼演算法 31
4.1 資料集選擇與VVC、360度視訊條件設定及本論文所提方案之整體架構 32
4.2 針對EAC格式之面邊界與畫面間相關性設計 36
4.3 本論文所提之快速模式決策方案 41
4.4本論文所提之快速深度決策演算法 53
4.5 總結 60
第五章實驗結果與分析 61
5.1 編碼環境與參數設定、測試視訊介紹 61
5.2 LNN準確度、個別方案效能分析 64
5.3 於VTM7.0實驗結果及現有方案比較 69
5.4 總結 81
第六章結論與未來展望 82
參考文獻 83
著作 87
符號表 88

參考文獻

[1] JVET 360Lib. Available: https://jvet.hhi.fraunhofer.de/svn/svn_360Lib/tags/360Lib-10.0/
[2] Y. Lee and C.-W. Tang, “Early skip mode decision of Versatile Video Coding on 8K 360-degree videos,” in Proc. IEEE International Conference on Consumer Electronics, Jan 2021.
[3] S.-H. Park and J.-W. Kang, “Fast multi-type tree partitioning for Versatile Video Coding using a lightweight neural network,” IEEE Transactions on Multimedia ( Early Access ) , Dec. 2020.
[4] I. Storch, G. Correa, B. Zatt, L. Agostini and D. Palomino ,“ESA360 - Early skip mode decision algorithm for fast ERP 360 video coding,” in Proc. 2020 28th European Signal Processing Conference （EUSIPCO）, pp. 535-539, Jan. 2021.
[5] J.-L. Lin, Y.-H. Lee, C.-H. Shih, S.-Y. Lin, H.-C. Lin, S.-K. Chang, P. Wang, L. Liu and C.-C. Ju, “Efficient projection and coding tools for 360° video,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Vol. 9, No. 1, pp. 84-97, March 2019.
[6] ITU-R M.2370-0, “IMT traffic estimates for the years 2020 to 2030,” July 2015.
[7] Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “Algorithm description for Versatile Video Coding and Test Model 7 (VTM 7),” Doc. JVET-P2002-v1, Oct. 2019.
[8] Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Test Model 7 [Online].
Available: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tree/VTM-7.0
[9] Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “Algorithm descriptions of projection format conversion and video quality metrics in 360Lib Version 11,” Doc. JVET-Q2004, January 2020.
[10] ISO/IEC JTC1/SC29/WG11, “AHG8: A study on Equi-Angular Cubemap projection (EAC),” Doc. JVET-G0056, Torino, July 2017.
[11] V. Zakharchenko, E. Alshina, A. Singh and A. Dsouza, “AhG8: Suggested testing procedure for 360-degree video,” Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-D0027, Oct. 2016, Chengdu, China.
[12] Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “JVET AHG report: Test model software development (AHG3),” Doc. JVET-Q0003-v1, January 2020.
[13] J.-N. Filipe, J. Carreira, L.-M.-N. Tavora, S.-M.-M. Faria, A. Navarro and P.-A.-A. Assuncao, “Complexity estimation for load balancing of 360-degree intra Versatile Video Coding,” in Proc. IEEE Workshop on Signal Processing Systems （SiPS）, Oct. 2020.
[14] N. Tang, J. Cao, F. Liang, J. Wang, H. Liu, X. Wang and X. Du, “Fast CTU partition decision algorithm for VVC intra and inter coding,” in Proc. IEEE Asia Pacific Conference on Circuits and Systems （APCCAS）, pp. 361-364, Nov. 2019.
[15] S.-H. Park and Je-Won Kang, “Context-based ternary tree decision method in Versatile Video Coding for fast intra coding,” IEEE Access, Vol. 7, pp. 172597-172605, Nov. 2019.
[16] Y. Fan, J. Chen, H. Sun, J. Katto and M. Jing, “A fast QTMT partition decision strategy for VVC intra prediction,” IEEE Access, Vol. 8, pp. 107900-107911, 2020.
[17] N. Zouidi, F. Belghith, A. Kessentini and N. Masmoudi, “Fast intra prediction decision algorithm for the QTBT structure,” in Proc. IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems （DTS）, May 2019.
[18] Y.-H. Huang, J.-J. Chen and Y.-H. Tsai, “Speed up H.266/QTMT intra-coding based on predictions of ResNet and Random Forest classifier,” in Proc. IEEE International Conference on Consumer Electronics, Jan 2021.
[19] H. Yang, L. Shen, X. Dong, Q. Ding, P. An and G. Jiang, “Low-complexity CTU partition structure decision and fast intra mode decision for Versatile Video Coding,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 30, No. 6, pp. 1668-1682, June 2020.
[20] S.-H. Park and J.-W. Kang, “Fast affine motion estimation for Versatile Video Coding （VVC） encoding,” IEEE Access, Vol.7, pp. 158075-158084, Oct. 2019.
[21] R.-L. Liao, R. Yang, Y. Ye, Z. Wang and C. Ma, “Fast partition decision for VVC interpicture coding using convolution neural network,” in Proc. Applications of Digital Image Processing XLIV, pp. 361-364, Aug. 2021.
[22] S. Jung and D. Jun, “Context-based inter mode decision method for fast affine prediction in Versatile Video Coding,” Electronics, Vol.10, April 2021.
[23] Q. Zhang, Y. Wang, B. Jiang, X. Wang and R. Su , “Adaptive CU partition and early skip mode detection for H.266/VVC,” Multimedia Tools and Applications, Vol.80, pp. 13957-13973, Jan. 2021.
[24] Z. Pan, P. Zhang, B. Peng, N. Ling and J. Lei, “A CNN-Based fast inter coding method for VVC,” IEEE Signal Processing Letters, Vol. 28, pp. 1260 – 1264, June 2021.
[25] M. Zhang, Y. Hou and Z. Liu, “An early CU partition mode decision algorithm in VVC based on variogram for virtual reality 360 degree videos,” EURASIP journal on image and video processing （JIVP）, May 2021.
[26] M. Xu, C. Li, Z. Chen, Z. Wang and Z. Guan, “Assessing visual quality of omnidirectional videos,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 29, pp. 3516-3530, Dec. 2019.
[27] F. Duanmu, Y. Mao, S. Liu, S. Srinivasan and Y. Wang, "A subjective study of viewer navigation behaviors when watching 360-degree videos on computers," in Proc. IEEE International Conference on Multimedia Expo （ICME）, July 2018.
[28] Joint Video Exploration Team （JVET） of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “JVET common test conditions and evaluation procedures for 360° video,” JVET-L1012-v1, October 2018.
[29] K. Pearson, “Notes on regression and inheritance in the case of two parents proceedings of the royal society of London”, 1895.
[30] ISO/IEC JTC 1/SC 29/WG 11, “AHG8: InterDigital test sequences for virtual reality video coding,” Doc. JEVT-D0039, Chengdu, Oct. 2016.
[31] G. Bjontegaard, “Calculation of average PSNR differences between RD-Curves,” Doc. VCEG-M33, Austin, US, April 2001.
[32] D. M W, “Evaluation: From precision, recall and F-Measure to ROC, informedness, markedness & correlation,” Journal of Machine Learning Technologies, pp.37-63, Nov. 2011.

指導教授

唐之瑋(Chih-Wei Tang)

審核日期

2021-7-19

推文