基於時頻感知域經由深度信念網路之吉他彈奏技巧辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：19

、訪客IP：18.188.29.49

姓名

劉郁廷(Yu-ting Liu) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

基於時頻感知域經由深度信念網路之吉他彈奏技巧辨識
(Recognition of Guitar Playing Techniques with Deep Belief Networks based on Spectral-Temporal Receptive Fields)

相關論文

★ 基於區域權重之衛星影像超解析技術	★ 延伸曝光曲線線性特性之調適性高動態範圍影像融合演算法
★ 股票開盤價漲跌預測	★ 實現於RISC架構之H.264視訊編碼複雜度控制
★ 基於卷積遞迴神經網路之構音異常評估技術	★ 具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術
★ 具有注意力機制之隱式表示於影像重建三維人體模型	★ 使用對抗式圖形神經網路之物件偵測張榮
★ 基於弱監督式學習可變形模型之三維人臉重建	★ 以非監督式表徵分離學習之邊緣運算裝置低延遲樂曲中人聲轉換架構
★ 基於序列至序列模型之 FMCW雷達估計人體姿勢	★ 基於多層次注意力機制之單目相機語意場景補全技術
★ 基於時序卷積網路之單FMCW雷達應用於非接觸式即時生命特徵監控	★ 視訊隨選網路上的視訊訊務描述與管理
★ 基於線性預測編碼及音框基頻週期同步之高品質語音變換技術	★ 基於藉語音再取樣萃取共振峰變化之聲調調整技術

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

吉他是非常常見的樂器，被廣泛運用於流行音樂、搖滾樂、民謠…等，學習吉他成為許多人的興趣。而不同吉他彈奏技巧能夠表現不同聲音、展示不同情緒，進而拼湊成一幅樂章。
吉他彈奏技巧的變化相當細微，欲將其分類、辨識是具有挑戰性的工作。對於不熟悉吉他的人而言，技巧聽起來十分相像；而會彈吉他的人，便能單憑聆聽就區分出不同技巧。
面對彈奏技巧些微的變化，本研究提出以深度學習網路(Deep Belief Networks, DBN)學習音訊特徵，包含梅爾倒頻譜系數(MFCCs)及大腦皮質組織(spectro-temporal receptive field)，藉由不同初始化方法與新提出的深度學習網路架構，學習找出相對關鍵的特徵增加辨識效果，並使用完整音檔和Onset部分進行比較。實驗結果顯示，本研究提出之方法於Onset部分最高提升11.74%之辨識率，而完整音檔的部分，辨識率更為精準，到達0.9819。說明有效運用特徵參數及辨認器，相較於大量參數，更能準確分類資訊。

摘要(英)

Guitar is a very common instrument which has been widely used in popular music, rock, ballad, etc. Different guitar playing technique can perform various vocal, express different emotion, then play the wonderful music. Some of guitar playing techniques has only tiny difference. To recognize it is a big challenge. This paper proposed a guitar playing technique recognition system including a novel STRF based feature extraction algorithm and a novel deep learning model called HCDBN. In experiments, the proposed system improves 11.74% recognition rate than baseline system on onset version dataset and achieves 98.19% recognition rate on whole version dataset. This paper also make an onset detection based guitar technique recognition system which can applied in real world guitar solo music.

關鍵字(中)

★ 聽覺模型
★ 吉他彈奏技巧
★ 分類
★ 辨識
★ 類神經網路
★ 深度學習

關鍵字(英)

★ STRF
★ Guitar Playing Technique
★ Classification
★ Recognition
★ Neural Network
★ Deep Belief Network

論文目次

摘　要 I
Abstract II
致謝 III
目　錄 IV
附圖索引 VI
附表索引 VIII
第一章緒論 1
1.1　研究背景 1
1.2　研究動機與目的 2
1.3　論文架構 2
第二章聽覺感知模型 3
2.1　聽覺感知模型 3
2.2　初期耳蝸模型 4
2.3 大腦皮質模型 6
2.4 STRF參數擷取 8
2.4.1 Scale參數擷取 8
2.4.2 Rate參數擷取 8
第三章深度信念網路 11
3.1　深度信念網路 11
3.2 Generative Restricted Boltzmann Machines 15
3.3 Discriminative Restricted Boltzmann Machines 19
3.4 Initialization 21
3.5 Softmax 22
第四章　DBN架構與實驗結果 24
4.1　系統架構 24
4.1.1 Scheme1：Original DBN 24
4.1.2 Scheme2：LDDBN 26
4.1.3 Scheme3：LFDBN 27
4.1.4 Scheme4：HDDBN 28
4.1.5 Scheme5：HCDBN 29
4.1.6 架構比較 31
4.2　實驗數據 32
4.2.1 五種架構在Split資料庫的結果 35
4.2.2 Split資料庫以STRF參數擷取 36
4.2.3 觀察一：Onset部份的時頻圖變異數 37
4.2.4 五種架構在Whole資料庫的結果 39
4.2.5 Whole資料庫以STRF參數擷取 40
4.2.6 觀察二：完整音檔的時頻圖變異數 40
4.2.7 分為19子類別的結果 43
4.2.8 Split資料庫所有架構、參數、分類之組合比較總表 45
4.2.9 Whole資料庫所有架構、參數、分類之組合比較總表 48
4.2.10 真實音檔 52
第五章　結論及未來展望 55
參考文獻 56

參考文獻

[1] T.S. Chi, P. Ru and S. Shamma, “Multiresolution spectrotemporal analysis of complex sounds,” , Journal of the Acoustical Society of America, vol. 118, no. 2, pp 887-906, 2005.
[2] Washington Neural Systems Laboratory Available on: http://neural.cs.washington.edu/
[3] Auditory Pathway:
http://www.edoctoronline.com/medical-atlas.asp?c=4&id=21838&m=3
[4] D. E. Rumelhart, G. E. Hinton, R. J. Williams, “Learning representations by back-propagating errors,” Nature 323 (6088): 533–536, 8 October 1986.
[5] D. H. Ackley, G. E. Hinton, T. J. Sejnowski, “A Learning Algorithm for Boltzmann Machines,” In D. E. Rumelhart, J. L. McClelland, and the PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations (Cambridge: MIT Press): 282–317. 1985.
[6] P. Smolensky, Parallel Distributed Processing: Volume 1:Foundations, D. E. Rumelhart, J. L. McClelland, Eds. (MIT Press, Cambridge, 1986), pp. 194–281
[7] A. Mnih, and G. E. Hinton, “Learning Unreliable Constraints using Contrastive Divergence,” In IJCNN 2005, Montreal.
[8] V. Nair, and G. E. Hinton, “3-D Object recognition with deep belief nets,” Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. lafferty, C. K. I. Williams, and A. Culotta (Eds.), pp 1339-1347.
[9] A. R. Mohamed, G. E. Dahl, and G. E. Hinton, “Deep belief networks for phone recognition,” NIPS 22 workshop on Deep Learning for Speech Recognition.
[10] G. E. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, Navdeep Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, “Deep Neural Networks for Acoustic Modeling in Speech Recognition,” IEEE Signal Processing Magazine, November, 2012.
[11] In Rumelhart, David E.; McLelland, James L. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations. MIT Press. pp. 194–281. ISBN 0-262-68053-X.
[12] Mohammad Ali Keyvanrad, Mohammad Mehdi Homayounpour:
“A brief survey on deep belief networks and introducing a new object oriented MATLAB toolbox (DeeBNet) ”, CoRR abs/1408.3264 (2014)
[13] R. Salakhutdinov and G. E. Hinton, “Deep boltzmann machines,” in Proceedings of the international conference on artificial intelligence and statistics, 2009, vol. 5, pp. 448–455.
[14] C. M. Bishop, Pattern Recognition and Machine Learning, 1st ed. 2006. Corr. 2nd printing. Springer, 2007.
[15] G. Hinton, “A practical guide to training restricted boltzmann machines,” Machine Learning Group, University of Toronto, Technical report, 2010.
[16] Hugo Larochelle , Yoshua Bengio, “Classification using discriminative restricted Boltzmann machines”, Proceedings of the 25th international conference on Machine learning, p.536-543, July 05-09, 2008, Helsinki, Finland [doi>10.1145/1390156.1390224]
[17] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”, arXiv:1502.01852 [cs.CV]
[18] A. Krizhevsky, I. Sutskever, and G. Hinton. “Imagenet classification with deep convolutional neural networks”, In NIPS, 2012.
[19] K. Simonyan and A. Zisserman “Very deep convolutional networks for large-scale image recognition”, arXiv:1409.1556, 2014.
[20] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions”, arXiv:1409.4842, 2014.
[21] C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu. “Deeply supervised nets”, arXiv:1409.5185, 2014.
[22] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks”, In International Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.
[23] Softmax回歸：http://ufldl.stanford.edu/wiki/index.php/Softmax%E5%9B%9E%E5%BD%92
[24] Li Su, Li-Fan Yu, and Yi-Hsuan Yang, “Sparse Cepstral and Phase Codes for Guitar Playing Technique Classification”, in 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan, Oct. 2014.
[25] Christian Kehling, Jakob Abeßer, Christian Dittmar, and Gerald Schuller, “Automatic tablature transcription of electric guitar recordings by estimation of score and instrument-related parameters”, In Proc. Int. Conf. Digital Audio Effects, 2014.

指導教授

張寶基、王家慶(Pao-chi Chang Jia-Ching Wang)

審核日期

2015-7-28

推文