自動場景分類在機器學習研究領域中是個熱門的議題。許多研究專注於以視覺為基礎做自動場景分類,而使用聲音為基礎做場景分類的研究則相對較少。以聲音為基礎的場景分類系統,或稱為聲學場景分類,分析輸入的聲音資料,並自動分類紀錄聲音的環境場景。當視覺資訊無法取得時,聲學場景分類可視為以視覺為基礎的場景分類的延伸。當聲音資訊被取得,聲學場景分類系統可以分類場景,因此可被稱為機器聽覺。此領域有數種針對聲學場景分類提出的方法。近年來,使用電腦視覺技術分析聲學事件的研究愈來愈多。此外,深層學習的研究也受到許多注意。深層學習在許多領域都展現傑出的效果。本篇論文中,針對聲學場景分類問題提出了以深層學習為基礎的方法。;Automatic scene classification is an active issue in the machine learning research field. While many works put a lot of focus on visual based approach, relatively little attention has been put to solve the problem of automatic scene classification using audio-based approach. The audio-based scene classification, or is known as acoustic scene classification (ASC), analyzes the input of audio data to automatically identify the scene of environment where the sound was recorded. Furthermore, the works in ASC can be seen as an alternative to visual-based approach when the performance of visual-based classifier is compromised. The audio-based approach has benefit, that as long as the sound can be listened, the practical ASC system will be able to perform scene classification, thus the obscuring object problem that exists in visual-based approach can be alternatively addressed. In this field, there have been a number of proposed approach to address the problem of audio-based scene classification. In recent years, there is an increasing interest of adopting the approach from computer vision research field to address the problem in audio analysis. Moreover, the research works of deep learning have attracted many attention. The deep learning based system has presented a promising result in many fields. In this thesis, the problem of ASC is solved using deep learning based approach. Several ASC systems, including the proposed system, have been implemented and discussed in the experiments. The results show the superiority of proposed system versus another systems that have been discussed in this thesis.