隨著電腦運算速度的增加,儲存裝置的加大、網路技術的進步、各種影音壓縮格式的產生,都造成了影像資訊在我們生活中越來越普及。因此,如何有效管理影像資料庫便成了一個值得探討、有趣的議題,但一般傳統的文字資訊管理方法並不適用於影像資料的管理,而一個有效的影像資料庫必需具備以下兩種功能:(1)有效的影像摘要功能,可以幫助使用者快速了解影像內容。(2)有效的影像搜尋功能,可以幫助使用者在大量的影像資料庫快速找出想要的影像內容。而影像摘要系統著重在分析出影像中的語意架構,如此一來會比較符合人所認知的感覺。 在本篇論文中,我們將研究重點著重在影像摘要部份,提出了以情境為索引的影像摘要系統,根據影片中每段不同意義的情境作為摘要,幫助使用者快速瞭解影片內容,以及使得搜尋動作更加方便。 一般影片摘要系統包含以下幾個步驟、從分鏡偵測(Shot detection)、主視訊頁擷取(Key-frame extraction)、分鏡合併(Shot group)到最後的情境偵測(Scene detection),每個步驟都環環相扣,前一步驟的結果不佳,就可能影響下一個步驟的結果。 而本篇論文的特點為利用自我組織特徵映射圖網路(Self-organizing Feature Map Network,簡稱SOM),因為SOM有能夠將資料特徵保存在映射圖上的拓璞特性,所以在映射圖上,特徵相似的分鏡(Shot),會映射在靠近的區域,接著再利用區域增長演算法(Region growing)將相似的分鏡合併起來為群組,最後再利用情境偵測的演算法來分析群組,將語意相同的群組合併成情境(Scene),而最後使用者可以利用建立好的情境圖,或者是階層式的樹狀表示法來了解影片的內容。 最後實驗部份,我們測試了各種不同類型的影片,並經由不同測試者來和系統分析出來做比較。 Due to rapid advances and improvements in electronics hardware and networking technologies and the decreasing cost of storage, video data are becoming available at an ever increasing rate. Traditional database management technique for text documents cannot effectively data with video data; therefore, Method and technique to automatically analyze video data have become a very attractive and challenging research topic. An efficient video database management system should have following two functionalities: 1) the video summarization functionality which make the take of browsing video content become easy and 2) the video retrieval functionality which can retrieve video from a huge video database based on user queries. This thesis focuses on the development of video summarization technique. Traditional way to browse video data is via the “fast forward” and “rewind” function keys to manually locate the region of interest. It is very time consuming. The goal of the new video summarization technique proposed in this thesis is to provide and effective table of content which can capture the semantic structure of a vide document. Several different approaches to video summarization technique have been proposed, each for its own advantage and limitation. The proposed video summarization technique involves the following four steps: 1) shot detection, 2) key-frame extraction, 3) shot group, 4) scene detection. The most appealing property of the proposed technique is the use of the self-organizing feature map(SOM).Since the SOM has the topologically preserving property, shots with similar feature will be grouped into the scene cluster and similar will be located nearby on a map. Then a region growing technique is employed to merge similar shots into groups. After the group map has been constructed, an effective scene detection technique is adopted to merge groups with a similar semantic concept into a scene. The constructed scene map can be either directly used as the table of content of a video document or transformed to a hierarchy tree to represent the video content of a video document. Via the scene map or the hierarchy tree, a user can effectively browse the content. The performance of the proposed technique is demonstrated by experiments on several different types of video documents.