視覺化語音辨識暨密碼驗證使用時空特徵與稀疏表示分類器;Visual Speech Recognition and Password Verification Using Local Spatiotemporal Features and Kernel Sparse Representation Classifier

NCU Institutional Repository > 資訊電機學院 > 資訊工程學系碩士在職專班 > 博碩士論文 > Item 987654321/68703

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/68703

題名:	視覺化語音辨識暨密碼驗證使用時空特徵與稀疏表示分類器;Visual Speech Recognition and Password Verification Using Local Spatiotemporal Features and Kernel Sparse Representation Classifier
作者:	柯奧福;Frisky,Aufaclav Zatu Kusuma
貢獻者:	資訊工程學系在職專班
關鍵詞:	內核稀疏表示;本地時空描述;可視化語音識別;嘴唇密碼驗證;Kernel Sparse Representation;Local Spatiotemporal Descriptor;Visual Speech Recognition;Lips password Verification
日期:	2015-08-05
上傳時間:	2015-09-23 14:17:59 (UTC+8)
出版者:	國立中央大學
摘要:	當研究導向安全、生物特徵、與人機互動的辨識系統時，視覺化語音辨識應用在多面向的人類生活中扮演了一個重要的角色。在本論文中，我們提出了兩種系統。在第一個系統中，我們提出一個使用時域空間特徵描述子的字母辨識系統。提出的系統使用非負矩陣分解來降低特徵維度並且使用核化稀疏表示分類器做辨識。我們使用局部紋理與局部時間表示視覺化嘴唇資料。首先，視覺化嘴唇資料經由影像對比度增強做前處理並取出特徵。在我們的實驗中，半語者相依、語者獨立、語者相依分別在AVLetters資料庫中取得67.13%、45.37%、63.12%的正確率。同時我們也使用AVLetters 2資料庫將我們的方法與其他方法做比較。在相同配置下，我們的方法可以在語者相依條件達到89.02%及在語者獨立條件下達到25.9%的正確率。這樣的結果顯示了我們的方法在相同配置下比其他方法更加傑出。在第二個系統中，我們提出使用信任點以唇語做為密碼用於家庭入口安全的家庭自動化系統。我們提出使用L2-Helinger對時域空間描述特徵做正規化的修改版新特徵，並且使用二維半非負矩陣分解降低維度。在辨識器部分，我們提出前饋-反饋核化稀疏表示分類器。我們的實驗結果證實了我們的系統對密碼辨識更具強健性。我們在AVLetters 2資料庫使用這個系統。在實驗中使用AVLetters 2資料庫產生長度為五個字母組合的十種視覺化密碼的所有組合，結果顯示我們的系統在密碼驗證表現非常好。在更複雜的實驗中，我們也證實了提出的系統在實際應用中可以實作在合理的時間內進行辨識。;Visual speech recognition (VSR) applications play an important role in various aspects of human life, with research efforts being put into recognition systems in security, biometrics, and human machine interaction. In this thesis, we proposed two lip-based systems. First system, we proposed a letter recognition system using spatiotemporal features descriptors. The proposed system adopted non-negative matrix factorization (NMF) to reduce the dimensionality of the feature and kernel sparse representation classifier for classification step. We used local texture and local temporal features to represent the visual lips data. Firstly, the visual lips data were preprocessed by enhancing the contrast of images and then used to extract the feature. In our experiment, the promising accuracies of 67.13%, 45.37%, and 63.12% can be achieved in semi speaker dependent, speaker independent, and speaker dependent on AVLetters database. We also compared our method with other methods on AVLetters 2 database. Using the same configuration, our method could achieve accuracy rate of 89.02% for speaker dependent case and 25.9% for speaker independent case. This result shows that our method outperforms the others in the same configuration. In the second system, we proposed a new approach in lip-based password for home entrance security using confidence point in home automation system. We also proposed new features using modified version of spatiotemporal descriptor features adopt L2-Hellinger to do a normalization and used two-dimension semi non-negative matrix factorization (2D Semi-NMF) for dimensionality reduction. In classifier parts, we proposed forward-backward kernel sparse representation classifier (FB-KSRC). Our experiment results proves that our system is quite robust to classify the password. We applied this system in AVLetters 2 dataset. Using ten visual passwords of five combined letters from AVLetters 2 dataset, using all combination experiments, the result shows that our system can verify the password very well. In the complexity experiment, we also get a reasonable time classification process if our system will be implemented in real world application.
顯示於類別:	[資訊工程學系碩士在職專班 ] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	720	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....