中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/88368
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 80990/80990 (100%)
造访人次 : 42713522      在线人数 : 1343
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/88368


    题名: 可自定義之語者驗證系統與其特徵擷取模組之硬體實現;A Hardware Implementation of Feature Extraction for Self-Defined Speaker Verification System
    作者: 王喬立;Wang, Chiao-Li
    贡献者: 電機工程學系
    关键词: 語音辨識;語者驗證;語音特徵擷取;FPGA;SoC
    日期: 2022-04-15
    上传时间: 2022-07-14 00:36:31 (UTC+8)
    出版者: 國立中央大學
    摘要: 近年來,在人機互動的社會中使用語音辨識來驅動設備或是控制設備的語音系統越來越普遍。其中,語者驗證已經被廣泛的探索並大幅提高了它的有效性,透過分析語者們的聲紋找出之間的特徵差異來進行驗證。然而,目前基於複雜且架構龐大的神經網路做法仍有許多缺點,像是只能在規格極高的邊緣裝置上執行,或是將語音片段擷取後上傳至雲端進行處理,進而衍伸出個人隱私問題。為了解決這些問題,可在終端運算之語者驗證系統是語音人機互動中重要的任務。
    本論文提出可自定義之語者驗證系統與其特徵擷取模組之硬體實現。經過各個模組的耗時分析後,在Xilinx ZCU104開發板 Programmable Logic端上實現梅爾倒頻譜參數 (Mel-Frequency Cepstral Coefficients) 預處理模組,並經由AXI匯流排將擷取出的語音特徵傳回 Processing System端進行後處理。其中MFCC硬體架構在FPGA上的功耗為4.26W,在150MHz操作頻率下,一時長為2秒的語音可在53.6毫秒內處理完畢,且在後續的後處理中保有高準確率,滿足實時系統的標準。
    ;In recent years, the devices that use speaker recognition to drive or control in a human-computer interactive society have become increasingly common. Among these, speaker verification has been widely explored and its effectiveness has been significantly improved by analyzing the voiceprints of speakers to identify differences in features between them. However, the current approach based on complex and large neural networks still has many drawbacks, such as it can only be performed on highly-specified edge devices, or the voice clips are captured and uploaded to the cloud for processing, which can lead to personal privacy issues. To address these issues, local speaker verification systems are an important task in speech human-computer interaction.
    This paper proposed a self-defined speaker verification system and its hardware implementation of feature extraction module. After time-consuming analysis of each module, the Mel-Frequency Cepstral Coefficients pre-processing module is implemented on the programmable logic side of the Xilinx ZCU104 development board and the extracted features data are sent back to the processing system side for post-processing. The MFCC hardware architecture consumes 4.26W on the FPGA, and a 2-second speech can be processed in 53.6ms at 150MHz operating frequency. The overall system can meet the real-time standards with high accuracy in the post-processing.
    显示于类别:[電機工程研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML56检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明