獨立訊號經過旋積混合後的盲訊號源分離問題,發生在很多實際的應用上,我們希望提供一個高效能與可延展的旋積盲訊號源分離電路架構。在演算法上,我們使用Torkkola所提出的架構來實現旋積盲訊號源分離,Torkkola的學習規則近似於最小均方誤差演算法,因為要增加硬體的時脈速度與吞吐率必需減少關鍵路徑長度,所以我們也利用近似於最小均方誤差適應性濾波器的延遲最小均方誤差適應性濾波器來做修改應用到旋積盲訊號源分離中。本論文在延遲最小均方誤差適應性濾波器中,提出了兩種架構,第一種架構大幅改進了適應延遲,在大部分的情況維持3個延遲,並且關鍵路徑也維持在一個乘法與一個加法。第二種架構利用分享乘法器的方法改進了適應延遲與關鍵路徑,大部分的情況只需要6個延遲,而且關鍵路徑只要一個預處理單元,處理時間較一個乘法時間為低,此外我們提出的兩個架構也將延遲數與濾波器長度的影響降低。我們所提出之旋積盲訊號源分離架構主要架構於上述兩者之上,在演算法上,分別針對向前式與回授式這兩種型式。我們讓延遲最小均方誤差適應性濾波器模組化以及維持了它的關鍵路徑長度,是一個兼具可延展與高效能的旋積盲訊號源分離電路架構。 Blind source separation (BSS) of independent sources from their convolutive mixtures is a problem in many real world applications. Therefore, we hope we can design an effective and scalable VLSI architecture for BSS. Considering the algorithm, the BSS architecture proposed by Torkkola is utilized and its learning rule is similar to least mean squares (LMS). Reducing the critical path will increase clock rate and throughput of hardware, so we apply delayed LMS (DLMS) to BSS. We proposed two VLSI architectures for DLMS. The proposed-I architecture improves the adaptation delays. In most of the cases, we maintain 3 adaptation delays and maintain the critical path as one adder and one multiplication. The proposed-II architecture based on sharing multiplication improves adaptation delays and critical path. In most of the cases, we only need 6 adaptation delays. Because critical path only passes precomputer bank, the critical path is less than 1Tm. Besides, we reduce the effect of delays and filter length in the two propose methods. The proposed VLSI design for BSS is based on the above two presented DLMS architecture. Moreover, both feedforward and feedback BSS algorithms are adopted respectively. Because the designed DLMS is modular and its critical path is maintained, the proposed VLSI architecture for convolutive blind source separation is scalable and effective.