Speech signal denoising with wavelets

Alak, Iman Khalil

DSpace Home
→
Enstitüler - Tezler
→
Fen Bilimleri Enstitüsü
→
Yüksek Lisans Tezleri
→
Elektrik Elektronik Mühendisliği Bölümü Tezleri
→
View Item

Speech signal denoising with wavelets

Alak, Iman Khalil

URI: http://hdl.handle.net/20.500.12416/2099

Date: 2018-05-23

Abstract:

Bu çalışma konuşma sinyalinden gürültünün arındırılması için dalgacık dönüşümünün performansını incelemeyi amaçlamaktadır. Dalgacıklar sayısal konuşma işlemede özellikle kodlama, iyileştirme veya gürültü temizlemede yaygın olarak kullanılırlar. Pekçok koşulda, doğal konuşma sinyalini anlama arkaplan gürültüsü nedeniyle zorlu bir iştir. Konuşma gürültüsü temizleme algoritmasının amacı gürültüyü minimum bozulmayla temizleyerek orjinal konuşma sinyalini kurtarmaktır. Konuşma sinyalini gürültüden temizlemede kullanılacak değişik metotlar mevcuttur. Kullanılan gürültü temizleme algoritmalarının pekçoğu bu işlemi, gürültü sinyalinin güç spectral yoğunluğunun kısa pencere aralıklarında incelenebildiği frekans düzleminde gerçekleştirir. Daha sonra, gürültülü sesin herbir pencere aralığı için temiz sesin spectral frekans ve genliği tahmin edilir. Sonuç olarak, metotlara bağlı olarak tahmin hataları ortaya çıkar. Tahmin hatalarını minimuma indirmek için yıllardır değişik spectral tahmin teknikleri araştırılmıştır. Bu çalışmada, gürültülü konuşma sinyalini temizlemede kesikli dalgacık dönüşüm tekniği kullanılmıştır. Kesikli dalgacık dönüşümünün performansı Daubechies, Symlets veya Coiflets gibi dalgacık filtreler kullanılarak değerlendirilmiştir. Analiz MATLAB yazılımı üzerinde gerçekleştirilmiştir. Gürültülü konuşma sinyali olarak babble gürültü (kalabalık insan grubu) veya farklı tipte arkaplan araç gürültüleri (arabalar, tren, uçak vs) gibi çevresel arkaplan gürültüleri içeren konuşmalar analiz edilmiştir. Bunlar konuşma sinyalinden dalgacık analizle temizlenmiştir. Gürültülü konuşma sinyali, soft ve hard eşikleme teknikleri içeren Sgtwolog, Heursure, Rigrsure ve Minimaxi eşikleme teknikleri olarak dört farklı eşik metodu kullanarak alt parçalara bölünmüştür. Tekrar oluşturulan konuşma sinyali ve gürültülü sinyal karşılaştırılarak sinyal-gürültü oranı (SNR) ve hatanın ortalama karekökü (MSE) hesaplanarak ölçülmüştür. Çalışmanın katkıları, farklı wavelet ailelerinin farklı arkaplan gürültülerine karşı performans kıyaslamalarının detaylı analizi ve gürültülü konuşma sinyalinden gürültü temizleme için etkin bir metodun (Maximal overlap DWT-MODWT) ortaya konmasıdır.

This study aims to examine the performance of wavelet transform for denoising of a speech signal. Wavelets are widely used in digital speech processing, especially in coding, enhancement or noise removing of a speech signal. In many conditions, recognizing natural speech is a challenging task due to the background noise in it. The goal of a speech denoising algorithm is to recover original speech signal by removing noise with a minimum distortion. There are various methods to help restore speech from noisy distortions. Many of the used deniosing algorithms perform this procedure in frequency domain where the power spectral density (PSD) function of the noisy signal can be examined in a short time frame. Then, the short-time spectral frequency and amplitude of clean speech is estimated for per frame of the noisy signal. As a result, estimation errors are introduced by the limitations of methods. Various spectral estimation techniques have been investigated for decades to reduce the estimation errors. In this study, discrete wavelet transform technique is used for denosing of an input noisy speech signal. The performance of discrete wavelet transform is evaluated by using different wavelet filters such as Daubechies, Symlets or Coiflets. The analysis was performed on MATLAB software. As an input noisy speech signal, different types of environmental background noises were analyzed such as babble noise (crowd of people) or noisy speeches with different type of background vehicle noises (cars, train, plane etc.). They were filtered from the speech signal by wavelet analysis. The input noisy speech signal was decomposed by applying four different threshold selection to the wavelet coefficient: sgtwolog, heursure, rigrsure, and minimaxi thresholding, with hard or soft thresholding techniques. Reconstructed speech was compared with the original speech signal by measuring the signal-to noise ratio (SNR) and MSE values between noisy and output signals. Contributions include detailed analysis of comparison of different wavelet family performances against different background noise types and the discovering an effective method (Maximal overlap DWT-MODWT) for denoising of noisy speech signals.

Show full item record