Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information

Pejman Mowlaee Beikzadehmahaleh; Josef Kulmer

doi:10.1109/TASLP.2015.2439038

Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information

Pejman Mowlaee Beikzadehmahaleh, Josef Kulmer

Institut für Signalverarbeitung und Sprachkommunikation (4420)

Publikation: Beitrag in einer Fachzeitschrift › Artikel › Begutachtung

Abstract

n conventional single-channel speech enhancement, typically the noisy spectral amplitude is modified while the noisy phase is used to reconstruct the enhanced signal. Several recent attempts have shown the effectiveness of utilizing an improved spectral phase for phase-aware speech enhancement and consequently its positive impact on the perceived speech quality. In this paper, we present a harmonic phase estimation method relying on fundamental frequency and signal-to-noise ratio (SNR) information estimated from noisy speech. The proposed method relies on SNR-based time-frequency smoothing of the unwrapped phase obtained from the decomposition of the noisy phase. To incorporate the uncertainty in the estimated phase due to unreliable voicing decision and SNR estimate, we propose a binary hypothesis test assuming speech-present and speech-absent classes representing high and low SNRs. The effectiveness of the proposed phase estimation method is evaluated for both phase-only enhancement of noisy speech and in combination with an amplitude-only enhancement scheme. We show that by enhancing the noisy phase both perceived speech quality as well as speech intelligibility are improved as predicted by the instrumental metrics and justified by subjective listening tests.

Originalsprache	englisch
Seiten (von - bis)	1521-1532
Fachzeitschrift	IEEE Transactions on Audio Speech and Language Processing
Jahrgang	23
Ausgabenummer	9
DOIs	https://doi.org/10.1109/TASLP.2015.2439038
Publikationsstatus	Veröffentlicht - 2015

Fields of Expertise

Information, Communication & Computing

Zugriff auf Dokument

10.1109/TASLP.2015.2439038

FWF - Phase - Signalverarbeitung für die Sprachübertragung unter Berücksichtigung der Phase
Mowlaee Beikzadehmahaleh, P.
1/10/15 → 31/07/19
Projekt: Forschungsprojekt

Dieses zitieren

@article{6e6a0e7d1260475884912367608e341c,

title = "Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information",

abstract = "n conventional single-channel speech enhancement, typically the noisy spectral amplitude is modified while the noisy phase is used to reconstruct the enhanced signal. Several recent attempts have shown the effectiveness of utilizing an improved spectral phase for phase-aware speech enhancement and consequently its positive impact on the perceived speech quality. In this paper, we present a harmonic phase estimation method relying on fundamental frequency and signal-to-noise ratio (SNR) information estimated from noisy speech. The proposed method relies on SNR-based time-frequency smoothing of the unwrapped phase obtained from the decomposition of the noisy phase. To incorporate the uncertainty in the estimated phase due to unreliable voicing decision and SNR estimate, we propose a binary hypothesis test assuming speech-present and speech-absent classes representing high and low SNRs. The effectiveness of the proposed phase estimation method is evaluated for both phase-only enhancement of noisy speech and in combination with an amplitude-only enhancement scheme. We show that by enhancing the noisy phase both perceived speech quality as well as speech intelligibility are improved as predicted by the instrumental metrics and justified by subjective listening tests.",

author = "{Mowlaee Beikzadehmahaleh}, Pejman and Josef Kulmer",

year = "2015",

doi = "10.1109/TASLP.2015.2439038",

language = "English",

volume = "23",

pages = "1521--1532",

journal = "IEEE Transactions on Audio Speech and Language Processing ",

issn = "1558-7924",

publisher = "Institute of Electrical and Electronics Engineers",

number = "9",

}

TY - JOUR

T1 - Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information

AU - Mowlaee Beikzadehmahaleh, Pejman

AU - Kulmer, Josef

PY - 2015

Y1 - 2015

N2 - n conventional single-channel speech enhancement, typically the noisy spectral amplitude is modified while the noisy phase is used to reconstruct the enhanced signal. Several recent attempts have shown the effectiveness of utilizing an improved spectral phase for phase-aware speech enhancement and consequently its positive impact on the perceived speech quality. In this paper, we present a harmonic phase estimation method relying on fundamental frequency and signal-to-noise ratio (SNR) information estimated from noisy speech. The proposed method relies on SNR-based time-frequency smoothing of the unwrapped phase obtained from the decomposition of the noisy phase. To incorporate the uncertainty in the estimated phase due to unreliable voicing decision and SNR estimate, we propose a binary hypothesis test assuming speech-present and speech-absent classes representing high and low SNRs. The effectiveness of the proposed phase estimation method is evaluated for both phase-only enhancement of noisy speech and in combination with an amplitude-only enhancement scheme. We show that by enhancing the noisy phase both perceived speech quality as well as speech intelligibility are improved as predicted by the instrumental metrics and justified by subjective listening tests.

AB - n conventional single-channel speech enhancement, typically the noisy spectral amplitude is modified while the noisy phase is used to reconstruct the enhanced signal. Several recent attempts have shown the effectiveness of utilizing an improved spectral phase for phase-aware speech enhancement and consequently its positive impact on the perceived speech quality. In this paper, we present a harmonic phase estimation method relying on fundamental frequency and signal-to-noise ratio (SNR) information estimated from noisy speech. The proposed method relies on SNR-based time-frequency smoothing of the unwrapped phase obtained from the decomposition of the noisy phase. To incorporate the uncertainty in the estimated phase due to unreliable voicing decision and SNR estimate, we propose a binary hypothesis test assuming speech-present and speech-absent classes representing high and low SNRs. The effectiveness of the proposed phase estimation method is evaluated for both phase-only enhancement of noisy speech and in combination with an amplitude-only enhancement scheme. We show that by enhancing the noisy phase both perceived speech quality as well as speech intelligibility are improved as predicted by the instrumental metrics and justified by subjective listening tests.

U2 - 10.1109/TASLP.2015.2439038

DO - 10.1109/TASLP.2015.2439038

M3 - Article

SN - 1558-7924

VL - 23

SP - 1521

EP - 1532

JO - IEEE Transactions on Audio Speech and Language Processing

JF - IEEE Transactions on Audio Speech and Language Processing

IS - 9

ER -

Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information

Abstract

Fields of Expertise

Zugriff auf Dokument

Fingerprint

Projekte

FWF - Phase - Signalverarbeitung für die Sprachübertragung unter Berücksichtigung der Phase

Dieses zitieren