Exploiting Temporal Correlation in Pitch-Adaptive Speech Enhancement

Publikation: Beitrag in einer FachzeitschriftArtikelBegutachtung


The single-channel speech enhancement problem is addressed. We propose a pitch-adaptive short-time Fourier transform (PASTFT) framework to obtain a signal-dependent time-frequency representation of the input signal. We analyze the inter-frame correlation of successive speech DFT bins resulting from the PASTFT and harmonic signal modeling. This analysis reveals significant correlation if the phase progression introduced by the harmonic nature of the speech signal is taken into account. Hence, we model successive speech DFT bins as complex-valued autoregressive processes and propose to incorporate the harmonic phase progression into a state-transition model. We estimate the corresponding model parameters by exploiting circular statistics and assume that the additive noise DFT coefficients are uncorrelated w.r.t. time. Based on this propagation model, we propose a pitch-adaptive complex-valued Kalman filter for speech enhancement. The effectiveness of the proposed speech enhancement method is demonstrated in terms of instrumental speech quality and intelligibility predictors. The results indicate a good balance between speech distortions and preservation of speech intelligibility of the input signal compared to the benchmark methods.
Seiten (von - bis)1-13
FachzeitschriftSpeech Communication
PublikationsstatusVeröffentlicht - 2019

ASJC Scopus subject areas

  • Software
  • Kommunikation
  • Sprache und Linguistik
  • Maschinelles Sehen und Mustererkennung
  • Angewandte Informatik
  • Modellierung und Simulation
  • Linguistik und Sprache

Dieses zitieren