Exploiting Temporal Correlation in Pitch-Adaptive Speech Enhancement

Research output: Contribution to journalArticlepeer-review

Abstract

The single-channel speech enhancement problem is addressed. We propose a pitch-adaptive short-time Fourier transform (PASTFT) framework to obtain a signal-dependent time-frequency representation of the input signal. We analyze the inter-frame correlation of successive speech DFT bins resulting from the PASTFT and harmonic signal modeling. This analysis reveals significant correlation if the phase progression introduced by the harmonic nature of the speech signal is taken into account. Hence, we model successive speech DFT bins as complex-valued autoregressive processes and propose to incorporate the harmonic phase progression into a state-transition model. We estimate the corresponding model parameters by exploiting circular statistics and assume that the additive noise DFT coefficients are uncorrelated w.r.t. time. Based on this propagation model, we propose a pitch-adaptive complex-valued Kalman filter for speech enhancement. The effectiveness of the proposed speech enhancement method is demonstrated in terms of instrumental speech quality and intelligibility predictors. The results indicate a good balance between speech distortions and preservation of speech intelligibility of the input signal compared to the benchmark methods.
Original languageEnglish
Pages (from-to)1-13
Number of pages13
JournalSpeech Communication
Volume111
DOIs
Publication statusPublished - 2019

Keywords

  • Circular statistics
  • Kalman filter
  • Pitch-adaptive
  • Speech enhancement

ASJC Scopus subject areas

  • Software
  • Communication
  • Language and Linguistics
  • Computer Vision and Pattern Recognition
  • Computer Science Applications
  • Modelling and Simulation
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Exploiting Temporal Correlation in Pitch-Adaptive Speech Enhancement'. Together they form a unique fingerprint.

Cite this