High-rate data embedding in unvoiced speech

Konrad Hofbauer, Gernot Kubin

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

We propose a blind speech watermarking algorithm which allows high-rate embedding of digital side information into speech signals. We exploit the fact that the well-known LPC vocoder works very well for unvoiced speech. Using an auto-correlation based pitch tracking algorithm, a voiced/unvoiced segmentation is carried out. In the unvoiced segments, the linear prediction residual is replaced by a data sequence. This substitution does not cause perceptual degradation as long as the residual's power is matched. The signal is resynthesised using the unmodified LPC filter coefficients. The watermark is decoded by a linear prediction analysis of the received signal and the information is extracted from the sign of the residual. The watermark is nearly imperceptible and provides a channel capacity of up to 2000 bit/s in an 8 kHz-sampled speech signal.
Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Pages241 - 244
Volume1
Publication statusPublished - 2006
Event9th International Conference on Spoken Language Processing: Interspeech 2006 - Pittsburgh, United States
Duration: 17 Sept 200621 Sept 2006

Conference

Conference9th International Conference on Spoken Language Processing
Country/TerritoryUnited States
CityPittsburgh
Period17/09/0621/09/06

Fingerprint

Dive into the research topics of 'High-rate data embedding in unvoiced speech'. Together they form a unique fingerprint.

Cite this