High-rate data embedding in unvoiced speech

Konrad Hofbauer; Gernot Kubin

High-rate data embedding in unvoiced speech

Institute of Signal Processing and Speech Communication (4420)

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Abstract

We propose a blind speech watermarking algorithm which allows high-rate embedding of digital side information into speech signals. We exploit the fact that the well-known LPC vocoder works very well for unvoiced speech. Using an auto-correlation based pitch tracking algorithm, a voiced/unvoiced segmentation is carried out. In the unvoiced segments, the linear prediction residual is replaced by a data sequence. This substitution does not cause perceptual degradation as long as the residual's power is matched. The signal is resynthesised using the unmodified LPC filter coefficients. The watermark is decoded by a linear prediction analysis of the received signal and the information is extracted from the sign of the residual. The watermark is nearly imperceptible and provides a channel capacity of up to 2000 bit/s in an 8 kHz-sampled speech signal.

Original language	English
Title of host publication	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Pages	241 - 244
Volume	1
Publication status	Published - 2006
Event	9th International Conference on Spoken Language Processing: Interspeech 2006 - Pittsburgh, United States Duration: 17 Sept 2006 → 21 Sept 2006

Conference

Conference	9th International Conference on Spoken Language Processing
Country/Territory	United States
City	Pittsburgh
Period	17/09/06 → 21/09/06

Access to Document

Hofbauer_ICSLP_2006_revised.pdfSubmitted manuscript, 490 KB

Cite this

@inproceedings{24e74057094b4cc0af6b18f087c392be,

title = "High-rate data embedding in unvoiced speech",

abstract = "We propose a blind speech watermarking algorithm which allows high-rate embedding of digital side information into speech signals. We exploit the fact that the well-known LPC vocoder works very well for unvoiced speech. Using an auto-correlation based pitch tracking algorithm, a voiced/unvoiced segmentation is carried out. In the unvoiced segments, the linear prediction residual is replaced by a data sequence. This substitution does not cause perceptual degradation as long as the residual's power is matched. The signal is resynthesised using the unmodified LPC filter coefficients. The watermark is decoded by a linear prediction analysis of the received signal and the information is extracted from the sign of the residual. The watermark is nearly imperceptible and provides a channel capacity of up to 2000 bit/s in an 8 kHz-sampled speech signal.",

author = "Konrad Hofbauer and Gernot Kubin",

year = "2006",

language = "English",

isbn = "978-160423449-7",

volume = "1",

pages = "241 -- 244",

booktitle = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

note = "9th International Conference on Spoken Language Processing : Interspeech 2006 ; Conference date: 17-09-2006 Through 21-09-2006",

}

TY - GEN

T1 - High-rate data embedding in unvoiced speech

AU - Hofbauer, Konrad

AU - Kubin, Gernot

PY - 2006

Y1 - 2006

N2 - We propose a blind speech watermarking algorithm which allows high-rate embedding of digital side information into speech signals. We exploit the fact that the well-known LPC vocoder works very well for unvoiced speech. Using an auto-correlation based pitch tracking algorithm, a voiced/unvoiced segmentation is carried out. In the unvoiced segments, the linear prediction residual is replaced by a data sequence. This substitution does not cause perceptual degradation as long as the residual's power is matched. The signal is resynthesised using the unmodified LPC filter coefficients. The watermark is decoded by a linear prediction analysis of the received signal and the information is extracted from the sign of the residual. The watermark is nearly imperceptible and provides a channel capacity of up to 2000 bit/s in an 8 kHz-sampled speech signal.

AB - We propose a blind speech watermarking algorithm which allows high-rate embedding of digital side information into speech signals. We exploit the fact that the well-known LPC vocoder works very well for unvoiced speech. Using an auto-correlation based pitch tracking algorithm, a voiced/unvoiced segmentation is carried out. In the unvoiced segments, the linear prediction residual is replaced by a data sequence. This substitution does not cause perceptual degradation as long as the residual's power is matched. The signal is resynthesised using the unmodified LPC filter coefficients. The watermark is decoded by a linear prediction analysis of the received signal and the information is extracted from the sign of the residual. The watermark is nearly imperceptible and provides a channel capacity of up to 2000 bit/s in an 8 kHz-sampled speech signal.

M3 - Conference paper

SN - 978-160423449-7

VL - 1

SP - 241

EP - 244

BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 9th International Conference on Spoken Language Processing

Y2 - 17 September 2006 through 21 September 2006

ER -

High-rate data embedding in unvoiced speech

Abstract

Conference

Access to Document

Fingerprint

Cite this