Abstract
This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a microphone array. We present a prototype system that can segment the utterances in real-time and generate robust ASR results off-line. The segmentation is carried out by a voice activity detector based on deep belief networks, the speaker localization by a position-pitch plane, and the enhancement by a novel combination of convex optimized beamforming and vector Taylor series compensation. All of the components are compared with other similar ones and justified in terms of word accuracy on a proposed database which simulates distant speech recognition in a home environment
Originalsprache | englisch |
---|---|
Titel | 22nd European Signal Processing Conference |
Seiten | 2380-2384 |
ISBN (elektronisch) | 9780992862619 |
Publikationsstatus | Veröffentlicht - 2014 |
Veranstaltung | 22nd European Signal Processing Conference: EUSIPCO 2014 - Lisbon, Portugal Dauer: 1 Sept. 2014 → 5 Sept. 2014 |
Konferenz
Konferenz | 22nd European Signal Processing Conference |
---|---|
Kurztitel | EUSIPCO |
Land/Gebiet | Portugal |
Ort | Lisbon |
Zeitraum | 1/09/14 → 5/09/14 |
Fields of Expertise
- Information, Communication & Computing
Treatment code (Nähere Zuordnung)
- Basic - Fundamental (Grundlagenforschung)
- Application