DescriptionIn the last decade, conversational speech has received a lot of attention among speech scientists. On the one hand, accurate automatic speech recognition (ASR) systems are essential for conversational dialogue systems, as these become more interactional and social rather than solely transactional. On the other hand, linguists study natural conversations, as they reveal additional insights to controlled experiments with respect to how speech processing works. Investigating conversational speech, however, does not only require applying existing methods to new data, but developing new categories, new modeling techniques and including new knowledge sources. Whereas traditional models are trained on either text or acoustic information, I propose language models that incorporate information on the phonetic variation of the words (i.e., pronunciation variation and prosody) and relate this information to the semantic context of the conversation and to the communicative functions in the conversation. This approach to language modeling is in line with the theoretical model proposed by Hawkins and Smith (2001), where the perceptual system accesses meaning from speech by using the most salient sensory information from any combination of levels/layers of formal linguistic analysis. The overal aim of my research is to create cross-layer models for conversational speech. In this talk, I will illustrate general challenges for ASR with conversational speech, I will present results from my recent and ongoing projects on pronunciation and prosody modeling, and I will discuss directions for future research.
|Period||31 Oct 2019|
|Held at||Brno University of Technology, Czech Republic|
|Degree of Recognition||Regional|
Documents & Links
Automatic detection of prosodic boundaries in two varieties of German
Research output: Contribution to conference › Abstract › peer-review
Prosodic Effects on Plosive Duration in German and Austrian German
Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review
On the use of acoustic features for automatic disambiguation of homophones in spontaneous German
Research output: Contribution to journal › Article › peer-review
Introduction, or: why rethink reduction?
Research output: Chapter in Book/Report/Conference proceeding › Chapter › peer-review
FWF - CLCS_2 - Cross-layer prosodic models for conversational speech
Project: Research project
Prize: Fellowship awarded competitively