GRASS: The Graz Corpus of Read and Spontaneous Speech

Barbara Schuppler, Martin Hagmüller, Juan Andrés Morales Cordovilla, Hannes Pessentheiner

Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandBegutachtung

Abstract

This paper provides a description of the preparation, the speakers, the recordings, and the creation of the orthographic transcriptions of the first large scale speech database for Austrian German. It contains approximately 1900 minutes of (read and spontaneous) speech
produced by 38 speakers. The corpus consists of three components. First, the Conversation Speech (CS) component contains free conversations of one hour length between friends, colleagues, couples, or family members. Second, the Commands Component (CC)
contains commands and keywords which were either read or elicited by pictures. Third, the Read Speech (RS) component contains phonetically balanced sentences and digits. The speech of all components has been recorded at super-wideband quality in a soundproof
recording-studio with head-mounted microphones, large-diaphragm microphones, a laryngograph, and with a video camera. The orthographic transcriptions, which have been created and subsequently corrected manually, contain approximately 290 000 word tokens
from 15 000 different word types.
Originalspracheenglisch
Titel9th Conference on Language Resources and Evaluation Conference (LREC 2014)
ErscheinungsortRed Hook, NY
Herausgeber (Verlag)Curran
Seiten1465-1470
Band2
ISBN (Print) 9781632666215
PublikationsstatusVeröffentlicht - 2014
Veranstaltung9th International Conference on Language Resources and Evaluation: LREC 2014 - Reykjavik, Island
Dauer: 26 Mai 201431 Mai 2014

Konferenz

Konferenz9th International Conference on Language Resources and Evaluation
KurztitelLREC 2014
Land/GebietIsland
OrtReykjavik
Zeitraum26/05/1431/05/14

Fields of Expertise

  • Information, Communication & Computing

Treatment code (Nähere Zuordnung)

  • Application
  • Basic - Fundamental (Grundlagenforschung)

Fingerprint

Untersuchen Sie die Forschungsthemen von „GRASS: The Graz Corpus of Read and Spontaneous Speech“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren