GRASS: The Graz Corpus of Read and Spontaneous Speech

Barbara Schuppler, Martin Hagmüller, Juan Andrés Morales Cordovilla, Hannes Pessentheiner

Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandBegutachtung


This paper provides a description of the preparation, the speakers, the recordings, and the creation of the orthographic transcriptions of the first large scale speech database for Austrian German. It contains approximately 1900 minutes of (read and spontaneous) speech
produced by 38 speakers. The corpus consists of three components. First, the Conversation Speech (CS) component contains free conversations of one hour length between friends, colleagues, couples, or family members. Second, the Commands Component (CC)
contains commands and keywords which were either read or elicited by pictures. Third, the Read Speech (RS) component contains phonetically balanced sentences and digits. The speech of all components has been recorded at super-wideband quality in a soundproof
recording-studio with head-mounted microphones, large-diaphragm microphones, a laryngograph, and with a video camera. The orthographic transcriptions, which have been created and subsequently corrected manually, contain approximately 290 000 word tokens
from 15 000 different word types.
Titel9th Conference on Language Resources and Evaluation Conference (LREC 2014)
ErscheinungsortRed Hook, NY
Herausgeber (Verlag)Curran
ISBN (Print) 9781632666215
PublikationsstatusVeröffentlicht - 2014
Veranstaltung9th International Conference on Language Resources and Evaluation: LREC 2014 - Reykjavik, Island
Dauer: 26 Mai 201431 Mai 2014


Konferenz9th International Conference on Language Resources and Evaluation
KurztitelLREC 2014

Fields of Expertise

  • Information, Communication & Computing

Treatment code (Nähere Zuordnung)

  • Application
  • Basic - Fundamental (Grundlagenforschung)


Untersuchen Sie die Forschungsthemen von „GRASS: The Graz Corpus of Read and Spontaneous Speech“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren