Projekte pro Jahr
Abstract
This paper provides a description of the preparation, the speakers, the recordings, and the creation of the orthographic transcriptions of the first large scale speech database for Austrian German. It contains approximately 1900 minutes of (read and spontaneous) speech
produced by 38 speakers. The corpus consists of three components. First, the Conversation Speech (CS) component contains free conversations of one hour length between friends, colleagues, couples, or family members. Second, the Commands Component (CC)
contains commands and keywords which were either read or elicited by pictures. Third, the Read Speech (RS) component contains phonetically balanced sentences and digits. The speech of all components has been recorded at super-wideband quality in a soundproof
recording-studio with head-mounted microphones, large-diaphragm microphones, a laryngograph, and with a video camera. The orthographic transcriptions, which have been created and subsequently corrected manually, contain approximately 290 000 word tokens
from 15 000 different word types.
produced by 38 speakers. The corpus consists of three components. First, the Conversation Speech (CS) component contains free conversations of one hour length between friends, colleagues, couples, or family members. Second, the Commands Component (CC)
contains commands and keywords which were either read or elicited by pictures. Third, the Read Speech (RS) component contains phonetically balanced sentences and digits. The speech of all components has been recorded at super-wideband quality in a soundproof
recording-studio with head-mounted microphones, large-diaphragm microphones, a laryngograph, and with a video camera. The orthographic transcriptions, which have been created and subsequently corrected manually, contain approximately 290 000 word tokens
from 15 000 different word types.
Originalsprache | englisch |
---|---|
Titel | 9th Conference on Language Resources and Evaluation Conference (LREC 2014) |
Erscheinungsort | Red Hook, NY |
Herausgeber (Verlag) | Curran |
Seiten | 1465-1470 |
Band | 2 |
ISBN (Print) | 9781632666215 |
Publikationsstatus | Veröffentlicht - 2014 |
Veranstaltung | 9th International Conference on Language Resources and Evaluation: LREC 2014 - Reykjavik, Island Dauer: 26 Mai 2014 → 31 Mai 2014 |
Konferenz
Konferenz | 9th International Conference on Language Resources and Evaluation |
---|---|
Kurztitel | LREC 2014 |
Land/Gebiet | Island |
Ort | Reykjavik |
Zeitraum | 26/05/14 → 31/05/14 |
Fields of Expertise
- Information, Communication & Computing
Treatment code (Nähere Zuordnung)
- Application
- Basic - Fundamental (Grundlagenforschung)
Fingerprint
Untersuchen Sie die Forschungsthemen von „GRASS: The Graz Corpus of Read and Spontaneous Speech“. Zusammen bilden sie einen einzigartigen Fingerprint.Projekte
- 1 Abgeschlossen
-
CLCS - Cross-layer Aussprachemodelle für Spontansprache
Schuppler, B. (Projektleiter (Principal Investigator))
1/09/12 → 30/04/17
Projekt: Forschungsprojekt
Publikationen
- 1 Abstract
-
10 Years of GRASS development: Experiences from annotating a large corpus of conversational Austrian German
Schuppler, B., Kelterer, A. & Hagmüller, M., 2023.Publikation: Konferenzbeitrag › Abstract