Fundamental Frequency Tracking in Diplophonic Voices

Philipp Aichinger, Martin Hagmüller, Imme Roesner, Berit Schneider-Stickler, J. Schoentgen, Franz Pernkopf

Research output: Contribution to journalArticlepeer-review


Background and objectives
Fundamental frequency (fo) extraction in disordered voices is a prerequisite for many types of clinical analyses. Special attention must be paid if multiple oscillators with different fos are active simultaneously. Two independent approaches to fo tracking in diplophonic voices are proposed and compared with a benchmark from the literature.

Material and methods
Six samples of sustained phonations were analyzed. High-speed videos were obtained in addition to audio recordings. Video-based fo tracks were obtained from cycle marks that report maximal vocal fold deflection in digital kymograms. Audio waveform modeling based extraction involved candidate tracking, oscillator waveform synthesis and track selection. Audio subband auto-correlation based extraction served as a benchmark.

Results and discussion
Promising qualitative and quantitative agreement of audio waveform modeling based estimates with kymogram-based tracks was observed. With reference to the kymogram-based tracks, audio waveform modeling based extraction had a median total error rate of 1.9%, which is an improvement over the benchmark method (17.7%).

The results illustrate that fos of diplophonic voices may be validly obtained from kymogram cycle marks, as well as via audio waveform modeling. The acquisition of two simultaneous fo tracks in diplophonic voices may increase the validity of clinical voice analysis procedures in the future.
Original languageEnglish
Pages (from-to)69-81
JournalBiomedical Signal Processing and Control
Publication statusPublished - Aug 2017


Dive into the research topics of 'Fundamental Frequency Tracking in Diplophonic Voices'. Together they form a unique fingerprint.

Cite this