Acoustic Scene Classification for Mismatched Recording Devices Using Heated-Up Softmax and Spectrum Correction

Truc Nguyen, Michał Kośmider, Franz Pernkopf

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review


Deep neural networks (DNNs) are successful in applications with matching inference and training distributions. In realworld scenarios, DNNs have to cope with truly new data samples during inference, potentially coming from a shifted data distribution. This usually causes a drop in performance. Acoustic scene classification (ASC) with different recording devices is one of this situation. Furthermore, an imbalance in quality and amount of data recorded by different devices causes severe challenges. In this paper, we introduce two calibration methods to tackle these challenges. In particular, we applied scaling of the features to deal with varying frequency response of the recording devices. Furthermore, to account for the shifted data distribution, a heated-up softmax is embedded to calibrate the predictions of the model. We use robust and resource-efficient models, and show the efficiency of heated-up softmax. Our ASC system reaches state-of-the-art performance on the development set of DCASE challenge 2019 task 1B with only ~70K parameters. It achieves 70.1% average classification accuracy for device B and device C. It performs on par with the best single model system of the DCASE 2019 challenge and outperforms the baseline system by 28.7% (absolute).

Original languageEnglish
Title of host publication2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
Number of pages5
ISBN (Electronic)9781509066315
Publication statusPublished - May 2020
Event2020 IEEE International Conference on Acoustics, Speech and Signal Processing : ICASSP 2020 - Virtuell, Barcelona, Spain
Duration: 4 May 20208 May 2020


Conference2020 IEEE International Conference on Acoustics, Speech and Signal Processing
Abbreviated titleICASSP 2020
CityVirtuell, Barcelona


  • Acoustic scene classification
  • calibration of confidence prediction
  • heated-up softmax
  • spectrum correction
  • temperature scaling

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this