Acoustic Scene Classification Using A Convolutional Neural Network Ensemble and Nearest Neighbor Filters

Thi Kim Truc Nguyen, Franz Pernkopf

Research output: Contribution to conferencePosterpeer-review

Abstract

This paper proposes Convolutional Neural Network (CNN) ensembles for acoustic scene classification of tasks 1A and 1B of the DCASE 2018 challenge. We introduce a nearest neighbor filter applied on spectrograms, which allows to emphasize and smooth similar patterns of sound events in a scene. We also propose a variety of CNN models for single-input (SI) and multi-input (MI) channels and three different methods for building a network ensemble. The experimental results show that for task 1A the combination of the MI-CNN structures using both of log-mel features and their nearest neighbor filtering is slightly more effective than the single-input channel CNN models using log-mel features only. This statement is opposite for task 1B. In addition, the ensemble methods improve the accuracy of the system significantly, the best ensemble method is ensemble selection, which achieves 69.3% for task 1A and 63.6% for task 1B. This improves the baseline system by 8.9% and 14.4% for task 1A and 1B, respectively.
Original languageEnglish
Publication statusPublished - 20 Nov 2018

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint

Dive into the research topics of 'Acoustic Scene Classification Using A Convolutional Neural Network Ensemble and Nearest Neighbor Filters'. Together they form a unique fingerprint.

Cite this