Khaled Koutini, Hamid Eghbal-Zadeh, Gerhard Widmer,
"CP-JKU submissions to DCASE?19: Acoustic Scene Classification and Audio Tagging with Receptive-Field-Regularized CNNs"
, Detection and Classification of Acoustic Scenes and Events 2019 Workshop, New York, 2019
CP-JKU submissions to DCASE?19: Acoustic Scene Classification and Audio Tagging with Receptive-Field-Regularized CNNs
Sprache des Titels:
In this report, we detail the CP-JKU submissions to the DCASE-2019 challenge Task 1 (acoustic scene classification) and Task 2 (audio tagging with noisy labels and minimal supervision). In all of our submissions, we use fully convolutional deep neural networks architectures that are regularized with Receptive Field (RF) adjustments. We adjust the RF of variants of Resnet and Densenet architectures to best fit the various audio processing tasks that use the spectrogram features as input. Additionally, we propose novel CNN layers such as Frequency-Aware CNNs, and new noise compensation techniques such as Adaptive Weighting for Learning from Noisy Labels to cope with the complexities of each task. We prepared all of our submissions without the use of any external data. Our focus in this year?s submissions is to provide the best-performing single-model submission, using our proposed approaches.
Sprache der Kurzfassung:
Detection and Classification of Acoustic Scenes and Events 2019 Workshop