This paper presents a novel approach to enhance the accuracy of deep learning models for acoustic event detection and classification in real-world environments. We introduce a method that leverages activation maps to identify and address model overfitting, combined with an expert-knowledge-based event detection algorithm for data pre-processing. Our approach significantly improved classification performance, increasing the F1 score from 0.65 in the baseline model to 0.96 in the optimized model. The method was evaluated on a diverse validation dataset of 100 samples per each of 8 classes of urban acoustic events, including gunshots, explosions, and screams. Gradient-weighted Class Activation Mapping (Grad-CAM) visualizations confirmed that our enhanced model focuses on relevant signal components, reducing reliance on irrelevant background information. Additionally, our expert detection module enables efficient online processing by bypassing the classifier for non-event signals. This research demonstrates the effectiveness of combining explainable AI techniques with domain expertise to improve the robustness and efficiency of acoustic event classification systems.
Autorzy
Informacje dodatkowe
- DOI
- Cyfrowy identyfikator dokumentu elektronicznego link otwiera się w nowej karcie 10.23919/spa61993.2024.10715605
- Kategoria
- Aktywność konferencyjna
- Typ
- publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
- Język
- angielski
- Rok wydania
- 2024