Sound Anomaly Transfer
Reference
- Müller, R., Ritz, F., Illium, S., and Linnhoff-Popien, C. 2020. Acoustic anomaly detection for machine sounds based on image transfer learning. arXiv preprint arXiv:2006.03429.
This study investigates an effective approach for acoustic anomaly detection in industrial machinery, focusing on identifying malfunctions through sound analysis. The core methodology leverages transfer learning by repurposing deep neural networks originally trained for large-scale image classification (e.g., on ImageNet) as powerful feature extractors for audio data represented as mel-spectrograms.
The process involves:
- Converting audio signals from machinery into mel-spectrogram images.
- Feeding these spectrograms into various pretrained image classification networks (specifically comparing ResNet architectures against AlexNet and SqueezeNet) to extract deep feature representations.
- Training standard anomaly detection models – particularly Gaussian Mixture Models (GMMs) and One-Class Support Vector Machines (OC-SVMs) – on the features extracted from normal operation sounds.
- Classifying new sounds as anomalous if their extracted features deviate significantly from the learned normality model.
Key findings from the experiments, conducted across different machine types and noise conditions, include:
- The proposed transfer learning approach significantly outperforms baseline methods like traditional convolutional autoencoders, especially in the presence of background noise.
- Features extracted using ResNet architectures consistently yielded superior anomaly detection performance compared to those from AlexNet and SqueezeNet.
- GMMs and OC-SVMs proved highly effective as anomaly detection classifiers when applied to these transferred features.

This work demonstrates the surprising effectiveness of transferring knowledge from the visual domain to the acoustic domain for anomaly detection, offering a robust and readily implementable method for monitoring industrial equipment. [Müller et al. 2020]