Huang, Qiang, Xu, Yong, Jackson, P. J. B., Wang, Wenwu, and Plumbley, Mark D. “Fast Tagging of Natural Sounds Using Marginal Co-regularization.” Proceedings of ICASSP2017 (2017).


Automatic and fast tagging of natural sounds in audio collections is a very challenging task due to wide acoustic variations, the large number of possible tags, the incomplete and ambiguous tags provided by different labellers. To handle these problems, we use a co-regularization approach to learn a pair of classifiers on sound and text. The first classifier maps low-level audio features to a true tag list. The second classifier maps actively corrupted tags to the true tags, reducing incorrect mappings caused by low-level acoustic variations in the first classifier, and to augment the tags with additional relevant tags. Training the classifiers is implemented using marginal co-regularization, pair of which draws the two classifiers into agreement by a joint optimization. We evaluate this approach on two sound datasets, Freefield1010 and Task4 of DCASE2016. The results obtained show that marginal co-regularization outperforms the baseline GMM in both efficiency and effectiveness.

Link to full paper ⤧  Next post Xu et al. (2016a) ⤧  Previous post DCASE 2016 Challenge: Random system performance in sound event detection in real life audio