Towards Light-weight, Real-time-capable Singing Voice Detection
Sprache des Vortragstitels:
Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR 2013)
Sprache des Tagungstitel:
We present a study that indicates that singing voice detection
? the problem of identifying those parts of a polyphonic
audio recording where one or several persons sing(s)
? can be realised with substantially fewer (and less expensive)
features than used in current state-of-the-art methods.
Essentially, we show that MFCCs alone, if appropriately
optimised and used with a suitable classifier, are sufficient
to achieve detection results that seem on par with the
state of the art ? at least as far as this can be ascertained
by direct, fair comparisons to existing systems. To make
this comparison, we select three relevant publications from
the literature where publicly accessible training/test data
were used, and where the experimental setup is described
in enough detail for us to perform fair comparison experiments.
The result of the experiments is that with our simple,
optimised MFCC-based classifier we achieve at least comparable
identification results, but with (in some cases much)
less computational effort, and without any need for extensive
lookahead, thus paving the way to on-line, real-time
voice detection applications.