I have just released the two Demucs baselnes for the bleeding and labelnoise sub challenges. You can checkout this repo to get started: GitHub - adefossez/sdx23: SDX23 startkit for the Demucs baselines.
On the labelnoise, the baseline is at 5dB overall, ranking third currently. Eval is still running for the bleeding.
Actually the time-frequency uncertainty is not the issue here. In particular, if you take a complex spectrogram model, it will also be able to surpass the Ideal Ratio Mask oracle, as you can represent any possible output with a complex spectrogram. The limitation of the IRM is that it uses masking of the input spectrogram, and in particular always reuse the complex phase of the input. This is bad for some instruments, like percussive sound; where getting the phase wrong will make transients (the attack of a drum for instance) sound hollow, or empty. The phase will be wrong if for instance some other instruments overlap in the frequency domain (which is not hard because percussive sounds cover all frequencies during the attack), because the input phase will be some blend from the two instruments. Waveform models; or complex spectrogram model (but not masking ones) can actually predict the right phase and overcome the IRM oracle.
For leaderboard B, I used 150 extra tracks. At the end of the competition I realised it was fine to use the test set from MusDB as well, so I fine tuned including that as well (keeping only the validation set out).
If there was no time limit I could only marginally improve Demucs, maybe by 0.1 dB. Also a limiting factor at the moment, especially for the hybrid model, is not run time but memory on the GPU while training.
I used a mixture of 4 models for my final submissions. For track A, it was a mixture of hybrid and non hybrid model, trained with different seeds. For track B, it was all hybrid models trained with different seeds.
I will probably keep on working on Demucs, on the longer term, but for the immediate future I will take a break from source separation and mostly work on other deep learning problems. Also, while for the MDX challenge I mostly built on the existing Demucs architecture, I am not sure if the next iteration will be the same, or something completely different.
I will write the paper in september, and most likely release everything in october.
is it possible to watch the webinar ? Was it recorded ? Thanks !
As I am currently on vacation, I cannot attend the townhall in person. I made a short video to present the key novelties of my submission to the challenge. Please ask your questions here if any. I will try to answer as soon as possible, given that I am still on vacation.
Sorry I missed info on the deadline, 6 of august. This is an extremely short notice… To be fair I think it is not realistic to ask for such a deadline. Of course I can just dump my code but what is the point ? The ISMIR workshop is in november and I don’t see the rush in getting some dirty code out, while it could be well polished and accompanied by a research paper. The rule only said we needed to open source, not that we needed to open source within one week…
Hello @GiorgioFabbro, thank you for the clarification. Following my employer policy, I might not be able to publish source code on a platform different from our github facebookresearch. I should be able to push a new version of Demucs based on my findings in the competition on the Demucs repository (on Github, facebookresearch/demucs) and push a solution on your gitlab that mostly make use of that code. You did not tell me on the timeline for open sourcing. Depending on which solution I chose, I would need more or less time to obtain the approval from my employer. I will also be in vacation starting on Friday until september, so timeline is quite important for that too.
I have a few question on how things will go after the competition. I saw the email with the README to add to our submission. What is the timeline to add the README and release code ? Do we have control over how the code will be open sourced ?
Thank you for the kind words, and thanks a lot for the organisation! It is definitely a great environment to push the limit, had it been for a paper, I would have stopped sooner.
Thanks for the clarification, and let see if @shivam confirms.
Are all our submitted models evaluated for the final score ? Or only our best model at the end of round 2? In other word, can submitting too many models (and in particular one with the best score on the 18 songs) can end up being detrimental to the final score.