Round 1: 36 days left #music

⏰ Round 1 ends June 13th!

πŸ•΅οΈ Introduction

The Music Demixing (MDX) Challenge is an opportunity for researchers and machine learning enthusiasts to test their skills by creating a system able to perform audio source separation.

Such a system, given an audio signal as input (referred to as β€œmixture”), will decompose it in its individual parts.

Audio source separation has different declinations, depending on the signal the system is working on. Music source separation systems take a song as input and output one track for each of the instruments. Speech enhancement systems take noisy speech as input and separate the speech content from the noise.

NOTE: This image will be updated to align with the visual theme of the whole challenge.

Such a technology can be employed in many different areas, ranging from entertainment to hearing aids. For example, the original master of old movies contains all the material (dialogue, music and sound effects) mixed in mono or stereo: thanks to source separation we can retrieve the individual components and allow for up-mixing to surround systems. Sony already restored two movies with this technology in their Columbia Classics collection. Karaoke systems can benefit from the audio source separation technology as users can sing over any original song, where the vocals have been suppressed, instead of picking from a set of β€œcover” songs specifically produced for karaoke.

The Music Demixing Challenge (MDX) will focus on music source separation and it follows the long tradition of the SiSEC MUS challenges (results of the 2018 competition: SiSEC MUS 2018). Participants will submit systems that separate a song into four instruments: vocals, bass, drums, and other (the instrument β€œother” contains signals of all instruments other than the first three, e.g., guitar or piano).

πŸ“ Datasets

Training Data

Participants are allowed to train their system exclusively on the training set of MUSDB18-HQ dataset or they can use their choice of data. According to the dataset used, participant will be eligible either for Leaderboard A or Leaderboard B respectively.

Hidden Test Data

The test set of the MDX challenge will be closed: participants will not have access to it, not even outside the challenge itself; this allows a fair comparison of all submissions. The set was created by Sony Music Entertainment (Japan) Inc. (SMEJ) with the specific intent to use it for the evaluation of the MDX challenge. It is therefore confidential and will not be shared with anyone outside the organization of the MDX challenge. 

πŸ† Leaderboards

The MDX challenge will feature two leaderboards.

Leaderboard A

Participants in Leaderboard A will be allowed to train their system exclusively on the training part of MUSDB18-HQ dataset. This dataset has become the standard in literature as it is free to use and gives anyone the possibility to start training source separation models.

Participants that use the compressed version of the dataset (MUSDB18) are still eligible for leaderboard A.

Leaderboard B

Participants in Leaderboard B, instead, will not be constrained in the choice of data for training and any available material can be used by the participants.

πŸ’° Prizes

The total prize pool is 10,000 CHF, which will be divided equally among the two leaderboards.

Leaderboard A

  • πŸ₯‡ 1st: 3500 CHF

  • πŸ₯ˆ 2nd: 1000 CHF

  • πŸ₯‰ 3rd: 500 CHF

Leaderboard B

  • πŸ₯‡ 1st: 3500 CHF

  • πŸ₯ˆ 2nd: 1000 CHF

  • πŸ₯‰ 3rd: 500 CHF

You are eligible for prizes in both the leaderboards.

πŸ’ͺ Getting Started

The starter kit of the competition is available at https://github.com/AIcrowd/music-demixing-challenge-starter-kit.

πŸš€ Baseline System

The MDX challenge will feature two baselines:

The first one is Open-Unmix (UMX) and a description of UMX can be found here.

The second baseline is a recent model called CrossNet-UMX (X-UMX), which is described here.

πŸ–Š Evaluation Metric

As an evaluation metric, we are using the signal-to-distortion ratio (SDR), which is defined as,

where Sπ‘–π‘›π‘ π‘‘π‘Ÿ(n) is the waveform of the ground truth and Εœπ‘–π‘›π‘ π‘‘π‘Ÿ(𝑛) denotes the waveform of the estimate. The higher the SDR score, the better the output of the system is.

In order to rank systems, we will use the average SDR computed by

for each song. Finally, the overall score is obtained by averaging SDRsong over all songs in the hidden test set.

The following Python code shows how they are computed:

import numpy as np

nb_sources, nb_samples, nb_channels = 4, 100000, 2
references = np.random.rand(nb_sources, nb_samples, nb_channels)
estimates = np.random.rand(nb_sources, nb_samples, nb_channels)

def sdr(references, estimates):
    # compute SDR for one song
    delta = 1e-7  # avoid numerical errors
    num = np.sum(np.square(references), axis=(1, 2))
    den = np.sum(np.square(references - estimates), axis=(1, 2))
    num += delta
    den += delta
    return 10 * np.log10(num  / den)

sdr_instr = sdr(references, estimates)
sdr_song = np.mean(sdr_instr)

print(f'SDR for individual instruments: {sdr_instr}')
print(f'SDR for full song: {sdr_song}')

Please note that the organizers (Sony and INRIA) will not get access to the submitted entries - everything is handled by AIcrowd and AIcrowd guarantees for the security of your submissions. However, the organizers plan to write an academic paper and for this will get access to the output (i.e., the separations) of the top-10 entries for each leaderboard. For more information, please see the challenge rules.

🚟 ISMIR Workshop

We will host a satellite workshop for ISMIR 2021 which will give all participants the opportunity to come together and share their experience during this challenge.

πŸ“… Competition Timeline

The MDX challenge will take place in 2 Rounds which differ in the evaluation dataset that is used for ranking the systems.

For this, we splitted the hidden dataset into 3 (roughly) equally-sized parts. During the 1st Round, participants can see the scores of their submission on the first-third of the hidden dataset. During the 2nd Round, participants can see their scores on the first- and second-third of the hidden dataset.

The ranking of the Final leaderboard will be based on the scores on all songs of the hidden test set.

Here is a summary of the competition timeline:

πŸ“… Round 1: May 3rd - June 13th, 12 PM UTC

πŸ“… Round 2: June 14th - July 31st, 12 PM UTC

πŸ₯Ά Team Freeze deadline: 23rd July, 12 PM UTC

Beginning of August

  • End of the challenge, Final leaderboard is made public with scores on all songs from the hidden test set.
  • Distribution of prizes based on this final leaderboard.

πŸ”— Links

πŸ“± Challenge Organizers

  • Yuki Mitsufuji, Sony Group Corporation, R&D Center, Japan 

  • Giorgio Fabbro, Sony Group Corporation, R&D Center, Germany 

  • Stefan Uhlich, Sony Group Corporation, R&D Center, Germany 

  • Fabian-Robert StΓΆter, INRIA, France