AIcrowd | MABe Task 2: Annotation Style Transfer

Task 2: Completed Weight: 1.0

AIcrowd &

MABe Team

5402

378

425

⏰ Task deadline extended to May 7th

🚀 Getting Started Code with Random Predictions

💪 Baseline Code

❓ Have a question? Visit the discussion forum

💻 How to claim AWS credits

🕵️ Introduction

Task 2 : Annotation Style Transfer Task

Your classifiers from Task 1 were a hit, and your neuroscientist colleagues are now using them to quantify social behaviors! However, other labs have started using your classifiers in the same experimental setting and found that the predicted behaviors disagreed with their own annotations. You realize that different labs are working from different "in-house" definitions of these three social behaviors. You resolve to create a new version of your classifiers match the "style" of each new lab's annotations, so that you can determine where labs are disagreeing in their definitions of behaviors.

Task 2 is an extension of Task 1, where the goal is to take models trained on Task 1 and adapt them to datasets annotated by other individuals, by learning to capture each annotator's particular annotation "style".

As in Task 1, your goal is to predict bouts of attack, mounting, and close investigation from hand-labeled examples. Training and test sets will be annotated by multiple individuals: for each test set item, be sure to make your behavior predictions in the style of the annotator specified in the annotator_id field!

If you want to try unsupervised feature learning or clustering, you can make use of the videos in our global test set, which is shared across challenge tasks. The global test set contains tracked poses from almost 300 additional videos of interacting mice. (Note that a portion of videos in the global test set are also used to evaluate your performance on this task.)

Task 2 overview

💾 Dataset

We provide frame-by-frame annotation data and animal pose estimates extracted from top-view videos of interacting mice recorded at 30Hz; raw videos will not be provided. Videos for all three challenge tasks use a standard resident-intruder assay format, in which a mouse in its home cage is presented with an unfamliar intruder mouse, and the animals are allowed to freely interact for several minutes under experimenter supervision. The identity of the annotator for each video is provided in an annotator_id field for all training and test sequences.

Please refer to Task 1 for an explanation of pose keypoints and annotations.

📁 Files

The following files are available in the resources section. Note that a "sequence" is the same thing as one video- it is one continuous recording of social interactions between animals, with duration between 1-2 and 10+ minutes, filmed at 30 frames per second.

train.npy - Training set for the task, which follows the following schema :

{
    "vocabulary": ['attack', 'investigation', 'mount', 'other'],
    "sequences" : {
        "<sequence_id> : {
            "keypoints" : a ndarray of shape (`frames`, 2, 2, 7), where "frames" refers to the number of frames in the dataset. More details about the individual keypoints provided in Task 1.
            "annotations" : a list containing the behavior annotations for each of the frames. The list contains the index-number of the corresponding entry in "vocabulary" (so in this example, 0 = 'attack', 1 = investigation, etc.)
            "annotator_id" : // Unique ID for the individual who annotated this video.
        }
    }
}

test.npy - Test set for the task, which follows the following schema (note that this is the same file for all three tasks):

{
    "<sequence_id> : {
        "keypoints" : a ndarray of shape (`frames`, 2, 2, 7), where frames refers to the number of frames in the dataset. More details about the individual keypoints added below.
        "annotator_id" : // Unique ID for the annotator who annotated this given sequence of the data. For Task 2, this value indicates which annotator "style" should be used to predict the animal's actions on the provided sequence.
    }
}

sample_submission.npy - Template for a sample submission for this task, follows the following schema :

{
    "<sequence_id-1>" : [0, 0, 1, 2, .....],
    "<sequence_id-2>" : [0, 1, 2, 0.....]
}

A "sequence" in this setting is one uninterrupted video of animal interactions; videos vary from 1-2 up to 10+ minutes in length.

In sample_submission, each key in the dictionary refers to the unique sequence id of a video in the test set. The item for each key is expected to be a list of integers of length frames, representing the index of the predicted behavior name in the vocabulary field of train.npy.

Only sequences with annotator_id 1-5 will count towards your score for Task 2, however you must submit predictions for all sequences for your entry to be parsed correctly.

🚀 Submission

Sample submission format is described in the dataset section above.

Make your first submission here 🚀 !!

To test out the system, you can start by uploading the provided sample_submission.npy. When you make your own submissions, they should follow the same format.

🖊 Evaluation Criteria

During evaluation F1 score is used as the Primary Score by which teams are judged. We use Precision as a tie-breaking Secondary Score when teams produce identical F1 scores. We use macro averaging when computing both the F1 Score and the Precision, and the scores from the other class are never considered during the macro-averaging phase when computing both the scores.

🏆 Prizes

The cash prize pool across the 3 tasks is $9000 USD total (Sponsored by Amazon and Northwestern)

For each task, the prize pool is as follows. Prizes will be awared for all the 3 tasks

🥇 1st on leaderboard: $1500 USD
🥈 2nd on the leaderboard: $1000 USD
🥉 3rd on the leaderboard: $500 USD

Additionally, Amazon is sponsoring $10000 USD total of SageMaker credits! 😄

Please check out this post to see how to claim credits.

🔗 Links

📫 Contact

mabe.workshop@gmail.com