MABe 2022: Mouse Triplets

Getting Started - MABe 2022: Mouse Triplets Round 1

Explore the mouse tracking dataset and make your first submission with a simple PCA embedding.


Explore the mouse tracking dataset and make your first submission with a simple PCA embedding. Also includes code for a cool animation for you to visualize the mice as they move around.

Problem Statement

Join the communty!
chat on Discord

How to use this notebook 📝

  1. Copy the notebook. This is a shared template and any edits you make here will not be saved. You should copy it into your own drive folder. For this, click the "File" menu (top-left), then "Save a Copy in Drive". You can edit your copy however you like.
  2. Link it to your AIcrowd account. In order to submit your predictions to AIcrowd, you need to provide your account's API key.

Setup AIcrowd Utilities 🛠

In [ ]:
!pip install -U aicrowd-cli
%load_ext aicrowd.magic

Login to AIcrowd ㊗¶

In [ ]:
%aicrowd login

Install packages 🗃

Please add all pacakages installations in this section

In [ ]:
!pip install scikit-learn

Import necessary modules and packages 📚

In [ ]:
import os

import numpy as np
import pandas as pd
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

Download the dataset 📲

In [ ]:
aicrowd_challenge_name = "mabe-2022-mouse-triplets"
if not os.path.exists('data'):

# %aicrowd ds dl -c {aicrowd_challenge_name} -o data # Download all files
%aicrowd ds dl -c {aicrowd_challenge_name} -o data *submission_data* # download only the submission keypoint data
%aicrowd ds dl -c {aicrowd_challenge_name} -o data *user_train* # download data with the public task labels provided
In [ ]:
from google.colab import drive

Load Data

The dataset files are python dictionaries, this is a descirption of how the data is organized.

In [ ]:
submission_clips = np.load('data/submission_data.npy',allow_pickle=True).item()
user_train = np.load('data/user_train.npy',allow_pickle=True).item()

Dataset Specifications 💾

We provide frame-by-frame animal pose estimates extracted from top-view videos of trios of interacting mice filmed at 30Hz; raw videos will not be provided for this stage of the competition. Animal poses are characterized by the tracked locations of body parts on each animal, termed "keypoints."

The following files are available in the resources section. A "sequence" is a continuous recording of social interactions between animals: sequences are 1 minute long (1800 frames at 30Hz) in the mouse dataset. The sequence_id is a random hash to anonymize experiment details.

  • user_train.npy - Training set for the task, which follows the following schema :
    "sequences" : {
        "<sequence_id> : {
            "keypoints" : a ndarray of shape (4500, 11, 24, 2)
  • submission_clips.npy - Test set for the task, which follows the following schema:
    "<sequence_id> : {
        "keypoints" : a ndarray of shape (4500, 11, 24, 2)
  • sample_submission.npy - Template for a sample submission for this task, follows the following schema :
        {"<sequence_id-1>": (start_frame_index, end_frame_index),
        "<sequence_id-1>": (start_frame_index, end_frame_index),
        "<sequence_id-n>": (start_frame_index, end_frame_index),
    "<sequence_id-1>" : [
            [0.321, 0.234, 0.186, 0.857, 0.482, 0.185], .....]
            [0.184, 0.583, 0.475], 0.485, 0.275, 0.958], .....]

In sample_submission, each key in the frame_number_map dictionary refers to the unique sequence id of a video in the test set. The item for each key is expected to be an the start and end index for slicing the embeddings numpy array to get the corresponding embeddings. The embeddings array is a 2D ndarray of floats of size total_frames by X , where X is the dimension of your learned embedding (6 in the above example; maximum permitted embedding dimension is 128), representing the embedded value of each frame in the sequence. total_frames is the sum of all the frames of the sequences, the array should be concatenation of all the embeddings of all the clips.

How does the data look like? 🔍

In [ ]:
print("Dataset keys - ", submission_clips.keys())
print("Number of submission sequences - ", len(submission_clips['sequences']))

Sample overview

In [ ]:
sequence_names = list(submission_clips["sequences"].keys())
sequence_key = sequence_names[0]
single_sequence = submission_clips["sequences"][sequence_key]["keypoints"]
print("Sequence name - ", sequence_key
print("Single Sequence shape ", single_sequence.shape)
print(f"Number of Frames in {sequence_key} - ", len(single_sequence))

Keypoints are stored in an ndarray with the following properties:

  • Dimensions: (# frames) x (animal ID) x (body part) x (x, y coordinate).
  • Units: pixels; coordinates are relative to the entire image. Original image dimensions are 850 x 850 for the mouse dataset.

Body parts are ordered: 1) nose, 2) left ear, 3) right ear, 4) neck, 5) left forepaw, 6) right forepaw, 7) center back, 8) left hindpaw, 9) right hindpaw, 10) tail base, 11) tail middle, 12) tail tip.

The placement of these keypoints is illustrated below: diagram of keypoint locations

Helper function for visualization 💁

Useful functions for interacting with the mouse tracking sequences

Don't forget to run the cell πŸ˜‰

In [ ]:
import matplotlib.pyplot as plt
from matplotlib import animation
from matplotlib import colors
from matplotlib import rc

rc('animation', html='jshtml')
# Note: Image processing may be slow if too many frames are animated.                
#Plotting constants
M1_COLOR = 'lawngreen'
M2_COLOR = 'skyblue'
M3_COLOR = 'tomato'
PLOT_MOUSE_START_END = [(0, 1), (1, 3), (3, 2), (2, 0),        # head
                        (3, 6), (6, 9),                        # midline
                        (9, 10), (10, 11),                     # tail
                        (4, 5), (5, 8), (8, 9), (9, 7), (7, 4) # legs
class_to_number = {s: i for i, s in enumerate(user_train['vocabulary'])}
number_to_class = {i: s for i, s in enumerate(user_train['vocabulary'])}
def num_to_text(anno_list):
  return np.vectorize(number_to_class.get)(anno_list)
def set_figax():
    fig = plt.figure(figsize=(8, 8))
    img = np.zeros((FRAME_HEIGHT_TOP, FRAME_WIDTH_TOP, 3))
    ax = fig.add_subplot(111)
    return fig, ax
def plot_mouse(ax, pose, color):
    # Draw each keypoint
    for j in range(10):
        ax.plot(pose[j, 0], pose[j, 1], 'o', color=color, markersize=3)
    # Draw a line for each point pair to form the shape of the mouse
    for pair in PLOT_MOUSE_START_END:
        line_to_plot = pose[pair, :]
        ax.plot(line_to_plot[:, 0], line_to_plot[
                :, 1], color=color, linewidth=1)
def animate_pose_sequence(video_name, seq, start_frame = 0, stop_frame = 100, skip = 0,
                          annotation_sequence = None):
    # Returns the animation of the keypoint sequence between start frame
    # and stop frame. Optionally can display annotations.
    image_list = []
    counter = 0
    if skip:
        anim_range = range(start_frame, stop_frame, skip)
        anim_range = range(start_frame, stop_frame)
    for j in anim_range:
        if counter%20 == 0:
          print("Processing frame ", j)
        fig, ax = set_figax()
        plot_mouse(ax, seq[j, 0, :, :], color=M1_COLOR)
        plot_mouse(ax, seq[j, 1, :, :], color=M2_COLOR)
        plot_mouse(ax, seq[j, 2, :, :], color=M3_COLOR)
        if annotation_sequence is not None:
          annot = annotation_sequence[j]
          annot = number_to_class[annot]
          plt.text(50, -20, annot, fontsize = 16, 
                   bbox=dict(facecolor=class_to_color[annot], alpha=0.5))
            video_name + '\n frame {:03d}.png'.format(j))
        image_from_plot = np.frombuffer(fig.canvas.tostring_rgb(),
        image_from_plot = image_from_plot.reshape(
            fig.canvas.get_width_height()[::-1] + (3,)) 
        counter = counter + 1
    # Plot animation.
    fig = plt.figure(figsize=(8,8))
    im = plt.imshow(image_list[0])
    def animate(k):
        return im,
    ani = animation.FuncAnimation(fig, animate, frames=len(image_list), blit=True)
    return ani

Visualize the mouse movements🎥

Sample visualization for plotting pose gifs.

In [ ]:
sequence_names = list(user_train['sequences'].keys())
sequence_key = sequence_names[0]
single_sequence = user_train["sequences"][sequence_key]

keypoint_sequence = single_sequence['keypoints']
filled_sequence = fill_holes(keypoint_sequence)
masked_data = np.ma.masked_where(keypoint_sequence==0, keypoint_sequence)

annotation_sequence = None  # single_sequence['annotations']

ani = animate_pose_sequence(sequence_key,
                            start_frame = 0,
                            stop_frame = 1800,
                            skip = 10,
                            annotation_sequence = annotation_sequence)

# Display the animaion on colab

Simple Embedding : Framewise PCA

Each frame contains tracking of multiple mice, in this simple submission, we'll do Principal component analysis of every frame. These PCA embeddings will be used as our submission.

Seeding helper

Its good practice to seed before every run, that way its easily reproduced.

In [ ]:
def seed_everything(seed):
  os.environ['PYTHONHASHSEED'] = str(seed)


Extract PCA per frame

First, we'll make a helper function to interpolate missing keypoint locations (identified as entries where the keypoint location is 0.)

In [ ]:
import copy

def fill_holes(data):
    clean_data = copy.deepcopy(data)
    for m in range(3):
        holes = np.where(clean_data[0,m,:,0]==0)
        if not holes:
        for h in holes[0]:
            sub = np.where(clean_data[:,m,h,0]!=0)
            if(sub and sub[0].size > 0):
                clean_data[0,m,h,:] = clean_data[sub[0][0],m,h,:]
              return np.empty((0))
    for fr in range(1,np.shape(clean_data)[0]):
        for m in range(3):
            holes = np.where(clean_data[fr,m,:,0]==0)
            if not holes:
            for h in holes[0]:
                clean_data[fr,m,h,:] = clean_data[fr-1,m,h,:]
    return clean_data

Next we'll stack up all of the training sequences to create the data we'll use to fit our principal axes.

In [ ]:
# generate the training data for PCA by stacking the entries of user_train
sequence_keys = list(user_train['sequences'].keys())
num_total_frames = np.sum([seq["keypoints"].shape[0] for _, seq in submission_clips['sequences'].items()])
sequence_dim = np.shape(user_train['sequences'][sequence_keys[0]]['keypoints'])
keypoints_dim = sequence_dim[1]*sequence_dim[2]*sequence_dim[3]

pca_train = np.empty((num_total_frames, keypoints_dim, 3), dtype=np.float32)
start = 0
for k in sequence_keys:
  keypoints = fill_holes(user_train['sequences'][k]["keypoints"])
  if keypoints.size == 0:  # sometimes a mouse is missing the entire time

  end = start + len(keypoints)
  for center_mouse in range(3):   # we're going to do PCA three times, each time centered on one mouse (rotating to mouse-eye-view and centering might be better...)
    ctr = np.median(keypoints[:,center_mouse,:,:],axis=1)
    ctr = np.repeat(np.expand_dims(ctr,axis=1),3,axis=1)
    ctr = np.repeat(np.expand_dims(ctr,axis=2), 12, axis=2)
    keypoints_centered = keypoints - ctr
    keypoints_centered = keypoints_centered.reshape(keypoints_centered.shape[0], -1)

    pca_train[start:end,:, center_mouse] = keypoints_centered
  start = end

Now we'll fit a scalar transform to each mouse-centered dataset and compute the principal axes.

In [ ]:
embed_size = 20
scaler_store = []
pca_store = []
for m in range(3):
  pca = PCA(n_components = embed_size)
  scaler = StandardScaler(with_std=False)

Finally, now that we've found our principal axes for each transform of the data (centering poses on each mouse), let's project all of our submission trajectories onto those axes.

In [ ]:
num_total_frames = np.sum([seq["keypoints"].shape[0] for _, seq in submission_clips['sequences'].items()])
embeddings_array = np.empty((num_total_frames, embed_size*3), dtype=np.float32)

frame_number_map = {}
start = 0
for sequence_key in submission_clips['sequences']:
  keypoints = fill_holes(submission_clips['sequences'][sequence_key]["keypoints"])
  if keypoints.size == 0:
    keypoints = submission_clips['sequences'][sequence_key]["keypoints"]
  embeddings = np.empty((len(keypoints),embed_size*3), dtype=np.float32)

  for center_mouse in range(3):   # now apply our three PCA transformations to the test data
    ctr = np.median(keypoints[:,center_mouse,:,:],axis=1)
    ctr = np.repeat(np.expand_dims(ctr,axis=1),3,axis=1)
    ctr = np.repeat(np.expand_dims(ctr,axis=2), 12, axis=2)
    keypoints_centered = keypoints - ctr
    keypoints_centered = keypoints_centered.reshape(keypoints_centered.shape[0], -1)

    x = scaler_store[center_mouse].transform(keypoints_centered)
    embeddings[:,(center_mouse*embed_size):((center_mouse+1)*embed_size)] = pca_store[center_mouse].transform(x)

  end = start + len(keypoints)
  embeddings_array[start:end] = embeddings
  frame_number_map[sequence_key] = (start, end)
  start = end
assert end == num_total_frames
submission_dict = {"frame_number_map": frame_number_map, "embeddings": embeddings_array}
In [ ]:
# Input and Embeddings shape
print("Input shape:", submission_clips['sequences'][sequence_key]["keypoints"].shape)
print("Embedding shape:", embeddings.shape)

Validate the submission ✅

The submssion should follow these constraints:

  1. It should be a dictionary with keys frame_number_map and embeddings
  2. frame_number_map should be have same keys as submission_data
  3. Embeddings is an 2D numpy array of dtype float32
  4. The embedding size should't exceed 128
  5. The frame number map matches the clip lengths

You can use the helper function below to check these

In [ ]:
def validate_submission(submission, submission_clips):
    if not isinstance(submission, dict):
      print("Submission should be dict")
      return False

    if 'frame_number_map' not in submission:
      print("Frame number map missing")
      return False

    if 'embeddings' not in submission:
        print('Embeddings array missing')
        return False
    elif not isinstance(submission['embeddings'], np.ndarray):
        print("Embeddings should be a numpy array")
        return False
    elif not len(submission['embeddings'].shape) == 2:
        print("Embeddings should be 2D array")
        return False
    elif not submission['embeddings'].shape[1] <= 128:
        print("Embeddings too large, max allowed is 128")
        return False
    elif not isinstance(submission['embeddings'][0, 0], np.float32):
        print(f"Embeddings are not float32")
        return False

    total_clip_length = 0
    for key in submission_clips['sequences']:
        start, end = submission['frame_number_map'][key]
        clip_length = submission_clips['sequences'][key]['keypoints'].shape[0]
        total_clip_length += clip_length
        if not end-start == clip_length:
            print(f"Frame number map for clip {key} doesn't match clip length")
            return False
    if not len(submission['embeddings']) == total_clip_length:
        print(f"Emebddings length doesn't match submission clips total length")
        return False

    if not np.isfinite(submission['embeddings']).all():
        print(f"Emebddings contains NaN or infinity")
        return False

    print("All checks passed")
    return True
In [ ]:
validate_submission(submission_dict, submission_clips)

Save the prediction as npy 📨

In [ ]:
np.save("submission.npy", submission_dict)

Submit to AIcrowd 🚀

In [ ]:
%aicrowd submission create --description "PCA-v2" -c {aicrowd_challenge_name} -f submission.npy


8 months ago

NameError: name β€˜fill_holes’ is not defined

Where can I find the function?

8 months ago

@victorkras it’s a few cells later - either move it up or run it before going back to the cell that gives the error. @annkenedy thank you for sharing - great to have a template for loading the data and making submissions :)

You must login before you can post a comment.