Loading

MABe Task 3: Learning New Behavior

[Task 3] Learning New Behavior [Getting Started Code]

Get started with the Learning New Behaviour task 💪

ashivani

AIcrowd-Logo

Join the communty!
chat on Discord

🐀🐀🐀🐀🐀🐀🐀🐀🐀🐀🐀🐁🐁🐁🐁🐁🐁🐁🐁🐁🐁
🐀 MABe Few Shot Learning - Learning New Behaviors: Starter kit 🐁
🐀🐀🐀🐀🐀🐀🐀🐀🐀🐀🐀🐁🐁🐁🐁🐁🐁🐁🐁🐁🐁

How to use this notebook 📝

  1. Copy the notebook. This is a shared template and any edits you make here will not be saved. You should copy it into your own drive folder. For this, click the "File" menu (top-left), then "Save a Copy in Drive". You can edit your copy however you like.
  2. Link it to your AIcrowd account. In order to submit your predictions to AIcrowd, you need to provide your account's API key.

Setup AIcrowd Utilities 🛠

In [ ]:
!pip install -U aicrowd-cli==0.1 > /dev/null

Install packages 🗃

Please add all pacakages installations in this section

In [ ]:
!pip install pandas

Import necessary modules and packages 📚

In [ ]:
import numpy as np
import os
import pandas as pd

Download the dataset 📲

Please get your API key from https://www.aicrowd.com/participants/me

In [ ]:
API_KEY = ""
!aicrowd login --api-key $API_KEY
API Key valid
Saved API Key successfully!
In [ ]:
!aicrowd dataset download --challenge mabe-task-3-learning-new-behavior

Extract the downloaded dataset to data directory

In [ ]:
!rm -rf data
!mkdir data
 
!mv train.npy data/train.npy
!mv test-release.npy data/test.npy
!mv sample-submission.npy data/sample_submission.npy
mv: cannot stat 'sample-submission.npy': No such file or directory

Load Data

The dataset files are python dictionaries, this is a descirption of how the data is organized.

In [ ]:
train = np.load('data/train.npy',allow_pickle=True).item()
test = np.load('data/test.npy',allow_pickle=True).item()
sample_submission = np.load('data/sample_submission.npy',allow_pickle=True).item()

Dataset Specifications 💾

  • train.npy - Training set for the task, which follows the following schema:

           

  • test-release.npy - Test set for the task, which follows the following schema :

           

  • sample_submission.npy - Template for a sample submission which follows the following schema
{
    "<sequence_id-1>" : [0, 0, 1, 2, ...],
    "<sequence_id-2>" : [0, 1, 2, 0, ...]
}

Each key in the dictionary here refers to the unique sequence id obtained for the sequences in the test set. The value for each of the keys is expected to hold a list of corresponing annotations. The annotations are represented by the index of the corresponding annotation words in the vocabular provided in the test set.

How does the data look like? 🔍

Task 3 has 7 sets of new behaviors, all binary classifications The test set is combined for all behaviors, you need to output behavior labels for all sequences in test set.

In [ ]:
print("Dataset keys - ", train.keys())
print()
for behavior in train:
  print("Vocabulary - ", train[behavior]['vocabulary'])
  print("Number of train Sequences - ", len(train[behavior]['sequences']))
  print()
print("Number of test Sequences - ", len(test['sequences']))
Dataset keys -  dict_keys(['behavior-0', 'behavior-1', 'behavior-2', 'behavior-3', 'behavior-4', 'behavior-5', 'behavior-6'])

Vocabulary -  {'other': 0, 'behavior-0': 1}
Number of train Sequences -  3

Vocabulary -  {'other': 0, 'behavior-1': 1}
Number of train Sequences -  2

Vocabulary -  {'other': 0, 'behavior-2': 1}
Number of train Sequences -  2

Vocabulary -  {'other': 0, 'behavior-3': 1}
Number of train Sequences -  1

Vocabulary -  {'other': 0, 'behavior-4': 1}
Number of train Sequences -  4

Vocabulary -  {'other': 0, 'behavior-5': 1}
Number of train Sequences -  3

Vocabulary -  {'other': 0, 'behavior-6': 1}
Number of train Sequences -  2

Number of test Sequences -  458

Submission Format

Test set 458 sequences, so you need to make 7*458 sets of predictions.

In [ ]:
print("Sample Submission keys - ", sample_submission.keys())
for beh in sample_submission:
  print(f"Test videos for {beh}", len(sample_submission[beh]))
Sample Submission keys -  dict_keys(['behavior-0', 'behavior-1', 'behavior-2', 'behavior-3', 'behavior-4', 'behavior-5', 'behavior-6'])
Test videos for behavior-0 458
Test videos for behavior-1 458
Test videos for behavior-2 458
Test videos for behavior-3 458
Test videos for behavior-4 458
Test videos for behavior-5 458
Test videos for behavior-6 458

Sample overview

In [ ]:
behavior_0 = train['behavior-0']
sequence_names = list(behavior_0["sequences"].keys())
sequence_key = sequence_names[0]
single_sequence = behavior_0["sequences"][sequence_key]
print("Sequence name - ", sequence_key)
print("Single Sequence keys ", single_sequence.keys())
print(f"Number of Frames in {sequence_key} - ", len(single_sequence['annotations']))
print(f"Keypoints data shape of {sequence_key} - ", single_sequence['keypoints'].shape)
print(f"annotator_id of {sequence_key} - ", single_sequence['annotator_id'])
Sequence name -  267709a544
Single Sequence keys  dict_keys(['keypoints', 'annotator_id', 'annotations'])
Number of Frames in 267709a544 -  3632
Keypoints data shape of 267709a544 -  (3632, 2, 2, 7)
annotator_id of 267709a544 -  0

Helper function for visualization 💁

Don't forget to run the cell 😉

In [ ]:
import matplotlib.pyplot as plt
from matplotlib import animation
from matplotlib import colors
from matplotlib import rc
 
rc('animation', html='jshtml')
 
# Note: Image processing may be slow if too many frames are animated.                
 
#Plotting constants
FRAME_WIDTH_TOP = 1024
FRAME_HEIGHT_TOP = 570
 
RESIDENT_COLOR = 'lawngreen'
INTRUDER_COLOR = 'skyblue'
 
PLOT_MOUSE_START_END = [(0, 1), (0, 2), (1, 3), (2, 3), (3, 4),
                        (3, 5), (4, 6), (5, 6), (1, 2)]
 
class_to_color = {'other': 'white', 'behavior-0' : 'red'}
 
class_to_number = {s: i for i, s in enumerate(behavior_0['vocabulary'])}
 
number_to_class = {i: s for i, s in enumerate(behavior_0['vocabulary'])}
 
def num_to_text(anno_list):
  return np.vectorize(number_to_class.get)(anno_list)
 
def set_figax():
    fig = plt.figure(figsize=(6, 4))
 
    img = np.zeros((FRAME_HEIGHT_TOP, FRAME_WIDTH_TOP, 3))
 
    ax = fig.add_subplot(111)
    ax.imshow(img)
 
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    return fig, ax
 
def plot_mouse(ax, pose, color):
    # Draw each keypoint
    for j in range(7):
        ax.plot(pose[j, 0], pose[j, 1], 'o', color=color, markersize=5)
 
    # Draw a line for each point pair to form the shape of the mouse
 
    for pair in PLOT_MOUSE_START_END:
        line_to_plot = pose[pair, :]
        ax.plot(line_to_plot[:, 0], line_to_plot[
                :, 1], color=color, linewidth=1)
 
def animate_pose_sequence(video_name, keypoint_sequence, start_frame = 0, stop_frame = 100, 
                          annotation_sequence = None):
    # Returns the animation of the keypoint sequence between start frame
    # and stop frame. Optionally can display annotations.
    seq = keypoint_sequence.transpose((0,1,3,2))
 
    image_list = []
    
    counter = 0
    for j in range(start_frame, stop_frame):
        if counter%20 == 0:
          print("Processing frame ", j)
        fig, ax = set_figax()
        plot_mouse(ax, seq[j, 0, :, :], color=RESIDENT_COLOR)
        plot_mouse(ax, seq[j, 1, :, :], color=INTRUDER_COLOR)
        
        if annotation_sequence is not None:
          annot = annotation_sequence[j]
          annot = number_to_class[annot]
          plt.text(50, -20, annot, fontsize = 16, 
                   bbox=dict(facecolor=class_to_color[annot], alpha=0.5))
 
        ax.set_title(
            video_name + '\n frame {:03d}.png'.format(j))
 
        ax.axis('off')
        fig.tight_layout(pad=0)
        ax.margins(0)
 
        fig.canvas.draw()
        image_from_plot = np.frombuffer(fig.canvas.tostring_rgb(),
                                        dtype=np.uint8)
        image_from_plot = image_from_plot.reshape(
            fig.canvas.get_width_height()[::-1] + (3,)) 
 
        image_list.append(image_from_plot)
 
        plt.close()
        counter = counter + 1
 
    # Plot animation.
    fig = plt.figure()
    plt.axis('off')
    im = plt.imshow(image_list[0])
 
    def animate(k):
        im.set_array(image_list[k])
        return im,
    ani = animation.FuncAnimation(fig, animate, frames=len(image_list), blit=True)
    return ani
 
def plot_annotation_strip(annotation_sequence, start_frame = 0, stop_frame = 100, title="Behavior Labels"):
  # Plot annotations as a annotation strip.
 
  # Map annotations to a number.
  annotation_num = []
  for item in annotation_sequence[start_frame:stop_frame]:
    annotation_num.append(class_to_number[item])
 
  all_classes = list(set(annotation_sequence[start_frame:stop_frame]))
 
  cmap = colors.ListedColormap(['white', 'red'])
  bounds=[-0.5,0.5,1.5]
  norm = colors.BoundaryNorm(bounds, cmap.N)
 
  height = 200
  arr_to_plot = np.repeat(np.array(annotation_num)[:,np.newaxis].transpose(),
                                                  height, axis = 0)
  
  fig, ax = plt.subplots(figsize = (16, 3))
  ax.imshow(arr_to_plot, interpolation = 'none',cmap=cmap, norm=norm)
 
  ax.set_yticks([])
  ax.set_xlabel('Frame Number')
  plt.title(title)
 
  import matplotlib.patches as mpatches
 
  legend_patches = []
  for item in all_classes:
    legend_patches.append(mpatches.Patch(color=class_to_color[item], label=item))
 
  plt.legend(handles=legend_patches,loc='center left', bbox_to_anchor=(1, 0.5))
 
  plt.tight_layout()

Visualize the mouse movements🎥

Sample visualization for plotting pose gifs.

In [ ]:
keypoint_sequence = single_sequence['keypoints']
annotation_sequence = single_sequence['annotations']

ani = animate_pose_sequence(sequence_key,
                            keypoint_sequence, 
                            start_frame = 3000,
                            stop_frame = 3100,
                            annotation_sequence = annotation_sequence)

# Display the animaion on colab
ani
Processing frame  3000
Processing frame  3020
Processing frame  3040
Processing frame  3060
Processing frame  3080
Out[ ]:

Showing a section of the validation data (Index needs to be selected for a full video)

In [ ]:
annotation_sequence = single_sequence['annotations']
text_sequence = num_to_text(annotation_sequence)
 
plot_annotation_strip(
    text_sequence,
    start_frame=0,
    stop_frame=len(annotation_sequence) + 1000
)

Basic EDA 🤓

There are 7 new behaviors in task3, each has different frequency of occurence per video.

Each sequence has different amounts of each behavior, here we get the percentage of frames of each behavior in each sequence. We can use this to split the dataset for validation in a stratified way.

In [ ]:
# Function for showing dataframes nicely on jupyter
from IPython.display import display, HTML
def pretty_print_dataframe(df):
  display(HTML(df.to_html()))
In [ ]:
def get_behavior_percentage_frames(behavior_ds): 
  vocabulary = behavior_ds['vocabulary']
  def get_percentage(sequence_key):
    anno_seq = behavior_ds['sequences'][sequence_key]['annotations']
    counts = {k: np.mean(np.array(anno_seq) == v) for k,v in vocabulary.items()}
    return counts

  anno_percentages = {k: get_percentage(k) for k in behavior_ds['sequences']}
  anno_perc_df = pd.DataFrame(anno_percentages).T
  return anno_perc_df

print("Percentage of frames in every sequence for every class")
for behavior in train:
  pretty_print_dataframe( get_behavior_percentage_frames(train[behavior]) )
Percentage of frames in every sequence for every class
other behavior-0
267709a544 0.962830 0.037170
ec99eb3873 0.955172 0.044828
f13cd749f5 0.972319 0.027681
other behavior-1
970e98dd6f 0.992356 0.007644
0541d73ea3 0.979073 0.020927
other behavior-2
fe309dcbb6 0.911718 0.088282
1ee5f8180a 0.802612 0.197388
behavior-3 other
82570f095f 0.192552 0.807448
other behavior-4
6cfe21b22a 0.985842 0.014158
9270107ff4 0.994531 0.005469
1afb9c1657 0.988082 0.011918
1a2b77b3ec 0.992410 0.007590
other behavior-5
47c048f5c9 0.968244 0.031756
a21559612c 0.946409 0.053591
3d5e84d8dd 0.949045 0.050955
other behavior-6
af34820e44 0.915761 0.084239
713183649d 0.898037 0.101963
In [ ]:
percentages = []
for behavior in sorted(list(train.keys())):
  beh_sequences = train[behavior]
  all_annotations = []
  for sk in beh_sequences['sequences']:
    anno = beh_sequences['sequences'][sk]['annotations']
    all_annotations.extend(list(anno))
  percentages.append(np.mean(np.array(all_annotations) == 1))
pd.DataFrame({"Behavior": sorted(list(train.keys())),
              "Percentage Frames": percentages})
Out[ ]:
Behavior Percentage Frames
0 behavior-0 0.032486
1 behavior-1 0.014657
2 behavior-2 0.139881
3 behavior-3 0.192552
4 behavior-4 0.009029
5 behavior-5 0.047443
6 behavior-6 0.092949

Generate predictions 💪

In [ ]:
# Generating Random Predictions
submission = {}
test = np.load('data/test.npy',allow_pickle=True).item()
for behavior in sample_submission:
  submission[behavior] = {}
  for sequence_id, sequence in test["sequences"].items():
    keypoint_sequence = sequence['keypoints']
    submission[behavior][sequence_id] = np.random.randint(2, size=len(sequence['keypoints']))

Validate the submission ✅

The submssion should follow these constraints:

  1. It should be a dictionary
  2. It should be have same keys as sample_submission
  3. It should have dictionaries for all behaviors
  4. The lengths of the arrays are same
  5. All values are intergers

You can use the helper function below to check these

In [ ]:
def validate_submission(submission, sample_submission):
    if not isinstance(submission, dict):
        print("Submission should be dict")
        return False
    
    if not submission.keys() == sample_submission.keys():
        print("Submission keys don't match")
        return False
    for behavior in submission:
      sb = submission[behavior]
      ssb = sample_submission[behavior]
      if not isinstance(sb, dict):
        print("Submission should be dict")
        return False

      if not sb.keys() == ssb.keys():
        print("Submission keys don't match")
        return False
      
      for key in sb:
        sv = sb[key]
        ssv = ssb[key]
        if not len(sv) == len(ssv):
          print(f"Submission lengths of {key} doesn't match")
          return False
      
      for key, sv in sb.items():
        if not all(isinstance(x, (np.int32, np.int64, int)) for x in list(sv)):
          print(f"Submission of {key} is not all integers")
          return False
    
    print("All tests passed")
    return True
In [ ]:
validate_submission(submission, sample_submission)
All tests passed
Out[ ]:
True

Save the prediction as npy 📨

In [ ]:
np.save("submission.npy", submission)

Submit to AIcrowd 🚀

In [ ]:
!aicrowd submission create -c mabe-task-3-learning-new-behavior -f submission.npy
submission.npy ━━━━━━━━━━━━━━━━━━━━ 100.0%457.6/457.6 MB1.4 MB/s0:00:00
                                                                    ╭─────────────────────────╮                                                                    
                                                                    │ Successfully submitted! │                                                                    
                                                                    ╰─────────────────────────╯                                                                    
                                                                          Important links                                                                          
┌──────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│  This submission │ https://www.aicrowd.com/challenges/multi-agent-behavior-representation-modeling-measurement-and-applications/submissions/125494              │
│                  │                                                                                                                                              │
│  All submissions │ https://www.aicrowd.com/challenges/multi-agent-behavior-representation-modeling-measurement-and-applications/submissions?my_submissions=true │
│                  │                                                                                                                                              │
│      Leaderboard │ https://www.aicrowd.com/challenges/multi-agent-behavior-representation-modeling-measurement-and-applications/leaderboards                    │
│                  │                                                                                                                                              │
│ Discussion forum │ https://discourse.aicrowd.com/c/multi-agent-behavior-representation-modeling-measurement-and-applications                                    │
│                  │                                                                                                                                              │
│   Challenge page │ https://www.aicrowd.com/challenges/multi-agent-behavior-representation-modeling-measurement-and-applications                                 │
└──────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
{'submission_id': 125494, 'created_at': '2021-03-07T21:00:53.434Z'}

Comments

You must login before you can post a comment.

Execute