Challenge Unboxed📦: AI Blitz⚡X

By  aryankargwal

👋Welcome back readers to Challenge Unboxed!

Challenge Unboxed is a series where we explore and dissect the methods and heuristics used in the winning solutions of a challenge hosted on our platform. This series aims to introduce you to new tools and approaches that will help you broaden your horizons in Machine Learning.

In this installment of Challenge Unboxed, we explore unique methods employed by the winners of AI Bliz⚡X.

👀About the Challenge

We, humans, perceive up to 80 percent of all impressions by means of our sight. Since its early days, Machine Learning has tried to capture the human power of sight. At AIcrowd, we too, have focused on Computer Vision related problems for several of our AI Blitz puzzles. For Blitz X we decided to go bigger!

Consisting of 5 wacky Computer Vision puzzles, Blitz X serves as our very own take on situations where Computer Vision has real-life implementation. This challenge takes the participants on a journey that gets harder with every puzzle. 

Let us take a quick look at the puzzles:

  1. Docking ISS
  2. Tree Segmentation
  3. Starship Detection
  4. Iceberg Detection
  5. Cloud Removal

🏆Winning Heuristics

This challenge saw some rejuvenated approaches to common tasks such as Image Segmentation and Object Detection. In this blog, we break down solutions and notebooks such as approaches from ksnxr, Eric Parsiot, and jinoooooooooo; the leaderboard and community contribution winners respectively.

🗻Xception-Style U-net Architecture for Iceberg Detection

In April of 1913, the RMS Titanic, a reportedly unsinkable behemoth of a ship, fell to the ocean's bottom after striking an iceberg! As the planet heats, such events will continue to be a concern for modern container ships. However, with the introduction of DL-based Image Segmentation, Iceberg Detection is a task that can be done easily by an automated system.

Working on top of the starter kit for the puzzle, which featured the implementation of a ResNet-18 based U-Net architecture, Eric submitted his own take on the implementation. He decides to go for the pre-trained network provided by Keras Library.

CNN's core premise is to learn an image's feature mapping and use it to create more sophisticated feature mapping, which is what essentially U-Net does in the process of Segmentation. However, the selection of CNN affects the results by a huge margin. Eric’s architecture happened to take Xception Net as the CNN backbone, which performs better than the traditional approach.

Xception Net is able to produce Top-1 Accuracy: 0.790 Top-5 Accuracy: 0.945 on the Image Net Benchmark contrary to ResNet-18’s Top-1 Accuracy: 0.695 Top-5 Accuracy: 0.894. The top-1 accuracy is the standard: the model response (the one with the highest probability) must match the predicted answer perfectly. Top-5 accuracy implies that any of your model's top five most likely answers must match the predicted result.

This approach helped Eric climb to the 3rd position in the puzzle, with a minuscule change to the starter code!

🌳Data augmentation using Albumentations for Tree Segmentation

We can learn more about the ecosystem by measuring the density and frequency of trees. As a result,  better forest conservation policies can be developed. Deep Learning can assist in the automation of this task.

In the puzzle, Tree Segmentation the participants were tasked with segmenting Tree Masks given an array of forest aerial shots. Seems like an easy Image Segmentation task right? As it turned out the puzzle only provides 5000 images in the training set, which is a relatively low amount of data for training a deep learning model.

The Community Contributor winner jinoooooooooo came up with a time-efficient approach to the solution. In his submission, he enlists the help of the Ablumentations python library. The library provides a huge array of 70 Image Augmentation techniques aimed at generating new training samples from the existing dataset.

Taking an example of Eric’s approach to Image Segmentation using Xception Net-based U-Net, the architecture contains 70+ layers of neural networks. Such Deep Neural Networks especially Convolutional Neural Networks tend to overfit in the absence of an adequate amount of training data. Overfitting occurs when a machine learning model fails to generalize or fit effectively on previously unknown data.

Image Augmentation is the technique of making new samples using existing ones, by making small changes to them. For example, you could produce a new image that is a bit brighter; you could cut a portion from the original image; you could create a reflection of the original image, and so on.

Albumentation helped jinoooooooooo get multiple folds of the initial amount of data, which further helped him avoid overfitting! Can you think of any such Data Augmentation library? Start a discussion about it in the AIcrowd Discourse!

🚀Torchextractor for feature extraction for Docking ISS

Docking a spacecraft to a space station allows us to perform experiments and learn how to live and operate in zero-gravity environments. In April of 2021, SpaceX Dragon Spacecraft performed docking using autonomous methods instead of manual control over the course correction.

In the puzzle, Docking ISS the participants were subjected to a condition wherein if the sensors malfunctioned the task of docking has to be done using Image Feed. The Challenge Winner ksnxr had a very interesting approach to the prompt which not only won him the puzzle but also the challenge. 

The puzzle presents the participants with a huge data of over 20,000 samples. However, ironically continuing from the last section, Data Augmentation using Xception Net-based U-Net Architecture, large data may become the major hindrance in terms of the sheer computer extensivity. We'll need a method to decipher this data which is not feasible to process manually.

Feature extraction aids in extracting the best features from large data sets by selecting and combining variables into features, therefore decreasing the quantity of data. These characteristics are simple to use while still accurately and uniquely describing the real data set.

Ksnxr used the python library called torchextractor. The library gave him the ability to add feature extraction layers in his model by just calling one line of code. The user is just required to mention the neural network block that it wishes to add feature extraction to and is greeted with the same.

Ksnxr won both the top score for this puzzle and the first rank on the leaderboard for Blitz X!

Do you wanna put these new methods and resources to a test? How about checking out more beginner-friendly Computer Vision challenges present on our platform through AI, Crowd-Sourced: Computer Vision for Beginners.

What do you wanna read next? Let us know in the comments below or tweet us @AIcrowdHQ!🐥

Written by



You must login before you can post a comment.

You may also like...