Snake Species Identification Challenge
Classify images of snake species from around the world
Snakebite is the most deadly neglected tropical disease (NTD), being responsible for a dramatic humanitarian crisis in global health
Snakebite causes over 100,000 human deaths and 400,000 victims of disability and disfigurement globally every year. It affects poor and rural communities in developing countries, which host the highest venomous snake diversity and the highest burden of snakebite due to limited medical expertise and access to antivenoms
Antivenoms can be life‐saving when correctly administered but this depends first on the correct taxonomic identification (i.e. family, genus, species) of the biting snake. Snake identification is challenging due to:
- their high diversity
- the incomplete or misleading information provided by snakebite victims
- the lack of knowledge or resources in herpetology that healthcare professionals have
In this challenge we want to explore how Machine Learning can help with snake identification, in order to potentially reduce erroneous and delayed healthcare actions.
Species richness of reptiles worldwide
In this challenge you will be provided with a dataset of RGB images of snakes, and their corresponding species (class) and geographic location (continent, country). The goal is to train a classification model.
The difficulty of the challenge relies on the dataset characteristics, as there might be a high intraclass variance for certain classes and a low interclass variance among others, as shown in the examples from the Datasets section. Also, the distribution of images between class is not equal for all classes: the class with the most images has 17,749, while the class with the fewest images has 552.
For now, we would like to make the barrier to entry much lower and demonstrate that an approach works well on 85 species and 187,720 images. The idea would be then to renew the challenge every 4 months in order to get closer to our final goal, which is to build an algorithm which best predicts which antivenin should be given (if any) when given a specific image.
Number of images per species in the dataset
Snakes are extremely diverse, and snake biologists continue to document & describe snake diversity, with an average of 30 new species described per year since the year 2000. Although most people probably think of snakes as a single “kind” of animal, humans are as evolutionarily close to whales as pythons are to rattlesnakes, so snakes in fact are very diverse! Taxonomically speaking, snakes are classified into 24 families, containing 528 genera and 3,709 species.
You can download the datasets in the Datasets Section. You are provided with a Train.tar.gz, file composed of 187,720 RGB images of varying size, split into 85 species.
Several aspects of snake morphology make this challenge more challenging:
Some species have patterns that vary depending on their age
Some species have patterns that vary depending on their location
Two species might look very similar, with one being venomous and the other not
The first iteration of the data set contains few such species, but we will add in more later.
- 1 travel grant to Geneva Health Forum 2020
- 1 co-authored paper releasing the dataset and describing the top solution(s) as baselines
This is the very first benchmarking challenge, meaning that it has no end date, but it will be updated every 3 months. Here are the first deadlines:
January 21, 2019
May 31, 2019
Qualifying Round 1 submission deadline
July 31, 2019
Qualifying Round 2 submission deadline
January 17, 2020
Qualifying Round 3 submission deadline
A puff adder, one of the most dangerous snakes in Africa