Challenge Summary: CLEF challenges
In this blog post, we cover some background about the CLEF community and summarize the different CLEF challenges on AIcrowd. This post also contains excerpts from our interview with the coordinators Ivan Eggel and Henning Müller.
The CLEF Initiative (Conference and Labs of the Evaluation Forum, formerly known as Cross-Language Evaluation Forum) promotes research and development by providing an infrastructure for:
- multilingual and multimodal system testing, tuning and evaluation
- investigation of the use of unstructured, semi-structured, highly-structured, and semantically enriched data in information access
- creation of reusable test collections for benchmarking
- exploration of new evaluation methodologies and innovative ways of using experimental data
- discussion of results, comparison of approaches, exchange of ideas, and transfer of knowledge
How did the CLEF community start?
“... CLEF community, it actually started historically as part of TREC. It was a task in TREC in 1999, and I think in 2000 it became independent and was moved to Europe from TREC, which has historically been in the US, because multilingual retrieval was considered to be more important for Europe. So, in that respect, they moved it with the European conference and digital libraries and for the past 10 years it's been an independent conference that attracts around 200-250 people every year...”
What was the motivation for having these subdivisions?
“... within CLEF there's a limitation for each of the labs in terms of the number of tasks that they can propose, so they limited it to three to four tasks maximum. At some point, LifeCLEF had several propositions for new tasks and they couldn't accommodate them, they could only accommodate one task of LifeCLEF. So, at some point they said, okay let's make it an independent lab even though they are linked, and are on the same web pages. But the idea was really to separate it to have more room and also focus on more than images, like videos and sounds... … ImageCLEF wouldn't make sense to have sounds because it's not linked to visual content, videos you could maybe justify, but not so much sounds. We had whale sounds, bird sounds, so that's more for the Signal Processing community than the Computer Vision community…”
And what kind of real-world applications is the community trying to target?
“... there is also a large part of people who work on biodiversity, who actually combine different sources because for them it's important to know what are invasive plants, and also how is bird life changing with global warming or with deforestation or like increased traffic, these kinds of things. So that's something pretty interesting… … people can take pictures of plants, send them, let them classify, but that also gives the organizers the possibility, with like GPS coordinates, and time of the year, to see are there any patterns changing, is like the flower season getting earlier and earlier or are specific plants growing higher and higher in the mountains because of climate change, these kinds of things… … people tend to take for example pictures of plants they do not know, or of plants that are very invasive, that are everywhere and by that, very likely if there's any weird species that start appearing in different areas, you will likely get pictures of those because that's the pattern that people use...”
The main objective of ImageCLEF is to advance the field of image retrieval and offer evaluation in various fields of image information retrieval.
ImageCLEF has focused on text based multilingual image retrieval initially, but has evolved into something more multimedia, with combination of text and images, doing image classification and a variety of different tasks.
LifeCLEF is an offspring of ImageCLEF, which has gone even more multimedia, from sounds to videos with everything focused on biomaterial, such as plants, animals etc.
It has traditionally involved people from the Information Retrieval community, and now also involves members from Multimedia Analysis, Machine Learning and Computer Vision, especially from Medical Imaging and Medical Text Analysis domains.
COLLABORATION WITH AIcrowd 🤝
ImageCLEF started in 2003 and has run every year since.
AIcrowd (crowdAI at that time) started hosting the CLEF challenges from 2018 onwards.
We ask the organizers regarding the motivation behind selecting AIcrowd as the platform:
“...we had our own system but that was very old and that we wanted to replace for a couple of years. So, we took that opportunity and said like, okay, here we have a new system that's more modern, that allows us to also have a leaderboard, that goes beyond the actual challenge, these kinds of things that we didn't have before. So, we just had a submission system before and then we did offline evaluations and that was the main reason actually to switchover to AIcrowd...”
“...previously, the organizers had to download all the submissions and then do grading locally on their machines. Whereas right now in AIcrowd or crowdAI before, you have the continuous evaluation actually directly on the platform.”
As of December 2019, 19 CLEF challenges have been hosted on AIcrowd.
Following is the list of challenges that have concluded:
- ImageCLEF 2018 Caption - Composing coherent captions for the entirety of an image
- ImageCLEF 2018 Caption - Identifying relevant concepts in a large corpus of medical images
- ImageCLEF 2018 VQA-Med - Visual question answering in the medical domain
- LifeCLEF 2018 Bird - Bird sounds recognition in soundscapes recordings
- LifeCLEF 2018 Bird - Bird sounds recognition from monodirectional recordings
- LifeCLEF 2018 Expert - Image-based identification of plant species
- ImageCLEF 2018 Tuberculosis - Scoring the severity of TB cases based on chest CT images
- ImageCLEF 2018 Tuberculosis - Classification of tuberculosis types using CT volumes
- ImageCLEF 2018 Tuberculosis - Tuberculosis multi-drug-resistance detection based on CT image analysis
- LifeCLEF 2019 Bird Recognition - Bird species recognition in soundscapes
- ImageCLEF 2019 VQA-Med - Visual question answering in the medical domain
- ImageCLEF 2019 Tuberculosis - CT report
- ImageCLEF 2019 Tuberculosis - Severity scoring
- ImageCLEF 2019 Caption - Detecting relevant concepts in a corpus of radiology images
- ImageCLEF 2019 Security - Retrieve hidden messages from forged stego images
- ImageCLEF 2019 Security - Identify forged images
- ImageCLEF 2019 Security - Identify stego images
- LifeCLEF 2019 Plant - Image-based plant identification on Amazonian flora
- LifeCLEF 2019 Geo - Location-based species prediction
- ImageCLEF 2019 Coral - Pixel-wise parsing
- ImageCLEF 2019 Coral - Annotation and Localisation
We asked the organizers to share a few things which they really liked about hosting the challenges on AIcrowd. Here’s what they had to say:
“... we wanted to run challenges beyond the actual deadlines. We want to have a deadline when people submit, so everybody's under the same time constraints, but then I think it's good to keep the datasets open. So, everybody who participated in the challenge could then publish afterwards on the data. That was something that AIcrowd offered us...”
“... obviously we did not want to build ourselves a new system from the bottom up, it's a huge amount of work and especially if you want to make it professionally and commercially. But still, I think it's worth the effort at the end because we can integrate it in a very nice and modern system, have the leaderboard there, everything is centralized and we can also upload the data there. So, thanks Mohanty for hosting our challenges...”
The organizers are equally appreciative about the quality of solutions that have come in through these challenges so far.
As Henning shared with us:
“... I have the impression that the quality of these submissions has always been fairly good. We feel that we have state-of-the-art results and the challenges are not easy, but that people can work on them with their limited resources. Often, it's interesting to see that it's not only the actual techniques that make the performance but also how they are employed. So quite often people use exactly the same techniques, like those groups who have best results and those groups who have really bad results, they just used in slightly different ways. So it's really about the fine-tuning and I think that's one of the main messages to get out...Sometimes it's surprising, some people use a lot of manual work, rule based systems instead of doing Machine Learning, because sometimes the datasets are not big enough to actually generalize well...sometimes we also have submissions with not so good results, but where we feel that the technique actually has a lot of potential...”
SOME FEEDBACK 😇
As a constant endeavor to improve overall experience of our platform, we reached out to the organizers for their suggestions and here is what they had to share:
“... having a system that's easy and simple to use for all of the organizers...I think the easier it is to use the platform, the better it is... … another thing is advertisements. The more a platform is used, the more people have access to it, the more people would participate. And if we have more participants then we will also get more scientific output, because we can compare a large number of algorithms and how they work…”
The organizers were also very clear on the kind of solutions they would like to see coming out from these challenges.
“... I don't want to go where it's competition for money, where it's not about learning something, but getting best results. I really want to focus what we do more on the scientific aspect. So, having papers, having their algorithms described. I don't want a black box where I don't know why it works and why it doesn't work. But I want to have details on the system. I think that's important... … that's also why I would like to have code because with code, if you have a description of the code, we could run it in a different scenario so we can see if it would still work on new data for example. Particularly if we have data that comes regularly, if we have code, we could rerun it on more data as it becomes available. That would be something that's extremely interesting...”
FUTURE PLANS 🔮
The organizers are busy working on future challenges which address some of the critical issues that are being faced globally today.
“... I'm currently trying to get access to data for Covid imaging which has TTS, chest X-rays. There are different initiatives underway, but also trying to get solid data, where we can actually do something that would have a clinical impact. So, we would need a timeline, we need symptom onset etc., to see at what time, we can see what is on the images. But that's not easy to get. If we get that, that would be extremely interesting to run a challenge on that...”
Following are some of the upcoming CLEF challenges on AIcrowd:
- LifeCLEF 2020 Plant
- ImageCLEF 2020 Lifelog - LMRT
- ImageCLEF 2020 Lifelog - SPLL
- ImageCLEF 2020 Coral - Annotation and Localisation
- ImageCLEF 2020 Coral - Pixel-wise parsing
- LifeCLEF 2020 Geo
- LifeCLEF 2020 Bird - Monophone
- LifeCLEF 2020 Bird - Stereo
- ImageCLEF 2020 DrawnUI
- ImageCLEF 2020 VQA-Med - VQA
- ImageCLEF 2020 VQA-Med - VQG
- ImageCLEF 2020 Tuberculosis - CT report
- ImageCLEF 2020 Caption - Concept Detection
We hope you find the above challenges interesting and participate in them.
We are sure to see more such exciting challenges in future, so stay tuned.