Join us for the closing webinar where top participants will discuss their solutions!
🛠 Contribute: Found a typo? Or any other change in the description that you would like to see? Please consider sending us a pull request in the public repo of the challenge here.
Today, we are producing more information than ever before, but not all information is true. Some of it is actually malicious and harmful. And it makes it harder for us to trust any piece of information we come across! Not only that, now the bad actors are able to use language modelling tools like Open AI's GPT 2 to generate fake news too. Ever since its initial release, there have been talks on how it can be potentially misused for generating misleading news articles, automating the production of abusive or fake content for social media, and automating the creation of spam and phishing content.
How do we figure out what is true and what is fake? Can we do something about it?
This challenge does exactly that! In this challenge, you differentiate real news from the fake news generated by GPT 2. Given a dataset of various texts , can you predict whether or not they are real/fake?
With such rampant fake news, our trust in our institutions is starting to shake, and this challenge initiates efforts to tackle SDG 16 - Trust in (Government) institutions.
Understand with code! Here is
getting started code for you.
The dataset consists of around
387,000 pieces of texts which has been sourced from various news articles from the web as well as texts generated by Open AI's GPT 2 language model!
The dataset is split into train,val and test such that each of the sets has an equal split of the two classes.
Following files are available in the
232,003samples) This csv file contains 232,003 texts and their corresponding labels i.e. whether the text is
38,666samples) This csv file contains 38,666 texts and their corresponding labels i.e. whether the text is
115,999samples) This csv file contains 115,999 texts without their corresponding label.
- Prepare a CSV containing header as [label] and predicted value as
fakein the same order as the test set.
- Name of the above file should be
- Sample submission format available at
sample_submission.csvin the resorces section.
Make your first submission here 🚀 !!
🖊 Evaluation Criteria
During evaluation F1 score where,
- 💪 Challenge Page: https://www.aicrowd.com/challenges/fnews
- 🗣️ Discussion Forum: https://www.aicrowd.com/challenges/fnews/discussion
- 🏆 Leaderboard: https://www.aicrowd.com/challenges/fnews/leaderboards