Loading
Feedback
AI Blitz #9: 7 days left #educational #blitz Weight: 15.0
601
193
17
212

🌈 Welcome thread | πŸ‘₯ Looking for teammates? | πŸš€ Easy-2-Follow Code Notebooks

πŸ“ Don't forget to participate in the Community Contribution Prize!

AIcrowd Server

 

Introduction

 

Through the previous puzzle of Emotional Detection, you performed a binary classification task. With this puzzle, we are leveling up and going to perform a multi-class classification. Your input dataset consists of text taken from research papers. You need to build a model which will correctly classify this with a label from 0 to 3.

To solve this challenge, you will be using the concepts of LSTM and Vectorization while employing Tensorflow.

Now, what is LSTM?!


πŸ’ͺ Getting Started

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) that can learn sequence to sequence tasks such as texts. Unlike most feedforward neural networks, LSTM has a feedback connection that helps LSTM to retain the previous information of a text to be able to predict the next set of texts. Read more about the concept of LSTM over here.

Word Vectorization is the second process used in this challenge. Simply put, it converts words into numbers. Why? Because converting words into numbers helps in word prediction and word similarity and semantics. Know more about the concept here.

To solve this challenge, you need to convert text into tokens and encode them using Vectorization. After this, we will train the Tensorflow model with LSTM layers. Test and submit the results to get your score.

AIcrowd's easy-to-use baseline has a breakdown of all the tools and codes required to get started. Find the starter code-kit here.


πŸ’Ύ Dataset

The dataset is fairly easy to understand, again! in any training/validation dataset, there will be two columns -  text & label. The text is the abstract from the research papers and the label column represents the category that the research paper falls in.    

text label
Estimating 3D hand meshes from single RGB ...... Each technical component above meaningfully improves the accuracy
in the ablation study.
2
The emergence of collective ...... classes and overlapping structures of data. 0

 

The label categories are as follows - Artificial Intelligence, Machine Learning, Robotics, Computer Vision.


πŸ“ Files

Following files are available in the resources section:

  • train.csv - (31499 samples) This CSV file containing a text column as the sentence and a label column as the category of the research paper.  

  • val.csv - (2699 samples) This CSV file containing a text column as the sentence and a label column as the emotion of the category of the research paper. 

  • test.csv - (10799 samples) This CSV file containing a text column as the sentence and a label column containing the category of the research paper. This file also serves the purpose of sample_submission.csv


πŸš€  Submission

  • Creating a submission directory
  • Use test.csv and fill the corresponding labels. 
  • Save the test.csv in the submission directory. The name of the above file should be submission.csv.
  • Inside a submission directory, put the .ipynb notebook from which you trained the model and made inference and save it as original_notebook.ipynb.

         Overall, this is what your submission directory should look like -          

submission
β”œβ”€β”€ submission.csv
└── original_notebook.ipynb
  • Zip the submission directory!

Make your first submission here πŸš€ !!


πŸ–Š Evaluation Criteria

During the evaluation, the F1 score ( weighted average ) and Accuracy Score will be used to test the efficiency of the model where,

\(\texttt{accuracy}(y, \hat{y}) = \frac{1}{n_\text{samples}} \sum_{i=0}^{n_\text{samples}-1} 1(\hat{y}_i = y_i)\)


πŸ”— Links

πŸ“± Contact

Notebooks

See all
0
0