AIcrowd | Sound Sentiment Prediction

🧩 Sound Sentiment Puzzle: Identity sentiments from audio clips of reviews

🛠 Start Solving

🗃 Explore Dataset

🕵🏼‍♀️ What is The Sound Sentiment Puzzle About?

We humans rely on our community's feedback and review for so many things. When our friends tell us about their visit to the new restaurant, we can gauge whether they had a positive or a negative experience. When our family talks about the new movie, we know whether they enjoyed it or not. But do you think machines can identify sentiment based on the sound clips of reviews?

In this puzzle, you will merge multiple domains of AI to build a model that can identify sentiment from sound clips.

💪🏼 What You’ll Learn

How to play with sound data
How to perform sentiment classification

Let’s get started! 🚀

✔ The Task

Given an audio clip, identify the sentiment of the review. Identify whether the review was positive, negative, or neutral from the sound bite.

👩🏽‍💻 Explore Dataset

The dataset contains 10,500 samples of audio files. The label for each audio is present in train.csv and val.csv, corresponding to their id. The dataset is divided into training and validation sets.

Training Set: 15000 samples
Validation Set: 2000 samples
Test Set: 7000 samples

Here are some details about the dataset:

The training and Validation dataset includes folders containing the wav audio files and a CSV file containing the sentiment label and the wav file id.

label: 2 = Positive, 1 = Neutral, 0 = Negative
wav_id: this refers to the name of the audio file in the respective folder.

label	wav_id
2	16
1	17
0	21
2	23

🗂 Dataset Files

train.csv : (15000 samples) Contains the column wav_id, which corresponds with the audio id in train.zip, and the label column containing the sentiment labels: 2 = Positive, 1 = Neutral, 0 = Negative
train.zip : (15000 samples) Contains the review sound bites corresponding to the training dataset.
val.csv : (2000 samples) Contains the column wav_id, which corresponds with the audio id in val.zip, and the label column containing the sentiment labels.
val.zip : (2000 samples) Contains the review sound bites corresponding to the validation dataset.
test.zip : (7000 samples) This is your test dataset. Contains sound bites in wav format.

🔬 Let's Solve This Puzzle

The starter kit breaks down everything from downloading the dataset, loading the libraries, processing the data, creating, training, and testing the model.

Click here to access the basic starter kit. It contains in-depth instructions to:

Download the necessary files
Setup the AIcrow-CLI environment that will help you make a submission directly via a notebook
Downloading dataset & importing libraries
Preprocessing the dataset
Creating the model
Setting the model
Training the model
Submitting the result
Uploading the results

Make your first submission using the starter kit. 🚀

🖊 Evaluation Criteria

The evaluation metric for this puzzle is F1 Score ( Primary Score ) and Accuracy ( Secondary Score )

$F1 = 2 * \frac{precision*recall}{precision+recall}$

🤫 Hint to get started

To solve this puzzle, convert audio signals to images and use those images to train a convolutional neural network. Note: This is a simple approach 😄 you can find the code for this approach here.
You can also convert the sound to text and classify the text among the respective sentiments.

📚 Resource Circle

Check out this blog which extracts features from sound and then classifies them into sentiments.

👯‍♀️ Get Help From Community

Hop over to the AIcrowd Blitz discord server to see ongoing discussions about this puzzle.

🙋‍♀️ Subscription Queries

This is one of the many free Blitz puzzles you can access forever. To access more puzzles from various domains from the Blitz Library and receive a special new puzzle in your inbox every two weeks, you can subscribe to AIcrowd Blitz here.