🧩 Sound Sentiment Puzzle: Identity sentiments from audio clips of reviews

πŸ›  Start Solving

πŸ—ƒ Explore Dataset

πŸ•΅πŸΌβ€β™€οΈ What is The Sound Sentiment Puzzle About?

We humans rely on our community's feedback and review for so many things. When our friends tell us about their visit to the new restaurant, we can gauge whether they had a positive or a negative experience. When our family talks about the new movie, we know whether they enjoyed it or not. But do you think machines can identify sentiment based on the sound clips of reviews?

In this puzzle, you will merge multiple domains of AI to build a model that can identify sentiment from sound clips.

πŸ’ͺ🏼 What You’ll Learn

  1. How to play with sound data
  2. How to perform sentiment classification

Let’s get started! πŸš€

βœ” The Task

Given an audio clip, identify the sentiment of the review. Identify whether the review was positive, negative, or neutral from the sound bite.

πŸ‘©πŸ½β€πŸ’» Explore Dataset

The dataset contains 10,500 samples of audio files. The label for each audio is present in train.csv and val.csv, corresponding to their id. The dataset is divided into training and validation sets.

  1. Training Set: 15000 samples
  2. Validation Set: 2000 samples
  3. Test Set: 7000 samples

Here are some details about the dataset:

The training and Validation dataset includes folders containing the wav audio files and a CSV file containing the sentiment label and the wav file id.

  1. label: 2 = Positive, 1 = Neutral, 0 = Negative
  2. wav_id: this refers to the name of the audio file in the respective folder.
label wav_id
2 16
1 17
0 21
2 23

πŸ—‚ Dataset Files

  • train.csv : (15000 samples) Contains the column wav_id, which corresponds with the audio id in train.zip, and the label column containing the sentiment labels: 2 = Positive, 1 = Neutral, 0 = Negative
  • train.zip : (15000 samples) Contains the review sound bites corresponding to the training dataset.
  • val.csv : (2000 samples) Contains the column wav_id, which corresponds with the audio id in val.zip, and the label column containing the sentiment labels.
  • val.zip : (2000 samples) Contains the review sound bites corresponding to the validation dataset.
  • test.zip : (7000 samples) This is your test dataset. Contains sound bites in wav format.

πŸ”¬ Let's Solve This Puzzle

The starter kit breaks down everything from downloading the dataset, loading the libraries, processing the data, creating, training, and testing the model.

Click here to access the basic starter kit. It contains in-depth instructions to:

  1. Download the necessary files
  2. Setup the AIcrow-CLI environment that will help you make a submission directly via a notebook
  3. Downloading dataset & importing libraries
  4. Preprocessing the dataset
  5. Creating the model
  6. Setting the model
  7. Training the model
  8. Submitting the result
  9. Uploading the results

Make your first submission using the starter kit. πŸš€

πŸ–Š Evaluation Criteria

The evaluation metric for this puzzle is F1 Score ( Primary Score ) and Accuracy ( Secondary Score )

🀫 Hint to get started

  1. To solve this puzzle, convert audio signals to images and use those images to train a convolutional neural network. Note: This is a simple approach πŸ˜„ you can find the code for this approach here.
  2. You can also convert the sound to text and classify the text among the respective sentiments.

πŸ“š Resource Circle

Check out this blog which extracts features from sound and then classifies them into sentiments.

πŸ‘―β€β™€οΈ Get Help From Community

Hop over to the AIcrowd Blitz discord server to see ongoing discussions about this puzzle.

πŸ™‹β€β™€οΈ Subscription Queries

This is one of the many free Blitz puzzles you can access forever. To access more puzzles from various domains from the Blitz Library and receive a special new puzzle in your inbox every two weeks, you can subscribe to AIcrowd Blitz here.


Getting Started


See all
[Getting Started Notebook] SOUSEN Challange
Over 2 years ago