AIcrowd | Speaker Identification

Round 1: Completed #educational Weight: 15.0

AIcrowd

3790

243

861

Welcome to AI Blitz XII! 🚀 | Starter Kit For This Challenge! 🛠

Community Contribution Prizes 📓 | Find Teammates 👯‍♀️

Discord AI Community 🎧

🎥 Introduction

The Minecraft community along has registered 1 trillion views on YouTube, that is 1000000000000!! With 500 hours of videos being uploaded on YouTube every minute, automated subtitles or closed captions help us understand content from various parts of the world. Whether its the latest episode of your favorite anime or the new trailer for Money Heist, these closed captions make videos more accessible.

Closed Captions work smoothly with the help of Youtube’s own Transcriber, working overtime to deliver the most relevant captions in over 100+ languages. Speaker Identification presents the participants with a huge array of unlabelled sentences taken from various videos of 10 Machine Learning Youtubers. The participants are then required to implement unsupervised learning methods to cluster these sentences into various clusters symbolizing each of the YouTuber.

Sounds a bit difficult? What if we tell you that even you have an identifiable way to speak?
Using a certain preposition, exclamation, or a unique sentence connector – all these things define your style of speaking. Your machine learning model’s task is to identify those underlying patterns using Unsupervised methods to solve this puzzle.

💪 Getting Started

Much like our other puzzles, we are providing the participants with a starter kit which you can take as a reference to plan your data pre-processing and general approach. With the abundance of data, the participants may opt for a Text-Dependant approach for the puzzle. This can be achieved by using clustering algorithms to capture the voice-print of the data sample by identifying the same words and passphrases used by a speaker.

In the starter kit presented, we have set a baseline using K Means Clustering for the puzzle. In this algorithm, the data points are assigned to a cluster in such a manner that the sum of the squared distance between the data points and centroid would be minimum. Optimal to be used in unsupervised learning, we hope that it helps you identify your own unique challenge-winning approach!

💾 Dataset

The dataset for this challenge contains sentences from speeches, transcripts, lectures given by some famous YouTubers. Each data point is a group of sentences of length 2-5 spoken by a single speaker. Since this is an unsupervised learning challenge the labels for these sentences are not present. The sentences from the same person need to be clustered together. There are over 10 different youtubers transcripts present in the dataset.

📁 Files

Following files are available in the resources section:

test.csv - This CSV file contains the sentences that's need to be clustered together.
sample_submission.csv - It contains the random labels for testing data which is used for testing purposes and making sure that your submission format is correct.

🚀 Submission

Creating a submission directory
Use sample_submission.csv to create your submission. The headers of the columns should be "id" and "prediction".
Save the CSV in the submission directory. The name of the above file should be submission.csv.
Inside a submission directory, put the .ipynb notebook from which you trained the model and made inference and save it as original_notebook.ipynb.

Overall, this is what your submission directory should look like -

Zip the submission directory!

Make your first submission here 🚀 !!

🖊 Evaluation Criteria

During the evaluation, the Adjusted Rand Score will be used to test the efficiency of the model.

📱 Contact

Aditya Jha
Shubhamai