Location
Badges
Activity
Ratings Progression
Challenge Categories
Challenges Entered
Shopping Session Dataset
Latest submissions
Amazon KDD Cup 2022
Latest submissions
A dataset and open-ended challenge for music recommendation research
Latest submissions
See Allgraded | 80810 |
Predict if users will skip or listen to the music they're streamed
Latest submissions
Participant | Rating |
---|
Participant | Rating |
---|
Amazon KDD Cup '23: Multilingual Recommendation Ch
Spotify Million Playlist Dataset Challenge
Any way to get access to this dataset for teaching?
Almost 2 years agoHi tommcd09 and nathan_carter. From the terms and conditions, the βChallenge Resultβ is defined as:
βany result, outcome, creation, submission, or other output, arising from your use of the Spotify Data, whether or not publicly available or shared with Spotify.β
The intention is that most non-commercial research and educational use cases fall under this definition, and should be allowed by the terms. Please note that redistributing the dataset is prohibited, access is only through the AICrowd website.
Thanks for your interest, I look forward to seeing the result of your and your studentβs work!
Adding Spotify Million Playlist Dataset in Kaggle(for computing)
Over 3 years agoHi md_sadakat_hussain_f,
Thanks for asking, I understand that it is a little extra work (and cost) to work on this dataset with your own machines. For free computing, I would also recommend Google Colab - for storage, you could store the ZIP file in Google Drive (you get 15GB free) and you can copy the ZIP file from the Colab instance and unzip it on the ephemeral storage (!unzip filename.zip), then process it there. It might be faster than uploading from your local machine each time, and easier to repeat if you script it.
Alternately, you could also follow these instructions from fast.ai (Steps 1-3) on setting up a GCP account and getting a VM and Jupyter notebook. You should get $300 credit for free as a new GCP user. Some VMβs are less than $1/hour, and include GPU and 100GB persistent disk.
Of course there are many other options for low or no-cost computing on different platforms, perhaps others can post what works for them here?
Please do respect the terms of the dataset license - you may not redistribute the dataset on Kaggle or other public platforms.
What kind of interactions are reflected in the sessions training set?
About 1 year agoHi,
Looking at the session_training.csv, it has the following columns:
I couldnβt find any documentation on what an βinteractionβ is - is it a purchase? is it a click? Are all interactions the same, or are some of them different? This is important to know when building a model.
Thanks!