(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)
Now things are clear. I think out of all the problems given, this one is the most challenging of all.
Thanks ! Now got it … Damn the code structure is too rigid . Also, I am trying to understand the problem statement-
- We have train data with inputs feature being text and output feature being labels. In your starter notebook, you consider the emotion detection dataset as train data and the corresponding label as target. My question is - “Is the emotion detection dataset the train dataset for our problem?”
I am getting confused because in the descriptions section, it is written -
" Working on the same Research Paper Dataset you used in the multi-class problem, you will be building a model using the word2vec approach using Tensorflow."
The train,test and validation datasets are not clear for the problem to be honest.
We need to find the embeddings in such a way that the F1 score is increased on the test dataset. I can see that datasets.csv is the test data having just 10 observations. Is this the complete data or there are some hidden data for us to generalize our solution?
Also I can see - " Each vector is should only contain 512 elements" in the description. Is it so that we can’t use any SOTA model embeddings(like SBERT) here(which may have more than 512 elements)?
Are you going to use any other models like Decision Tree Classifier in “train_model” function. I mean how exactly is F score predicted on leaderboard ?
I am still getting the error message- DockerBuildError. I am running your starter notebook as it is but still getting the error mentioned.
I guess there seems to be some bug in the system. I can see a lot of evaluation timed-out errors while submitting (even one of my submissions was showing this and the worst thing is I did not even save that submission of mine because I was submitting via colab notebook and did factory reset after the submission was made from colab ).
Can any one rectify the bug so that we can make our submissions?? Also, can the timed-out submissions be made from your backend so that we know what possibly we could have scored in those?
Thanks in advance!
Solution for submission 152443 A detailed solution for submission 152443 submitted for challenge Tree Segmentation
Solution for submission 152223 A detailed solution for submission 152223 submitted for challenge Iceberg Detection
Solution for submission 148532 A detailed solution for submission 148532 submitted for challenge NLP Feature Engineering
Solution for submission 147071 A detailed solution for submission 147071 submitted for challenge De-Shuffling Text
Solution for submission 146642 A detailed solution for submission 146642 submitted for challenge Sound Prediction