# shivam

#### Name

Shivam Khandelwal

AIcrowd

Gurgaon, IN

May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Mon
Wed
Fri

#### Challenges Entered

##### Snake Species Identification Challenge
By Institute of Global Health LifeCLEF

Classify images of snake species from around the world

#### Latest submissions

 graded 60335 Sun, 5 Apr 2020 11:14:14 graded 60334 Sun, 5 Apr 2020 10:26:23
##### AIcrowd Blitz - May 2020
By AIcrowd

5 Problems 15 Days. Can you solve it all?

#### Latest submissions

 graded 67402 Mon, 18 May 2020 15:43:30 graded 66492 Thu, 14 May 2020 18:31:46
##### Food Recognition Challenge
By Seerave Foundation

A benchmark for image-based food recognition

#### Latest submissions

 graded 59791 Fri, 21 Feb 2020 14:31:46 failed 59371 Tue, 11 Feb 2020 15:38:07 failed 31084 Thu, 19 Dec 2019 00:46:56
##### MNIST
By AIcrowd

Recognise Handwritten Digits

#### Latest submissions

 graded 63126 Fri, 1 May 2020 14:11:03 graded 62025 Tue, 28 Apr 2020 14:18:56 graded 61850 Tue, 28 Apr 2020 10:50:00
##### AMLD 2020 - Transfer Learning for International Crisis Response
By DEEP

Help improve humanitarian crisis response through better NLP modeling

#### Latest submissions

 failed 32245 Sat, 28 Dec 2019 09:32:25
##### LifeCLEF 2020 Bird - Monophone
By LifeCLEF

Recognizing bird sounds in monophone soundscapes

#### Latest submissions

No submissions made in this challenge.
By LifeCLEF

#### Latest submissions

No submissions made in this challenge.
By LifeCLEF

#### Latest submissions

No submissions made in this challenge.
• gold-challenge-end Gold Medal winner for a challenge
challenge_round : 298
May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
• May 16, 2020
Gold 1
gold-challenge-end
May 16, 2020
Silver 0
Bronze 0

### Final countdown

Hi @fenway,

I looked into it and your profile isn’t complete yet due to which files are not available to you for download.

Each time I filled out the profile info

Can you link me to the page you have filled the profile information? I can accordingly verify that everything is working as expected. Thanks.

### Have the dataset files been released?

3 months ago

Hi @houkal,

We are working with @kahst to make the dataset available soon.

Regards,
Shivam

### Submission not showing up

Yesterday

Hi @yankeesong,

It is because matplotlib is not installed in your current runtime environment.

You can do so by adding it in your environment.yml file. In case you are an advanced linux user, and want to see all the ways you can configure runtime environment, you can read this FAQ post.

You can view the logs for debug submissions (i.e. debug: true, in aicrowd.json) yourself by clicking on link present in your GitLab issue. Something like below:

Logs for Admin Reference : agent-logs | pod_spec-logs | error_trace-logs |

Let me know in case you have any follow up question.

### Submission not showing up

2 days ago

Hi @yankeesong, the submission is valid only when:

1. tag starts with prefix submission-
2. the tag commit hash is new and not same as any previous submission, i.e. 2 submission with same commit is ignored

I think your tag push isn’t considered for submission due to 1st point above.

### System confirmation for submissions

3 days ago

Hi participants,

I noticed question from participant regarding system configuration for Snakes Challenge submissions.

Here is the configuration we use:

CPU: 3.92 cores
RAM: 12.3 GB
GPU (if requested via aicrowd.json): K80

I hope this helps!
Wishing you best of luck for the challenge and excited to see awesome submissions.

### Read state_dict in my submission

3 days ago

Hi @yankeesong,

As @picekl mentioned, git-lfs is the way to go.

In case you have any follow up question do let us know.

### Submission is taking really long

3 days ago

Hi @eric,

I have shared response on Gitlab issue.

It seems like some problem when you are saving your predictions to file and they are filled with Nan, instead of floats.

### Submission is taking really long

11 days ago

Hi @eric,

Yes, I am stopping the running submission.

### Submission is taking really long

11 days ago

Hi @eric,

I am looking into your submission #67390 now and will update you asap.

### Hash id wrong format

15 days ago

Hi @yankeesong,

Please treat the column as “text” and not “number” in whichever software you are viewing train_labels.csv file.

The hash ids are random “text” fields, and the ID above in the dataset is “9990646e65” (text representation) and not “9.990646E+71” (numeric representation).

~❯ cat ~/Downloads/train_labels.csv | grep 99064
natrix-tessellata,Italy,9990646e65,Europe


I hope it helps, let us know in case you have any further query.

Wish you luck with the competition!

### New Submission does not appear in Leaderboard

18 days ago

Hi @eric,

#65930 and #65938 have failed due to SyntaxError which have been shared by @picekl in GitLab issues comments.

But #65941 and #65950 have failed due to issue on our end. I have requeued them now, and made announcement about the same here.

### Is it still possible to submit for the snake competition?

4 months ago

Yes, you can submit. The submissions wouldn’t count toward leaderboard ranking.

### Can I have an example of a code which is working to make a submission on gitlab?

4 months ago

Hi @ValAn, participants,

Congratulations to all for your participation.

There is no update right now. Organisers will be reaching out to the participants shortly with details about their travel grants, etc and post challenge follow-up.

### Can I have an example of a code which is working to make a submission on gitlab?

4 months ago

Hi @amapic, I have started force cudatoolkit=10.0 installation at same time above announcement is made i.e. 14 hours ago.

Edit: I remember the conda environment issue you were facing, and it isn’t related to it.

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

Hi @ignasimg,

Thanks for the suggestions.
I completely agree that we need to improve our communication & orientation of information for providing seamless experience to participants.

We would be glad to hear back from you after competition and looking forward for the inputs.

I checked all the submissions and unfortunately multiple participants are facing same issue i.e. GPU is being allocated but not used by submissions, due to cuda version mismatch.

For making GPU work out of box, we have introduced force installation as below in our snakes challenge evaluation process:

conda install cudatoolkit=10.0


This should fix the timing issues and we will continue monitoring all the submissions closely.

@ignasimg I have verified disks performance and it was good. Unfortunately on debugging, I found your submission faced same issue i.e. cudatoolkit=10.1 due to which it may have given the impression that disk is the bottleneck (but it was GPU which wasn’t being utilised). The current submission should finish much sooner after condatoolkit version pinning.

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

@ValAn No, I can confirm the timeouts haven’t been change b/w your previous and current runs. The only issue has been timeout wasn’t implemented properly in past and it can be reason why your previous (1 week old) submission get missed from timeout.

We can absolutely check why it is taking >8 hours instead of ~10 minutes on local. Can you help me with following:

• The local run is with GPU? I can check if your code is utilising GPU (when allocated) or running only on CPU for whatsoever reason.
• What are the number of images when you are doing locally? The server/test dataset have 32428 images to be exact, which may be causing higher time.

I think specs for online environment would help a bit in case there is significant difference from your local environment: 4 vCPUs, 16 GB memory, K80 GPU (when enabled)

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

Hi @amapic, let me get back on this after confirming with organisers.

Meanwhile we can create new questions instead of following up on this thread, it will make QnA search for future simpler.

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

Hi @ValAn,

The submissions ideally should take few hours to run but we have put hard timeout as 8 hours. In case your solution is crossings 8 hours it is marked failed.

According to you how much time your code should run roughly? Is it way too off in local v/s during evaluation phase?

Otherwise you can include GPU (if not doing right now) to speed up computation and finish the evaluation under 8 hours.

Please let us know in case you require more help with debugging your submission. We can try to see which step/part of code is taking higher time if required.

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

@amapic This is happening as these packages are only available for linux distribution, due to while installing them in windows (I assume you are using windows) is failing. This is unfortunately a limitation currently with conda.

Example:
https://anaconda.org/anaconda/ncurses, have only osx & linux builds but not windows

In such scenario, I will recommend getting rid of above packages from environment.yaml and continue your conda env creation. These packages are often included being dependencies of “main” dependencies, conda should resolve similar package for your system automatically.

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

Hi participants, @ValAn,

Yes the GPUs are available on snakes challenge submissions when gpu: true is done in aicrowd.json.

It need to be 10.0 because nodes on which your code run has GKE version 1.12.x currently -> Nvidia driver 410.79 (based on) -> cuda 10.0 (based on).

We are looking forward to have future challenges on higher CUDA version (GKE version). But to keep consistency in results, timings, etc we do not want to change versions mid-way of contest.

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

Hi @gokuleloop,

Thanks for pointing it out. We have updated the last date to Jan 17, 2020 on website as well.

### Can I have an example of a code which is working to make a submission on gitlab?

5 months ago

Hi git lfs migrate is for transferring any older commit to start using lfs. This is useful in case you have lots of older commit (intended/unintended) and want those files to migrate to LFS based in future.

5 months ago

3 days ago

Hi @SergeKo,

### Organizing submissions

3 days ago

Hi, do you mean submission selected on leaderboard here?

This is picked based on ranking method set by challenge organiser, which is descending for both mean_auc and min_auc in this challenge.

Sorry in case I didn’t understand the question properly.
Please let us know in case you meant something else.

### Error Message not present in the evaluation script

4 days ago

Hi,

The full traceback is as follows.

You can try out your csv file with ic2020_drawn_ui_evaluator.py:

  File "ic2020_drawn_ui_evaluator.py", line 526, in <module>
File "ic2020_drawn_ui_evaluator.py", line 37, in _evaluate
File "ic2020_drawn_ui_evaluator.py", line 118, in load_predictions
_csv.Error: field larger than field limit (131072)


### Getting incomplete error message after submission

13 days ago

Hi @OG_SouL,

Thanks for notifying this. We didn’t realise the “View” button next to error message doesn’t contain full error message displayed. We will start displaying inside it properly.

Meanwhile for your case, the full error message is as follows:

Error : Incorrect localisation format (Line nbr 1). The format should be …<widget_ID><localisations_delimited_by_comma>…

### How long for EUA approval?

4 months ago

cc: @Ivan_Eggel for looking into it

12 days ago

# Getting Started Code for CRDSM Educational Challenge¶

### Author - Pulkit Gera¶

In [0]:
!pip install numpy
!pip install pandas
!pip install sklearn


The first step is to download out train test data. We will be training a classifier on the train data and make predictions on test data. We submit our predictions

In [0]:
!rm -rf data
!mkdir data
!wget https://s3.eu-central-1.wasabisys.com/aicrowd-public-datasets/aicrowd_educational_crdsm/data/public/test.csv
!wget https://s3.eu-central-1.wasabisys.com/aicrowd-public-datasets/aicrowd_educational_crdsm/data/public/train.csv
!mv train.csv data/train.csv
!mv test.csv data/test.csv

--2020-05-16 21:33:33--  https://s3.eu-central-1.wasabisys.com/aicrowd-public-datasets/aicrowd_educational_crdsm/data/public/test.csv
Resolving s3.eu-central-1.wasabisys.com (s3.eu-central-1.wasabisys.com)... 130.117.252.12, 130.117.252.10, 130.117.252.13, ...
Connecting to s3.eu-central-1.wasabisys.com (s3.eu-central-1.wasabisys.com)|130.117.252.12|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 72142 (70K) [text/csv]
Saving to: ‘test.csv’

test.csv            100%[===================>]  70.45K   150KB/s    in 0.5s

2020-05-16 21:33:34 (150 KB/s) - ‘test.csv’ saved [72142/72142]

--2020-05-16 21:33:36--  https://s3.eu-central-1.wasabisys.com/aicrowd-public-datasets/aicrowd_educational_crdsm/data/public/train.csv
Resolving s3.eu-central-1.wasabisys.com (s3.eu-central-1.wasabisys.com)... 130.117.252.12, 130.117.252.10, 130.117.252.13, ...
Connecting to s3.eu-central-1.wasabisys.com (s3.eu-central-1.wasabisys.com)|130.117.252.12|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2543764 (2.4M) [text/csv]
Saving to: ‘train.csv’

train.csv           100%[===================>]   2.43M  1.47MB/s    in 1.6s

2020-05-16 21:33:39 (1.47 MB/s) - ‘train.csv’ saved [2543764/2543764]



## Import packages¶

In [0]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC
from sklearn.metrics import f1_score,precision_score,recall_score,accuracy_score


• We use pandas 🐼 library to load our data.
• Pandas loads the data into dataframes and facilitates us to analyse the data.
In [0]:
all_data = pd.read_csv('data/train.csv')


## Analyse Data¶

In [0]:
all_data.head()

Out[0]:
max_ndvi 20150720_N 20150602_N 20150517_N 20150501_N 20150415_N 20150330_N 20150314_N 20150226_N 20150210_N 20150125_N 20150109_N 20141117_N 20141101_N 20141016_N 20140930_N 20140813_N 20140626_N 20140610_N 20140525_N 20140509_N 20140423_N 20140407_N 20140322_N 20140218_N 20140202_N 20140117_N 20140101_N class
0 997.904 637.5950 658.668 -1882.030 -1924.36 997.904 -1739.990 630.087 -1628.240 -1325.64 -944.084 277.107 -206.7990 536.441 749.348 -482.993 492.001 655.770 -921.193 -1043.160 -1942.490 267.138 366.608 452.238 211.328 -2203.02 -1180.190 433.906 4
1 914.198 634.2400 593.705 -1625.790 -1672.32 914.198 -692.386 707.626 -1670.590 -1408.64 -989.285 214.200 -75.5979 893.439 401.281 -389.933 394.053 666.603 -954.719 -933.934 -625.385 120.059 364.858 476.972 220.878 -2250.00 -1360.560 524.075 4
2 3800.810 1671.3400 1206.880 449.735 1071.21 546.371 1077.840 214.564 849.599 1283.63 1304.910 542.100 922.6190 889.774 836.292 1824.160 1670.270 2307.220 1562.210 1566.160 2208.440 1056.600 385.203 300.560 293.730 2762.57 150.931 3800.810 4
3 952.178 58.0174 -1599.160 210.714 -1052.63 578.807 -1564.630 -858.390 729.790 -3162.14 -1521.680 433.396 228.1530 555.359 530.936 952.178 -1074.760 545.761 -1025.880 368.622 -1786.950 -1227.800 304.621 291.336 369.214 -2202.12 600.359 -1343.550 4
4 1232.120 72.5180 -1220.880 380.436 -1256.93 515.805 -1413.180 -802.942 683.254 -2829.40 -1267.540 461.025 317.5210 404.898 563.716 1232.120 -117.779 682.559 -1813.950 155.624 -1189.710 -924.073 432.150 282.833 298.320 -2197.36 626.379 -826.727 4

Here we use the describe function to get an understanding of the data. It shows us the distribution for all the columns. You can use more functions like info() to get useful info.

In [0]:
all_data.describe()
#all_data.info()

Out[0]:
max_ndvi 20150720_N 20150602_N 20150517_N 20150501_N 20150415_N 20150330_N 20150314_N 20150226_N 20150210_N 20150125_N 20150109_N 20141117_N 20141101_N 20141016_N 20140930_N 20140813_N 20140626_N 20140610_N 20140525_N 20140509_N 20140423_N 20140407_N 20140322_N 20140218_N 20140202_N 20140117_N 20140101_N class
count 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000 10545.000000
mean 7282.721268 5713.832981 4777.434284 4352.914883 5077.372030 2871.423540 4898.348680 3338.303406 4902.600296 4249.307925 5094.772928 2141.881486 3255.355465 2628.115168 2780.793602 2397.228981 1548.151856 3015.626776 4787.492858 3640.367446 3027.313647 3022.054677 2041.609136 2691.604363 2058.300423 6109.309315 2563.511596 2558.926018 0.550213
std 1603.782784 2283.945491 2735.244614 2870.619613 2512.162084 2675.074079 2578.318759 2421.309390 2691.397266 2777.809493 2777.504638 2149.931518 2596.151532 2256.234526 2446.439258 2387.652138 1034.798320 1670.965823 2745.333581 2298.281052 2054.223951 2176.307289 2020.499263 2408.279935 2212.018257 1944.613487 2336.052498 2413.851082 1.009424
min 563.444000 -433.735000 -1781.790000 -2939.740000 -3536.540000 -1815.630000 -5992.080000 -1677.600000 -2624.640000 -3403.050000 -3024.250000 -4505.720000 -1570.780000 -3305.070000 -1633.980000 -482.993000 -1137.170000 372.067000 -3765.860000 -1043.160000 -4869.010000 -1505.780000 -1445.370000 -4354.630000 -232.292000 -6807.550000 -2139.860000 -4145.250000 0.000000
25% 7285.310000 4027.570000 2060.600000 1446.940000 2984.370000 526.911000 2456.310000 1017.710000 2321.550000 1379.210000 2392.480000 559.867000 1068.940000 616.822000 947.793000 513.204000 718.068000 1582.530000 2003.930000 1392.390000 1405.020000 1010.180000 429.881000 766.451000 494.858000 5646.670000 689.922000 685.680000 0.000000
50% 7886.260000 6737.730000 5270.020000 4394.340000 5584.070000 1584.970000 5638.400000 2872.980000 5672.730000 4278.880000 6261.950000 1157.170000 2277.560000 1770.350000 1600.950000 1210.230000 1260.280000 2779.570000 5266.930000 3596.680000 2671.400000 2619.180000 1245.900000 1511.180000 931.713000 6862.060000 1506.570000 1458.870000 0.000000
75% 8121.780000 7589.020000 7484.110000 7317.950000 7440.210000 5460.080000 7245.040000 5516.610000 7395.610000 7144.480000 7545.880000 3006.960000 5290.800000 4513.960000 4066.930000 3963.590000 1994.910000 4255.580000 7549.430000 5817.750000 4174.010000 4837.610000 3016.520000 4508.510000 2950.880000 7378.020000 4208.730000 4112.550000 1.000000
max 8650.500000 8377.720000 8566.420000 8650.500000 8516.100000 8267.120000 8499.330000 8001.700000 8452.380000 8422.060000 8401.100000 8477.560000 8624.780000 7932.690000 8630.420000 8210.230000 5915.740000 7492.230000 8489.970000 7981.820000 8445.410000 7919.070000 8206.780000 8235.400000 8247.630000 8410.330000 8418.230000 8502.020000 5.000000

## Split Data into Train and Validation 🔪¶

• The next step is to think of a way to test how well our model is performing. we cannot use the test data given as it does not contain the data labels for us to verify.
• The workaround this is to split the given training data into training and validation. Typically validation sets give us an idea of how our model will perform on unforeseen data. it is like holding back a chunk of data while training our model and then using it to for the purpose of testing. it is a standard way to fine-tune hyperparameters in a model.
• There are multiple ways to split a dataset into validation and training sets. following are two popular ways to go about it, k-fold, leave one out. 🧐
• Validation sets are also used to avoid your model from overfitting on the train dataset.
In [0]:
X = all_data.drop('class',1)
y = all_data['class']
# Validation testing
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

• We have decided to split the data with 20 % as validation and 80 % as training.
• This is of course the simplest way to validate your model by simply taking a random chunk of the train set and setting it aside solely for the purpose of testing our train model on unseen data. as mentioned in the previous block, you can experiment 🔬 with and choose more sophisticated techniques and make your model better.
• Now, since we have our data splitted into train and validation sets, we need to get the corresponding labels separated from the data.
• with this step we are all set move to the next step with a prepared dataset.

# TRAINING PHASE 🏋️¶

## Define the Model¶

• We have fixed our data and now we are ready to train our model.

• There are a ton of classifiers to choose from some being Logistic Regression, SVM, Random Forests, Decision Trees, etc.🧐

• Remember that there are no hard-laid rules here. you can mix and match classifiers, it is advisable to read up on the numerous techniques and choose the best fit for your solution , experimentation is the key.

• A good model does not depend solely on the classifier but also on the features you choose. So make sure to analyse and understand your data well and move forward with a clear view of the problem at hand. you can gain important insight from here.🧐

In [0]:
# classifier = LogisticRegression()

classifier = SVC(gamma='auto')

# from sklearn import tree
# classifier = tree.DecisionTreeClassifier()

• To start you off, We have used a basic Support Vector Machines classifier here.
• But you can tune parameters and increase the performance. To see the list of parameters visit here.
• Do keep in mind there exist sophisticated techniques for everything, the key as quoted earlier is to search them and experiment to fit your implementation.

To read more about other sklearn classifiers visit here 🧐. Try and use other classifiers to see how the performance of your model changes. Try using Logistic Regression or MLP and compare how the performance changes.

## Train the Model¶

In [0]:
classifier.fit(X_train, y_train)

Out[0]:
SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)

got a warning! Dont worry, its just beacuse the number of iteration is very less(defined in the classifier in the above cell).Increase the number of iterations and see if the warning vanishes.Do remember increasing iterations also increases the running time.( Hint: max_iter=500)

# Validation Phase 🤔¶

Wonder how well your model learned! Lets check it.

## Predict on Validation¶

Now we predict using our trained model on the validation set we created and evaluate our model on unforeseen data.

In [0]:
y_pred = classifier.predict(X_val)


## Evaluate the Performance¶

• We have used basic metrics to quantify the performance of our model.
• This is a crucial step, you should reason out the metrics and take hints to improve aspects of your model.
• Do read up on the meaning and use of different metrics. there exist more metrics and measures, you should learn to use them correctly with respect to the solution,dataset and other factors.
• F1 score is the metric for this challenge
In [0]:
precision = precision_score(y_val,y_pred,average='micro')
recall = recall_score(y_val,y_pred,average='micro')
accuracy = accuracy_score(y_val,y_pred)
f1 = f1_score(y_val,y_pred,average='macro')

In [0]:
print("Accuracy of the model is :" ,accuracy)
print("Recall of the model is :" ,recall)
print("Precision of the model is :" ,precision)
print("F1 score of the model is :" ,f1)

Accuracy of the model is : 0.7140825035561877
Recall of the model is : 0.7140825035561877
Precision of the model is : 0.7140825035561877
F1 score of the model is : 0.138865836791148


# Testing Phase 😅¶

We are almost done. We trained and validated on the training data. Now its the time to predict on test set and make a submission.

Load the test data on which final submission is to be made.

In [0]:
test_data = pd.read_csv('data/test.csv')


## Predict Test Set¶

Time for the moment of truth! Predict on test set and time to make the submission.

In [0]:
y_test = classifier.predict(test_data)


## Save the prediction to csv¶

In [0]:
df = pd.DataFrame(y_test,columns=['class'])
df.to_csv('submission.csv',index=False)


🚧 Note :

• Do take a look at the submission format.
• The submission file should contain a header.
• Follow all submission guidelines strictly to avoid inconvenience.

## To download the generated csv in collab run the below command¶

In [0]:
try:
except ImportError as e:
print("Only for Collab")


### Well Done! 👍 We are all set to make a submission and see you name on leaderborad. Let navigate to challenge page and make one.¶

#### ImageCLEF 2020 VQA-Med - VQA

14 days ago

Hi,

You have marked challenge as ImageCLEF 2020 VQA-Med in this question, but it doesn’t seem to havee any retrieval type and run type fields?

Meanwhile, for this challenge, please check the submissions instruction given on challenge page. https://www.aicrowd.com/challenges/imageclef-2020-vqa-med-vqa#submission-instructions

Let us know in case you have any follow up questions.

### Submissions taking too long

14 days ago

I see all the submissions made by you have failed either due to image build failure caused due to improper Dockerfile or due to exceptions in your code.

I can ignore the failed submissions from your daily count so you can make submission right now (given it is last few hours left), but considering those for final leaderboard or not, will be a decision made by challenge organisers later.

### Submissions taking too long

14 days ago

Your submissions 67274 went without any problem as far as I see. While, 67214 took longer because existing VMs were already busy in evaluating other submissions. We didn’t considering surge in submissions just before the Round end and I have increased parallel submissions to be evaluated (from 4 to 8) which should keep the queue clear.

I hope it helps.

### Round 2 End Time

16 days ago

Hi, the configured end time as of now is 17/05/2020, 00:00 UTC.

### Submissions taking too long

17 days ago

The issue is fixed now and you should be able to make submission. Please remember to pull latest commit from mmdetection starter kit.

Explaination:

This basically happened because mmcv had a new release 0.5.2 ~7 hours back from now.

And mmdetection has requirement of/pinned to latest release of mmcv

Due to this mmdetection installation start failing. I have pinned mmcv version to 0.5.1 in starter kit now. https://gitlab.aicrowd.com/nikhil_rayaprolu/food-pytorch-baseline/commit/84eadc1ca353b5741423e0e1ea9f8db5d4bfd49f

Following this, submissions using this starter kit will go through as usual.
Thanks for notifying the issue to us!

### Submissions taking too long

17 days ago

Hi,

No worries. You can ping me at either place.

It isn’t happening due to server side this time.

The issue is happening when Dockerfile is trying to install mmdetection package. I think it is due to any new release of package it is dependent on (or similar). I am trying to debug it on my side and inform as soon as I find fix for your Dockerfile.

https://gitlab.aicrowd.com/simon_mezgec/food-recognition-challenge-starter-kit/snippets/20588#L1854

### Submissions taking too long

18 days ago

Hi @simon_mezgec, your submission has been processed properly now, and I have made post about the error here.

### Submissions taking too long

19 days ago

Sorry for the trouble. The submission 65790 is on it’s way to evaluation too now.

I will keep a close eye for the new submissions, to make sure this isn’t repeating.

### Submissions taking too long

19 days ago

We had issue in submissions queue due to which submissions got stuck.

We have manually cleaned ongoing submissions – which got stuck and re-queued them now. (to be exact: 65632, 65262, 65404, 65411).

Please let us know in case any other submission ID is stuck for you.

### New tag does not create an issue or evaluation

Hi @frgfm,

I have same hypothesis as Mohanty shared above.

Can you share exact output/error when you do git push, I can help based on the error.

Hypothesis in advance based on the exact error:

1. In case it is throwing Fatal: Maximum size, etc.. then the reason would be file is already added and you need to migrate it from non-LFS to LFS (happens most of the times). Reference: How to upload large files (size) to your submission
2. If the error is Failed to push LFS/stuck in uploading etc, it can be due to unstable/very-slow internet on your side causing the upload to stop/timeout in middle (rare, but happens). Reference: Cannot upload my model's weights to GitLab - filesize too large

### New tag does not create an issue or evaluation

Hi @frgfm,

Welcome to the Food Recognition Challenge.

Solution
To immediately start and make a submission, please create a new commit (editing any file) and submit again using submission- git tag.

Description
I went through your git history and it happened because you pushed v0.1.0 followed by submission-v1. What happened here is, we only accept submissions from git tags having prefix submission-, due to which v0.1.0 failed to create a submission.

While when you retried using submission-v1 it looked into the history and found the same commit hash (v0.1.0) sent previously and didn’t trigger submission. Ideally, I believe it should cache/check history only for submission- prefix tags, which didn’t happen here and we will improve it on our side.

Sorry for the inconvenience caused.
Hoping to get exciting submissions from you in the challenge!

### Editing Docker file

Glad to know that we could help.

for you getting good score on your first run, but all the best in improving scores over time too. Wishing you luck!

### Editing Docker file

2 months ago

Regarding this, I guess the starter kit/baseline you followed didn’t respect requirements.txt (because of custom Dockerfile being used – which has highest precedence).

We will get it fixed in whichever starter kit you used for your submission (let us know the link). Sorry for the confusion caused due to it.

### Editing Docker file

2 months ago

Understood. So those lines are actually fine in Dockerfile.

I debugged further to look into issue your code (#59996) is facing and this is what I found:

Traceback (most recent call last):
File "run.py", line 332, in <module>
run()
[.... removed ....]
File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 1364, in __init__
name=self.name)
File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 504, in placeholder
x = tf.placeholder(dtype, shape=shape, name=name)
AttributeError: module 'tensorflow' has no attribute 'placeholder'


This is happening due to wrong (?) version of tensorflow used in your submission v/s the one you may be using on your system. This can be mitigated by using tensorflow v1.4 or by disabling v2 behaviour, etc. More: https://github.com/theislab/scgen/issues/14, https://stackoverflow.com/questions/37383812/tensorflow-module-object-has-no-attribute-placeholder

When running manually the next issue I came across is, you need to add import skimage:

Traceback (most recent call last):
File "run.py", line 332, in <module>
run()
File "run.py", line 304, in run
predictions=evaluate_coco(model, image_ids=image_ids)
File "run.py", line 220, in evaluate_coco
File "run.py", line 162, in load_image
NameError: name 'skimage' is not defined


After that your submission can start running immediately.

Running COCO evaluation on 1959 images.
0
1
2
3
4
5
6
7
8
[....] (I didn't run further)


You can fix based on above remarks and start submitting your solution.

In case you want to debug properly on your desktop directly and are comfortable with Docker, you can use aicrowd-repo2docker to generate image & execute ./run.sh (More: Which docker image is used for my submissions?).

Let us know in case we can provide any other feedback. Also, it will be good to know which starter kit/initial repository you referred for making the submission, so we can add some more debug/testing scripts to the same.

All the best with the competition!

### Editing Docker file

2 months ago

Hi @hannan4252,

Can you tell why are you trying to edit and/or achieve by the above edit?
By default, you shouldn’t need to edit the above lines and things should work out of the box.

### Not able to ssh to gitlab

2 months ago

Great, I believe you would have followed steps in starter kit or baseline which make you push codebase to gitlab.aicrowd.com (by changing remote), and you ended up correctly on gitlab.aicrowd.com.

Happy that things are working on your end now, excited to see yours submissions into the challenge and leaderboard!

### Can I submit code in PyTorch?

2 months ago

@nofreewill42, in case you are not comfortable with Dockerfile, you can still submit and specify your runtime using requirements.txt, environment.yml etc based on your preference to pip/conda/others. (delete the Dockerfile in case you want to use any of these method)

### Not able to ssh to gitlab

2 months ago

For making submissions for the challenge you need to use gitlab.aicrowd.com and not gitlab.com. I guess there has been confusion regarding the same above.

The steps will be as follows:

2. Login and start using git repository via gitlab.aicrowd.com domain.

2 months ago

Hi, sorry for wrong error message in this case. Your submission timed out i.e. >8 hours due to which it was terminated.

It can happen due to multiple reasons:

1. Code is too slow
2. Code needs GPU while GPU wasn’t requested in aicrowd.json
3. GPU was requested and provided, but your code isn’t able to utilise the same, either due to code issue or package issue.

In case you can identify one of the reason for your case, you can submit your code again with fix. Otherwise, you can share submission ID which you would like us to look into. We can help you in debug and share what went wrong.

2 months ago

Hi @himanshu ,

Sorry to keep for waiting, the issue is now resolved and datasets are available again on the website.
Thanks again for letting us know about the issue proactively.

Regards,
Shivam

2 months ago

Thanks for informing, we are looking into it and fixing asap.

2 months ago

Hi, can you share the error coming to you and for which file?

Ideally the link shared here should works directly: https://www.aicrowd.com/challenges/food-recognition-challenge/dataset_files

### Submission struck

2 months ago

Hi @hannan4252, I see your submission #59955 is still ongoing/running and not stuck.

Side note, you have ran your submission without GPU. In case you want your submission to run with GPU and slow run is due to the same, please enable GPU using this guide.

### Local Testing for submission error

2 months ago

Sure, is it linux or ubuntu?

Ideally you should be able to test it using docker ps command.
In case you want to install docker in local you can using this help article. https://docs.docker.com/install/

### Local Testing for submission error

2 months ago

Thanks, command looks good, can you share full traceback in that case?
And is “docker” running on your local, I suspect that to be reason till now.

### Local Testing for submission error

2 months ago

Hi, what command you used to run locally?

### Evaluation Criteria

3 months ago

Yes @gloria_macia_munoz, you are correct for image_id & score field. We will also work toward adding this information in starter kit so it is easier for newer participants.

### Evaluation Criteria

3 months ago

Yes, the structure shared by you is correct. You can ignore iscrowd field.

Example for final structure required is as follows:

[
{
"image_id": 28902,
"category_id": 2738,
"score": 0.18888643674121008,
"segmentation": [
[
270,
195,
381,
823,
56,
819,
527,
[....]
]
],
"bbox": [
56,
165,
678,
658
]
}
[....]
}


Please let us know in case there is any followup question. All the best with the challenge!

### Instructions, EDA and baseline for Food Recognition Challenge

5 months ago

You will get an environment variable AICROWD_PREDICTIONS_OUTPUT_PATH having absolute path to location at which json file need to be written.

Example from starter kit here.

### Cannot upload my model's weights to GitLab - filesize too large

5 months ago

Thanks for the inputs, I have added git for windows in the FAQ above.

We had cases where people wanted to upload files in GBs, due to which timeout was increased/removed. I will go through the current value and set it to a better value.

### Instructions, EDA and baseline for Food Recognition Challenge

5 months ago

HI @joao_schapke, please use git lfs clone <repo> / git lfs pull command in your above repository as Nikhil also mentioned. Do let us know how it goes and if the problem continues.

### Cannot upload my model's weights to GitLab - filesize too large

5 months ago

Are you facing the error file size too large or the git-lfs is getting stuck for upload?

In case of file size too large, please go through How to upload large files (size) to your submission.

### Instructions, EDA and baseline for Food Recognition Challenge

5 months ago

Thanks for notifying about it. The Dockerfile for the baseline was dependent on https://github.com/open-mmlab/mmdetection repository’s master branch which is broken right now. We have updated the baseline repository point to a stable release version now.

### Submission confusion. Am I dumb?

5 months ago

@shraddhaamohan Sorry for the confusion above, looks like you were submitting the baseline solution as it is, and this is bug in the same, instead of something you committed. We are updating the baseline with above fix.

### Submission confusion. Am I dumb?

5 months ago

I can confirm that GPU is available for evaluations if you have used gpu: true in your aicrowd.json, and they were not removed at any point. In case someone is facing launching GPU in their submission, please share your submission ID with us so it can be investigated.

@shraddhaamohan, in your submission above i.e. #27829, your asset was assert torch.cuda.is_available()==True,"NO GPU AVAILABLE" which wasn’t showing full issue.

I tried to debug it on your submitted code, and this was happening:

>>> import torch
>>> torch.backends.cudnn.enabled
True
>>> torch.cuda.is_available()
False

aicrowd@aicrowd-food-recognition-challenge-27829-38f8:~$nvidia-smi -L GPU 0: Tesla K80 (UUID: GPU-cd5d75c4-a9c5-13c5-bd7a-267d82ae4002) aicrowd@aicrowd-food-recognition-challenge-27829-38f8:~$ nvidia-smi
Tue Dec 17 14:19:22 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   47C    P8    30W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


We further found this is happening because the underlying CUDA version we provide to submissionos was 10.0 and submissions are evaluated with docker image “nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04”. While in your submission you have custom Dockerfile which was trying to run with pytorch/pytorch:1.3-cuda10.1-cudnn7-devel, leading to above no GPU found assert.

Finally, the diff for your existing v/s working Dockerfile is as follows:

--- a/Dockerfile
+++ b/Dockerfile
@@ -1,5 +1,5 @@
-ARG PYTORCH="1.3"
-ARG CUDA="10.1"
+ARG PYTORCH="1.2"
+ARG CUDA="10.0"
ARG CUDNN="7"

FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn\${CUDNN}-devel
@@ -17,6 +17,7 @@ RUN conda install cython -y && conda clean --all

RUN git clone [removed-name] /[removed-name]
WORKDIR /[removed-name]
+RUN git reset --hard c68890db5910eed4fc8ec2acf4cdf1426cb038e9
RUN pip install --no-cache-dir -e .
RUN cd /


The repository you were cloning above was working the last time your docker image was built i.e. Dec 10, and some of the commit currently in master branch has broke pip install. We will suggest to use versioning in your future submission so inconsistent state doesn’t occur on re-build/re-run.

I have shared the new error traceback in your submission’s gitlab issue above (after GPU assert went fine).

tl;dr I tried running your exact codebase with pytorch/pytorch:1.2-cuda10.0-cudnn7-devel base image & above Dockerfile diff. It seems to be working fine after it. Let us know in case there is any follow up doubt.

### Issues with submitting

6 months ago

Hi @shraddhaamohan, you are correct. Couple of user submitted codes ran into error but keep on running forever (didn’t exit) due to which pipeline was blocked. We will be adding sensible overall timeout for the challenge so this blockage is taken care of automatically going on.

### Issues with submitting

6 months ago

Yes, the test set provided to you in resources section has following description:

Set of test images for local debuggin (Note : These are the same ones that are provided in the validation set)

It is validation set basically. The server runs your code with [hidden] test set in protected environment.

### Issues with submitting

6 months ago

We debugged on your submission. Your output.json contains prediction as follows (one of them for example):

  {
"image_id": 10752,
"category_id": 1040,
"bbox": [
5.0,
29.0,
466.0,
427.0
],
"score": 0.8330578207969666,
"area": 176875,
"segmentation": [
[
195.0,
455.5,
194.0,
455.5,
193.0,
455.5,
192.0,
455.5,
[.....]


The coco format is not loading the such generated output properly, the issue is due to bbox of size 4. Please try generating bbox of different dimension. Related issue on Gitlab.

### Is GPU available?

6 months ago

Hi @kay,

This challenge now have GPUs enabled. I have requeued your submission above which ran with GPU. You can switch between using GPUs via aicrowd.json. All the best with the competition!

### Issues with submitting

6 months ago

Yes, gpu=False was working as expected in the meanwhile.

The GPUs issue is resolved now and the submissions with GPU are no longer in pending state.

### Issues with submitting

6 months ago

It is stuck right now due to GPU node provisioning.
We were out of limits in Food challenge, and newer limits have been requested with GCP right now. It will start evaluating shortly after this is resolved.

### Is GPU available?

6 months ago

I have raised the request to our team and we will update on the decision.

### Is GPU available?

6 months ago

No, this contest don’t have GPUs enabled as of now. We are willing to add GPUs in case it is required here. Please let us know in case your model requires it.

### Submissions taking too long

6 months ago

You are correct, the submission were stuck in the Food Challenge. The submissions are going through now and you should get feedback on your submissions right now.

### I don't know how to submit

6 months ago

Hi,

You need to create a repository on Gitlab at https://gitlab.aicrowd.com/ by forking the food challenge starter kit. New tag in your repository starting with “submission-” prefix counts toward submission.

Please go through the complete README present in this starter kit repository, especially the “submitting” section to check more. Let us know if you still face issue with submission flow.

#### ImageCLEF 2020 Lifelog - LMRT

16 days ago

Hi, it works now!

17 days ago

cc: @Ivan_Eggel for looking into it.

The page referred above is: https://www.imageclef.org/system/files/ImageCLEF2020-test-topics.pdf

And the link is present in overview section here: https://www.aicrowd.com/challenges/imageclef-2020-lifelog-lmrt#topics-and%20ground%20truth%20release

### Problem: Registering for LMRT

3 months ago

@BIDAL-HCMUS, we add challenge into your AIcrowd profile page after 1st submission is made for that challenge. I hope this clarifies your doubt.

### Internal Server Error

16 days ago

Hi everyone,

The issue is now fixed and all the pending/failed submissions have been evaluated.

### Need to download datasets is necessary? or else any other way

25 days ago

Yes, you will need to download the dataset for training your model which can finally predict values for the test dataset.

BUT you can use our starter kit present here: Baseline - FOODC and click “Open In Colab” to run it completely online, by using Colab you wouldn’t need to download/install/run anything in your system, but can do it in online server directly (available as python notebook).

Let me know if I understood the question wrongly or you need any further clarification.

25 days ago

18 days ago

### Image build failed errors for code based submissions

18 days ago

Hi participants,

We came to know about higher cases of image build failures in the last couple of days, which caused few image build failures in Food Recognition Challenge and Snake Species Identification Challenge submissions.

Error Like:

Thin Pool has [...] free data blocks which is less than minimum required [...] free data blocks.
Create more free space in thin pool or use dm.min_free_space option to change behavior


The issue crept in because our docker space cleanup wasn’t working as expected, causing reduced disk space. This has been fixed now, but in case you continue to face this issue please let us know.

28 days ago

28 days ago

28 days ago

### AIcrowd Blitz ⚡- May 2020

29 days ago

AIcrowd is excited to announce the launch of AIcrowd Blitz - our fortnight-long marathon of interesting AI puzzles .

Whether you are an AI veteran or someone who is just finding feet in the world of ML and AI, there is something for each one of you. And did we mention there are some cash prizes up for grabs too !!

Our problems have always been intriguing and this time would be no exception. So put on that puzzle hat and join us in this marathon.

What : AIcrowd Blitz

When : 2nd May’20 17:00 CEST - 16th May’20 17:00 CEST

Challenge Page: https://aicrowd.com/challenges/aicrowd-blitz-may-2020

Sneak Peek : We have taken some of the classic ML problems and given it a flavor of our own.

29 days ago

29 days ago

(Replace this first paragraph with a brief description of your new category. This guidance will appear in the category selection area, so try to keep it below 200 characters.)

Use the following paragraphs for a longer description, or to establish category guidelines or rules:

• Why should people use this category? What is it for?

• How exactly is this different than the other categories we already have?

• What should topics in this category generally contain?

• Do we need this category? Can we merge with another category, or subcategory?

### Submission limit

20 days ago

Hi @aimk,

The submission limit depends from competition to competition.
You can check the submission limit for any challenge on “new submissions” page.

It is visible something like:

You have 100 submissions remaining.

Similarly for challenges which have daily limit, the message will be visible along with time (i.e. when the limit will reset).

Let me know if you have any further query.

### Can I access train file and test file in the same predict.py? As I see test file path is referring to production evironmwent but trian path is referring to local path?

6 months ago

Yes, you can access all the files at the same time during evaluation.

The starter kit have all the information about the environment variable, but let me clarify on the environment variables available during evaluations here as well.

• AICROWD_TEST_DATA_PATH: Refers to testing_phase2_release.csv file which is used by evaluator to judge your models in testing phase (soon to be made public)
• AICROWD_TRAIN_DATA_PATH: Refers to /shared_data/data/training_data/ in which all of training related files are present.
• AICROWD_PREDICTIONS_OUTPUT_PATH: Refers to the path at which your code is expected to output final predictions

Now in your codebase, you can simply do something as follows to load both the files:

AICROWD_TRAIN_DATA_PATH = os.getenv("AICROWD_TRAIN_DATA_PATH", "/shared_data/data/training_data/")
AICROWD_PREDICTIONS_OUTPUT_PATH = os.getenv("AICROWD_PREDICTIONS_OUTPUT_PATH", "random_prediction.csv")

# Do pre-processing, etc
[...]
# Make predictions
[...]
prediction_df.to_csv(AICROWD_PREDICTIONS_OUTPUT_PATH, index=False)


I hope the example clarifies your doubt.

### Where is the leaderboard? Submission confusion

23 days ago

Thanks for notifying.
Let me investigate on where we went wrong and displayed the wrong link as you mentioned. I will meanwhile add quick redirection so it doesn’t cause confusion to any other participant.

Regards,
Shivam

Update: Redirection is now active for all the problems.

### Where is the leaderboard? Submission confusion

23 days ago

We have concept of problems & challenges. And the problems can be used as part of multiple challenges.

The link you are referring above have it’s own leaderboard and submission queue, independent of AIcrowd Blitz submission queue & leaderboard. This was the reason why your scores didn’t reflect back.

I believe you ended up on above link, due to our recent email notification?

### Where is the leaderboard? Submission confusion

23 days ago

Your probably made submissions directly to the problem instead of ongoing AIcrowd Blitz competition, due to which you faced missing name in leaderboard, etc.

I have assigned your submissions manually to AIcrowd Blitz challenge.

Sorry for the confusion caused.

Regards,
Shivam

### [Important] Dataset + Problem Update

23 days ago

#### PKHND

24 days ago

Hi @dills,

The leaderboards are now updated to reflect rankings properly.

Sorry for the inconvenience and wishing you best of luck with the challenge!

26 days ago

Hi @dills,

Thanks for pointing it out. We are working on fix in our leaderboard computation for same scores scenario.

It will get changed to “1” for everyone having 1.0 (n users), “N+1” for the next score and so on.

### Approval of EUA

3 months ago

It should not take more than 1 or 2 days. (sharing based on similar question we had on forum in past)

### Possibility of mixed teams

3 months ago

cc: @Ivan_Eggel for clarification.

#### LifeCLEF 2020 Plant

3 months ago

I see some queries around dataset for this CLEF challenge.
Please let us know in case AIcrowd should host the dataset on our side, we can coordinate it over email quickly.

3 months ago

3 months ago

### Evaluation Error

5 months ago

Hi @maruthi0506, shared the error logs on your submission.

### Randomly failing image builds - what is going on?

5 months ago

Hi @bjoern.holzhauer, looking into it. The error seems to be coming while apt package are being installed, I will keep you updated here.

### Unable to push file that "exceeds maximum limit"

5 months ago

I assume the problem being mentioned here is, large files have been already commited and now no matter what you do git push rejects with above error?

If this is the case, please use the git lfs migrate command which ammend your git tree and fixes this problem. You may need to force push once this is done. https://github.com/git-lfs/git-lfs/wiki/Tutorial#migrating-existing-repository-data-to-lfs

### Unable to push file that "exceeds maximum limit"

5 months ago

Hi @ngewkokyew,

I remember the workspace have older git version (i assume) which don’t come with lfs, please install it using:

sudo apt-get update
sudo apt-get install git-lfs


### Submissions get killed without any error message

5 months ago

We are using the kubernetes cluster from organisers which have 8G base machines and AKS have quite hard eviction policy due to which it kill code as soon as it reach 5.5G.

Best might be to see if your RAM usage can be reduced by down casting variables.

Meanwhile, @kelleni2, @laurashishodia is it possible to change underlying nodes in AKS cluster from Standard_F4s_v2 to some higher RAM nodes? I am seeing OOM issue for multiple teams (3-4+ at least).

### Submissions get killed without any error message

5 months ago

Sorry for missing out your query earlier.

Yes the Killed is referring to OOMKill of your submission. It happens when your codebase is breaching the memory limit i.e. ~5.5G during evaluation.

The training data on the server is same as workspace except the row_id part which were changed, which I announced on the day on change.

Can you share me your latest submission ID which you think is only getting stuck due to OOM issue? I can debug it for you and share the part of code which is causing high memory usage.

5 months ago

### Different submission tags but same commit tag

5 months ago

Hi @ngewkokyew,

Are you committing your code with changes? The submissioin.sh just created submission with current git commit for you.

You will need to do:

[... make your changes ...]
git commit -m "Your commit message"
./submission.sh <solution tag>


5 months ago

Hi @ngewkokyew,

Please e-mail this issue to Aridhia team (servicedesk@aridhia.com) with description of issue and team name.

5 months ago

Hi @carlos.cortes, @all,

The scores are now updated for all the submissions and new ranks are available on the leaderboard.

We were following approach to re-evaluate one, if it fails provide feedback/fix for the submission and so on, which turned out to be quite slow. Right now, we have re-evaluated all the submissions, and submissions which have failed are being provided feedback or applied automatic patches asynchronously.

All the best and, Merry Christmas!

### Evaluation Error

5 months ago

Sorry for the sed issue, we were trying to provided automated patch to user codes for row_id which went wrong. I have undo this and requeued all the submissions affected by it now.

### Evaluation Error

5 months ago

Hi @maruthi0506,

I can confirm the recent submissions failed due to OOM kill, when they touched memory usage ~5.5G.

Upon debugging #31962, I found it is happening due to Series.str.get_dummies used in the code, which is not a memory optimised function.
Point at which OOM is happening: https://gitlab.aicrowd.com/maruthi0506/dsai-challenge-solution/blob/master/predict.py#L279

This demonstrates what is happening in your submission along with alternatives which you can use (name of variable changed to hide any potential information getting public on feature used):

(suggested ways #1, decently memory efficient)
>>> something_stack_2 = pd.get_dummies(something_stack)
>>> something_stack_2.info()
<class 'pandas.core.frame.DataFrame'>
MultiIndex: 38981 entries, (0, 0) to (8690, 2)
Columns: 4589 entries,  to <removed>
dtypes: uint8(4589)
memory usage: 170.7 MB

(suggested ways #2, most memory efficient, slower then #1)
>>> something_stack_2 = pd.get_dummies(something_stack, sparse=True)
>>> something_stack_2.info()
<class 'pandas.core.frame.DataFrame'>
MultiIndex: 38981 entries, (0, 0) to (8690, 2)
Columns: 4589 entries,  to <removed>
dtypes: Sparse[uint8, 0](4589)
memory usage: 304.8 KB

(what your submission is doing -- ~5G was available at this time)
>>> something_stack_2 = something_stack.str.get_dummies()
Killed


NOTE: The only difference between two approaches is Series.str.get_dummies use “|” as separator by default. In case you were relying on it, can do something like below:

>>> pd.get_dummies(pd.Series(np.concatenate(something_stack.str.split('|'))))


Let us know in case the problem continues after changing this (here and it’s usage anywhere else in your codebase), we will be happy to debug further accordingly.

### Evalution error - row id mismatch

5 months ago

Hi @rachaas1,

Yes the output file format generated by your code is wrong. prob_approval need to be float instead of arr[float]. Shared the current output in above link as comment.

### Evaluation Error

5 months ago

The solution have 8GB RAM available.

Edit: out of 8GB, ~5.5GB is available for evaluation code

### Evaluation Error

5 months ago

Hi, it is getting killed on running without traceback.

Does it have any high RAM/CPU need?

### Evaluation Error

5 months ago

Hi, looks like git-lfs isn’t installed on your system.

Can you try sudo apt-get install git-lfs. (more)

### Evaluation Error

5 months ago

Yes, just nltk_data folder need to be present.

Yes, the content remains same.

### [Announcement] row_id can be dynamic and different to workspace file

5 months ago

Hi everyone,

Please make sure that your submissions are creating prediction file with correct row_id .

The row_id was not being match strictly till the previous evaluator version and we have added assert for the same now. Due to which the submissions have failed with the row_ids in the generated prediction file do not match that of the ground_truth .

Your solution need to output row_id from testing data during evaluation and not hardcoded / sequential (0,1,2…). Also note, that row_id can be different & shuffled on data present on evaluations v/s workspace, to make sure people who have just submit predictions csv (instead of code) fail automatically.

We are trying to apply automatic patch wherever possible, but it need to be ultimately fixed in solutions submitted. Example patch is present here.

### Evaluation Error

5 months ago

Hi @maruthi0506,

Yes, the row_id i.e. 5,6,7 in test data provided to you on workspace can be anything say 1000123, 1001010, 100001 (and in random order) in test data present on the server going forward, so we know predictions are being carried out during evaluation.

To use nltk for the evaluation, you need to provide ntlk_data folder in your repository root, which can be done as follows (current working directory: at your repository root):

python -c "import nltk; nltk.download('stopwords', download_dir='./nltk_data')"


cp ~/nltk_data nltk_data


#> git lfs install   (if not already using git lfs)
#> git lfs track "nltk_data/**"
#> git commit [...]


Please let us know in case you still face any issue.

### Submission Evaluation - Queued since 2 hours

5 months ago

Hi Shravan,

It looks like you have uploaded your prediction file i.e. lgbm.csv and directly dumping to the output path. We want your prediction model to run on the server itself and not prediction files to be submitted as solution. Due to which your submission is failed.

The row_id i.e. 5,6,7 in test data provided to you can be anything say 1000123, 1001010, 100001 (and in random order) in test data present on the server going forward, so we know predictions are being carried out during evaluation.

### Evaluation Error

5 months ago

Hi everyone, please make sure that your submissions are creating prediction file with correct row_id. The row_id was not being match strictly till the previous evaluator version and we have added assert for the same now. Due to which the submissions have failed with the row_ids in the generated prediction file do not match that of the ground_truth.

Your solution need to output row_id as shared in the test data and not hardcoded / sequential (0,1,2…). Also note, that row_id can be different on data present on evaluations v/s workspace, to make sure people aren’t hardcoding from that file.

We are trying to apply automatic patch wherever possible, but it need to be ultimately fixed in solutions submitted.

### Submission Evaluation - Queued since 2 hours

5 months ago

Hi everyone, please make sure that your submissions are creating prediction file with correct row_id. The row_id was not being match strictly till the previous evaluator version and we have added assert for the same now. Due to which the submissions have failed with the row_ids in the generated prediction file do not match that of the ground_truth.

Your solution need to output row_id as shared in the test data and not hardcoded / sequential (0,1,2…). Also note, that row_id can be different on data present on evaluations v/s workspace, to make sure people aren’t hardcoding from that file.

We are trying to apply automatic patch wherever possible, but it need to be ultimately fixed in solutions submitted.

5 months ago

The submissions are being reevaluated right now. Given we have large amount of submissions i.e. 1000+ successful submissions, it will take few more hours before all the submissions are reevaluated with new dataset.

### Submission Evaluation - Queued since 2 hours

5 months ago

This issue is resolved now, and your above submission have latest feedback i.e. newer dataset. Meanwhile other submissions by you and other participants are still in queue and being re-evaluated right now.

### Submission Evaluation - Queued since 2 hours

5 months ago

Yes, I think workspaces will be available to you. Please go through the announcement made by Nick here. UPDATE / EXTENSION: DSAI Challenge: Leaderboard & Presentation deadlines

### Submission Evaluation - Queued since 2 hours

5 months ago

Yes this condition is added with newer version of evaluator that is using the updated splitting announced here. I am looking into this and will keep you updated here.

5 months ago

Hi @bzhousd,

Yes, the splitting approach is being changed.

### Different results from debug vs non-debug mode

5 months ago

Yes, the debug mode has small subset as described above.

No, the logs are not visible for successful submissions as of now in our design, but we will be glad to help in fetching logs in case of successful submission (debug mode I assume) if it is blocking you.

5 months ago

### Is the scoring function F1 or logloss?

5 months ago

Please confirm policy for final scoring i.e. all submissions will be considered or the one having best score on partial dataset?

### Log Loss and F1 on Leaderboard different from "PUBLIC_F1" and "PUBLIC_LOGLOSS"

5 months ago

Hi, you are looking into a debug submission score. http://gitlab.aicrowd.com/wangbot/dsai-challenge-solution/issues/23#note_34783

1. The submission is reflected back on AIcrowd.com / Leaderboard with lowest possible score for given competition.

6 months ago

### Test file changed

6 months ago

Can you tell the submission ID and where do you notice the above file? Do you mean in workspace?

### Unable to find a valid aicrowd.json file at the root of the repository

6 months ago

Hi @TayHaoZhe,

This is resolved now and your shared commit ID is now running as submission ID 28656.
We were having limit of maximum files which we were expecting to be present at repository root, which was an incorrect assumption.

### Example configuration to use CRAN packages in submission

6 months ago

Yes, all the dependencies need to be present in your repository.

### Example configuration to use CRAN packages in submission

6 months ago

Hi, shared the logs in your submission now, I guess you are missing C50 package due to which it is failing.

### Example configuration to use CRAN packages in submission

6 months ago

Can you tell your submission ID in which you tried using it? We can look into it.

### Example configuration to use CRAN packages in submission

6 months ago

Can you tell your submission ID in which you tried using it? We can look into it.

### Update of "LogLoss" Score on Leader board

6 months ago

Hi @sweenke4,

Can you share which submission ID has performed better then the leaderboard one as per you?

### Example configuration to use CRAN packages in submission

6 months ago

Hi @TayHaoZhe,

There was discussion for tidyverse package here for installing tidyverse 1.3.0. There is issue in r-stringi package on conda-forge, in case you are trying that one.

### Example configuration to use CRAN packages in submission

6 months ago

Hi @ngewkokyew,

You can use the r-mice package from conda-forge, given it isn’t present in R channel.

To do so, in your environment.yml, make sure you have conda-forge under “channels”, and add r-mice under dependencies.

For installing locally on your system you can use: conda install -c conda-forge r-mice

### Submission Evaluation - Queued since 2 hours

6 months ago

There are ~12+(6 running) submissions in queue due to stuck pipeline and getting cleared right now. Your submission will be evaluated shortly. https://www.aicrowd.com/challenges/novartis-dsai-challenge/submissions

6 months ago

### Submission Evaluation - Queued since 2 hours

6 months ago

The pipeline was blocked due to failed submissions which didn’t terminate with non-zero exit code. We have cleared the pipeline and adding a fix now so it don’t happen again.

### Accessing the train file and test file in the same predict.py?

6 months ago

Yes, the evaluations run in seperate servers then your workspaces.

### Is the scoring function F1 or logloss?

6 months ago

Hi, I will let @kelleni2 confirm on this from organisers point of view, given it is just configurable setting on our side.

### Accessing the train file and test file in the same predict.py?

6 months ago

Hi,

The default path can be anything of your preference i.e. your workspace based path for testing.

While during evaluation this environment variable will be set always and default value wouldn’t be used.

### Test data matrix available

6 months ago

Hi, the process it extremely useful in longer run due to multiple reasons. This guarantees the reproducibility of the results and the transparency needed. We also preserves your submissions as docker images which guarantee the code to run forever on current or on future dataset even if any of the dependency is lost in public internet.

### Accessing the train file and test file in the same predict.py?

6 months ago

Hi @maruthi0506,

Your codebase need to read this environment variable i.e. absolute and just write final predictions at that location. The example is in starter kit already as well as in this comment above.

### Submission of only final predictions file

6 months ago

Hi, the process it extremely useful in longer run due to multiple reasons. This guarantees the reproducibility of the results and the transparency needed. We also preserves your submissions as docker images which guarantee the code to run forever even if any of the dependency is lost in public internet.

Meanwhile if you are facing any issues in setting up, it will be good to share it with us, so that can be taken care of for your smoother participation.

cc: @mohanty @kelleni2 if you have any additional points

### Submission limit per day

6 months ago

The limit is per team.

6 months ago

cc: @kelleni2

### Is the scoring function F1 or logloss?

6 months ago

It is your submission having best score on half of the test dataset.

We already have scores against full dataset for all of your submissions (hidden), so all submissions will be used.

### How to identify if my program running successfuly?

6 months ago

Hi, please copy paste the exact command from error message. You have to give permission for .conda folder not anaconda3.

### How to identify if my program running successfuly?

6 months ago

Hi, I just remembered that participants were having sudo in their workspace. Can you instead try running the sudo chown... command yourself on workspace? I believe it will fix the permission issue for you, followed by pip=10 installation.

### How to identify if my program running successfuly?

6 months ago

Hi, as I mentioned earlier you will need to get in touch with Aridhia first on Microsoft Teams to get your conda working i.e. fixing permission issue above.

After this, conda install pip=10 should resolve this issue. If not, we can debug further.

### How to identify if my program running successfuly?

6 months ago

Thanks for sharing the output. The pip package installation looks correct on your side.

I am suspecting pip version to be 18.X on similar on your side which isn’t working out well with conda. (Github Issue)

Can you share output of pip -V and also at the same time try installing pip version 10.X by conda install pip=10 . The exports so generated should contain all of your pip packages as well.

Let me know if this resolves the issue you are facing.

### How to identify if my program running successfuly?

6 months ago

Hi, looks like your conda installation have permission issues. Can you get in touch with Aridhia team for permission fix along with above message?

And you haven’t shared output of above two commands by which I can check if your pip install [...] worked properly or not.

### How to identify if my program running successfuly?

6 months ago

Hi,

I don’t see pip packages in your environment.yml. Please make sure you have activated your conda environment when you did pip install.

Output from below commands will be useful to know more about the issue you are facing.

• which pip
• pip freeze

### How to identify if my program running successfuly?

6 months ago

Hi,

Logs shared for both #25991 and #25990.

Please check out this FAQ for debug mode which will speed up your debugging. Meanwhile also try running ./run.sh in local system (workspace) before submitting, to catch bugs without even making a submission (as submissions/day are limited).

### How to identify if my program running successfuly?

6 months ago

We provide feedback for all the submissions via Gitlab issues.

To clarify, the exact flow is as follows:

1. You make changes to your code, followed by git commit
2. ./submission.sh <xyz> to create a new git tag and push it to the repository
3. We create a new Gitlab issue in your repository when your submission tag is correct i.e. prefix of the Gitlab tag is submission- (prefix thing happen automatically in submission.sh for Novartis challenge)

Gitlab Issues Page: https://gitlab.aicrowd.com/shravankoninti/dsai-challenge-solution/issues

In case to run locally, you can simply call ./run.sh on your local system and it mimics what will happen on online server (except the runtime environment). When ran in online environment i.e. as a submissions we provide feedback via sharing logs, you can read more about it here.

6 months ago

This is happening because all of above submissions you have shared are having same commit id i.e. 6b832bec. And the evaluation for this commit ID was already done in #25958. The subsequent tags are being considered to be duplicates (cached for sometime, not permanently).

Please make some change in your repository followed by git commit & submission.sh to trigger a new submission. Let us know in case any doubt still exists.

6 months ago

Can you share the AIcrowd submission ID (something like #2XXXX) or link to Gitlab issue? I can look into it and update you.

Meanwhile please try to create new post on forum for unrelated issues.

### Accessing the train file and test file in the same predict.py?

6 months ago

Yes, this is correct.

### Accessing the train file and test file in the same predict.py?

6 months ago

Sure. Can you point us to the file/link where you find wrong path?

### Is training data available during evaluation?

6 months ago

Hi all,

We are sorry that the announcement didn’t went through for this change. The testing data is available during evaluation and starter kit has been updated accordingly for demonstrating example.

It can be accessed via environment variable AICROWD_TRAIN_DATA_PATH which refers to same directory structure as /shared_data/data/training_data/ i.e. in which all of training related files are present.

Example to use it:

AICROWD_TEST_DATA_PATH = os.getenv("AICROWD_TEST_DATA_PATH", "/shared_data/data/testing_data/to_be_added_in_workspace.csv")
[...]


Please let us know in case there is any follow up question.

### Accessing the train file and test file in the same predict.py?

6 months ago

Yes, you can access all the files at the same time during evaluation.

The starter kit have all the information about the environment variable, but let me clarify on the environment variables available during evaluations here as well.

• AICROWD_TEST_DATA_PATH: Refers to testing_phase2_release.csv file which is used by evaluator to judge your models in testing phase (soon to be made public)
• AICROWD_TRAIN_DATA_PATH: Refers to /shared_data/data/training_data/ in which all of training related files are present.
• AICROWD_PREDICTIONS_OUTPUT_PATH: Refers to the path at which your code is expected to output final predictions

Now in your codebase, you can simply do something as follows to load both the files:

AICROWD_TRAIN_DATA_PATH = os.getenv("AICROWD_TRAIN_DATA_PATH", "/shared_data/data/training_data/")
AICROWD_PREDICTIONS_OUTPUT_PATH = os.getenv("AICROWD_PREDICTIONS_OUTPUT_PATH", "random_prediction.csv")

# Do pre-processing, etc
[...]
# Make predictions
[...]
prediction_df.to_csv(AICROWD_PREDICTIONS_OUTPUT_PATH, index=False)


I hope the example clarifies your doubt.

### Original Datasets for Train and Test

6 months ago

Hi,

Consider the files present on /shared_data/data/ on workspace as latest version and the records as correct. The README in starter kit contains number from previous dataset version and can be wrong.

I am not sure about random_number_join.csv. @kelleni2 might be aware of it?

6 months ago

Hi @lcb,

I am sorry for the confusion. I see your submission ran in debug mode, under which we provide lowest score on the leaderboard.

Debug Mode FAQ

### How to access error logs?

6 months ago

We share error logs on best effort directly in your failed Gitlab issues as comment, and looks like you get the response later on by our team in each gitlab issue.

The submission which you made with debug mode, actually had open “agent-logs” i.e. your submitted code. http://gitlab.aicrowd.com/michal-pikusa/dsai-challenge/issues/5#note_30139

Unfortunately, the message was saying “Logs for Admin Reference” instead of “Logs for Participant Reference” which would have caused the confusion that the logs aren’t available to you directly. I have updated the gitlab issue comment content for this challenge now, so this don’t cause any confusion going further.

FAQ Section: Why Debug Mode

### Original Datasets for Train and Test

6 months ago

Are you referring in workspace or on evaluator?

In the workspace those files are present in /shared_data/data/ while in the evaluator you can access them using the environment variables AICROWD_TEST_DATA_PATH.

### How to use conda-forge or CRAN for packages in evaluation?

6 months ago

The issue you are facing was multi-fold due to which it took some time on our side as well to figure best solution for you.

1. The procedure you were following was correct except apt.txt file. This file excepts just the packages name you want to install (although it isn’t the fix/cause of the error). So it should be something like:
~❯ cat apt.txt
libicu-dev


But the error still continues as:

> library('tidyverse')
Error: package or namespace load failed for ‘tidyverse’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/srv/conda/envs/notebook/lib/R/library/stringi/libs/stringi.so':
libicui18n.so.64: cannot open shared object file: No such file or directory


We found the issue is due to the dependency r-tidyverse -> r-stringr -> r-stringi, and r-stringi package in conda-forge channel is broken.

1. We checked with the recommended channel i.e. r and it had working packages by default, so this should have worked at first place.
    - r::r-tidyverse
- r::r-stringi

1. [Solution; tldr] But I remember from our call, you needed 1.3.0 version specifically. So, this is the environment.yml entry you need for 1.3.0. Basically getting r-stringr from r channel instead of conda-forge one.
  - conda-forge::r-tidyverse==1.3.0
- r::r-stringr
- r::r-stringi


Sorry that you went through long debug cycle.

6 months ago

Hi @lcb,

The leaderboard get update real-time as soon as your submission is failed/successful, and contain your best score. The leaderboard is currently up to date as well.

Your submission #25650 has log loss 1000.0 and f1 score 0.0. This score is lower than your submission #24558 which has log loss 0.973 and f1 score 0.380. Due to which leaderboard didn’t change after your submission.

Let us know in case you have any further doubt on this.

### What is being evaluated during submission?

6 months ago

Hi @wangbot,

Welcome to the challenge!

As described here in starter kit README, we use run.sh as the code entry point. You can modify it based on your requirement. https://gitlab.aicrowd.com/novartis/novartis-dsai-challenge-starter-kit#code-entrypoint

We have a debug mode which you can activate using debug: true in aicrowd.json. Under this, you will get complete access to logs and can debug without need of help for logs. NOTE: The submission runs on a extremely small/partial dataset during this and your scores aren’t reflected back to leaderboard. https://gitlab.aicrowd.com/novartis/novartis-dsai-challenge-starter-kit#aicrowdjson

Nevertheless, AIcrowd team [and organisers] have access to all the logs and we do share error tracebacks and relevant logs with you as comment in Gitlab issue on best effort manner, which range from few minutes to few hours.

I hope this clarifies any doubt you had. All the best with the competition!

### https://discourse.aicrowd.com/t/

4 months ago

(Replace this first paragraph with a brief description of your new category. This guidance will appear in the category selection area, so try to keep it below 200 characters.)

Use the following paragraphs for a longer description, or to establish category guidelines or rules:

• Why should people use this category? What is it for?

• How exactly is this different than the other categories we already have?

• What should topics in this category generally contain?

• Do we need this category? Can we merge with another category, or subcategory?

5 months ago

### Test Topic, Test Topic

5 months ago

Test Content, Test Content, Test Content, Test Content, Test Content, Test Content, Test Content, Test Content

### https://discourse.aicrowd.com/t/

5 months ago

(Replace this first paragraph with a brief description of your new category. This guidance will appear in the category selection area, so try to keep it below 200 characters.)

Use the following paragraphs for a longer description, or to establish category guidelines or rules:

• Why should people use this category? What is it for?

• How exactly is this different than the other categories we already have?

• What should topics in this category generally contain?

• Do we need this category? Can we merge with another category, or subcategory?

5 months ago

Hi @bzhousd,

Your new submissions are facing issue due to downcasting done on “row_id” column. I have added automated patch for your submissions which convert it, but it will be important to get it included in your codebase, so it is fixed properly.

The changes are as follows in your codebase:

replace("EDA_simple.py", 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"]]', 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"] and c!="row_id"]')
replace("EDA_v3.py", 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"]]', 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"] and c!="row_id"]')
replace("EDA.py", 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"]]', 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"] and c!="row_id"]')
replace("EDA_v4.py", 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"] and c not in [\'drugkey\',\'indicationkey\']]', 'int_cols = [c for c in df if df[c].dtype in ["int64", "int32"] and c not in [\'drugkey\',\'indicationkey\', \'row_id\']]')


### REAL Competition - Submission (and VMs?) stuck

6 months ago

Hi Emilio,

Sorry the stuck submission was running into underlying node issue multiple times and finally went through yesterday.

All the nodes in Kubernetes cluster are stopped now and submissions have went through.

Cheers,
Shivam

### Due Date and Conference related information

4 months ago

Hi @student,

The deadline for AMLD participants especially was December 31.

### How to add SSH key to Gitlab?

5 months ago

It is best practice to use Git over SSH instead of Git over HTTP. In order to use SSH, you will need to:

1. Create an SSH key pair on your local computer.
2. Add the key to GitLab

## Creating your SSH key pair

ssh-keygen -t rsa -b 4096 -C "name@example.com"


Example:

Once you have key in your system at location of your choice. You must manually copy this and add it at https://gitlab.aicrowd.com/profile/keys.

### Which docker image is used for my submissions?

5 months ago

We use custom fork of repo2docker for our image build processes. The functionality of it is exactly same as upstream repo2docker, except the base image. Our Dockerfile uses following base image:

FROM nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04


How to use above?
To install forked version of repo2docker, please install it via pypi using:

pip install aicrowd-repo2docker


git clone git@gitlab.aicrowd.com:<your-username>/<your-repository>.git
cd <your-repository>
pip install -U aicrowd-repo2docker
aicrowd-repo2docker \
--no-run \
--user-id 1001 \
--user-name aicrowd \
--image-name sample_aicrowd_build_45f36 \
--debug .


How to specify custom runtime?

Why fork repo2docker?
It is currently not possible to use custom base image in vanilla repo2docker, and this status is being tracked here.

### How to enable GPU for your submission?

5 months ago

The GPUs are allotted to your submission on need-basis. In case your submission use GPU and you want to get it allocated, please use gpu: true in your aicrowd.json.

Versions we support right now:

nvidia driver version: 410.79
cuda: 10.0


NOTE: Majority of the challenges haven GPU enabled, this information should be available directly on challenge page or starter kits. In case you are not sure the challenge you are participating has GPUs, please reach out to us on Discourse.

6 months ago

### Background

When we write codes and run them, many of times we make errors. The best friend in such cases are the logs, using which we can debug and fix our codes quickly.

While we want to, but sharing logs by default is tricky because arbitrary codes are executed as part of submission in our competition. These may unintentionally leak part of whole of the datasets. As we all understand, these datasets are confidential and at the same time, knowing testing dataset can be used for undue advantage in a running competition.

Due to this our default policy is to hide the logs. BUT we do keep a close eye on submissions which fail, manually verify and share relevant traceback to the participants in a best effort basis, which can take few minutes to multiple hours. This is surely a issue and the process can’t be scaled. This was the reason which lead to an integration testing phase in our competitions known as “debug” mode. I will be explaining how to enable “debug” mode and what it does below.

## Debug Mode

The debug mode when enabled, runs your code against extremely small datasets (different than testing, subpart of testing, subpart of training, etc based on competition), different seed or so on. In a nutshell, the data even when visible to participants hold no value.

Each of the competition have different policy for what all logs are visible and if debug mode should exist. But in majority cases, we enable debug mode by default and show logs for user’s submitted code (not the infrastructure/evaluator logs)

### How to use this?

When you submit a solution, the metadata for competition along with other informations is present in aicrowd.json. You can specify debug: true to enable the same.

When enabled following happens with your submission:

1. Run against different and small dataset for quicker runs
2. Logs are visible by default to you when the submission fails, under the heading “Logs for participants reference”
3. The submission is reflected back on AIcrowd.com / Leaderboard with lowest possible score for given competition.

### Still facing issues?

We keep the environment for debug mode and actual submission exactly the same. But it is still possible that your code runs well in debug mode while don’t in actual submission. In this case, we will need to revert to traditional support method. The escalations for the same are, we will automatically post your logs -> you can tag competition organisers in Gitlab issue -> let us know in Discourse Forum.

We wish you best of luck for participating in competition!

### Submission Error AMLD 2020

5 months ago

Hi @student,

Are the submissions you are referring #57743 and #57745 respectively? Both of these submissions are part of leaderboard calculation. It is possible that you tried to view leaderboard immediately while we refresh leaderboard in ~30 seconds?

#### Flatland Challenge

5 months ago

Hi @a2821952,

Let us know in case it doesn’t work out for you.

### Evaluation process

6 months ago

1. Yes, the solutions are evaluated on same test samples but they are shuffled. https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/evaluators/service.py#L89
2. Video generation is done on a subset of all environments and remain same for all evaluations. It may be possible when you open the leaderboard, all videos didn’t start playing at the same time leading to this perception?
3. This is the place where Flatland library is generating score and N+1 thing might not be the reason. I will let @mlerik investigate & comment on it.

### Evaluation time

6 months ago

@mugurelionut Please use 8 hours as the time limit, we have updated our evaluation phase to strictly enforce it going forward.

NOTE: Your submission having 28845 seconds as total execution time is safe with 8 hours time limit as well. Non-evaluation phase takes roughly 5-10 minutes which is included in timing visible above.

Total Time = Docker Image Building + Orchestration + Execution (8 hours enforced now)

### Evaluation time

6 months ago

Thanks for sharing one example submission. Yes, it will be checked for all the submissions.

### Evaluation time

6 months ago

Can you share ID/link of the submission you are referring above?

The exact timeout from participants’ perspective is 8 hours, while we keep a margin of 2 hours (making it total of 10 hours) as overhead for docker image build, node provisioning, scheduling, etc. If any submission has run for longer time than this, we can look further why timeout wasn’t respected.

### Evaluation time

6 months ago

We have per step timeout of 15 minutes while the overall submission timeout is 8 hours.

### Mean Reward and Mean Normalized Reward

6 months ago

You can checkout the information in flatland-rl documentation here. https://flatlandrl-docs.aicrowd.com/09_faq.html#how-is-the-score-of-a-submission-computed

The scores of your submission are computed as follows:

1. Mean number of agents done, in other words how many agents reached their target in time.
2. Mean reward is just the mean of the cummulated reward.
3. If multiple participants have the same number of done agents we compute a “nomralized” reward as follows: … code-block:

normalized_reward =cumulative_reward / (self.env._max_episode_steps +self.env.get_num_agents()

The mean number of agents done is the primary score value, only when it is tied to we use the “normalized” reward to determine the position on the leaderboard.

#### AMLD 2020 - D'Avatar Challenge

5 months ago

Hi @borisov,

The submission format you shared above is valid json.
You can also download sample output json from resources section on the contest page here.

### Evaluation failed

6 months ago

I went through the submission #25101. It failed because the underlying Kubernetes node faced issue causing your submission terminate while it was in extrinsic phase. I have requeued the submission now.

### Thanking the organizers

6 months ago

Hi @amirabdi, @imheyman.
Updates from the organisers are as follows

Winners of the stage 1 and 2 are the same.

• 1st entry: DMIRLAB
• 2nd entry: Maximilian Seitzer
• 3rd entry: Amir Abdi

There will be no best paper awards due to the lack of submitted reports and the lacking quality of the reports the jury could not decide to award a brilliancy price.

### Evaluation result says file too large？

6 months ago

Hi @rolanchen ,

It seems the error message wasn’t good. I re-ran your submission with minor change to print the traceback. I think this should help you in debugging further, seems to be coming from your server.init()

Traceback (most recent call last):
File "/home/aicrowd/run.py", line 14, in <module>
train.main()
File "/home/aicrowd/train.py", line 159, in main
server.init()
File "/home/aicrowd/aiserver.py", line 52, in init
self.kick_cbuf = SharedCircBuf(self.instance_num, {'NAN':np.zeros([2,2])}, ['NAN'])
File "/home/aicrowd/lock_free_queue.py", line 86, in __init__
File "/home/aicrowd/lock_free_queue.py", line 25, in __init__
sary = multiprocessing.sharedctypes.RawArray('b', 8 * size)
File "/srv/conda/envs/notebook/lib/python3.7/multiprocessing/sharedctypes.py", line 61, in RawArray
obj = _new_value(type_)
File "/srv/conda/envs/notebook/lib/python3.7/multiprocessing/sharedctypes.py", line 41, in _new_value
wrapper = heap.BufferWrapper(size)
File "/srv/conda/envs/notebook/lib/python3.7/multiprocessing/heap.py", line 263, in __init__
block = BufferWrapper._heap.malloc(size)
File "/srv/conda/envs/notebook/lib/python3.7/multiprocessing/heap.py", line 242, in malloc
(arena, start, stop) = self._malloc(size)
File "/srv/conda/envs/notebook/lib/python3.7/multiprocessing/heap.py", line 134, in _malloc
arena = Arena(length)
File "/srv/conda/envs/notebook/lib/python3.7/multiprocessing/heap.py", line 77, in __init__
os.ftruncate(self.fd, size)
OSError: [Errno 27] File too large


### Evaluation result says file too large？

6 months ago

Also the error doesn’t seem to be size related:

On attempting to open files with sufficiently long file names, python throws IOError: [Errno 27] File too large.  This is misleading, and perhaps should be relabeled as 'File name too long.'


### Evaluation result says file too large？

6 months ago

Hi, it is acceptable. You can training models in train/ folder upto 1000Gi size.

Can you share submission id?

shivam has not provided any information yet.