### 🚀 Share your solutions! 🚀

11 months ago

Hello, I want to share my solution.

The competition was very interesting and unusual. And it was my first competition on AI crowd platform and guides/pages/discussions were very helpful for me. So thanks to organizers!!!

Actually my solution is very similar to xiaozhou_wang’s.

I have two strategies. First strategy is based on the idea to collect samples with “hard” classes (it went from Round 1). Suppose we have a trained model and we know F1-measure for all six classes from validation. Let us sum class predictions with weights equal to 1 - f1_validataion. And then choose samples with maximum of weighted predictions.


def choose_unlabelled_by_sum_probs(self, unlabelled_indices, unlabelled_preds, choose_size):
assert len(unlabelled_indices) == len(unlabelled_preds)

if len(unlabelled_indices) <= choose_size:
return unlabelled_indices

_, best_f1s = self.best_states['best_thrs_0']

choose_scores = unlabelled_preds[:, 0] * (1 - best_f1s[0])
for x in range(1, n_classes):
choose_scores += unlabelled_preds[:, x] * (1 - best_f1s[x])
sorted_indices = np.argsort(-choose_scores)
return [unlabelled_indices[x] for x in sorted_indices[:choose_size]]


The second strategy is to collect samples with higher uncertainty. I consider the prediction 0.5 is the most uncertain, so I just sum the absolute value of 0.5 – over all classes.

def choose_unlabelled_by_uncertainty(self, unlabelled_indices, unlabelled_preds, choose_size):
assert len(unlabelled_indices) == len(unlabelled_preds)

if len(unlabelled_indices) <= choose_size:
return unlabelled_indices

_, best_f1s = self.best_states['best_thrs_0']

choose_scores = np.sum(0.5 - np.abs(unlabelled_preds - 0.5), axis=1)
sorted_indices = np.argsort(-choose_scores)
return [unlabelled_indices[x] for x in sorted_indices[:choose_size]]


I also considered the third strategy from hosts: “match labels to target distribution”, but it was worse than without it. PS. to organizers – I have this code in my solution since I exprimented, but take very little samples by it and I think it doesn’t matter for score.

I tried several ratios of first strategies, but I didn’t see an obvious advantage of one of them. So finally I used both strategies with the equal budget.

I saw the idea of “Active Learning” in one of papers and decided to make several iterations (let’s say, L).

1. Train a model with current known samples
2. Take ~purchase_budget//L samples by two strategies (the last one batch can be bigger by 1).

The problem was to calculate the number L of iterations. My way is not so clever as xiaozhou_wang’s. I noticed that ~300 samples are enough for one iteration. Even more, in my experiments sometimes more iterations worsened a result. I looked at the submissions table to estimate training time and inference time. So I came to the formula (I have Pretraining Phase, so the first iteration doesn’t need training)

max_choose_size = min(len(unlabelled_dataset), purchase_budget)
n_loops = max(1, min(1 + (compute_budget - 50) // 220, int_ceil(max_choose_size, 290)))


For training I used efficientnet_b3, 5 epochs with

CosineAnnealingLR(optimizer, T_max=5, eta_min=1e-5)


and the following augmentations

return A.Compose([

A.OneOf([A.GaussianBlur(), A.MotionBlur()], p=0.5),
A.ToGray(p=0.01),
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.5),
A.RandomRotate90(p=0.5),
])


### Code for End of Competition Training pipelines

12 months ago

Each of 5 training pipilines will go with its own budget, right ?

### :aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!

Hi, it seems theres’s a bug in local_evaluation.py.
I think you should change
time_available = COMPUTE_BUDGET - (time_started - time.time())

time_available = COMPUTE_BUDGET - (time.time() - time_started)

### 0.9+ Baseline Solution for Part 1 of Challenge

Do you know how much “pseudolabel remaining dataset” gives in terms of accuracy? (a boost)
I didn’t use it.

### Experiments with “unlabelled” data

I’ve checked it locally.
Using all 10K images is better than my 3K choosing by 0.006. Maybe I can take some of it by changing purchasing algorithm. But still I feel I need to tune my model.

### Experiments with “unlabelled” data

I wrote scores from the leaderboard. I can’t check 10K there…
Local scores are a little bit higher than LB, but correlated with LB.
Yeah maybe I’ll check it locally.

### Experiments with “unlabelled” data

Here are just my results. I used the same model, but different purchase modes.

1. Train with initial 5000 images only: LB 0.869
2. Add 3000 random images from unlabelled dataset: 0.881
3. “smart” purchasing (at least non random): 0.888

So we see, that using some “smart” purchasing is helpful, but not so many, maybe ~0.01.
Probably tuning models would be more helpful to push further.

### First round doesn't matter?

If I understood correctly, then the first round means a little and is preliminary. The second round is decisive, right?

### Size of Datasets

Ahh… I see so AICrowd runs the whole pipeline twice, and I can see logs only from the debug version.
Great, thanks!

### Size of Datasets

Hello!
During submission sizes of datasets are only 100 (both training dataset and unlabelled dataset).
Probably it is the debug version.
Is it intentionally?

### Potential loop hole in purchasing phase

I think local evaluation can be modified somehow.
Maybe in ZEWDPCProtectedDataset class, that it doesn’t give you the label in a sample.