0 Follower

1 Following

Anchit

Anchit Gupta

Activity

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Mon

Wed

Fri

Ratings Progression

Challenge Categories

Challenges Entered

Completed

ADDI Alzheimers Detection Challenge

ADDI

Machine Learning for detection of early onset of Alzheimers

Latest submissions

No submissions made in this challenge.

Completed

AI Blitz XIII

AIcrowd

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

See All

graded

174305

Fri, 18 Feb 2022 09:30:54

Completed

Seismic Facies Identification Challenge

SEAM AI

3D Seismic Image Interpretation by Machine Learning

Latest submissions

No submissions made in this challenge.

Completed

Music Demixing Challenge ISMIR 2021

Sony Group Corporation

Latest submissions

No submissions made in this challenge.

Completed

Food Recognition Challenge

Seerave Foundation

A benchmark for image-based food recognition

Latest submissions

See All

graded	65838	Tue, 12 May 2020 16:38:10
graded	63982	Wed, 6 May 2020 15:15:57
graded	63734	Tue, 5 May 2020 08:29:00

Completed

AI Blitz #8

AIcrowd

5 Puzzles, 3 Weeks. Can you solve them all? 😉

Latest submissions

See All

graded	138074	Mon, 17 May 2021 06:07:10
graded	137954	Sun, 16 May 2021 16:33:00
graded	137921	Sun, 16 May 2021 15:38:08

Completed

Learning to Smell

Firmenich

Predicting smell of molecular compounds

Latest submissions

No submissions made in this challenge.

Completed

AI Blitz #6

AIcrowd

5 Problems 21 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

Completed

AI Blitz 5 ⚡

AIcrowd

5 Puzzles, 3 Weeks | Can you solve them all?

Latest submissions

See All

graded	119139	Wed, 3 Feb 2021 06:46:10
graded	118824	Mon, 1 Feb 2021 13:44:30
graded	118823	Mon, 1 Feb 2021 13:41:53

Completed

AI Blitz #4

AIcrowd

5 PROBLEMS 3 WEEKS. CAN YOU SOLVE THEM ALL?

Latest submissions

No submissions made in this challenge.

Completed

AIcrowd Blitz⚡#2

AIcrowd

5 Problems 15 Days. Can you solve it all?

Latest submissions

See All

graded	73911	Fri, 24 Jul 2020 13:38:52
graded	73510	Wed, 22 Jul 2020 10:39:43
graded	73508	Wed, 22 Jul 2020 10:29:15

Practice

Latest submissions

See All

graded

60072

Fri, 27 Mar 2020 12:17:51

Completed

AIcrowd Blitz - May 2020

AIcrowd

5 Problems 15 Days. Can you solve it all?

Latest submissions

See All

graded	67341	Sat, 16 May 2020 19:52:02
graded	67339	Sat, 16 May 2020 19:50:34
graded	67335	Sat, 16 May 2020 19:48:32

Completed

AI for Good - AI Blitz #3

AIcrowd

AI for Good - ITU

5 PROBLEMS 3 WEEKS. CAN YOU SOLVE THEM ALL?

Latest submissions

See All

graded	78827	Sun, 30 Aug 2020 16:05:48
graded	78550	Sat, 29 Aug 2020 14:57:12
graded	78506	Sat, 29 Aug 2020 08:20:23

Completed

Latest submissions

No submissions made in this challenge.

Completed

Face De-Blurring

AIcrowd

Convert Blurred image Into Clear Image

Latest submissions

See All

graded

174305

Fri, 18 Feb 2022 09:30:54

Participant	Rating

Participant	Rating
akshay_goindani	0

bhookh_lagi_hai Food Recognition Challenge
View
Hard_Drive_Corrupted AIcrowd Blitz - May 2020
View
kant_breathe AIcrowd Blitz⚡#2
View
motherboard_corrupted AI for Good - AI Blitz #3
View
Hard_Drive_Corrupted AI Blitz #4
View
Hard_drive_corrupted AI Blitz 5 ⚡
View

Solve Sudoku

Is there a possibilty there exist multiple solution for same sudoku? The code written produces result for each test image but still accuracy is not 1. For wrong input sudoku, the code gives no possible solution.

FOODC

RESNET 50 LB 0.566 notebook

About 4 years ago

Sharing this notebook so that more people can get a headstart on how to use pre-trained models to get better results.

FOODC Resnet50 submission¶

This notebook basically follows the baseline notebook and instead of training a network from scratch we use Resnet 50 model and train it for 25 epochs.

Author - Pulkit Gera

To open this notebook on Google Computing platform Colab, click below!¶

Download the files¶

These include the train test images as well the csv indexing them

In [0]:

!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/train_images.zip
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/test_images.zip
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/train.csv
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/test.csv

We create directories and unzip the images

In [0]:

!mkdir data
!mkdir data/test
!mkdir data/train
!unzip train_images -d data/train
!unzip test_images -d data/test

Import necessary packages¶

In [0]:

import torchvision.transforms as transforms
from torch.utils.data.sampler import SubsetRandomSampler
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.optim import lr_scheduler
from torch.utils.data import TensorDataset, DataLoader, Dataset
import torchvision
from torchvision import models
import torch.optim as optim
import pandas as pd
import numpy as np
import cv2
import os
from sklearn import preprocessing
import matplotlib.pyplot as plt
%matplotlib inline
import time

Loading Data¶

In pytorch we can directly load our files into torchvision(the library which creates the object) or create a custom class to load data. The class must have __init__ , __len__ and __getitem__ functions. We create a custom dataloader to suit our needs. More info on custom loaders can be read here

In [0]:

class FoodData(Dataset):
    def __init__(self,data_list,data_dir = './',transform=None,train=True):
        super().__init__()
        self.data_list = data_list
        self.data_dir = data_dir
        self.transform = transform
        self.train = train
    
    def __len__(self):
        return self.data_list.shape[0]
    
    def __getitem__(self,item):
        if self.train:
          img_name,label = self.data_list.iloc[item]
        else:
          img_name = self.data_list.iloc[item]['ImageId']
        img_path = os.path.join(self.data_dir,img_name)
        img = cv2.imread(img_path,1)
        img = cv2.resize(img,(256,256))
        if self.transform is not None:
            img = self.transform(img)
        if self.train:
          return {
              'gt' : img,
              'label' : torch.tensor(label)

          }
        else:
          return {
              'gt':img
          }

We first convert the data labels into encodings using Label Encoders. This basically converts labels into number encodings. This is an important step as without it we cannot train our network

In [0]:

train = pd.read_csv('train.csv')
le = preprocessing.LabelEncoder()
targets = le.fit_transform(train['ClassName'])
ntrain = train
ntrain['ClassName'] = targets

We load our train data and some necessary augementations like converting to PIL image, converting to tensors and normalizing them across channels. We can add more augementations such as Random Flip, Random Rotation, etc more on which can be found here. Augmentation is an important step and helps increasing the data size. The more data the better the model.

In [0]:

transforms_train = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomRotation(90),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(),
    transforms.ToTensor(),
    transforms.Normalize( mean = np.array([0.485, 0.456, 0.406]),
    std = np.array([0.229, 0.224, 0.225]))
])
train_path = 'data/train/train_images'
train_data = FoodData(data_list= ntrain,data_dir = train_path,transform = transforms_train)

EDA¶

Let us do some exploratory data analysis. The idea is to see the class distribution, how the images are and much more.

In [0]:

train = pd.read_csv('train.csv')
num = train['ClassName'].value_counts()
classes = train['ClassName'].unique()
print("Percentage of each class")
for cl in classes:
  print(cl,'\t',num[cl]/train.shape[0]*100,"%")

Percentage of each class
water 	 9.25667703528907 %
pizza-margherita-baked 	 1.179877721763381 %
broccoli 	 0.9009975329829454 %
salad-leaf-salad-green 	 5.738496192212807 %
egg 	 2.2417676713504235 %
butter 	 3.71125174300118 %
bread-white 	 6.382065858629196 %
apple 	 2.0486967714255067 %
dark-chocolate 	 0.9439021774107046 %
white-coffee-with-caffeine 	 1.3085916550466588 %
sweet-pepper 	 0.9009975329829454 %
mixed-salad-chopped-without-sauce 	 1.8127212270728308 %
tomato-sauce 	 1.179877721763381 %
cucumber 	 1.1476992384425615 %
cheese 	 1.4694840716507562 %
pasta-spaghetti 	 1.040437627373163 %
rice 	 2.7458972433765956 %
zucchini 	 0.9653544996245843 %
salmon 	 0.5470342164539311 %
mixed-vegetables 	 2.542100182344739 %
espresso-with-caffeine 	 2.0916014158532663 %
banana 	 1.9414351603561086 %
strawberries 	 0.9331760163037649 %
mayonnaise 	 0.4612249275984125 %
almonds 	 0.740105116378848 %
bread-wholemeal 	 4.269012120562051 %
wine-white 	 1.619650327147914 %
hard-cheese 	 1.2013300439772605 %
ham-raw 	 0.7079266330580285 %
tomato 	 3.8399656762844576 %
french-beans 	 0.8044620830204869 %
mandarine 	 0.740105116378848 %
wine-red 	 2.585004826772498 %
potatoes-steamed 	 1.673281132682613 %
croissant 	 0.8044620830204869 %
carrot 	 3.185669848761128 %
salami 	 0.5255818942400515 %
boisson-au-glucose-50g 	 0.9117236940898853 %
biscuits 	 0.7293789552719082 %
corn 	 0.39686796095677357 %
leaf-spinach 	 0.9331760163037649 %
tea-green 	 0.740105116378848 %
chips-french-fries 	 1.4587579105438164 %
parmesan 	 0.7293789552719082 %
beer 	 0.8580928885551861 %
bread-french-white-flour 	 0.6542958275233294 %
coffee-with-caffeine 	 4.043762737316314 %
chicken 	 1.1369730773356215 %
soft-cheese 	 0.5148557331331117 %
tea 	 1.8985305159283494 %
avocado 	 0.9439021774107046 %
bread-sourdough 	 0.6757481497372091 %
gruyere 	 0.7615574385927276 %
sauce-savoury 	 0.6542958275233294 %
honey 	 0.6972004719510887 %
mixed-nuts 	 0.868819049662126 %
jam 	 1.7483642604311918 %
bread-whole-wheat 	 0.7937359219135471 %
water-mineral 	 0.922449855196825 %
onion 	 0.4397726053845329 %
pickle 	 0.3003325109943151 %

We observe that water is the most popular class although the distribution is not that skewed. Let us plot the images of white flour french bread and french fries and have a look at the kind of images we have

In [0]:

imgs = train.loc[train['ClassName'] == 'bread-french-white-flour']
plt.figure(figsize=(10,10))
for i in range(imgs[:16].shape[0]):
  path = imgs.iloc[i]['ImageId']
  image = cv2.imread(os.path.join(train_path,path),1)
  image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
  plt.subplot(4,4,i+1)
  plt.axis('off')
  plt.imshow(image)

In [0]:

imgs = train.loc[train['ClassName'] == 'chips-french-fries']
plt.figure(figsize=(10,10))
for i in range(imgs[:16].shape[0]):
  path = imgs.iloc[i]['ImageId']
  image = cv2.imread(os.path.join(train_path,path),1)
  image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
  plt.subplot(4,4,i+1)
  plt.axis('off')
  plt.imshow(image)

Split Data into Train and Validation¶

Now we want to see how well our model is performing, but we dont have the test data labels with us to check. What do we do ? So we split our dataset into train and validation. The idea is that we test our classifier on validation set in order to get an idea of how well our classifier works. This way we can also ensure that we dont overfit on the train dataset. There are many ways to do validation like k-fold,leave one out, etc

We also make dataloaders which basically create minibatches of dataset which are used in each epoch

Although the data is not that imbalanced, one idea we can add is SMOTE which helps in sampling incase of imbalanced classification where it samples multiple times. I havent tried that but it works well. Read more about it here

In [0]:

batch = 64
valid_size = 0.2
num = train_data.__len__()
# Dividing the indices for train and cross validation
indices = list(range(num))
np.random.shuffle(indices)
split = int(np.floor(valid_size*num))
train_idx,valid_idx = indices[split:], indices[:split]

#Create Samplers
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)

train_loader = DataLoader(train_data, batch_size = batch, sampler = train_sampler)
valid_loader = DataLoader(train_data, batch_size = batch, sampler = valid_sampler)

Here we load test images. Note: This file will not have any labels with it

In [0]:

transforms_test = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor(),
    transforms.Normalize( mean = np.array([0.485, 0.456, 0.406]),
    std = np.array([0.229, 0.224, 0.225]))
])
test_path = 'data/test/test_images'
test = pd.read_csv('test.csv')
test_data = FoodData(data_list= test,data_dir = test_path,transform = transforms_test,train=False)

test_loader = DataLoader(test_data, batch_size=batch, shuffle=False)

Here we check if we have a GPU or not. If we have we just need to shift our data and model to GPU for faster computations.

In [0]:

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Assuming that we are on a CUDA machine, this should print a CUDA device:

print(device)

cuda:0

Define the Model¶

We are going to use ResNet50. The idea was born from the fact that deeper the models are , after a certain point their accuracy becomes poor. This is due to the fact that gradients passed back become smaller and smaller which leads to negligible change in the early layers.
In order to combat that, we add skip connections to layers forward. This way we are able to preserve the gradients and build deeper models.

Pytorch provides a collection of pretrained resnet architectures. We use resnet 50.
To read more about resnet and its variants, this is a good blog
More on pretrained models with pytorch here and making models here.

In [0]:

train_sampler.__len__()

Out[0]:

In [0]:

dataloaders = {}
dataset_sizes = {}
dataloaders['train'] = train_loader
dataloaders['val'] = valid_loader
dataset_sizes['train'] = train_sampler.__len__()
dataset_sizes['val'] = valid_sampler.__len__()

Train¶

Alright enough talk and time to train. We define the number of epochs and train the model. An epoch is a forward pass and backward pass of all the data points. An epoch consists of iterations which depend on batch size. So basically we take a batch, get its output, do a backward pass and let the optimizer take a step. This is the workflow for any pytorch code.

Validate¶

Now after an epoch ends, we check with validation and do the same steps except backward pass on loss and optimizer step. If we get a reduction in validation loss, we save the model. This is sort of an early stopping.

In [0]:

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for data in dataloaders[phase]:
                inputs = data['gt'].squeeze(0).to(device)
                labels = data['label'].to(device)
                # inputs = inputs.to(device)
                # labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

In [0]:

import copy
model_ft = models.resnet50(pretrained=True)
# To only train the last layer
# for param in model_ft.parameters():
#     param.requires_grad = False
num_ftrs = model_ft.fc.in_features

# Alternatively, it can be generalized to nn.Linear(num_ftrs, len(class_names)).
model_ft.fc = nn.Linear(num_ftrs, 61)

model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.Adam(model_ft.parameters(), lr=0.001)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

Here we define our model object along with our optimizer and error function. Typically for multi class classification we use Cross Entropy Loss. More about different types of losses are here.
We use the popular Adam optimizer with its default parameters. There are other optimizers like SGD, RMSPROP, Adamax,etc. You can have a detailed look at optimizers here

You must be wondering what is a scheduler. A scheduler provides a policy for the decay of learning rate. For example we can say that let the learning rate value decrease after some fixed number of epochs or say if the validation accuracy doesnt change, we can change the learning rate. This helps in faster and better convergence. Read more here

In [0]:

model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=25)

Epoch 0/24
----------
train Loss: 3.0510 Acc: 0.2419
val Loss: 2.6348 Acc: 0.3192

Epoch 1/24
----------

Predict on Validation¶

Now we predict our trained model on the validation set and evaluate our model

In [0]:

# model.load_state_dict(torch.load('best_model_so_far.pth'))
model_ft.eval()
correct = 0
total = 0
pred_list = []
correct_list = []
with torch.no_grad():
    for images in valid_loader:
        data = images['gt'].squeeze(0).to(device)
        target = images['label'].to(device)
        outputs = model_ft(data)
        _, predicted = torch.max(outputs.data, 1)
        total += target.size(0)
        pr = predicted.detach().cpu().numpy()
        for i in pr:
          pred_list.append(i)
        tg = target.detach().cpu().numpy()
        for i in tg:
          correct_list.append(i)
        correct += (predicted == target).sum().item()

print('Accuracy of the network on the 10000 test images: %f %%' % (
    100 * correct / total))

Accuracy of the network on the 10000 test images: 49.141631 %

Evaluate the Performance¶

We use the same metrics as that will be used for the test set.
F1 score and Log Loss are the metrics for this challenge

In [0]:

from sklearn.metrics import f1_score,precision_score,log_loss   
print("F1 score :",f1_score(correct_list,pred_list,average='micro')*100)

F1 score : 49.141630901287556

Predict on test set¶

Time for the moment of truth! Predict on test set and time to make the submission.

In [0]:

# model.load_state_dict(torch.load('best_model_so_far.pth'))
model_ft.eval()

preds = []
with torch.no_grad():
    for images in test_loader:
        data = images['gt'].squeeze(0).to(device)
        outputs = model_ft(data)
        _, predicted = torch.max(outputs.data, 1)
        pr = predicted.detach().cpu().numpy()
        for i in pr:
          preds.append(i)

Save it in correct format¶

In [0]:

# Create Submission file        
df = pd.DataFrame(le.inverse_transform(preds),columns=['ClassName'])
df.to_csv('submission.csv',index=False)

To download the generated in collab csv run the below command¶

In [0]:

from google.colab import files
files.download('submission.csv')

Go to platform. Participate in the challenge and submit the submission.csv.¶

Resnet 50 lb 0.566

About 4 years ago

(topic withdrawn by author, will be automatically deleted in 24 hours unless flagged)

Anchit has not provided any information yet.

Notebooks

Create Notebook

Filters

Private

MaskRCNN + Augmentation Augmentation to improve baseline score

Anchit
· Over 3 years ago

Open in Colab · View

Notebooks

Create Notebook

Filters

Private