Loading
Feedback

ML Battleground

Submitted code for the Hey Barrels challenge.

A getting started code for the Hey Barrels challenge.

By victorkras2008

Image regression for predicting numerical targets from photo by 'ktrain'.



ML Battleground Hey-Barrels

See image regression for predicting numerical targets from photos (e.g., age prediction) example notebook

Author: Victor Krasilnikov

Score = 0.156 / 0.096

STEP 0: Install packages

In [1]:
!pip install -q ktrain
     |████████████████████████████████| 25.3MB 130kB/s 
     |████████████████████████████████| 6.8MB 40.6MB/s 
     |████████████████████████████████| 983kB 46.2MB/s 
     |████████████████████████████████| 266kB 56.5MB/s 
     |████████████████████████████████| 1.3MB 57.2MB/s 
     |████████████████████████████████| 1.2MB 39.8MB/s 
     |████████████████████████████████| 471kB 52.6MB/s 
     |████████████████████████████████| 2.9MB 44.2MB/s 
     |████████████████████████████████| 890kB 48.1MB/s 
  Building wheel for ktrain (setup.py) ... done
  Building wheel for langdetect (setup.py) ... done
  Building wheel for syntok (setup.py) ... done
  Building wheel for seqeval (setup.py) ... done
  Building wheel for keras-bert (setup.py) ... done
  Building wheel for sacremoses (setup.py) ... done
  Building wheel for keras-transformer (setup.py) ... done
  Building wheel for keras-pos-embd (setup.py) ... done
  Building wheel for keras-multi-head (setup.py) ... done
  Building wheel for keras-layer-normalization (setup.py) ... done
  Building wheel for keras-position-wise-feed-forward (setup.py) ... done
  Building wheel for keras-embed-sim (setup.py) ... done
  Building wheel for keras-self-attention (setup.py) ... done
ERROR: transformers 3.5.1 has requirement sentencepiece==0.1.91, but you'll have sentencepiece 0.1.95 which is incompatible.
In [2]:
!pip install git+https://gitlab.aicrowd.com/aicrowd/aicrowd-cli.git >/dev/null
%load_ext aicrowd.magic
  Running command git clone -q https://gitlab.aicrowd.com/aicrowd/aicrowd-cli.git /tmp/pip-req-build-_dvkcr50

STEP 1: Download the Data

In [3]:
API_KEY = '' # Please enter your API Key [https://www.aicrowd.com/participants/me]
%aicrowd login --api-key $API_KEY
API Key valid
Saved API Key successfully!
In [4]:
%aicrowd dataset list -c hey-barrels
%aicrowd dataset download -c hey-barrels -j 3

!unzip train.zip >/dev/null
!unzip test.zip >/dev/null
              Datasets for challenge #750                                       
┌───┬────────────────────────┬─────────────┬───────────┐                        
│ # │ Title                  │ Description │      Size │                        
├───┼────────────────────────┼─────────────┼───────────┤                        
│ 0 │ example_submission.csv │ -           │   7.45 KB │                        
│ 1 │ test.zip               │ -           │ 485.96 MB │                        
│ 2 │ train.zip              │ -           │ 494.61 MB │                        
└───┴────────────────────────┴─────────────┴───────────┘                        


In [6]:
!ls
example_submission.csv	sample_data  test  test.zip  train  train.zip
In [7]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline


import os
import pandas as pd
import numpy as np

import ktrain
from ktrain import vision as vis

print(ktrain.__version__)
0.25.4
In [8]:
TRAIN_DATA_DIR = "train/images"
TEST_DATA_DIR = "test"
In [9]:
submission = pd.read_csv("example_submission.csv")

train_df = pd.read_csv("train/meta-data.csv")
train_df
Out[9]:
filename barrels_count pigs_count
0 0001.png 6 10
1 0002.png 10 10
2 0003.png 7 8
3 0004.png 5 8
4 0005.png 12 7
... ... ... ...
495 0496.png 11 9
496 0497.png 8 13
497 0498.png 10 8
498 0499.png 6 8
499 0500.png 11 5

500 rows × 3 columns

In [11]:
# Models for image regression
vis.print_image_regression_models()
pretrained_resnet50: 50-layer Residual Network (pretrained on ImageNet)
resnet50: 50-layer Resididual Network (randomly initialized)
pretrained_mobilenet: MobileNet Neural Network (pretrained on ImageNet)
mobilenet: MobileNet Neural Network (randomly initialized)
pretrained_inception: Inception Version 3  (pretrained on ImageNet)
inception: Inception Version 3 (randomly initialized)
wrn22: 22-layer Wide Residual Network (randomly initialized)
default_cnn: a default LeNet-like Convolutional Neural Network

STEP 2: Train and Predict for 'barrels_count'

In [18]:
LABEL1 = 'barrels_count'

NET = 'pretrained_resnet50'
FREEZE = 15
EPOCHS = 5
SIZE = (224,224)

Create train and val data

In [13]:
# LABEL1
data_aug = vis.get_data_aug(horizontal_flip=True)
(train_data, val_data, preproc) = vis.data.images_from_df(train_df,
                                                     data_aug = data_aug, 
                                                     image_column="filename",
                                                     label_columns=[LABEL1],
                                                     directory=TRAIN_DATA_DIR ,
                                                     is_regression=True,
                                                     target_size=SIZE,
                                                     color_mode='rgb',
                                                     random_state=42)
/usr/local/lib/python3.7/dist-packages/ktrain/utils.py:580: UserWarning: Task is being treated as REGRESSION because either class_names argument was not supplied or is_regression=True. If this is incorrect, change accordingly.
  'either class_names argument was not supplied or is_regression=True. ' + \
Found 500 images belonging to 1 classes.
Found 450 validated image filenames.
Found 50 validated image filenames.

Create a Model and Wrap in Learner

We use the image_regression_model function to create a ResNet50 model.\ By default, the model freezes all layers except the final randomly-initialized dense layer.

In [14]:
model = vis.image_regression_model(NET, train_data, val_data)
The normalization scheme has been changed for use with a pretrained_resnet50 model. If you decide to use a different model, please reload your dataset with a ktrain.vision.data.images_from* function.

Is Multi-Label? False
Is Regression? True
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94773248/94765736 [==============================] - 0s 0us/step
pretrained_resnet50 model created.

Estimate Learning Rate

We will select a learning rate associated with falling loss from the plot displayed.

In [15]:
# wrap model and data in Learner object
learner = ktrain.get_learner(model=model, train_data=train_data, val_data=val_data, 
                             workers=8, use_multiprocessing=False, batch_size=64)
In [16]:
learner.lr_find(max_epochs=10)
learner.lr_plot()
simulating training for different learning rates... this may take a few moments...
Epoch 1/10
7/7 [==============================] - 23s 200ms/step - loss: 49.5231 - mae: 6.1074
Epoch 2/10
7/7 [==============================] - 14s 171ms/step - loss: 51.9118 - mae: 6.2883
Epoch 3/10
7/7 [==============================] - 14s 169ms/step - loss: 43.0927 - mae: 5.6934
Epoch 4/10
7/7 [==============================] - 14s 169ms/step - loss: 25.3665 - mae: 4.1148
Epoch 5/10
7/7 [==============================] - 14s 170ms/step - loss: 54.3453 - mae: 6.2018
Epoch 6/10
7/7 [==============================] - 14s 201ms/step - loss: 60.9660 - mae: 6.7987
Epoch 7/10
7/7 [==============================] - 14s 170ms/step - loss: 2036.5121 - mae: 21.3089


done.
Please invoke the Learner.lr_plot() method to visually inspect the loss plot to help identify the maximal learning rate associated with falling loss.

From the plot above, we choose a learning rate of 1e-4.

Train Model

We will begin by training the model for EPOCHS using a 1cycle learning rate policy.

In [19]:
learner.freeze(FREEZE) # unfreeze all but the first FREEZE layers
In [20]:
learner.fit_onecycle(1e-4, EPOCHS)

begin training using onecycle policy with max lr of 0.0001...
Epoch 1/5
8/8 [==============================] - 38s 3s/step - loss: 53.4476 - mean_absolute_error: 6.1878 - val_loss: 9.4409 - val_mean_absolute_error: 2.6303
Epoch 2/5
8/8 [==============================] - 24s 3s/step - loss: 20.9102 - mean_absolute_error: 3.5075 - val_loss: 8.1250 - val_mean_absolute_error: 2.4088
Epoch 3/5
8/8 [==============================] - 30s 1s/step - loss: 16.1258 - mean_absolute_error: 3.2325 - val_loss: 7.4529 - val_mean_absolute_error: 2.2556
Epoch 4/5
8/8 [==============================] - 31s 1s/step - loss: 15.8710 - mean_absolute_error: 3.2218 - val_loss: 7.5739 - val_mean_absolute_error: 2.3220
Epoch 5/5
8/8 [==============================] - 30s 1s/step - loss: 11.7174 - mean_absolute_error: 2.6846 - val_loss: 6.9371 - val_mean_absolute_error: 2.2365
Out[20]:
<tensorflow.python.keras.callbacks.History at 0x7f6d745f63d0>

Make Predictions

In [23]:
# get a Predictor instance that wraps model and Preprocessor object
predictor = ktrain.get_predictor(learner.model, preproc)
In [24]:
# Predict 'barrels_count'
preds = predictor.predict_folder(TEST_DATA_DIR)   
preds[0]
Found 500 images belonging to 1 classes.
Out[24]:
('test/0501.png', 9.280953)
In [25]:
submission[LABEL1] = [round(pred[1]) for pred in preds]

STEP 3: Train and Predict for LABEL2

In [26]:
LABEL2 = 'pigs_count'

NET = 'pretrained_resnet50'
FREEZE = 15
EPOCHS = 5
SIZE = (224,224)
In [27]:
# 1
data_aug = vis.get_data_aug(horizontal_flip=True)
(train_data, val_data, preproc) = vis.data.images_from_df(train_df,
                                                     data_aug = data_aug, 
                                                     image_column="filename",
                                                     label_columns=[LABEL2],
                                                     directory=TRAIN_DATA_DIR ,
                                                     is_regression=True,
                                                     target_size=SIZE,
                                                     color_mode='rgb',
                                                     random_state=42)
# 2
model = vis.image_regression_model(NET, train_data, val_data)

# 3
learner = ktrain.get_learner(model=model, train_data=train_data, val_data=val_data, 
                             workers=8, use_multiprocessing=False, batch_size=64)
# 4
learner.freeze(FREEZE) 
learner.fit_onecycle(1e-4, EPOCHS)

# 5
predictor = ktrain.get_predictor(learner.model, preproc)
preds = predictor.predict_folder(TEST_DATA_DIR)  
preds[0]

# 6
submission[LABEL2] = [round(pred[1]) for pred in preds]
/usr/local/lib/python3.7/dist-packages/ktrain/utils.py:580: UserWarning: Task is being treated as REGRESSION because either class_names argument was not supplied or is_regression=True. If this is incorrect, change accordingly.
  'either class_names argument was not supplied or is_regression=True. ' + \
Found 500 images belonging to 1 classes.
Found 450 validated image filenames.
Found 50 validated image filenames.
The normalization scheme has been changed for use with a pretrained_resnet50 model. If you decide to use a different model, please reload your dataset with a ktrain.vision.data.images_from* function.

Is Multi-Label? False
Is Regression? True
pretrained_resnet50 model created.


begin training using onecycle policy with max lr of 0.0001...
Epoch 1/5
8/8 [==============================] - 37s 3s/step - loss: 50.1007 - mae: 6.0489 - val_loss: 20.4288 - val_mae: 3.7329
Epoch 2/5
8/8 [==============================] - 25s 1s/step - loss: 20.8619 - mae: 3.7150 - val_loss: 20.7194 - val_mae: 3.7453
Epoch 3/5
8/8 [==============================] - 31s 1s/step - loss: 16.9383 - mae: 3.2302 - val_loss: 33.2414 - val_mae: 4.8714
Epoch 4/5
8/8 [==============================] - 31s 1s/step - loss: 18.9736 - mae: 3.4415 - val_loss: 16.3823 - val_mae: 3.3656
Epoch 5/5
8/8 [==============================] - 31s 1s/step - loss: 17.4203 - mae: 3.4033 - val_loss: 16.2066 - val_mae: 3.3439
Found 500 images belonging to 1 classes.

STEP 4: Submite

In [28]:
submission.to_csv("KT_Reg_submission.csv", index=False)
submission
Out[28]:
filename barrels_count pigs_count
0 0501.png 9 6
1 0502.png 6 4
2 0503.png 9 7
3 0504.png 7 6
4 0505.png 10 6
... ... ... ...
495 0996.png 8 5
496 0997.png 9 5
497 0998.png 11 3
498 0999.png 12 6
499 1000.png 8 2

500 rows × 3 columns

In [29]:
%aicrowd submission create -c hey-barrels -f KT_Reg_submission.csv
Submission limit reached for your account, it will reset at 2021-03-01 05:56:57 UTC
In [30]:
try:
    
  from google.colab import files
  files.download('KT_Reg_submission.csv')
except ImportError as e:
  print("Only for Colab")
↕️  Read More

Liked by  

Comments

You must login before you can post a comment.