Loading

Scene Segmentation

Solution for 0.975 Scene Segmentation Using DeepLabV3Plus

A detailed solution for challenge Scene Segmentation

g_mothy

## Points covered in this Notebook 

1. Split the data into train & validation dataset

2. DeepLabV3 Model

3. Save the best model based on F1 Score Metric

4. Analyse the predictions to understand the performance of the model.

Segmentation Model

Thanks to Shubhamaicrowd for the starter Notebook. This notebook is few enhanced version developed from the inspiration of the Starter Notebook

Note: The semantic segmentation dataset was generated using the Carla Simulator. The dataset contains over 23 different classes.

Downloading Dataset

Installing aicrowd-cli

In [1]:
!pip install aicrowd-cli
%load_ext aicrowd.magic
Collecting aicrowd-cli
  Downloading aicrowd_cli-0.1.10-py3-none-any.whl (44 kB)
     |████████████████████████████████| 44 kB 2.4 MB/s 
Requirement already satisfied: click<8,>=7.1.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (7.1.2)
Collecting GitPython==3.1.18
  Downloading GitPython-3.1.18-py3-none-any.whl (170 kB)
     |████████████████████████████████| 170 kB 16.7 MB/s 
Collecting requests-toolbelt<1,>=0.9.1
  Downloading requests_toolbelt-0.9.1-py2.py3-none-any.whl (54 kB)
     |████████████████████████████████| 54 kB 2.7 MB/s 
Collecting rich<11,>=10.0.0
  Downloading rich-10.9.0-py3-none-any.whl (211 kB)
     |████████████████████████████████| 211 kB 36.7 MB/s 
Requirement already satisfied: toml<1,>=0.10.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.10.2)
Collecting pyzmq==22.1.0
  Downloading pyzmq-22.1.0-cp37-cp37m-manylinux1_x86_64.whl (1.1 MB)
     |████████████████████████████████| 1.1 MB 36.0 MB/s 
Collecting requests<3,>=2.25.1
  Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB)
     |████████████████████████████████| 62 kB 786 kB/s 
Requirement already satisfied: tqdm<5,>=4.56.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (4.62.0)
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4.0.7-py3-none-any.whl (63 kB)
     |████████████████████████████████| 63 kB 1.7 MB/s 
Requirement already satisfied: typing-extensions>=3.7.4.0 in /usr/local/lib/python3.7/dist-packages (from GitPython==3.1.18->aicrowd-cli) (3.7.4.3)
Collecting smmap<5,>=3.0.1
  Downloading smmap-4.0.0-py2.py3-none-any.whl (24 kB)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.10)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.24.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2021.5.30)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.6.1)
Collecting colorama<0.5.0,>=0.4.0
  Downloading colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Collecting commonmark<0.10.0,>=0.9.0
  Downloading commonmark-0.9.1-py2.py3-none-any.whl (51 kB)
     |████████████████████████████████| 51 kB 6.5 MB/s 
Installing collected packages: smmap, requests, gitdb, commonmark, colorama, rich, requests-toolbelt, pyzmq, GitPython, aicrowd-cli
  Attempting uninstall: requests
    Found existing installation: requests 2.23.0
    Uninstalling requests-2.23.0:
      Successfully uninstalled requests-2.23.0
  Attempting uninstall: pyzmq
    Found existing installation: pyzmq 22.2.1
    Uninstalling pyzmq-22.2.1:
      Successfully uninstalled pyzmq-22.2.1
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.26.0 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
Successfully installed GitPython-3.1.18 aicrowd-cli-0.1.10 colorama-0.4.4 commonmark-0.9.1 gitdb-4.0.7 pyzmq-22.1.0 requests-2.26.0 requests-toolbelt-0.9.1 rich-10.9.0 smmap-4.0.0
In [2]:
%aicrowd login
Please login here: https://api.aicrowd.com/auth/PzJXke3oUUuScT42c62VAXKiVmY7cgAF3qDjJE3WHr4
API Key valid
Saved API Key successfully!
In [3]:
!rm -rf data
!mkdir data
%aicrowd ds dl -c scene-segmentation -o data
In [4]:
!unzip data/train.zip -d data/train > /dev/null
!unzip data/test.zip -d data/test > /dev/null

Downloading & Importing Libraries

Here we are going to use segmentation_models.pytorch which is a really popular library providing a tons of different segmentation models for pytorch including basic unets to DeepLabV3!

In [5]:
!pip install git+https://github.com/qubvel/segmentation_models.pytorch
Collecting git+https://github.com/qubvel/segmentation_models.pytorch
  Cloning https://github.com/qubvel/segmentation_models.pytorch to /tmp/pip-req-build-fjq1bk0_
  Running command git clone -q https://github.com/qubvel/segmentation_models.pytorch /tmp/pip-req-build-fjq1bk0_
Requirement already satisfied: torchvision>=0.5.0 in /usr/local/lib/python3.7/dist-packages (from segmentation-models-pytorch==0.2.0) (0.10.0+cu102)
Collecting pretrainedmodels==0.7.4
  Downloading pretrainedmodels-0.7.4.tar.gz (58 kB)
     |████████████████████████████████| 58 kB 4.2 MB/s 
Collecting efficientnet-pytorch==0.6.3
  Downloading efficientnet_pytorch-0.6.3.tar.gz (16 kB)
Collecting timm==0.4.12
  Downloading timm-0.4.12-py3-none-any.whl (376 kB)
     |████████████████████████████████| 376 kB 12.7 MB/s 
Requirement already satisfied: torch in /usr/local/lib/python3.7/dist-packages (from efficientnet-pytorch==0.6.3->segmentation-models-pytorch==0.2.0) (1.9.0+cu102)
Collecting munch
  Downloading munch-2.5.0-py2.py3-none-any.whl (10 kB)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from pretrainedmodels==0.7.4->segmentation-models-pytorch==0.2.0) (4.62.0)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch->efficientnet-pytorch==0.6.3->segmentation-models-pytorch==0.2.0) (3.7.4.3)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torchvision>=0.5.0->segmentation-models-pytorch==0.2.0) (1.19.5)
Requirement already satisfied: pillow>=5.3.0 in /usr/local/lib/python3.7/dist-packages (from torchvision>=0.5.0->segmentation-models-pytorch==0.2.0) (7.1.2)
Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from munch->pretrainedmodels==0.7.4->segmentation-models-pytorch==0.2.0) (1.15.0)
Building wheels for collected packages: segmentation-models-pytorch, efficientnet-pytorch, pretrainedmodels
  Building wheel for segmentation-models-pytorch (setup.py) ... done
  Created wheel for segmentation-models-pytorch: filename=segmentation_models_pytorch-0.2.0-py3-none-any.whl size=88635 sha256=96b1752fbc41c6e7c82da9b1651dfccbd5c58c36d2eb3526876e99ba38f452ed
  Stored in directory: /tmp/pip-ephem-wheel-cache-dijt339v/wheels/fa/c5/a8/1e8af6cb04a0974db8a4a156ebd2fdd1d99ad2558d3fce49d4
  Building wheel for efficientnet-pytorch (setup.py) ... done
  Created wheel for efficientnet-pytorch: filename=efficientnet_pytorch-0.6.3-py3-none-any.whl size=12421 sha256=2dad6940fb666f15dc6330f52e76c1d68d13ef3deeb7dcbd3109801356510a24
  Stored in directory: /root/.cache/pip/wheels/90/6b/0c/f0ad36d00310e65390b0d4c9218ae6250ac579c92540c9097a
  Building wheel for pretrainedmodels (setup.py) ... done
  Created wheel for pretrainedmodels: filename=pretrainedmodels-0.7.4-py3-none-any.whl size=60965 sha256=3b669a317c54f1883cf5e2312b5347ca7bd0659ed1e94a0669b24527a0a7b792
  Stored in directory: /root/.cache/pip/wheels/ed/27/e8/9543d42de2740d3544db96aefef63bda3f2c1761b3334f4873
Successfully built segmentation-models-pytorch efficientnet-pytorch pretrainedmodels
Installing collected packages: munch, timm, pretrainedmodels, efficientnet-pytorch, segmentation-models-pytorch
Successfully installed efficientnet-pytorch-0.6.3 munch-2.5.0 pretrainedmodels-0.7.4 segmentation-models-pytorch-0.2.0 timm-0.4.12
In [6]:
!pip install natsort
Requirement already satisfied: natsort in /usr/local/lib/python3.7/dist-packages (5.5.0)
In [7]:
# Pytorch 
import torch
from torch import nn
import segmentation_models_pytorch as smp
from torch.utils.data import Dataset, DataLoader

# Reading Dataset, vis and miscellaneous
from PIL import Image
import matplotlib.pyplot as plt
import os
import numpy as np
import torch.nn as nn
from natsort import natsorted
from tqdm.notebook import tqdm
import cv2
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

Create Train & Validation Dataset

Prepare the train & validation dataset

In [8]:
IMAGE_PATH = './data/train/image/'
MASK_PATH = './data/train/segmentation/'
In [9]:
n_classes = 23 

def create_df():
    name = []
    for dirname, _, filenames in os.walk(IMAGE_PATH):
        for filename in filenames:
            name.append(filename.split('.')[0])
    
    return pd.DataFrame({'id': name}, index = np.arange(0, len(name)))

df = create_df()
print('Total Images: ', len(df))
Total Images:  4000
In [10]:
#split data
X_train, X_val = train_test_split(df['id'].values, test_size=0.20, random_state=19)

print('Train Size   : ', len(X_train))
print('Val Size     : ', len(X_val))
Train Size   :  3200
Val Size     :  800
In [11]:
!mkdir dataset
!mkdir dataset/train
!mkdir dataset/train/image
!mkdir dataset/train/segmentation

!mkdir dataset/val
!mkdir dataset/val/image
!mkdir dataset/val/segmentation
In [12]:
import shutil
 
source = "./data/train"
destination = "dataset/train"

for i in os.listdir(os.path.join(source,"image")):
    if i.split('.')[0] in X_train:
        shutil.copy(os.path.join(source,"image",i), os.path.join(destination,"image",i))

for i in os.listdir(os.path.join(source,"segmentation")):
    if i.split('.')[0] in X_train:
        shutil.copy(os.path.join(source,"segmentation",i), os.path.join(destination,"segmentation",i))
        
source = "./data/train"
destination = "dataset/val"

for i in os.listdir(os.path.join(source,"image")):
    if i.split('.')[0] in X_val:
        shutil.copy(os.path.join(source,"image",i), os.path.join(destination,"image",i))

for i in os.listdir(os.path.join(source,"segmentation")):
    if i.split('.')[0] in X_val:
        shutil.copy(os.path.join(source,"segmentation",i), os.path.join(destination,"segmentation",i))
In [13]:
img = Image.open(IMAGE_PATH + df['id'][100] + '.jpg')
mask = Image.open(MASK_PATH + df['id'][100] + '.png')
print('Image Size', np.asarray(img).shape)
print('Mask Size', np.asarray(mask).shape)


plt.imshow(img)
plt.imshow(mask, alpha=0.6)
plt.title('Picture with Mask Appplied')
plt.show()
Image Size (512, 512, 3)
Mask Size (512, 512)

Creating the Dataloader

In this section, we will be creating the dataloader that our model will use for the loading batches of corrosponding features and labels for training & testing

In [14]:
class SemanticSegmentationDataset(Dataset):
    
    def __init__(self, img_directory=None, label_directory=None, train=True):
        self.img_directory = img_directory
        self.label_directory = label_directory            

        if img_directory != None:
            if train:
                self.img_list = natsorted(os.listdir(img_directory))
            else:
                self.img_list = natsorted(os.listdir(img_directory))

        if train:
            self.label_list = natsorted(os.listdir(label_directory))

        self.train = train

        self.labels = list(range(0, 23))

    def __len__(self):
        return len(self.img_list)

    def __getitem__(self, idx):

        # Reading the image
        img = Image.open(os.path.join(self.img_directory, self.img_list[idx]))
        img = img.convert("L")

        if self.train == True:
          
            # Reading the mak image
            mask = Image.open(os.path.join(self.label_directory, self.label_list[idx]))

            # mask.show()
            img = np.array(img, dtype=np.float32)
            mask = np.array(mask, dtype=np.float32)

            # Change image channel ordering
            img = img[np.newaxis, :, :]


            # Normalizing images
            img = torch.from_numpy(img)
            img = img.float()/255

            binary_mask = np.array([(mask == v) for v in list(self.labels)])
            binary_mask = np.stack(binary_mask, axis=-1).astype('float')

            mask_preprocessed = binary_mask.transpose(2, 0, 1)
            mask_preprocessed = torch.from_numpy(mask_preprocessed)

            return img, mask_preprocessed
        
        # If reading test dataset, only return image 
        else:
          
            img = np.array(img, dtype=np.float32)
            img = img[np.newaxis, :, :]
            # img = np.moveaxis(img, -1, 0)

            # Normalizing images
            img = torch.from_numpy(img)
            img = img.float()/255
          
            return img
In [15]:
data_dir = "dataset"

# Creating the training dataset
train_dataset = SemanticSegmentationDataset(img_directory=os.path.join(data_dir,"train/image"), 
                                            label_directory=os.path.join(data_dir,"train/segmentation"))
train_loader = DataLoader(train_dataset, batch_size=4, num_workers=0, shuffle=False, drop_last=True)

# Creating the validation dataset
val_dataset = SemanticSegmentationDataset(img_directory=os.path.join(data_dir,"val/image"), 
                                            label_directory=os.path.join(data_dir,"val/segmentation"))
val_loader = DataLoader(val_dataset, batch_size=4, num_workers=0, shuffle=False, drop_last=True)
In [16]:
# Reading the image and corrosponding segmentation
image_batch, segmentation_batch = next(iter(train_loader))

image_batch.shape, segmentation_batch.shape
Out[16]:
(torch.Size([4, 1, 512, 512]), torch.Size([4, 23, 512, 512]))

Visualizing Dataset

In [17]:
plt.rcParams["figure.figsize"] = (30,5)

# Going through each image and segmentation
for image, segmentation in zip(image_batch, segmentation_batch):

    # Change the channel ordering
    image = np.moveaxis(image.numpy()*255, 0, -1)

    # Showing the image
    plt.figure()
    plt.subplot(1,2,1)
    plt.imshow(image[:, :, 0])
    plt.subplot(1,2,2)
    plt.imshow(segmentation[9]*255)
    plt.show()
In [18]:
segmentation.shape
Out[18]:
torch.Size([23, 512, 512])

Creating the Model

Here we will get setting up the model architecture, optimizer and loss.

In [19]:
ENCODER = 'timm-efficientnet-b1'
ENCODER_WEIGHTS = 'imagenet'
ACTIVATION = "softmax2d" 
DEVICE = 'cuda'

# create segmentation model with pretrained encoder
model = smp.DeepLabV3Plus(
    encoder_name=ENCODER, 
    encoder_weights=ENCODER_WEIGHTS, 
    classes=len(train_dataset.labels),
    in_channels=1,
    activation=ACTIVATION,
)
Downloading: "https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/tf_efficientnet_b1_aa-ea7a6ee0.pth" to /root/.cache/torch/hub/checkpoints/tf_efficientnet_b1_aa-ea7a6ee0.pth

Hyperparameters & Metrics

In [21]:
# using DiceLoss
loss = smp.utils.losses.DiceLoss()

# using multiple metrics to train the model
metrics = [
    smp.utils.metrics.IoU(threshold=0.5),
    smp.utils.metrics.Fscore(threshold=0.5),
    smp.utils.metrics.Accuracy(threshold=0.5),
    smp.utils.metrics.Recall(threshold=0.5),
    smp.utils.metrics.Precision(threshold=0.5),
]

# Using SGD optimizer
#optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
optimizer = torch.optim.Adam(params=model.parameters(), lr=0.01)

Training the Model

Saved Model and be loaded here and and trained

In [22]:
# Setting up training epoch to train the model
train_epoch = smp.utils.train.TrainEpoch(
    model, 
    loss=loss,
    metrics=metrics, 
    optimizer=optimizer,
    device=DEVICE,
    verbose=True,
    )

val_epoch = smp.utils.train.ValidEpoch(
    model, 
    loss=loss,
    metrics=metrics, 
    device=DEVICE,
    verbose=True,
    )

Model is saved based on the validation F1 Score

In [ ]:
max_score = 0

for i in range(0, 10):
    
    print('\nEpoch: {}'.format(i))
    train_logs = train_epoch.run(train_loader)
    valid_logs = val_epoch.run(val_loader) 
    
    # do something (save model, change lr, etc.)
    if max_score < valid_logs['fscore']:
        max_score = valid_logs['fscore']
        torch.save(model, 'best_model.pth')
        print('Model saved!')
        
    if i == 5:
        # lr is reduced 
        optimizer.param_groups[0]['lr'] = 1e-5
        print('Decrease decoder learning rate to 1e-5!')

Generating Predictions

In [42]:
# Creating the testing dataset
data_dir_ = "./data"

test_dataset = SemanticSegmentationDataset(img_directory=os.path.join(data_dir_,"test/image"), train=False)
test_loader = DataLoader(test_dataset, batch_size=1, num_workers=2, shuffle=False, drop_last=False)

Load the best model

In [43]:
# load the best model
model = torch.load("./best_model.pth")
In [44]:
# Generating Model Predictions
!rm -rf segmentation
!mkdir segmentation

for n, batch in enumerate(tqdm(test_loader)):

    # Getting the predictions
    predictions = model.predict(batch.to(DEVICE)).cpu() 
  
    # Converting the predictions to right format
    prediction_mask = (predictions.squeeze().cpu().numpy())   
    prediction_mask = np.transpose(prediction_mask, (1, 2, 0))

    # Getting individual channel and combining them into single image
    prediction_mask_gray = np.zeros((prediction_mask.shape[0],prediction_mask.shape[1]))
    for ii in range(prediction_mask.shape[2]):
        prediction_mask_gray = prediction_mask_gray + ii*prediction_mask[:,:,ii].round()


    # Saving the image
    prediction_mask_gray = Image.fromarray(prediction_mask_gray.astype(np.uint8))
    prediction_mask_gray.save(os.path.join("segmentation", f"{n}.png"))

Analyze the Predictions

In [30]:
# Creating the testing dataset
data_dir_ = "./data"

test_dataset = SemanticSegmentationDataset(img_directory=os.path.join(data_dir_,"test/image"),
                                           label_directory=os.path.join("segmentation"))
test_loader = DataLoader(test_dataset, batch_size=4, num_workers=0, shuffle=False, drop_last=False)
In [31]:
# Reading the image and corrosponding segmentation
image_batch, segmentation_batch = next(iter(test_loader))

image_batch.shape, segmentation_batch.shape
Out[31]:
(torch.Size([4, 1, 512, 512]), torch.Size([4, 23, 512, 512]))
In [33]:
plt.rcParams["figure.figsize"] = (30,5)

# Going through each image and segmentation
for image, segmentation in zip(image_batch, segmentation_batch):

    # Change the channel ordering
    image = np.moveaxis(image.numpy()*255, 0, -1)

    # Showing the image
    plt.figure()
    plt.subplot(1,2,1)
    plt.imshow(image[:, :, 0])
    plt.subplot(1,2,2)
    plt.imshow(segmentation[1]*255)
    plt.show()