Loading

MASKD

[Getting Started Notebook] MASKD Challange

This is a Baseline Code to get you started with the challenge.

gauransh_k

AIcrowd-Logo

This dataset and notebook correspond to the MASKD Challenge being held on AIrowd.

Authors: Gauransh Kumar, Shraddhaa Mohan, Rohit Midha

Downloads and Installations

In [ ]:
# download the MaskRCNN repository
!git clone https://github.com/matterport/Mask_RCNN.git
Cloning into 'Mask_RCNN'...
remote: Enumerating objects: 956, done.
remote: Total 956 (delta 0), reused 0 (delta 0), pack-reused 956
Receiving objects: 100% (956/956), 125.23 MiB | 13.04 MiB/s, done.
Resolving deltas: 100% (565/565), done.
In [ ]:
# install
!cd Mask_RCNN; python setup.py install
WARNING:root:Fail load requirements file, so using default ones.
/usr/local/lib/python3.7/dist-packages/setuptools/dist.py:700: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
  % (opt, underscore_opt))
/usr/local/lib/python3.7/dist-packages/setuptools/dist.py:700: UserWarning: Usage of dash-separated 'license-file' will not be supported in future versions. Please use the underscore name 'license_file' instead
  % (opt, underscore_opt))
/usr/local/lib/python3.7/dist-packages/setuptools/dist.py:700: UserWarning: Usage of dash-separated 'requirements-file' will not be supported in future versions. Please use the underscore name 'requirements_file' instead
  % (opt, underscore_opt))
running install
running bdist_egg
running egg_info
creating mask_rcnn.egg-info
writing mask_rcnn.egg-info/PKG-INFO
writing dependency_links to mask_rcnn.egg-info/dependency_links.txt
writing top-level names to mask_rcnn.egg-info/top_level.txt
writing manifest file 'mask_rcnn.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'mask_rcnn.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib
creating build/lib/mrcnn
copying mrcnn/visualize.py -> build/lib/mrcnn
copying mrcnn/config.py -> build/lib/mrcnn
copying mrcnn/utils.py -> build/lib/mrcnn
copying mrcnn/model.py -> build/lib/mrcnn
copying mrcnn/__init__.py -> build/lib/mrcnn
copying mrcnn/parallel_model.py -> build/lib/mrcnn
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/mrcnn
copying build/lib/mrcnn/visualize.py -> build/bdist.linux-x86_64/egg/mrcnn
copying build/lib/mrcnn/config.py -> build/bdist.linux-x86_64/egg/mrcnn
copying build/lib/mrcnn/utils.py -> build/bdist.linux-x86_64/egg/mrcnn
copying build/lib/mrcnn/model.py -> build/bdist.linux-x86_64/egg/mrcnn
copying build/lib/mrcnn/__init__.py -> build/bdist.linux-x86_64/egg/mrcnn
copying build/lib/mrcnn/parallel_model.py -> build/bdist.linux-x86_64/egg/mrcnn
byte-compiling build/bdist.linux-x86_64/egg/mrcnn/visualize.py to visualize.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/mrcnn/config.py to config.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/mrcnn/utils.py to utils.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/mrcnn/model.py to model.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/mrcnn/__init__.py to __init__.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/mrcnn/parallel_model.py to parallel_model.cpython-37.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying mask_rcnn.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying mask_rcnn.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying mask_rcnn.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying mask_rcnn.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating dist
creating 'dist/mask_rcnn-2.1-py3.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing mask_rcnn-2.1-py3.7.egg
Copying mask_rcnn-2.1-py3.7.egg to /usr/local/lib/python3.7/dist-packages
Adding mask-rcnn 2.1 to easy-install.pth file

Installed /usr/local/lib/python3.7/dist-packages/mask_rcnn-2.1-py3.7.egg
Processing dependencies for mask-rcnn==2.1
Finished processing dependencies for mask-rcnn==2.1
In [ ]:
# install pycocotools
!pip uninstall -q pycocotools -y
!pip install -q git+https://github.com/waleedka/coco.git#subdirectory=PythonAPI
  Building wheel for pycocotools (setup.py) ... done
In [ ]:
!pip install numpy==1.17.5
!pip uninstall pycocotools
!pip install pycocotools --no-binary pycocotools
Requirement already satisfied: numpy==1.17.5 in /usr/local/lib/python3.7/dist-packages (1.17.5)
Found existing installation: pycocotools 2.0
Uninstalling pycocotools-2.0:
  Would remove:
    /usr/local/lib/python3.7/dist-packages/pycocotools-2.0.dist-info/*
    /usr/local/lib/python3.7/dist-packages/pycocotools/*
Proceed (y/n)? y
  Successfully uninstalled pycocotools-2.0
Collecting pycocotools
  Downloading pycocotools-2.0.4.tar.gz (106 kB)
     |████████████████████████████████| 106 kB 4.4 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from pycocotools) (1.17.5)
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.7/dist-packages (from pycocotools) (3.2.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.11.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.3.2)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools) (3.0.7)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib>=2.1.0->pycocotools) (1.16.0)
Building wheels for collected packages: pycocotools
  Building wheel for pycocotools (PEP 517) ... done
  Created wheel for pycocotools: filename=pycocotools-2.0.4-cp37-cp37m-linux_x86_64.whl size=265217 sha256=2490fad10dacb5f4bed898bebf44cbaa680f4d38639f7f5ed501c35b978bff47
  Stored in directory: /root/.cache/pip/wheels/a3/5f/fa/f011e578cc76e1fc5be8dce30b3eb9fd00f337e744b3bba59b
Successfully built pycocotools
Installing collected packages: pycocotools
Successfully installed pycocotools-2.0.4
In [ ]:
!pip install 'h5py==2.10.0' --force-reinstall
Collecting h5py==2.10.0
  Using cached h5py-2.10.0-cp37-cp37m-manylinux1_x86_64.whl (2.9 MB)
Collecting six
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting numpy>=1.7
  Using cached numpy-1.21.5-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
Installing collected packages: six, numpy, h5py
  Attempting uninstall: six
    Found existing installation: six 1.16.0
    Uninstalling six-1.16.0:
      Successfully uninstalled six-1.16.0
  Attempting uninstall: numpy
    Found existing installation: numpy 1.17.5
    Uninstalling numpy-1.17.5:
      Successfully uninstalled numpy-1.17.5
  Attempting uninstall: h5py
    Found existing installation: h5py 2.10.0
    Uninstalling h5py-2.10.0:
      Successfully uninstalled h5py-2.10.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lucid 0.3.10 requires umap-learn, which is not installed.
tensorflow 1.15.2 requires gast==0.2.2, but you have gast 0.5.3 which is incompatible.
lucid 0.3.10 requires numpy<=1.19, but you have numpy 1.21.5 which is incompatible.
kapre 0.3.7 requires tensorflow>=2.0.0, but you have tensorflow 1.15.2 which is incompatible.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.27.1 which is incompatible.
google-colab 1.0.0 requires six~=1.15.0, but you have six 1.16.0 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.
Successfully installed h5py-2.10.0 numpy-1.21.5 six-1.16.0

Restart Runtime here!

In [ ]:
# install a stable keras version
!pip install -q keras==2.2.5
In [ ]:
!cd Mask_RCNN; wget -q https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5
In [ ]:
import sys
!{sys.executable} -m pip install aicrowd-cli
%load_ext aicrowd.magic
Requirement already satisfied: aicrowd-cli in /usr/local/lib/python3.7/dist-packages (0.1.14)
Requirement already satisfied: rich<11,>=10.0.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (10.16.2)
Requirement already satisfied: semver<3,>=2.13.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (2.13.0)
Requirement already satisfied: pyzmq==22.1.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (22.1.0)
Requirement already satisfied: python-slugify<6,>=5.0.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (5.0.2)
Requirement already satisfied: toml<1,>=0.10.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.10.2)
Requirement already satisfied: requests<3,>=2.25.1 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (2.27.1)
Requirement already satisfied: tqdm<5,>=4.56.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (4.63.0)
Requirement already satisfied: requests-toolbelt<1,>=0.9.1 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.9.1)
Requirement already satisfied: GitPython==3.1.18 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (3.1.18)
Requirement already satisfied: click<8,>=7.1.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (7.1.2)
Requirement already satisfied: typing-extensions>=3.7.4.0 in /usr/local/lib/python3.7/dist-packages (from GitPython==3.1.18->aicrowd-cli) (3.10.0.2)
Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.7/dist-packages (from GitPython==3.1.18->aicrowd-cli) (4.0.9)
Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.7/dist-packages (from gitdb<5,>=4.0.1->GitPython==3.1.18->aicrowd-cli) (5.0.0)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.7/dist-packages (from python-slugify<6,>=5.0.0->aicrowd-cli) (1.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.24.3)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.10)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.0.12)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2021.10.8)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.6.1)
Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.9.1)
Requirement already satisfied: colorama<0.5.0,>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.4.4)
In [ ]:
%aicrowd login
Please login here: https://api.aicrowd.com/auth/FK1bAw74uwATDWyHgyjRRE6iiyijafwB4F_B4ebWac4
API Key valid
Gitlab access token valid
Saved details successfully!

DOWNLOAD DATASET

In [ ]:
#Donwload the datasets
%aicrowd ds dl -c maskd
In [ ]:
!unzip -q train_images.zip
!unzip -q val_images.zip
!unzip -q test_images.zip

Imports

In [ ]:
%tensorflow_version 1.x
TensorFlow 1.x selected.
In [ ]:
import os
import sys
import itertools
import math
import logging
import json
import re
import random
import matplotlib

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import matplotlib.lines as lines
import imgaug.augmenters as iaa

from matplotlib.patches import Polygon
from collections import OrderedDict
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
from pycocotools import mask as maskUtils


# Root directory of the project
ROOT_DIR = os.path.abspath("/content/Mask_RCNN")

# Import Mask RCNN
sys.path.append(ROOT_DIR)  # To find local version of the library
from mrcnn import utils
from mrcnn import visualize
from mrcnn.visualize import display_images
import mrcnn.model as modellib
from mrcnn.model import log
from mrcnn.config import Config

%matplotlib inline
Using TensorFlow backend.
In [ ]:
import tensorflow
print(tensorflow.__version__)
import keras
print(keras.__version__)
1.15.2
2.2.5

Explanatory Data Analysis

Let us Inspect the data we have first before moving on to building the model.

In [ ]:
##Create Mask Dataset Class
class MaskDataset(utils.Dataset):
    def load_dataset(self, dataset_dir,dtype,return_coco=True):
        """ Loads Mask dataset
            Params:
                - dataset_dir : root directory of the dataset (can point to the train/val folder)
                - dtype: specifies train or val
                - load_small : Boolean value which signals if the annotations for all the images need to be loaded into the memory,
                               or if only a small subset of the same should be loaded into memory
        """
        self.dtype = dtype
        if self.dtype=="train":
          annotation_path = os.path.join(dataset_dir, "train.json")
          image_dir = os.path.join(dataset_dir, "train_images")
        elif self.dtype=="val":
          annotation_path = os.path.join(dataset_dir, "val.json")
          image_dir = os.path.join(dataset_dir, "val_images")

        print("Annotation Path ", annotation_path)
        print("Image Dir ", image_dir)
        assert os.path.exists(annotation_path) and os.path.exists(image_dir)

        self.coco = COCO(annotation_path)
        self.image_dir = image_dir

        # Load all classes (Only Building in this version)
        classIds = self.coco.getCatIds()

        # Load all images
        image_ids = list(self.coco.imgs.keys())

        # register classes
        for _class_id in classIds:
            self.add_class("mask-detection", _class_id, self.coco.loadCats(_class_id)[0]["name"])

        # Register Images
        for _img_id in image_ids:
            assert(os.path.exists(os.path.join(image_dir, self.coco.imgs[_img_id]['file_name'])))
            self.add_image(
                "mask-detection", image_id=_img_id,
                path=os.path.join(image_dir, self.coco.imgs[_img_id]['file_name']),
                width=self.coco.imgs[_img_id]["width"],
                height=self.coco.imgs[_img_id]["height"],
                annotations=self.coco.loadAnns(self.coco.getAnnIds(
                                            imgIds=[_img_id],
                                            catIds=classIds,
                                            iscrowd=None)))

        if return_coco:
            return self.coco

    def load_mask(self, image_id):
        """ Loads instance mask for a given image
              This function converts mask from the coco format to a
              a bitmap [height, width, instance]
            Params:
                - image_id : reference id for a given image

            Returns:
                masks : A bool array of shape [height, width, instances] with
                    one mask per instance
                class_ids : a 1D array of classIds of the corresponding instance masks
        """

        image_info = self.image_info[image_id]
        assert image_info["source"] == "mask-detection"

        instance_masks = []
        class_ids = []
        annotations = self.image_info[image_id]["annotations"]
        # Build mask of shape [height, width, instance_count] and list
        # of class IDs that correspond to each channel of the mask.
        for annotation in annotations:
            class_id = self.map_source_class_id(
                "mask-detection.{}".format(annotation['category_id']))
            if class_id:
                m = self.annToMask(annotation,  image_info["height"],
                                                image_info["width"])
                # Some objects are so small that they're less than 1 pixel area
                # and end up rounded out. Skip those objects.
                if m.max() < 1:
                    continue

                # Ignore the notion of "is_crowd" as specified in the coco format
                # as we donot have the said annotation in the current version of the dataset

                instance_masks.append(m)
                class_ids.append(class_id)
        # Pack instance masks into an array
        if class_ids:
            mask = np.stack(instance_masks, axis=2)
            class_ids = np.array(class_ids, dtype=np.int32)
            return mask, class_ids
        else:
            # Call super class to return an empty mask
            return super(MaskDataset, self).load_mask(image_id)


    def image_reference(self, image_id):
        """Return a reference for a particular image

            Ideally you this function is supposed to return a URL
            but in this case, we will simply return the image_id
        """
        return "mask-detection::{}".format(image_id)
    # The following two functions are from pycocotools with a few changes.

    def annToRLE(self, ann, height, width):
        """
        Convert annotation which can be polygons, uncompressed RLE to RLE.
        :return: binary mask (numpy 2D array)
        """
        segm = ann['segmentation']
        if isinstance(segm, list):
            # polygon -- a single object might consist of multiple parts
            # we merge all parts into one mask rle code
            rles = maskUtils.frPyObjects(segm, height, width)
            rle = maskUtils.merge(rles)
        elif isinstance(segm['counts'], list):
            # uncompressed RLE
            rle = maskUtils.frPyObjects(segm, height, width)
        else:
            # rle
            rle = ann['segmentation']
        return rle

    def annToMask(self, ann, height, width):
        """
        Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask.
        :return: binary mask (numpy 2D array)
        """
        rle = self.annToRLE(ann, height, width)
        m = maskUtils.decode(rle)
        return m

The MaskRCNN repository works on configs. For trainin we create a config sa shown below.

In [ ]:
class MaskConfig(Config):
    """Configuration for training on data in MS COCO format.
    Derives from the base Config class and overrides values specific
    to the COCO dataset. Edit here to find optimum parameters
    """
    # Give the configuration a recognizable name
    NAME = "mask-detection"

    # We use a GPU with 12GB memory, which can fit two images.
    # Adjust down if you use a smaller GPU.
    IMAGES_PER_GPU = 2

    # Comment to train on 8 GPUs (default is 1)
    GPU_COUNT = 1

    BACKBONE = 'resnet50'
    
    # Number of classes (including background)
    NUM_CLASSES = 3  # 1 Background + 2 classes(mask/no_mask)

    STEPS_PER_EPOCH=150
    VALIDATION_STEPS=50
    MAX_GT_INSTANCES=35
    LEARNING_RATE=0.01
    IMAGE_MAX_DIM=256
    IMAGE_MIN_DIM=256
    MINI_MASK_SHAPE=(128,128)
In [ ]:
# Mask Dataset 
config = MaskConfig()
DATASET_DIR = "/content/"
In [ ]:
config.display()
Configurations:
BACKBONE                       resnet50
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     2
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.7
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 2
IMAGE_CHANNEL_COUNT            3
IMAGE_MAX_DIM                  256
IMAGE_META_SIZE                15
IMAGE_MIN_DIM                  256
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [256 256   3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.01
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               35
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (128, 128)
NAME                           mask-detection
NUM_CLASSES                    3
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
PRE_NMS_LIMIT                  6000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                150
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           200
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001


In [ ]:
dataset = MaskDataset()
dataset.load_dataset(DATASET_DIR, "train")
# Must call before using the dataset
dataset.prepare()

print("[INFO] Image Count: {}".format(len(dataset.image_ids)))
print("[INFO] Class Count: {}".format(dataset.num_classes))
for i, info in enumerate(dataset.class_info):
    print("{:3}. {:50}".format(i, info['name']))
Annotation Path  /content/train.json
Image Dir  /content/train_images
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
[INFO] Image Count: 679
[INFO] Class Count: 3
  0. BG                                                
  1. mask                                              
  2. no_mask                                           

Samples

Load and display some sample images and masks.

In [ ]:
image_ids = np.random.choice(dataset.image_ids, 4)
for image_id in image_ids:
    print("[INFO] Image ID: {}".format(image_id))
    image = dataset.load_image(image_id)
    mask, class_ids = dataset.load_mask(image_id)
    visualize.display_top_masks(image, mask, class_ids, dataset.class_names)
[INFO] Image ID: 7
[INFO] Image ID: 618
[INFO] Image ID: 249
[INFO] Image ID: 490

Bounding Boxes

Rather than using bounding box coordinates provided by the source datasets, we compute the bounding boxes from masks instead. This allows us to handle bounding boxes consistently regardless of the source dataset, and it also makes it easier to resize, rotate, or crop images because we simply generate the bounding boxes from the updated masks rather than computing bounding box transformation for each type of image transformation.

In [ ]:
# Load random image and mask.
image_id =  np.random.choice(dataset.image_ids, 1)[0]
image = dataset.load_image(image_id)
mask, class_ids = dataset.load_mask(image_id)
print("[INFO] Image Shape: {} \tClass ID : {}".format(mask.shape, class_ids))
# Compute Bounding box
bbox = utils.extract_bboxes(mask)

# Display image and additional stats
print("[INFO] Image ID: {} \tDataset Reference: {}".format(image_id, dataset.image_reference(image_id)))
log("[INFO] Image", image)
log("[INFO] Mask", mask)
log("[INFO] Class IDs", class_ids)
log("[INFO] BBOX", bbox)

# Display image and instances
visualize.display_instances(image, bbox, mask, class_ids, dataset.class_names)
[INFO] Image Shape: (1024, 932, 2) 	Class ID : [1 1]
[INFO] Image ID: 475 	Dataset Reference: mask-detection::475
[INFO] Image             shape: (1024, 932, 3)        min:    0.00000  max:  255.00000  uint8
[INFO] Mask              shape: (1024, 932, 2)        min:    0.00000  max:    1.00000  uint8
[INFO] Class IDs         shape: (2,)                  min:    1.00000  max:    1.00000  int32
[INFO] BBOX              shape: (2, 4)                min:   86.00000  max:  759.00000  int32