Round 1: 44 days left


πŸš€ Make your first submission using the starter-kit

πŸ‘₯ Competitions are more fun with friends. Find your teammates!

πŸ•΅οΈ Introduction

Enabling quick and precise search among millions of items on marketplaces is a key feature for e-commerce. The use of common text-based search engines often requires several iterations and can render unsuccessful unless exact product names are known. Image-based search provides a powerful alternative and can be particularly handy when a customer observes the desired product in real life, in movies or online media.

Recent progress in computer vision now provides rich and precise descriptors for visual content. The goal of this challenge is to benchmark and advance existing computer vision methods for the task of image-based product search. Our evaluation targets a real-case scenario where we use over 40k images for 9k products from real marketplaces. Example products include sandals and sunglasses, and their successful matching requires overcoming visual variations in images due changing viewpoints, background clutter, varying image quality and resolution.



The challenge is organized in conjunction with Machines Can See Summit (MCS) that will be held in Dubai in the beginning of April 2023. The winners of the challenge will be invited to present their solutions at MCS 2023.

We hope this challenge will help advancing novel algorithms for image retrieval and practical applications of computer vision to e-commerce.

πŸ”Ž Problem Statement

In this challenge we separate product images into user and seller photos. User photos are typically snapshots of products taken with a phone camera in cluttered scenes. Such images differ substantially from seller photos that are intended to represent products on marketplaces. We provide object bounding boxes to indicate desired products on user photos and use such images and boxes as search queries. Given a search query, the goal of the algorithm is to find correct product matches in the gallery of seller photos.

πŸ“ Datasets

Test set

To simplify debugging and to enable local validation, we provide a development testset with images and ground truth labels. The local development test set and the public leaderboard test set share the same format as described below.

Testset format

Test set contains 2 files: gallery.csv and queries.csv.

gallery.csv defines the database of images from marketplaces. Each row contains the following information:

  • seller_img_id - unique int32 identifier of product image that is used in result ranking NumPy array;
  • img_path - path to the product image in the "data" folder.

queries.csv defines a set of user images that will be used as queries to search the database. Each row contains the following information:

  • user_img_id - unique int32 identifier of user image that is used in result ranking NumPy array;
  • img_path - path to user image in "data" folder;
  • bbox_x, bbox_y, bbox_w, bbox_h - bounding box coordinates of the product in the user image.

Training set

We do not provide a training dataset for this competition. Participants are invited to use publicly available data under common-use license or other public research datasets such as Products10K.

If you collect your own dataset and use it for training, we will request you to make this dataset available to other participants in the relevant competition thread by March 9.

πŸš€ Submission

Your submission should return a numpy array of size N x 1000, where each row r corresponds to the top-1000 list of gallery images sorted by the similarity with respect to the query image r, r = 1…N.

Make your first submission using the starter kitπŸš€!

πŸ–Š Evaluation Metric

Submissions will be evaluated by the mean Average Precision (mAP) score for the retrieval task, where AP is defined as

\(x = {-b \pm \sqrt{b^2-4ac} \over 2a}\)

where GTP refers to the total number of ground truth positives, n refers to the total number of products you are interested in, P@k refers to the precision@k and rel@k is a relevance function. The relevance function is an indicator function which equals 1 if the product at rank k is relevant and equals to 0 otherwise. K = 1000 in our case.

Example below illustrates AP calculation for a given query, Q, with GTP=3


The overall AP for this query is 0.7. One thing to note is that since we know that there are only three GTP, the AP@5 would equal to overall AP.
For another query, Q, we could get a perfect AP of 1 if the returned G’ is sorted as such: img_2.png

Source: Breaking Down Mean Average Precision (mAP)

πŸ“… Timeline

Here's the timeline of the challenge:

  • Challenge start: January 16, 2023
  • Deadline for publication of training datasets by participants: March 9, 2023
  • Challenge end: March 16, 2023
  • Presentation of winning solutions at MCS: Beginning of April 2023

πŸ’° Prizes

This challenge has Leaderboard Prize Pool of USD 15,000.

Leaderboard Prizes
The leaderboard's top three teams or participants will receive the following prizes.

  • πŸ₯‡ 1st on the leaderboard: USD 8000
  • πŸ₯ˆ 2nd on the leaderboard: USD 5000
  • πŸ₯‰ 3rd on the leaderboard: USD 2000

The winners of the challenge will be invited to present their solutions at the Machines Can See summit in Dubai and will be awarded a travel grant.

πŸ“± F.A.Q

Q: How many product bounding boxes should I expect for one query image? For example, if more than one product is depicted in the image. A: Each query image has exactly one bounding box corresponding to the query object.

Q: Should the same products which differ only by color be considered as same or different? For example, the same bags with different leather colors.
A: Yes, two images of products should be considered as a correct match if these products only differ by color.

Q: Can participants use re-ranking techniques in their solutions?
A: Yes, re-ranking can be used provided your submission respects runtime constraints defined by the challenge.

For any more queries, please post on the Discourse Forum or send an email to: help@aicrowd.com