Round 1: Completed Round 2: 18 days left

# SemTab 2020 Columns-Property Annotation (CPA) Challenge

2015
38
7
92

This is a task of the ISWC 2020 SemTab challenge (Semantic Web Challenge on Tabular Data to Knowledge Graph Matching). The task is to annotate column pairs (two ordered columns) within a table with properties defined by a knowledge graph (KG) such as DBpedia and Wikidata.

Each submission should contain one or no property annotation for one column pair which is identified by a table id, a head column id and a tail column id. Each column pair should be annotated by one property that is as fine grained as possible but correct. Both object properties and data properties are possible. The annotation should be the property's full URI, and case is NOT sensitive

Briefly each line of the submission file should include “table ID”, “head column ID”, “tail column ID”, and “property URI”. The header should be excluded from the submission file. Here is one line example for DBpedia: “50245608_0_871275842592178099”,”0”,”1”,”http://dbpedia.org/ontology/releaseDate”. Another example for Wikidata: "KIN0LD6C","0","1","http://www.wikidata.org/prop/direct/P131".

Notes:

1) Table ID is the filename of the table data, but does not include the extension.

2) Column ID is the position of the column in the table file, starting from 0, i.e., first column’s ID is 0.

3) At most one property should be annotated for one column pair.

4) One submission file should have NO duplicate annotations for one column pair.

6) Annotations for column pairs out of the targets are ignored.

## Datasets

Table set for Round #1: Tables, Target Column Pairs, KG: Wikidata

Table set for Round #2: Tables, Target Columns Pairs

Table set for Round #3: Tables, Target Columns Pairs

Table set for Round #4: Tables, Target Columns Pairs

Data Description: The table for Round #1 is generated from Wikidata (Version: March 5, 2020). One table is stored in one CSV file. Each line corresponds to a table row. Note that the first row may either be the table header or content. The column pairs for annotation are saved in a CSV file.

## Evaluation Criteria

Precision, Recall and F1_Score will be calculated:

$$Precision = {correct\ annotations\ \# \over annotations\ \#}$$

$$Recall = {correct\ annotations\ \# \over target\ column\ pairs\ \#}$$

$$F1\_Score = {2 \times Precision \times Recall \over Precision + Recall}$$

Notes:

1) # denotes the number.

2) F1_Score is used as the primary score; Precision is used as the secondary score.

3) An empty annotation of a column pair will lead to an annotated cell; we suggest to exclude the cell with empty annotation in the submission file.

## Submission

1. One participant is allowed to make at most 5 submissions per day in Round #1 and #2

## Tentative Dates

1. Round #1: 26 May to 20 July

2. Round #2: 25 July to 30 Aug

3. Round #3: 3 September to 17 September

4. Round #4: 20 September to 4 October

## Rules

1. Selected systems with the best results will be invited to present their results during the ISWC conference and the Ontology Matching workshop.

2. Participants are encouraged to submit a system paper describing their tool and the obtained results. Papers will be published online as a volume of CEUR-WS as well as indexed on DBLP. By submitting a paper, the authors accept the CEUR-WS and DBLP publishing rules.