HardTable Column Property Annotation by Wikidata (HardTable-CPA-WD)
This is a task of Round 2 of ISWC 2021 “Semantic Web Challenge on Tabular Data to Knowledge Graph Matching”. It is to annotate column relationships in a table with properties of Wikidata (version: 20210628). Click here for the official challenge website.
The task is to annotate each column pair with a property of Wikidata.
Each submission should contain an annotation of a target column pair. Note the order of the two columns matters. The annotation property should start with the prefix of http://www.wikidata.org/prop/direct/. Case is NOT sensitive.
The submission file should be in CSV format. Each line should contain the annotation of two columns which is identified by a table id, column id one and column id two. Namely one line should have four fields: “Table ID”, “Column ID 1”, “Column ID 2” and “Property IRI”. Each column pair should be annotated by at most one property. The headers should be excluded from the submission file. Here is an example: “OHGI1JNY”,”0,”1”,”http://www.wikidata.org/prop/direct/P702”. Please use the prefix of http://www.wikidata.org/prop/direct/ instead of https://www.wikidata.org/wiki/ which is the prefix of the Wikidata page URL.
1) Table ID does not include filename extension; make sure you remove the .csv extension from the filename.
2) Column ID is the position of the column in the table file, starting from 0, i.e., first column’s ID is 0.
3) One submission file should have NO duplicate lines for one column pair.
4) Annotations for column pairs out of the targets are ignored.
Data Description: One table is stored in one CSV file. Each line corresponds to a table row. The first row may either be the table header or content. The target cells for annotation are saved in a CSV file.
Precision, Recall and F1 Score are calculated:
Precision = (Correct Annotations #) / (Submitted Annotations #)
Recall = (Correct Annotations #) / (Ground Truth Annotations #)
F1 Score = (2 * Precision * Recall) / (Precision + Recall)
1) # denotes the number.
2) F1 Score is used as the primary score; Precision is used as the secondary score.
3) One target column pair, one ground truth annotation, i.e., # ground truth annotations = # target column pairs.
IBM Research has promised contributed to the prizes!
Each team can submit 10 times per day.
The prize winners will be announced during the ISWC conference (October 24 - 28, 2021). We will take into account all evaluation rounds specially the ones running till the conference dates.
Participants are encouraged to submit a system paper describing their tool and the obtained results. Papers will be published online as a volume of CEUR-WS as well as indexed on DBLP. By submitting a paper, the authors accept the CEUR-WS and DBLP publishing rules.
Please see additional information at our official website