ISWC 2019 Column-Type Annotation (CTA) Challenge
This is a task of ISWC 2019 “Semantic Web Challenge on Tabular Data to Knowledge Graph Matching”. It’s to annotate an entity column (i.e., a column composed of phrases) in a table with classes of DBpedia Ontology. Click here for the official challenge website.
The task is to annotate each of the given entity columns with classes of DBpedia ontology. The annotation class should come from DBpedia ontology classes (excluding owl:Thing and owl:Agent). Each column can be annotated by multiple classes: the one that is as fine grained as possible and correct to all its cells, is regarded as a perfect annotation; the one that is the ancestor of the perfect annotation is regarded as an okay annotation; others are regarded as wrong annotations. Case is NOT sensitive.
Each submission should be a CSV file. Each line should include a column identified by table id and column id and its class annotations. It means one line should include three fields: “Table ID”, “Column ID” and “DBpedia classes”. The headers should be excluded from the submission file. Annotation classes should be separated by space, and their order does not matter. Here is one line example: “9206866_1_8114610355671172497”,”0”,”http://dbpedia.org/ontology/Country http://dbpedia.org/ontology/PopulatedPlace http://dbpedia.org/ontology/Place”
1) Table ID does not include the file name extension; make sure you remove the .csv extension from the filename.
2) Column ID is the position of the column in the input, starting from 0, i.e., first column’s ID is 0.
3) In Round 1, only perfect annotations score; in Round 2, both perfect annotations and okay annotations score.
4) One submission file should have NO duplicate lines (annotations) for one target column.
5) Annotations for columns out of the target columns are ignored.
Data Description: One table is stored in one CSV file. Each line corresponds to a table row. Note that the first row may either be the table header or content. The target columns for annotation are saved in a CSV file.
Evaluation Criteria [Round 1]
Precision, Recall and F1 Score will be calculated for ranking:
Precision = (# perfect annotations) / (# submitted annotations)
Recall = (# perfect annotations) / (# ground truth annotations)
F1 Score = (2 * Precision * Recall) / (Precision + Recall)
1) # denotes the number.
2) In the ground truth file, one specified column is exactly annotated by one perfect class: # ground truth annotations = # target columns.
2) F1 Score is used as the primary score, Precision is used as the secondary score.
Evaluation Criteria [Round 2]
The following metrics named Average Hierarchical Score (AH-Score) and Average Perfect Score (AP-Score) are calculated for ranking:
AH-Score = (1 * (# perfect annotations) + 0.5 * (# okay annotations) - 1 * (# wrong annotations)) / (# target columns)
AP-Score = (# perfect annotations) / (# total annotated classes)
1) # denotes the number
2) AH-Score is used as the primary score; AP-Score is used as the secondary score.
SIRIUS and IBM Research sponsor the prizes for the best systems.
The prize winners will be announced during the ISWC conference (on October 30, 2019). We will take into account all evaluation rounds specially the ones running till the conference dates.
Participants are encouraged to submit a system paper describing their tool and the obtained results. Papers will be published online as a volume of CEUR-WS as well as indexed on DBLP. By submitting a paper, the authors accept the CEUR-WS and DBLP publishing rules.
Please see additional information at our official website