Download PDF

SemTab: Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, 2023, Location: Athens, Greece

Publication date: 2023-11-02
Pages: 21 - 37
Publisher: CEUR Workshop Proceedings

CEUR Workshop Proceedings

Author:

Dasoulas, Ioannis
Yang, Duo ; Duan, Xuemin ; Dimou, Anastasia

Keywords:

STG/21/058#56765259, 4609 Information systems

Abstract:

An abundance of tabular data exists and is used by a wide range of applications. However, a big portion of these data lack the semantic information necessary for users and machines to properly understand them. This lack of table semantic understanding impedes their usage in data analytics pipelines. Solutions to semantically interpret tables exist but they are focused on specific annotation tasks and types of tables, and rely on large knowledge bases, making it difficult to re-use in real-world settings. Thus, more robust systems that produce more precise annotations and adapt to different table types are needed. The Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab) was introduced in an effort to benchmark semantic table interpretation systems, by evaluating them over diverse datasets and tasks. In this paper, we introduce TorchicTab, a versatile semantic table interpretation system able to annotate tables with varied structures by using either an external knowledge graph, such as Wikidata, or annotated tables with pre-defined terms for training. We evaluate our proposed system according to the different annotation tasks of the SemTab challenge. The results show that our system can produce accurate annotations for different tasks across varied datasets.