A Benchmark Study on Knowledge Graphs Enrichment and Pruning Methods in the Presence of Noisy Relationships

Stefano Faralli; Andrea Lenzi; Paola Velardi

doi:10.1613/jair.1.14494

PDF Appendix

Published: Sep 13, 2023

DOI: https://doi.org/10.1613/jair.1.14494

Keywords:

knowledge representation, knowledge discovery, ontologies

Stefano Faralli

Sapienza University of Rome

https://orcid.org/0000-0003-3684-8815

Andrea Lenzi

Sapienza University of Rome

https://orcid.org/0000-0002-8997-9862

Paola Velardi

Sapienza University of Rome

https://orcid.org/0000-0003-0884-1499

Abstract

In the past few years, knowledge graphs (KGs), as a form of structured human intelligence, have attracted considerable research attention from academia and industry. In this very active field of study, a widely explored problem is that of link prediction, the task of predicting whether two nodes should be connected, based on node attributes and local or global graph connectivity properties. The state of the art in this area is represented by techniques based on graph embeddings. However, KGs, especially those acquired using automated or partly automated techniques, are often riddled with noise, e.g., wrong relationships, which makes the problem of link deletion as important as that of link prediction. In this paper, we address three main research questions. The first is about the true effectiveness of different knowledge graph embedding models under the presence of an increasing number of wrong links. The second is to asses if methods that can predict unknown relationships effectively, work equally well in recognizing incorrect relations. The third is to verify if there are systems robust enough to maintain primacy in all experimental conditions. To answer these research questions, we performed a systematic benchmark study in which the experimental setting includes ten state-of-the-art models, three common KG datasets with different structural properties and three downstream tasks: the widely explored tasks of link prediction and triple classification, and the less popular task of link deletion. Comparative studies often yield contradictory results, where the same systems score better or worse depending on the experimental context. In our work, in order to facilitate the discovery of clear performance patterns and their interpretation, we select and/or aggregate performance data to highlight each specific comparison dimension: dataset complexity, type of task, category of models, and robustness against noise.

Issue

Vol. 78 (2023)

Section

Articles

Author Biographies

Stefano Faralli, Sapienza University of Rome

Assistant Professor (Computer Science)

Computer Science Department

Andrea Lenzi, Sapienza University of Rome

PhD student (Computer Science)

Department of Computer Science

Paola Velardi, Sapienza University of Rome

Full Professor (Computer Science)

Department of Computer Science,

Article Sidebar

Main Article Content

Abstract

Article Details

Stefano Faralli, Sapienza University of Rome

Andrea Lenzi, Sapienza University of Rome

Paola Velardi, Sapienza University of Rome