Enhancing Semantic Web Entity Matching Process Using Transformer Neural Networks and Pre-Trained Language Models

Mourad Jabrane

Sultan Moulay Slimane University Laboratory of Process Engineering, Computer Science and Mathematics Bd Beni Amir, BP 77 Khouribga, Morocco
Abdelfattah Toulaoui

Sultan Moulay Slimane University Laboratory of Process Engineering, Computer Science and Mathematics Bd Beni Amir, BP 77 Khouribga, Morocco
Imad Hafidi

Sultan Moulay Slimane University Laboratory of Process Engineering, Computer Science and Mathematics Bd Beni Amir, BP 77 Khouribga, Morocco

Enhancing Semantic Web Entity Matching Process Using Transformer Neural Networks and Pre-Trained Language Models

keywords: Entity matching, record linkage, linked data, deep learning, transformer neural networks

Entity matching (EM) is a critical yet complex component of data cleaning and integration. Recent advancements in EM have predominantly been driven by deep learning (DL) methods. These methods primarily enhance data accuracy within structured data that adheres to a high-quality and well-defined schema. However, these schema-centric DL strategies struggle with the semantic web's linked data, which tends to be voluminous, semi-structured, diverse, and often noisy. To tackle this, we introduce a novel approach that is loosely schema-aware and leverages cutting-edge developments in DL, specifically transformer neural networks and pre-trained language models. We evaluated our approach on six datasets, including two tabular and four RDF datasets from the semantic web. The findings demonstrate the effectiveness of our model in managing the complexities of noisy and varied data.

reference: Vol. 43, 2024, No. 6, pp. 1397–1415

doi: 10.31577/cai_2024_6_1397

Computing and Informatics

formerly Computers and Artificial Intelligence

Enhancing Semantic Web Entity Matching Process Using Transformer Neural Networks and Pre-Trained Language Models