Integration of Heterogeneous Data Sources in an Ontological Knowledge Base

keywords: Data management, data integration, semantics, ontologies, conversion, RDF, integrity constraints, semantic optimization, SPARQL
In this paper we present X2R, a system for integrating heterogeneous data sources in an ontological knowledge base. The main goal of the system is to create a unified view of information stored in relational, XML and LDAP data sources within an organization, expressed in RDF using a common ontology and valid according to a prescribed set of integrity constraints. X2R supports a wide range of source schemas and target ontologies by allowing the user to define potentially complex transformations of data between the original data source and the unified knowledge base. A rich set of integrity constraint primitives has been provided to ensure the quality of the unified data set. They are also leveraged in a novel approach towards semantic optimization of SPARQL queries.
mathematics subject classification 2000: 68P15, 68T30, 68U35
reference: Vol. 31, 2012, No. 1, pp. 189–223