Managing Uncertain Mediated Schema and Semantic Mappings Automatically in Dataspace Support Platforms
keywords: Schema matching, mediated schema, semantic mappings, reliability degrees, dataspace
Contrary to existing heterogeneous data integration systems which need to be fully integrated before using, a Dataspace Support Platform is a self-sustained system which automatically provides for the user its best endeavor results regardless of how integrated its sources are. Therefore, a Dataspace Support Platform needs to support uncertainty in mediated schema and in schema mappings. This paper proposes a novel approach to automatically providing reliable mediated schemas and reliable semantic mappings in Dataspace Support Platforms. Our aim is to increase the system's endeavor results by leading it to considering as much as possible information available in any source connected. In fact, we first extract from the source schemas, their corresponding graph representations. Then, we introduce algorithms which automatically extract a set of mediated schemas from the graph representations and a set of semantic mappings between a source and a target mediated schema. Finally, we assign reliability degrees to the mediated schema generated and to the semantic mappings. Indeed, the higher the reliability degree of a given mediated schema or semantic mapping, the more consistent with the source it is. Compared with existing systems, experimental results show that our system is faster and, although completely automatic, it produces reliable mediated schemas and reliable semantic mappings which are as accurate as those produced by semi-automatic systems.
reference: Vol. 32, 2013, No. 1, pp. 175–202