Search-Based Evolution of XML Schemas

keywords: XML-based applications; DTD; LL parsing
The use of schemas makes an XML-based application more reliable, since they contribute to avoid failures by defining the specific format for the data that the application manipulates. In practice, when an application evolves, new requirements for the data may be established, raising the need of schema evolution. In some cases the generation of a schema is necessary, if such schema does not exist. To reduce maintenance and reengineering costs, automatic evolution of schemas is very desirable. However, there are no algorithms to satisfactorily solve the problem. To help in this task, this paper introduces a search-based approach that explores the correspondence between schemas and context-free grammars. The approach is supported by a tool, named EXS. Our tool implements algorithms of grammatical inference based on LL(1) Parsing. If a grammar (that corresponds to a schema) is given and a new word (XML document) is provided, the EXS system infers the new grammar that: i) continues to generate the same words as before and ii) generates the new word, by modifying the original grammar. If no initial grammar is available, EXS is also capable of generating a grammar from scratch from a set of samples.
mathematics subject classification 2000: 68N30
reference: Vol. 31, 2012, No. 3, pp. 573–595