A Double Scoring Method for XML Element Retrieval
keywords: Ranking strategies, indexing units, XML retrieval, BM25F
Efficient retrieval of XML elements and documents is essential in the effective application of the XML format. The ranking function BM25F is composed of several document fields with potentially different degrees of importance; these fields are known as selected fields that give substantial improvements over the baseline BM25. The BM25F function has performed well in past evaluations; however, there are issues that require additional attention. In the first instance, which elements should be treated as fields? Secondly, what is an appropriate weight for each field? Previously, document fields were selected manually, and the weight for each chosen field was tuned before being assigned. Two automatic methods are introduced in this paper that enable the extraction of fields in document-centric XML documents and the assignment weights to the selected fields. Our experiments show an improvement of up to 28 % over BM25, and up to 15 % over BM25F at iP[0.01] based on INEX evaluations.
reference: Vol. 32, 2013, No. 2, pp. 411–440