Learning bf k-Nearest Neighbors Classifier from Distributed Data

keywords: Learning, k-classifier, decomposable algorithms, vertically and horizontally distributed data
Most learning algorithms assume that all the relevant data are available on a single computer site. In the emerging networked environments learning tasks are encountering situations in which the relevant data exists in a number of geographically distributed databases that are connected by communication networks. These databases cannot be moved to other network sites due to security, size, privacy, or data-ownership considerations. In this paper we show how a k-nearest classifier algorithm can be adapted for distributed data situations. The objective of our algorithms is to achieve the learning objectives for any data distribution encountered across the network by exchanging local summaries among the participating nodes.
mathematics subject classification 2000: 68T05
reference: Vol. 27, 2008, No. 3, pp. 355–376