An Intelligent Genetic Algorithm for Mining Classification Rules in Large Datasets
keywords: Classification, genetic algorithm (GA), knowledge discovery, scalability
Genetic algorithm is a popular classification algorithm which creates a random population of candidate solutions and makes them to evolve into a suitable accurate solution for a given problem by processing them iteratively for several generations. During each generation the training data set is accessed by the genetic algorithm only for the population member's fitness calculation and no other extra knowledge about the problem domain is extracted from the training data set. Even the domain knowledge stored in the chromosome code of the population may be lost in the future generations due to genetic operations. All the genetic operations like crossover and mutation are probability based and they do not depend upon the domain knowledge. This phenomenon makes the genetic algorithm to converge slowly. This paper proposes a genetic algorithm which tries to gain maximum knowledge in between the generations and store them in the form of knowledge chromosomes. The gained knowledge is used to make predictions about the search space and to guide the search process to an area with potential solutions in the subsequent generations. This makes the genetic algorithm to converge quickly which in turn reduces the learning cost. The experiments show that the run time is reduced considerably when compared with the state-of-the-art evolutionary algorithm.
reference: Vol. 32, 2013, No. 1, pp. 1–22