Numerical Cruncher



Clustering


GRASP
[Greedy Randomized Adaptive Search Procedure]



GRASP is a technique of the eighties aimed to solve complex combinatorial optimization problems. It directs most of its efforts to build high quality solutions which are processed afterwards to get even better solutions.

GRASP algorithms are iterative algorithms in which each iteration includes a solution construction phase and postprocessing one where the solution found in the first stage is optimized.

An analogy between GRASP and Linear Programming can be found. In Linear Programming, you build a feasible solution first, and you apply Simplex algorithm later. However, in GRASP, the initial solution quality is considered of prime importance.

The basic structure of a GRASP algorithm is the following:

This algorithm can be extended with a mutation operator:

Greedy algorithm

A simplified version of Batchelor and Wilkins' algorithm can be used as greedy algorithm. Cluster centers will be training set patterns chosen so that, at any step of the algorithm, the pattern added as new cluster center will be the furthest pattern from the actual cluster centroids. The first centroid must be chosen arbitrarily, obviously.

Greedy randomized algorithm

A candidate list [RCL: Restricted Candidate List] is used during the construction of a greedy randomized solution. This list contains the better candidates to become part of a solution. A candidate is extracted from this list and is added to the current solution randomly.

A greedy randomized algorithm for the clustering problem could use as RCL the set of furthest patterns from the actual set of cluster centroids. The size of this list could be, for example, equal to 5% of available pattern samples.

Thus, the randomized greedy algorithm is analogous to the classical greedy algorithm, keeping in mind that you choose randomly one of the furthest patterns from the set of established cluster centroids (and not simply the furthest).

Local search technique

The local search strategy is used to improve the solution obtained with the greedy randomized algorithm. While the actual solution is not a local optimum, it is replaced with one of its neighbours.

The well-known K-MEANS algorithm can be used as local search technique for the clustering problem. Remember that its convergence depends on its initial cluster configuration.

Mutation (extended GRASP algorithm)

A solution to the clustering problem can be considered as a vector where the i-th component represents the cluster the i-th pattern is assigned to. An uniform mutation operator can be applied to such a solution if each component of the solution vector is altered with a given probability P. The components of the vector will remain unaltered with a probability 1-P. In Numerical Cruncher, P=0.3 (a really strong mutation!) so that a good search space exploration can be performed.