org.wdssii.decisiontree
Class QuinlanC45AxialDecisionTreeCreator

java.lang.Object
  extended by org.wdssii.decisiontree.QuinlanC45AxialDecisionTreeCreator
All Implemented Interfaces:
DecisionTreeCreator

public class QuinlanC45AxialDecisionTreeCreator
extends Object
implements DecisionTreeCreator

C45 learning algorithm to create an axial decision tree. J. R. Quinlan. Improved use of continuous attributes in c4.5. Journal of Artificial Intelligence Research, 4:77-90, 1996 Usage:

   float[][] data = new float[numTraining][numAttr];
   int[] categories = new int[numTraining];
   // populate arrays
   ...
   QuinlanC45AxialDecisionTreeCreator classifier = new QuinlanC45AxialDecisionTreeCreator(0.1); // pruning fraction
   DecisionTree tree = classifier.learn(data, categories);
 

Author:
lakshman

Nested Class Summary
static class QuinlanC45AxialDecisionTreeCreator.TreeCreationException
           
 
Field Summary
private  FitnessFunction fitness
          By default, InformationGain is used.
private  int maxDepth
          how deep can this tree go? The deeper the tree the less general it is.
private  int numCategories
          how many classes are there?
private  int populationToConsiderSplitting
          How many members can be in a population before it is split?
private  float pruningFraction
          fraction of the training data set to keep aside so that the learned tree is not overfit.
 
Constructor Summary
QuinlanC45AxialDecisionTreeCreator()
           
QuinlanC45AxialDecisionTreeCreator(float pruningFraction)
           
 
Method Summary
private  AxialTreeNode buildTree(float[][] inputData, int[] targetClass, int[] toConsider, int depth)
          Helper method that creates a sub-tree and returns a node
private  int getMostLikelyCategory(float[][] inputData, int[] targetClass, int[] toConsider)
           
 int getNumCategories()
          Corresponds to previously learnt data set
 AxialDecisionTree learn(float[][] inputData, int[] targetClass)
           
private  AxialTreeNode pruneTree(AxialTreeNode node, float[][] inputData, int[] targetClass, int[] toConsider)
          removes nodes that do not perform well on the validation dataset.
private  int[][] split(float[][] inputData, int[] toConsider, int bestAttribute, float thresh)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

populationToConsiderSplitting

private int populationToConsiderSplitting
How many members can be in a population before it is split?


pruningFraction

private float pruningFraction
fraction of the training data set to keep aside so that the learned tree is not overfit. A value of 0.1f may be a pretty good choice. The pruning points will be the last few instances of the training data set. Pass in a randomized sample if this simply won't do.


maxDepth

private int maxDepth
how deep can this tree go? The deeper the tree the less general it is.


numCategories

private int numCategories
how many classes are there?


fitness

private FitnessFunction fitness
By default, InformationGain is used.

Constructor Detail

QuinlanC45AxialDecisionTreeCreator

public QuinlanC45AxialDecisionTreeCreator(float pruningFraction)

QuinlanC45AxialDecisionTreeCreator

public QuinlanC45AxialDecisionTreeCreator()
Method Detail

learn

public AxialDecisionTree learn(float[][] inputData,
                               int[] targetClass)
                        throws QuinlanC45AxialDecisionTreeCreator.TreeCreationException,
                               IllegalArgumentException
Specified by:
learn in interface DecisionTreeCreator
Parameters:
inputData - an array where each row corresponds to a single instance (to be classified) and the columns hold the attributes of that instance
targetClass - an array where each row corresponds to a single instance, specifically the actual classification of that instance. The class needs to be a number 0,1,2,...,N-1 where N is the number of classes. Some of these classes may have no examples.
Returns:
decisiontree
Throws:
QuinlanC45AxialDecisionTreeCreator.TreeCreationException
IllegalArgumentException

pruneTree

private AxialTreeNode pruneTree(AxialTreeNode node,
                                float[][] inputData,
                                int[] targetClass,
                                int[] toConsider)
removes nodes that do not perform well on the validation dataset.


buildTree

private AxialTreeNode buildTree(float[][] inputData,
                                int[] targetClass,
                                int[] toConsider,
                                int depth)
Helper method that creates a sub-tree and returns a node

Returns:

split

private int[][] split(float[][] inputData,
                      int[] toConsider,
                      int bestAttribute,
                      float thresh)

getMostLikelyCategory

private int getMostLikelyCategory(float[][] inputData,
                                  int[] targetClass,
                                  int[] toConsider)

getNumCategories

public int getNumCategories()
Corresponds to previously learnt data set