Back to the table of contents

Previous      Next

Coding With Supervised Learners

The GSupervisedLearner class is declared in Learner.h. All classes that inherit from GSupervisedLearner must implement a method named

	void train(GMatrix& features, GMatrix& labels);
and one named
	void predict(const double* pIn, double* pOut);
. As you might expect, the train method trains the model, and the predict method uses a trained model to make a prediction.

The train method expects two matrices to be passed in as parameters. The first parameter contains the features (or input patterns), and the second parameter contains the corresponding labels (or target outputs). These two matrices are expected to have the same number of rows.

If your data is stored in one table that contains both features and labels, then you will need to divide it into two separate matrices before you call the train method. Here is an example that will load an ARFF file, swap the first column with the last one, then split it into a feature matrix and a label matrix. In this case, the last 2 columns will be used for the label matrix:

	GMatrix data;
	data.loadArff("mydata.arff");
	data.swapColumns(1, data.cols() - 1);
	GDataColSplitter splitter(data, 2);
	GMatrix& features = splitter.features();
	GMatrix& labels = splitter.labels();

Notice that you are not restricted to having one-dimensional labels. Our supervised learning algorithms can implicitly handle labels of arbitrary dimensionality. This is particularly convenient when you need to predict things like pixel colors (which are generally comprised of 3 channel values), or points in n-dimensional space, or control vectors for systems with several knobs and levers, etc.

So, training a model is as simple as calling the train method.

	GDecisionTree model;
	model.train(features, labels);
or
	GKNN model;
	model.setNeighborCount(3);
	model.train(features, labels);
etc. For a full list of all of our supervised learning algorithms, take a look in the API docs at the class hierarchy. Expand GTransducer to show all the classes that inherit from it. Then, expand GSupervisedLearner (which inherits from GTransducer) to show all the classes that inherit from it. Also, expand GIncrementalLearner.

To make a prediction using a trained model, just pass one row of features in to the predict method, and the predicted label vector will come out. (pOut must point to an array of doubles big enough to hold the label vector.) Example:

	double pOut[2];
	model.predict(features[10], pOut);

Note that some learning algorithms may not implicitly support all types of data. This problem can be solved by wrapping the learning algorithm in a filter. A filter is a class that converts the data to a suitable type before passing it to the learning algorithm. Perhaps, the easiest filter to use is GAutoFilter. Example:

	GNaiveBayes model;
	GAutoFilter af(&model, false);
	af.train(features, labels); // It is okay if features and/or labels contains continuous values,
	                            // even though naive Bayes only supports categorical values. The
	                            // GAutoFilter class will take care of type conversions as needed.

	...

	af.predict(pattern, prediction);

Previous      Next

Back to the table of contents



Hosting for this project generously provided by:
SourceForge.net Logo