Train Classifier

From Spin Help
Jump to: navigation, search
TrainClassifierDialog.jpg

Train a support vector machine (SVM) classifier to divide objects into two different classes based on a set of features. Will accept multiple feature files (*.FET) as input and outputs an SVM model (*.SVM) that can then be used to classify other feature files. The training data contained in the feature files must already be classified.

Algorithm

This function trains a support vector machine (SVM) classifier. The libsvm library implementation was used. The algorithm seeks to create a maximum margin hyperplane to separate the two classes of objects in a higher dimensional feature space. The algorithm is very sensitive to the values of "cost" and "gamma" so using the cross validation to select near optimal values is highly recommended. A weighting can also be applied to one class of objects. This makes this class more important so the classifier will attempt to include them at the cost of more false identifications in the other class.

Train SVM Classifier Input

Select Training Data – Selects the feature files (*.FET) for the training data. Multiple files can be selected in the dialog by holding down shift.

Cost – Sets the cost value for the SVM, this value is critical for the SVM to function properly. It is highly recommended to run cross validation to determine this value, unless you already know a value that works well for your data.
Gamma – Sets the gamma value for the SVM, this value is critical for the SVM to function properly. It is highly recommended to run cross validation to determine this value, unless you already know a value that works well for your data.
Weight True – Sets the weighting assigned to the “true” objects, or objects assigned to class 2. This is used to weight this class more heavily if it is more important or there are an un-equal number of objects in the classes.
Weight False – Sets the weighting assigned to the “true” objects, or objects assigned to class 1. This is used to weight this class more heavily if it is more important or there are an un-equal number of objects in the classes.
Scale Features – Scales all the features from 0 to 1. This is done so all features are equally weighted in the classifier. Recommended. Kernel Type – Sets the kernel used in the SVM. Radial Basis Function recommended.


Cost Range – Sets the range of cost values that are tested in cross validation. 100 values are tested between the min and max in an equally divided log scale (i.e. 10-5, 10-4, 10-3, 10-2, etc..). A smaller range will thus result in a denser testing of an area.
Gamma Range – Sets the range of gamma values that are tested in cross validation. 100 values are tested between the min and max in an equally divided log scale (i.e. 10-5, 10-4, 10-3, 10-2, etc..). A smaller range will thus result in a denser testing of an area.
Run – Runs cross validation. Splits testing data in half trains with a given cost and gamma on half, and test on the other half. This is repeated for all combinations of cost and gamma specified. The combination with the best results is selected as optimal. This provides an automated means to select near optimal cost and gamma values and prevents a choice that would over-constrain the classifier. Can take 10 minutes or more for large datasets on slow computers. Calculation time will increase towards the end of the progress bar it isn’t frozen be patient.
Reset – Resets the min and max gamma ranges to their default values.

Save Model File As – Select file name to save model file as. This file is loaded to run the classifier.

Train SVM Classifier Output

This algorithm outputs a single SVM model file (*.SVM) that contains the trained classfier. This model file can be used to classify other feature files with the Run Classifier dialog.

Location