7.5.4.5   Advanced Training Parameters and Features

This selection allows a precise specification of the classifier as well as of the classification parameters.

The following table can help you choose suitable classifier parameters.

Parameter Method Selection Explanation

Classifier Type All MLP (Multilayer Perceptron), SVM (Support Vector Machine), kNN (k-Nearest Neighbor), Box (Hyperboxes); (default = MLP (if no classifier has been trained before) or the last classifier that has been trained) Choose a method for the classification of the symbols. The last classifier that has been trained is loaded automatically. The settings for this classifier are preserved even if one of the pretrained classifiers is used in between.
Preprocessing MLP, SVM None, Normalization, Principal Components, Canonical Variates; (default = None) Preprocessing can either normalize a value range of a feature or transform the feature space to decrease the number of dimensions. Principal Components and Canonical Variates can be used to reduce the amount of data without losing a large amount of information, while additionally optimizing the separability of the classes after the data reduction.
Components MLP, SVM Number of components (default = 10) Dimension of the reduced feature space when using appropriate preprocessing. For preprocessing with principal components or canonical variates the length of the data is determined in Components. It is ignored if Processing was set to None or Normalization.
Hidden Units MLP Number of hidden units (1...150, default = Auto) Number of neurons hidden in the middle layer of the MLP. The more input data you are using, the higher this value should be. In many cases, very small values of Hidden Units already lead to very good classification results. If Hidden Units is chosen too large, the MLP learns the training data very well, but does not return very good results for unknown data. In the 'auto' mode, the hidden value is estimated by a heuristic algorithm which is based on the number of characters. If the number of characters is really high, the number of hidden units might be estimated too high, which leads to a very slow training.
Iterations MLP The maximum number of iterations for training a MLP classifier (default = 200) Select a sufficient number of iterations to create an MLP classifier that performs well in the subsequent application.
Weight Tolerance MLP MLP training continues while weights still change more than this value between iterations (default = 1) Choose a realistic value for Weight Tolerance. If this value is chosen too small in the training, the training will take longer. If Weight Tolerance is set very high, the training will abort quickly and the classification results will not be very useful either.
Error Tolerance MLP MLP training continues while error still changes more than this value between iterations (default = 0.01) Choose a realistic value for Error Tolerance. If this value is chosen too small in the training, the training will take longer. If Error Tolerance is set very high, everything is accepted which means that the results may not be useful either.
Mode SVM One vs. All, One vs. One (default = One vs. All) The voting method used to combine the binary support vector machine classifiers. One vs. All creates a classifier where each class is compared to the rest of the training data. During testing, the class with the largest output is chosen. One vs. One creates a binary classifier between each single class. During testing a vote is cast and the class with the majority of the votes is selected.
Specialization (Gamma) SVM The Gamma parameter of the radial basis kernel function. 0.01, 0.02, 0.05, 0.1, 0.5 (default = 0.02) It specifies the amount of influence of a support vector upon its surroundings. A big value means a small influence of surroundings, each training vector becomes a support vector and training/classification times grow. A too small value leads to few support vectors.
Regularization (Nu) SVM Regularization constant of an SVM (default 0.05) The Nu parameter of the radial basis kernel function. One typical strategy is to select a small Specialization (Gamma) and Regularization (Nu) pair and consecutively increase the values as long as the recognition rate increases.
Number of Trees kNN Number of trees in a tree structure (default = 4) The number of trees used by the kNN classifier. If more trees are used, the classification becomes more robust but the runtime also increases.

Other features that can be chosen are pixel, pixel_invar, pixel_binary, gradient_8dir, projection_horizontal, projection_horizontal_invar, projection_vertical, projection_vertical_invar, ration, anisometry, width, height, zoom_factor, foreground, foreground_grid_9, foreground_grid_16, compactness, convexity, moments_region_2nd_invar, moments_region_2nd_rel_invar, moments_region_3rd_invar., moments_central, moments_gray_plane, phi, num_connect, num_holes, cooc, num_runs, and chord_histo.

For more information about the effect of these parameters, please refer to the reference documentation of the HALCON operators create_ocr_class_mlp, create_ocr_class_svm, create_ocr_class_knn, and create_ocr_class_box.