create_class_gmmT_create_class_gmmCreateClassGmmCreateClassGmmcreate_class_gmm（算子）

名称

create_class_gmmT_create_class_gmmCreateClassGmmCreateClassGmmcreate_class_gmm — 创建高斯混合模型进行分类。

签名

create_class_gmm( : : NumDim, NumClasses, NumCenters, CovarType, Preprocessing, NumComponents, RandSeed : GMMHandle)

描述

create_class_gmmcreate_class_gmmCreateClassGmmCreateClassGmmCreateClassGmmcreate_class_gmm creates a Gaussian Mixture Model (GMM) for classification. NumDimNumDimNumDimNumDimnumDimnum_dim specifies the number of dimensions of the feature space, NumClassesNumClassesNumClassesNumClassesnumClassesnum_classes specifies the number of classes. A GMM consists of NumCentersNumCentersNumCentersNumCentersnumCentersnum_centers Gaussian centers per class. NumCentersNumCentersNumCentersNumCentersnumCentersnum_centers can not only be the exact number of centers to be used, but, depending on the number of parameters, can specify upper and lower bounds for the number of centers:

exactly one parameter：: The parameter determines the exact number of centers to be used for all classes.
exactly two parameters：: The first parameter determines the minimum number of centers, the second determines the maximum number of centers for all classes.
exactly parameters：: Alternatingly every first parameter determines the minimum number of centers per class and every second parameters determines the maximum number of centers per class.

When upper and lower bounds are specified, the optimum number of centers will be determined with the help of the Minimum Message Length Criterion (MML). In general, we recommend to start the training with (too) many centers as maximum and the expected number of centers as minimum.

Each center is described by the parameters center , covariance matrix , and mixing coefficient . These parameters are calculated from the training data by means of the Expectation Maximization (EM) algorithm. A GMM can approximate an arbitrary probability density, provided that enough centers are being used. The covariance matrices have the dimensions NumDimNumDimNumDimNumDimnumDimnum_dim x NumDimNumDimNumDimNumDimnumDimnum_dim (NumComponentsNumComponentsNumComponentsNumComponentsnumComponentsnum_components x NumComponentsNumComponentsNumComponentsNumComponentsnumComponentsnum_components if preprocessing is used) and are symmetric. Further constraints can be given by CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type:

For CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type = 'spherical'"spherical""spherical""spherical""spherical""spherical", is a scalar multiple of the identity matrix . The center density function is

For CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type = 'diag'"diag""diag""diag""diag""diag", is a diagonal matrix . The center density function is

For CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type = 'full'"full""full""full""full""full", is a positive definite matrix. The center density function is

The complexity of the calculations increases from CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type = 'spherical'"spherical""spherical""spherical""spherical""spherical" over CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type = 'diag'"diag""diag""diag""diag""diag" to CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type = 'full'"full""full""full""full""full". At the same time the flexibility of the centers increases. In general, 'spherical'"spherical""spherical""spherical""spherical""spherical" therefore needs higher values for NumCentersNumCentersNumCentersNumCentersnumCentersnum_centers than 'full'"full""full""full""full""full".

The procedure to use GMM is as follows: First, a GMM is created by create_class_gmmcreate_class_gmmCreateClassGmmCreateClassGmmCreateClassGmmcreate_class_gmm. Then, training vectors are added by add_sample_class_gmmadd_sample_class_gmmAddSampleClassGmmAddSampleClassGmmAddSampleClassGmmadd_sample_class_gmm, afterwards they can be written to disk with write_samples_class_gmmwrite_samples_class_gmmWriteSamplesClassGmmWriteSamplesClassGmmWriteSamplesClassGmmwrite_samples_class_gmm。With train_class_gmmtrain_class_gmmTrainClassGmmTrainClassGmmTrainClassGmmtrain_class_gmm the classifier center parameters (defined above) are determined. Furthermore, they can be saved with write_class_gmmwrite_class_gmmWriteClassGmmWriteClassGmmWriteClassGmmwrite_class_gmm for later classifications.

From the mixing probabilities and the center density function , the probability density function p(x) can be calculated by:

The probability density function p(x) can be evaluated with evaluate_class_gmmevaluate_class_gmmEvaluateClassGmmEvaluateClassGmmEvaluateClassGmmevaluate_class_gmm for a feature vector x. classify_class_gmmclassify_class_gmmClassifyClassGmmClassifyClassGmmClassifyClassGmmclassify_class_gmm sorts the p(x) and therefore discovers the most probable class of the feature vector.

The parameters PreprocessingPreprocessingPreprocessingPreprocessingpreprocessingpreprocessing 和 NumComponentsNumComponentsNumComponentsNumComponentsnumComponentsnum_components can be used to preprocess the training data and reduce its dimensions. These parameters are explained in the description of the operator create_class_mlpcreate_class_mlpCreateClassMlpCreateClassMlpCreateClassMlpcreate_class_mlp。

create_class_gmmcreate_class_gmmCreateClassGmmCreateClassGmmCreateClassGmmcreate_class_gmm initializes the coordinates of the centers with random numbers. To ensure that the results of training the classifier with train_class_gmmtrain_class_gmmTrainClassGmmTrainClassGmmTrainClassGmmtrain_class_gmm are reproducible, the seed value of the random number generator is passed in RandSeedRandSeedRandSeedRandSeedrandSeedrand_seed。

执行信息

多线程类型：可重入（与非独占算子并行运行）。
多线程作用域：全局（可从任何线程调用）。
未采用并行化处理。

此算子返回一个句柄。请注意，即使该句柄被用作特定算子的输入参数，这些算子仍可能改变此句柄类型的实例状态。

参数

NumDimNumDimNumDimNumDimnumDimnum_dim (输入控制) integer → (integer)

Number of dimensions of the feature space.

默认值： 3

建议值： 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100

限制： NumDim >= 1

NumClassesNumClassesNumClassesNumClassesnumClassesnum_classes (输入控制) integer → (integer)

Number of classes of the GMM.

默认值： 5

建议值： 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

限制： NumClasses >= 1

NumCentersNumCentersNumCentersNumCentersnumCentersnum_centers (输入控制) integer(-array) → (integer)

Number of centers per class.

默认值： 1

建议值： 1, 2, 3, 4, 5, 8, 10, 15, 20, 30

限制： NumClasses >= 1

CovarTypeCovarTypeCovarTypeCovarTypecovarTypecovar_type (输入控制) string → (string)

Type of the covariance matrices.

默认值： 'spherical' "spherical" "spherical" "spherical" "spherical" "spherical"

值列表： 'diag'"diag""diag""diag""diag""diag", 'full'"full""full""full""full""full", 'spherical'"spherical""spherical""spherical""spherical""spherical"

PreprocessingPreprocessingPreprocessingPreprocessingpreprocessingpreprocessing (输入控制) string → (string)

Type of preprocessing used to transform the feature vectors.

默认值： 'normalization' "normalization" "normalization" "normalization" "normalization" "normalization"

值列表： 'canonical_variates'"canonical_variates""canonical_variates""canonical_variates""canonical_variates""canonical_variates", 'none'"none""none""none""none""none", 'normalization'"normalization""normalization""normalization""normalization""normalization", 'principal_components'"principal_components""principal_components""principal_components""principal_components""principal_components"

NumComponentsNumComponentsNumComponentsNumComponentsnumComponentsnum_components (输入控制) integer → (integer)

Preprocessing parameter: Number of transformed features (ignored for PreprocessingPreprocessingPreprocessingPreprocessingpreprocessingpreprocessing = 'none'"none""none""none""none""none" and PreprocessingPreprocessingPreprocessingPreprocessingpreprocessingpreprocessing = 'normalization'"normalization""normalization""normalization""normalization""normalization").

默认值： 10

建议值： 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100

限制： NumComponents >= 1

RandSeedRandSeedRandSeedRandSeedrandSeedrand_seed (输入控制) integer → (integer)

Seed value of the random number generator that is used to initialize the GMM with random values.

默认值： 42

GMMHandleGMMHandleGMMHandleGMMHandleGMMHandlegmmhandle (输出控制) class_gmm → (handle)

GMM 句柄。

示例（HDevelop）

* Classification with Gaussian Mixture Models
create_class_gmm (NumDim , NumClasses, [1,5], 'full', 'none',\
                  NumComponents, 42, GMMHandle)
* Add the training data
for J := 0 to NumData-1 by 1
    * Features := [...]
    * ClassID := [...]
    add_sample_class_gmm (GMMHandle, Features, ClassID, Randomize)
endfor
* Train the GMM
train_class_gmm (GMMHandle, 100, 0.001, 'training', 0.0001, Centers, Iter)
* Classify unknown data in 'Features'
classify_class_gmm (GMMHandle, Features, 1, ID, Prob, Density, KSigmaProb)

结果

如果参数有效，算子 create_class_gmmcreate_class_gmmCreateClassGmmCreateClassGmmCreateClassGmmcreate_class_gmm 返回值 2 (H_MSG_TRUE)。如有必要，则抛出异常。

可能的后继

add_sample_class_gmmadd_sample_class_gmmAddSampleClassGmmAddSampleClassGmmAddSampleClassGmmadd_sample_class_gmm, add_samples_image_class_gmmadd_samples_image_class_gmmAddSamplesImageClassGmmAddSamplesImageClassGmmAddSamplesImageClassGmmadd_samples_image_class_gmm

替代

create_class_mlpcreate_class_mlpCreateClassMlpCreateClassMlpCreateClassMlpcreate_class_mlp, create_class_svmcreate_class_svmCreateClassSvmCreateClassSvmCreateClassSvmcreate_class_svm

另见

clear_class_gmmclear_class_gmmClearClassGmmClearClassGmmClearClassGmmclear_class_gmm, train_class_gmmtrain_class_gmmTrainClassGmmTrainClassGmmTrainClassGmmtrain_class_gmm, classify_class_gmmclassify_class_gmmClassifyClassGmmClassifyClassGmmClassifyClassGmmclassify_class_gmm, evaluate_class_gmmevaluate_class_gmmEvaluateClassGmmEvaluateClassGmmEvaluateClassGmmevaluate_class_gmm, classify_image_class_gmmclassify_image_class_gmmClassifyImageClassGmmClassifyImageClassGmmClassifyImageClassGmmclassify_image_class_gmm

参考文献

Christopher M. Bishop: “Neural Networks for Pattern Recognition”; Oxford University Press, Oxford; 1995.
Mario A.T. Figueiredo: “Unsupervised Learning of Finite Mixture Models”; IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 3; March 2002.

模块

基础

算子