select_feature_set_gmmT_select_feature_set_gmmSelectFeatureSetGmmSelectFeatureSetGmmselect_feature_set_gmm(算子)

名称

select_feature_set_gmmT_select_feature_set_gmmSelectFeatureSetGmmSelectFeatureSetGmmselect_feature_set_gmm — 从一组特征中选择最佳组合来对提供的数据进行分类。

签名

select_feature_set_gmm( : : ClassTrainDataHandle, SelectionMethod, GenParamName, GenParamValue : GMMHandle, SelectedFeatureIndices, Score)

Herror T_select_feature_set_gmm(const Htuple ClassTrainDataHandle, const Htuple SelectionMethod, const Htuple GenParamName, const Htuple GenParamValue, Htuple* GMMHandle, Htuple* SelectedFeatureIndices, Htuple* Score)

void SelectFeatureSetGmm(const HTuple& ClassTrainDataHandle, const HTuple& SelectionMethod, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* GMMHandle, HTuple* SelectedFeatureIndices, HTuple* Score)

HTuple HClassGmm::SelectFeatureSetGmm(const HClassTrainData& ClassTrainDataHandle, const HString& SelectionMethod, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)

HTuple HClassGmm::SelectFeatureSetGmm(const HClassTrainData& ClassTrainDataHandle, const HString& SelectionMethod, const HString& GenParamName, double GenParamValue, HTuple* Score)

HTuple HClassGmm::SelectFeatureSetGmm(const HClassTrainData& ClassTrainDataHandle, const char* SelectionMethod, const char* GenParamName, double GenParamValue, HTuple* Score)

HTuple HClassGmm::SelectFeatureSetGmm(const HClassTrainData& ClassTrainDataHandle, const wchar_t* SelectionMethod, const wchar_t* GenParamName, double GenParamValue, HTuple* Score)   ( Windows only)

HClassGmm HClassTrainData::SelectFeatureSetGmm(const HString& SelectionMethod, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* SelectedFeatureIndices, HTuple* Score) const

HClassGmm HClassTrainData::SelectFeatureSetGmm(const HString& SelectionMethod, const HString& GenParamName, double GenParamValue, HTuple* SelectedFeatureIndices, HTuple* Score) const

HClassGmm HClassTrainData::SelectFeatureSetGmm(const char* SelectionMethod, const char* GenParamName, double GenParamValue, HTuple* SelectedFeatureIndices, HTuple* Score) const

HClassGmm HClassTrainData::SelectFeatureSetGmm(const wchar_t* SelectionMethod, const wchar_t* GenParamName, double GenParamValue, HTuple* SelectedFeatureIndices, HTuple* Score) const   ( Windows only)

static void HOperatorSet.SelectFeatureSetGmm(HTuple classTrainDataHandle, HTuple selectionMethod, HTuple genParamName, HTuple genParamValue, out HTuple GMMHandle, out HTuple selectedFeatureIndices, out HTuple score)

HTuple HClassGmm.SelectFeatureSetGmm(HClassTrainData classTrainDataHandle, string selectionMethod, HTuple genParamName, HTuple genParamValue, out HTuple score)

HTuple HClassGmm.SelectFeatureSetGmm(HClassTrainData classTrainDataHandle, string selectionMethod, string genParamName, double genParamValue, out HTuple score)

HClassGmm HClassTrainData.SelectFeatureSetGmm(string selectionMethod, HTuple genParamName, HTuple genParamValue, out HTuple selectedFeatureIndices, out HTuple score)

HClassGmm HClassTrainData.SelectFeatureSetGmm(string selectionMethod, string genParamName, double genParamValue, out HTuple selectedFeatureIndices, out HTuple score)

def select_feature_set_gmm(class_train_data_handle: HHandle, selection_method: str, gen_param_name: MaybeSequence[str], gen_param_value: MaybeSequence[Union[int, str, float]]) -> Tuple[HHandle, Sequence[str], Sequence[float]]

描述

select_feature_set_gmmselect_feature_set_gmmSelectFeatureSetGmmSelectFeatureSetGmmSelectFeatureSetGmmselect_feature_set_gmm selects an optimal subset from a set of features to solve a given classification problem. The classification problem has to be specified with annotated training data in ClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleclassTrainDataHandleclass_train_data_handle and will be classified by a Gaussian Mixture Model. Details of the properties of this classifier can be found in create_class_gmmcreate_class_gmmCreateClassGmmCreateClassGmmCreateClassGmmcreate_class_gmm

The result of the operator is a trained classifier that is returned in GMMHandleGMMHandleGMMHandleGMMHandleGMMHandlegmmhandle. Additionally, the list of indices or names of the selected features is returned in SelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesselectedFeatureIndicesselected_feature_indices. To use this classifier, calculate for new input data all features mentioned in SelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesselectedFeatureIndicesselected_feature_indices and pass them to the classifier.

A possible application of this operator can be a comparison of different parameter sets for certain feature extraction techniques. Another application is to search for a feature that is discriminating between different classes.

To define the features that should be selected from ClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleclassTrainDataHandleclass_train_data_handle, the dimensions of the feature vectors in ClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleclassTrainDataHandleclass_train_data_handle can be grouped into subfeatures by calling set_feature_lengths_class_train_dataset_feature_lengths_class_train_dataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataset_feature_lengths_class_train_data。A subfeature can contain several subsequent elements of a feature vector. select_feature_set_gmmselect_feature_set_gmmSelectFeatureSetGmmSelectFeatureSetGmmSelectFeatureSetGmmselect_feature_set_gmm decides for each of these subfeatures, if it is better to use it for the classification or leave it out.

The indices of the selected subfeatures are returned in SelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesselectedFeatureIndicesselected_feature_indices. If names were set in set_feature_lengths_class_train_dataset_feature_lengths_class_train_dataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataset_feature_lengths_class_train_data, these names are returned instead of the indices. If set_feature_lengths_class_train_dataset_feature_lengths_class_train_dataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataset_feature_lengths_class_train_data was not called for ClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleclassTrainDataHandleclass_train_data_handle before, each element of the feature vector is considered as a subfeature.

The selection method SelectionMethodSelectionMethodSelectionMethodSelectionMethodselectionMethodselection_method is either a greedy search 'greedy'"greedy""greedy""greedy""greedy""greedy" (iteratively add the feature with highest gain) or the dynamically oscillating search 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating" (add the feature with highest gain and test then if any of the already added features can be left out without great loss). The method 'greedy'"greedy""greedy""greedy""greedy""greedy" is generally preferable, since it is faster. Only in cases when the subfeatures are low-dimensional or redundant, the method 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating" should be chosen.

The optimization criterion is the classification rate of a two-fold cross-validation of the training data. The best achieved value is returned in ScoreScoreScoreScorescorescore

The following generic parameters can be set in GenParamNameGenParamNameGenParamNameGenParamNamegenParamNamegen_param_name and GenParamValueGenParamValueGenParamValueGenParamValuegenParamValuegen_param_value:

'min_centers'"min_centers""min_centers""min_centers""min_centers""min_centers"

Minimal number of clusters to represent a class in the training data.

Possible values: '1'"1""1""1""1""1", '2'"2""2""2""2""2"

默认值: '1'"1""1""1""1""1"

'max_center'"max_center""max_center""max_center""max_center""max_center"

Maximal number of clusters to represent a class in the training data.

Possible values: '1'"1""1""1""1""1", '5'"5""5""5""5""5", '10'"10""10""10""10""10"

默认值: '1'"1""1""1""1""1"

'covar_type'"covar_type""covar_type""covar_type""covar_type""covar_type"

Type of the covariance to represent the size of a cluster.

Possible values: 'spherical'"spherical""spherical""spherical""spherical""spherical", 'diag'"diag""diag""diag""diag""diag", 'full'"full""full""full""full""full"

默认值: 'spherical'"spherical""spherical""spherical""spherical""spherical"

'random_seed'"random_seed""random_seed""random_seed""random_seed""random_seed"

Random seed.

默认值: '42'"42""42""42""42""42"

'threshold'"threshold""threshold""threshold""threshold""threshold"

Training threshold.

默认值: '0.001'"0.001""0.001""0.001""0.001""0.001"

'regularize'"regularize""regularize""regularize""regularize""regularize"

Regularization value.

默认值: '0.0001'"0.0001""0.0001""0.0001""0.0001""0.0001"

'randomize'"randomize""randomize""randomize""randomize""randomize"

Randomize the input vector.

默认值: '0'"0""0""0""0""0"

'class_priors'"class_priors""class_priors""class_priors""class_priors""class_priors"

Mode to determine the a-priori probabilities of the classes.

Possible values: 'training'"training""training""training""training""training", 'uniform'"uniform""uniform""uniform""uniform""uniform"

默认值: 'training'"training""training""training""training""training"

A more exact description of those parameters can be found in create_class_gmmcreate_class_gmmCreateClassGmmCreateClassGmmCreateClassGmmcreate_class_gmm and train_class_gmmtrain_class_gmmTrainClassGmmTrainClassGmmTrainClassGmmtrain_class_gmm

注意

This operator may take considerable time, depending on the size of the data set in the training file, and the number of features.

Please note, that this operator should not be called, if only a small set of training data is available. Due to the risk of overfitting the operator select_feature_set_gmmselect_feature_set_gmmSelectFeatureSetGmmSelectFeatureSetGmmSelectFeatureSetGmmselect_feature_set_gmm may deliver a classifier with a very high score. However, the classifier may perform poorly when tested.

执行信息

此算子返回一个句柄。请注意,即使该句柄被用作特定算子的输入参数,这些算子仍可能改变此句柄类型的实例状态。

参数

ClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleClassTrainDataHandleclassTrainDataHandleclass_train_data_handle (输入控制)  class_train_data HClassTrainData, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

训练数据的句柄。

SelectionMethodSelectionMethodSelectionMethodSelectionMethodselectionMethodselection_method (输入控制)  string HTuplestrHTupleHtuple (string) (string) (HString) (char*)

Method to perform the selection.

默认值: 'greedy' "greedy" "greedy" "greedy" "greedy" "greedy"

值列表: 'greedy'"greedy""greedy""greedy""greedy""greedy", 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating"

GenParamNameGenParamNameGenParamNameGenParamNamegenParamNamegen_param_name (输入控制)  string(-array) HTupleMaybeSequence[str]HTupleHtuple (string) (string) (HString) (char*)

Names of generic parameters to configure the classifier.

默认值: []

值列表: 'class_priors'"class_priors""class_priors""class_priors""class_priors""class_priors", 'covar_type'"covar_type""covar_type""covar_type""covar_type""covar_type", 'max_center'"max_center""max_center""max_center""max_center""max_center", 'min_centers'"min_centers""min_centers""min_centers""min_centers""min_centers", 'random_seed'"random_seed""random_seed""random_seed""random_seed""random_seed", 'randomize'"randomize""randomize""randomize""randomize""randomize", 'regularize'"regularize""regularize""regularize""regularize""regularize", 'threshold'"threshold""threshold""threshold""threshold""threshold"

GenParamValueGenParamValueGenParamValueGenParamValuegenParamValuegen_param_value (输入控制)  number(-array) HTupleMaybeSequence[Union[int, str, float]]HTupleHtuple (real / integer / string) (double / int / long / string) (double / Hlong / HString) (double / Hlong / char*)

Values of generic parameters to configure the classifier.

默认值: []

建议值: 1, 2, 3, 'spherical'"spherical""spherical""spherical""spherical""spherical", 'diag'"diag""diag""diag""diag""diag", 'full'"full""full""full""full""full", 42, 0.001, 0.0001, 0

GMMHandleGMMHandleGMMHandleGMMHandleGMMHandlegmmhandle (输出控制)  class_gmm HClassGmm, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

A trained GMM classifier using only the selected features.

SelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesSelectedFeatureIndicesselectedFeatureIndicesselected_feature_indices (输出控制)  string-array HTupleSequence[str]HTupleHtuple (string) (string) (HString) (char*)

The selected feature set, contains indices or names.

ScoreScoreScoreScorescorescore (输出控制)  real-array HTupleSequence[float]HTupleHtuple (real) (double) (double) (double)

The achieved score using two-fold cross-validation.

示例(HDevelop)

* Find out which of the two features distinguishes two Classes
NameFeature1 := 'Good Feature'
NameFeature2 := 'Bad Feature'
LengthFeature1 := 3
LengthFeature2 := 2
* Create training data
create_class_train_data (LengthFeature1+LengthFeature2,\
  ClassTrainDataHandle)
* Define the features which are in the training data
set_feature_lengths_class_train_data (ClassTrainDataHandle, [LengthFeature1,\
  LengthFeature2], [NameFeature1, NameFeature2])
* Add training data
*                                                         |Feat1| |Feat2|
add_sample_class_train_data (ClassTrainDataHandle, 'row', [1,1,1,  2,1  ], 0)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,2,2,  2,1  ], 1)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [1,1,1,  3,4  ], 0)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,2,2,  3,4  ], 1)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [0,0,1,  5,6  ], 0)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,3,2,  5,6  ], 1)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [0,0,1,  5,6  ], 0)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,3,2,  5,6  ], 1)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [0,0,1,  5,6  ], 0)
add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,3,2,  5,6  ], 1)
* Add more data
* ...
* Select the better feature with a GMM
select_feature_set_gmm (ClassTrainDataHandle, 'greedy', [], [], GMMHandle,\
  SelectedFeatureGMM, Score)
* Use the classifier
* ...

结果

如果参数有效,算子 select_feature_set_gmmselect_feature_set_gmmSelectFeatureSetGmmSelectFeatureSetGmmSelectFeatureSetGmmselect_feature_set_gmm 返回值 2 (H_MSG_TRUE)。如有必要,则抛出异常。

可能的前趋

create_class_train_datacreate_class_train_dataCreateClassTrainDataCreateClassTrainDataCreateClassTrainDatacreate_class_train_data, add_sample_class_train_dataadd_sample_class_train_dataAddSampleClassTrainDataAddSampleClassTrainDataAddSampleClassTrainDataadd_sample_class_train_data, set_feature_lengths_class_train_dataset_feature_lengths_class_train_dataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataSetFeatureLengthsClassTrainDataset_feature_lengths_class_train_data

可能的后继

classify_class_gmmclassify_class_gmmClassifyClassGmmClassifyClassGmmClassifyClassGmmclassify_class_gmm

替代

select_feature_set_mlpselect_feature_set_mlpSelectFeatureSetMlpSelectFeatureSetMlpSelectFeatureSetMlpselect_feature_set_mlp, select_feature_set_knnselect_feature_set_knnSelectFeatureSetKnnSelectFeatureSetKnnSelectFeatureSetKnnselect_feature_set_knn, select_feature_set_svmselect_feature_set_svmSelectFeatureSetSvmSelectFeatureSetSvmSelectFeatureSetSvmselect_feature_set_svm

另见

create_class_gmmcreate_class_gmmCreateClassGmmCreateClassGmmCreateClassGmmcreate_class_gmm, gray_featuresgray_featuresGrayFeaturesGrayFeaturesGrayFeaturesgray_features, region_featuresregion_featuresRegionFeaturesRegionFeaturesRegionFeaturesregion_features

模块

基础