select_feature_set_trainf_knnT_select_feature_set_trainf_knnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnselect_feature_set_trainf_knn (算子)
名称
select_feature_set_trainf_knnT_select_feature_set_trainf_knnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnselect_feature_set_trainf_knn — 选择一个最佳的特征组合来对 OCR 数据进行分类。
签名
void SelectFeatureSetTrainfKnn(const HTuple& TrainingFile, const HTuple& FeatureList, const HTuple& SelectionMethod, const HTuple& Width, const HTuple& Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* OCRHandle, HTuple* FeatureSet, HTuple* Score)
HTuple HOCRKnn::SelectFeatureSetTrainfKnn(const HTuple& TrainingFile, const HTuple& FeatureList, const HString& SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
HTuple HOCRKnn::SelectFeatureSetTrainfKnn(const HString& TrainingFile, const HString& FeatureList, const HString& SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
HTuple HOCRKnn::SelectFeatureSetTrainfKnn(const char* TrainingFile, const char* FeatureList, const char* SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
HTuple HOCRKnn::SelectFeatureSetTrainfKnn(const wchar_t* TrainingFile, const wchar_t* FeatureList, const wchar_t* SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
(
Windows only)
static void HOperatorSet.SelectFeatureSetTrainfKnn(HTuple trainingFile, HTuple featureList, HTuple selectionMethod, HTuple width, HTuple height, HTuple genParamName, HTuple genParamValue, out HTuple OCRHandle, out HTuple featureSet, out HTuple score)
HTuple HOCRKnn.SelectFeatureSetTrainfKnn(HTuple trainingFile, HTuple featureList, string selectionMethod, int width, int height, HTuple genParamName, HTuple genParamValue, out HTuple score)
HTuple HOCRKnn.SelectFeatureSetTrainfKnn(string trainingFile, string featureList, string selectionMethod, int width, int height, HTuple genParamName, HTuple genParamValue, out HTuple score)
def select_feature_set_trainf_knn(training_file: MaybeSequence[str], feature_list: MaybeSequence[str], selection_method: str, width: int, height: int, gen_param_name: Sequence[str], gen_param_value: Sequence[Union[int, str, float]]) -> Tuple[HHandle, Sequence[str], Sequence[float]]
描述
select_feature_set_trainf_knnselect_feature_set_trainf_knnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnselect_feature_set_trainf_knn selects an optimal combination of
features, to classify the data given in the training file
TrainingFileTrainingFileTrainingFileTrainingFiletrainingFiletraining_file with a k-Nearest Neighbor classifier,
for details see create_ocr_class_knncreate_ocr_class_knnCreateOcrClassKnnCreateOcrClassKnnCreateOcrClassKnncreate_ocr_class_knn。
Possible features are all OCR features listed and explained in
create_ocr_class_knncreate_ocr_class_knnCreateOcrClassKnnCreateOcrClassKnnCreateOcrClassKnncreate_ocr_class_knn。All candidates which should be tested can be
specified in FeatureListFeatureListFeatureListFeatureListfeatureListfeature_list. A subset of these features is
returned as selected features in FeatureSetFeatureSetFeatureSetFeatureSetfeatureSetfeature_set。
select_feature_set_trainf_knnselect_feature_set_trainf_knnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnselect_feature_set_trainf_knn is specialized on OCR problems and
only supports the features in the list mentioned before.
In order to use other features, please use the more general operator
select_feature_set_knnselect_feature_set_knnSelectFeatureSetKnnSelectFeatureSetKnnSelectFeatureSetKnnselect_feature_set_knn。
The selection method
SelectionMethodSelectionMethodSelectionMethodSelectionMethodselectionMethodselection_method is either a greedy search 'greedy'"greedy""greedy""greedy""greedy""greedy"
(iteratively add the feature with highest gain)
or the dynamically oscillating search 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating"
(add the feature with highest gain and test then if any of the already added
features can be left out without great loss).
The method 'greedy'"greedy""greedy""greedy""greedy""greedy" is generally preferable, since it is faster.
Only in cases when a large training set is available
the method 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating" might return better results.
The optimization criterion is the classification rate of a two-fold
cross-validation of the training data. The best achieved value
is returned in ScoreScoreScoreScorescorescore。
The k-NN classifier can be parametrized using the following values in
GenParamNameGenParamNameGenParamNameGenParamNamegenParamNamegen_param_name and GenParamValueGenParamValueGenParamValueGenParamValuegenParamValuegen_param_value:
- 'num_neighbors'"num_neighbors""num_neighbors""num_neighbors""num_neighbors""num_neighbors":
-
The number of minimally evaluated
nodes, increase this value for high dimensional data.
建议值:
'1'"1""1""1""1""1", '2'"2""2""2""2""2", '5'"5""5""5""5""5",
'10'"10""10""10""10""10"
默认值:
'1'"1""1""1""1""1"
- 'num_trees'"num_trees""num_trees""num_trees""num_trees""num_trees":
-
Number of search trees in the k-NN
classifier
建议值:
'1'"1""1""1""1""1", '4'"4""4""4""4""4", '10'"10""10""10""10""10"
默认值:
'4'"4""4""4""4""4"
注意
This operator may take considerable time, depending on the size of the
data set in the training file, and the number of features.
Please note, that this operator should not be called, if only a small
set of training data is available. Due to the risk of overfitting the
operator select_feature_set_trainf_knnselect_feature_set_trainf_knnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnselect_feature_set_trainf_knn may deliver a classifier with
a very high score. However, the classifier may perform poorly when tested.
执行信息
- 多线程类型:可重入(与非独占算子并行运行)。
- 多线程作用域:全局(可从任何线程调用)。
- 在内部数据级别上自动并行化。
此算子返回一个句柄。请注意,即使该句柄被用作特定算子的输入参数,这些算子仍可能改变此句柄类型的实例状态。
参数
TrainingFileTrainingFileTrainingFileTrainingFiletrainingFiletraining_file (输入控制) filename.read(-array) → HTupleMaybeSequence[str]HTupleHtuple (string) (string) (HString) (char*)
Names of the training files.
默认值:
''
""
""
""
""
""
File extension:
.trf, .otr
FeatureListFeatureListFeatureListFeatureListfeatureListfeature_list (输入控制) string(-array) → HTupleMaybeSequence[str]HTupleHtuple (string) (string) (HString) (char*)
List of features that should be considered for selection.
默认值:
['zoom_factor','ratio','width','height','foreground','foreground_grid_9','foreground_grid_16','anisometry','compactness','convexity','moments_region_2nd_invar','moments_region_2nd_rel_invar','moments_region_3rd_invar','moments_central','phi','num_connect','num_holes','projection_horizontal','projection_vertical','projection_horizontal_invar','projection_vertical_invar','chord_histo','num_runs','pixel','pixel_invar','pixel_binary','gradient_8dir','cooc','moments_gray_plane']
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
值列表:
'anisometry'"anisometry""anisometry""anisometry""anisometry""anisometry", 'chord_histo'"chord_histo""chord_histo""chord_histo""chord_histo""chord_histo", 'compactness'"compactness""compactness""compactness""compactness""compactness", 'convexity'"convexity""convexity""convexity""convexity""convexity", 'cooc'"cooc""cooc""cooc""cooc""cooc", 'default'"default""default""default""default""default", 'foreground'"foreground""foreground""foreground""foreground""foreground", 'foreground_grid_16'"foreground_grid_16""foreground_grid_16""foreground_grid_16""foreground_grid_16""foreground_grid_16", 'foreground_grid_9'"foreground_grid_9""foreground_grid_9""foreground_grid_9""foreground_grid_9""foreground_grid_9", 'gradient_8dir'"gradient_8dir""gradient_8dir""gradient_8dir""gradient_8dir""gradient_8dir", 'height'"height""height""height""height""height", 'moments_central'"moments_central""moments_central""moments_central""moments_central""moments_central", 'moments_gray_plane'"moments_gray_plane""moments_gray_plane""moments_gray_plane""moments_gray_plane""moments_gray_plane", 'moments_region_2nd_invar'"moments_region_2nd_invar""moments_region_2nd_invar""moments_region_2nd_invar""moments_region_2nd_invar""moments_region_2nd_invar", 'moments_region_2nd_rel_invar'"moments_region_2nd_rel_invar""moments_region_2nd_rel_invar""moments_region_2nd_rel_invar""moments_region_2nd_rel_invar""moments_region_2nd_rel_invar", 'moments_region_3rd_invar'"moments_region_3rd_invar""moments_region_3rd_invar""moments_region_3rd_invar""moments_region_3rd_invar""moments_region_3rd_invar", 'num_connect'"num_connect""num_connect""num_connect""num_connect""num_connect", 'num_holes'"num_holes""num_holes""num_holes""num_holes""num_holes", 'num_runs'"num_runs""num_runs""num_runs""num_runs""num_runs", 'phi'"phi""phi""phi""phi""phi", 'pixel'"pixel""pixel""pixel""pixel""pixel", 'pixel_binary'"pixel_binary""pixel_binary""pixel_binary""pixel_binary""pixel_binary", 'pixel_invar'"pixel_invar""pixel_invar""pixel_invar""pixel_invar""pixel_invar", 'projection_horizontal'"projection_horizontal""projection_horizontal""projection_horizontal""projection_horizontal""projection_horizontal", 'projection_horizontal_invar'"projection_horizontal_invar""projection_horizontal_invar""projection_horizontal_invar""projection_horizontal_invar""projection_horizontal_invar", 'projection_vertical'"projection_vertical""projection_vertical""projection_vertical""projection_vertical""projection_vertical", 'projection_vertical_invar'"projection_vertical_invar""projection_vertical_invar""projection_vertical_invar""projection_vertical_invar""projection_vertical_invar", 'ratio'"ratio""ratio""ratio""ratio""ratio", 'width'"width""width""width""width""width", 'zoom_factor'"zoom_factor""zoom_factor""zoom_factor""zoom_factor""zoom_factor"
SelectionMethodSelectionMethodSelectionMethodSelectionMethodselectionMethodselection_method (输入控制) string → HTuplestrHTupleHtuple (string) (string) (HString) (char*)
Method to perform the selection.
默认值:
'greedy'
"greedy"
"greedy"
"greedy"
"greedy"
"greedy"
值列表:
'greedy'"greedy""greedy""greedy""greedy""greedy", 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating"
WidthWidthWidthWidthwidthwidth (输入控制) integer → HTupleintHTupleHtuple (integer) (int / long) (Hlong) (Hlong)
Width of the rectangle to which the gray values
of the segmented character are zoomed.
默认值:
15
HeightHeightHeightHeightheightheight (输入控制) integer → HTupleintHTupleHtuple (integer) (int / long) (Hlong) (Hlong)
Height of the rectangle to which the gray values
of the segmented character are zoomed.
默认值:
16
GenParamNameGenParamNameGenParamNameGenParamNamegenParamNamegen_param_name (输入控制) string-array → HTupleSequence[str]HTupleHtuple (string) (string) (HString) (char*)
Names of generic parameters to configure the selection
process and the classifier.
默认值:
[]
值列表:
'num_neighbors'"num_neighbors""num_neighbors""num_neighbors""num_neighbors""num_neighbors"
GenParamValueGenParamValueGenParamValueGenParamValuegenParamValuegen_param_value (输入控制) number-array → HTupleSequence[Union[int, str, float]]HTupleHtuple (real / integer / string) (double / int / long / string) (double / Hlong / HString) (double / Hlong / char*)
Values of generic parameters to configure the selection
process and the classifier.
默认值:
[]
建议值:
1, 2, 3
OCRHandleOCRHandleOCRHandleOCRHandleOCRHandleocrhandle (输出控制) ocr_knn → HOCRKnn, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)
Trained OCR-k-NN classifier.
ScoreScoreScoreScorescorescore (输出控制) real-array → HTupleSequence[float]HTupleHtuple (real) (double) (double) (double)
Achieved score using tow-fold cross-validation.
结果
如果参数有效,算子
select_feature_set_trainf_knnselect_feature_set_trainf_knnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnselect_feature_set_trainf_knn 返回值 2 ( H_MSG_TRUE )。如有必要,则抛出异常。
替代
select_feature_set_trainf_svmselect_feature_set_trainf_svmSelectFeatureSetTrainfSvmSelectFeatureSetTrainfSvmSelectFeatureSetTrainfSvmselect_feature_set_trainf_svm,
select_feature_set_trainf_mlpselect_feature_set_trainf_mlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpselect_feature_set_trainf_mlp
另见
select_feature_set_knnselect_feature_set_knnSelectFeatureSetKnnSelectFeatureSetKnnSelectFeatureSetKnnselect_feature_set_knn
模块
光学字符识别/光学字符验证