create_text_model_readerT_create_text_model_readerCreateTextModelReaderCreateTextModelReadercreate_text_model_reader (算子)
名称
create_text_model_readerT_create_text_model_readerCreateTextModelReaderCreateTextModelReadercreate_text_model_reader — 创建文本模型。
签名
void CreateTextModelReader(const HTuple& Mode, const HTuple& OCRClassifier, HTuple* TextModel)
void HTextModel::HTextModel(const HString& Mode, const HTuple& OCRClassifier)
void HTextModel::HTextModel(const HString& Mode, const HString& OCRClassifier)
void HTextModel::HTextModel(const char* Mode, const char* OCRClassifier)
void HTextModel::HTextModel(const wchar_t* Mode, const wchar_t* OCRClassifier)
(
Windows only)
void HTextModel::CreateTextModelReader(const HString& Mode, const HTuple& OCRClassifier)
void HTextModel::CreateTextModelReader(const HString& Mode, const HString& OCRClassifier)
void HTextModel::CreateTextModelReader(const char* Mode, const char* OCRClassifier)
void HTextModel::CreateTextModelReader(const wchar_t* Mode, const wchar_t* OCRClassifier)
(
Windows only)
描述
create_text_model_readercreate_text_model_readerCreateTextModelReaderCreateTextModelReaderCreateTextModelReadercreate_text_model_reader creates a TextModelTextModelTextModelTextModeltextModeltext_model, which
describes the text to be segmented with find_textfind_textFindTextFindTextFindTextfind_text。
The parameter value of ModeModeModeModemodemode determines which text segmentation
approach is used. Possible values are 'auto'"auto""auto""auto""auto""auto" and 'manual'"manual""manual""manual""manual""manual".
Typically, the parameter ModeModeModeModemodemode should be set to 'auto'"auto""auto""auto""auto""auto"
because this mode is more stable and requires less configuration effort.
Note that in this case, also an OCR classifier must be passed in
OCRClassifierOCRClassifierOCRClassifierOCRClassifierOCRClassifierocrclassifier. Only if one of the following restrictions apply,
ModeModeModeModemodemode must be set to 'manual'"manual""manual""manual""manual""manual":
-
The segmentation of text which has strong local variations of the
polarity is required. For example, due to reflections, engraved text often
has strong local variations.
-
No suitable OCR classifier is available (see below).
If ModeModeModeModemodemode = 'auto'"auto""auto""auto""auto""auto", find_textfind_textFindTextFindTextFindTextfind_text is able to extract text
of arbitrary size. It is possible to restrict the search to characters with
specific attributes, see set_text_model_paramset_text_model_paramSetTextModelParamSetTextModelParamSetTextModelParamset_text_model_param for details.
In particular, if the text to be segmented contains dot printed characters,
the text model parameter 'dot_print'"dot_print""dot_print""dot_print""dot_print""dot_print" must be set to 'true'"true""true""true""true""true".
Furthermore, an OCR classifier must be passed in OCRClassifierOCRClassifierOCRClassifierOCRClassifierOCRClassifierocrclassifier.
This OCR classifier must be based on a convolutional neural network (CNN) or
a multilayer perceptron (MLP). Moreover, it is strongly recommended to
use a CNN based OCR classifier with rejection class or a MLP based
classifiers that has been trained with regularization parameters (see
set_regularization_params_ocr_class_mlpset_regularization_params_ocr_class_mlpSetRegularizationParamsOcrClassMlpSetRegularizationParamsOcrClassMlpSetRegularizationParamsOcrClassMlpset_regularization_params_ocr_class_mlp and provides a rejection
class (see set_rejection_params_ocr_class_mlpset_rejection_params_ocr_class_mlpSetRejectionParamsOcrClassMlpSetRejectionParamsOcrClassMlpSetRejectionParamsOcrClassMlpset_rejection_params_ocr_class_mlp).
A suitable OCR classifier can either be read with read_ocr_class_cnnread_ocr_class_cnnReadOcrClassCnnReadOcrClassCnnReadOcrClassCnnread_ocr_class_cnn
or read_ocr_class_mlpread_ocr_class_mlpReadOcrClassMlpReadOcrClassMlpReadOcrClassMlpread_ocr_class_mlp, or be created with
create_ocr_class_mlpcreate_ocr_class_mlpCreateOcrClassMlpCreateOcrClassMlpCreateOcrClassMlpcreate_ocr_class_mlp。It is also possible to pass a string containing
the path to a pretrained OCR classifier or an OCR classifier that has
been stored with write_ocr_class_mlpwrite_ocr_class_mlpWriteOcrClassMlpWriteOcrClassMlpWriteOcrClassMlpwrite_ocr_class_mlp。
To enable text segmentation when ModeModeModeModemodemode = 'manual'"manual""manual""manual""manual""manual",
reasonable parameters for the text model, including the expected
character height and width, must be set using
set_text_model_paramset_text_model_paramSetTextModelParamSetTextModelParamSetTextModelParamset_text_model_param。In this case, the value of
OCRClassifierOCRClassifierOCRClassifierOCRClassifierOCRClassifierocrclassifier is ignored.
The parameters of the TextModelTextModelTextModelTextModeltextModeltext_model can be set and queried with
set_text_model_paramset_text_model_paramSetTextModelParamSetTextModelParamSetTextModelParamset_text_model_param and get_text_model_paramget_text_model_paramGetTextModelParamGetTextModelParamGetTextModelParamget_text_model_param。
Since memory is allocated for the text model during the call of
create_text_model_readercreate_text_model_readerCreateTextModelReaderCreateTextModelReaderCreateTextModelReadercreate_text_model_reader and during the following operations, the
model should be freed explicitly by the operator clear_text_modelclear_text_modelClearTextModelClearTextModelClearTextModelclear_text_model as
soon as it is no longer used.
执行信息
- 多线程类型:可重入(与非独占算子并行运行)。
- 多线程作用域:全局(可从任何线程调用)。
- 未采用并行化处理。
此算子返回一个句柄。请注意,即使该句柄被用作特定算子的输入参数,这些算子仍可能改变此句柄类型的实例状态。
参数
ModeModeModeModemodemode (输入控制) string → HTuplestrHTupleHtuple (string) (string) (HString) (char*)
The Mode of the text model.
默认值:
'auto'
"auto"
"auto"
"auto"
"auto"
"auto"
值列表:
'auto'"auto""auto""auto""auto""auto", 'manual'"manual""manual""manual""manual""manual"
OCRClassifierOCRClassifierOCRClassifierOCRClassifierOCRClassifierocrclassifier (输入控制) string → HTupleUnion[int, str]HTupleHtuple (string / integer) (string / int / long) (HString / Hlong) (char* / Hlong)
OCR Classifier.
默认值:
'Universal_Rej.occ'
"Universal_Rej.occ"
"Universal_Rej.occ"
"Universal_Rej.occ"
"Universal_Rej.occ"
"Universal_Rej.occ"
建议值:
'Document_Rej.omc'"Document_Rej.omc""Document_Rej.omc""Document_Rej.omc""Document_Rej.omc""Document_Rej.omc", 'Document_0-9_Rej.omc'"Document_0-9_Rej.omc""Document_0-9_Rej.omc""Document_0-9_Rej.omc""Document_0-9_Rej.omc""Document_0-9_Rej.omc", 'Document_0-9A-Z_Rej.omc'"Document_0-9A-Z_Rej.omc""Document_0-9A-Z_Rej.omc""Document_0-9A-Z_Rej.omc""Document_0-9A-Z_Rej.omc""Document_0-9A-Z_Rej.omc", 'Document_A-Z+_Rej.omc'"Document_A-Z+_Rej.omc""Document_A-Z+_Rej.omc""Document_A-Z+_Rej.omc""Document_A-Z+_Rej.omc""Document_A-Z+_Rej.omc", 'DotPrint_Rej.omc'"DotPrint_Rej.omc""DotPrint_Rej.omc""DotPrint_Rej.omc""DotPrint_Rej.omc""DotPrint_Rej.omc", 'DotPrint_0-9_Rej.omc'"DotPrint_0-9_Rej.omc""DotPrint_0-9_Rej.omc""DotPrint_0-9_Rej.omc""DotPrint_0-9_Rej.omc""DotPrint_0-9_Rej.omc", 'DotPrint_0-9+_Rej.omc'"DotPrint_0-9+_Rej.omc""DotPrint_0-9+_Rej.omc""DotPrint_0-9+_Rej.omc""DotPrint_0-9+_Rej.omc""DotPrint_0-9+_Rej.omc", 'DotPrint_0-9A-Z_Rej.omc'"DotPrint_0-9A-Z_Rej.omc""DotPrint_0-9A-Z_Rej.omc""DotPrint_0-9A-Z_Rej.omc""DotPrint_0-9A-Z_Rej.omc""DotPrint_0-9A-Z_Rej.omc", 'DotPrint_A-Z+_Rej.omc'"DotPrint_A-Z+_Rej.omc""DotPrint_A-Z+_Rej.omc""DotPrint_A-Z+_Rej.omc""DotPrint_A-Z+_Rej.omc""DotPrint_A-Z+_Rej.omc", 'HandWritten_0-9_Rej.omc'"HandWritten_0-9_Rej.omc""HandWritten_0-9_Rej.omc""HandWritten_0-9_Rej.omc""HandWritten_0-9_Rej.omc""HandWritten_0-9_Rej.omc", 'Industrial_Rej.omc'"Industrial_Rej.omc""Industrial_Rej.omc""Industrial_Rej.omc""Industrial_Rej.omc""Industrial_Rej.omc", 'Industrial_0-9_Rej.omc'"Industrial_0-9_Rej.omc""Industrial_0-9_Rej.omc""Industrial_0-9_Rej.omc""Industrial_0-9_Rej.omc""Industrial_0-9_Rej.omc", 'Industrial_0-9+_Rej.omc'"Industrial_0-9+_Rej.omc""Industrial_0-9+_Rej.omc""Industrial_0-9+_Rej.omc""Industrial_0-9+_Rej.omc""Industrial_0-9+_Rej.omc", 'Industrial_0-9A-Z_Rej.omc'"Industrial_0-9A-Z_Rej.omc""Industrial_0-9A-Z_Rej.omc""Industrial_0-9A-Z_Rej.omc""Industrial_0-9A-Z_Rej.omc""Industrial_0-9A-Z_Rej.omc", 'Industrial_A-Z+_Rej.omc'"Industrial_A-Z+_Rej.omc""Industrial_A-Z+_Rej.omc""Industrial_A-Z+_Rej.omc""Industrial_A-Z+_Rej.omc""Industrial_A-Z+_Rej.omc", 'OCRA_Rej.omc'"OCRA_Rej.omc""OCRA_Rej.omc""OCRA_Rej.omc""OCRA_Rej.omc""OCRA_Rej.omc", 'OCRA_0-9_Rej.omc'"OCRA_0-9_Rej.omc""OCRA_0-9_Rej.omc""OCRA_0-9_Rej.omc""OCRA_0-9_Rej.omc""OCRA_0-9_Rej.omc", 'OCRA_0-9A-Z_Rej.omc'"OCRA_0-9A-Z_Rej.omc""OCRA_0-9A-Z_Rej.omc""OCRA_0-9A-Z_Rej.omc""OCRA_0-9A-Z_Rej.omc""OCRA_0-9A-Z_Rej.omc", 'OCRA_A-Z+_Rej.omc'"OCRA_A-Z+_Rej.omc""OCRA_A-Z+_Rej.omc""OCRA_A-Z+_Rej.omc""OCRA_A-Z+_Rej.omc""OCRA_A-Z+_Rej.omc", 'OCRB_Rej.omc'"OCRB_Rej.omc""OCRB_Rej.omc""OCRB_Rej.omc""OCRB_Rej.omc""OCRB_Rej.omc", 'OCRB_0-9_Rej.omc'"OCRB_0-9_Rej.omc""OCRB_0-9_Rej.omc""OCRB_0-9_Rej.omc""OCRB_0-9_Rej.omc""OCRB_0-9_Rej.omc", 'OCRB_0-9A-Z_Rej.omc'"OCRB_0-9A-Z_Rej.omc""OCRB_0-9A-Z_Rej.omc""OCRB_0-9A-Z_Rej.omc""OCRB_0-9A-Z_Rej.omc""OCRB_0-9A-Z_Rej.omc", 'OCRB_A-Z+_Rej.omc'"OCRB_A-Z+_Rej.omc""OCRB_A-Z+_Rej.omc""OCRB_A-Z+_Rej.omc""OCRB_A-Z+_Rej.omc""OCRB_A-Z+_Rej.omc", 'OCRB_passport_Rej.omc'"OCRB_passport_Rej.omc""OCRB_passport_Rej.omc""OCRB_passport_Rej.omc""OCRB_passport_Rej.omc""OCRB_passport_Rej.omc", 'Pharma_Rej.omc'"Pharma_Rej.omc""Pharma_Rej.omc""Pharma_Rej.omc""Pharma_Rej.omc""Pharma_Rej.omc", 'Pharma_0-9_Rej.omc'"Pharma_0-9_Rej.omc""Pharma_0-9_Rej.omc""Pharma_0-9_Rej.omc""Pharma_0-9_Rej.omc""Pharma_0-9_Rej.omc", 'Pharma_0-9+_Rej.omc'"Pharma_0-9+_Rej.omc""Pharma_0-9+_Rej.omc""Pharma_0-9+_Rej.omc""Pharma_0-9+_Rej.omc""Pharma_0-9+_Rej.omc", 'Pharma_0-9A-Z_Rej.omc'"Pharma_0-9A-Z_Rej.omc""Pharma_0-9A-Z_Rej.omc""Pharma_0-9A-Z_Rej.omc""Pharma_0-9A-Z_Rej.omc""Pharma_0-9A-Z_Rej.omc", 'SEMI_Rej.omc'"SEMI_Rej.omc""SEMI_Rej.omc""SEMI_Rej.omc""SEMI_Rej.omc""SEMI_Rej.omc", 'Universal_Rej.occ'"Universal_Rej.occ""Universal_Rej.occ""Universal_Rej.occ""Universal_Rej.occ""Universal_Rej.occ", 'Universal_0-9_Rej.occ'"Universal_0-9_Rej.occ""Universal_0-9_Rej.occ""Universal_0-9_Rej.occ""Universal_0-9_Rej.occ""Universal_0-9_Rej.occ", 'Universal_0-9+_Rej.occ'"Universal_0-9+_Rej.occ""Universal_0-9+_Rej.occ""Universal_0-9+_Rej.occ""Universal_0-9+_Rej.occ""Universal_0-9+_Rej.occ", 'Universal_0-9A-Z_Rej.occ'"Universal_0-9A-Z_Rej.occ""Universal_0-9A-Z_Rej.occ""Universal_0-9A-Z_Rej.occ""Universal_0-9A-Z_Rej.occ""Universal_0-9A-Z_Rej.occ", 'Universal_0-9A-Z+_Rej.occ'"Universal_0-9A-Z+_Rej.occ""Universal_0-9A-Z+_Rej.occ""Universal_0-9A-Z+_Rej.occ""Universal_0-9A-Z+_Rej.occ""Universal_0-9A-Z+_Rej.occ", 'Universal_A-Z+_Rej.occ'"Universal_A-Z+_Rej.occ""Universal_A-Z+_Rej.occ""Universal_A-Z+_Rej.occ""Universal_A-Z+_Rej.occ""Universal_A-Z+_Rej.occ"
TextModelTextModelTextModelTextModeltextModeltext_model (输出控制) text_model → HTextModel, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)
New text model.
示例(HDevelop)
read_image (Image, 'numbers_scale')
create_text_model_reader ('auto', 'Document_Rej.omc', TextModel)
* Optionally specify text properties
set_text_model_param (TextModel, 'min_char_height', 20)
find_text (Image, TextModel, TextResultID)
* Return character regions and corresponding classification results
get_text_object (Characters, TextResultID, 'all_lines')
get_text_result (TextResultID, 'class', Class)
结果
create_text_model_readercreate_text_model_readerCreateTextModelReaderCreateTextModelReaderCreateTextModelReadercreate_text_model_reader 返回值 2 ( H_MSG_TRUE )。
可能的后继
set_text_model_paramset_text_model_paramSetTextModelParamSetTextModelParamSetTextModelParamset_text_model_param,
get_text_model_paramget_text_model_paramGetTextModelParamGetTextModelParamGetTextModelParamget_text_model_param,
find_textfind_textFindTextFindTextFindTextfind_text
另见
clear_text_modelclear_text_modelClearTextModelClearTextModelClearTextModelclear_text_model
模块
光学字符识别/光学字符验证