set_text_model_param — 设置文本模型的参数。
set_text_model_param( : : TextModel, GenParamName, GenParamValue : )
set_text_model_param sets parameters of a text model. The list of
allowed parameter values for GenParamName differs, depending on
which Mode was set when creating the text model with
create_text_model_reader。In the following, first the parameter values for text models with
Mode = 'auto' are listed, and then those
for text models with Mode = 'manual'.
The name and value of a parameter must be given in GenParamName
and GenParamValue. The following values are possible:
Parameters of text models with Mode = 'auto'
Segmentation behavior
The minimal contrast the characters have to their surrounding background.
值列表: integer or float value between 1 and 255 for byte images and between 1 and 65.535 for uint2 images
默认值: 15
'dark_on_light' if the text to be segmented is darker than its background, 'light_on_dark' if the text to be segmented is lighter than its background, and 'both' if both kinds of text are to be segmented.
值列表: 'dark_on_light', 'light_on_dark', 'both'
默认值: 'both'
'true' if regions that are touching the border of the image domain should be discarded, otherwise 'false'.
值列表: 'true','false'
默认值: 'false'
'true' if fragments, such as the dot on the 'i', should be added to the segmented characters, otherwise 'false'. Be aware, that this can cause noise to be added to the segmented characters.
值列表: 'true','false'
默认值: 'true'
Controls the handling of pairs or small groups of neighboring characters that are segmented as one single region. When selecting 'standard' or 'enhanced', such regions are detected and separated into two or more single characters. While the 'enhanced' method yields more accurate results, the 'standard' method is less complex and thus faster. If 'separate_touching_chars' is set to 'false', no separation of touching characters is performed.
Remark: If 'enhanced' is selected, the file find_text_support.hotc from the ocr subdirectory of the root directory of the HALCON installation is needed. It is also possible to place this file in the current working directory.
值列表: 'false', 'standard', 'enhanced'
默认值: 'standard'
Character size
The minimal height of the characters in pixel. If text of arbitrary height is to be segmented, 'auto' may be passed. Note that 'min_char_height' refers to characters only. The height of punctuation marks or separators is not restricted by 'min_char_height'.
值列表: integer or float value greater or equal to 1
默认值: 'auto'
The maximal height of the characters in pixel. If text of arbitrary height is to be segmented, 'auto' may be passed. Note that 'max_char_height' refers to characters only. The height of punctuation marks or separators is not restricted by 'max_char_height'.
值列表: integer or float value greater or equal to 1
默认值: 'auto'
The minimal width of the characters in pixel. If text of arbitrary width is to be segmented, 'auto' may be passed. Note that 'min_char_width' refers to characters only. The width of punctuation marks or separators is not restricted by 'min_char_width'.
值列表: integer or float value greater or equal to 1
默认值: 'auto'
The maximal width of the characters in pixel. If text of arbitrary width is to be segmented, 'auto' may be passed. Note that 'max_char_width' refers to characters only. The width of punctuation marks or separators is not restricted by 'max_char_width'.
值列表: integer or float value greater or equal to 1
默认值: 'auto'
The minimal stroke width of the characters in pixel. If the minimal stroke width is to be estimated within the text segmentation process automatically, 'auto' may be passed. Note that 'min_stroke_width' refers to characters only. The stroke width of punctuation marks or separators is not restricted by 'min_stroke_width'.
值列表: integer or float value greater or equal to 1
默认值: 'auto'
The maximal stroke width of the characters in pixel. If the maximal stroke width is to be estimated within the text segmentation process automatically, 'auto' may be passed. Note that 'max_stroke_width' refers to characters only. The stroke width of punctuation marks or separators is not restricted by 'max_stroke_width'.
值列表: integer or float value greater or equal to 1
默认值: 'auto'
Special characters
'true' if small punctuation marks that lie close to the base line of the corresponding text line (e.g., dots or commas) are to be returned. 'false' if no such punctuations should be returned.
值列表: 'true','false'
默认值: 'true'
'true' if separators such as a minus or the equality sign should be returned as well. 'false' if no separators should be returned.
值列表: 'true','false'
默认值: 'true'
Handling of dot prints
'true' if the text to be segmented contains dot printed characters, otherwise 'false'.
值列表: 'true','false'
默认值: 'false'
'true' if the gap between adjacent characters is smaller than the largest gap between two dots within a single character, otherwise 'false'. If 'dot_print' is set to 'false' this parameter does not have any effect. In cases where the minimal gap size between characters is exactly known, 'dot_print_min_char_gap' can be set instead. In this case the value of 'dot_print_tight_char_spacing' is ignored.
值列表: 'true','false'
默认值: 'false'
The minimal gap size between two characters in pixel. This parameter can be used to improve the text result in cases where the minimal gap size between characters is smaller than the maximal gap size between dots within characters. If the minimal character gap size is not known or is bigger than the maximal dot gap size, 'auto' may be passed. If 'dot_print' is set to 'false' this parameter does not have any effect. In cases where the minimal gap size between characters is not known but the characters are printed close to each other, 'dot_print_tight_char_spacing' might be used instead.
值列表: integer or float value greater or equal to 0
默认值: 'auto'
The maximal gap size between two
dots within a character in pixel. If arbitrary dot printed characters are
to be segmented, 'auto' may be passed.
If 'dot_print' is set to 'false' this parameter
does not have any effect. In cases where the maximal dot gap size is
larger than or equal to the minimal gap size between characters,
'dot_print_tight_char_spacing' or 'dot_print_min_char_gap'
should be set accordingly. Setting 'dot_print_max_dot_gap' can
reduce the runtime of find_text significantly.
值列表: integer or float value greater or equal to 1
默认值: 'auto'
Line structures
To simplify the search for specific structures (e.g., dates or serial numbers) within the segmented text, it is possible to define text line structures. For each text line the distances between the characters are calculated, and based on these distances, the text line is divided into text blocks. Short characters such as '.', '_' and '-' are ignored in this process and treated as spaces. Furthermore, it is possible to define user specific separators which are also ignored. See the description of 'text_line_separators' for details. It is then tested if any of the user defined text line structures fit the resulting text blocks.
For example, if the text to be found is a date with two characters for month, day, and year the structure would be '2 2 2'. If the year may consist of two or four characters, the structure would be '2 2 2-4', indicating that the last character block consists of two to four characters. It is possible to provide more than one structure to match by appending an index to the parameter name, e.g., 'text_line_structure_0', 'text_line_structure_1'. If 'text_line_structure' is set to an empty string ' ', the text to be found may have any structure.
Please observe, that every text line structure which is found, is
saved as a unique text line within the text result. Hence, when calling
get_text_object, a 'line' then refers to a valid text line
structure. If the whole text line containing the text line structure is
to be returned instead, it is possible to set 'return_whole_line'
accordingly.
默认值: ' '
A string containing the list of characters which are to be ignored in the process of finding text line structures, see 'text_line_structure' for further details. Please note, user specific separators need to be valid characters within the used OCR classifier. For example, if ':' and '\' are to be ignored, ':\\' should be passed. Please observe, that '\' escapes any special symbol to treat it as a literal, and hence '\\' needs to be passed to use '\' as a separator.
值列表: '/',':', ':\\', '\\/:',...
默认值: ' '
'false' if only the segmented text line structures are to be returned as text lines. 'true' if each whole text line containing a text line structure is to be returned in text lines.
值列表: 'true','false'
默认值: 'false'
OCR classifier
The OCR classifier used within
find_text for text segmentation and classification. An initial
classifier is set when the text model is created. See
create_text_model_reader for more information about the required
OCR Classifier.
The number of best classes to be stored
for each character (e.g., if 'num_classes' is set to 2,
find_text returns the classification results with the highest
and second highest confidence).
If 'num_classes' exceeds the number of classes of the
classifier stored in the text model, 'num_classes' is decreased
accordingly. The actual number of classes can be queried by
get_text_result。For classifiers with rejection class,
'num_classes' should be at least 2 in order to be able
to use the second best result if a character is classified as rejection
class.
值列表: integer greater than or equal to 1.
默认值: 2
Parameters of text models with Mode = 'manual'
Height of the characters in pixel. Refers to an uppercase character. Default value: 30px
Width of the characters in pixel. Refers to an uppercase character. Default value: 20px
Stroke width of the characters in pixel. Default value: 4.0px
Maximum base line deviation of the characters (in percent of 'manual_char_height'). Default value: 0.15
'dark_on_light' if the text to be segmented is darker than its background, otherwise 'light_on_dark'. Default value: 'dark_on_light'
'true' if the text to be segmented contains uppercase characters or numbers only, otherwise 'false'. Default value: 'false'
'true' if the text to be segmented is a dotprint, otherwise 'false'. Default value: 'false'
'true' if the text to be segmented suffers of local changes of polarity due to reflections, otherwise 'false'. Default value: 'false'
'true' if there are longer horizontal structures close to the text to be segmented, otherwise 'false'. Default value: 'false'
'true' if regions that are touching the border of the image domain should be discarded, otherwise 'false'. Default value: 'false'
Maximum number of lines to be found.
Zero or negative values indicate no limitation. Setting
'manual_max_line_num' to a low value may strongly improve the
runtime of find_text。Default value: no limitation
'true' if punctuation marks (e.g., dots or comma) should be added to the segmented characters. Default value: 'true'
'true' if separators such as a minus or the equality sign should be added to the segmented characters. Default value: 'true'
'true' if fragments, such as the dot on the 'i', should be added to the segmented characters. Be aware, that this can cause noise to be added to the segmented characters. Default value: 'true'
minimum area of fragment regions that are added if 'manual_add_fragments' is set to 'true'. Default value: 1
specifies the structure of the text to be found to reduce the search space and to avoid false positives. The structure is a string that contains the number of characters for every character block and spaces between these character blocks. For example, if the text to be found is a date with two characters for month, day, and year the structure would be '2 2 2'. If the year may also consist of four characters the structure would be '2 2 2-4', indicating that the last character block consists of two to four characters. It is possible to provide more than one structure to match by appending an index to the parameter name, e.g., 'manual_text_line_structure_0', 'manual_text_line_structure_1'. If 'manual_text_line_structure' is set to an empty string ' ', the text to be found may have any structure. Default value: ' '
'true' if selected intermediate
results should be kept with the output result of find_text。
此算子修改后续输入参数的状态:
在执行此算子时,若该参数值需在多个线程间使用,则必须对其访问进行同步。
TextModel (输入控制,状态被修改) text_model → (handle)
Text model.
GenParamName (输入控制) string(-array) → (string)
Names of the parameters to be set.
默认值: 'min_contrast'
建议值: 'add_fragments', 'dot_print', 'dot_print_max_dot_gap', 'dot_print_min_char_gap', 'dot_print_tight_char_spacing', 'eliminate_border_blobs', 'max_char_height', 'max_char_width', 'max_stroke_width', 'min_char_height', 'min_char_width', 'min_contrast', 'min_stroke_width', 'num_classes', 'ocr_classifier', 'polarity', 'return_punctuation', 'return_separators', 'return_whole_line', 'separate_touching_chars', 'text_line_separators', 'text_line_structure', 'text_line_structure_0', 'text_line_structure_1', 'text_line_structure_2', 'manual_add_fragments', 'manual_base_line_tolerance', 'manual_char_height', 'manual_char_width', 'manual_eliminate_border_blobs', 'manual_eliminate_horizontal_lines', 'manual_fragment_size_min', 'manual_is_dotprint', 'manual_is_imprinted', 'manual_max_line_num', 'manual_persistence', 'manual_polarity', 'manual_return_punctuation', 'manual_return_separators', 'manual_stroke_width', 'manual_text_line_structure', 'manual_text_line_structure_0', 'manual_text_line_structure_1', 'manual_text_line_structure_2', 'manual_uppercase_only'
GenParamValue (输入控制) string(-array) → (integer / real / string)
Values of the parameters to be set.
默认值: 10
建议值: 'true', 'false', 'dark_on_light', 'light_on_dark', 'both', 'auto', 'standard', 'enhanced'
read_image (Image, 'numbers_scale')
create_text_model_reader ('auto', 'Document_Rej.omc', TextModel)
* Optionally specify text properties
set_text_model_param (TextModel, 'min_char_height', 20)
find_text (Image, TextModel, TextResultID)
* Return character regions and corresponding classification results
get_text_object (Characters, TextResultID, 'all_lines')
get_text_result (TextResultID, 'class', Class)
If the input parameters are set correctly, the operator
set_text_model_param 返回值 2 ( H_MSG_TRUE )。 否则,将抛出异常。
光学字符识别/光学字符验证