train_dl_classifier_batch — 对一批图像执行基于深度学习的分类器的训练步骤。
train_dl_classifier_batch 已过时,仅出于向后兼容性考虑而保留。 New applications should use the
common CNN-based operator train_dl_model_batch。
train_dl_classifier_batch(BatchImages : : DLClassifierHandle, BatchLabels : DLClassifierTrainResultHandle)
train_dl_classifier_batch performs a training step of the
deep-learning-based classifier contained in DLClassifierHandle.
The classifier handle DLClassifierHandle has to be read previously
using read_dl_classifier。In order to apply training steps, classes have to be specified using
set_dl_classifier_param。Other hyperparameters such as the learning rate and the momentum are also
important for a successful training. They are set using
set_dl_classifier_param。
The training step is done on basis of a single batch of images from the
training dataset, thus the images BatchImages with labels
BatchLabels. The number of images within the batch needs to be
a multiple of the 'batch_size' where the parameter
'batch_size' is limited by the amount of available GPU memory.
In order to process more images in one training step, the classifier
parameter 'batch_size_multiplier' can be set to a value greater
than 1. The number of images being passed to the training operator
needs to be equal to 'batch_size' times
'batch_size_multiplier'. Note that a training step calculated
for a batch and a 'batch_size_multiplier' greater 1 is
an approximation of a training step calculated for the same batch but with
a 'batch_size_multiplier' equal to 1 and an accordingly
greater 'batch_size'. As an example, the loss calculated
with a 'batch_size' of 4 and a 'batch_size_multiplier'
of 2 is usually not equal to the loss calculated with a
'batch_size' of 8 and a 'batch_size_multiplier' of 1,
although the same number of images is used for training in both cases.
However, the approximation generally delivers comparably good results,
so it can be utilized if you wish to train with a larger number of images
than your GPU allows.
In some rare cases the approximation with a 'batch_size' of 1
and an accordingly large 'batch_size_multiplier' does not
show the expected performance which for example can happen when the
pretrained network 'pretrained_dl_classifier_resnet50.hdl' is used.
Setting the 'batch_size' to a value greater than 1 can help to
solve this issue.
Note that the images in BatchImages must fulfill certain conditions
regarding, for example, the image size and gray value range, depending on
the chosen network. Please have a look at read_dl_classifier and
set_dl_classifier_param for more information.
The labels in BatchLabels can be handed over as an array of
strings, or as an array of indices corresponding to the position of the
label within the array of classes (counting from 0) set before via
'classes' with set_dl_classifier_param。Information about the results of the training step as the value of the loss
are stored in DLClassifierTrainResultHandle and
can be accessed using get_dl_classifier_train_result。
Note that an epoch generally consists of a large number of batches and
that a successful training involves many epochs. Therefore
train_dl_classifier_batch has to be applied several times
with different batches.
For a more detailed explanation, we refer to
Legacy / DL Classification。
During training, a nonlinear optimization algorithm minimizes the value of the loss function. The later one is determined based on the prediction of the neural network on the current batch of images. The algorithm used for optimization is stochastic gradient descent (SGD). It updates the layers' weights of the previous iteration to the new values at iteration as follows:
Here, is the learning rate, for the momentum, and for the classification result of the deep learning-based classifier which depends on the network weights and the input batch . The variable is used to involve the influence of the momentum . The loss function used here is the Multinomial Logistic Loss in combination with a quadratic regularization term ,
Here, is a one-hot encoded target vector that encodes the
label of the -th image of the batch
containing -many images,
and shall be understood to be
a vector such that is applied on each component of
.
The regularization term is a weighted
-norm involving all weights except for biases.
Its influence can be controlled through .
In the above formula, denotes the hyperparameter
'weight_prior' that can be set with
set_dl_classifier_param。In order to gain more insight, you can retrieve the current value of the
total loss function as well as individual contributions using
get_dl_classifier_train_result。
有关基于深度学习的分类概念的说明,请参阅 深度学习 / 分类 一章的引言。涉及此遗留算子的工作流在 遗留 / 深度学习分类 一章中有详细说明。
算子 train_dl_classifier_batch internally calls functions
that might not be deterministic. Therefore, results from multiple calls of
train_dl_classifier_batch can slightly differ, although the same
input values have been used.
To run this operator, cuDNN and cuBLAS are required.更多详细信息,请参阅 “安装指南” 中的“深度学习及基于深度学习方法的要求”一章。
此算子返回一个句柄。请注意,即使该句柄被用作特定算子的输入参数,这些算子仍可能改变此句柄类型的实例状态。
BatchImages (输入对象) (multichannel-)image(-array) → object (real)
Images comprising the batch.
DLClassifierHandle (输入控制) dl_classifier → (handle)
基于深度学习分类器的句柄。
BatchLabels (输入控制) string-array → (string / integer)
Corresponding labels for each of the images.
默认值: []
值列表: []
DLClassifierTrainResultHandle (输出控制) dl_classifier_train_result → (handle)
Handle of the training results from the deep-learning-based classifier.
如果参数有效,算子
train_dl_classifier_batch 返回值 2 ( H_MSG_TRUE )。如有必要,则抛出异常。
read_dl_classifier,
set_dl_classifier_param,
get_dl_classifier_param
get_dl_classifier_train_result,
apply_dl_classifier,
clear_dl_classifier_train_result,
clear_dl_classifier
train_dl_model_batch,
train_class_mlp,
train_class_svm
深度学习训练