- Resource: ModelEvaluation
- ClassificationEvaluationMetrics
- ConfidenceMetricsEntry
- ConfusionMatrix
- Row
- TranslationEvaluationMetrics
- Methods
Resource: ModelEvaluation
Evaluation results of a model.
JSON representation | |
---|---|
{ "name": string, "annotationSpecId": string, "createTime": string, "evaluatedExampleCount": number, // Union field |
Fields | ||
---|---|---|
name |
Output only. Resource name of the model evaluation. Format:
|
|
annotationSpecId |
Output only. The ID of the annotation spec that the model evaluation applies to. The ID is empty for overall model evaluation. NOTE: Currently there is no way to obtain the displayName of the annotation spec from its ID. To see the display_names, review the model evaluations in the UI. |
|
createTime |
Output only. Timestamp when this model evaluation was created. A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: |
|
evaluatedExampleCount |
Output only. The number of examples used for model evaluation. |
|
Union field metrics . Output only. Problem type specific evaluation metrics. metrics can be only one of the following: |
||
classificationEvaluationMetrics |
Evaluation metrics for models that classify items. |
|
translationEvaluationMetrics |
Evaluation metrics for translation models. |
ClassificationEvaluationMetrics
Model evaluation metrics for classification problems. Visible only to v1beta1
JSON representation | |
---|---|
{ "auPrc": number, "baseAuPrc": number, "confidenceMetricsEntry": [ { object( |
Fields | |
---|---|
auPrc |
Output only. The Area under precision recall curve metric. |
baseAuPrc |
Output only. The Area under precision recall curve metric based on priors. |
confidenceMetricsEntry[] |
Output only. Metrics that have confidence thresholds. Precision-recall curve can be derived from it. |
confusionMatrix |
Output only. Confusion matrix of the evaluation. Only set for MULTICLASS classification problems where number of labels is no more than 10. Only set for model level evaluation, not for evaluation per label. |
annotationSpecId[] |
Output only. The annotation spec ids used for this evaluation. |
ConfidenceMetricsEntry
Metrics for a single confidence threshold.
JSON representation | |
---|---|
{ "confidenceThreshold": number, "recall": number, "precision": number, "f1Score": number, "recallAt1": number, "precisionAt1": number, "f1ScoreAt1": number } |
Fields | |
---|---|
confidenceThreshold |
Output only. The confidence threshold value used to compute the metrics. |
recall |
Output only. Recall under the given confidence threshold. |
precision |
Output only. Precision under the given confidence threshold. |
f1Score |
Output only. The harmonic mean of recall and precision. |
recallAt1 |
Output only. The recall when only considering the label that has the highest prediction score and not below the confidence threshold for each example. |
precisionAt1 |
Output only. The precision when only considering the label that has the highest predictionscore and not below the confidence threshold for each example. |
f1ScoreAt1 |
Output only. The harmonic mean of |
ConfusionMatrix
Confusion matrix of the model running the classification.
JSON representation | |
---|---|
{
"annotationSpecId": [
string
],
"row": [
{
object( |
Fields | |
---|---|
annotationSpecId[] |
Output only. IDs of the annotation specs used in the confusion matrix. |
row[] |
Output only. Rows in the confusion matrix. The number of rows is equal to the size of |
Row
Output only. A row in the confusion matrix.
JSON representation | |
---|---|
{ "exampleCount": [ number ] } |
Fields | |
---|---|
exampleCount[] |
Output only. Value of the specific cell in the confusion matrix. The number of values each row is equal to the size of annotatin_spec_id. |
TranslationEvaluationMetrics
Evaluation metrics for the dataset.
JSON representation | |
---|---|
{ "bleuScore": number, "baseBleuScore": number } |
Fields | |
---|---|
bleuScore |
Output only. BLEU score. |
baseBleuScore |
Output only. BLEU score for base model. |
Methods |
|
---|---|
|
Gets a model evaluation. |
|
Lists model evaluations. |