Metric

The metric used for running evaluations.

Fields
aggregationMetrics[] enum (AggregationMetric)

Optional. The aggregation metrics to use.

metric_spec Union type
The spec for the metric. It would be either a pre-defined metric, or a inline metric spec. metric_spec can be only one of the following:
predefinedMetricSpec object (PredefinedMetricSpec)

The spec for a pre-defined metric.

llmBasedMetricSpec object (LLMBasedMetricSpec)

Spec for an LLM based metric.

pointwiseMetricSpec object (PointwiseMetricSpec)

Spec for pointwise metric.

pairwiseMetricSpec object (PairwiseMetricSpec)

Spec for pairwise metric.

exactMatchSpec object (ExactMatchSpec)

Spec for exact match metric.

bleuSpec object (BleuSpec)

Spec for bleu metric.

rougeSpec object (RougeSpec)

Spec for rouge metric.

JSON representation
{
  "aggregationMetrics": [
    enum (AggregationMetric)
  ],

  // metric_spec
  "predefinedMetricSpec": {
    object (PredefinedMetricSpec)
  },
  "llmBasedMetricSpec": {
    object (LLMBasedMetricSpec)
  },
  "pointwiseMetricSpec": {
    object (PointwiseMetricSpec)
  },
  "pairwiseMetricSpec": {
    object (PairwiseMetricSpec)
  },
  "exactMatchSpec": {
    object (ExactMatchSpec)
  },
  "bleuSpec": {
    object (BleuSpec)
  },
  "rougeSpec": {
    object (RougeSpec)
  }
  // Union type
}

PredefinedMetricSpec

The spec for a pre-defined metric.

Fields
metricSpecName string

Required. The name of a pre-defined metric, such as "instruction_following_v1" or "text_quality_v1".

metricSpecParameters object (Struct format)

Optional. The parameters needed to run the pre-defined metric.

JSON representation
{
  "metricSpecName": string,
  "metricSpecParameters": {
    object
  }
}

LLMBasedMetricSpec

Specification for an LLM based metric.

Fields
rubrics_source Union type
Source of the rubrics to be used for evaluation. rubrics_source can be only one of the following:
rubricGroupKey string

Use a pre-defined group of rubrics associated with the input. Refers to a key in the rubricGroups map of EvaluationInstance.

rubricGenerationSpec object (RubricGenerationSpec)

Dynamically generate rubrics using this specification.

predefinedRubricGenerationSpec object (PredefinedMetricSpec)

Dynamically generate rubrics using a predefined spec.

metricPromptTemplate string

Required. Template for the prompt sent to the judge model.

systemInstruction string

Optional. System instructions for the judge model.

judgeAutoraterConfig object (AutoraterConfig)

Optional. Optional configuration for the judge LLM (Autorater).

additionalConfig object (Struct format)

Optional. Optional additional configuration for the metric.

JSON representation
{

  // rubrics_source
  "rubricGroupKey": string,
  "rubricGenerationSpec": {
    object (RubricGenerationSpec)
  },
  "predefinedRubricGenerationSpec": {
    object (PredefinedMetricSpec)
  }
  // Union type
  "metricPromptTemplate": string,
  "systemInstruction": string,
  "judgeAutoraterConfig": {
    object (AutoraterConfig)
  },
  "additionalConfig": {
    object
  }
}

RubricGenerationSpec

Specification for how rubrics should be generated.

Fields
promptTemplate string

Template for the prompt used to generate rubrics. The details should be updated based on the most-recent recipe requirements.

rubricContentType enum (RubricContentType)

The type of rubric content to be generated.

rubricTypeOntology[] string

Optional. An optional, pre-defined list of allowed types for generated rubrics. If this field is provided, it implies include_rubric_type should be true, and the generated rubric types should be chosen from this ontology.

modelConfig object (AutoraterConfig)

Configuration for the model used in rubric generation. Configs including sampling count and base model can be specified here. Flipping is not supported for rubric generation.

JSON representation
{
  "promptTemplate": string,
  "rubricContentType": enum (RubricContentType),
  "rubricTypeOntology": [
    string
  ],
  "modelConfig": {
    object (AutoraterConfig)
  }
}

RubricContentType

Specifies the type of rubric content to generate.

Enums
RUBRIC_CONTENT_TYPE_UNSPECIFIED The content type to generate is not specified.
PROPERTY Generate rubrics based on properties.
NL_QUESTION_ANSWER Generate rubrics in an NL question answer format.
PYTHON_CODE_ASSERTION Generate rubrics in a unit test format.

PointwiseMetricSpec

Spec for pointwise metric.

Fields
customOutputFormatConfig object (CustomOutputFormatConfig)

Optional. CustomOutputFormatConfig allows customization of metric output. By default, metrics return a score and explanation. When this config is set, the default output is replaced with either: - The raw output string. - A parsed output based on a user-defined schema. If a custom format is chosen, the score and explanation fields in the corresponding metric result will be empty.

metricPromptTemplate string

Required. Metric prompt template for pointwise metric.

systemInstruction string

Optional. System instructions for pointwise metric.

JSON representation
{
  "customOutputFormatConfig": {
    object (CustomOutputFormatConfig)
  },
  "metricPromptTemplate": string,
  "systemInstruction": string
}

CustomOutputFormatConfig

Spec for custom output format configuration.

Fields
custom_output_format_config Union type
Custom output format configuration. custom_output_format_config can be only one of the following:
returnRawOutput boolean

Optional. Whether to return raw output.

JSON representation
{

  // custom_output_format_config
  "returnRawOutput": boolean
  // Union type
}

PairwiseMetricSpec

Spec for pairwise metric.

Fields
candidateResponseFieldName string

Optional. The field name of the candidate response.

baselineResponseFieldName string

Optional. The field name of the baseline response.

customOutputFormatConfig object (CustomOutputFormatConfig)

Optional. CustomOutputFormatConfig allows customization of metric output. When this config is set, the default output is replaced with the raw output string. If a custom format is chosen, the pairwiseChoice and explanation fields in the corresponding metric result will be empty.

metricPromptTemplate string

Required. Metric prompt template for pairwise metric.

systemInstruction string

Optional. System instructions for pairwise metric.

JSON representation
{
  "candidateResponseFieldName": string,
  "baselineResponseFieldName": string,
  "customOutputFormatConfig": {
    object (CustomOutputFormatConfig)
  },
  "metricPromptTemplate": string,
  "systemInstruction": string
}

ExactMatchSpec

This type has no fields.

Spec for exact match metric - returns 1 if prediction and reference exactly matches, otherwise 0.

BleuSpec

Spec for bleu score metric - calculates the precision of n-grams in the prediction as compared to reference - returns a score ranging between 0 to 1.

Fields
useEffectiveOrder boolean

Optional. Whether to useEffectiveOrder to compute bleu score.

JSON representation
{
  "useEffectiveOrder": boolean
}

RougeSpec

Spec for rouge score metric - calculates the recall of n-grams in prediction as compared to reference - returns a score ranging between 0 and 1.

Fields
rougeType string

Optional. Supported rouge types are rougen[1-9], rougeL, and rougeLsum.

useStemmer boolean

Optional. Whether to use stemmer to compute rouge score.

splitSummaries boolean

Optional. Whether to split summaries while using rougeLsum.

JSON representation
{
  "rougeType": string,
  "useStemmer": boolean,
  "splitSummaries": boolean
}