REST Resource: projects.locations.evaluationItems

Resource: EvaluationItem

EvaluationItem is a single evaluation request or result. The content of an EvaluationItem is immutable - it cannot be updated once created. EvaluationItems can be deleted when no longer needed.

Fields
name string

Identifier. The resource name of the EvaluationItem. Format: projects/{project}/locations/{location}/evaluationItems/{evaluationItem}

displayName string

Required. The display name of the EvaluationItem.

metadata value (Value format)

Optional. metadata for the EvaluationItem.

labels map (key: string, value: string)

Optional. Labels for the EvaluationItem.

evaluationItemType enum (EvaluationItemType)

Required. The type of the EvaluationItem.

createTime string (Timestamp format)

Output only. timestamp when this item was created.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

error object (Status)

Output only. Error for the evaluation item.

payload Union type
The request or response for the EvaluationItem. payload can be only one of the following:
evaluationRequest object (EvaluationRequest)

The request to evaluate.

evaluationResponse object (EvaluationResult)

Output only. The response from evaluation.

gcsUri string

The Cloud Storage object where the request or response is stored.

JSON representation
{
  "name": string,
  "displayName": string,
  "metadata": value,
  "labels": {
    string: string,
    ...
  },
  "evaluationItemType": enum (EvaluationItemType),
  "createTime": string,
  "error": {
    object (Status)
  },

  // payload
  "evaluationRequest": {
    object (EvaluationRequest)
  },
  "evaluationResponse": {
    object (EvaluationResult)
  },
  "gcsUri": string
  // Union type
}

EvaluationRequest

Single evaluation request.

Fields
prompt object (EvaluationPrompt)

Required. The request/prompt to evaluate.

goldenResponse object (CandidateResponse)

Optional. The Ideal response or ground truth.

rubrics map (key: string, value: object (RubricGroup))

Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group.

candidateResponses[] object (CandidateResponse)

Optional. Responses from model under test and other baseline models for comparison.

JSON representation
{
  "prompt": {
    object (EvaluationPrompt)
  },
  "goldenResponse": {
    object (CandidateResponse)
  },
  "rubrics": {
    string: {
      object (RubricGroup)
    },
    ...
  },
  "candidateResponses": [
    {
      object (CandidateResponse)
    }
  ]
}

EvaluationPrompt

Prompt to be evaluated.

Fields
data Union type
Prompt can be in one of the following formats. data can be only one of the following:
text string

Text prompt.

value value (Value format)

Fields and values that can be used to populate the prompt template.

promptTemplateData object (PromptTemplateData)

Prompt template data.

JSON representation
{

  // data
  "text": string,
  "value": value,
  "promptTemplateData": {
    object (PromptTemplateData)
  }
  // Union type
}

PromptTemplateData

message to hold a prompt template and the values to populate the template.

Fields
values map (key: string, value: object (Content))

The values for fields in the prompt template.

JSON representation
{
  "values": {
    string: {
      object (Content)
    },
    ...
  }
}

CandidateResponse

Responses from model or agent.

Fields
candidate string

Required. The name of the candidate that produced the response.

data Union type
The response from the model or agent. data can be only one of the following:
text string

Text response.

value value (Value format)

Fields and values that can be used to populate the response template.

JSON representation
{
  "candidate": string,

  // data
  "text": string,
  "value": value
  // Union type
}

RubricGroup

A group of rubrics, used for grouping rubrics based on a metric or a version.

Fields
groupId string

Unique identifier for the group.

displayName string

Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task".

rubrics[] object (Rubric)

Rubrics that are part of this group.

JSON representation
{
  "groupId": string,
  "displayName": string,
  "rubrics": [
    {
      object (Rubric)
    }
  ]
}

EvaluationResult

Evaluation result.

Fields
evaluationRequest string

Required. The request item that was evaluated. Format: projects/{project}/locations/{location}/evaluationItems/{evaluationItem}

evaluationRun string

Required. The evaluation run that was used to generate the result. Format: projects/{project}/locations/{location}/evaluationRuns/{evaluationRun}

request object (EvaluationRequest)

Required. The request that was evaluated.

metric string

Required. The metric that was evaluated.

candidateResults[] object (CandidateResult)

Optional. The results for the metric.

metadata value (Value format)

Optional. metadata about the evaluation result.

JSON representation
{
  "evaluationRequest": string,
  "evaluationRun": string,
  "request": {
    object (EvaluationRequest)
  },
  "metric": string,
  "candidateResults": [
    {
      object (CandidateResult)
    }
  ],
  "metadata": value
}

CandidateResult

result for a single candidate.

Fields
candidate string

Required. The candidate that is being evaluated. The value is the same as the candidate name in the EvaluationRequest.

metric string

Required. The metric that was evaluated.

explanation string

Optional. The explanation for the metric.

rubricVerdicts[] object (RubricVerdict)

Optional. The rubric verdicts for the metric.

additionalResults value (Value format)

Optional. Additional results for the metric.

result Union type
The result for the metric. result can be only one of the following:
score number

Optional. The score for the metric.

JSON representation
{
  "candidate": string,
  "metric": string,
  "explanation": string,
  "rubricVerdicts": [
    {
      object (RubricVerdict)
    }
  ],
  "additionalResults": value,

  // result
  "score": number
  // Union type
}

RubricVerdict

Represents the verdict of an evaluation against a single rubric.

Fields
evaluatedRubric object (Rubric)

Required. The full rubric definition that was evaluated. Storing this ensures the verdict is self-contained and understandable, especially if the original rubric definition changes or was dynamically generated.

verdict boolean

Required. Outcome of the evaluation against the rubric, represented as a boolean. true indicates a "Pass", false indicates a "Fail".

reasoning string

Optional. Human-readable reasoning or explanation for the verdict. This can include specific examples or details from the evaluated content that justify the given verdict.

JSON representation
{
  "evaluatedRubric": {
    object (Rubric)
  },
  "verdict": boolean,
  "reasoning": string
}

EvaluationItemType

The type of the EvaluationItem.

Enums
EVALUATION_ITEM_TYPE_UNSPECIFIED The default value. This value is unused.
REQUEST The EvaluationItem is a request to evaluate.
RESULT The EvaluationItem is the result of evaluation.

Methods

create

Creates an Evaluation Item.

delete

Deletes an Evaluation Item.

get

Gets an Evaluation Item.

list

Lists Evaluation Items.