Resource: EvaluationItem
EvaluationItem is a single evaluation request or result. The content of an EvaluationItem is immutable - it cannot be updated once created. EvaluationItems can be deleted when no longer needed.
namestring
Identifier. The resource name of the EvaluationItem. Format: projects/{project}/locations/{location}/evaluationItems/{evaluationItem}
displayNamestring
Required. The display name of the EvaluationItem.
Optional. metadata for the EvaluationItem.
labelsmap (key: string, value: string)
Optional. Labels for the EvaluationItem.
Required. The type of the EvaluationItem.
Output only. timestamp when this item was created.
Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".
Output only. Error for the evaluation item.
payloadUnion type
payload can be only one of the following:The request to evaluate.
Output only. The response from evaluation.
gcsUristring
The Cloud Storage object where the request or response is stored.
| JSON representation |
|---|
{ "name": string, "displayName": string, "metadata": value, "labels": { string: string, ... }, "evaluationItemType": enum ( |
EvaluationRequest
Single evaluation request.
Required. The request/prompt to evaluate.
Optional. The Ideal response or ground truth.
Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group.
Optional. Responses from model under test and other baseline models for comparison.
| JSON representation |
|---|
{ "prompt": { object ( |
EvaluationPrompt
Prompt to be evaluated.
dataUnion type
data can be only one of the following:textstring
Text prompt.
Fields and values that can be used to populate the prompt template.
Prompt template data.
| JSON representation |
|---|
{
// data
"text": string,
"value": value,
"promptTemplateData": {
object ( |
PromptTemplateData
CandidateResponse
Responses from model or agent.
candidatestring
Required. The name of the candidate that produced the response.
dataUnion type
data can be only one of the following:textstring
Text response.
Fields and values that can be used to populate the response template.
| JSON representation |
|---|
{ "candidate": string, // data "text": string, "value": value // Union type } |
RubricGroup
A group of rubrics, used for grouping rubrics based on a metric or a version.
groupIdstring
Unique identifier for the group.
displayNamestring
Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task".
Rubrics that are part of this group.
| JSON representation |
|---|
{
"groupId": string,
"displayName": string,
"rubrics": [
{
object ( |
EvaluationResult
Evaluation result.
evaluationRequeststring
Required. The request item that was evaluated. Format: projects/{project}/locations/{location}/evaluationItems/{evaluationItem}
evaluationRunstring
Required. The evaluation run that was used to generate the result. Format: projects/{project}/locations/{location}/evaluationRuns/{evaluationRun}
Required. The request that was evaluated.
metricstring
Required. The metric that was evaluated.
Optional. The results for the metric.
Optional. metadata about the evaluation result.
| JSON representation |
|---|
{ "evaluationRequest": string, "evaluationRun": string, "request": { object ( |
CandidateResult
result for a single candidate.
candidatestring
Required. The candidate that is being evaluated. The value is the same as the candidate name in the EvaluationRequest.
metricstring
Required. The metric that was evaluated.
explanationstring
Optional. The explanation for the metric.
Optional. The rubric verdicts for the metric.
Optional. Additional results for the metric.
resultUnion type
result can be only one of the following:scorenumber
Optional. The score for the metric.
| JSON representation |
|---|
{
"candidate": string,
"metric": string,
"explanation": string,
"rubricVerdicts": [
{
object ( |
RubricVerdict
Represents the verdict of an evaluation against a single rubric.
Required. The full rubric definition that was evaluated. Storing this ensures the verdict is self-contained and understandable, especially if the original rubric definition changes or was dynamically generated.
verdictboolean
Required. Outcome of the evaluation against the rubric, represented as a boolean. true indicates a "Pass", false indicates a "Fail".
reasoningstring
Optional. Human-readable reasoning or explanation for the verdict. This can include specific examples or details from the evaluated content that justify the given verdict.
| JSON representation |
|---|
{
"evaluatedRubric": {
object ( |
EvaluationItemType
The type of the EvaluationItem.
| Enums | |
|---|---|
EVALUATION_ITEM_TYPE_UNSPECIFIED |
The default value. This value is unused. |
REQUEST |
The EvaluationItem is a request to evaluate. |
RESULT |
The EvaluationItem is the result of evaluation. |
Methods |
|
|---|---|
|
Creates an Evaluation Item. |
|
Deletes an Evaluation Item. |
|
Gets an Evaluation Item. |
|
Lists Evaluation Items. |