Resource: EvaluationItem
EvaluationItem is a single evaluation request or result. The content of an EvaluationItem is immutable - it cannot be updated once created. EvaluationItems can be deleted when no longer needed.
name
string
Identifier. The resource name of the EvaluationItem. Format: projects/{project}/locations/{location}/evaluationItems/{evaluationItem}
displayName
string
Required. The display name of the EvaluationItem.
Optional. metadata for the EvaluationItem.
labels
map (key: string, value: string)
Optional. Labels for the EvaluationItem.
Required. The type of the EvaluationItem.
Output only. timestamp when this item was created.
Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z"
, "2014-10-02T15:01:23.045123456Z"
or "2014-10-02T15:01:23+05:30"
.
Output only. Error for the evaluation item.
payload
Union type
payload
can be only one of the following:The request to evaluate.
Output only. The response from evaluation.
gcsUri
string
The Cloud Storage object where the request or response is stored.
JSON representation |
---|
{ "name": string, "displayName": string, "metadata": value, "labels": { string: string, ... }, "evaluationItemType": enum ( |
EvaluationRequest
Single evaluation request.
Required. The request/prompt to evaluate.
Optional. The Ideal response or ground truth.
Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group.
Optional. Responses from model under test and other baseline models for comparison.
JSON representation |
---|
{ "prompt": { object ( |
EvaluationPrompt
Prompt to be evaluated.
data
Union type
data
can be only one of the following:text
string
Text prompt.
Fields and values that can be used to populate the prompt template.
Prompt template data.
JSON representation |
---|
{
// data
"text": string,
"value": value,
"promptTemplateData": {
object ( |
PromptTemplateData
CandidateResponse
Responses from model or agent.
candidate
string
Required. The name of the candidate that produced the response.
data
Union type
data
can be only one of the following:text
string
Text response.
Fields and values that can be used to populate the response template.
JSON representation |
---|
{ "candidate": string, // data "text": string, "value": value // Union type } |
RubricGroup
A group of rubrics, used for grouping rubrics based on a metric or a version.
groupId
string
Unique identifier for the group.
displayName
string
Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task".
Rubrics that are part of this group.
JSON representation |
---|
{
"groupId": string,
"displayName": string,
"rubrics": [
{
object ( |
EvaluationResult
Evaluation result.
evaluationRequest
string
Required. The request item that was evaluated. Format: projects/{project}/locations/{location}/evaluationItems/{evaluationItem}
evaluationRun
string
Required. The evaluation run that was used to generate the result. Format: projects/{project}/locations/{location}/evaluationRuns/{evaluationRun}
Required. The request that was evaluated.
metric
string
Required. The metric that was evaluated.
Optional. The results for the metric.
Optional. metadata about the evaluation result.
JSON representation |
---|
{ "evaluationRequest": string, "evaluationRun": string, "request": { object ( |
CandidateResult
result for a single candidate.
candidate
string
Required. The candidate that is being evaluated. The value is the same as the candidate name in the EvaluationRequest.
metric
string
Required. The metric that was evaluated.
explanation
string
Optional. The explanation for the metric.
Optional. The rubric verdicts for the metric.
Optional. Additional results for the metric.
result
Union type
result
can be only one of the following:score
number
Optional. The score for the metric.
JSON representation |
---|
{
"candidate": string,
"metric": string,
"explanation": string,
"rubricVerdicts": [
{
object ( |
RubricVerdict
Represents the verdict of an evaluation against a single rubric.
Required. The full rubric definition that was evaluated. Storing this ensures the verdict is self-contained and understandable, especially if the original rubric definition changes or was dynamically generated.
verdict
boolean
Required. Outcome of the evaluation against the rubric, represented as a boolean. true
indicates a "Pass", false
indicates a "Fail".
reasoning
string
Optional. Human-readable reasoning or explanation for the verdict. This can include specific examples or details from the evaluated content that justify the given verdict.
JSON representation |
---|
{
"evaluatedRubric": {
object ( |
EvaluationItemType
The type of the EvaluationItem.
Enums | |
---|---|
EVALUATION_ITEM_TYPE_UNSPECIFIED |
The default value. This value is unused. |
REQUEST |
The EvaluationItem is a request to evaluate. |
RESULT |
The EvaluationItem is the result of evaluation. |
Methods |
|
---|---|
|
Creates an Evaluation Item. |
|
Deletes an Evaluation Item. |
|
Gets an Evaluation Item. |
|
Lists Evaluation Items. |