Package google.cloud.aiplatform.v1

Index

EvaluationService

Vertex AI Online Evaluation Service.

EvaluateInstances

rpc EvaluateInstances(EvaluateInstancesRequest) returns (EvaluateInstancesResponse)

Evaluates instances based on a given metric.

GenAiTuningService

A service for creating and managing GenAI Tuning Jobs.

CancelTuningJob

rpc CancelTuningJob(CancelTuningJobRequest) returns (Empty)

Cancels a TuningJob. Starts asynchronous cancellation on the TuningJob. The server makes a best effort to cancel the job, but success is not guaranteed. Clients can use GenAiTuningService.GetTuningJob or other methods to check whether the cancellation succeeded or whether the job completed despite cancellation. On successful cancellation, the TuningJob is not deleted; instead it becomes a job with a TuningJob.error value with a google.rpc.Status.code of 1, corresponding to Code.CANCELLED, and TuningJob.state is set to CANCELLED.

IAM Permissions

Requires the following IAM permission on the name resource:

  • aiplatform.tuningJobs.cancel

For more information, see the IAM documentation.

CreateTuningJob

rpc CreateTuningJob(CreateTuningJobRequest) returns (TuningJob)

Creates a TuningJob. A created TuningJob right away will be attempted to be run.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • aiplatform.tuningJobs.create

For more information, see the IAM documentation.

GetTuningJob

rpc GetTuningJob(GetTuningJobRequest) returns (TuningJob)

Gets a TuningJob.

IAM Permissions

Requires the following IAM permission on the name resource:

  • aiplatform.tuningJobs.get

For more information, see the IAM documentation.

ListTuningJobs

rpc ListTuningJobs(ListTuningJobsRequest) returns (ListTuningJobsResponse)

Lists TuningJobs in a Location.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • aiplatform.tuningJobs.list

For more information, see the IAM documentation.

RebaseTunedModel

rpc RebaseTunedModel(RebaseTunedModelRequest) returns (Operation)

Rebase a TunedModel.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • aiplatform.tuningJobs.create

For more information, see the IAM documentation.

PredictionService

A service for online predictions and explanations.

ChatCompletions

rpc ChatCompletions(ChatCompletionsRequest) returns (HttpBody)

Exposes an OpenAI-compatible endpoint for chat completions.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

GenerateContent

rpc GenerateContent(GenerateContentRequest) returns (GenerateContentResponse)

Generate content with multimodal inputs.

IAM Permissions

Requires the following IAM permission on the model resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

Predict

rpc Predict(PredictRequest) returns (PredictResponse)

Perform an online prediction.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

ServerStreamingPredict

rpc ServerStreamingPredict(StreamingPredictRequest) returns (StreamingPredictResponse)

Perform a server-side streaming online prediction request for Vertex LLM streaming.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

StreamDirectPredict

rpc StreamDirectPredict(StreamDirectPredictRequest) returns (StreamDirectPredictResponse)

Perform a streaming online prediction request to a gRPC model server for Vertex first-party products and frameworks.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

StreamDirectRawPredict

rpc StreamDirectRawPredict(StreamDirectRawPredictRequest) returns (StreamDirectRawPredictResponse)

Perform a streaming online prediction request to a gRPC model server for custom containers.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

StreamGenerateContent

rpc StreamGenerateContent(GenerateContentRequest) returns (GenerateContentResponse)

Generate content with multimodal inputs with streaming support.

IAM Permissions

Requires the following IAM permission on the model resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

StreamingPredict

rpc StreamingPredict(StreamingPredictRequest) returns (StreamingPredictResponse)

Perform a streaming online prediction request for Vertex first-party products and frameworks.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

StreamingRawPredict

rpc StreamingRawPredict(StreamingRawPredictRequest) returns (StreamingRawPredictResponse)

Perform a streaming online prediction request through gRPC.

IAM Permissions

Requires the following IAM permission on the endpoint resource:

  • aiplatform.endpoints.predict

For more information, see the IAM documentation.

BleuInput

Input for bleu metric.

Fields
metric_spec BleuSpec

Required. Spec for bleu score metric.

instances[] BleuInstance

Required. Repeated bleu instances.

BleuInstance

Spec for bleu instance.

Fields
prediction string

Required. Output of the evaluated model.

reference string

Required. Ground truth used to compare against the prediction.

BleuMetricValue

Bleu metric value for an instance.

Fields
score float

Output only. Bleu score.

BleuResults

Results for bleu metric.

Fields
bleu_metric_values[] BleuMetricValue

Output only. Bleu metric values.

BleuSpec

Spec for bleu score metric - calculates the precision of n-grams in the prediction as compared to reference - returns a score ranging between 0 to 1.

Fields
use_effective_order bool

Optional. Whether to use_effective_order to compute bleu score.

Blob

Content blob.

It's preferred to send as text directly rather than raw bytes.

Fields
mime_type string

Required. The IANA standard MIME type of the source data.

data bytes

Required. Raw bytes.

CancelTuningJobRequest

Request message for GenAiTuningService.CancelTuningJob.

Fields
name string

Required. The name of the TuningJob to cancel. Format: projects/{project}/locations/{location}/tuningJobs/{tuning_job}

Candidate

A response candidate generated from the model.

Fields
index int32

Output only. Index of the candidate.

content Content

Output only. Content parts of the candidate.

avg_logprobs double

Output only. Average log probability score of the candidate.

logprobs_result LogprobsResult

Output only. Log-likelihood scores for the response tokens and top tokens

finish_reason FinishReason

Output only. The reason why the model stopped generating tokens. If empty, the model has not stopped generating the tokens.

safety_ratings[] SafetyRating

Output only. List of ratings for the safety of a response candidate.

There is at most one rating per category.

citation_metadata CitationMetadata

Output only. Source attribution of the generated content.

grounding_metadata GroundingMetadata

Output only. Metadata specifies sources used to ground generated content.

finish_message string

Output only. Describes the reason the mode stopped generating tokens in more detail. This is only filled when finish_reason is set.

FinishReason

The reason why the model stopped generating tokens. If empty, the model has not stopped generating the tokens.

Enums
FINISH_REASON_UNSPECIFIED The finish reason is unspecified.
STOP Token generation reached a natural stopping point or a configured stop sequence.
MAX_TOKENS Token generation reached the configured maximum output tokens.
SAFETY Token generation stopped because the content potentially contains safety violations. NOTE: When streaming, content is empty if content filters blocks the output.
RECITATION Token generation stopped because the content potentially contains copyright violations.
OTHER All other reasons that stopped the token generation.
BLOCKLIST Token generation stopped because the content contains forbidden terms.
PROHIBITED_CONTENT Token generation stopped for potentially containing prohibited content.
SPII Token generation stopped because the content potentially contains Sensitive Personally Identifiable Information (SPII).
MALFORMED_FUNCTION_CALL The function call generated by the model is invalid.

ChatCompletionsRequest

Request message for [PredictionService.ChatCompletions]

Fields
endpoint string

Required. The name of the endpoint requested to serve the prediction. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

http_body HttpBody

Optional. The prediction input. Supports HTTP headers and arbitrary data payload.

Citation

Source attributions for content.

Fields
start_index int32

Output only. Start index into the content.

end_index int32

Output only. End index into the content.

uri string

Output only. Url reference of the attribution.

title string

Output only. Title of the attribution.

license string

Output only. License of the attribution.

publication_date Date

Output only. Publication date of the attribution.

CitationMetadata

A collection of source attributions for a piece of content.

Fields
citations[] Citation

Output only. List of citations.

CoherenceInput

Input for coherence metric.

Fields
metric_spec CoherenceSpec

Required. Spec for coherence score metric.

Required. Coherence instance.

CoherenceInstance

Spec for coherence instance.

Fields
prediction string

Required. Output of the evaluated model.

CoherenceResult

Spec for coherence result.

Fields
explanation string

Output only. Explanation for coherence score.

score float

Output only. Coherence score.

confidence float

Output only. Confidence for coherence score.

CoherenceSpec

Spec for coherence score metric.

Fields
version int32

Optional. Which version to use for evaluation.

Content

The base structured datatype containing multi-part content of a message.

A Content includes a role field designating the producer of the Content and a parts field containing multi-part data that contains the content of the message turn.

Fields
role string

Optional. The producer of the content. Must be either 'user' or 'model'.

Useful to set for multi-turn conversations, otherwise can be left blank or unset.

parts[] Part

Required. Ordered Parts that constitute a single message. Parts may have different IANA MIME types.

CreateTuningJobRequest

Request message for GenAiTuningService.CreateTuningJob.

Fields
parent string

Required. The resource name of the Location to create the TuningJob in. Format: projects/{project}/locations/{location}

tuning_job TuningJob

Required. The TuningJob to create.

DynamicRetrievalConfig

Describes the options to customize dynamic retrieval.

Fields
mode Mode

The mode of the predictor to be used in dynamic retrieval.

dynamic_threshold float

Optional. The threshold to be used in dynamic retrieval. If not set, a system default value is used.

Mode

The mode of the predictor to be used in dynamic retrieval.

Enums
MODE_UNSPECIFIED Always trigger retrieval.
MODE_DYNAMIC Run retrieval only when system decides it is necessary.

EncryptionSpec

Represents a customer-managed encryption key spec that can be applied to a top-level resource.

Fields
kms_key_name string

Required. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form: projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created.

EvaluateInstancesRequest

Request message for EvaluationService.EvaluateInstances.

Fields
location string

Required. The resource name of the Location to evaluate the instances. Format: projects/{project}/locations/{location}

Union field metric_inputs. Instances and specs for evaluation metric_inputs can be only one of the following:
exact_match_input ExactMatchInput

Auto metric instances. Instances and metric spec for exact match metric.

bleu_input BleuInput

Instances and metric spec for bleu metric.

rouge_input RougeInput

Instances and metric spec for rouge metric.

fluency_input FluencyInput

LLM-based metric instance. General text generation metrics, applicable to other categories. Input for fluency metric.

coherence_input CoherenceInput

Input for coherence metric.

safety_input SafetyInput

Input for safety metric.

groundedness_input GroundednessInput

Input for groundedness metric.

fulfillment_input FulfillmentInput

Input for fulfillment metric.

summarization_quality_input SummarizationQualityInput

Input for summarization quality metric.

pairwise_summarization_quality_input PairwiseSummarizationQualityInput

Input for pairwise summarization quality metric.

summarization_helpfulness_input SummarizationHelpfulnessInput

Input for summarization helpfulness metric.

summarization_verbosity_input SummarizationVerbosityInput

Input for summarization verbosity metric.

question_answering_quality_input QuestionAnsweringQualityInput

Input for question answering quality metric.

pairwise_question_answering_quality_input PairwiseQuestionAnsweringQualityInput

Input for pairwise question answering quality metric.

question_answering_relevance_input QuestionAnsweringRelevanceInput

Input for question answering relevance metric.

question_answering_helpfulness_input QuestionAnsweringHelpfulnessInput

Input for question answering helpfulness metric.

question_answering_correctness_input QuestionAnsweringCorrectnessInput

Input for question answering correctness metric.

pointwise_metric_input PointwiseMetricInput

Input for pointwise metric.

pairwise_metric_input PairwiseMetricInput

Input for pairwise metric.

tool_call_valid_input ToolCallValidInput

Tool call metric instances. Input for tool call valid metric.

tool_name_match_input ToolNameMatchInput

Input for tool name match metric.

tool_parameter_key_match_input ToolParameterKeyMatchInput

Input for tool parameter key match metric.

tool_parameter_kv_match_input ToolParameterKVMatchInput

Input for tool parameter key value match metric.

EvaluateInstancesResponse

Response message for EvaluationService.EvaluateInstances.

Fields
Union field evaluation_results. Evaluation results will be served in the same order as presented in EvaluationRequest.instances. evaluation_results can be only one of the following:
exact_match_results ExactMatchResults

Auto metric evaluation results. Results for exact match metric.

bleu_results BleuResults

Results for bleu metric.

rouge_results RougeResults

Results for rouge metric.

fluency_result FluencyResult

LLM-based metric evaluation result. General text generation metrics, applicable to other categories. Result for fluency metric.

coherence_result CoherenceResult

Result for coherence metric.

safety_result SafetyResult

Result for safety metric.

groundedness_result GroundednessResult

Result for groundedness metric.

fulfillment_result FulfillmentResult

Result for fulfillment metric.

summarization_quality_result SummarizationQualityResult

Summarization only metrics. Result for summarization quality metric.

pairwise_summarization_quality_result PairwiseSummarizationQualityResult

Result for pairwise summarization quality metric.

summarization_helpfulness_result SummarizationHelpfulnessResult

Result for summarization helpfulness metric.

summarization_verbosity_result SummarizationVerbosityResult

Result for summarization verbosity metric.

question_answering_quality_result QuestionAnsweringQualityResult

Question answering only metrics. Result for question answering quality metric.

pairwise_question_answering_quality_result PairwiseQuestionAnsweringQualityResult

Result for pairwise question answering quality metric.

question_answering_relevance_result QuestionAnsweringRelevanceResult

Result for question answering relevance metric.

question_answering_helpfulness_result QuestionAnsweringHelpfulnessResult

Result for question answering helpfulness metric.

question_answering_correctness_result QuestionAnsweringCorrectnessResult

Result for question answering correctness metric.

pointwise_metric_result PointwiseMetricResult

Generic metrics. Result for pointwise metric.

pairwise_metric_result PairwiseMetricResult

Result for pairwise metric.

tool_call_valid_results ToolCallValidResults

Tool call metrics. Results for tool call valid metric.

tool_name_match_results ToolNameMatchResults

Results for tool name match metric.

tool_parameter_key_match_results ToolParameterKeyMatchResults

Results for tool parameter key match metric.

tool_parameter_kv_match_results ToolParameterKVMatchResults

Results for tool parameter key value match metric.

ExactMatchInput

Input for exact match metric.

Fields
metric_spec ExactMatchSpec

Required. Spec for exact match metric.

instances[] ExactMatchInstance

Required. Repeated exact match instances.

ExactMatchInstance

Spec for exact match instance.

Fields
prediction string

Required. Output of the evaluated model.

reference string

Required. Ground truth used to compare against the prediction.

ExactMatchMetricValue

Exact match metric value for an instance.

Fields
score float

Output only. Exact match score.

ExactMatchResults

Results for exact match metric.

Fields
exact_match_metric_values[] ExactMatchMetricValue

Output only. Exact match metric values.

ExactMatchSpec

This type has no fields.

Spec for exact match metric - returns 1 if prediction and reference exactly matches, otherwise 0.

FileData

URI based data.

Fields
mime_type string

Required. The IANA standard MIME type of the source data.

file_uri string

Required. URI.

FluencyInput

Input for fluency metric.

Fields
metric_spec FluencySpec

Required. Spec for fluency score metric.

instance FluencyInstance

Required. Fluency instance.

FluencyInstance

Spec for fluency instance.

Fields
prediction string

Required. Output of the evaluated model.

FluencyResult

Spec for fluency result.

Fields
explanation string

Output only. Explanation for fluency score.

score float

Output only. Fluency score.

confidence float

Output only. Confidence for fluency score.

FluencySpec

Spec for fluency score metric.

Fields
version int32

Optional. Which version to use for evaluation.

FulfillmentInput

Input for fulfillment metric.

Fields
metric_spec FulfillmentSpec

Required. Spec for fulfillment score metric.

Required. Fulfillment instance.

FulfillmentInstance

Spec for fulfillment instance.

Fields
prediction string

Required. Output of the evaluated model.

instruction string

Required. Inference instruction prompt to compare prediction with.

FulfillmentResult

Spec for fulfillment result.

Fields
explanation string

Output only. Explanation for fulfillment score.

score float

Output only. Fulfillment score.

confidence float

Output only. Confidence for fulfillment score.

FulfillmentSpec

Spec for fulfillment metric.

Fields
version int32

Optional. Which version to use for evaluation.

FunctionCall

A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values.

Fields
name string

Required. The name of the function to call. Matches [FunctionDeclaration.name].

args Struct

Optional. Required. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details.

FunctionCallingConfig

Function calling config.

Fields
mode Mode

Optional. Function calling mode.

allowed_function_names[] string

Optional. Function names to call. Only set when the Mode is ANY. Function names should match [FunctionDeclaration.name]. With mode set to ANY, model will predict a function call from the set of function names provided.

Mode

Function calling mode.

Enums
MODE_UNSPECIFIED Unspecified function calling mode. This value should not be used.
AUTO Default model behavior, model decides to predict either function calls or natural language response.
ANY Model is constrained to always predicting function calls only. If "allowed_function_names" are set, the predicted function calls will be limited to any one of "allowed_function_names", else the predicted function calls will be any one of the provided "function_declarations".
NONE Model will not predict any function calls. Model behavior is same as when not passing any function declarations.

FunctionDeclaration

Structured representation of a function declaration as defined by the OpenAPI 3.0 specification. Included in this declaration are the function name and parameters. This FunctionDeclaration is a representation of a block of code that can be used as a Tool by the model and executed by the client.

Fields
name string

Required. The name of the function to call. Must start with a letter or an underscore. Must be a-z, A-Z, 0-9, or contain underscores, dots and dashes, with a maximum length of 64.

description string

Optional. Description and purpose of the function. Model uses it to decide how and whether to call the function.

parameters Schema

Optional. Describes the parameters to this function in JSON Schema Object format. Reflects the Open API 3.03 Parameter Object. string Key: the name of the parameter. Parameter names are case sensitive. Schema Value: the Schema defining the type used for the parameter. For function with no parameters, this can be left unset. Parameter names must start with a letter or an underscore and must only contain chars a-z, A-Z, 0-9, or underscores with a maximum length of 64. Example with 1 required and 1 optional parameter: type: OBJECT properties: param1: type: STRING param2: type: INTEGER required: - param1

response Schema

Optional. Describes the output from this function in JSON Schema format. Reflects the Open API 3.03 Response Object. The Schema defines the type used for the response value of the function.

FunctionResponse

The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction.

Fields
name string

Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name].

response Struct

Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output.

GcsDestination

The Google Cloud Storage location where the output is to be written to.

Fields
output_uri_prefix string

Required. Google Cloud Storage URI to output directory. If the uri doesn't end with '/', a '/' will be automatically appended. The directory is created if it doesn't exist.

GenerateContentRequest

Request message for [PredictionService.GenerateContent].

Fields
model string

Required. The fully qualified name of the publisher model or tuned model endpoint to use.

Publisher model format: projects/{project}/locations/{location}/publishers/*/models/*

Tuned model endpoint format: projects/{project}/locations/{location}/endpoints/{endpoint}

contents[] Content

Required. The content of the current conversation with the model.

For single-turn queries, this is a single instance. For multi-turn queries, this is a repeated field that contains conversation history + latest request.

tools[] Tool

Optional. A list of Tools the model may use to generate the next response.

A Tool is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model.

tool_config ToolConfig

Optional. Tool config. This config is shared for all tools provided in the request.

labels map<string, string>

Optional. The labels with user-defined metadata for the request. It is used for billing and reporting only.

Label keys and values can be no longer than 63 characters (Unicode codepoints) and can only contain lowercase letters, numeric characters, underscores, and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter.

safety_settings[] SafetySetting

Optional. Per request settings for blocking unsafe content. Enforced on GenerateContentResponse.candidates.

generation_config GenerationConfig

Optional. Generation config.

system_instruction Content

Optional. The user provided system instructions for the model. Note: only text should be used in parts and content in each part will be in a separate paragraph.

GenerateContentResponse

Response message for [PredictionService.GenerateContent].

Fields
candidates[] Candidate

Output only. Generated candidates.

model_version string

Output only. The model version used to generate the response.

prompt_feedback PromptFeedback

Output only. Content filter results for a prompt sent in the request. Note: Sent only in the first stream chunk. Only happens when no candidates were generated due to content violations.

usage_metadata UsageMetadata

Usage metadata about the response(s).

PromptFeedback

Content filter results for a prompt sent in the request.

Fields
block_reason BlockedReason

Output only. Blocked reason.

safety_ratings[] SafetyRating

Output only. Safety ratings.

block_reason_message string

Output only. A readable block reason message.

BlockedReason

Blocked reason enumeration.

Enums
BLOCKED_REASON_UNSPECIFIED Unspecified blocked reason.
SAFETY Candidates blocked due to safety.
OTHER Candidates blocked due to other reason.
BLOCKLIST Candidates blocked due to the terms which are included from the terminology blocklist.
PROHIBITED_CONTENT Candidates blocked due to prohibited content.

UsageMetadata

Usage metadata about response(s).

Fields
prompt_token_count int32

Number of tokens in the request. When cached_content is set, this is still the total effective prompt size meaning this includes the number of tokens in the cached content.

candidates_token_count int32

Number of tokens in the response(s).

total_token_count int32

Total token count for prompt and response candidates.

GenerationConfig

Generation config.

Fields
stop_sequences[] string

Optional. Stop sequences.

response_mime_type string

Optional. Output response mimetype of the generated candidate text. Supported mimetype: - text/plain: (default) Text output. - application/json: JSON response in the candidates. The model needs to be prompted to output the appropriate response type, otherwise the behavior is undefined. This is a preview feature.

temperature float

Optional. Controls the randomness of predictions.

top_p float

Optional. If specified, nucleus sampling will be used.

top_k float

Optional. If specified, top-k sampling will be used.

candidate_count int32

Optional. Number of candidates to generate.

max_output_tokens int32

Optional. The maximum number of output tokens to generate per message.

response_logprobs bool

Optional. If true, export the logprobs results in response.

logprobs int32