Batch predictions let you send a large number of multimodal prompts in a single batch request.
For more information about the batch workflow and how to format your input data, see Get batch predictions for Gemini.
Supported Models:
Model | Version |
---|---|
Gemini 1.5 Flash | gemini-1.5-flash-002 gemini-1.5-flash-001 |
Gemini 1.5 Pro | gemini-1.5-pro-002 gemini-1.5-pro-001 |
Gemini 1.0 Pro | gemini-1.0-pro-001 gemini-1.0-pro-002 |
Example syntax
Syntax to send a batch prediction API request.
curl
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/batchPredictionJobs \ -d '{ "displayName": "...", "model": "publishers/google/models/${MODEL_ID}", "inputConfig": { "instancesFormat":"bigquery", "bigquerySource":{ "inputUri" : "..." } }, "outputConfig": { "predictionsFormat":"bigquery", "bigqueryDestination":{ "outputUri": "..." } } }'
Parameters
See examples for implementation details.
Body request
Parameters | |
---|---|
|
A name you choose for your job. |
|
The model to use for batch prediction. |
|
The data format. For Gemini batch prediction, BigQuery input is supported. |
|
The output configuration which determines model output location. |
inputConfig
Parameters | |
---|---|
|
The prompt input format. Use |
|
The input source URI. This is a BigQuery table URI in
the form |
outputConfig
Parameters | |
---|---|
|
The output format of the prediction. It must match the input format.
Use |
|
The BigQuery URI of the target output table, in the
form |
Examples
Request a batch response
Batch requests for multimodal models only accept BigQuery storage sources. To learn more, see the following:
Depending on the number of input items that you submitted, a batch generation task can take some time to complete.
REST
To test a multimodal prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
PROJECT_ID
: The name of your Google Cloud project.BP_JOB_NAME
: A name you choose for your job.INPUT_URI
: The input source URI. This is a BigQuery table URI in the formbq://PROJECT_ID.DATASET.TABLE
. Or your Cloud Storage bucket URI.INPUT_SOURCE
: The input source type. Options arebigquerySource
andgcsSource
.INSTANCES_FORMAT
: Input instances format - can be `jsonl` or `bigquery`.OUTPUT_URI
: The URI of the output or target output table, in the formbq://PROJECT_ID.DATASET.TABLE
. If the table doesn't already exist, then it is created for you.
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs
Request JSON body:
{ "displayName": "BP_JOB_NAME", "model": "publishers/google/models/gemini-1.0-pro-002", "inputConfig": { "instancesFormat": "INSTANCES_FORMAT", "INPUT_SOURCE":{ "inputUri" : "INPUT_URI" } }, "outputConfig": { "predictionsFormat": "bigquery", "bigqueryDestination":{ "outputUri": "OUTPUT_URI" } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/{PROJECT_ID}/locations/us-central1/batchPredictionJobs/{BATCH_JOB_ID}", "displayName": "My first batch prediction", "model": "projects/{PROJECT_ID}/locations/us-central1/models/gemini-1.0-pro-002", "inputConfig": { "instancesFormat": "bigquery", "bigquerySource": { "inputUri": "bq://{PROJECT_ID}.mydataset.batch_predictions_input" } }, "modelParameters": {}, "outputConfig": { "predictionsFormat": "bigquery", "bigqueryDestination": { "outputUri": "bq://{PROJECT_ID}.mydataset.batch_predictions_output" } }, "state": "JOB_STATE_PENDING", "createTime": "2023-07-12T20:46:52.148717Z", "updateTime": "2023-07-12T20:46:52.148717Z", "modelVersionId": "1" }
The response includes a unique identifier for the batch job.
You can poll for the status of the batch job using
the BATCH_JOB_ID until the job state
is
JOB_STATE_SUCCEEDED
. For example:
curl \ -X GET \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs/BATCH_JOB_ID
Retrieve batch output
When a batch prediction task completes, the output is stored in the BigQuery table that you specified in your request.
What's next
- Learn how to tune a Gemini model in Overview of model tuning for Gemini.
- Learn more about how to Get batch predictions for Gemini.