Get batch predictions for Gemini

Batch predictions let you send a large number of multimodal prompts in a single batch request.

For more information about the batch workflow and how to format your input data, see Get batch predictions for Gemini.

Supported Models:

Model Version
Gemini 1.5 Flash gemini-1.5-flash-001
Gemini 1.5 Pro gemini-1.5-pro-001
Gemini 1.0 Pro gemini-1.0-pro-001
gemini-1.0-pro-002

Example syntax

Syntax to send a batch prediction API request.

curl

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \

https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/batchPredictionJobs \
-d '{
    "displayName": "...",
    "model": "publishers/google/models/${MODEL_ID}",
    "inputConfig": {
      "instancesFormat":"bigquery",
      "bigquerySource":{
        "inputUri" : "..."
      }
    },
    "outputConfig": {
      "predictionsFormat":"bigquery",
      "bigqueryDestination":{
        "outputUri": "..."
        }
    }
}'

Parameters

See examples for implementation details.

Body request

Parameters

displayName

A name you choose for your job.

model

The model to use for batch prediction.

inputConfig

The data format. For Gemini batch prediction, BigQuery input is supported.

outputConfig

The output configuration which determines model output location.

inputConfig

Parameters

instancesFormat

The prompt input format. Use bigquery.

bigquerySource.inputUri

The input source URI. This is a BigQuery table URI in the form bq://PROJECT_ID.DATASET.TABLE.

outputConfig

Parameters

predictionsFormat

The output format of the prediction. It must match the input format. Use bigquery.

bigqueryDestination.outputUri

The BigQuery URI of the target output table, in the form bq://PROJECT_ID.DATASET.TABLE. If the table doesn't already exist, then it is created for you.

Examples

Request a batch response

Batch requests for multimodal models only accept BigQuery storage sources. To learn more, see the following:

Depending on the number of input items that you submitted, a batch generation task can take some time to complete.

REST

To test a multimodal prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: The name of your Google Cloud project.
  • BP_JOB_NAME: A name you choose for your job.
  • INPUT_URI: The input source URI. This is a BigQuery table URI in the form bq://PROJECT_ID.DATASET.TABLE.
  • OUTPUT_URI: The BigQuery URI of the target output table, in the form bq://PROJECT_ID.DATASET.TABLE. If the table doesn't already exist, then it is created for you.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs

Request JSON body:

{
    "displayName": "BP_JOB_NAME",
    "model": "publishers/google/models/gemini-1.0-pro-002",
    "inputConfig": {
      "instancesFormat":"bigquery",
      "bigquerySource":{
        "inputUri" : "INPUT_URI"
      }
    },
    "outputConfig": {
      "predictionsFormat":"bigquery",
      "bigqueryDestination":{
        "outputUri": "OUTPUT_URI"
        }
    }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/{PROJECT_ID}/locations/us-central1/batchPredictionJobs/{BATCH_JOB_ID}",
  "displayName": "My first batch prediction",
  "model": "projects/{PROJECT_ID}/locations/us-central1/models/gemini-1.0-pro-002",
  "inputConfig": {
    "instancesFormat": "bigquery",
    "bigquerySource": {
      "inputUri": "bq://{PROJECT_ID}.mydataset.batch_predictions_input"
    }
  },
  "modelParameters": {},
  "outputConfig": {
    "predictionsFormat": "bigquery",
    "bigqueryDestination": {
      "outputUri": "bq://{PROJECT_ID}.mydataset.batch_predictions_output"
    }
  },
  "state": "JOB_STATE_PENDING",
  "createTime": "2023-07-12T20:46:52.148717Z",
  "updateTime": "2023-07-12T20:46:52.148717Z",
  "modelVersionId": "1"
}

The response includes a unique identifier for the batch job. You can poll for the status of the batch job using the BATCH_JOB_ID until the job state is JOB_STATE_SUCCEEDED. For example:

curl \
  -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs/BATCH_JOB_ID

Retrieve batch output

When a batch prediction task completes, the output is stored in the BigQuery table that you specified in your request.

What's next