This page describes how to get batch predictions using BigQuery.
1. Prepare your inputs
BigQuery storage input
- Your service account must have have appropriate BigQuery
permissions. To grant the service account the BigQuery User role,
use the
gcloud iam service-accounts add-iam-policy-binding
command as follows:
gcloud projects add-iam-policy-binding PROJECT_ID \ --member="serviceAccount:SERVICE_ACCOUNT_ID@PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/bigquery.user"
Replace the following values:
* <var>PROJECT_ID</var>: The project that your service account was
created in.
* <var>SERVICE_ACCOUNT_ID</var>: The ID for the service account.
- A
request
column is required, and must be valid JSON. This JSON data represents your input for the model. - The content in the
request
column must match the structure of aGenerateContentRequest
. + Your input table can have column data types other thanrequest
. These columns can have BigQuery data types except for the following: array, struct, range, datetime, and geography. These columns are ignored for content generation but included in the output table.
Example input (JSON) |
---|
|
2. Submit a batch job
You can create a batch job through the Google Cloud console, the Google Gen AI SDK, or the REST API.
The job and your table must be in the same region.
Console
- In the Vertex AI section of the Google Cloud console, go to the Batch Inference page.
- Click Create.
REST
To create a batch prediction job, use the
projects.locations.batchPredictionJobs.create
method.
Before using any of the request data, make the following replacements:
- LOCATION: A region that supports Gemini models.
- PROJECT_ID: Your project ID.
- MODEL_PATH: the publisher model name, for example,
publishers/google/models/gemini-2.0-flash-001
; or the tuned endpoint name, for example,projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID
, where MODEL_ID is the model ID of the tuned model. - INPUT_URI: The
BigQuery table where your batch prediction input is located
such as
bq://myproject.mydataset.input_table
. The dataset must be located in the same region as the batch prediction job. Multi-region datasets are not supported. - OUTPUT_FORMAT: To output to
a BigQuery table, specify
bigquery
. To output to a Cloud Storage bucket, specifyjsonl
. - DESTINATION: For
BigQuery, specify
bigqueryDestination
. For Cloud Storage, specifygcsDestination
. - OUTPUT_URI_FIELD_NAME:
For BigQuery, specify
outputUri
. For Cloud Storage, specifyoutputUriPrefix
. - OUTPUT_URI: For
BigQuery, specify the table location such as
bq://myproject.mydataset.output_result
. The region of the output BigQuery dataset must be the same as the Vertex AI batch prediction job. For Cloud Storage, specify the bucket and directory location such asgs://mybucket/path/to/output
.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs
Request JSON body:
{ "displayName": "my-bigquery-batch-prediction-job", "model": "MODEL_PATH", "inputConfig": { "instancesFormat": "bigquery", "bigquerySource":{ "inputUri" : "INPUT_URI" } }, "outputConfig": { "predictionsFormat": "OUTPUT_FORMAT", "DESTINATION": { "OUTPUT_URI_FIELD_NAME": "OUTPUT_URI" } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
The response includes a unique identifier for the batch job. You can poll for the status of the batch job using the BATCH_JOB_ID. For more information, see Monitor the job status. Note: Custom Service account, live progress, CMEK, and VPCSC reports are not supported.Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
3. Monitor the job status and progress
After the job is submitted, you can check the status of your batch job using API, SDK and Cloud Console.
Console
- Go to the Batch Inference page.
- Select your batch job to monitor its progress.
REST
To monitor a batch prediction job, use the
projects.locations.batchPredictionJobs.get
method and view the CompletionStats
field in the response.
Before using any of the request data, make the following replacements:
- LOCATION: A region that supports Gemini models.
- PROJECT_ID: .
- BATCH_JOB_ID: Your batch job ID.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs/BATCH_JOB_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs/BATCH_JOB_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs/BATCH_JOB_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
The status of the a given batch job can be any of the following:
JOB_STATE_PENDING
: Queue for capacity. The job can be inqueue
state up to 72-hour before enteringrunning
state.JOB_STATE_RUNNING
: The input file was successfully validated and the batch is currently being run.JOB_STATE_SUCCEEDED
: The batch has been completed and the results are readyJOB_STATE_FAILED
: the input file has failed the validation process, or could not be completed within the 24-hour time window after enteringRUNNING
state.JOB_STATE_CANCELLING
: the batch is being cancelledJOB_STATE_CANCELLED
: the batch was cancelled
4. Retrieve batch output
When a batch prediction task completes, the output is stored in the BigQuery table that you specified in your request.
For succeeded rows, model responses are stored in the response
column.
Otherwise, error details are stored in the status
column for further
inspection.
Output example
Successful example
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": "In a medium bowl, whisk together the flour, baking soda, baking powder."
}
]
},
"finishReason": "STOP",
"safetyRatings": [
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.14057204,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.14270912
}
]
}
],
"usageMetadata": {
"promptTokenCount": 8,
"candidatesTokenCount": 396,
"totalTokenCount": 404
}
}
Failed example
Request
{"contents":[{"parts":{"text":"Explain how AI works in a few words."},"role":"tester"}]}
Response
Bad Request: {"error": {"code": 400, "message": "Please use a valid role: user, model.", "status": "INVALID_ARGUMENT"}}