The AI.GENERATE function
This document describes the AI.GENERATE
function, which lets you
analyze text in a BigQuery
standard table. For each row
in the table, the function generates a STRUCT
that contains a STRING
value.
The function works by sending requests to a Vertex AI Gemini model, and then returning that model's response.
You can use the AI.GENERATE
function to perform tasks such as
classification and sentiment analysis.
Prompt design can strongly affect the responses returned by the model. For more information, see Introduction to prompting.
Syntax
AI.GENERATE( [ prompt => ] 'prompt', connection_id => 'connection', endpoint => 'endpoint' [, model_params => model_params] [, output_schema => 'field_name1 data_type1, field_name2, data_type2, ...'] )
Arguments
AI.GENERATE
takes the following arguments:
prompt
: aSTRING
value orSTRUCT
ofSTRING
values that specifies the prompt to send to the model. Prompt content must be either a literal string or the name of a string column in the table that you are using with the function. For example, if you are running the function on a table that contains acity
column, then('Tell me about ', city)
is a valid prompt. Ifprompt
is aSTRUCT
, then the strings are concatenated together. For example,('Tell me about ', city)
is equivalent toCONCAT('Tell me about ', city)
. The prompt must be the first argument that you specify.connection_id
: aSTRING
value specifying the connection to use to communicate with the model, in the format[PROJECT_ID].[LOCATION].[CONNECTION_ID]
. For example,myproject.us.myconnection
.Replace the following:
PROJECT_ID
: the project ID of the project that contains the connection.LOCATION
: the location used by the connection. The connection must be in the same location as the dataset that contains the model.CONNECTION_ID
: the connection ID—for example,myconnection
.You can get this value by viewing the connection details in the Google Cloud console and copying the value in the last section of the fully qualified connection ID that is shown in Connection ID. For example,
projects/myproject/locations/connection_location/connections/myconnection
.
You need to grant the Vertex AI User role to the connection's service account in the project where you run the
AI.GENERATE
function.endpoint
: aSTRING
value that specifies the Vertex AI endpoint to use for the model. Only Gemini models are supported. If you specify the model name, BigQuery ML automatically identifies and uses the full endpoint of the model.model_params
: aJSON
literal that provides additional parameters to the model. Themodel_params
value must conform to thegenerateContent
request body format. You can provide a value for any field in the request body except for thecontents
field; thecontents
field is populated with theprompt
argument value.output_schema
: aSTRING
value that specifies the schema of the output, in the formfield_name1 data_type1, field_name2 data_type2, ...
. Supported data types includeSTRING
,INT64
,FLOAT64
,BOOL
,ARRAY
, andSTRUCT
.For Gemini 1.5 models, only specify a
FLOAT64
data type if you are certain that the return value won't be a round number. These models can sometimes returnINT
values rather thanFLOAT
values for round numbers, for example2
instead of2.0
, and this can cause a parsing error in the query.
Output
AI.GENERATE
returns a STRUCT
value for each row in the table. The struct
contains the following fields:
result
: aSTRING
value containing the model's response to the prompt. The result isNULL
if the request fails or is filtered by responsible AI. If you specify anoutput_schema
thenresult
is replaced by your custom schema.full_response
: aSTRING
value containing the JSON response from theprojects.locations.endpoints.generateContent
call to the model. The generated text is in thetext
element. The safety attributes are in thesafety_ratings
element.status
: aSTRING
value that contains the API response status for the corresponding row. This value is empty if the operation was successful.
Examples
The following examples assume that your connection and input tables are in your default project.
Describe cities
Suppose you have the following table called mydataset.cities
with a single
city
column:
+---------+ | city | +---------+ | Seattle | | Beijing | | Paris | | London | +---------+
To generate a short description of each city, you can call the
AI.GENERATE
function and select the result
field in the output
by running the following query:
SELECT city, AI.GENERATE( ('Give a short, one sentence description of ', city), connection_id => 'us.test_connection', endpoint => 'gemini-2.0-flash').result FROM mydataset.cities;
The result is similar to the following:
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ | city | result | +---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Seattle | Seattle is a vibrant city nestled between mountains and water, renowned for its coffee culture, tech industry, and rainy weather. | | Beijing | Beijing is a vibrant metropolis where ancient history meets modern innovation, offering a captivating blend of cultural treasures and bustling urban life. | | Paris | Paris is a romantic city renowned for its iconic landmarks, elegant architecture, and vibrant culture. | | London | London, a vibrant global metropolis brimming with history, culture, and innovation. | +---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
Use structured output
Suppose you have the following table called mydataset.states
with a single
state
column of US states:
+------------+ | state | +------------+ | Washington | | Oregon | | California | | Hawaii | +------------+
The following query generates state capitals for a list of states.
The query uses the output_schema
argument to set two custom fields in the
output struct: state
and capital
.
SELECT state, AI.GENERATE( ('What is the capital of ', state, '?'), connection_id => 'us.example_connection', endpoint => 'gemini-2.0-flash', output_schema => 'state STRING, capital STRING').capital FROM mydataset.states;
The result is similar to the following:
+------------+------------+ | state | capital | +------------+------------+ | Washington | Olympia | | Oregon | Salem | | California | Sacramento | | Hawaii | Honolulu | +------------+------------+
The following query shows how to set the model_params
argument to
specify a label for the request:
SELECT state, AI.GENERATE( ('What is the capital of ', state, '?'), connection_id => 'us.example_connection', endpoint => 'gemini-2.0-flash', model_params => JSON '{"labels":{"key": "my_key", "value": "useful_value"}}', output_schema => 'state STRING, capital STRING').capital FROM mydataset.states;
Locations
You can run AI.GENERATE
in all of the
regions
that support Gemini models, and also in the US
and EU
multi-regions.
Quotas
See Vertex AI and Cloud AI service functions quotas and limits.
What's next
- For more information about using Vertex AI models to generate text and embeddings, see Generative AI overview.
- For more information about using Cloud AI APIs to perform AI tasks, see AI application overview.