The AI.GENERATE_TABLE function
This document describes the AI.GENERATE_TABLE
function, which lets you perform
generative natural language tasks by using text from BigQuery
standard tables and also
specify a schema to format the response from the model.
The function works by sending requests to a BigQuery ML remote model that represents a Vertex AI model, and then returning that model's response. Only the following Gemini models are supported:
gemini-2.0-flash-001
gemini-1.5-flash-001
gemini-1.5-flash-002
gemini-1.5-pro-001
gemini-1.5-pro-002
Several of the AI.GENERATE_TABLE
function's arguments provide the
parameters that shape the Vertex AI model's response.
You can use the AI.GENERATE_TABLE
function to perform tasks such as
classification, sentiment analysis, image captioning, and transcription. For
more information on the types of tasks the Vertex AI
models can perform, see
Vertex AI Gemini API model use cases.
Prompt design can strongly affect the responses returned by the Vertex AI model. For more information, see Introduction to prompting.
Syntax when using standard tables
AI.GENERATE_TABLE( MODEL `project_id.dataset.model`, { TABLE `project_id.dataset.table` | (query_statement) }, STRUCT( output_schema AS output_schema [, max_output_tokens AS max_output_tokens] [, top_p AS top_p] [, temperature AS temperature] [, stop_sequences AS stop_sequences] [, ground_with_google_search AS ground_with_google_search] [, safety_settings AS safety_settings]) )
Arguments
AI.GENERATE_TABLE
takes the following arguments:
project_id
: your project ID.dataset
: the BigQuery dataset that contains the model.model
: the name of the remote model. For more information, see TheCREATE MODEL
statement for remote models over LLMs.table
: the name of the BigQuery table that contains the prompt data. The text in the column that's namedprompt
is sent to the model. If your table does not have aprompt
column, use aSELECT
statement for this argument to provide an alias for an existing table column. An error occurs if noprompt
column is available.query_statement
: the GoogleSQL query that generates the prompt data. The prompt value itself can be pulled from a column, or you can specify it as a struct value with an arbitrary number of string and column name subfields. For example,SELECT ('Analyze the sentiment in ', feedback_column, 'using the following categories: positive, negative, neutral') AS prompt
.output_schema
: a schema to use to format the model's response. Theoutput_schema
value must a SQL schema definition, similar to that used in theCREATE TABLE
statement. The following data types are supported:INT64
FLOAT64
BOOL
STRING
ARRAY
STRUCT
When using the
output_schema
argument to generate structured data based on prompts from a table, it is important to understand the prompt data in order to specify an appropriate schema.For example, say you are analyzing movie review content from a table that has the following fields:
- movie_id
- review
- prompt
Then you might create prompt text by running a query similar to the following:
UPDATE `mydataset.movie_review` SET prompt = CONCAT('Extract the key words and key sentiment from the text below: ', review) WHERE review IS NOT NULL;
And you might specify a
output_schema
value similar to"keywords ARRAY<STRING>, sentiment STRING" AS output_schema
.max_output_tokens
: anINT64
value that sets the maximum number of tokens that can be generated in the response. A token might be smaller than a word and is approximately four characters. One hundred tokens correspond to approximately 60-80 words.This value must be in the range[1,8192]
. Specify a lower value for shorter responses and a higher value for longer responses. The default is128
.top_p
: aFLOAT64
value in the range[0.0,1.0]
that changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. The default is0.95
.Tokens are selected from the most to least probable until the sum of their probabilities equals the
top_p
value. For example, if tokens A, B, and C have a probability of0.3
,0.2
, and0.1
, and thetop_p
value is0.5
, then the model selects either A or B as the next token by using thetemperature
value and doesn't consider C.temperature
: aFLOAT64
value in the range[0.0,2.0]
that controls the degree of randomness in token selection. Lowertemperature
values are good for prompts that require a more deterministic and less open-ended or creative response, while highertemperature
values can lead to more diverse or creative results. Atemperature
value of0
is deterministic, meaning that the highest probability response is always selected. The default is1.0
.stop_sequences
: anARRAY<STRING>
value that removes the specified strings if they are included in responses from the model. Strings are matched exactly, including capitalization. The default is an empty array.ground_with_google_search
: aBOOL
value that determines whether the Vertex AI model uses Grounding with Google Search when generating responses. Grounding lets the model use additional information from the internet when generating a response, in order to make model responses more specific and factual. The default isFALSE
.safety_settings
: anARRAY<STRUCT<STRING AS category, STRING AS threshold>>
value that configures content safety thresholds to filter responses. The first element in the struct specifies a harm category, and the second element in the struct specifies a corresponding blocking threshold. The model filters out content that violate these settings. You can only specify each category once. For example, you can't specify bothSTRUCT('HARM_CATEGORY_DANGEROUS_CONTENT' AS category, 'BLOCK_MEDIUM_AND_ABOVE' AS threshold)
andSTRUCT('HARM_CATEGORY_DANGEROUS_CONTENT' AS category, 'BLOCK_ONLY_HIGH' AS threshold)
. If there is no safety setting for a given category, theBLOCK_MEDIUM_AND_ABOVE
safety setting is used.Supported categories are as follows:
HARM_CATEGORY_HATE_SPEECH
HARM_CATEGORY_DANGEROUS_CONTENT
HARM_CATEGORY_HARASSMENT
HARM_CATEGORY_SEXUALLY_EXPLICIT
Supported thresholds are as follows:
BLOCK_NONE
(Restricted)BLOCK_LOW_AND_ABOVE
BLOCK_MEDIUM_AND_ABOVE
(Default)BLOCK_ONLY_HIGH
HARM_BLOCK_THRESHOLD_UNSPECIFIED
For more information, see Harm categories and How to configure content filters.
Example
The following example shows a request that provides a SQL schema to format the model's response:
SELECT address, age, is_married, name, phone_number, weight_in_pounds FROM AI.GENERATE_TABLE( MODEL `mydataset.gemini_model`, ( SELECT 'John Smith is a 20-year old single man living at 1234 NW 45th St, Kirkland WA, 98033. He has two phone numbers 123-123-1234, and 234-234-2345. He is 200.5 pounds.' AS prompt ), STRUCT("address STRING, age INT64, is_married BOOL, name STRING, phone_number ARRAY<STRING>, weight_in_pounds FLOAT64" AS output_schema));
The results look similar to the following:
+-------------------------------------+-----+------------+------------+---------------+------------------+ | address | age | is_married | name | phone_number | weight_in_pounds | +-------------------------------------+-----+------------+------------+---------------+------------------+ | 1234 NW 45th St, Kirkland WA, 98033 | 20 | No | John Smith | 123-123-1234 | 200.5 | | | | | | 234-234-2345 | | | | | | | | | +-------------------------------------+-----+------------+------------+---------------+------------------+
Details
The model and input table must be in the same region.
Output
AI.GENERATE_TABLE
returns the following columns:
All columns in the input table.
All columns specified in the
output_response
argument.full_response
: this is the JSON response from theprojects.locations.endpoints.generateContent
call to the model. The generated data is in thetext
element. If you specify theground_with_google_search
orsafety_settings
arguments, this column also includes elements that contain the related output.status
: aSTRING
value that contains the API response status for the corresponding row. This value is empty if the operation was successful.
Locations
AI.GENERATE_TABLE
must run in the same
region or multi-region as the remote model that the
function references.
You can create remote models over Gemini models in the
supported regions
for the given Gemini model, and also in the US
and EU
multi-regions.
Quotas
See Vertex AI and Cloud AI service functions quotas and limits.
Known issues
This section contains information about known issues.Resource exhausted errors
Sometimes after a query job that uses this function finishes successfully, some returned rows contain the following error message:
A retryable error occurred: RESOURCE EXHAUSTED error from <remote endpoint>
This issue occurs because BigQuery query jobs finish successfully
even if the function fails for some of the rows. The function fails when the
volume of API calls to the remote endpoint exceeds the quota limits for that
service. This issue occurs most often when you are running multiple parallel
batch queries. BigQuery retries these calls, but if the retries
fail, the resource exhausted
error message is returned.
To iterate through inference calls until all rows are successfully processed, you can use the BigQuery remote inference SQL scripts or the BigQuery remote inference pipeline Dataform package.
What's next
- Get step-by-step instructions on how to generate structured data using your own data.
- For more information about using Vertex AI models to generate text and embeddings, see Generative AI overview.
- For more information about using Cloud AI APIs to perform AI tasks, see AI application overview.