The Model endpoint management preview lets you register a model endpoint, manage model endpoint metadata in your
database cluster, and then interact with the models using SQL queries. It provides the google_ml_integration
extension that
includes functions to add and register the model endpoint metadata related to the models, and then use the
models to generate vector embeddings or invoke predictions.
Some of the example model types that you can register using model endpoint management are as follows:
- Vertex AI text embedding models
- Embedding models provided by third-party providers.
- Custom-hosted text embedding models
- Generic models with a JSON-based API—for example,
gemini-pro
model from the Vertex AI Model Garden
How it works
You can use model endpoint management to register a model endpoint that complies to the following:
- Model input and output supports JSON format.
- Model can be called using the REST protocol.
When you register a model endpoint with the model endpoint management, it registers each endpoint with a unique model ID that you provided as a reference to the model. You can use this model ID to query models:
Generate embeddings to translate text prompts to numerical vectors. You can store generated embeddings as vector data when the
pgvector
extension is enabled in the database.Invoke predictions to call a model using SQL within a transaction.
Your applications can access the model endpoint management using the google_ml_integration
extension. This extension provides the following functions:
- The
google_ml.create_model()
SQL function, which is used to register the model endpoint that is used in the prediction or embedding function. - The
google_ml.create_sm_secret()
SQL function, which uses secrets in the Google Cloud Secret Manager, where the API keys are stored. - The
google_ml.embedding()
SQL function, which is a prediction function that generates text embeddings. - The
google_ml.predict_row()
SQL function that generates predictions when you call generic models that support JSON input and output format. - Other helper functions that handle generating custom URL, generating HTTP headers, or passing transform functions for your generic models.
- Functions to manage the registered model endpoints and secrets.
Key concepts
Before you start using the model endpoint management, understand the concepts required to connect to and use the models.
Model provider
Model provider indicates the supported model hosting providers. The following table shows the model provider value you must set based on the model provider you use:
Model provider | Set in function as… |
---|---|
Vertex AI | google |
Other models | custom |
The default model provider is custom
.
Based on the provider type, the supported authentication method differs. The Vertex AI models use the Distributed Cloud service account to authenticate, while other providers can use the Secret Manager to authenticate.
Model type
Model type indicates the type of the AI model. The extension supports text embedding as
well as any generic model type. The supported model type you can set when
registering a model endpoint are text-embedding
and generic
. Setting model type is
optional when registering generic model endpoints as generic
is the default model type.
- Text embedding models with built-in support
- The
model endpoint management provides built-in support for all versions of the
textembedding-gecko
model by Vertex AI. To register these model endpoints, use thegoogle_ml.create_model()
function. Distributed Cloud automatically sets up default transform functions for these models. - The model type for these models is
text-embedding
. - Other text embedding models
- For other text embedding models, you need to create transform functions to handle the input and output formats that the model supports. Optionally, you can use the HTTP header generation function that generates custom headers required by your model.
- The model type for these models is
text-embedding
. - Generic models
- The model endpoint management also supports
registering of all other model types apart from text embedding models. To
invoke predictions for generic models, use the
google_ml.predict_row()
function. You can set model endpoint metadata, such as a request endpoint and HTTP headers that are specific to your model. - You cannot pass transform functions when you are registering a generic model endpoint. Ensure that when you invoke predictions the input to the function is in the JSON format, and that you parse the JSON output to derive the final output.
- The model type for these models is
generic
.
Authentication
Auth types indicate the authentication type that you can use to connect to the
model endpoint management using the google_ml_integration
extension. Setting
authentication is optional and is required only if you need to authenticate to access your model.
For Vertex AI models, the Distributed Cloud service account is used for authentication. For other models,
API key or bearer token that is stored as a secret in the
Secret Manager can be used with the google_ml.create_sm_secret()
SQL
function.
The following table shows the auth types that you can set:
Authentication method | Set in function as… | Model provider |
---|---|---|
Distributed Cloud service agent | alloydb_service_agent_iam |
Vertex AI provider |
Secret Manager | secret_manager |
third-party providers |
Prediction functions
The google_ml_integration
extension includes the following prediction functions:
google_ml.embedding()
- Used to call a registered text embedding model endpoint to
generate embeddings. It includes built-in support for the
textembedding-gecko
model by Vertex AI. - For text embedding models without built-in support, the input and output parameters are unique to a model and need to be transformed for the function to call the model. Create a transform input function to transform input of the prediction function to the model specific input, and a transform output function to transform model specific output to the prediction function output.
google_ml.predict_row()
- Used to call a registered generic model endpoint, as long as they support JSON-based API, to invoke predictions.
Transform functions
Transform functions modify the input to a format that the model understands, and
convert the model response to the format that the prediction function expects. The
transform functions are used when registering the text-embedding
model endpoint without
built-in support. The signature of the transform functions depends on the
prediction function for the model type.
You cannot use transform functions when registering a generic
model endpoint.
The following shows the signatures for the prediction function for text embedding models:
// define custom model specific input/output transform functions.
CREATE OR REPLACE FUNCTION input_transform_function(model_id VARCHAR(100), input_text TEXT) RETURNS JSON;
CREATE OR REPLACE FUNCTION output_transform_function(model_id VARCHAR(100), response_json JSON) RETURNS real[];
HTTP header generation function
The HTTP header generation function generates the output in JSON key value pairs that are used as HTTP headers. The signature of the prediction function defines the signatures of the header generation function.
The following example shows the signature for the google_ml.embedding()
prediction function.
CREATE OR REPLACE FUNCTION generate_headers(model_id VARCHAR(100), input TEXT) RETURNS JSON;
For the google_ml.predict_row()
prediction function, the signature is as follows:
CREATE OR REPLACE FUNCTION generate_headers(model_id VARCHAR(100), input JSON) RETURNS JSON;
Register a model
To invoke predictions or generate embeddings using a model, register the model endpoint with model endpoint management.
For more information about the google_ml.create_model()
function, see Model endpoint management reference.
Before you register a model endpoint with model endpoint management, you must enable the google_ml_integration
extension and set up authentication based on the model provider, if your model endpoint requires authentication.
Make sure that you access your database with the postgres
default username.
Set up authentication
The following sections show how to set up authentication before adding a Vertex AI model endpoint or model endpoints by other providers.
Set up authentication for Vertex AI
To use the Google Vertex AI model endpoints, you must add Vertex AI permissions to the service account that you used while installing AlloyDB Omni.
Set up authentication for other model providers
For all models except Vertex AI models, you can store your API keys or bearer tokens in Secret Manager. This step is optional if your model endpoint doesn't handle authentication through Secret Manager—for example, if your model endpoint uses HTTP headers to pass authentication information or doesn't use authentication at all.
This section explains how to set up authentication if you are using Secret Manager.
To create and use an API key or a bearer token, complete the following steps:
Create the secret in Secret Manager.
The secret name and the secret path is used in the
google_ml.create_sm_secret()
SQL function.Grant permissions to the Distributed Cloud cluster to access the secret.
gcloud secrets add-iam-policy-binding 'SECRET_ID' \ --member="serviceAccount:SERVICE_ACCOUNT_ID" \ --role="roles/secretmanager.secretAccessor"
Replace the following:
SECRET_ID
: the secret ID in Secret Manager.SERVICE_ACCOUNT_ID
: the ID of the service account that you created in the previous step. Ensure that this is the same account you used during AlloyDB Omni installation. This includes the fullPROJECT_ID.iam.gserviceaccount.com
suffix. For example:my-service@my-project.iam.gserviceaccount.com
You can also grant this role to the service account at the project level.
Generate embeddings
This section describes a preview that lets you experiment with registering an AI model endpoint and invoking predictions with model endpoint management.
After the model endpoints are added and registered in the model endpoint management, you can reference them using the model ID to generate embeddings.
Before you begin
Make sure that you have registered your model endpoint with model endpoint management.
Generate embeddings
Use the google_ml.embedding()
SQL function to call the registered model endpoint with
the text embedding model type to generate embeddings.
To call the model and generate embeddings, use the following SQL query:
SELECT
google_ml.embedding(
model_id => 'MODEL_ID',
content => 'CONTENT');
Replace the following:
MODEL_ID
: the model ID you defined when registering the model endpoint.CONTENT
: the text to translate into a vector embedding.
Examples
Some examples for generating embeddings using registered model endpoint are listed in this section.
Text embedding models with in-built support
To generate embeddings for a registered textembedding-gecko@002
model endpoint, run the following statement:
SELECT
google_ml.embedding(
model_id => 'textembedding-gecko@002',
content => 'AlloyDB is a managed, cloud-hosted SQL database service');
Invoke predictions
This section describes a preview that lets you experiment with registering an AI model endpoint and invoking predictions with model endpoint management.
After the model endpoints are added and registered in the model endpoint management, you can reference them using the model ID to invoke predictions.
Before you begin
Make sure that you have registered your model endpoint with model endpoint management.
Invoke predictions for generic models
Use the google_ml.predict_row()
SQL function to call a registered generic model endpoint to invoke
predictions. You can use google_ml.predict_row()
function with any model type.
SELECT
google_ml.predict_row(
model_id => 'MODEL_ID',
request_body => 'REQUEST_BODY');
Replace the following:
MODEL_ID
: the model ID you defined when registering the model endpoint.REQUEST_BODY
: the parameters to the prediction function, in JSON format.
Examples
Some examples for invoking predictions using registered model endpoints are listed in this section.
To generate predictions for a registered gemini-pro
model endpoint, run the following statement:
SELECT
json_array_elements(
google_ml.predict_row(
model_id => 'gemini-pro',
request_body => '{
"contents": [
{
"role": "user",
"parts": [
{
"text": "For TPCH database schema as mentioned here https://www.tpc.org/TPC_Documents_Current_Versions/pdf/TPC-H_v3.0.1.pdf , generate a SQL query to find all supplier names which are located in the India nation."
}
]
}
]
}'))-> 'candidates' -> 0 -> 'content' -> 'parts' -> 0 -> 'text';
Model endpoint management API reference
This section lists parameters for different functions provided by the
google_ml_integration
extension to register and manage model endpoints, and secrets with
model endpoint management.
You must set the
google_ml_integration.enable_model_support
database flag to on
before you
can start using the extension.
Models
Use this reference to understand parameters for functions that let you manage model endpoints.
google_ml.create_model()
function
The following shows how to call the google_ml.create_model()
SQL function used
to register model endpoint metadata:
CALL
google_ml.create_model(
model_id => 'MODEL_ID',
model_request_url => 'REQUEST_URL',
model_provider => 'PROVIDER_ID',
model_type => 'MODEL_TYPE',
model_qualified_name => 'MODEL_QUALIFIED_NAME',
model_auth_type => 'AUTH_TYPE',
model_auth_id => 'AUTH_ID',
generate_headers_fn => 'GENERATE_HEADER_FUNCTION',
model_in_transform_fn => 'INPUT_TRANSFORM_FUNCTION',
model_out_transform_fn => 'OUTPUT_TRANSFORM_FUNCTION');
Parameter | Required | Description |
---|---|---|
MODEL_ID |
required for all model endpoints | A unique ID for the model endpoint that you define. |
REQUEST_URL |
optional for other text embedding model endpoints with built-in support | The model-specific endpoint when adding other text embedding and generic model endpoints. For AlloyDB for PostgreSQL, provide an https URL.The request URL that the function generates for built-in model endpoints refers to your cluster's project and region or location. If you want to refer to another project, then ensure that you specify the model_request_url explicitly.For custom hosted model endpoints, ensure that the model endpoint is accessible from the network where Distributed Cloud is located. |
PROVIDER_ID |
required for text embedding model endpoints with built-in support | The provider of the model endpoint. The default value is custom .Set to one of the following:
|
MODEL_TYPE |
optional for generic model endpoints | The model type. Set to one of the following:
|
MODEL_QUALIFIED_NAME |
required for text embedding models with built-in support; optional for other model endpoints | The fully qualified name for text embedding models with built-in support. |
AUTH_TYPE |
optional unless the model endpoint has specific authentication requirement | The authentication type used by the model endpoint. You can set it to either alloydb_service_agent_iam for Vertex AI models or secret_manager for other providers, if they use Secret Manager for authentication. You don't need to set this value if you are using authentication headers. |
AUTH_ID |
don't set for Vertex AI model endpoints; required for all other model endpoints that store secrets in Secret Manager | The secret ID that you set and is subsequently used when registering a model endpoint. |
GENERATE_HEADER_FUNCTION |
optional | The name of the function that generates custom headers. The signature of this function depends on the prediction function that you use. |
INPUT_TRANSFORM_FUNCTION |
optional for text embedding model endpoints with built-in support; don't set for generic model endpoints | The function to transform input of the corresponding prediction function to the model-specific input. |
OUTPUT_TRANSFORM_FUNCTION |
optional for text embedding model endpoints with built-in support; don't set for generic model endpoints | The function to transform model specific output to the prediction function output. |
google_ml.alter_model()
The following shows how to call the google_ml.alter_model()
SQL function used
to update model endpoint metadata:
CALL
google_ml.alter_model(
model_id => 'MODEL_ID',
model_request_url => 'REQUEST_URL',
model_provider => 'PROVIDER_ID',
model_type => 'MODEL_TYPE',
model_qualified_name => 'MODEL_QUALIFIED_NAME',
model_auth_type => 'AUTH_TYPE',
model_auth_id => 'AUTH_ID',
generate_headers_fn => 'GENERATE_HEADER_FUNCTION',
model_in_transform_fn => 'INPUT_TRANSFORM_FUNCTION',
model_out_transform_fn => 'OUTPUT_TRANSFORM_FUNCTION');
For information about the values that you must set for each parameter, see Create a model.
google_ml.drop_model()
function
The following shows how to call the google_ml.drop_model()
SQL function used
to drop a model endpoint:
CALL google_ml.drop_model('MODEL_ID');
Parameter | Description |
---|---|
MODEL_ID |
A unique ID for the model endpoint that you defined. |
google_ml.list_model()
function
The following shows how to call the google_ml.list_model()
SQL function used
to list model endpoint information:
SELECT google_ml.list_model('MODEL_ID');
Parameter | Description |
---|---|
MODEL_ID |
A unique ID for the model endpoint that you defined. |
google_ml.model_info_view
view
The following shows how to call the google_ml.model_info_view
view that is
used to list model endpoint information for all model endpoints:
SELECT * FROM google_ml.model_info_view;
Secrets
Use this reference to understand parameters for functions that let you manage secrets.
google_ml.create_sm_secret()
function
The following shows how to call the google_ml.create_sm_secret()
SQL function
used to add the secret created in Secret Manager:
CALL
google_ml.create_sm_secret(
secret_id => 'SECRET_ID',
secret_path => 'projects/project-id/secrets/SECRET_MANAGER_SECRET_ID/versions/VERSION_NUMBER');
Parameter | Description |
---|---|
SECRET_ID |
The secret ID that you set and is subsequently used when registering a model endpoint. |
PROJECT_ID |
The ID of your Google Cloud project that contains the secret. This project can be different from the project that contains your AlloyDB for PostgreSQL cluster. For AlloyDB Omni, the ID of your Google Cloud project that contains the secret. |
SECRET_MANAGER_SECRET_ID |
The secret ID set in Secret Manager when you created the secret. |
VERSION_NUMBER |
The version number of the secret ID. |
google_ml.alter_sm_secret()
function
The following shows how to call the google_ml.alter_sm_secret()
SQL function
used to update secret information:
CALL
google_ml.alter_sm_secret(
secret_id => 'SECRET_ID',
secret_path => 'projects/project-id/secrets/SECRET_MANAGER_SECRET_ID/versions/VERSION_NUMBER');
For information about the values that you must set for each parameter, see Create a secret.
google_ml.drop_sm_secret()
function
The following shows how to call the google_ml.drop_sm_secret()
SQL function
used to drop a secret:
CALL google_ml.drop_sm_secret('SECRET_ID');
Parameter | Description |
---|---|
SECRET_ID |
The secret ID that you set and was subsequently used when registering a model endpoint. |
Prediction functions
Use this reference to understand parameters for functions that let you generate embeddings or invoke predictions.
google_ml.embedding()
function
The following shows how to generate embeddings:
SELECT
google_ml.embedding(
model_id => 'MODEL_ID',
contents => 'CONTENT');
Parameter | Description |
---|---|
MODEL_ID |
A unique ID for the model endpoint that you define. |
CONTENT |
The text to translate into a vector embedding. |
google_ml.predict_row()
function
The following shows how to invoke predictions:
SELECT
google_ml.predict_row(
model_id => 'MODEL_ID',
request_body => 'REQUEST_BODY');
Parameter | Description |
---|---|
MODEL_ID |
A unique ID for the model endpoint that you define. |
REQUEST_BODY |
The parameters to the prediction function, in JSON format. |
Transform functions
Use this reference to understand parameters for input and output transform functions.
Input transform function
The following shows the signature for the prediction function for text embedding model endpoints:
CREATE OR REPLACE FUNCTION INPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), input_text TEXT) RETURNS JSON;
Parameter | Description |
---|---|
INPUT_TRANSFORM_FUNCTION |
The function to transform input of the corresponding prediction function to the model endpoint-specific input. |
Output transform function
The following shows the signature for the prediction function for text embedding model endpoints:
CREATE OR REPLACE FUNCTION OUTPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), response_json JSON) RETURNS real[];
Parameter | Description |
---|---|
OUTPUT_TRANSFORM_FUNCTION |
The function to transform model endpoint-specific output to the prediction function output. |
Transform functions example
To better understand how to create transform functions for your model endpoint, consider a custom-hosted text embedding model endpoint that requires JSON input and output.
The following example cURL request creates embeddings based on the prompt and the model endpoint:
curl -m 100 -X POST https://cymbal.com/models/text/embeddings/v1 \
-H "Content-Type: application/json"
-d '{"prompt": ["AlloyDB Embeddings"]}'
The following example response is returned:
[[ 0.3522231 -0.35932037 0.10156056 0.17734447 -0.11606089 -0.17266059
0.02509351 0.20305622 -0.09787305 -0.12154685 -0.17313677 -0.08075467
0.06821183 -0.06896557 0.1171584 -0.00931572 0.11875633 -0.00077482
0.25604948 0.0519384 0.2034983 -0.09952664 0.10347155 -0.11935943
-0.17872004 -0.08706985 -0.07056875 -0.05929353 0.4177883 -0.14381726
0.07934926 0.31368294 0.12543282 0.10758053 -0.30210832 -0.02951015
0.3908268 -0.03091059 0.05302926 -0.00114946 -0.16233777 0.1117468
-0.1315904 0.13947351 -0.29569918 -0.12330773 -0.04354299 -0.18068913
0.14445548 0.19481727]]
Based on this input and response, we can infer the following:
The model expects JSON input through the
prompt
field. This field accepts an array of inputs. As thegoogle_ml.embedding()
function is a row level function, it expects one text input at a time. Thus,you need to create an input transform function that builds an array with single element.The response from the model is an array of embeddings, one for each prompt input to the model. As the
google_ml.embedding()
function is a row level function, it returns single input at a time. Thus, you need to create an output transform function that can be used to extract the embedding from the array.
The following example shows the input and output transform functions that is used for this model endpoint when it is registered with model endpoint management:
input transform function
CREATE OR REPLACE FUNCTION cymbal_text_input_transform(model_id VARCHAR(100), input_text TEXT)
RETURNS JSON
LANGUAGE plpgsql
AS $$
DECLARE
transformed_input JSON;
model_qualified_name TEXT;
BEGIN
SELECT json_build_object('prompt', json_build_array(input_text))::JSON INTO transformed_input;
RETURN transformed_input;
END;
$$;
output transform function
CREATE OR REPLACE FUNCTION cymbal_text_output_transform(model_id VARCHAR(100), response_json JSON)
RETURNS REAL[]
LANGUAGE plpgsql
AS $$
DECLARE
transformed_output REAL[];
BEGIN
SELECT ARRAY(SELECT json_array_elements_text(response_json->0)) INTO transformed_output;
RETURN transformed_output;
END;
$$;
HTTP header generation function
The following shows signature for the header generation function that can be
used with the google_ml.embedding()
prediction function when registering other
text embedding model endpoints.
CREATE OR REPLACE FUNCTION GENERATE_HEADERS(model_id VARCHAR(100), input_text TEXT) RETURNS JSON;
For the google_ml.predict_row()
prediction function, the signature is as
follows:
CREATE OR REPLACE FUNCTION GENERATE_HEADERS(model_id TEXT, input JSON) RETURNS JSON;
Parameter | Description |
---|---|
GENERATE_HEADERS |
The function to generate custom headers. You can also pass the authorization header generated by the header generation function while registering the model endpoint. |
Header generation function example
To better understand how to create a function that generates output in JSON key value pairs that are used as HTTP headers, consider a custom-hosted text embedding model endpoint.
The following example cURL request passes the version
HTTP header which is
used by the model endpoint:
curl -m 100 -X POST https://cymbal.com/models/text/embeddings/v1 \
-H "Content-Type: application/json" \
-H "version: 2024-01-01" \
-d '{"prompt": ["AlloyDB Embeddings"]}'
The model expects text input through the version
field and returns the version
value in JSON format. The following example shows the header generation function
that is used for this text embedding model endpoint when it is registered with model
endpoint management:
CREATE OR REPLACE FUNCTION header_gen_fn(model_id VARCHAR(100), input_text TEXT)
RETURNS JSON
LANGUAGE plpgsql
AS $$
BEGIN
RETURN json_build_object('version', '2024-01-01')::JSON;
END;
$$;
Header generation function using API Key
The following examples show how to set up authentication using the API key.
embedding model
CREATE OR REPLACE FUNCTION header_gen_func(
model_id VARCHAR(100),
input_text TEXT
)
RETURNS JSON
LANGUAGE plpgsql
AS $$
#variable_conflict use_variable
BEGIN
RETURN json_build_object('Authorization', 'API_KEY')::JSON;
END;
$$;
Replace the API_KEY
with the API key of the model provider.
generic model
CREATE OR REPLACE FUNCTION header_gen_func(
model_id VARCHAR(100),
response_json JSON
)
RETURNS JSON
LANGUAGE plpgsql
AS $$
#variable_conflict use_variable
DECLARE
transformed_output REAL[];
BEGIN
-- code to add Auth token to API request
RETURN json_build_object('x-api-key', 'API_KEY', 'model-version', '2023-06-01')::JSON;
END;
$$;
Replace the API_KEY
with the API key of the model provider.
Request URL generation
Use the request URL generation function to infer the request URLs for the model endpoints with built-in support. The following shows the signature for this function:
CREATE OR REPLACE FUNCTION GENERATE_REQUEST_URL(provider google_ml.model_provider, model_type google_ml.MODEL_TYPE, model_qualified_name VARCHAR(100), model_region VARCHAR(100) DEFAULT NULL)
Parameter | Description |
---|---|
GENERATE_REQUEST_URL |
The function to generate request URL generated by the extension for model endpoints with built-in support. |
Supported models
You can use model endpoint management to register any text embedding or generic model endpoint. Model endpoint management also includes pre-registered Vertex AI models and models with built-in support.
Pre-registered Vertex AI models
Model type | Model ID | Extension version |
---|---|---|
generic |
|
version 1.4.2 and later |
text_embedding |
|
version 1.3 and later |
Models with built-in support
Vertex AI
Qualified model name | Model type |
---|---|
text-embedding-gecko@001 |
text-embedding |
text-embedding-gecko@003 |
text-embedding |
text-embedding-004 |
text-embedding |
text-embedding-preview-0815 |
text-embedding |
text-multilingual-embedding-002 |
text-embedding |