[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[[["\u003cp\u003eModel endpoint management is a preview feature that allows users to register and manage AI model endpoints within AlloyDB Omni, enabling interaction with models through SQL queries.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003egoogle_ml_integration\u003c/code\u003e extension provides functions to register model metadata, generate vector embeddings, and invoke predictions using registered models, supporting various model types and providers.\u003c/p\u003e\n"],["\u003cp\u003eSupported model providers include Vertex AI, Hugging Face, Anthropic, OpenAI, and custom-hosted models, with authentication methods varying by provider, including AlloyDB service accounts and Secret Manager.\u003c/p\u003e\n"],["\u003cp\u003eModel endpoint management supports text embedding models, with built-in support for specific Vertex AI and OpenAI models, as well as generic models with JSON-based APIs.\u003c/p\u003e\n"],["\u003cp\u003eTransform and HTTP header generation functions are available for advanced customization when registering text-embedding models, allowing users to modify input and output formats and define custom HTTP headers.\u003c/p\u003e\n"]]],[],null,["# Register and call remote AI models in AlloyDB Omni overview\n\nSelect a documentation version: 15.5.2keyboard_arrow_down\n\n- [15.5.5](/alloydb/omni/15.5.5/docs/model-endpoint-overview)\n- [15.5.4](/alloydb/omni/15.5.4/docs/model-endpoint-overview)\n- [15.5.2](/alloydb/omni/15.5.2/docs/model-endpoint-overview)\n\n\u003cbr /\u003e\n\n|\n| **Preview\n| --- Model endpoint management**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nThis page describes a preview that lets you experiment with registering an AI model endpoint\nand invoking predictions with Model endpoint management in AlloyDB Omni. To use AI models in\nproduction environments, see [Build generative AI applications using\nAlloyDB AI](/alloydb/omni/docs/ai/overview-ai) and [Work with vector embeddings](/alloydb/docs/ai/work-with-embeddings).\n\nTo register remote model endpoints with AlloyDB, see [Register and call remote AI models in AlloyDB](/alloydb/docs/ai/model-endpoint-overview).\n\nOverview\n--------\n\nThe *Model endpoint management* preview lets you register a model endpoint, manage model endpoint metadata in your\ndatabase cluster, and then interact with the models using SQL queries. It provides the [`google_ml_integration`](/alloydb/omni/docs/reference/model-endpoint) extension that\nincludes functions to add and register the model endpoint metadata related to the models, and then use the\nmodels to generate vector embeddings or invoke predictions.\n\nSome of the example model types that you can register using model endpoint management are as follows:\n\n- [Vertex AI](/vertex-ai/docs) text embedding models\n- Embedding models provided by third-party providers, such as Anthropic, Hugging Face, or OpenAI.\n- Custom-hosted text embedding models\n- Generic models with a JSON-based API---for example, `facebook/bart-large-mnli` model hosted on Hugging Face or `gemini-pro` model from the Vertex AI Model Garden\n\nHow it works\n------------\n\nYou can use model endpoint management to register a model endpoint that complies to the following:\n\n- Model input and output supports JSON format.\n- Model can be called using the REST protocol.\n\nWhen you [register a model endpoint with the model endpoint management](/alloydb/omni/15.5.2/docs/model-endpoint-register-model), it registers each endpoint with a unique model ID that you provided as a reference to the model. You can use this model ID to query models:\n\n- Generate embeddings to translate text prompts to numerical vectors. You can\n store generated embeddings as vector data when the `pgvector` extension is enabled in the database.\n\n \u003cbr /\u003e\n\n- Invoke predictions to call a model using SQL within a transaction.\n\nYour applications can access the model endpoint management using the `google_ml_integration`\nextension. This extension provides the following functions:\n\n- The `google_ml.create_model()` SQL function, which is used to register the model endpoint that is used in the prediction or embedding function.\n- The `google_ml.create_sm_secret()` SQL function, which uses secrets in the Google Cloud Secret Manager, where the API keys are stored.\n- The `google_ml.embedding()` SQL function, which is a prediction function that generates text embeddings.\n- The `google_ml.predict_row()` SQL function that generates predictions when you call generic models that support JSON input and output format.\n- Other helper functions that handle generating custom URL, generating HTTP headers, or passing transform functions for your generic models.\n- Functions to manage the registered model endpoints and secrets.\n\nKey concepts\n------------\n\nBefore you start using the model endpoint management, understand the concepts required to connect to and use the models.\n\n### Model provider {: #model-provider}̦\n\n*Model provider* indicates the supported model hosting providers. The following\ntable shows the model provider value you must set based on the model provider\nyou use:\n\nThe default model provider is `custom`.\n\nBased on the provider type, the supported authentication method differs. The Vertex AI models use the AlloyDB service account to authenticate, while other providers can use the Secret Manager to authenticate.\n\n### Model type\n\n*Model type* indicates the type of the AI model. The extension supports text embedding as\nwell as any generic model type. The supported model type you can set when\nregistering a model endpoint are `text-embedding` and `generic`. Setting model type is\noptional when registering generic model endpoints as `generic` is the default model type.\n\nText embedding models with built-in support\n: The\n model endpoint management provides built-in support for all versions of the\n `textembedding-gecko` model by Vertex AI and the\n `text-embedding-ada-002` model by OpenAI. To register these model endpoints,\n use the `google_ml.create_model()` function. AlloyDB\n automatically sets up default transform functions for these models.\n: The model type for these models is `text-embedding`.\n\nOther text embedding models\n: For other text embedding models, you\n need to create transform functions to handle the input and output formats that\n the model supports. Optionally, you can use the HTTP header generation function\n that generates custom headers required by your model.\n: The model type for these models is `text-embedding`.\n\nGeneric models\n: The model endpoint management also supports\n registering of all other model types apart from text embedding models. To\n invoke predictions for generic models, use the\n `google_ml.predict_row()` function. You can set model endpoint metadata,\n such as a request endpoint and HTTP headers that are specific to your model.\n: You cannot\n pass transform functions when you are registering a generic model endpoint. Ensure that\n when you invoke predictions the input to the function is in the JSON format, and\n that you parse the JSON output to derive the final output.\n: The model type for these models is `generic`.\n\n### Authentication\n\n*Auth types* indicate the authentication type that you can use to connect to the\nmodel endpoint management using the `google_ml_integration` extension. Setting\nauthentication is optional and is required only if you need to authenticate to access your model.\n\nFor Vertex AI models, the AlloyDB service account is used for authentication. For other models,\nAPI key or bearer token that is stored as a secret in the\nSecret Manager can be used with the `google_ml.create_sm_secret()` SQL\nfunction.\n\nThe following table shows the auth types that you can set:\n\n### Prediction functions\n\nThe `google_ml_integration` extension includes the following prediction functions:\n\n`google_ml.embedding()`\n: Used to call a registered text embedding model endpoint to\n generate embeddings. It includes built-in support for the\n `textembedding-gecko` model by Vertex AI and the `text-embedding-ada-002` model by\n OpenAI.\n: For text embedding models without built-in support, the input and output\n parameters are unique to a model and need to be transformed for the function\n to call the model. Create a transform input function to transform input of the\n prediction function to the model specific input, and a transform output\n function to transform model specific output to the prediction function output.\n\n`google_ml.predict_row()`\n: Used to call a registered generic model endpoint, as long as they\n support JSON-based API, to invoke predictions.\n\n### Transform functions\n\nTransform functions modify the input to a format that the model understands, and\nconvert the model response to the format that the prediction function expects. The\ntransform functions are used when registering the `text-embedding` model endpoint without\nbuilt-in support. The signature of the transform functions depends on the\nprediction function for the model type.\n\nYou cannot use transform functions when registering a `generic` model endpoint.\n\nThe following shows the signatures for the prediction function for text\nembedding models: \n\n // define custom model specific input/output transform functions.\n CREATE OR REPLACE FUNCTION input_transform_function(model_id VARCHAR(100), input_text TEXT) RETURNS JSON;\n\n CREATE OR REPLACE FUNCTION output_transform_function(model_id VARCHAR(100), response_json JSON) RETURNS real[];\n\nFor more information about how to create transform functions, see [Transform functions example](/alloydb/docs/reference/model-endpoint#transform-function).\n\n### HTTP header generation function\n\nThe HTTP header generation function generates the output in JSON key value pairs\nthat are used as HTTP headers. The signature of the prediction function defines\nthe signatures of the header generation function.\n\nThe following example shows the signature for the `google_ml.embedding()` prediction function. \n\n CREATE OR REPLACE FUNCTION generate_headers(model_id VARCHAR(100), input TEXT) RETURNS JSON;\n\nFor the `google_ml.predict_row()` prediction function, the signature is as follows: \n\n CREATE OR REPLACE FUNCTION generate_headers(model_id VARCHAR(100), input JSON) RETURNS JSON;\n\nFor more information about how to create a header generation function, see [HTTP header generation function](/alloydb/docs/reference/model-endpoint#header-gen-function).\n\nWhat's next\n-----------\n\n- [Register a model endpoint with model endpoint management](/alloydb/omni/15.5.2/docs/model-endpoint-register-model).\n- Learn about the [model endpoint management reference](/alloydb/docs/reference/model-endpoint-reference)."]]