Invoke predictions with model endpoint management

This page describes a preview that lets you experiment with registering AI models and invoking predictions with Model endpoint management. For using AI models in production environments, see Build generative AI applications using AlloyDB AI.

After the models are added and registered in the Model endpoint management, you can reference them using the model ID to invoke predictions.

Before you begin

Make sure that you have registered your model with Model endpoint management. For more information, see Register a model with model endpoint management

Invoke predictions for generic models

Use the google_ml.predict_row() SQL function to call a registered generic model to invoke predictions. You can use google_ml.predict_row() function with any model type.

    model_id => 'MODEL_ID',
    request_body => 'REQUEST_BODY');

Replace the following:

  • MODEL_ID: the model ID you defined when registering the model.
  • REQUEST_BODY: the parameters to the prediction function, in JSON format.


Some examples for invoking predictions using registered models are listed in this section.

To generate predictions for a registered gemini-pro model, run the following statement:

            model_id => 'gemini-pro',
            request_body => '{
        "contents": [
                "role": "user",
                "parts": [
                        "text": "For TPCH database schema as mentioned here , generate a SQL query to find all supplier names which are located in the India nation."
        }'))-> 'candidates' -> 0 -> 'content' -> 'parts' -> 0 -> 'text';

To generate predictions for a registered facebook/bart-large-mnli model on Hugging Face, run the following statement:

    model_id => 'facebook/bart-large-mnli',
    request_body =>
       "inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!",
    "parameters": {"candidate_labels": ["refund", "legal", "faq"]}}