Model Armor integration with Vertex AI

Model Armor can be directly integrated into Vertex AI either using floor settings or using templates. This integration lets you screen requests sent to and responses from Gemini models and inspect or block requests or responses if they violate the floor settings thresholds. This integration provides prompt and response protection within Gemini API in Vertex AI for the generateContent method. You need to enable Cloud Logging to get visibility into the sanitization results of prompts and responses. The supported locations for this integration are us-central1, us-east4, us-west1, and europe-west4. While in Preview, there is no cost for using this integration. For pricing information, see Model Armor pricing.

Before you begin

Grant the Model Armor user permission to the Vertex AI service account.

gcloud projects add-iam-policy-binding PROJECT_ID --member='serviceAccount:service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.iam.gserviceaccount.com' --role='roles/modelarmor.user'
  

Replace the following:

  • PROJECT_ID is your Google Cloud project ID.
  • PROJECT_NUMBER is your Google Cloud project number.

Configure floor settings

You use floor settings to configure the minimum detection thresholds for Model Armor templates. These settings verify that all new and modified templates meet floor policy requirements.

Before configuring the floor settings, consider the following:

  • Floor settings can be set at the organization, folder, and project level. The user interface is only available for the project level floor settings. To set the floor settings at the organization or folder level, you must use the API.
  • The user interface is available only at the project level and lets you inherit the organization or folder level settings.

To configure floor settings, see configure floor settings.

After configuring the floor settings to enable Vertex AI sanitization, Model Armor sanitizes all generateContent API calls to the project's Gemini endpoints using the filter settings specified.

The following code sample shows how to use the generateContent method.

curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer $(gcloud auth print-access-token)" "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent" -d '{
"contents": [
    {
        "role": "user",
        "parts": [
            {
                "text": ""
            }
        ]
    }
]
, "generationConfig": {
    "responseModalities": ["TEXT"]
    ,"temperature": 0.2
    ,"maxOutputTokens": 1024
    ,"topP": 0.8
}
}'

Replace the following:

  • PROJECT_ID is your Google Cloud project ID.
  • LOCATION is the Google Cloud location of the Gemini endpoint. The supported locations are us-central1, us-east4, us-west1, and europe-west4.

The following code sample shows the response from the generateContent method.

{
  "promptFeedback": {
    "blockReason": "MODEL_ARMOR",
    "blockReasonMessage": "Blocked by Floor Setting. The prompt violated Responsible AI Safety settings (Harassment, Dangerous), Prompt Injection and Jailbreak filters."
  },
  "usageMetadata": {
    "trafficType": "ON_DEMAND"
  },
  "modelVersion": "gemini-2.0-flash-001",
  "createTime": "2025-03-26T13:14:36.961184Z",
  "responseId": "vP3jZ6DVOqLKnvgPqZL-8Ao"
}

Configure Model Armor templates

Model Armor can also be integrated with Vertex AI using Model Armor templates. Templates let you configure how Model Armor screens prompts and responses and they define security filter configurations.

You must create templates first, and then use these templates with Gemini's generateContent method. For more information about templates, see Create and manage Model Armor templates.

After configuring the Model Armor template, pass the template ID as a parameter when making a call to the Gemini API using the generateContent method. Vertex AI will route the request to Model Armor for processing.

The following code sample shows the request to the generateContent method.

curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer $(gcloud auth print-access-token)" "https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/gemini-2.0-flash-001:generateContent" -d '{
"contents": [
    {
        "role": "user",
        "parts": [
            {
                "text": ""
            }
        ]
    }
]
, "generationConfig": {
    "responseModalities": ["TEXT"]
    ,"temperature": 0.2
    ,"maxOutputTokens": 1024
    ,"topP": 0.8
},
 "model_armor_config": {
        "prompt_template_name": "projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID",
"response_template_name": "projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID"
    }
}'

Replace the following:

  • PROJECT_ID is the Google Cloud project ID.
  • LOCATION is the Google Cloud location of the Gemini endpoint. The supported locations are us-central1, us-east4, us-west1, and europe-west4.
  • TEMPLATE_ID is Model Armor template ID.

The following code sample shows the response from the generateContent method.

{
  "promptFeedback": {
    "blockReason": "MODEL_ARMOR",
    "blockReasonMessage": "Blocked by Floor Setting. The prompt violated Responsible AI Safety settings (Harassment, Dangerous), Prompt Injection and Jailbreak filters."
  },
  "usageMetadata": {
    "trafficType": "ON_DEMAND"
  },
  "modelVersion": "gemini-2.0-flash-001",
  "createTime": "2025-03-26T13:14:36.961184Z",
  "responseId": "vP3jZ6DVOqLKnvgPqZL-8Ao"
}