Model Armor integration with Google Cloud services

Model Armor integrates with various Google Cloud services:

Google Kubernetes Engine (GKE) and Service Extensions
Vertex AI
Gemini Enterprise

GKE and Service Extensions

Model Armor can be integrated with GKE through Service Extensions. Service Extensions allow you to integrate internal (Google Cloud services) or external (user-managed) services to process traffic. You can configure a service extension on application load balancers, including GKE inference gateways, to screen traffic to and from a GKE cluster. This verifies that all interactions with the AI models are protected by Model Armor. For more information, see Integration with GKE.

Vertex AI

Model Armor can be directly integrated into Vertex AI using either floor settings or templates. This integration screens Gemini model requests and responses, blocking those that violate floor settings. This integration provides prompt and response protection within Gemini API in Vertex AI for the generateContent method. You need to enable Cloud Logging to get visibility into the sanitization results of prompts and responses. For more information, see Integration with Vertex AI.

Gemini Enterprise

Model Armor can be directly integrated with Gemini Enterprise using templates. Gemini Enterprise routes the interactions between users and agents and the underlying LLMs through Model Armor. This means prompts from users or agents, and the responses generated by the LLMs, are inspected by Model Armor before being presented to the user. For more information, see Integration with Gemini Enterprise.

Before you begin

Enable APIs

You must enable Model Armor APIs before you can use Model Armor.

Console

Enable the Model Armor API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.
Enable the API
Select the project where you want to activate Model Armor.

gcloud

Before you begin, follow these steps using the Google Cloud CLI with the Model Armor API:

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Run the following command to set the API endpoint for the Model Armor service.
```
gcloud config set api_endpoint_overrides/modelarmor "https://modelarmor.LOCATION.rep.googleapis.com/"
```
Replace LOCATION with the region where you want to use Model Armor.

Run the following command to enable Model Armor.

  gcloud services enable modelarmor.googleapis.com --project=PROJECT_ID

Replace PROJECT_ID with the ID of the project.

Options when integrating Model Armor

Model Armor offers the following integration options. Each option provides different features and capabilities.

Integration option	Policy enforcer/detector	Configure detections	Inspect only	Inspect and block	Model and cloud coverage
REST API	Detector	Only using templates	Yes	Yes	All models and all clouds
Vertex AI (Preview)	Inline enforcement	Using floor settings or templates	Yes	Yes	Gemini (non-streaming) on Google Cloud
Google Kubernetes Engine	Inline enforcement	Only using templates	Yes	Yes	Models with OpenAI format on Google Cloud
Gemini Enterprise	Inline enforcement	Only using templates	Yes	Yes	All models and all clouds

For the REST API integration option, Model Armor functions only as a detector using templates. This means it identifies and reports potential policy violations based on predefined templates rather than actively preventing them. When integrating with the Model Armor API, your application can use its output to block or allow actions based on the security evaluation results provided. The Model Armor API returns information about potential threats or policy violations related to your API traffic, especially in the case of AI/LLM interactions. Your application can call the Model Armor API and use the information received in the response to make a decision and take action based on your predefined custom logic.

With the Vertex AI integration option, Model Armor provides inline enforcement using floor settings or templates. This means Model Armor actively enforces policies by intervening directly in the process without requiring modifications to your application code.

Similar to Vertex AI, the GKE integration and Gemini Enterprise integration options also offers inline enforcement only using templates. This means that Model Armor can enforce policies directly within the inference gateway as well as the user or agent interactions within the Gemini Enterprise instances without requiring modifications to your application code.

Model Armor and Gemini Enterprise integration sanitizes only the initial user prompt and the final agent or model response. Any intermediate steps that occur between the initial user prompt and the final response generation are not covered by this integration.