Model Armor can be integrated with Google Kubernetes Engine (GKE) through Service Extensions. Service Extensions allow you to add custom logic to network traffic processing paths. Traffic extensions are a specific type of service extensions that let you integrate external services to process traffic. These extensions can be attached to various Google Cloud services, including load balancers. You can configure a service extension on application load balancers, including GKE inference gateways, to screen traffic to and from a GKE cluster. This ensures that all interactions with the AI models are protected by Model Armor. For more information, see Configure a traffic extension to call a Model Armor service.
How it works
- You configure a service extension on a load balancer that routes traffic to an LLM hosted in your GKE cluster. This configuration specifies that Model Armor should be used to screen prompts and responses.
- When prompts and responses reach the load balancer, the service extension calls the Model Armor service.
- Model Armor then applies security policies to the prompts and responses, identifying and blocking any malicious or harmful content.
- Only prompts and responses that pass the Model Armor checks are allowed through to the GKE cluster or back to you.