This page describes how to sanitize prompts and responses in detail. Model Armor offers a set of filters to protect your AI applications. Model Armor checks prompts and responses for the configured screening confidence levels.
Before you begin
Create a template following the instructions in Create templates.
Obtain the required permissions
To get the permissions that you need to sanitize prompts and responses, ask your administrator to grant you the following IAM roles on Model Armor:
-
Model Armor User (
roles/modelarmor.user
) -
Model Armor Viewer (
roles/modelarmor.viewer
)
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Enable APIs
You must enable Model Armor APIs before you can use Model Armor.
Console
Enable the Model Armor API.
Select the project where you want to activate Model Armor.
gcloud
Before you begin, follow these steps using the Google Cloud CLI with the Model Armor API:
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Run the following command to set the API endpoint for the Model Armor service.
gcloud config set api_endpoint_overrides/modelarmor "https://modelarmor.LOCATION.rep.googleapis.com/"
Replace
LOCATION
with the region where you want to use Model Armor.
Run the following command to enable Model Armor.
gcloud services enable modelarmor.googleapis.com --project=PROJECT_ID
Replace PROJECT_ID
with the ID of the project.
In the project containing the Sensitive Data Protection template, grant the
DLP User role (roles/dlp.user
)
and DLP Reader role (roles/dlp.reader
)
to the service agent created as a part of the Advanced Sensitive Data Protection step of
Create templates.
Skip this step if the Sensitive Data Protection template is in the same
project as the Model Armor template.
gcloud projects add-iam-policy-binding SDP_PROJECT_ID \ --member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-modelarmor.iam.gserviceaccount.com --role=roles/dlp.user gcloud projects add-iam-policy-binding SDP_PROJECT_ID \ --member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-modelarmor.iam.gserviceaccount.com --role=roles/dlp.reader
Replace the following:
SDP_PROJECT_ID
: the ID of the project that the advanced Sensitive Data Protection template belongs to.PROJECT_NUMBER
: the number of the project the template belongs to.
Sanitize prompts
Sanitize prompts to prevent malicious inputs and help ensure safe and appropriate prompts are sent to your LLMs.
Text prompts
Model Armor sanitizes text prompts by analyzing the text and applying different filters to identify and mitigate potential threats.
REST
Use the following command to sanitize a text prompt in Model Armor.
curl -X POST \
-d '{"userPromptData":{"text":"[UNSAFE TEXT]"}}' \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeUserPrompt"
Replace the following:
PROJECT_ID
: the ID of the project for the template.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.
This results in the following response. Note that MATCH_FOUND
is in the
Dangerous category.
{ "sanitizationResult": { "filterMatchState": "MATCH_FOUND", "invocationResult": "SUCCESS", "filterResults": { "csam": { "csamFilterFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } }, "malicious_uris": { "maliciousUriFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } }, "rai": { "raiFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "MATCH_FOUND", "raiFilterTypeResults": { "sexually_explicit": { "matchState": "NO_MATCH_FOUND" }, "hate_speech": { "matchState": "NO_MATCH_FOUND" }, "harassment": { "matchState": "NO_MATCH_FOUND" }, "dangerous": { "matchState": "MATCH_FOUND" } } } }, "pi_and_jailbreak": { "piAndJailbreakFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "MATCH_FOUND" } }, "sdp": { "sdpFilterResult": { "inspectResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } } } } } }
Go
Java
Node.js
PHP
Python
To run this code, set up a Python development environment and install the Model Armor Python SDK.
This results in the following response.
sanitization_result { filter_match_state: MATCH_FOUND filter_results { key: "rai" value { rai_filter_result { execution_state: EXECUTION_SUCCESS match_state: MATCH_FOUND rai_filter_type_results { key: "dangerous" value { confidence_level: HIGH match_state: MATCH_FOUND } } } } } filter_results { key: "pi_and_jailbreak" value { pi_and_jailbreak_filter_result { execution_state: EXECUTION_SUCCESS match_state: MATCH_FOUND confidence_level: HIGH } } } filter_results { key: "malicious_uris" value { malicious_uri_filter_result { execution_state: EXECUTION_SUCCESS match_state: NO_MATCH_FOUND } } } filter_results { key: "csam" value { csam_filter_filter_result { execution_state: EXECUTION_SUCCESS match_state: NO_MATCH_FOUND } } } invocation_result: SUCCESS }
Sanitize text prompts with multi-language detection enabled
Enable multi-language detection on a per-request basis by setting the
enableMultiLanguageDetection
flag to true
for each individual request.
Optionally, you can specify the source language
for more accurate results. If the source language is not specified, it is
automatically detected to provide multi-language support.
Use the following command to sanitize a text prompt in Model Armor with multi-language detection enabled at a request level.
curl -X POST \ -d '{"userPromptData":{"text":"[UNSAFE TEXT]"}, "multiLanguageDetectionMetadata": { "enableMultiLanguageDetection": true , "sourceLanguage": "jp"}}' \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeUserPrompt"
Replace the following:
PROJECT_ID
: the ID of the project for the template.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.
File-based prompts
To sanitize a prompt that is stored in a file, provide the file content in
base64 format. Model Armor doesn't automatically detect the file
type. You must explicitly set the byteDataType
field to indicate the file
format. If the field is missing or not specified, the request fails. The possible
byteDataType
values are PLAINTEXT_UTF8
, PDF
, WORD_DOCUMENT
,
EXCEL_DOCUMENT
, POWERPOINT_DOCUMENT
, TXT
, and CSV
.
REST
curl -X POST \ -d "$(jq -n \ --arg data "$(base64 -w 0 -i sample.pdf)" \ '{userPromptData: {byteItem: {byteDataType: "FILE_TYPE", byteData: $data}}}')" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeUserPrompt"
Replace the following:
PROJECT_ID
: the ID of the project that the template belongs to.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.FILE_TYPE
: the format of the input file.
Go
Java
Node.js
PHP
Python
To run this code, set up a Python development environment and install the Model Armor Python SDK.
Basic Sensitive Data Protection configuration
Model Armor integrates with Sensitive Data Protection to help prevent accidental exposure of private information. Create a template with basic Sensitive Data Protection settings enabled. Basic Sensitive Data Protection helps you screen for a fixed set of Sensitive Data Protection infoTypes.
The following Sensitive Data Protection infoTypes are scanned in the prompt for all regions:
CREDIT_CARD_NUMBER
: A credit card number is 12 to 19 digits long. It is used for payment transactions globally.FINANCIAL_ACCOUNT_NUMBER
: A number referring to a specific financial account, for example, a bank account number or a retirement account number.GCP_CREDENTIALS
: Google Cloud service account credentials. Credentials that can be used to authenticate with {api_client_lib_name} and service accounts.GCP_API_KEY
: Google Cloud API key. An encrypted string that is used when calling Google Cloud APIs that don't need to access private user data.PASSWORD
: Clear text passwords in configs, code, and other text.
The following additional Sensitive Data Protection infoTypes are scanned in the prompt for US-based regions:
US_SOCIAL_SECURITY_NUMBER
: A United States Social Security number (SSN) is a 9-digit number issued to US citizens, permanent residents, and temporary residents. This detector won't match against numbers with all zeros in any digit group (that is, 000-##-####, ###-00-####, or ###-##-0000), against numbers with 666 in the first digit group, or against numbers whose first digit is 9.US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER
: A United States Individual Taxpayer Identification Number (ITIN) is a type of Tax Identification Number (TIN) issued by the Internal Revenue Service (IRS). An ITIN is a tax processing number only available for certain nonresident and resident aliens, their spouses, and dependents who cannot get a Social Security Number (SSN).
Here's an example basic Sensitive Data Protection configuration:
gcloud
gcloud model-armor templates create TEMPLATE_ID \ --location=LOCATION \ --project=PROJECT_ID \ --basic-config-filter-enforcement=enabled
Replace the following:
TEMPLATE_ID
: the ID of the template.LOCATION
: the location of the template.
REST
export FILTER_CONFIG_SDP_BASIC='{ "filterConfig": { "sdpSettings": { "basicConfig": { "filterEnforcement": "ENABLED" } } } }' curl -X PATCH \ -d "$FILTER_CONFIG_SDP_BASIC" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID?updateMask=filterConfig.sdpSettings.basicConfig.filterEnforcement"
Replace the following:
PROJECT_ID
: the ID of the project that the template belongs to.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.
Go
Java
Node.js
PHP
Python
To run this code, set up a Python development environment and install the Model Armor Python SDK.
Use the template created to screen your prompts. Here's an example:
curl -X POST \ -d '{"userPromptData":{"text":"can you remember my ITIN : ###-##-####"}}' \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeUserPrompt"
Replace the following:
PROJECT_ID
: the ID of the project that the template belongs to.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.
This example returns the following response:
{ "sanitizationResult": { "filterMatchState": "MATCH_FOUND", "invocationResult": "SUCCESS", "filterResults": [ { "csamFilterFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } }, { "sdpFilterResult": { "inspectResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "MATCH_FOUND", "findings": [ { "infoType": "US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER", "likelihood": "LIKELY", "location": { "byteRange": { "start": "26", "end": "37" }, "codepointRange": { "start": "26", "end": "37" } } } ] } } } ] } }
Advanced Sensitive Data Protection configuration
Model Armor screens the LLM prompts and responses using the advanced Sensitive Data Protection configuration setting. This lets you use Sensitive Data Protection capabilities beyond the infoTypes offered in the basic Sensitive Data Protection setting.
To use the Sensitive Data Protection advanced filter in Model Armor, the Sensitive Data Protection templates must be in the same cloud location as that of the Model Armor template.
gcloud
gcloud model-armor templates create TEMPLATE_ID \ --location=LOCATION \ --advanced-config-inspect-template="path/to/template" \
Replace the following:
TEMPLATE_ID
: the ID of the template.LOCATION
: the location of the template.
REST
export FILTER_CONFIG_SDP_ADV='{ "filterConfig": { "sdpSettings": { "advancedConfig": { "deidentifyTemplate": "projects/PROJECT_ID/locations/LOCATION/deidentifyTemplates/deidentify-ip-address", "inspectTemplate": "projects/PROJECT_ID/locations/LOCATION/inspectTemplates/inspect-ip-address" } } } }' curl -X POST \ -d "$FILTER_CONFIG_SDP_ADV" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID?updateMask=filterConfig.sdpSettings.advancedConfig"
Replace the following:
PROJECT_ID
: the ID of the project that the template belongs to.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.
This example returns the following response:
{ "name": "projects/PROJECT_ID/locations/LOCATION/templates/all-filters-test", "createTime": "2024-12-16T17:08:19.626693819Z", "updateTime": "2024-12-16T17:08:19.626693819Z", "filterConfig": { "sdpSettings": { "advancedConfig": { "deidentifyTemplate": "projects/PROJECT_ID/locations/LOCATION/deidentifyTemplates/deidentify-ip-address", "inspectTemplate": "projects/PROJECT_ID/locations/LOCATION/inspectTemplates/inspect-ip-address" } } } }
Go
Java
Node.js
PHP
Python
To run this code, set up a Python development environment and install the Model Armor Python SDK.
Use the template created to screen your prompts. Here's an example:
curl -X POST \ -d '{"userPromptData":{"text":"is there anything malicious running on 1.1.1.1?"}}' \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeUserPrompt"
Replace the following:
PROJECT_ID
: the ID of the project that the template belongs to.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.
This example returns the following response:
{ "sanitizationResult": { "filterMatchState": "MATCH_FOUND", "invocationResult": "SUCCESS", "filterResults": [ { "csamFilterFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } }, { "sdpFilterResult": { "deidentifyResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "MATCH_FOUND", "data": { "text": "is there anything malicious running on [IP_ADDRESS]?" }, "transformedBytes": "7", "infoTypes": ["IP_ADDRESS"] } } } ] } }
Sanitize model response
LLMs can sometimes generate harmful responses. To reduce the risks associated with using LLMs in your applications, it is important to sanitize their responses.
Here's an example command to sanitize a model response in Model Armor.
REST
curl -X POST \
-d '{"text":"IP address of the current network is ##.##.##.##"}' \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeModelResponse"
Replace the following:
PROJECT_ID
: the ID of the project that the template belongs to.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.This example returns the following response:
{ "sanitizationResult": { "filterMatchState": "MATCH_FOUND", "invocationResult": "SUCCESS", "filterResults": { "rai": { "raiFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "MATCH_FOUND", "raiFilterTypeResults": { "dangerous": { "confidenceLevel": "MEDIUM_AND_ABOVE", "matchState": "MATCH_FOUND" }, "sexually_explicit": { "matchState": "NO_MATCH_FOUND" }, "hate_speech": { "matchState": "NO_MATCH_FOUND" }, "harassment": { "matchState": "NO_MATCH_FOUND" } } } }, "pi_and_jailbreak": { "piAndJailbreakFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } }, "csam": { "csamFilterFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } }, "malicious_uris": { "maliciousUriFilterResult": { "executionState": "EXECUTION_SUCCESS", "matchState": "NO_MATCH_FOUND" } }, } } }
Go
Java
Node.js
PHP
Python
To run this code, set up a Python development environment and install the Model Armor Python SDK.
Sanitize model response with multi-language detection enabled
Enable multi-language detection on a per-request basis by setting the
enableMultiLanguageDetection
flag to true
for each individual response.
Optionally, you can specify the
source language
for more accurate results. If the source language is not specified, it is
automatically detected to provide multi-language support.
curl -X POST \ -d '{"userPromptData":{"text":"[UNSAFE TEXT]"}, "multiLanguageDetectionMetadata": { "enableMultiLanguageDetection": true , "sourceLanguage": "jp"}}' \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeModelResponse"
Replace the following:
PROJECT_ID
: the ID of the project that the template belongs to.LOCATION
: the location of the template.TEMPLATE_ID
: the ID of the template.
What's next
- Learn more about Model Armor.
- Learn about Model Armor floor settings.
- Learn about Model Armor templates.
- Troubleshoot Model Armor issues.