Sanitize prompts and responses

Model Armor checks prompts and responses for the configured screening confidence levels. This page describes how to sanitize prompts and responses in detail.

Before you begin, create a template following the instructions in Create templates.

Sanitize prompts

Model Armor sanitizes prompts in text and file-based formats.

Text prompts

Use this command to sanitize a text prompt in Model Armor. Use the template (ma-template-id-1234) that we created as a part of step 7 - Advanced Sensitive Data Protection of Create templates.

curl -X POST \
-d  "{user_prompt_data: { text: 'How do I make a bomb?' } }" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.$LOCATION.rep.googleapis.com/v1/projects/$PROJECT_ID/locations/$LOCATION/templates/$TEMPLATE_ID:sanitizeUserPrompt"

Replace the following:

  • gcloud auth print-access-token is the access token for the specified account.
  • PROJECT_ID - the ID of the project that the template belongs to.
  • LOCATION - the location of the template.
  • TEMPLATE_ID - the ID of the template.

This results in the following response. Note that MATCH_FOUND is in the Dangerous category.

{
 "sanitizationResult": {
   "filterMatchState": "NO_MATCH_FOUND",
   "invocationResult": "SUCCESS",
   "filterResults": {
     "csam": {
       "csamFilterFilterResult": {
         "executionState": "EXECUTION_SUCCESS",
         "matchState": "NO_MATCH_FOUND"
       }
     },
     "malicious_uris": {
       "maliciousUriFilterResult": {
         "executionState": "EXECUTION_SUCCESS",
         "matchState": "NO_MATCH_FOUND"
       }
     },
     "rai": {
       "raiFilterResult": {
         "executionState": "EXECUTION_SUCCESS",
         "matchState": "MATCH_FOUND",
         "raiFilterTypeResults": {
           "sexually_explicit": {
             "matchState": "NO_MATCH_FOUND"
           },
           "hate_speech": {
             "matchState": "NO_MATCH_FOUND"
           },
           "harassment": {
             "matchState": "NO_MATCH_FOUND"
           },
           "dangerous": {
             "matchState": "MATCH_FOUND"
           }
         }
       }
     },
     "pi_and_jailbreak": {
       "piAndJailbreakFilterResult": {
         "executionState": "EXECUTION_SUCCESS",
         "matchState": "MATCH_FOUND"
       }
     },
     "sdp": {
       "sdpFilterResult": {
         "inspectResult": {
           "executionState": "EXECUTION_SUCCESS",
           "matchState": "NO_MATCH_FOUND"
         }
       }
     }
   }
 }
}

Basic Sensitive Data Protection configuration

Create a template with basic Sensitive Data Protection settings enabled. Basic Sensitive Data Protection helps you screen for the following Sensitive Data Protection infoTypes:

  • CREDIT_CARD_NUMBER: A credit card number is 12 to 19 digits long. They are used for payment transactions globally.
  • US_SOCIAL_SECURITY_NUMBER: A United States Social Security number (SSN) is a 9-digit number issued to US citizens, permanent residents, and temporary residents. This detector won't match against numbers with all zeros in any digit group (that is, 000-##-####, ###-00-####, or ###-##-0000), against numbers with 666 in the first digit group, or against numbers whose first digit is 9.
  • FINANCIAL_ACCOUNT_NUMBER: A number referring to a specific financial account—for example, a bank account number or a retirement account number.
  • US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER: A United States Individual Taxpayer Identification Number (ITIN) is a type of Tax Identification Number (TIN) issued by the Internal Revenue Service (IRS). An ITIN is a tax processing number only available for certain nonresident and resident aliens, their spouses, and dependents who cannot get a Social Security Number (SSN).
  • GCP_CREDENTIALS: Google Cloud service account credentials. Credentials that can be used to authenticate with Google API client libraries and service accounts.
  • GCP_API_KEY: Google Cloud API key. An encrypted string that is used when calling Google Cloud APIs that don't need to access private user data.

Here's an example basic Sensitive Data Protection configuration:

gcloud

gcloud alpha model-armor templates create template-name
--location=location
--basic-config-filter-enforcement=enabled

REST APIs

export FILTER_CONFIG_SDP_BASIC='{
  "filterConfig": {
  "sdpSettings": {
  "basicConfig": {
    "filterEnforcement": "ENABLED"
      }
    }
  }
}'

curl -X POST \
-d  $FILTER_CONFIG_SDP_BASIC \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/$PROJECT_ID/locations/LOCATION/templates?TEMPLATE_ID=sdp_basic"

Replace the following:

  • gcloud auth print-access-token is the access token for the specified account.
  • PROJECT_ID - the ID of the project that the template belongs to.
  • LOCATION - the location of the template.
  • TEMPLATE_ID - the ID of the template.

Use the template created to screen your prompts. Here's an example:

curl -X POST \
-d  "{ user_prompt_data: { 'text': 'can you remember my ITIN : 988-86-1234'} }" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.us-central1.rep.googleapis.com/v1/projects/$PROJECT_ID/locations/us-central1/templates/sdp_basic:sanitizeUserPrompt"

This example returns the following response:

{
  "sanitizationResult": {
    "filterMatchState": "MATCH_FOUND",
    "invocationResult": "SUCCESS",
    "filterResults": [
      {
        "csamFilterFilterResult": {
          "executionState": "EXECUTION_SUCCESS",
          "matchState": "NO_MATCH_FOUND"
        }
      },
      {
        "sdpFilterResult": {
          "inspectResult": {
            "executionState": "EXECUTION_SUCCESS",
            "matchState": "MATCH_FOUND",
            "findings": [
              {
                "infoType": "US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER",
                "likelihood": "LIKELY",
                "location": {
                  "byteRange": {
                    "start": "26",
                    "end": "37"
                  },
                  "codepointRange": {
                    "start": "26",
                    "end": "37"
                  }
                }
              }
            ]
          }
        }
      }
    ]
  }
}

Advanced Sensitive Data Protection configuration

Model Armor lets you screen LLM prompts and responses using Sensitive Data Protection templates using the advanced Sensitive Data Protection configuration setting. This lets you use Sensitive Data Protection capabilities beyond the infoTypes offered in the basic Sensitive Data Protection setting.

To use the Sensitive Data Protection advanced filter in Model Armor, the Sensitive Data Protection templates should be in the same cloud location as that of the Model Armor template, for example, us-central1 in this case.

gcloud

gcloud alpha model-armor templates create template-name
--location=location \
--advanced-config-inspect-template="path/to/template"

REST APIs

  export FILTER_CONFIG_SDP_ADV='{
  "filterConfig": {
  "sdpSettings": {
    "advancedConfig": {
      "deidentifyTemplate": "projects/$PROJECT_ID/locations/LOCATION/deidentifyTemplates/deidentify-ip-address",
       "inspectTemplate": "projects/$PROJECT_ID/locations/$LOCATION/inspectTemplates/inspect-ip-address"
        }
      }
    }
  }'

 curl -X POST \
 -d  $FILTER_CONFIG_SDP_ADV \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://modelarmor.$LOCATION.rep.googleapis.com/v1/projects/$PROJECT_ID/locations/$LOCATION/templates?template_id=sdp_advanced"

# Result of CreateTemplate
{
  "name": "projects/$PROJECT_ID/locations/LOCATION/templates/all-filters-test",
  "createTime": "2024-12-16T17:08:19.626693819Z",
  "updateTime": "2024-12-16T17:08:19.626693819Z",
   "filterConfig": {
    "sdpSettings": {
    "advancedConfig": {
      "deidentifyTemplate":  "projects/$PROJECT_ID/locations/LOCATION/deidentifyTemplates/deidentify-ip-address",
       "inspectTemplate": "projects/$PROJECT_ID/locations/$LOCATION/inspectTemplates/inspect-ip-address"
      }
   }
  }
},
service_agent_email: "service-PROJECT_NUMBER@gcp-sa-modelarmor.iam.gserviceaccount.com"

Replace the following:

  • PROJECT_ID - the ID of the project that the template belongs to.
  • LOCATION - the location of the template.

In the project containing the Sensitive Data Protection template, grant the DLP User role (roles/dlp.user) and DLP Reader role (roles/dlp.reader) to the service agent created as a part of step 7 - Advanced Sensitive Data Protection of Create templates. You can skip this step if the Sensitive Data Protection template is in the same project as the Model Armor template.

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member serviceAccount:service-$PROJECT_NUMBER@gcp-sa-modelarmor.iam.gserviceaccount.com --role roles/dlp.user

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member serviceAccount:service-$PROJECT_NUMBER@gcp-sa-modelarmor.iam.gserviceaccount.com --role roles/dlp.reader

Replace the following:

  • PROJECT_ID - the ID of the project that the template belongs to.

Use the template created to screen your prompts. Here's an example:

curl -X POST \
-d  "{ user_prompt_data: { 'text': 'is there anything malicious running on 1.1.1.1?'} }" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeUserPrompt"

Replace the following:

  • gcloud auth print-access-token is the access token for the specified account.
  • PROJECT_ID - the ID of the project that the template belongs to.
  • LOCATION - the location of the template.
  • TEMPLATE_ID - the ID of the template.

This example returns the following response:

{
  "sanitizationResult": {
    "filterMatchState": "MATCH_FOUND",
    "invocationResult": "SUCCESS",
    "filterResults": [
      {
        "csamFilterFilterResult": {
          "executionState": "EXECUTION_SUCCESS",
          "matchState": "NO_MATCH_FOUND"
        }
      },
      {
        "sdpFilterResult": {
          "deidentifyResult": {
            "executionState": "EXECUTION_SUCCESS",
            "matchState": "MATCH_FOUND",
            "data": {
              "text": "is there anything malicious running on [IP_ADDRESS]?"
            },
            "transformedBytes": "7"
          }
        }
      }
    ]
  }
}

File-based prompts

Use this command to sanitize a user prompt in the file format with Model Armor. The files need to be passed in the Base64 encoded format.

curl -X POST \
-d "$(jq -n \
  --arg data "$(base64 -w 0 -i sample.pdf)" \
  '{userPromptData: {byteItem: {byteDataType: "PDF", byteData: $data}}}')" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.LOCATION.rep.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/templates/TEMPLATE_ID:sanitizeUserPrompt"

Replace the following:

  • gcloud auth print-access-token is the access token for the specified account.
  • PROJECT_ID - the ID of the project that the template belongs to.
  • LOCATION - the location of the template.
  • TEMPLATE_ID - the ID of the template.

Sanitize model response

Here's an example command to sanitize a model response in Model Armor.

curl -X POST \
-d  "{model_response_data: { text: 'It might hurt and cause pain' } }" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://modelarmor.us-central1.rep.googleapis.com/v1/projects/$PROJECT_ID/locations/us-central1/templates/ma-template-id-1234:sanitizeModelResponse"

This example returns the following response:

{
  "sanitizationResult": {
    "filterMatchState": "MATCH_FOUND",
    "invocationResult": "SUCCESS",
    "filterResults": {
      "rai": {
        "raiFilterResult": {
          "executionState": "EXECUTION_SUCCESS",
          "matchState": "MATCH_FOUND",
          "raiFilterTypeResults": {
            "dangerous": {
              "confidenceLevel": "MEDIUM_AND_ABOVE",
              "matchState": "MATCH_FOUND"
            },
            "sexually_explicit": {
              "matchState": "NO_MATCH_FOUND"
            },
            "hate_speech": {
              "matchState": "NO_MATCH_FOUND"
            },
            "harassment": {
              "matchState": "NO_MATCH_FOUND"
            }
          }
        }
      },
      "pi_and_jailbreak": {
        "piAndJailbreakFilterResult": {
          "executionState": "EXECUTION_SUCCESS",
          "matchState": "NO_MATCH_FOUND"
          }
      },
"csam": {
        "csamFilterFilterResult": {
          "executionState": "EXECUTION_SUCCESS",
          "matchState": "NO_MATCH_FOUND"
        }
      },
      "malicious_uris": {
        "maliciousUriFilterResult": {
          "executionState": "EXECUTION_SUCCESS",
          "matchState": "NO_MATCH_FOUND"
        }
      },
    }
  }
}

What's next