Sensitive Data Protection's built-in infoType detectors are effective at finding common types of sensitive data. Custom infoType detectors enable you to fully customize your own sensitive data detector. Inspection rules help refine the scan results that the Sensitive Data Protection returns by modifying the detection mechanism of a given infoType detector.
If you want to exclude or include more values from the results that are returned by a built-in infoType detector, you can create a new custom infoType from scratch and define all the criteria that Sensitive Data Protection should look for. Alternatively, you can refine the findings that Sensitive Data Protection's built-in or custom detectors return according to criteria that you specify. You can do this by adding inspection rules that can help reduce noise, increase precision and recall, or adjust likelihood certainty of scan findings.
This topic discusses how to use the two types of inspection rules to either exclude certain findings or add additional findings, all according to custom criteria that you specify. Presented in this topic are several scenarios in which you might want to alter an existing infoType detector.
The two types of inspection rules are:
- Exclusion rules, which help exclude false or unwanted findings.
- Hotword rules, which help detect additional findings.
Exclusion rules
Exclusion rules are useful in situations like the following:
- You want to exclude duplicate scan matches in results that are caused by overlapping infoType detectors. For example, you're scanning for email addresses and phone numbers, but you are receiving two hits for email addresses with phone numbers in them, such as "206-555-0764@example.org."
- You're experiencing noise in your scan results. For example, you're seeing the same dummy email address (such as example@example.com") or domain (such as "example.com") returned an inordinate number of times by a scan for legitimate email addresses.
- You have a list of terms, phrases, or combination of characters that you want to exclude from results.
- You want to exclude an entire column of data from results.
- You want to exclude findings that are near a string that matches a regular expression.
Exclusion rules API overview
Sensitive Data Protection defines an exclusion rule in the
ExclusionRule
object. Within ExclusionRule
you specify one of the following:
- A
Dictionary
object, which contains a list of strings to exclude from the results. - A
Regex
object, which defines a regular expression pattern. Strings that match the pattern are excluded from the results. - An
ExcludeInfoTypes
object, which contains an array of infoType detectors. If a finding is matched by any of the infoType detectors listed here, the finding is excluded from the results. An
ExcludeByHotword
object, which contains the following:- A regular expression that defines the hotword.
- A proximity value that defines how near the hotword must be to the finding.
If the finding is within the set proximity, that finding is excluded from the results. For tables, this exclusion rule type lets you exclude an entire column of data from the results.
Exclusion rule example scenarios
Each of the following JSON snippets illustrates how to configure Sensitive Data Protection for the given scenario.
Omit specific email address from EMAIL_ADDRESS detector scan
The following JSON snippet and code in several languages illustrates how to
indicate to Sensitive Data Protection using an
InspectConfig
that it should avoid matching on
"example@example.com" in a scan that uses the infoType detector EMAIL_ADDRESS
:
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
...
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"ruleSet":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"rules":[
{
"exclusionRule":{
"dictionary":{
"wordList":{
"words":[
"example@example.com"
]
}
},
"matchingType": "MATCHING_TYPE_FULL_MATCH"
}
}
]
}
]
}
...
Omit email addresses ending with a specific domain from EMAIL_ADDRESS detector scan
The following JSON snippet and code in several languages illustrates how to
indicate to Sensitive Data Protection using an
InspectConfig
that it should avoid matching on any email addresses that end with
"@example.com" in a scan that uses the infoType detector EMAIL_ADDRESS
:
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
...
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"ruleSet":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"rules":[
{
"exclusionRule":{
"regex":{
"pattern":".+@example.com"
},
"matchingType": "MATCHING_TYPE_FULL_MATCH"
}
}
]
}
]
}
...
Omit scan matches that include the substring "TEST"
The following JSON snippet and code in several languages illustrates how to
indicate to Sensitive Data Protection using an
InspectConfig
that it should exclude any findings that include the token "TEST" from the
specified list of infoTypes.
Note that this matches on "TEST" as a token, not a substring, so that although something like "TEST@email.com" will match, "TESTER@email.com" will not. If matching on a substring is desired, use a regex in the exclusion rule instead of a dictionary.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
...
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
},
{
"name":"DOMAIN_NAME"
},
{
"name":"PHONE_NUMBER"
},
{
"name":"PERSON_NAME"
}
],
"ruleSet":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
},
{
"name":"DOMAIN_NAME"
},
{
"name":"PHONE_NUMBER"
},
{
"name":"PERSON_NAME"
}
],
"rules":[
{
"exclusionRule":{
"dictionary":{
"wordList":{
"words":[
"TEST"
]
}
},
"matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
}
}
]
}
]
}
...
Omit scan matches that include the substring "Jimmy" from a custom infoType detector scan
The following JSON snippet and code in several languages illustrates how to
indicate to Sensitive Data Protection using an
InspectConfig
that it should avoid matching on the name "Jimmy" in a scan that
uses the specified custom regex detector:
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
...
"inspectConfig":{
"customInfoTypes":[
{
"infoType":{
"name":"CUSTOM_NAME_DETECTOR"
},
"regex":{
"pattern":"[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}"
}
}
],
"ruleSet":[
{
"infoTypes":[
{
"name":"CUSTOM_NAME_DETECTOR"
}
],
"rules":[
{
"exclusionRule":{
"dictionary":{
"wordList":{
"words":[
"Jimmy"
]
}
},
"matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
}
}
]
}
]
}
...
Omit scan matches from a PERSON_NAME detector scan that overlap with a custom detector
In this scenario, the user does not want a match from a Sensitive Data Protection
scan using the PERSON_NAME
built-in detector returned if the match would also
be matched in a scan using the custom regex detector defined in the first part
of the snippet.
The following JSON snippet and code in several languages specifies both a custom
regex detector and an exclusion rule in the
InspectConfig
.
The custom regex detector specifies the names to exclude from results. The
exclusion rule specifies that if any results returned from a scan for
PERSON_NAME
would also be matched by the custom regex detector, they are
omitted. Note that VIP_DETECTOR
in this case is marked as
EXCLUSION_TYPE_EXCLUDE
, so it will not produce results itself. It will only
affect results produced by the PERSON_NAME
detector.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
...
"inspectConfig":{
"infoTypes":[
{
"name":"PERSON_NAME"
}
],
"customInfoTypes":[
{
"infoType":{
"name":"VIP_DETECTOR"
},
"regex":{
"pattern":"Dana Williams|Quinn Garcia"
},
"exclusionType":"EXCLUSION_TYPE_EXCLUDE"
}
],
"ruleSet":[
{
"infoTypes":[
{
"name":"PERSON_NAME"
}
],
"rules":[
{
"exclusionRule":{
"excludeInfoTypes":{
"infoTypes":[
{
"name":"VIP_DETECTOR"
}
]
},
"matchingType": "MATCHING_TYPE_FULL_MATCH"
}
}
]
}
]
}
...
Omit matches on PERSON_NAME detector if also matched by EMAIL_ADDRESS detector
The following JSON snippet and code in several languages illustrate how to
indicate to Sensitive Data Protection using an
InspectConfig
that it should only return one match in the case that matches for the
PERSON_NAME
detector overlap with matches for the EMAIL_ADDRESS
detector.
Doing this is to avoid the situation where an email address such as
"james@example.com" matches on both the PERSON_NAME
and EMAIL_ADDRESS
detectors.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
...
"inspectConfig":{
"infoTypes":[
{
"name":"PERSON_NAME"
},
{
"name":"EMAIL_ADDRESS"
}
],
"ruleSet":[
{
"infoTypes":[
{
"name":"PERSON_NAME"
}
],
"rules":[
{
"exclusionRule":{
"excludeInfoTypes":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
]
},
"matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
}
}
]
}
]
}
...
Omit matches on domain names that are part of email addresses in a DOMAIN_NAME detector scan
The following JSON snippet and code in several languages illustrate how to
indicate to Sensitive Data Protection using an
InspectConfig
that it should only return matches for a DOMAIN_NAME
detector scan if the
match does not overlap with a match in an EMAIL_ADDRESS
detector scan. In
this scenario, the main scan is a DOMAIN_NAME
detector scan. The user does not
want a domain name match returned in findings if the domain name is used in an
email address:
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.