Replaces an identifier with a surrogate using Format Preserving
Encryption (FPE) with the FFX mode of operation; however when used in
the ReidentifyContent
API method, it serves the opposite function
by reversing the surrogate back into the original identifier. The
identifier must be encoded as ASCII. For a given crypto key and
context, the same identifier will be replaced with the same surrogate.
Identifiers must be at least two characters long. In the case that the
identifier is the empty string, it will be skipped. See
https://cloud.google.com/dlp/docs/pseudonymization to learn more.
Note: We recommend using CryptoDeterministicConfig for all use cases
which do not require preserving the input alphabet space and size,
plus warrant referential integrity.
.. attribute:: crypto_key
Required. The key used by the encryption algorithm.
Choose an alphabet which the data being transformed will be made up of.
This is supported by mapping these to the alphanumeric characters that the FFX mode natively supports. This happens before/after encryption/decryption. Each character listed must appear only once. Number of characters must be in the range [2, 95]. This must be encoded as ASCII. The order of characters does not matter.
The custom infoType to annotate the surrogate with. This
annotation will be applied to the surrogate by prefixing it
with the name of the custom infoType followed by the number of
characters comprising the surrogate. The following scheme
defines the format:
info_type_name(surrogate_character_count):surrogate For
example, if the name of custom infoType is
‘MY_TOKEN_INFO_TYPE’ and the surrogate is ‘abc’, the full
replacement value will be: ‘MY_TOKEN_INFO_TYPE(3):abc’ This
annotation identifies the surrogate when inspecting content
using the custom infoType `SurrogateType
</dlp/docs/reference/rest/v2/InspectConfig#surrogatetype>`__.
This facilitates reversal of the surrogate when it occurs in
free text. In order for inspection to work properly, the name
of this infoType must not occur naturally anywhere in your
data; otherwise, inspection may find a surrogate that does not
correspond to an actual identifier. Therefore, choose your
custom infoType name carefully after considering what your
data looks like. One way to select a name that has a high
chance of yielding reliable detection is to include one or
more unicode characters that are highly improbable to exist in
your data. For example, assuming your data is entered from a
regular ASCII keyboard, the symbol with the hex code point
29DD might be used like so: ⧝MY_TOKEN_TYPE