Deidentify(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Create a de-identified copy of a storage bucket. Only compatible with Cloud Storage buckets.
A TransformationDetail will be created for each transformation.
Compatible with: Inspection of Cloud Storage
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
Attributes |
|
---|---|
Name | Description |
transformation_config |
google.cloud.dlp_v2.types.TransformationConfig
User specified deidentify templates and configs for structured, unstructured, and image files. |
transformation_details_storage_config |
google.cloud.dlp_v2.types.TransformationDetailsStorageConfig
Config for storing transformation details. This field specifies the configuration for storing detailed metadata about each transformation performed during a de-identification process. The metadata is stored separately from the de-identified content itself and provides a granular record of both successful transformations and any failures that occurred. Enabling this configuration is essential for users who need to access comprehensive information about the status, outcome, and specifics of each transformation. The details are captured in the TransformationDetails][google.privacy.dlp.v2.TransformationDetails]
message for each operation.
Key use cases:
- **Auditing and compliance**
- Provides a verifiable audit trail of de-identification
activities, which is crucial for meeting regulatory
requirements and internal data governance policies.
- Logs what data was transformed, what transformations
were applied, when they occurred, and their success
status. This helps demonstrate accountability and due
diligence in protecting sensitive data.
- **Troubleshooting and debugging**
- Offers detailed error messages and context if a
transformation fails. This information is useful for
diagnosing and resolving issues in the
de-identification pipeline.
- Helps pinpoint the exact location and nature of
failures, speeding up the debugging process.
- **Process verification and quality assurance**
- Allows users to confirm that de-identification rules
and transformations were applied correctly and
consistently across the dataset as intended.
- Helps in verifying the effectiveness of the chosen
de-identification strategies.
- **Data lineage and impact analysis**
- Creates a record of how data elements were modified,
contributing to data lineage. This is useful for
understanding the provenance of de-identified data.
- Aids in assessing the potential impact of
de-identification choices on downstream analytical
processes or data usability.
- **Reporting and operational insights**
- You can analyze the metadata stored in a queryable
BigQuery table to generate reports on transformation
success rates, common error types, processing volumes
(e.g., transformedBytes), and the types of
transformations applied.
- These insights can inform optimization of
de-identification configurations and resource
planning.
To take advantage of these benefits, set this configuration.
The stored details include a description of the
transformation, success or error codes, error messages, the
number of bytes transformed, the location of the transformed
content, and identifiers for the job and source data.
|
cloud_storage_output |
str
Required. User settable Cloud Storage bucket and folders to store de-identified files. This field must be set for Cloud Storage deidentification. The output Cloud Storage bucket must be different from the input bucket. De-identified files will overwrite files in the output path. Form of: gs://bucket/folder/ or gs://bucket This field is a member of oneof _ output .
|
file_types_to_transform |
MutableSequence[google.cloud.dlp_v2.types.FileType]
List of user-specified file type groups to transform. If specified, only the files with these file types will be transformed. If empty, all supported files will be transformed. Supported types may be automatically added over time. If a file type is set in this field that isn't supported by the Deidentify action then the job will fail and will not be successfully created/started. Currently the only file types supported are: IMAGES, TEXT_FILES, CSV, TSV. |