Class Deidentify (3.30.0)

Deidentify(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Create a de-identified copy of a storage bucket. Only compatible with Cloud Storage buckets.

A TransformationDetail will be created for each transformation.

Compatible with: Inspection of Cloud Storage

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

Attributes

Name Description
transformation_config google.cloud.dlp_v2.types.TransformationConfig
User specified deidentify templates and configs for structured, unstructured, and image files.
transformation_details_storage_config google.cloud.dlp_v2.types.TransformationDetailsStorageConfig
Config for storing transformation details. This field specifies the configuration for storing detailed metadata about each transformation performed during a de-identification process. The metadata is stored separately from the de-identified content itself and provides a granular record of both successful transformations and any failures that occurred. Enabling this configuration is essential for users who need to access comprehensive information about the status, outcome, and specifics of each transformation. The details are captured in the TransformationDetails][google.privacy.dlp.v2.TransformationDetails] message for each operation. Key use cases: - **Auditing and compliance** - Provides a verifiable audit trail of de-identification activities, which is crucial for meeting regulatory requirements and internal data governance policies. - Logs what data was transformed, what transformations were applied, when they occurred, and their success status. This helps demonstrate accountability and due diligence in protecting sensitive data. - **Troubleshooting and debugging** - Offers detailed error messages and context if a transformation fails. This information is useful for diagnosing and resolving issues in the de-identification pipeline. - Helps pinpoint the exact location and nature of failures, speeding up the debugging process. - **Process verification and quality assurance** - Allows users to confirm that de-identification rules and transformations were applied correctly and consistently across the dataset as intended. - Helps in verifying the effectiveness of the chosen de-identification strategies. - **Data lineage and impact analysis** - Creates a record of how data elements were modified, contributing to data lineage. This is useful for understanding the provenance of de-identified data. - Aids in assessing the potential impact of de-identification choices on downstream analytical processes or data usability. - **Reporting and operational insights** - You can analyze the metadata stored in a queryable BigQuery table to generate reports on transformation success rates, common error types, processing volumes (e.g., transformedBytes), and the types of transformations applied. - These insights can inform optimization of de-identification configurations and resource planning. To take advantage of these benefits, set this configuration. The stored details include a description of the transformation, success or error codes, error messages, the number of bytes transformed, the location of the transformed content, and identifiers for the job and source data.
cloud_storage_output str
Required. User settable Cloud Storage bucket and folders to store de-identified files. This field must be set for Cloud Storage deidentification. The output Cloud Storage bucket must be different from the input bucket. De-identified files will overwrite files in the output path. Form of: gs://bucket/folder/ or gs://bucket This field is a member of oneof_ output.
file_types_to_transform MutableSequence[google.cloud.dlp_v2.types.FileType]
List of user-specified file type groups to transform. If specified, only the files with these file types will be transformed. If empty, all supported files will be transformed. Supported types may be automatically added over time. If a file type is set in this field that isn't supported by the Deidentify action then the job will fail and will not be successfully created/started. Currently the only file types supported are: IMAGES, TEXT_FILES, CSV, TSV.