Stay organized with collections
Save and categorize content based on your preferences.
De-identification is the process of removing identifying information from data.
The Cloud Healthcare API detects sensitive data in
DICOM instances
and FHIR resources, such as protected
health information (PHI), and then uses a de-identification transformation to
mask, delete, or otherwise obscure the data. De-identification has multiple
uses cases, including:
When sharing health information with non-privileged parties
When creating datasets from multiple sources and analyzing them
When anonymizing data so that it can be used in machine learning models
De-identification overview
De-identification works at the following levels:
At the dataset level. De-identification occurs on all data in DICOM
stores and FHIR stores in the dataset. If a dataset
contains both DICOM instances and FHIR resources, you can de-identify all of
the instances and resources at the same time.
To de-identify sensitive data at the dataset level, call the Cloud Healthcare API
datasets.deidentify
method.
At the FHIR store level. De-identification occurs on all data in a
specific FHIR store in a dataset.
To de-identify sensitive data at the FHIR store level, call the Cloud Healthcare API
fhirStores.deidentify
method.
At the DICOM store level. De-identification occurs on all data in a specific
DICOM store in a dataset.
To de-identify sensitive data at the DICOM store level, call the Cloud Healthcare API
dicomStores.deidentify
method.
De-identification doesn't impact the original dataset, FHIR store, DICOM store,
or the original data. Depending on how you configure the de-identification, the
operation behaves as follows:
If you are de-identifying data at the dataset level, de-identified copies
of the original data are written to a new dataset called the destination dataset.
If you are de-identifying data at the DICOM or FHIR store level, de-identified
copies of the original data are written to an existing DICOM or FHIR
store in an existing dataset. The output DICOM store and FHIR store are called
the destination DICOM store and destination FHIR store, respectively.
The source dataset, FHIR store, or DICOM store and the destination
dataset, FHIR store, or DICOM store must reside in
the same Google Cloud location. De-identifying data across
multiple Google Cloud locations is not supported.
De-identification location
When the Cloud Healthcare API de-identifies data, the data might be processed in a location that is different from where the source and destination FHIR or DICOM store resides.
After de-identification finishes, the data is stored in the same Google Cloud location as the source FHIR store or DICOM store.
To ensure data is processed in the same location as the source FHIR or DICOM store, you can specify
the useRegionalDataProcessing option in
DeidentifyConfig.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eDe-identification is the process of removing identifying information from data, with the Cloud Healthcare API detecting sensitive data like protected health information (PHI) in DICOM instances and FHIR resources.\u003c/p\u003e\n"],["\u003cp\u003eThe Cloud Healthcare API supports de-identification at the dataset level, FHIR store level, and DICOM store level, using methods to remove or obscure sensitive data.\u003c/p\u003e\n"],["\u003cp\u003eDe-identification operations do not alter the original data; instead, they create de-identified copies in either a new dataset or a designated destination DICOM/FHIR store.\u003c/p\u003e\n"],["\u003cp\u003eBoth the source and destination datasets, FHIR stores, or DICOM stores must reside within the same Google Cloud location for de-identification to occur.\u003c/p\u003e\n"],["\u003cp\u003eData may be processed in a different location than where the source and destination store are located during de-identification, but the data will be stored in the same location as the source after de-identification is complete.\u003c/p\u003e\n"]]],[],null,["# De-identifying sensitive data\n\n| **Disclaimer**: This operation processes data using a mixture of rules-based and heuristic methods. Results may differ between resources and datasets due to factors like input data quality/consistency, heuristic-based algorithms, and others. This feature is not guaranteed to satisfy any specific legal, regulatory, or compliance requirements, including requirements for the de-identification of data. It is the user's responsibility to ensure that they set the appropriate configuration parameters for the operation and evaluate the end result to determine whether it is acceptable for their use cases and any legal, regulatory, or compliance requirements they may have.\n\n\nDe-identification is the process of removing identifying information from data.\nThe Cloud Healthcare API detects sensitive data in\n[DICOM instances](/healthcare-api/docs/how-tos/dicom-deidentify)\nand [FHIR resources](/healthcare-api/docs/how-tos/fhir-deidentify), such as protected\nhealth information (PHI), and then uses a de-identification transformation to\nmask, delete, or otherwise obscure the data. De-identification has multiple\nuses cases, including:\n\n- When sharing health information with non-privileged parties\n- When creating datasets from multiple sources and analyzing them\n- When anonymizing data so that it can be used in machine learning models\n\nDe-identification overview\n--------------------------\n\n\nDe-identification works at the following levels:\n\n- At the dataset level. De-identification occurs on all data in DICOM stores and FHIR stores in the dataset. If a dataset contains both DICOM instances and FHIR resources, you can de-identify all of the instances and resources at the same time. \n\n To de-identify sensitive data at the dataset level, call the Cloud Healthcare API [`datasets.deidentify`](/healthcare-api/docs/reference/rest/v1/projects.locations.datasets/deidentify) method.\n- At the FHIR store level. De-identification occurs on all data in a specific FHIR store in a dataset. \n\n To de-identify sensitive data at the FHIR store level, call the Cloud Healthcare API [`fhirStores.deidentify`](/healthcare-api/docs/reference/rest/v1/projects.locations.datasets.fhirStores/deidentify) method.\n- At the DICOM store level. De-identification occurs on all data in a specific DICOM store in a dataset. \n\n To de-identify sensitive data at the DICOM store level, call the Cloud Healthcare API [`dicomStores.deidentify`](/healthcare-api/docs/reference/rest/v1/projects.locations.datasets.dicomStores/deidentify) method.\n\n\nDe-identification doesn't impact the original dataset, FHIR store, DICOM store,\nor the original data. Depending on how you configure the de-identification, the\noperation behaves as follows:\n\n- If you are de-identifying data at the dataset level, de-identified copies of the original data are written to a new dataset called the *destination dataset*.\n- If you are de-identifying data at the DICOM or FHIR store level, de-identified copies of the original data are written to an existing DICOM or FHIR store in an existing dataset. The output DICOM store and FHIR store are called the *destination DICOM store* and *destination FHIR store*, respectively.\n\n\nThe source dataset, FHIR store, or DICOM store and the destination\ndataset, FHIR store, or DICOM store must reside in\nthe same Google Cloud location. De-identifying data across\nmultiple Google Cloud locations is not supported.\n\nDe-identification location\n--------------------------\n\n\nWhen the Cloud Healthcare API de-identifies data, the data might be processed in a location that is different from where the source and destination FHIR or DICOM store resides.\nAfter de-identification finishes, the data is stored in the same Google Cloud location as the source FHIR store or DICOM store.\n\n\nTo ensure data is processed in the same location as the source FHIR or DICOM store, you can specify\nthe `useRegionalDataProcessing` option in\n[`DeidentifyConfig`](/healthcare-api/docs/reference/rest/v1/projects.locations.datasets.fhirStores#DeidentifyConfig).\n\nDe-identifying data in the Google Cloud console\n-----------------------------------------------\n\n\nYou can de-identify data for a dataset, FHIR store, or DICOM store from within the Google Cloud console. For more information see [De-identifying data in the Google Cloud console (DICOM)](/healthcare-api/docs/how-tos/dicom-deidentify#de-identifying_data_in_the)\nand [De-identifying data in the Google Cloud console (FHIR)](/healthcare-api/docs/how-tos/fhir-deidentify#de-identifying_data_in_the)."]]