Image inspection and redaction

Sensitive Data Protection can inspect for and redact sensitive text and objects in images, according to criteria that you specify.

Using infoType detectors, Sensitive Data Protection inspects a base64-encoded image and detects sensitive data within the image. Sensitive Data Protection can then return information about the location of sensitive data within the image or redact the sensitive data by masking it with an opaque rectangle.

Inspection and redaction are two distinct operations:

  • Inspection: Sensitive Data Protection inspects the submitted base64-encoded image for the specified intoTypes. It returns the detected InfoTypes, along with one or more set of pixel coordinates and dimensions. Each set of pixel coordinate and dimension values indicate the bottom-left corner and the dimensions of bounding boxes, respectively. Each bounding box corresponds to all or part of a Sensitive Data Protection finding.
  • Redaction: Sensitive Data Protection inspects the submitted base64-encoded image for the specified infoTypes. Sensitive Data Protection redacts any sensitive data findings by masking them with opaque rectangles. It returns the redacted base64-encoded image in the same image format as the original image. You can also configure the color of the redaction boxes in the request.

About image inspection

The Sensitive Data Protection inspection service accepts a base64-encoded image and then searches the image for any data that matches its inspection criteria. Sensitive Data Protection returns the locations of any sensitive data that it detects.

Consider the following image.

Original image that contains sensitive objects.
Original image (click to enlarge).

The image inspection process is as follows:

  1. You send a content.inspect request to the DLP API. The request contains the base64-encoded image and the inspection configuration, which contains your detection criteria.
  2. Sensitive Data Protection scans the image using the inspection configuration and identifies any matches.
  3. Sensitive Data Protection returns the coordinates and dimensions of the regions within the image where it found sensitive data according to your detection criteria.

The returned coordinates indicate where to find the sensitive data. Be aware that Sensitive Data Protection often uses multiple boxes to indicate where a single instance of sensitive data is in the image.

If Sensitive Data Protection doesn't find any data in the image that corresponds to your detection criteria, it returns an empty, successful HTTP 200 response.

About image redaction

Image redaction is similar to image inspection, with one additional step. After Sensitive Data Protection identifies the locations of sensitive data within the image, instead of returning the coordinates of the areas that contain the data, it places opaque rectangles on those areas, returning a redacted, base64-encoded image.

Redacted image with sensitive data obscured.
Redacted image (click to enlarge).

The image redaction process is as follows:

  1. You send an image.redact request to the DLP API. The request contains the base64-encoded image and the image redaction configuration, which contains your detection criteria.
  2. Sensitive Data Protection scans the image using the image redaction configuration and identifies any matches.
  3. Sensitive Data Protection redacts all detected sensitive data by covering it with an opaque rectangle. It then encodes the image in base64 and returns the redacted image in the request response.

If Sensitive Data Protection doesn't find any data in the image that corresponds to your detection criteria, it returns the base64-encoded image unchanged.

What's next