Resource: DataScan
Represents a user-visible job which provides the insights for the related data source.
For example:
- Data quality: generates queries based on the rules and runs against the data to get data quality check results. For more information, see Auto data quality overview.
- Data profile: analyzes the data in tables and generates insights about the structure, content and relationships (such as null percent, cardinality, min/max/mean, etc). For more information, see About data profiling.
- Data discovery: scans data in Cloud Storage buckets to extract and then catalog metadata. For more information, see Discover and catalog Cloud Storage data.
- Data documentation: analyzes the table details and generates insights including descriptions and sample SQL queries for the table. For more information, see Generate data insights in BigQuery.
| JSON representation |
|---|
{ "name": string, "uid": string, "description": string, "displayName": string, "labels": { string: string, ... }, "state": enum ( |
| Fields | |
|---|---|
name |
Output only. Identifier. The relative resource name of the scan, of the form: |
uid |
Output only. System generated globally unique ID for the scan. This ID will be different if the scan is deleted and re-created with the same name. |
description |
Optional. Description of the scan.
|
displayName |
Optional. User friendly display name.
|
labels |
Optional. User-defined labels for the scan. An object containing a list of |
state |
Output only. Current state of the DataScan. |
createTime |
Output only. The time when the scan was created. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
updateTime |
Output only. The time when the scan was last updated. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
data |
Required. The data source for DataScan. |
executionSpec |
Optional. DataScan execution settings. If not specified, the fields in it will use their default values. |
executionStatus |
Output only. Status of the data scan execution. |
type |
Output only. The type of DataScan. |
Union field spec. Data scan related setting. The settings are required and immutable. After you configure the settings for one type of data scan, you can't change the data scan to a different type of data scan. spec can be only one of the following: |
|
dataQualitySpec |
Settings for a data quality scan. |
dataProfileSpec |
Settings for a data profile scan. |
dataDiscoverySpec |
Settings for a data discovery scan. |
dataDocumentationSpec |
Settings for a data documentation scan. |
Union field result. The result of the data scan. result can be only one of the following: |
|
dataQualityResult |
Output only. The result of a data quality scan. |
dataProfileResult |
Output only. The result of a data profile scan. |
dataDiscoveryResult |
Output only. The result of a data discovery scan. |
dataDocumentationResult |
Output only. The result of a data documentation scan. |
DataSource
The data source for DataScan.
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field source. The source is required and immutable. Once it is set, it cannot be change to others. source can be only one of the following: |
|
entity |
Immutable. The Dataplex Universal Catalog entity that represents the data source (e.g. BigQuery table) for DataScan, of the form: |
resource |
Immutable. The service-qualified full resource name of the cloud resource for a DataScan job to scan against. The field could either be: Cloud Storage bucket for DataDiscoveryScan Format: //storage.googleapis.com/projects/PROJECT_ID/buckets/BUCKET_ID or BigQuery table of type "TABLE" for DataProfileScan/DataQualityScan/DataDocumentationScan Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID |
ExecutionSpec
DataScan execution settings.
| JSON representation |
|---|
{ "trigger": { object ( |
| Fields | |
|---|---|
trigger |
Optional. Spec related to how often and when a scan should be triggered. If not specified, the default is |
Union field When an option is selected for incremental scan, it cannot be unset or changed. If not specified, a data scan will run for all data in the table. |
|
field |
Immutable. The unnested field (of type Date or Timestamp) that contains values which monotonically increase over time. If not specified, a data scan will run for all data in the table. |
Trigger
DataScan scheduling and trigger settings.
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field If not specified, the default is |
|
onDemand |
The scan runs once via |
schedule |
The scan is scheduled to run periodically. |
OnDemand
This type has no fields.
The scan runs once via dataScans.run API.
Schedule
The scan is scheduled to run periodically.
| JSON representation |
|---|
{ "cron": string } |
| Fields | |
|---|---|
cron |
Required. Cron schedule for running scans periodically. To explicitly set a timezone in the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or "TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database (wikipedia). For example, This field is required for Schedule scans. |
ExecutionStatus
Status of the data scan execution.
| JSON representation |
|---|
{ "latestJobStartTime": string, "latestJobEndTime": string, "latestJobCreateTime": string } |
| Fields | |
|---|---|
latestJobStartTime |
Optional. The time when the latest DataScanJob started. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
latestJobEndTime |
Optional. The time when the latest DataScanJob ended. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
latestJobCreateTime |
Optional. The time when the DataScanJob execution was created. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
Methods |
|
|---|---|
|
Creates a DataScan resource. |
|
Deletes a DataScan resource. |
|
Generates recommended data quality rules based on the results of a data profiling scan. |
|
Gets a DataScan resource. |
|
Gets the access control policy for a resource. |
|
Lists DataScans. |
|
Updates a DataScan resource. |
|
Runs an on-demand execution of a DataScan |
|
Sets the access control policy on the specified resource. |
|
Returns permissions that a caller has on the specified resource. |