Resource: DataScan
Represents a user-visible job which provides the insights for the related data source.
For example:
- Data Quality: generates queries based on the rules and runs against the data to get data quality check results.
- Data Profile: analyzes the data in table(s) and generates insights about the structure, content and relationships (such as null percent, cardinality, min/max/mean, etc).
JSON representation |
---|
{ "name": string, "uid": string, "description": string, "displayName": string, "labels": { string: string, ... }, "state": enum ( |
Fields | |
---|---|
name |
Output only. Identifier. The relative resource name of the scan, of the form: |
uid |
Output only. System generated globally unique ID for the scan. This ID will be different if the scan is deleted and re-created with the same name. |
description |
Optional. Description of the scan.
|
display |
Optional. User friendly display name.
|
labels |
Optional. User-defined labels for the scan. An object containing a list of |
state |
Output only. Current state of the DataScan. |
create |
Output only. The time when the scan was created. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
update |
Output only. The time when the scan was last updated. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
data |
Required. The data source for DataScan. |
execution |
Optional. DataScan execution settings. If not specified, the fields in it will use their default values. |
execution |
Output only. Status of the data scan execution. |
type |
Output only. The type of DataScan. |
Union field spec . Data scan related setting. The settings are required and immutable. After you configure the settings for one type of data scan, you can't change the data scan to a different type of data scan. spec can be only one of the following: |
|
data |
Settings for a data quality scan. |
data |
Settings for a data profile scan. |
data |
Settings for a data discovery scan. |
Union field result . The result of the data scan. result can be only one of the following: |
|
data |
Output only. The result of a data quality scan. |
data |
Output only. The result of a data profile scan. |
data |
Output only. The result of a data discovery scan. |
DataSource
The data source for DataScan.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field source . The source is required and immutable. Once it is set, it cannot be change to others. source can be only one of the following: |
|
entity |
Immutable. The Dataplex entity that represents the data source (e.g. BigQuery table) for DataScan, of the form: |
resource |
Immutable. The service-qualified full resource name of the cloud resource for a DataScan job to scan against. The field could be: BigQuery table of type "TABLE" for DataProfileScan/DataQualityScan Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID |
ExecutionSpec
DataScan execution settings.
JSON representation |
---|
{ "trigger": { object ( |
Fields | |
---|---|
trigger |
Optional. Spec related to how often and when a scan should be triggered. If not specified, the default is |
Union field When an option is selected for incremental scan, it cannot be unset or changed. If not specified, a data scan will run for all data in the table. |
|
field |
Immutable. The unnested field (of type Date or Timestamp) that contains values which monotonically increase over time. If not specified, a data scan will run for all data in the table. |
Trigger
DataScan scheduling and trigger settings.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field If not specified, the default is |
|
on |
The scan runs once via |
schedule |
The scan is scheduled to run periodically. |
OnDemand
This type has no fields.
The scan runs once via dataScans.run
API.
Schedule
The scan is scheduled to run periodically.
JSON representation |
---|
{ "cron": string } |
Fields | |
---|---|
cron |
Required. Cron schedule for running scans periodically. To explicitly set a timezone in the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or "TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database (wikipedia). For example, This field is required for Schedule scans. |
ExecutionStatus
Status of the data scan execution.
JSON representation |
---|
{ "latestJobStartTime": string, "latestJobEndTime": string, "latestJobCreateTime": string } |
Fields | |
---|---|
latest |
Optional. The time when the latest DataScanJob started. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
latest |
Optional. The time when the latest DataScanJob ended. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
latest |
Optional. The time when the DataScanJob execution was created. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
Methods |
|
---|---|
|
Creates a DataScan resource. |
|
Deletes a DataScan resource. |
|
Generates recommended data quality rules based on the results of a data profiling scan. |
|
Gets a DataScan resource. |
|
Gets the access control policy for a resource. |
|
Lists DataScans. |
|
Updates a DataScan resource. |
|
Runs an on-demand execution of a DataScan |
|
Sets the access control policy on the specified resource. |
|
Returns permissions that a caller has on the specified resource. |