DataQualityRule(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A rule captures data quality intent about a data source.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
Attributes |
|
---|---|
Name | Description |
range_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.RangeExpectation
Row-level rule which evaluates whether each column value lies between a specified range. This field is a member of oneof _ rule_type .
|
non_null_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.NonNullExpectation
Row-level rule which evaluates whether each column value is null. This field is a member of oneof _ rule_type .
|
set_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.SetExpectation
Row-level rule which evaluates whether each column value is contained by a specified set. This field is a member of oneof _ rule_type .
|
regex_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.RegexExpectation
Row-level rule which evaluates whether each column value matches a specified regex. This field is a member of oneof _ rule_type .
|
uniqueness_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.UniquenessExpectation
Row-level rule which evaluates whether each column value is unique. This field is a member of oneof _ rule_type .
|
statistic_range_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.StatisticRangeExpectation
Aggregate rule which evaluates whether the column aggregate statistic lies between a specified range. This field is a member of oneof _ rule_type .
|
row_condition_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.RowConditionExpectation
Row-level rule which evaluates whether each row in a table passes the specified condition. This field is a member of oneof _ rule_type .
|
table_condition_expectation |
google.cloud.dataplex_v1.types.DataQualityRule.TableConditionExpectation
Aggregate rule which evaluates whether the provided expression is true for a table. This field is a member of oneof _ rule_type .
|
sql_assertion |
google.cloud.dataplex_v1.types.DataQualityRule.SqlAssertion
Aggregate rule which evaluates the number of rows returned for the provided statement. If any rows are returned, this rule fails. This field is a member of oneof _ rule_type .
|
column |
str
Optional. The unnested column which this rule is evaluated against. |
ignore_null |
bool
Optional. Rows with null values will automatically fail
a rule, unless ignore_null is true . In that case,
such null rows are trivially considered passing.
This field is only valid for the following type of rules:
- RangeExpectation
- RegexExpectation
- SetExpectation
- UniquenessExpectation
|
dimension |
str
Required. The dimension a rule belongs to. Results are also aggregated at the dimension level. Supported dimensions are **["COMPLETENESS", "ACCURACY", "CONSISTENCY", "VALIDITY", "UNIQUENESS", "FRESHNESS", "VOLUME"]** |
threshold |
float
Optional. The minimum ratio of **passing_rows / total_rows** required to pass this rule, with a range of [0.0, 1.0]. 0 indicates default value (i.e. 1.0). This field is only valid for row-level type rules. |
name |
str
Optional. A mutable name for the rule. - The name must contain only letters (a-z, A-Z), numbers (0-9), or hyphens (-). - The maximum length is 63 characters. - Must start with a letter. - Must end with a number or a letter. |
description |
str
Optional. Description of the rule. - The maximum length is 1,024 characters. |
suspended |
bool
Optional. Whether the Rule is active or suspended. Default is false. |
Classes
NonNullExpectation
NonNullExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether each column value is null.
RangeExpectation
RangeExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether each column value lies between a specified range.
RegexExpectation
RegexExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether each column value matches a specified regex.
RowConditionExpectation
RowConditionExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether each row passes the specified condition.
The SQL expression needs to use BigQuery standard SQL syntax and should produce a boolean value per row as the result.
Example: col1 >= 0 AND col2 < 10
SetExpectation
SetExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether each column value is contained by a specified set.
SqlAssertion
SqlAssertion(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A SQL statement that is evaluated to return rows that match an invalid state. If any rows are are returned, this rule fails.
The SQL statement must use BigQuery standard SQL syntax, and must not contain any semicolons.
You can use the data reference parameter ${data()}
to reference
the source table with all of its precondition filters applied.
Examples of precondition filters include row filters, incremental
data filters, and sampling. For more information, see Data
reference
parameter <https://cloud.google.com/dataplex/docs/auto-data-quality-overview#data-reference-parameter>
__.
Example: SELECT * FROM ${data()} WHERE price < 0
StatisticRangeExpectation
StatisticRangeExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether the column aggregate statistic lies between a specified range.
TableConditionExpectation
TableConditionExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether the provided expression is true.
The SQL expression needs to use BigQuery standard SQL syntax and should produce a scalar boolean result.
Example: MIN(col1) >= 0
UniquenessExpectation
UniquenessExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Evaluates whether the column has duplicates.