Metadata for a dataset used for AutoML Tables.
column_spec_id of the primary table's column that should be used as the training & prediction target. This column must be non-nullable and have one of following data types (otherwise model creation will error): * CATEGORY * FLOAT64 Furthermore, if the type is CATEGORY , then only up to 100 unique values may exist in that column across all rows. NOTE: Updates of this field will instantly affect any other users concurrently working with the dataset.
column_spec_id of the primary table column which specifies a
possible ML use of the row, i.e. the column will be used to
split the rows into TRAIN, VALIDATE and TEST sets. Required
type: STRING. This column, if set, must either have all of
TRAIN
, VALIDATE
, TEST
among its values, or only
have TEST
, UNASSIGNED
values. In the latter case the
rows with UNASSIGNED
value will be assigned by AutoML.
Note that if a given ml use distribution makes it impossible
to create a "good" model, that call will error describing the
issue. If both this column_spec_id and primary table's
time_column_spec_id are not set, then all rows are treated
as UNASSIGNED
. NOTE: Updates of this field will instantly
affect any other users concurrently working with the dataset.
The most recent timestamp when target_column_correlations field and all descendant ColumnSpec.data_stats and ColumnSpec.top_correlated_columns fields were last (re-)generated. Any changes that happened to the dataset afterwards are not reflected in these fields values. The regeneration happens in the background on a best effort basis.
Classes
TargetColumnCorrelationsEntry
API documentation for automl_v1beta1.types.TablesDatasetMetadata.TargetColumnCorrelationsEntry
class.