Stay organized with collections
Save and categorize content based on your preferences.
The ML.TRANSFORM function
This document describes the ML.TRANSFORM function, which you can use
to preprocess feature data. This function processes input data by
applying the data transformations captured in the
TRANSFORM clause
of an existing model. The statistics that were calculated for data
transformation during model training are applied to the input data of the function.
ML.TRANSFORM(
MODEL `PROJECT_ID.DATASET.MODEL_NAME`,
{ TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }
)
Arguments
ML.TRANSFORM takes the following arguments:
PROJECT_ID: the project that contains the
resource.
DATASET: the BigQuery dataset that
contains the resource.
MODEL_NAME: the name of a model. The model
must have been created by using a CREATE MODEL statement that includes a
TRANSFORM clause
to manually preprocess feature data. You can check to see if a model uses a
TRANSFORM clause by using the
bq show command
to look at the
model's metadata.
If the model was trained using a TRANSFORM clause, the model metadata
contains a section about the transform columns. The function returns an error
if you specify a model that was trained without a TRANSFORM clause.
TABLE_NAME: the name of the input table that
contains the feature data to preprocess.
If you specify a value for the TABLE_NAME argument, the input column names
in the table must match the input column names in the model's TRANSFORM
clause, and their types should be compatible according to
BigQuery implicit coercion rules.
You can get the input column names and data types from the
model's metadata,
in the section about the feature columns.
QUERY_STATEMENT: A query that generates the feature
data to preprocess. For the supported SQL syntax of the QUERY_STATEMENT
clause, see
GoogleSQL query syntax.
If you specify a value for the QUERY_STATEMENT argument, the input column
names from the query must match the input column names in the model's
TRANSFORM clause, and their types should be compatible according to
BigQuery implicit coercion rules.
You can get the input column names and data types from the
model's metadata,
in the section about the feature columns.
Output
ML.TRANSFORM returns the columns specified in the model's TRANSFORM clause.
Example
The following example returns feature data that has been preprocessed by
using the TRANSFORM clause included in the model named mydataset.mymodel
in your default project.
Create the model that contains the TRANSFORM clause:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003e\u003ccode\u003eML.TRANSFORM\u003c/code\u003e is a function used to preprocess feature data by applying the transformations defined in the \u003ccode\u003eTRANSFORM\u003c/code\u003e clause of an existing model.\u003c/p\u003e\n"],["\u003cp\u003eThe function utilizes the statistics calculated during the model training phase to transform the input data.\u003c/p\u003e\n"],["\u003cp\u003eTo use \u003ccode\u003eML.TRANSFORM\u003c/code\u003e, the specified model must have been created with a \u003ccode\u003eTRANSFORM\u003c/code\u003e clause and its input data must match the column names and be compatible with the data types specified in this \u003ccode\u003eTRANSFORM\u003c/code\u003e clause.\u003c/p\u003e\n"],["\u003cp\u003eThe function accepts either a table or a query as input to provide the data to be preprocessed.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eML.TRANSFORM\u003c/code\u003e function returns the columns that are specified in the model's \u003ccode\u003eTRANSFORM\u003c/code\u003e clause, reflecting the applied transformations.\u003c/p\u003e\n"]]],[],null,["# The ML.TRANSFORM function\n=========================\n\nThis document describes the `ML.TRANSFORM` function, which you can use\nto preprocess feature data. This function processes input data by\napplying the data transformations captured in the\n[`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform)\nof an existing model. The statistics that were calculated for data\ntransformation during model training are applied to the input data of the function.\n\nFor more information about which models support\nfeature preprocessing, see\n[End-to-end user journey for each model](/bigquery/docs/e2e-journey).\n\nSyntax\n------\n\n```sql\nML.TRANSFORM(\n MODEL `PROJECT_ID.DATASET.MODEL_NAME`,\n { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }\n)\n```\n\n### Arguments\n\n`ML.TRANSFORM` takes the following arguments:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the project that contains the resource.\n- \u003cvar translate=\"no\"\u003eDATASET\u003c/var\u003e: the BigQuery dataset that contains the resource.\n- \u003cvar translate=\"no\"\u003eMODEL_NAME\u003c/var\u003e: the name of a model. The model must have been created by using a `CREATE MODEL` statement that includes a [`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform) to manually preprocess feature data. You can check to see if a model uses a `TRANSFORM` clause by using the [`bq show` command](/bigquery/docs/reference/bq-cli-reference#bq_show) to look at the [model's metadata](/bigquery/docs/getting-model-metadata#get_model_metadata). If the model was trained using a `TRANSFORM` clause, the model metadata contains a section about the transform columns. The function returns an error if you specify a model that was trained without a `TRANSFORM` clause.\n- \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e: the name of the input table that\n contains the feature data to preprocess.\n\n If you specify a value for the `TABLE_NAME` argument, the input column names\n in the table must match the input column names in the model's `TRANSFORM`\n clause, and their types should be compatible according to\n BigQuery [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n You can get the input column names and data types from the\n [model's metadata](/bigquery/docs/getting-model-metadata#get_model_metadata),\n in the section about the feature columns.\n- \u003cvar translate=\"no\"\u003eQUERY_STATEMENT\u003c/var\u003e: A query that generates the feature\n data to preprocess. For the supported SQL syntax of the `QUERY_STATEMENT`\n clause, see\n [GoogleSQL query syntax](/bigquery/docs/reference/standard-sql/query-syntax#sql_syntax).\n\n If you specify a value for the `QUERY_STATEMENT` argument, the input column\n names from the query must match the input column names in the model's\n `TRANSFORM` clause, and their types should be compatible according to\n BigQuery [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n You can get the input column names and data types from the\n [model's metadata](/bigquery/docs/getting-model-metadata#get_model_metadata),\n in the section about the feature columns.\n\nOutput\n------\n\n`ML.TRANSFORM` returns the columns specified in the model's `TRANSFORM` clause.\n\nExample\n-------\n\nThe following example returns feature data that has been preprocessed by\nusing the `TRANSFORM` clause included in the model named `mydataset.mymodel`\nin your default project.\n\nCreate the model that contains the `TRANSFORM` clause: \n\n```sql\nCREATE OR REPLACE MODEL `mydataset.mymodel`\n TRANSFORM(\n species,\n island,\n ML.MAX_ABS_SCALER(culmen_length_mm) OVER () AS culmen_length_mm,\n ML.MAX_ABS_SCALER(flipper_length_mm) OVER () AS flipper_length_mm,\n sex,\n body_mass_g)\n OPTIONS (\n model_type = 'linear_reg',\n input_label_cols = ['body_mass_g'])\nAS (\n SELECT *\n FROM `bigquery-public-data.ml_datasets.penguins`\n WHERE body_mass_g IS NOT NULL\n);\n```\n\nReturn feature data preprocessed by the model's `TRANSFORM` clause: \n\n```sql\nSELECT\n *\nFROM\n ML.TRANSFORM(\n MODEL `mydataset.mymodel`,\n TABLE `bigquery-public-data.ml_datasets.penguins`);\n```\n\nThe result is similar to the following: \n\n```\n+-------------------------------------+--------+---------------------+---------------------+--------+-----------------+-------------+\n| species | island | culmen_length_mm | flipper_length_mm | sex | culmen_depth_mm | body_mass_g |\n--------------------------------------+--------+ ------------------- +---------------------+--------+-----------------+-------------+\n| Adelie Penguin (Pygoscelis adeliae) | Dream | 0.61409395973154368 | 0.79653679653679654 | Female | 18.4 | 3475.0 |\n| Adelie Penguin (Pygoscelis adeliae) | Dream | 0.66778523489932884 | 0.79653679653679654 | Male | 19.1 | 4650.0 |\n+-------------------------------------+--------+---------------------+---------------------+--------+-----------------+-------------+\n```\n\nWhat's next\n-----------\n\n- For information about feature preprocessing, see [Feature preprocessing overview](/bigquery/docs/preprocess-overview)."]]