You can create predictions using an existing dataset (for example, the one you
were using for backtesting). However, in a production environment, we recommend
that you create a new dataset for each prediction run:
As a customer, you're responsible for all tracking of lineage from dataset
to model. To ensure data remains unchanged, we recommend that you
create a BigQuery table snapshot
of your BigQuery tables after they pass data validation and reference the snapshot in
the AML AI dataset. If you reference regularly updated tables, AML AI
operations read the BigQuery tables each time an operation uses the AML AI
dataset, so changes to the underlying BigQuery tables could impact tuning, training,
backtesting, and predictions.
Before creating prediction results, you must
create a BigQuery dataset
for these outputs. Any BigQuery dataset can be used for
prediction outputs, as long as the
correct permissions are granted
and the dataset is in the same project where the API is enabled and in the same
location as the AML AI instance.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eThis guide outlines the steps to prepare datasets for generating prediction outputs using AML AI.\u003c/p\u003e\n"],["\u003cp\u003eBefore starting, you need a pre-existing model and to register all parties relevant to the dataset being used for prediction.\u003c/p\u003e\n"],["\u003cp\u003eIt is strongly recommended to create a new dataset for each prediction run, and to use BigQuery table snapshots to maintain data integrity, referencing them in the AML AI dataset.\u003c/p\u003e\n"],["\u003cp\u003eBefore generating prediction results, you must establish a BigQuery dataset to serve as the output destination, ensuring proper permissions are set and that the dataset is within the same project and location.\u003c/p\u003e\n"],["\u003cp\u003eOnce the dataset, trained model resource, and output destination are ready, you can proceed to create prediction results, generating risk scores and explainability data.\u003c/p\u003e\n"]]],[],null,["# Prepare to generate prediction outputs\n\nThis page describes how to prepare the datasets needed to generate prediction\noutputs.\n\nBefore you begin\n----------------\n\nBefore you begin, you need the following:\n\n- A [model](/financial-services/anti-money-laundering/docs/overview-model-preparation)\n- To [register all parties](/financial-services/anti-money-laundering/docs/register-parties) that appear in the dataset you are using for prediction\n\n| **Note:** The operations in the following sections take several hours to complete. To check the status of an operation, see [Manage long-running operations](/financial-services/anti-money-laundering/docs/manage-long-running-operations).\n\nCreate a dataset for prediction\n-------------------------------\n\nYou can create predictions using an existing dataset (for example, the one you\nwere using for backtesting). However, in a production environment, we recommend\nthat you create a new dataset for each prediction run:\n\n1. As a customer, you're responsible for all tracking of lineage from dataset to model. To ensure data remains unchanged, we recommend that you create a [BigQuery table snapshot](/bigquery/docs/table-snapshots-intro) of your BigQuery tables after they pass data validation and reference the snapshot in the AML AI dataset. If you reference regularly updated tables, AML AI operations read the BigQuery tables each time an operation uses the AML AI dataset, so changes to the underlying BigQuery tables could impact tuning, training, backtesting, and predictions.\n2. Follow the guidance under [Prepare Data for AML AI](/financial-services/anti-money-laundering/docs/understand-data-model-requirements) to prepare your BigQuery tables and then [create a separate AML AI dataset for prediction](/financial-services/anti-money-laundering/docs/create-and-manage-datasets) using the tables you snapshotted in Step 1. To create the BigQuery datasets and tables, you can use the commands in [Prepare BigQuery datasets and tables](/financial-services/anti-money-laundering/docs/prepare-bq-datasets-tables).\n\nPrepare the output destinations\n-------------------------------\n\nAML AI generates prediction outputs (risk scores and\nexplainability) in BigQuery when you [create a prediction results\nresource](/financial-services/anti-money-laundering/docs/reference/rest/v1/projects.locations.instances.predictionResults/create).\n\nBefore creating prediction results, you must\n[create a BigQuery dataset](/financial-services/anti-money-laundering/docs/prepare-bq-datasets-tables#create-bq-output-dataset)\nfor these outputs. Any BigQuery dataset can be used for\nprediction outputs, as long as the\n[correct permissions are granted](/financial-services/anti-money-laundering/docs/prepare-bq-datasets-tables#grant-access-datasets)\nand the dataset is in the same project where the API is enabled and in the same\nlocation as the AML AI instance.\n\nGenerate risk scores and explainability\n---------------------------------------\n\nNow that you have the dataset for prediction, a\n[trained model resource](/financial-services/anti-money-laundering/docs/reference/rest/v1/projects.locations.instances.models), and a BigQuery dataset for output, you can create prediction results.\nTo do this, see\n[Create and manage prediction results](/financial-services/anti-money-laundering/docs/create-and-manage-prediction-results)."]]