Stay organized with collections
Save and categorize content based on your preferences.
This page briefly covers concepts behind model training. An AML AI
model resource represents a trained model that can be used to generate risk
scores and explainability.
When to train or re-train
AML AI trains a model as part of creating a
Model resource. The model
must be trained before it can be evaluated (that is, backtested) or used to
generate prediction results.
For best performance and to maintain the most up-to-date models, consider
monthly re-training. However, a given engine version supports generating
prediction results for 12 months from the
release of a newer minor engine version.
Specify a dataset and an end time within the date range of the dataset.
Training uses labels and features based on complete calendar months up to,
but not including, the month of the selected end time. For more information,
see
Dataset time ranges.
Training generates a
Model resource, which can
be used to do the following:
Create backtest results, which are used to evaluate model performance using
currently-known true positives
Create prediction results, which are used once you are ready to start
reviewing new cases for potential money laundering
The
model metadata
contains the missingness metric, which can be used to assess
dataset consistency
(for example, by comparing the missingness values of feature families from
different operations)
Metric name
Metric description
Example metric value
Missingness
Share of missing values across all features in each feature family.
Ideally, all AML AI feature families should have a
Missingness near to 0. Exceptions may occur where the data underlying
those feature families is unavailable for integration.
A significant change in this value for any feature family between tuning,
training, evaluation, and prediction can indicate inconsistency in the
datasets used.
A metric that shows the importance of a feature family to the model. Higher values indicate more significant use of the feature family in the model. A feature family that is not used in the model has zero importance.
Importance values can be used when prioritizing acting on family skew results. For example, the same skew value for a family with higher importance to the model is more urgent to resolve.
Model metadata does not contain recall metrics from a test set. To generate
recall measurements for a specific time period (for example, the test set), see
Evaluate a model.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eAML AI model training creates a Model resource that can generate risk scores and provide explainability.\u003c/p\u003e\n"],["\u003cp\u003eModel training is necessary before evaluation (backtesting) or generating prediction results, and monthly retraining is recommended for optimal performance.\u003c/p\u003e\n"],["\u003cp\u003eTraining involves selecting a dataset with a specified end time and an engine configuration based on a consistent dataset.\u003c/p\u003e\n"],["\u003cp\u003eThe training process outputs a Model resource that facilitates creating backtest results for performance evaluation and prediction results for reviewing potential money laundering cases.\u003c/p\u003e\n"],["\u003cp\u003eModel metadata includes the \u003ccode\u003emissingness\u003c/code\u003e metric to assess dataset consistency and an importance metric that indicates the influence of a feature family on the model, which is available for specific engine versions.\u003c/p\u003e\n"]]],[],null,["# Generate a model\n\nThis page briefly covers concepts behind model training. An AML AI\nmodel resource represents a trained model that can be used to generate risk\nscores and explainability.\n\nWhen to train or re-train\n-------------------------\n\nAML AI trains a model as part of creating a\n[Model resource](/financial-services/anti-money-laundering/docs/reference/rest/v1/projects.locations.instances.models). The model\nmust be trained before it can be evaluated (that is, backtested) or used to\ngenerate prediction results.\n\nFor best performance and to maintain the most up-to-date models, consider\nmonthly re-training. However, a given engine version supports generating\nprediction results for 12 months from the\nrelease of a newer minor engine version.\n| **Note:** Training is a billable operation requiring significant compute resources and may take days to complete. For more information, see the [Pricing page](/financial-services/anti-money-laundering/pricing).\n\nHow to train\n------------\n\nTo train a model (that is, create a model), see\n[Create and manage models](/financial-services/anti-money-laundering/docs/create-and-manage-models).\n\nIn particular, you need to select the following:\n\n- **The data to use for training:**\n\n Specify a dataset and an end time within the date range of the dataset.\n\n Training uses labels and features based on complete calendar months up to,\n but not including, the month of the selected end time. For more information,\n see\n [Dataset time ranges](/financial-services/anti-money-laundering/docs/overview-model-preparation#dataset-time-ranges).\n- **An engine config created using a [consistent dataset](/financial-services/anti-money-laundering/docs/overview-model-preparation#dataset-consistency):**\n\n See\n [Configure an engine](/financial-services/anti-money-laundering/docs/configure-engine).\n\nTraining output\n---------------\n\nTraining generates a\n[Model resource](/financial-services/anti-money-laundering/docs/reference/rest/v1/projects.locations.instances.models), which can\nbe used to do the following:\n\n- Create backtest results, which are used to evaluate model performance using currently-known true positives\n- Create prediction results, which are used once you are ready to start reviewing new cases for potential money laundering\n\nThe\n[model metadata](/financial-services/anti-money-laundering/docs/create-and-manage-models#export-metadata)\ncontains the `missingness` metric, which can be used to assess\n[dataset consistency](/financial-services/anti-money-laundering/docs/overview-model-preparation#dataset-consistency)\n(for example, by comparing the missingness values of feature families from\ndifferent operations)\n\nModel metadata does not contain recall metrics from a test set. To generate\nrecall measurements for a specific time period (for example, the test set), see\n[Evaluate a model](/financial-services/anti-money-laundering/docs/evaluate-model)."]]