Stay organized with collections
Save and categorize content based on your preferences.
AML AI is set up to assess money laundering risk for one
line of business. An LoB is associated with one of your retail or
commercial customers.
When creating a dataset for use with an LoB, you will need to include several
tables. Each table should cover a sufficient time range. This page gives an
overview of the tables you will need and shows how to determine the time range
that each should cover.
Tables to use
The BigQuery dataset used with AML AI should contain the following tables:
Retail LoB: All retail banking customers that have held accounts at
any point in the required time range
Commercial LoB: All commercial banking customers (legal and natural
entities) that have held accounts at any point in the required time range
AccountPartyLink: Full history of which accounts were
held by which parties. This should cover all accounts for products and
services when any party in the Party table was the primary account holder at
any point in the required time range.
Transaction: All transactions for accounts in the
AccountPartyLink table for the required time range.
RiskCaseEvent: All risk case events (see event type
values) for any risk case and party in the Party table with an
AML_PROCESS_START (start of investigation) in the required time range. This
table may include events that have an event time earlier or later than the
required time range.
PartySupplementaryData: (If used) For 0 to 100 unique
party_supplementary_data_id values, include a full history of the values of
these fields for all parties in the Party table for the required time range.
Using additional data
See Supplementary data if you have additional data on parties
(not otherwise covered in the schema) that is relevant to identifying money
laundering risk.
Dataset time range
The time range that any table in a dataset should cover can be worked out as
follows for any given operation. You will need to know:
The end time. This is the latest time from which labels are used and from which data is used to generate features for tuning.
The operation you will conduct: tune, train, predict or backtest.
For predict or backtest operations, the number of periods for which you will conduct the operation, to be specified in the API call.
First you should work out the number of periods the operation will use. This is
the number of consecutive months ending in the last full calendar month prior to
the specified end time, for which AML AI will evaluate model features.
For predict and backtest operations, this is the number of prediction
periods or backtest periods specified in the API call.
For other operations this depends on the Engine Version and the operation.
For example, v004.004 Engine Versions use 18 periods for tuning and 15 for
training.
Next you should work out the lookback window for each table. This is the maximum
number of months of data needed from that table for AML AI to calculate model
features for a given period.
For example, for v004.004 Engine Versions, this is 13 months for Transaction
and AccountPartyLink tables, 12 months for the RiskCaseEvent table and 0
months for Party and PartySupplementaryData tables.
The dataset will need to cover the lookback window for all of the periods used
by the chosen operation. You can calculate the number of full calendar months of
data prior to the end time that you will need for a given operation with the
following formula:
number of periods + lookback window -1
For example, for v004.00X Engine Versions conducting tuning, you require:
18 + 13 - 1 = 30 months of data from the Transaction and AccountPartyLink
tables,
18 + 12 - 1 = 29 months of data from the Risk Case Events table as well as
any more recent events for risk cases in the table,
And 18 + 0 - 1 = 17 months of data from the Party and PartySupplementaryData
tables.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eAML AI assesses money laundering risk for a specific Line of Business (LoB), associated with either retail or commercial customers.\u003c/p\u003e\n"],["\u003cp\u003eCreating a dataset for AML AI requires including several tables: Party, AccountPartyLink, Transaction, RiskCaseEvent, and optionally PartySupplementaryData, each with defined content requirements.\u003c/p\u003e\n"],["\u003cp\u003eThe required time range for each table in the dataset is determined by the operation type (tune, train, predict, or backtest), the engine version, the end time, the number of periods, and the lookback window specific to each table.\u003c/p\u003e\n"],["\u003cp\u003eFor predict and backtest operations, the number of periods is defined in the API call, while for other operations, it is determined by the engine version.\u003c/p\u003e\n"],["\u003cp\u003eFor v004.00X Engine Versions conducting tuning, the dataset will need 30 months of data from Transaction and AccountPartyLink, 29 months from Risk Case Events and 17 from Party and PartySupplementaryData tables, at minimum.\u003c/p\u003e\n"]]],[],null,["# Understand data scope and duration\n\nAML AI is set up to assess money laundering risk for one\n[line of business](/financial-services/anti-money-laundering/docs/concepts/glossary#lob). An LoB is associated with one of your retail or\ncommercial customers.\n\nWhen creating a dataset for use with an LoB, you will need to include several\ntables. Each table should cover a sufficient time range. This page gives an\noverview of the tables you will need and shows how to determine the time range\nthat each should cover.\n\nTables to use\n-------------\n\nThe BigQuery dataset used with AML AI should contain the following tables:\n\n- **[Party](/financial-services/anti-money-laundering/docs/reference/schemas/aml-input-data-model#party)** : All parties relevant to that LoB\n - **Retail LoB**: All retail banking customers that have held accounts at any point in the required time range\n - **Commercial LoB**: All commercial banking customers (legal and natural entities) that have held accounts at any point in the required time range\n- **[AccountPartyLink](/financial-services/anti-money-laundering/docs/reference/schemas/aml-input-data-model#accountpartylink)**: Full history of which accounts were held by which parties. This should cover all accounts for products and services when any party in the Party table was the primary account holder at any point in the required time range.\n- **[Transaction](/financial-services/anti-money-laundering/docs/reference/schemas/aml-input-data-model#transaction)**: All transactions for accounts in the AccountPartyLink table for the required time range.\n- **[RiskCaseEvent](/financial-services/anti-money-laundering/docs/reference/schemas/aml-input-data-model#riskcaseevent)**: All risk case events (see event type values) for any risk case and party in the Party table with an AML_PROCESS_START (start of investigation) in the required time range. This table may include events that have an event time earlier or later than the required time range.\n- **[PartySupplementaryData](/financial-services/anti-money-laundering/docs/reference/schemas/aml-input-data-model#partysupplementarydata)**: (If used) For 0 to 100 unique party_supplementary_data_id values, include a full history of the values of these fields for all parties in the Party table for the required time range.\n\nUsing additional data\n---------------------\n\nSee [Supplementary data](/financial-services/anti-money-laundering/docs/reference/schemas/aml-input-data-model#supplementary-data-tables) if you have additional data on parties\n(not otherwise covered in the schema) that is relevant to identifying money\nlaundering risk.\n\nDataset time range\n------------------\n\nThe time range that any table in a dataset should cover can be worked out as\nfollows for any given operation. You will need to know:\n\n- The end time. This is the latest time from which labels are used and from which data is used to generate features for tuning.\n- The Engine Version (See [list of engine versions](/financial-services/anti-money-laundering/docs/private/engine-versions)) you will use.\n- The operation you will conduct: tune, train, predict or backtest.\n- For predict or backtest operations, the number of periods for which you will conduct the operation, to be specified in the API call.\n\nFirst you should work out the number of periods the operation will use. This is\nthe number of consecutive months ending in the last full calendar month prior to\nthe specified end time, for which AML AI will evaluate model features.\n\n- For predict and backtest operations, this is the number of prediction periods or backtest periods specified in the API call.\n- For other operations this depends on the Engine Version and the operation. For example, v004.004 Engine Versions use 18 periods for tuning and 15 for training.\n\nNext you should work out the lookback window for each table. This is the maximum\nnumber of months of data needed from that table for AML AI to calculate model\nfeatures for a given period.\n\n- For example, for v004.004 Engine Versions, this is 13 months for Transaction and AccountPartyLink tables, 12 months for the RiskCaseEvent table and 0 months for Party and PartySupplementaryData tables.\n\nThe dataset will need to cover the lookback window for all of the periods used\nby the chosen operation. You can calculate the number of full calendar months of\ndata prior to the end time that you will need for a given operation with the\nfollowing formula:\n\n- **number of periods + lookback window -1**\n\nFor example, for v004.00X Engine Versions conducting tuning, you require:\n\n- 18 + 13 - 1 = 30 months of data from the Transaction and AccountPartyLink tables,\n- 18 + 12 - 1 = 29 months of data from the Risk Case Events table as well as any more recent events for risk cases in the table,\n- And 18 + 0 - 1 = 17 months of data from the Party and PartySupplementaryData tables."]]