Stay organized with collections
Save and categorize content based on your preferences.
Configure external datasets
This page describes an optional step to configure external datasets for
the Cortex Framework Data Foundation deployment. Some advanced
use cases might require external datasets to complement an enterprise system of
record. In addition to external exchanges consumed from
BigQuery sharing (formerly Analytics Hub),
some datasets might need custom or tailored methods to ingest data
and join them with the reporting models.
To enable the following external datasets, set k9.deployDataset to True
if you want Dataset to be deployed.
Configure the Directed Acyclic Graphs (DAGs) for the supported external datasets
following these steps:
Holiday Calendar: This DAG retrieves the special dates from
PyPi Holidays.
Adjust the list of countries, the list of years, as well as other DAG parameters
to retrieve holidays in holiday_calendar.ini.
Trends: This DAG retrieves Interest Over Time for a specific set
of terms from Google Search trends.
The terms can be configured in trends.ini.
After an initial run, adjust the start_date to 'today 7-d' in
trends.ini.
Get familiarized with the results coming from the
different terms to tune parameters.
We recommend partitioning large lists to multiple copies of this DAG
running at different times.
For more information about the underlying library being used, see Pytrends.
This dataset needs to be created in the same region as the other datasets prior to executing deployment. If the datasets aren't available in your region, you can continue
with the following instructions to transfer the data into the chosen region:
When prompted, keep noaa_global_forecast_system as the name
of the dataset. If needed, adjust the name of the dataset and
table in the FROM clauses in weather_daily.sql.
Repeat the listing search for Dataset OpenStreetMap Public Dataset.
Adjust the FROM clauses containing:
BigQuery-public-data.geo_openstreetmap.planet_layers in
postcode.sql.
Sustainability and ESG insights: Cortex Framework combines
SAP supplier performance data with advanced ESG insights to compare
delivery performance, sustainability and risks more holistically across
global operations. For more information,
see the Dun & Bradstreet data source.
General considerations
Sharing
is only supported in EU and US locations,
and some datasets, such as NOAA Global Forecast, are only offered
in a single multi location.
If you are targeting a location different
from the one available for the required dataset, we recommended to create
a scheduled query
to copy the new records from the Sharing
linked dataset followed by a transfer service
to copy those new records into a dataset located
in the same location or region as the rest of your deployment.
You then need to adjust the SQL files.
Before copying these DAGs to Cloud Composer, add the required
python modules as dependencies:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eThis page provides instructions for configuring optional external datasets within the Cortex Framework Data Foundation deployment, which can be utilized to enhance enterprise systems of record with external data.\u003c/p\u003e\n"],["\u003cp\u003eConfiguring external datasets involves setting \u003ccode\u003ek9.deployDataset\u003c/code\u003e to \u003ccode\u003eTrue\u003c/code\u003e and setting up Directed Acyclic Graphs (DAGs) for each supported dataset like the holiday calendar, search trends, weather, and sustainability/ESG data.\u003c/p\u003e\n"],["\u003cp\u003eThe Holiday Calendar DAG retrieves special dates from PyPi Holidays, allowing customization of countries and years through the \u003ccode\u003eholiday_calendar.ini\u003c/code\u003e file.\u003c/p\u003e\n"],["\u003cp\u003eThe Trends DAG fetches "Interest Over Time" data from Google Search Trends, with configurable terms and date ranges in \u003ccode\u003etrends.ini\u003c/code\u003e, and recommends multiple copies for large term lists.\u003c/p\u003e\n"],["\u003cp\u003eThe Weather DAG uses public data from \u003ccode\u003eBigQuery-public-data.geo_openstreetmap.planet_layers\u003c/code\u003e and the \u003ccode\u003enoaa_global_forecast_system\u003c/code\u003e from Analytics Hub, both of which need to be available in the same region as other datasets.\u003c/p\u003e\n"]]],[],null,["# Configure external datasets\n===========================\n\nThis page describes an optional step to configure external datasets for\nthe Cortex Framework Data Foundation deployment. Some advanced\nuse cases might require external datasets to complement an enterprise system of\nrecord. In addition to external exchanges consumed from\n[BigQuery sharing (formerly Analytics Hub)](/bigquery/docs/analytics-hub-introduction),\nsome datasets might need custom or tailored methods to ingest data\nand join them with the reporting models.\n\nTo enable the following external datasets, set `k9.deployDataset` to `True`\nif you want Dataset to be deployed.\n\nConfigure the Directed Acyclic Graphs (DAGs) for the supported external datasets\nfollowing these steps:\n\n1. **Holiday Calendar:** This DAG retrieves the special dates from\n [PyPi Holidays](https://pypi.org/project/holidays/).\n\n | **Note:** If using sample data, keep default values.\n 1. Adjust the list of countries, the list of years, as well as other DAG parameters to retrieve holidays in [`holiday_calendar.ini`](https://github.com/GoogleCloudPlatform/cortex-data-foundation/blob/main/src/k9/src/holiday_calendar/holiday_calendar.ini).\n2. **Trends** : This DAG retrieves *Interest Over Time* for a specific set\n of terms from [Google Search trends](https://trends.google.com/trends/).\n The terms can be configured in [`trends.ini`](https://github.com/GoogleCloudPlatform/cortex-data-foundation/blob/main/src/k9/src/trends/trends.ini).\n\n 1. After an initial run, adjust the `start_date` to `'today 7-d'` in [`trends.ini`](https://github.com/GoogleCloudPlatform/cortex-data-foundation/blob/main/src/k9/src/trends/trends.ini).\n 2. Get familiarized with the results coming from the different terms to tune parameters.\n 3. We recommend partitioning large lists to multiple copies of this DAG running at different times.\n 4. For more information about the underlying library being used, see [Pytrends](https://pypi.org/project/pytrends/).\n3. **Weather** : By default, this DAG uses the publicly available\n test dataset [`BigQuery-public-data.geo_openstreetmap.planet_layers`](https://console.cloud.google.com/bigquery/analytics-hub/exchanges(analyticshub:search)?queryText=open%20street%20map).\n The query also relies on an NOAA dataset only available\n through Sharing: [`noaa_global_forecast_system`](https://console.cloud.google.com/bigquery/analytics-hub/exchanges(analyticshub:search)?queryText=noaa%20global%20forecast).\n\n **This dataset needs to be created in the same region as the other datasets prior to executing deployment**. If the datasets aren't available in your region, you can continue\n with the following instructions to transfer the data into the chosen region:\n 1. Go to the [**Sharing (Analytics Hub)**](https://console.cloud.google.com/bigquery/analytics-hub) page.\n 2. Click **Search listings**.\n 3. Search for **NOAA Global Forecast System**.\n 4. Click **Subscribe**.\n 5. When prompted, keep `noaa_global_forecast_system` as the name of the dataset. If needed, adjust the name of the dataset and table in the FROM clauses in `weather_daily.sql`.\n 6. Repeat the listing search for Dataset `OpenStreetMap Public Dataset`.\n 7. Adjust the `FROM` clauses containing: `BigQuery-public-data.geo_openstreetmap.planet_layers` in `postcode.sql`.\n4. **Sustainability and ESG insights** : Cortex Framework combines\n SAP supplier performance data with advanced ESG insights to compare\n delivery performance, sustainability and risks more holistically across\n global operations. For more information,\n see the [Dun \\& Bradstreet data source](/cortex/docs/dun-and-bradstreet).\n\nGeneral considerations\n----------------------\n\n- [Sharing](/bigquery/docs/analytics-hub-introduction)\n is only supported in EU and US locations,\n and some datasets, such as NOAA Global Forecast, are only offered\n in a single multi location.\n\n If you are targeting a location different\n from the one available for the required dataset, we recommended to create\n a [scheduled query](/bigquery/docs/scheduling-queries)\n to copy the new records from the Sharing\n linked dataset followed by a [transfer service](/bigquery/docs/dts-introduction)\n to copy those new records into a dataset located\n in the same location or region as the rest of your deployment.\n You then need to adjust the SQL files.\n- Before copying these DAGs to Cloud Composer, add the required\n python modules [as dependencies](/composer/docs/how-to/using/installing-python-dependencies#options_for_managing_python_packages):\n\n Required modules:\n pytrends~=4.9.2\n holidays"]]