使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
本页介绍了为 Cortex Framework Data Foundation 部署配置外部数据集的可选步骤。某些高级使用情形可能需要外部数据集来补充企业记录系统。除了从 BigQuery Sharing(以前称为 Analytics Hub)中使用的外部交换之外,某些数据集可能需要自定义或量身定制的方法来注入数据并将其与报告模型联接。
如需启用以下外部数据集,请将 k9.deployDataset
设置为 True
(如果您希望部署数据集)。
按照以下步骤为受支持的外部数据集配置有向无环图 (DAG):
节假日日历:此 DAG 从 PyPi Holidays 中检索特殊日期。
- 调整国家/地区列表、年份列表以及其他 DAG 参数,以检索
holiday_calendar.ini
中的节假日。
趋势:此 DAG 从 Google 搜索趋势中检索一组特定字词的随时间变化的搜索热度。
您可以在 trends.ini
中配置这些条款。
- 初次运行后,将
trends.ini
中的 start_date
调整为 'today 7-d'
。
- 熟悉不同搜索字词带来的结果,以便调整参数。
- 我们建议将大型列表分区为多个 DAG 副本,这些副本在不同时间运行。
- 如需详细了解所使用的底层库,请参阅 Pytrends。
天气:默认情况下,此 DAG 使用公开提供的测试数据集 BigQuery-public-data.geo_openstreetmap.planet_layers
。该查询还依赖于只能通过共享访问的 NOAA 数据集:noaa_global_forecast_system
。
此数据集需要在执行部署之前在与其他数据集相同的区域中创建。如果您的区域中没有这些数据集,您可以按照以下说明将数据转移到所选区域:
- 前往共享 (Analytics Hub) 页面。
- 点击搜索商品详情。
- 搜索 NOAA 全球预报系统。
- 点击订阅。
- 当系统提示时,请保留
noaa_global_forecast_system
作为数据集的名称。如果需要,请在 weather_daily.sql
的 FROM 子句中调整数据集和表的名称。
- 针对数据集
OpenStreetMap Public Dataset
重复执行商品详情搜索。
- 调整
postcode.sql
中包含 BigQuery-public-data.geo_openstreetmap.planet_layers
的 FROM
子句。
可持续性和 ESG 洞见:Cortex Framework 将 SAP 供应商绩效数据与高级 ESG 洞见相结合,以便更全面地比较全球运营中的交付绩效、可持续性和风险。如需了解详情,请参阅 Dun & Bradstreet 数据源。
一般注意事项
共享仅在欧盟和美国位置受支持,并且某些数据集(例如 NOAA 全球预报)仅在单个多位置提供。
如果您定位到的位置与所需数据集的可用位置不同,建议您创建计划查询,以从共享关联的数据集中复制新记录,然后使用转移服务将这些新记录复制到与部署的其他部分位于同一位置或区域的数据集中。然后,您需要调整 SQL 文件。
在将这些 DAG 复制到 Cloud Composer 之前,请以依赖项的形式添加所需的 Python 模块:
Required modules:
pytrends~=4.9.2
holidays
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-18。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-18。"],[[["\u003cp\u003eThis page provides instructions for configuring optional external datasets within the Cortex Framework Data Foundation deployment, which can be utilized to enhance enterprise systems of record with external data.\u003c/p\u003e\n"],["\u003cp\u003eConfiguring external datasets involves setting \u003ccode\u003ek9.deployDataset\u003c/code\u003e to \u003ccode\u003eTrue\u003c/code\u003e and setting up Directed Acyclic Graphs (DAGs) for each supported dataset like the holiday calendar, search trends, weather, and sustainability/ESG data.\u003c/p\u003e\n"],["\u003cp\u003eThe Holiday Calendar DAG retrieves special dates from PyPi Holidays, allowing customization of countries and years through the \u003ccode\u003eholiday_calendar.ini\u003c/code\u003e file.\u003c/p\u003e\n"],["\u003cp\u003eThe Trends DAG fetches "Interest Over Time" data from Google Search Trends, with configurable terms and date ranges in \u003ccode\u003etrends.ini\u003c/code\u003e, and recommends multiple copies for large term lists.\u003c/p\u003e\n"],["\u003cp\u003eThe Weather DAG uses public data from \u003ccode\u003eBigQuery-public-data.geo_openstreetmap.planet_layers\u003c/code\u003e and the \u003ccode\u003enoaa_global_forecast_system\u003c/code\u003e from Analytics Hub, both of which need to be available in the same region as other datasets.\u003c/p\u003e\n"]]],[],null,["# Configure external datasets\n===========================\n\nThis page describes an optional step to configure external datasets for\nthe Cortex Framework Data Foundation deployment. Some advanced\nuse cases might require external datasets to complement an enterprise system of\nrecord. In addition to external exchanges consumed from\n[BigQuery sharing (formerly Analytics Hub)](/bigquery/docs/analytics-hub-introduction),\nsome datasets might need custom or tailored methods to ingest data\nand join them with the reporting models.\n\nTo enable the following external datasets, set `k9.deployDataset` to `True`\nif you want Dataset to be deployed.\n\nConfigure the Directed Acyclic Graphs (DAGs) for the supported external datasets\nfollowing these steps:\n\n1. **Holiday Calendar:** This DAG retrieves the special dates from\n [PyPi Holidays](https://pypi.org/project/holidays/).\n\n | **Note:** If using sample data, keep default values.\n 1. Adjust the list of countries, the list of years, as well as other DAG parameters to retrieve holidays in [`holiday_calendar.ini`](https://github.com/GoogleCloudPlatform/cortex-data-foundation/blob/main/src/k9/src/holiday_calendar/holiday_calendar.ini).\n2. **Trends** : This DAG retrieves *Interest Over Time* for a specific set\n of terms from [Google Search trends](https://trends.google.com/trends/).\n The terms can be configured in [`trends.ini`](https://github.com/GoogleCloudPlatform/cortex-data-foundation/blob/main/src/k9/src/trends/trends.ini).\n\n 1. After an initial run, adjust the `start_date` to `'today 7-d'` in [`trends.ini`](https://github.com/GoogleCloudPlatform/cortex-data-foundation/blob/main/src/k9/src/trends/trends.ini).\n 2. Get familiarized with the results coming from the different terms to tune parameters.\n 3. We recommend partitioning large lists to multiple copies of this DAG running at different times.\n 4. For more information about the underlying library being used, see [Pytrends](https://pypi.org/project/pytrends/).\n3. **Weather** : By default, this DAG uses the publicly available\n test dataset [`BigQuery-public-data.geo_openstreetmap.planet_layers`](https://console.cloud.google.com/bigquery/analytics-hub/exchanges(analyticshub:search)?queryText=open%20street%20map).\n The query also relies on an NOAA dataset only available\n through Sharing: [`noaa_global_forecast_system`](https://console.cloud.google.com/bigquery/analytics-hub/exchanges(analyticshub:search)?queryText=noaa%20global%20forecast).\n\n **This dataset needs to be created in the same region as the other datasets prior to executing deployment**. If the datasets aren't available in your region, you can continue\n with the following instructions to transfer the data into the chosen region:\n 1. Go to the [**Sharing (Analytics Hub)**](https://console.cloud.google.com/bigquery/analytics-hub) page.\n 2. Click **Search listings**.\n 3. Search for **NOAA Global Forecast System**.\n 4. Click **Subscribe**.\n 5. When prompted, keep `noaa_global_forecast_system` as the name of the dataset. If needed, adjust the name of the dataset and table in the FROM clauses in `weather_daily.sql`.\n 6. Repeat the listing search for Dataset `OpenStreetMap Public Dataset`.\n 7. Adjust the `FROM` clauses containing: `BigQuery-public-data.geo_openstreetmap.planet_layers` in `postcode.sql`.\n4. **Sustainability and ESG insights** : Cortex Framework combines\n SAP supplier performance data with advanced ESG insights to compare\n delivery performance, sustainability and risks more holistically across\n global operations. For more information,\n see the [Dun \\& Bradstreet data source](/cortex/docs/dun-and-bradstreet).\n\nGeneral considerations\n----------------------\n\n- [Sharing](/bigquery/docs/analytics-hub-introduction)\n is only supported in EU and US locations,\n and some datasets, such as NOAA Global Forecast, are only offered\n in a single multi location.\n\n If you are targeting a location different\n from the one available for the required dataset, we recommended to create\n a [scheduled query](/bigquery/docs/scheduling-queries)\n to copy the new records from the Sharing\n linked dataset followed by a [transfer service](/bigquery/docs/dts-introduction)\n to copy those new records into a dataset located\n in the same location or region as the rest of your deployment.\n You then need to adjust the SQL files.\n- Before copying these DAGs to Cloud Composer, add the required\n python modules [as dependencies](/composer/docs/how-to/using/installing-python-dependencies#options_for_managing_python_packages):\n\n Required modules:\n pytrends~=4.9.2\n holidays"]]