Use query templates
BigQuery data clean room query templates accelerate time to insight and provide additional layers of security and control to minimize data exfiltration concerns. By predefining and limiting the queries that can be executed in data clean rooms, you can do the following:
Help prevent the leakage of sensitive data. When subscribers that are running queries in a clean room are given greater flexibility to explore, data owners feel there is a higher risk of accidental or intentional exposure of sensitive information.
Simplify onboarding and adoption for less-technical users. Many data providers expect subscribers of clean rooms to be less technically proficient, especially in writing privacy-centric SQL queries and understanding how to appropriately allocate privacy budgets.
Guarantee consistent analytical outcomes for data subscribers. It becomes more difficult to enforce specific data analysis rules and verify compliance with privacy regulations when there is no way to control the queries that are being executed in a data clean room.
Query templates let data owners and contributors create predefined and approved queries that are specifically tailored to the use cases of the data clean room, and to publish these queries for consumption by subscribers. Predefined queries use table-valued functions (TVFs) in BigQuery to allow an entire table or specific fields to be passed as input parameters and return a table as the output.
Limitations
- Query templates only support a maximum of two data references—that is, the
data used to define the TVF's query and the data parameter input accepted by
the TVF.
- Multiple tables or views can be referenced within the TVF's query definition, but they must all belong to the same data owner or party.
- Query template TVFs only support
TABLE
andVIEW
fixed types. - Query template definitions are subject to the same limitations as TVFs.
Before you begin
Enable the Analytics Hub API for your Google Cloud project by following these steps:
Console
Go to the Analytics Hub API page.
In the Google Cloud console toolbar, select your project.
If the API is not already enabled, click Enable.
bq
Run the gcloud services enable
command:
gcloud services enable analyticshub.googleapis.com
Required roles
To get the permissions that you need to perform the tasks in this document, ask your administrator to grant you the following IAM roles:
-
Create or delete a TVF in a data clean room:
-
Analytics Hub Publisher (
roles/analyticshub.publisher
) on the project -
Analytics Hub Subscriber (
roles/analyticshub.subscriber
) on the project
-
Analytics Hub Publisher (
-
Authorize a TVF:
BigQuery Data Owner (
roles/bigquery.dataOwner
) on the project -
Add, update, or delete a TVF listing in a data clean room:
-
Analytics Hub Publisher (
roles/analyticshub.publisher
) on the project -
Analytics Hub Subscriber (
roles/analyticshub.subscriber
) on the project
-
Analytics Hub Publisher (
-
Create a query template:
-
Analytics Hub Publisher (
roles/analyticshub.publisher
) on the project -
Analytics Hub Subscriber (
roles/analyticshub.subscriber
) on the project
-
Analytics Hub Publisher (
-
Approve a query template:
-
Analytics Hub Publisher (
roles/analyticshub.publisher
) on the project -
BigQuery Data Owner (
roles/bigquery.dataOwner
) on the project
-
Analytics Hub Publisher (
-
Subscribe to a data clean room with query templates:
-
Analytics Hub Subscriber (
roles/analyticshub.subscriber
) on the project -
Analytics Hub Subscription Owner (
roles/analyticshub.subscriptionOwner
) on the project where you want to subscribe to the data clean room
-
Analytics Hub Subscriber (
-
Execute the queries defined in query templates:
-
BigQuery Data Viewer (
roles/bigquery.dataViewer
) on the project -
BigQuery User (
roles/bigquery.user
) on the project
-
BigQuery Data Viewer (
For more information about granting roles, see Manage access to projects, folders, and organizations.
These predefined roles contain the permissions required to perform the tasks in this document. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to perform the tasks in this document:
-
Create or delete a TVF in a data clean room:
-
bigquery.routines.create
on the project -
bigquery.routines.update
on the project -
bigquery.routines.delete
on the project
-
-
Authorize a TVF:
bigquery.datasets.update
on the datasets that the routine accesses -
Create a query template:
-
analyticshub.listings.subscribe
on the project -
analyticshub.queryTemplates.create
on the project
-
-
Approve a query template:
-
bigquery.routines.create
on the project -
bigquery.datasets.update
on the datasets that the routine accesses -
analyticshub.listings.create
on the project -
analyticshub.queryTemplates.approve
on the project
-
You might also be able to get these permissions with custom roles or other predefined roles.
Add an existing TVF to a data clean room
You can add an existing TVF to a data clean room using the Analytics Hub API.
Use the
projects.locations.dataExchanges.listings.create
method.
The following example shows how to call the
projects.locations.dataExchanges.listings.create
method using the curl
command:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -H 'x-goog-user-project:DCR_PROJECT_ID' -X POST https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID/listings?listingId=LISTING_ID -d '{"bigqueryDataset":{"dataset":"projects/PROJECT_ID/datasets/DATASET_ID","selectedResources":[{"routine":"projects/PROJECT_ID/datasets/DATASET_ID/tables/ROUTINE_ID"}],},"displayName":LISTING_NAME"}'
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.PROJECT_ID
: the project ID of the project where the source dataset was contained.DATASET_ID
: the source dataset ID.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.LISTING_ID
: the listing ID.LISTING_NAME
: the listing name.ROUTINE_ID
: the routine ID.
Query template roles
There are three main roles for using data clean room query templates. Each role has specific workflows, which are described later in this document.
Template creator: a user that defines the queries to be executed within the clean room. This role is analogous to any of the following roles: Analytics Hub Admin, Analytics Hub Publisher, or Analytics Hub Listing Admin. For more information, see Template creator workflows.
Template approver: the owner of the data who must approve the query template's references before the template is available to use. This role is analogous to any of the following roles: Analytics Hub Admin, Analytics Hub Publisher, or Analytics Hub Listing Admin. For more information, see Template approver workflows.
Template subscriber: a user that subscribes to the clean room and can only run the queries that are approved in the template. This role is analogous to the Analytics Hub Subscriber. role. For more information, see Template subscriber workflows.
Template creator workflows
As a query template creator, you can do the following:
- Create a query template.
- Update a query template.
- Submit a query template for review.
- Delete a query template.
Add a listing to a data clean room
Before creating a query template, you must add data to a data clean room. To create a listing in the data clean room, follow these steps:
Go to the Sharing (Analytics Hub) page.
Click the display name of the data clean room that you want to create the query template in.
Click Add data and follow the steps to create a view with analysis rules configured. For detailed instructions, see Create a listing (add data).
- To add data from other parties, share the clean room with another trusted contributor. This data contributor must also add data to the clean room to be eligible for use in a query template.
Set data egress controls for the listing.
Set the metadata controls for the listing. If you only want to share the schemas and descriptions of the data added in the previous step (and not the shared data itself), select Exclude access to listing from linked dataset.
Review the listing details.
Click Add data. The metadata of the view that is created for your data is now added to the clean room.
Create a query template
Select one of the following options:
Console
Go to the Sharing (Analytics Hub) page.
Click the display name of the data clean room where you want to create the query template.
In the clean room, go to the Templates tab.
Click Create Template.
Enter a template name and description.
Click Next.
You can see the schemas of the views added to the clean room, and you can propose a query definition.
- Be sure to define the query using the supported
CREATE TABLE FUNCTION
syntax. Pass your entire table or view with fixed definitions. You must define the full table path reference, including the project ID and dataset ID, from the data added to the clean room. For example:
query_template1(t1 TABLE<year INT64>) AS (SELECT * FROM `project_id.dataset_id.table_id` WHERE year = table_id.year)
If you applied privacy analysis rules to the data, be sure this TVF includes privacy-specific SQL syntax, for example,
SELECT WITH AGGREGATION_THRESHOLD
.
- Be sure to define the query using the supported
Review the template details.
To save the template without submitting for review, click Save. The query template now has the status of DRAFT.
You can update the query template or submit the query template for review.
API
The following example shows how to create a query template with a CURL
command:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -H 'x-goog-user-project:DCR_PROJECT_ID' -X POST https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID/queryTemplates?queryTemplateId=QUERY_TEMPLATE_ID -d 'query_template { name: "query_template1", display_name: "query_template1", routine { definition_body: "QUERY_TEMPLATE_ID(TVF arguments) AS (TVF_DEFINITION)" } }'
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.QUERY_TEMPLATE_ID
: the query template ID.TVF_DEFINITION
: the TVF definition.
The following code sample shows a definition_body
example for the API call.
You must define the full table path reference, including the project ID and
dataset ID, from the data added to the clean room.
query_template1(t1 TABLE<year INT64>) AS (SELECT * FROM `project_id.dataset_id.table_id` WHERE year = table_id.year)
The definition_body
is analogous to the definition of a routine. The
preceding definition_body
translates to the following routine:
CREATE OR REPLACE TABLE FUNCTION <approvers_dataset>.query_template1(t1 TABLE, y INT64)
AS (SELECT * FROM t1 WHERE year > y)
You can update the query template or submit the query template for review.
Update a query template
You can only update a query template if it is in the DRAFT status. If the query template has already been submitted for review, you can no longer modify it.
To update a query template, select one of the following options:
Console
Go to the Sharing (Analytics Hub) page.
Click the display name of the data clean room that contains the query template.
In the clean room, go to the Templates tab.
In the row for the template you want to update, click Actions > Edit template.
Update the template details and query definition as needed.
Click Next.
Review the query template and click Save to save the changes without submitting the template for review.
API
The following example shows how to update a query template with a CURL
command:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -H 'x-goog-user-project:DCR_PROJECT_ID' -X PATCH https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID/queryTemplates/QUERY_TEMPLATE_ID?updateMask=display_name -d 'query_template { display_name: "query_template1" }'
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.QUERY_TEMPLATE_ID
: the query template ID.
Submit a query template for review
Select one of the following options:
Console
Go to the Sharing (Analytics Hub) page.
Click the display name of the data clean room that contains the query template.
In the clean room, go to the Templates tab.
In the row for the template you want to submit for review, click Actions > Submit for review. The template now has the status of Requires review.
API
The following example shows how to submit a query template for review with a
CURL
command:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -H 'x-goog-user-project:DCR_PROJECT_ID' -X POST https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID/queryTemplates/QUERY_TEMPLATE_ID:submit
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.QUERY_TEMPLATE_ID
: the query template ID.
Delete a query template
You can only delete a query template if it is in the DRAFT status. If the query template has already been submitted for review, you can no longer delete it.
Select one of the following options:
Console
Go to the Sharing (Analytics Hub) page.
Click the display name of the data clean room that contains the query template.
In the clean room, go to the Templates tab.
In the row for the template you want to delete, click Actions > Delete template.
API
The following example shows how to delete a query template with a CURL
command:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -H 'x-goog-user-project:DCR_PROJECT_ID' -X DELETE https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID/queryTemplates?queryTemplateId=QUERY_TEMPLATE_ID
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.QUERY_TEMPLATE_ID
: the query template ID.
Template approver workflows
As a query template approver, you can approve a query template.
When a TVF references data that you don't own (for example, the other contributor's data), the query template can only be approved by that data's owner. If you are creating a TVF that only references your data (for sharing in one direction), you can approve the query template yourself.
Approve a query template
Select one of the following options:
Console
Go to the Sharing (Analytics Hub) page.
Click the display name of the data clean room that contains the query template.
In the clean room, go to the Templates tab.
In the row for the template that requires your review, click Approval Status > Requires review.
Click Approve.
Select the template location. This location is where the TVF is created for sharing.
Review the proposed query template.
Click Approve if the query template is approved for use within the clean room.
API
Create the routine out of the query template using a
jobserver.query
call:curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -L -X POST https://bigquery.googleapis.com/bigquery/v2/projects/ROUTINE_PROJECT_ID/queries --data '{"query":"ROUTINE_CREATION_QUERY","useLegacySql":false}'
Replace the following:
ROUTINE_PROJECT_ID
: the project ID of the project where the routine was created.ROUTINE_CREATION_QUERY
: the query to create the routine.
Add the routine you created to the data clean room:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -H 'x-goog-user-project:DCR_PROJECT_ID' -X POST https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID/listings?listingId=LISTING_ID -d '{"bigqueryDataset":{"dataset":"projects/PROJECT_ID/datasets/DATASET_ID","selectedResources":[{"routine":"projects/PROJECT_ID/datasets/DATASET_ID/tables/ROUTINE_ID"}],},"displayName":"LISTING_NAME"}'
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.LISTING_ID
: the listing ID.PROJECT_ID
: the project ID of the project where the source dataset was contained.DATASET_ID
: the source dataset ID.ROUTINE_ID
: the routine ID.LISTING_NAME
: the listing name.
Update the query template status to
APPROVED
:curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -L -X POST https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID/queryTemplates/QUERY_TEMPLATE_ID:approve --data '{}'
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.QUERY_TEMPLATE_ID
: the query template ID.
Reject a query template
In the Google Cloud console, you can reject a query template in the following ways:
- Don't approve the submitted query template review.
- Delete the query template to remove it from the clean room.
Template subscriber workflows
A query template subscriber can view and subscribe to a data clean room. If only the query template is added to the clean room, subscribing to the clean room only grants access to the corresponding TVF, not to the underlying shared data.
Subscribe to a query template
Select one of the following options:
Console
You subscribe to a query template by subscribing to the data clean room. Access is granted to all the listings that have the Exclude access to listing from linked dataset setting disabled.
To subscribe to a query template, follow these steps:
Go to the BigQuery page.
Navigate to the linked dataset you created when subscribing to the clean room.
Open the routine or TVF shared in the linked dataset.
Click Invoke table function.
Replace the parameter with the accepted input, which is the table name or field.
Click Run.
If you can't view the TVF nested as a child element of the linked dataset in the Explorer panel, you can query the TVF directly on the linked dataset:
SELECT * FROM `myproject.dcr_linked_dataset.mytvf`(TABLE myTable);
API
Use the
projects.locations.dataExchanges.subscribe
method.
The following example shows how to call the projects.locations.dataExchanges.subscribe
method using the curl
command:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" -L -X POST https://analyticshub.googleapis.com/v1/projects/DCR_PROJECT_ID/locations/LOCATION/dataExchanges/CLEAN_ROOM_ID:subscribe --data '{"destination":"projects/SUBSCRIBER_PROJECT_ID/locations/LOCATION","subscription":"SUBSCRIPTION"}'
Replace the following:
DCR_PROJECT_ID
: the project ID of the project where the data clean room was created.LOCATION
: the location of the data clean room.CLEAN_ROOM_ID
: the data clean room ID.SUBSCRIBER_PROJECT_ID
: the project ID of the subscriber project.SUBSCRIPTION
: the name of your subscription.
After you've subscribed to the query template, you can query the TVF directly on the linked dataset:
SELECT * FROM `myproject.dcr_linked_dataset.mytvf`(TABLE myTable);
Example scenarios
Query templates can be used to facilitate different forms of data collaboration within a data clean room. The following sections describe example scenarios.
Single-direction sharing scenario
A data publisher creates a query template to verify that subscribing partners can only run the queries that are defined by the publisher. The query template creators ultimately self-approve the query templates, since no other contributor is added to the clean room.
In this scenario, user A is a data clean room owner who creates a data clean
room called campaign_analysis
and adds a dataset named my_campaign
with a
campaigns
table. User A configures an aggregation threshold policy and
metadata controls to verify that only the metadata schema is visible and
subscribers cannot access the source data. User A then creates a query template
by defining a table-valued function from the campaigns
table, restricting all
the subscribers of the linked dataset to only execute the TVF.
This is the TVF syntax:
campaigns_template(t1 TABLE campaign_ID <STRING> ) AS (
SELECT WITH AGGREGATION_THRESHOLD company_id, company, sum(impressions) FROM myproject.my_campaign.campaigns
group by company_id, company
);
Since user A has the appropriate permissions for the campaigns table with the BigQuery Data Owner role, user A can immediately self-approve the query template after submitting it for review.
Multiple party collaboration sharing
A clean room owner invites a trusted contributor to propose queries to be run against each other's data. Both parties can safely propose queries by viewing metadata schemas only, without accessing the underlying shared data. When a query definition references data that does not belong to the template proposer, the template can only be approved by that data's owner.
In this scenario, user A invites user B, who is a data clean room contributor,
to the campaign_analysis
clean room. User B wants to propose a query template
to join their own data to the campaigns
table by viewing the table's metadata
schema.
This is the TVF syntax:
campaigns_template(t1 TABLE campaign_ID <STRING> ) AS (
SELECT WITH AGGREGATION_THRESHOLD company_id, company, sum(impressions) FROM my_project.my_campaign.campaigns
group by company_id, company
);
Since user B didn't add and doesn't own the campaigns
table, once the query
template is submitted for approval, only user A can approve. To use the query
template, user B has to subscribe to the clean room and invoke the TVF. User B
passes their own table with a field called campaign_ID
as the table parameter,
and they can execute the private SQL defined in the query template. It's
useful to note here that user B doesn't need to add their data to the clean room.
User B also adds a dataset named my_transactions
to the clean room that has a
transactions
table and a products
table. User B configures aggregation
threshold policies and metadata controls to verify that only the metadata
schema is visible and subscribers cannot access the source data.
User A can now propose various query templates to join their own data to the transactions table by viewing the table's metadata schema. The following are examples of TVF syntax:
transactions_template(t1 TABLE user_ID <STRING> ) AS (
SELECT WITH AGGREGATION_THRESHOLD company_id, company, campaign_id, sku, category, date, sum(amount) FROM my_project.my_transactions.transactions
group by company_id, company, campaign_id, sku, category, date
);
transactions_template_with_join(t1 TABLE user_ID <STRING> ) AS (
SELECT WITH AGGREGATION_THRESHOLD t.company_id, t.company, t.campaign_id, t.sku, t.date, p.product_name, p.product_category, sum(t.amount) FROM myproject.my_transactions.transactions t
left join my_project.my_transactions.products p
on t.product_id = p.product_id
group by t.company_id, t.company, t.campaign_id, t.sku, t.date, p.product_name, p.product_category
);
Since user A didn't add and doesn't own the transactions
and products
tables, once the query template is submitted for approval, only user B can
approve. To use the query template, user A has to subscribe to the clean room
and invoke the TVF. User A passes their own table with a field named user_ID
as the table parameter, and they can execute the privacy SQL defined in the
query template. It's useful to note here that user A doesn't need to add their
data to the clean room.
Pricing
Data contributors using query templates are only charged for data storage.
Subscribers using query templates are only charged for compute (analysis) when they run queries.
What's next
- To learn more about data clean rooms, see Share sensitive data with data clean rooms.
- To learn more about subscriptions, see Subscribe to a data clean room.
- To learn more about TVFs, see Table functions.
- To learn more about data egress, see Data egress options (BigQuery shared datasets only).