Query complete after 0.07s: 100%|██████████| 4/4 [00:00<00:00, 1440.60query/s]
Downloading: 100%|██████████| 41/41 [00:02<00:00, 20.21rows/s]
country_code country_name num_regions
0 TR Turkey 81
1 TH Thailand 77
2 VN Vietnam 63
3 JP Japan 47
4 RO Romania 42
5 NG Nigeria 37
6 IN India 36
7 ID Indonesia 34
8 CO Colombia 33
9 MX Mexico 32
10 BR Brazil 27
11 EG Egypt 27
12 UA Ukraine 27
13 CH Switzerland 26
14 AR Argentina 24
15 FR France 22
16 SE Sweden 21
17 HU Hungary 20
18 IT Italy 20
19 PT Portugal 20
20 NO Norway 19
21 FI Finland 18
22 NZ New Zealand 17
23 PH Philippines 17
...
sql="""WITH TopTermsByDate AS ( SELECT DISTINCT refresh_date AS date, term FROM `bigquery-public-data.google_trends.top_terms` ), DistinctDates AS ( SELECT DISTINCT date FROM TopTermsByDate )SELECT DATE_DIFF(Dates2.date, Date1Terms.date, DAY) AS days_apart, COUNT(DISTINCT (Dates2.date || Date1Terms.date)) AS num_date_pairs, COUNT(Date1Terms.term) AS num_date1_terms, SUM(IF(Date2Terms.term IS NOT NULL, 1, 0)) AS overlap_terms, SAFE_DIVIDE( SUM(IF(Date2Terms.term IS NOT NULL, 1, 0)), COUNT(Date1Terms.term) ) AS pct_overlap_termsFROM TopTermsByDate AS Date1TermsCROSS JOIN DistinctDates AS Dates2LEFT JOIN TopTermsByDate AS Date2Terms ON Dates2.date = Date2Terms.date AND Date1Terms.term = Date2Terms.termWHERE Date1Terms.date <= Dates2.dateGROUP BY days_apartORDER BY days_apart;"""pct_overlap_terms_by_days_apart=client.query(sql).to_dataframe()pct_overlap_terms_by_days_apart.head()
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-03。"],[],[],null,["# Explore and visualize data in BigQuery from within JupyterLab\n=============================================================\n\n\n| Vertex AI Workbench managed notebooks is\n| [deprecated](/vertex-ai/docs/deprecations). On\n| April 14, 2025, support for\n| managed notebooks will end and the ability to create managed notebooks instances\n| will be removed. Existing instances will continue to function\n| but patches, updates, and upgrades won't be available. To continue using\n| Vertex AI Workbench, we recommend that you\n| [migrate\n| your managed notebooks instances to Vertex AI Workbench instances](/vertex-ai/docs/workbench/managed/migrate-to-instances).\n\n\u003cbr /\u003e\n\nThis page shows you some examples of how to explore and visualize data\nthat is stored in BigQuery from within the JupyterLab interface\nof your Vertex AI Workbench managed notebooks instance.\n\nBefore you begin\n----------------\n\nIf you haven't already,\n[create\na managed notebooks instance](/vertex-ai/docs/workbench/managed/create-instance#create).\n\nOpen JupyterLab\n---------------\n\n1. In the Google Cloud console, go to the **Managed notebooks** page.\n\n [Go to Managed notebooks](https://console.cloud.google.com/vertex-ai/workbench/managed)\n2. Next to your managed notebooks instance's name,\n click **Open JupyterLab**.\n\n Your managed notebooks instance opens JupyterLab.\n\nRead data from BigQuery\n-----------------------\n\nIn the next two sections, you read data from BigQuery\nthat you will use to visualize later. These steps are identical to those\nin [Query data in BigQuery from\nwithin JupyterLab](/vertex-ai/docs/workbench/managed/bigquery), so if you've completed\nthem already, you can skip to\n[Get a summary of data in a BigQuery table](#summary).\n\n### Query data by using the %%bigquery magic command\n\nIn this section, you write SQL directly in notebook cells and read data from\nBigQuery into the Python notebook.\n\nMagic commands that use a single or double percentage character (`%` or `%%`)\nlet you use minimal syntax to interact with BigQuery within the\nnotebook. The BigQuery client library for Python is automatically\ninstalled in a managed notebooks instance. Behind the scenes, the `%%bigquery` magic\ncommand uses the BigQuery client library for Python to run the\ngiven query, convert the results to a pandas DataFrame, optionally save the\nresults to a variable, and then display the results.\n\n**Note** : As of version 1.26.0 of the `google-cloud-bigquery` Python package,\nthe [BigQuery Storage API](/bigquery/docs/reference/storage)\nis used by default to download results from the `%%bigquery` magics.\n\n1. To open a notebook file, select **File \\\u003e New \\\u003e\n Notebook**.\n\n \u003cbr /\u003e\n\n2. In the **Select Kernel** dialog, select **Python (Local)** , and then click\n **Select**.\n\n Your new IPYNB file opens.\n3. To get the number of regions by country in the `international_top_terms`\n dataset, enter the following statement:\n\n ```sql\n %%bigquery\n SELECT\n country_code,\n country_name,\n COUNT(DISTINCT region_code) AS num_regions\n FROM\n `bigquery-public-data.google_trends.international_top_terms`\n WHERE\n refresh_date = DATE_SUB(CURRENT_DATE, INTERVAL 1 DAY)\n GROUP BY\n country_code,\n country_name\n ORDER BY\n num_regions DESC;\n ```\n4. Click play_circle_filled **Run cell**.\n\n The output is similar to the following: \n\n ```\n Query complete after 0.07s: 100%|██████████| 4/4 [00:00\u003c00:00, 1440.60query/s]\n Downloading: 100%|██████████| 41/41 [00:02\u003c00:00, 20.21rows/s]\n country_code country_name num_regions\n 0 TR Turkey 81\n 1 TH Thailand 77\n 2 VN Vietnam 63\n 3 JP Japan 47\n 4 RO Romania 42\n 5 NG Nigeria 37\n 6 IN India 36\n 7 ID Indonesia 34\n 8 CO Colombia 33\n 9 MX Mexico 32\n 10 BR Brazil 27\n 11 EG Egypt 27\n 12 UA Ukraine 27\n 13 CH Switzerland 26\n 14 AR Argentina 24\n 15 FR France 22\n 16 SE Sweden 21\n 17 HU Hungary 20\n 18 IT Italy 20\n 19 PT Portugal 20\n 20 NO Norway 19\n 21 FI Finland 18\n 22 NZ New Zealand 17\n 23 PH Philippines 17\n ...\n ```\n | **Note:** Your results might differ from what is above as the `google_trends` dataset being queried is refreshed with new data on an ongoing basis.\n5. In the next cell (below the output from the previous cell), enter the\n following command to run the same query, but this time save the results to\n a new pandas DataFrame that's named `regions_by_country`. You provide that\n name by using an argument with the `%%bigquery` magic command.\n\n ```sql\n %%bigquery regions_by_country\n SELECT\n country_code,\n country_name,\n COUNT(DISTINCT region_code) AS num_regions\n FROM\n `bigquery-public-data.google_trends.international_top_terms`\n WHERE\n refresh_date = DATE_SUB(CURRENT_DATE, INTERVAL 1 DAY)\n GROUP BY\n country_code, country_name\n ORDER BY\n num_regions DESC;\n ```\n\n **Note:** For more information about available arguments for the\n `%%bigquery` command, see the [client library magics documentation](/python/docs/reference/bigquery/latest/magics).\n6. Click play_circle_filled **Run cell**.\n\n7. In the next cell, enter the following command to look at the first few\n rows of the query results that you just read in:\n\n regions_by_country.head()\n\n8. Click play_circle_filled **Run cell**.\n\n The pandas DataFrame `regions_by_country` is ready to plot.\n\n### Query data by using the BigQuery client library directly\n\n\\\u003c\n\nGet a summary of data in a BigQuery table\n-----------------------------------------\n\n\nIn this section, you use a notebook shortcut to get summary statistics and\nvisualizations for all fields of a BigQuery table. This can\nbe a fast way to profile your data before exploring further.\n\nThe BigQuery client library provides a magic command,\n`%bigquery_stats`, that you can call with a specific table name to provide an\noverview of the table and detailed statistics on each of the table's\ncolumns.\n\n1. In the next cell, enter the following code to run that analysis on the US\n [`top_terms` table](https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=google_trends&t=top_terms&page=table):\n\n %bigquery_stats bigquery-public-data.google_trends.top_terms\n\n2. Click play_circle_filled **Run cell**.\n\n After running for some time, an image appears with various statistics on\n each of the 7 variables in the `top_terms` table. The following image shows\n part of some example output:\n\n| **Note:** Your results might differ from what is above as the `google_trends` dataset being queried is refreshed with new data on an ongoing basis.\n\nVisualize BigQuery data\n-----------------------\n\nIn this section, you use plotting capabilities to visualize the results from\nthe queries that you previously ran in your Jupyter notebook.\n\n1. In the next cell, enter the following code to use the pandas\n `DataFrame.plot()` method to create a bar chart that visualizes the results\n of the query that returns the number of regions by country:\n\n regions_by_country.plot(kind=\"bar\", x=\"country_name\", y=\"num_regions\", figsize=(15, 10))\n\n2. Click play_circle_filled **Run cell**.\n\n The chart is similar to the following:\n\n3. In the next cell, enter the following code to use the pandas\n `DataFrame.plot()` method to create a scatter plot that visualizes the\n results from the query for the percentage of overlap in the top search terms\n by days apart:\n\n pct_overlap_terms_by_days_apart.plot(\n kind=\"scatter\",\n x=\"days_apart\",\n y=\"pct_overlap_terms\",\n s=len(pct_overlap_terms_by_days_apart[\"num_date_pairs\"]) * 20,\n figsize=(15, 10)\n )\n\n4. Click play_circle_filled **Run cell**.\n\n The chart is similar to the following. The size of each point reflects\n the number of date pairs that are that many days apart in the data. For\n example, there are more pairs that are 1 day apart than 30 days apart\n because the top search terms are surfaced daily over about a month's time.\n\nFor more information about data visualization, see the\n[pandas documentation](https://pandas.pydata.org/pandas-docs/stable/visualization.html)."]]