GoogleSQL for Bigtable supports built-in table functions.
This topic includes functions that produce columns of a table.
You can only use these functions in the FROM
clause.
Function list
Name | Summary |
---|---|
Temporal filters
|
Access temporal elements of a table by using the Bigtable table name as a function. |
UNPACK
|
Expands the timestamped values in an input row into multiple rows, each row representing a different timestamp; and moves the timestamps into a `_timestamp` column. |
Temporal filters
my_table(
with_history => true,
after => TIMESTAMP(),
after_or_equal => TIMESTAMP(),
before => TIMESTAMP(),
before_or_equal => TIMESTAMP(),
latest_n => 4
)
Description
To apply a temporal filter, use the table name as the function name. The function can be called with named arguments.
Each Bigtable cell contains a unique timestamped version of the data for that row and column. Temporal filters expose the timestamp associated with each cell. They let you query time series data and filter for timestamps and versions.
Arguments
with_history
:BOOL
WhenTRUE
the columns are expanded to include timestamp. More specifically, the columns becomeMAP<KEY_TYPE, ARRAY<STRUCT<TIMESTAMP, VALUE_TYPE>>>
. Default toFALSE
.
The following parameters must also include with_history => true
:
after
:TIMESTAMP
Values with timestamps after the input, exclusive.after_or_equal
:TIMESTAMP
Values with timestamps after the input, inclusive.before
:TIMESTAMP
Values with timestamps before the input, exclusive.before_or_equal
:TIMESTAMP
Values with timestamps before the input, inclusive.latest_n
:INT64
The number of timestamped values to return per column qualifier (map key). Must be greater than or equal to 1.
The following parameters must not include with_history => true
:
as_of
:TIMESTAMP
Returns the latest values with timestamps less than or equal to the provided timestamp.
Return type
TABLE
Example
The following query reads the stats_summary
column in a table called
test_table with with_history
:
SELECT stats_summary FROM test_table(with_history => true) LIMIT 2
/*-------------------------------------------------------------------------+
| stats_summary |
+-------------------------------------------------------------------------+
| { |
| "connected_cell" : [ |
| {timestamp: 2025-03-28T14:15:00.57-04:00, value:"2"}, |
| {timestamp: 2025-03-28T14:12:56.762-04:00, value:"1"} |
| ], |
| "connected_wifi" : [ |
| {timestamp:2025-03-28T14:15:00.57-04:00, value:"5"}, |
| {timestamp:2025-03-28T14:12:56.762-04:00, value:"1"} |
| ], |
| "os_build" : [ |
| {timestamp:2025-03-28T14:15:00.57-04:00, value:"PQ2A.190405.003"}, |
| {timestamp:2025-03-28T14:12:56.762-04:00, value:"PQ2A.190405.003"} |
| ] |
| } |
+-------------------------------------------------------------------------+
| { |
| "connected_cell" : [ |
| {timestamp:2025-03-28T14:15:17.371-04:00, value:"2"}, |
| {timestamp:2025-03-28T14:13:42.018-04:00, value:"1"} |
| ], |
| "connected_wifi" : [ |
| {timestamp:2025-03-28T14:15:17.371-04:00, value:"5"}, |
| {timestamp:2025-03-28T14:13:42.018-04:00, value:"1"} |
| ], |
| "os_build" : [ |
| {timestamp:2025-03-28T14:15:17.371-04:00, value:"PQ2A.190405.004"}, |
| {timestamp:2025-03-28T14:13:42.018-04:00, value:"PQ2A.190405.004"} |
| ] |
| } |
+-------------------------------------------------------------------------*/
You can filter for a column qualifier within a timestamp range:
SELECT stats_summary['connected_wifi'] as connected_wifi
FROM test_table(
with_history => true,
after => TIMESTAMP("2025-03-28 14:13:40-0400"),
before_or_equal => TIMESTAMP("2025-03-28 14:15:10-04:00"))
LIMIT 2
/*--------------------------------------------------------+
| connected_wifi |
+--------------------------------------------------------+
| [{timestamp:2025-03-28T14:15:00.57-04:00, value:"5"}] |
+--------------------------------------------------------+
| [{timestamp:2025-03-28T14:13:42.018-04:00, value:"1"}] |
+--------------------------------------------------------*/
You can also filter with as_of
, which doesn't return timestamps:
SELECT stats_summary FROM test_table(as_of => TIMESTAMP("2025-03-28 14:15:10-04:00")) LIMIT 2;
/*----------------------------------+
| stats_summary |
+----------------------------------+
| { |
| "connected_cell" : "2", |
| "connected_wifi" : "5", |
| "os_build" : "PQ2A.190405.003" |
| } |
+----------------------------------+
| { |
| "connected_cell" : "1", |
| "connected_wifi" : "1", |
| "os_build" : "PQ2A.190405.004" |
| } |
+----------------------------------*/
UNPACK
UNPACK(TABLE table(with_history => true))
Description
This function takes a input table and expands it so that each input timestamped
value is expanded into its own row representing that timestamp. The timestamps
are moved to a separate _timestamp
column.
The UNPACK
function has the following behavior and restrictions:
- Input column families
MAP<key, ARRAY<STRUCT<timestamp, value>>>
are transformed toMAP<key, value>
. - Input column qualifiers
ARRAY<STRUCT<timestamp, value>>>
are transformed tovalue
. - Other column types are unchanged.
- The input table must use the Temporal filter
with_history => true
in order to expand the timestamps. - The columns must be selected in the subquery in order to be unpacked and selected.
- The new
_timestamp
column doesn't need to be selected in order for the timestamps to be unpacked.
Return type
TABLE
Example
All the examples are based on this table:
SELECT
_key,
report
FROM
table(with_history => true)
/*-------------------------------------------------------------------+
| _key | report |
+-------------------------------------------------------------------+
| Dept1 | "expenses": [ |
| | { timestamp: 2025-04-01T21:00:00Z, value: "3" }, |
| | { timestamp: 2025-04-01T18:00:00Z, value: "2" } |
| | ], |
| | "sales": [ |
| | { timestamp: 2025-04-01T22:00:00Z, value: "12" }, |
| | { timestamp: 2025-04-01T20:00:00Z, value: "10" } |
| | ] |
+------------------------------------------------------------------+
| Dept2 | "expenses": [ |
| | { timestamp: 2025-04-01T19:00:00Z, value: "2" }, |
| | { timestamp: 2025-04-01T18:00:00Z, value: "1" } |
| | ], |
| | "sales": [ |
| | { timestamp: 2025-04-01T23:00:00Z, value: "9" }, |
| | { timestamp: 2025-04-01T19:00:00Z, value: "5" } |
| | ] |
+-------------------------------------------------------------------*/
You can unpack the timestamps from the report
column. Each row is expanded
into multiple rows, one per timestamp:
SELECT *
FROM UNPACK((
SELECT
_key,
report
FROM
table(with_history => true)))
/*----------------------------------------------------------------+
| _key | report | _timestamp |
+----------------------------------------------------------------+
| Dept1 | { sales: "12" } | 2025-04-01T22:00:00Z |
| Dept1 | { expenses: "3" } | 2025-04-01T21:00:00Z |
| Dept1 | { expenses: "2" } | 2025-04-01T18:00:00Z |
| Dept1 | { sales: "10" } | 2025-04-01T20:00:00Z |
| Dept2 | { expenses: "1" } | 2025-04-01T18:00:00Z |
| Dept2 | { expenses: "2", sales: "5" } | 2025-04-01T19:00:00Z |
| Dept2 | { sales: "9" } | 2025-04-01T23:00:00Z |
+----------------------------------------------------------------*/
You can unpack the column qualifiers individually. Only the timestamps from the
sales
column qualifier are unpacked:
SELECT
_key,
_timestamp,
sales
FROM
UNPACK((
SELECT
_key,
report['sales'] as sales
FROM
table(with_history => true)
))
/*--------------------------------------+
| _key | _timestamp | sales |
+--------------------------------------+
| Dept1 | 2025-04-01T20:00:00Z | "10" |
| Dept1 | 2025-04-01T22:00:00Z | "12" |
| Dept2 | 2025-04-01T19:00:00Z | "5" |
| Dept2 | 2025-04-01T23:00:00Z | "9" |
+--------------------------------------*/
You can unpack the entire column family and only select certain column
qualifiers. This results in NULL
values where the sales
column doesn't have
a value for the timestamp:
SELECT
_key,
_timestamp,
report['sales'] as sales,
FROM
UNPACK((
SELECT
_key,
report
FROM
table(with_history => true)))
/*--------------------------------------+
| _key | _timestamp | sales |
+--------------------------------------+
| Dept1 | 2025-04-01T22:00:00Z | "12" |
| Dept1 | 2025-04-01T21:00:00Z | NULL |
| Dept1 | 2025-04-01T18:00:00Z | NULL |
| Dept1 | 2025-04-01T20:00:00Z | "10" |
| Dept2 | 2025-04-01T18:00:00Z | NULL |
| Dept2 | 2025-04-01T19:00:00Z | "5" |
| Dept2 | 2025-04-01T23:00:00Z | "9" |
+--------------------------------------*/
You don't have to select the _timestamp
column. The timestamps are still
unpacked without selecting the column:
SELECT
_key,
expenses
FROM
UNPACK((
SELECT
_key,
report['sales'] as sales,
report['expenses'] as expenses
FROM
table(with_history => true)))
/*------------------+
| _key | expenses |
+------------------+
| Dept1 | NULL |
| Dept1 | NULL |
| Dept1 | 3 |
| Dept1 | 2 |
| Dept2 | NULL |
| Dept2 | 2 |
| Dept2 | 1 |
+------------------*/
You can unpack an input table that doesn't have any timestamps, but it doesn't transform any of the columns or unpack any timestamps because there are no timestamps in the input table to unpack:
SELECT
_key,
_timestamp
FROM
UNPACK((
SELECT
_key
FROM
table(with_history => true)))
/*--------------------+
| _key | _timestamp |
+--------------------+
| Dept1 | NULL |
| Dept2 | NULL |
+--------------------*/
ORDER BY
within the UNPACK
subquery is ignored. In order to correctly order
the result, the ORDER BY
must be on the outer query:
SELECT
_key,
_timestamp,
expenses,
constant
FROM
UNPACK((
SELECT
_key,
report['expenses'] as expenses,
"5" as constant
FROM
table(with_history => true)
ORDER BY _key)) -- Does NOT affect the ordering
ORDER BY expenses -- Does affect the ordering
/*----------------------------------------------------+
| _key | _timestamp | expenses | constant |
+----------------------------------------------------+
| Dept2 | 2025-04-01T18:00:00Z | "1" | 5 |
| Dept1 | 2025-04-01T18:00:00Z | "2" | 5 |
| Dept2 | 2025-04-01T19:00:00Z | "2" | 5 |
| Dept1 | 2025-04-01T21:00:00Z | "3" | 5 |
+----------------------------------------------------*/