Functions, operators, and conditionals

This topic is a compilation of functions, operators, and conditional expressions.

To learn more about how to call functions, function call rules, the SAFE prefix, and special types of arguments, see Function calls.


OPERATORS AND CONDITIONALS

Operators

GoogleSQL for BigQuery supports operators. Operators are represented by special characters or keywords; they do not use function call syntax. An operator manipulates any number of data inputs, also called operands, and returns a result.

Common conventions:

  • Unless otherwise specified, all operators return NULL when one of the operands is NULL.
  • All operators will throw an error if the computation result overflows.
  • For all floating point operations, +/-inf and NaN may only be returned if one of the operands is +/-inf or NaN. In other cases, an error is returned.

Operator precedence

The following table lists all GoogleSQL operators from highest to lowest precedence, i.e., the order in which they will be evaluated within a statement.

Order of Precedence Operator Input Data Types Name Operator Arity
1 Field access operator STRUCT
JSON
Field access operator Binary
  Array subscript operator ARRAY Array position. Must be used with OFFSET or ORDINAL—see Array Functions . Binary
  JSON subscript operator JSON Field name or array position in JSON. Binary
2 + All numeric types Unary plus Unary
  - All numeric types Unary minus Unary
  ~ Integer or BYTES Bitwise not Unary
3 * All numeric types Multiplication Binary
  / All numeric types Division Binary
  || STRING, BYTES, or ARRAY<T> Concatenation operator Binary
4 + All numeric types, DATE with INT64 , INTERVAL Addition Binary
  - All numeric types, DATE with INT64 , INTERVAL Subtraction Binary
5 << Integer or BYTES Bitwise left-shift Binary
  >> Integer or BYTES Bitwise right-shift Binary
6 & Integer or BYTES Bitwise and Binary
7 ^ Integer or BYTES Bitwise xor Binary
8 | Integer or BYTES Bitwise or Binary
9 (Comparison Operators) = Any comparable type. See Data Types for a complete list. Equal Binary
  < Any comparable type. See Data Types for a complete list. Less than Binary
  > Any comparable type. See Data Types for a complete list. Greater than Binary
  <= Any comparable type. See Data Types for a complete list. Less than or equal to Binary
  >= Any comparable type. See Data Types for a complete list. Greater than or equal to Binary
  !=, <> Any comparable type. See Data Types for a complete list. Not equal Binary
  [NOT] LIKE STRING and BYTES Value does [not] match the pattern specified Binary
  Quantified LIKE STRING and BYTES Checks a search value for matches against several patterns. Binary
  [NOT] BETWEEN Any comparable types. See Data Types for a complete list. Value is [not] within the range specified Binary
  [NOT] IN Any comparable types. See Data Types for a complete list. Value is [not] in the set of values specified Binary
  IS [NOT] NULL All Value is [not] NULL Unary
  IS [NOT] TRUE BOOL Value is [not] TRUE. Unary
  IS [NOT] FALSE BOOL Value is [not] FALSE. Unary
10 NOT BOOL Logical NOT Unary
11 AND BOOL Logical AND Binary
12 OR BOOL Logical OR Binary

Operators with the same precedence are left associative. This means that those operators are grouped together starting from the left and moving right. For example, the expression:

x AND y AND z

is interpreted as

( ( x AND y ) AND z )

The expression:

x * y / z

is interpreted as:

( ( x * y ) / z )

All comparison operators have the same priority, but comparison operators are not associative. Therefore, parentheses are required in order to resolve ambiguity. For example:

(x < y) IS FALSE

Operator list

Name Summary
Field access operator Gets the value of a field.
Array subscript operator Gets a value from an array at a specific position.
Struct subscript operator Gets the value of a field at a selected position in a struct.
JSON subscript operator Gets a value of an array element or field in a JSON expression.
Arithmetic operators Performs arithmetic operations.
Date arithmetics operators Performs arithmetic operations on dates.
Datetime subtraction Computes the difference between two datetimes as an interval.
Interval arithmetic operators Adds an interval to a datetime or subtracts an interval from a datetime.
Bitwise operators Performs bit manipulation.
Logical operators Tests for the truth of some condition and produces TRUE, FALSE, or NULL.
Comparison operators Compares operands and produces the results of the comparison as a BOOL value.
EXISTS operator Checks if a subquery produces one or more rows.
IN operator Checks for an equal value in a set of values.
IS operators Checks for the truth of a condition and produces either TRUE or FALSE.
IS DISTINCT FROM operator Checks if values are considered to be distinct from each other.
LIKE operator Checks if values are like or not like one another.
Quantified LIKE operator Checks a search value for matches against several patterns.
Concatenation operator Combines multiple values into one.

Field access operator

expression.fieldname[. ...]

Description

Gets the value of a field. Alternatively known as the dot operator. Can be used to access nested fields. For example, expression.fieldname1.fieldname2.

Input values:

  • STRUCT
  • JSON

Return type

  • For STRUCT: SQL data type of fieldname. If a field is not found in the struct, an error is thrown.
  • For JSON: JSON. If a field is not found in a JSON value, a SQL NULL is returned.

Example

In the following example, the expression is t.customer and the field access operations are .address and .country. An operation is an application of an operator (.) to specific operands (in this case, address and country, or more specifically, t.customer and address, for the first operation, and t.customer.address and country for the second operation).

WITH orders AS (
  SELECT STRUCT(STRUCT('Yonge Street' AS street, 'Canada' AS country) AS address) AS customer
)
SELECT t.customer.address.country FROM orders AS t;

/*---------*
 | country |
 +---------+
 | Canada  |
 *---------*/

Array subscript operator

array_expression[array_subscript_specifier]

array_subscript_specifier:
  { index | position_keyword(index) }

position_keyword:
  { OFFSET | SAFE_OFFSET | ORDINAL | SAFE_ORDINAL }

Description

Gets a value from an array at a specific position.

Input values:

  • array_expression: The input array.
  • position_keyword(index): Determines where the index for the array should start and how out-of-range indexes are handled. The index is an integer that represents a specific position in the array.
    • OFFSET(index): The index starts at zero. Produces an error if the index is out of range. To produce NULL instead of an error, use SAFE_OFFSET(index). This position keyword produces the same result as index by itself.
    • SAFE_OFFSET(index): The index starts at zero. Returns NULL if the index is out of range.
    • ORDINAL(index): The index starts at one. Produces an error if the index is out of range. To produce NULL instead of an error, use SAFE_ORDINAL(index).
    • SAFE_ORDINAL(index): The index starts at one. Returns NULL if the index is out of range.
  • index: An integer that represents a specific position in the array. If used by itself without a position keyword, the index starts at zero and produces an error if the index is out of range. To produce NULL instead of an error, use the SAFE_OFFSET(index) or SAFE_ORDINAL(index) position keyword.

Return type

T where array_expression is ARRAY<T>.

Examples

In following query, the array subscript operator is used to return values at specific position in item_array. This query also shows what happens when you reference an index (6) in an array that is out of range. If the SAFE prefix is included, NULL is returned, otherwise an error is produced.

WITH Items AS (SELECT ["coffee", "tea", "milk"] AS item_array)
SELECT
  item_array,
  item_array[0] AS item_index,
  item_array[OFFSET(0)] AS item_offset,
  item_array[ORDINAL(1)] AS item_ordinal,
  item_array[SAFE_OFFSET(6)] AS item_safe_offset
FROM Items

/*---------------------+------------+-------------+--------------+------------------*
 | item_array          | item_index | item_offset | item_ordinal | item_safe_offset |
 +---------------------+------------+-------------+--------------+------------------+
 | [coffee, tea, milk] | coffee     | coffee      | coffee       | NULL             |
 *----------------------------------+-------------+--------------+------------------*/

When you reference an index that is out of range in an array, and a positional keyword that begins with SAFE is not included, an error is produced. For example:

WITH Items AS (SELECT ["coffee", "tea", "milk"] AS item_array)
SELECT
  item_array[6] AS item_offset
FROM Items

-- Error. Array index 6 is out of bounds.
WITH Items AS (SELECT ["coffee", "tea", "milk"] AS item_array)
SELECT
  item_array[OFFSET(6)] AS item_offset
FROM Items

-- Error. Array index 6 is out of bounds.

Struct subscript operator

struct_expression[struct_subscript_specifier]

struct_subscript_specifier:
  { index | position_keyword(index) }

position_keyword:
  { OFFSET | ORDINAL }

Description

Gets the value of a field at a selected position in a struct.

Input types

  • struct_expression: The input struct.
  • position_keyword(index): Determines where the index for the struct should start and how out-of-range indexes are handled. The index is an integer literal or constant that represents a specific position in the struct.
    • OFFSET(index): The index starts at zero. Produces an error if the index is out of range. Produces the same result as index by itself.
    • ORDINAL(index): The index starts at one. Produces an error if the index is out of range.
  • index: An integer literal or constant that represents a specific position in the struct. If used by itself without a position keyword, the index starts at zero and produces an error if the index is out of range.

Examples

In following query, the struct subscript operator is used to return values at specific locations in item_struct using position keywords. This query also shows what happens when you reference an index (6) in an struct that is out of range.

WITH Items AS (SELECT STRUCT<INT64, STRING, BOOL>(23, "tea", FALSE) AS item_struct)
SELECT
  item_struct[0] AS field_index,
  item_struct[OFFSET(0)] AS field_offset,
  item_struct[ORDINAL(1)] AS field_ordinal
FROM Items

/*-------------+--------------+---------------*
 | field_index | field_offset | field_ordinal |
 +-------------+--------------+---------------+
 | 23          | 23           | 23            |
 *-------------+--------------+---------------*/

When you reference an index that is out of range in a struct, an error is produced. For example:

WITH Items AS (SELECT STRUCT<INT64, STRING, BOOL>(23, "tea", FALSE) AS item_struct)
SELECT
  item_struct[6] AS field_offset
FROM Items

-- Error. Field ordinal 6 is out of bounds in STRUCT
WITH Items AS (SELECT STRUCT<INT64, STRING, BOOL>(23, "tea", FALSE) AS item_struct)
SELECT
  item_struct[OFFSET(6)] AS field_offset
FROM Items

-- Error. Field ordinal 6 is out of bounds in STRUCT

JSON subscript operator

json_expression[array_element_id]
json_expression[field_name]

Description

Gets a value of an array element or field in a JSON expression. Can be used to access nested data.

Input values:

  • JSON expression: The JSON expression that contains an array element or field to return.
  • [array_element_id]: An INT64 expression that represents a zero-based index in the array. If a negative value is entered, or the value is greater than or equal to the size of the array, or the JSON expression doesn't represent a JSON array, a SQL NULL is returned.
  • [field_name]: A STRING expression that represents the name of a field in JSON. If the field name is not found, or the JSON expression is not a JSON object, a SQL NULL is returned.

Return type

JSON

Example

In the following example:

  • json_value is a JSON expression.
  • .class is a JSON field access.
  • .students is a JSON field access.
  • [0] is a JSON subscript expression with an element offset that accesses the zeroth element of an array in the JSON value.
  • ['name'] is a JSON subscript expression with a field name that accesses a field.
SELECT json_value.class.students[0]['name'] AS first_student
FROM
  UNNEST(
    [
      JSON '{"class" : {"students" : [{"name" : "Jane"}]}}',
      JSON '{"class" : {"students" : []}}',
      JSON '{"class" : {"students" : [{"name" : "John"}, {"name": "Jamie"}]}}'])
    AS json_value;

/*-----------------*
 | first_student   |
 +-----------------+
 | "Jane"          |
 | NULL            |
 | "John"          |
 *-----------------*/

Arithmetic operators

All arithmetic operators accept input of numeric type T, and the result type has type T unless otherwise indicated in the description below:

Name Syntax
Addition X + Y
Subtraction X - Y
Multiplication X * Y
Division X / Y
Unary Plus + X
Unary Minus - X

NOTE: Divide by zero operations return an error. To return a different result, consider the IEEE_DIVIDE or SAFE_DIVIDE functions.

Result types for Addition, Subtraction and Multiplication:

INPUTINT64NUMERICBIGNUMERICFLOAT64
INT64INT64NUMERICBIGNUMERICFLOAT64
NUMERICNUMERICNUMERICBIGNUMERICFLOAT64
BIGNUMERICBIGNUMERICBIGNUMERICBIGNUMERICFLOAT64
FLOAT64FLOAT64FLOAT64FLOAT64FLOAT64

Result types for Division:

INPUTINT64NUMERICBIGNUMERICFLOAT64
INT64FLOAT64NUMERICBIGNUMERICFLOAT64
NUMERICNUMERICNUMERICBIGNUMERICFLOAT64
BIGNUMERICBIGNUMERICBIGNUMERICBIGNUMERICFLOAT64
FLOAT64FLOAT64FLOAT64FLOAT64FLOAT64

Result types for Unary Plus:

INPUTINT64NUMERICBIGNUMERICFLOAT64
OUTPUTINT64NUMERICBIGNUMERICFLOAT64

Result types for Unary Minus:

INPUTINT64NUMERICBIGNUMERICFLOAT64
OUTPUTINT64NUMERICBIGNUMERICFLOAT64

Date arithmetics operators

Operators '+' and '-' can be used for arithmetic operations on dates.

date_expression + int64_expression
int64_expression + date_expression
date_expression - int64_expression

Description

Adds or subtracts int64_expression days to or from date_expression. This is equivalent to DATE_ADD or DATE_SUB functions, when interval is expressed in days.

Return Data Type

DATE

Example

SELECT DATE "2020-09-22" + 1 AS day_later, DATE "2020-09-22" - 7 AS week_ago

/*------------+------------*
 | day_later  | week_ago   |
 +------------+------------+
 | 2020-09-23 | 2020-09-15 |
 *------------+------------*/

Datetime subtraction

date_expression - date_expression
timestamp_expression - timestamp_expression
datetime_expression - datetime_expression

Description

Computes the difference between two datetime values as an interval.

Return Data Type

INTERVAL

Example

SELECT
  DATE "2021-05-20" - DATE "2020-04-19" AS date_diff,
  TIMESTAMP "2021-06-01 12:34:56.789" - TIMESTAMP "2021-05-31 00:00:00" AS time_diff

/*-------------------+------------------------*
 | date_diff         | time_diff              |
 +-------------------+------------------------+
 | 0-0 396 0:0:0     | 0-0 0 36:34:56.789     |
 *-------------------+------------------------*/

Interval arithmetic operators

Addition and subtraction

date_expression + interval_expression = DATETIME
date_expression - interval_expression = DATETIME
timestamp_expression + interval_expression = TIMESTAMP
timestamp_expression - interval_expression = TIMESTAMP
datetime_expression + interval_expression = DATETIME
datetime_expression - interval_expression = DATETIME

Description

Adds an interval to a datetime value or subtracts an interval from a datetime value.

Example

SELECT
  DATE "2021-04-20" + INTERVAL 25 HOUR AS date_plus,
  TIMESTAMP "2021-05-02 00:01:02.345" - INTERVAL 10 SECOND AS time_minus;

/*-------------------------+--------------------------------*
 | date_plus               | time_minus                     |
 +-------------------------+--------------------------------+
 | 2021-04-21 01:00:00     | 2021-05-02 00:00:52.345+00     |
 *-------------------------+--------------------------------*/

Multiplication and division

interval_expression * integer_expression = INTERVAL
interval_expression / integer_expression = INTERVAL

Description

Multiplies or divides an interval value by an integer.

Example

SELECT
  INTERVAL '1:2:3' HOUR TO SECOND * 10 AS mul1,
  INTERVAL 35 SECOND * 4 AS mul2,
  INTERVAL 10 YEAR / 3 AS div1,
  INTERVAL 1 MONTH / 12 AS div2

/*----------------+--------------+-------------+--------------*
 | mul1           | mul2         | div1        | div2         |
 +----------------+--------------+-------------+--------------+
 | 0-0 0 10:20:30 | 0-0 0 0:2:20 | 3-4 0 0:0:0 | 0-0 2 12:0:0 |
 *----------------+--------------+-------------+--------------*/

Bitwise operators

All bitwise operators return the same type and the same length as the first operand.

Name Syntax Input Data Type Description
Bitwise not ~ X Integer or BYTES Performs logical negation on each bit, forming the ones' complement of the given binary value.
Bitwise or X | Y X: Integer or BYTES
Y: Same type as X
Takes two bit patterns of equal length and performs the logical inclusive OR operation on each pair of the corresponding bits. This operator throws an error if X and Y are bytes of different lengths.
Bitwise xor X ^ Y X: Integer or BYTES
Y: Same type as X
Takes two bit patterns of equal length and performs the logical exclusive OR operation on each pair of the corresponding bits. This operator throws an error if X and Y are bytes of different lengths.
Bitwise and X & Y X: Integer or BYTES
Y: Same type as X
Takes two bit patterns of equal length and performs the logical AND operation on each pair of the corresponding bits. This operator throws an error if X and Y are bytes of different lengths.
Left shift X << Y X: Integer or BYTES
Y: INT64
Shifts the first operand X to the left. This operator returns 0 or a byte sequence of b'\x00' if the second operand Y is greater than or equal to the bit length of the first operand X (for example, 64 if X has the type INT64). This operator throws an error if Y is negative.
Right shift X >> Y X: Integer or BYTES
Y: INT64
Shifts the first operand X to the right. This operator does not do sign bit extension with a signed type (i.e., it fills vacant bits on the left with 0). This operator returns 0 or a byte sequence of b'\x00' if the second operand Y is greater than or equal to the bit length of the first operand X (for example, 64 if X has the type INT64). This operator throws an error if Y is negative.

Logical operators

GoogleSQL supports the AND, OR, and NOT logical operators. Logical operators allow only BOOL or NULL input and use three-valued logic to produce a result. The result can be TRUE, FALSE, or NULL:

x y x AND y x OR y
TRUE TRUE TRUE TRUE
TRUE FALSE FALSE TRUE
TRUE NULL NULL TRUE
FALSE TRUE FALSE TRUE
FALSE FALSE FALSE FALSE
FALSE NULL FALSE NULL
NULL TRUE NULL TRUE
NULL FALSE FALSE NULL
NULL NULL NULL NULL
x NOT x
TRUE FALSE
FALSE TRUE
NULL NULL

Examples

The examples in this section reference a table called entry_table:

/*-------*
 | entry |
 +-------+
 | a     |
 | b     |
 | c     |
 | NULL  |
 *-------*/
SELECT 'a' FROM entry_table WHERE entry = 'a'

-- a => 'a' = 'a' => TRUE
-- b => 'b' = 'a' => FALSE
-- NULL => NULL = 'a' => NULL

/*-------*
 | entry |
 +-------+
 | a     |
 *-------*/
SELECT entry FROM entry_table WHERE NOT (entry = 'a')

-- a => NOT('a' = 'a') => NOT(TRUE) => FALSE
-- b => NOT('b' = 'a') => NOT(FALSE) => TRUE
-- NULL => NOT(NULL = 'a') => NOT(NULL) => NULL

/*-------*
 | entry |
 +-------+
 | b     |
 | c     |
 *-------*/
SELECT entry FROM entry_table WHERE entry IS NULL

-- a => 'a' IS NULL => FALSE
-- b => 'b' IS NULL => FALSE
-- NULL => NULL IS NULL => TRUE

/*-------*
 | entry |
 +-------+
 | NULL  |
 *-------*/

Comparison operators

Compares operands and produces the results of the comparison as a BOOL value. These comparison operators are available:

Name Syntax Description
Less Than X < Y Returns TRUE if X is less than Y. This operator supports specifying collation.
Less Than or Equal To X <= Y Returns TRUE if X is less than or equal to Y. This operator supports specifying collation.
Greater Than X > Y Returns TRUE if X is greater than Y. This operator supports specifying collation.
Greater Than or Equal To X >= Y Returns TRUE if X is greater than or equal to Y. This operator supports specifying collation.
Equal X = Y Returns TRUE if X is equal to Y. This operator supports specifying collation.
Not Equal X != Y
X <> Y
Returns TRUE if X is not equal to Y. This operator supports specifying collation.
BETWEEN X [NOT] BETWEEN Y AND Z

Returns TRUE if X is [not] within the range specified. The result of X BETWEEN Y AND Z is equivalent to Y <= X AND X <= Z but X is evaluated only once in the former. This operator supports specifying collation.

LIKE X [NOT] LIKE Y See the `LIKE` operator for details.
IN Multiple See the `IN` operator for details.

The following rules apply to operands in a comparison operator:

  • The operands must be comparable.
  • A comparison operator generally requires both operands to be of the same type.
  • If the operands are of different types, and the values of those types can be converted to a common type without loss of precision, they are generally coerced to that common type for the comparison.
  • A literal operand is generally coerced to the same data type of a non-literal operand that is part of the comparison.
  • Struct operands support only these comparison operators: equal (=), not equal (!= and <>), and IN.

The following rules apply when comparing these data types:

  • FLOAT64: All comparisons with NaN return FALSE, except for != and <>, which return TRUE.
  • BOOL: FALSE is less than TRUE.
  • STRING: Strings are compared codepoint-by-codepoint, which means that canonically equivalent strings are only guaranteed to compare as equal if they have been normalized first.
  • JSON: You can't compare JSON, but you can compare the values inside of JSON if you convert the values to SQL values first. For more information, see JSON functions.
  • NULL: Any operation with a NULL input returns NULL.
  • STRUCT: When testing a struct for equality, it's possible that one or more fields are NULL. In such cases:

    • If all non-NULL field values are equal, the comparison returns NULL.
    • If any non-NULL field values are not equal, the comparison returns FALSE.

    The following table demonstrates how STRUCT data types are compared when they have fields that are NULL valued.

    Struct1 Struct2 Struct1 = Struct2
    STRUCT(1, NULL) STRUCT(1, NULL) NULL
    STRUCT(1, NULL) STRUCT(2, NULL) FALSE
    STRUCT(1,2) STRUCT(1, NULL) NULL

EXISTS operator

EXISTS ( subquery )

Description

Returns TRUE if the subquery produces one or more rows. Returns FALSE if the subquery produces zero rows. Never returns NULL. To learn more about how you can use a subquery with EXISTS, see EXISTS subqueries.

Examples

In this example, the EXISTS operator returns FALSE because there are no rows in Words where the direction is south:

WITH Words AS (
  SELECT 'Intend' as value, 'east' as direction UNION ALL
  SELECT 'Secure', 'north' UNION ALL
  SELECT 'Clarity', 'west'
 )
SELECT EXISTS ( SELECT value FROM Words WHERE direction = 'south' ) as result;

/*--------*
 | result |
 +--------+
 | FALSE  |
 *--------*/

IN operator

The IN operator supports the following syntax:

search_value [NOT] IN value_set

value_set:
  {
    (expression[, ...])
    | (subquery)
    | UNNEST(array_expression)
  }

Description

Checks for an equal value in a set of values. Semantic rules apply, but in general, IN returns TRUE if an equal value is found, FALSE if an equal value is excluded, otherwise NULL. NOT IN returns FALSE if an equal value is found, TRUE if an equal value is excluded, otherwise NULL.

  • search_value: The expression that is compared to a set of values.
  • value_set: One or more values to compare to a search value.

    • (expression[, ...]): A list of expressions.
    • (subquery): A subquery that returns a single column. The values in that column are the set of values. If no rows are produced, the set of values is empty.
    • UNNEST(array_expression): An UNNEST operator that returns a column of values from an array expression. This is equivalent to:

      IN (SELECT element FROM UNNEST(array_expression) AS element)
      

This operator supports collation, but these limitations apply:

  • [NOT] IN UNNEST does not support collation.
  • If collation is used with a list of expressions, there must be at least one item in the list.

Semantic rules

When using the IN operator, the following semantics apply in this order:

  • Returns FALSE if value_set is empty.
  • Returns NULL if search_value is NULL.
  • Returns TRUE if value_set contains a value equal to search_value.
  • Returns NULL if value_set contains a NULL.
  • Returns FALSE.

When using the NOT IN operator, the following semantics apply in this order:

  • Returns TRUE if value_set is empty.
  • Returns NULL if search_value is NULL.
  • Returns FALSE if value_set contains a value equal to search_value.
  • Returns NULL if value_set contains a NULL.
  • Returns TRUE.

The semantics of:

x IN (y, z, ...)

are defined as equivalent to:

(x = y) OR (x = z) OR ...

and the subquery and array forms are defined similarly.

x NOT IN ...

is equivalent to:

NOT(x IN ...)

The UNNEST form treats an array scan like UNNEST in the FROM clause:

x [NOT] IN UNNEST(<array expression>)

This form is often used with array parameters. For example:

x IN UNNEST(@array_parameter)

See the Arrays topic for more information on how to use this syntax.

IN can be used with multi-part keys by using the struct constructor syntax. For example:

(Key1, Key2) IN ( (12,34), (56,78) )
(Key1, Key2) IN ( SELECT (table.a, table.b) FROM table )

See the Struct Type topic for more information.

Return Data Type

BOOL

Examples

You can use these WITH clauses to emulate temporary tables for Words and Items in the following examples:

WITH Words AS (
  SELECT 'Intend' as value UNION ALL
  SELECT 'Secure' UNION ALL
  SELECT 'Clarity' UNION ALL
  SELECT 'Peace' UNION ALL
  SELECT 'Intend'
 )
SELECT * FROM Words;

/*----------*
 | value    |
 +----------+
 | Intend   |
 | Secure   |
 | Clarity  |
 | Peace    |
 | Intend   |
 *----------*/
WITH
  Items AS (
    SELECT STRUCT('blue' AS color, 'round' AS shape) AS info UNION ALL
    SELECT STRUCT('blue', 'square') UNION ALL
    SELECT STRUCT('red', 'round')
  )
SELECT * FROM Items;

/*----------------------------*
 | info                       |
 +----------------------------+
 | {blue color, round shape}  |
 | {blue color, square shape} |
 | {red color, round shape}   |
 *----------------------------*/

Example with IN and an expression:

SELECT * FROM Words WHERE value IN ('Intend', 'Secure');

/*----------*
 | value    |
 +----------+
 | Intend   |
 | Secure   |
 | Intend   |
 *----------*/

Example with NOT IN and an expression:

SELECT * FROM Words WHERE value NOT IN ('Intend');

/*----------*
 | value    |
 +----------+
 | Secure   |
 | Clarity  |
 | Peace    |
 *----------*/

Example with IN, a scalar subquery, and an expression:

SELECT * FROM Words WHERE value IN ((SELECT 'Intend'), 'Clarity');

/*----------*
 | value    |
 +----------+
 | Intend   |
 | Clarity  |
 | Intend   |
 *----------*/

Example with IN and an UNNEST operation:

SELECT * FROM Words WHERE value IN UNNEST(['Secure', 'Clarity']);

/*----------*
 | value    |
 +----------+
 | Secure   |
 | Clarity  |
 *----------*/

Example with IN and a struct:

SELECT
  (SELECT AS STRUCT Items.info) as item
FROM
  Items
WHERE (info.shape, info.color) IN (('round', 'blue'));

/*------------------------------------*
 | item                               |
 +------------------------------------+
 | { {blue color, round shape} info } |
 *------------------------------------*/

IS operators

IS operators return TRUE or FALSE for the condition they are testing. They never return NULL, even for NULL inputs, unlike the IS_INF and IS_NAN functions defined in Mathematical Functions. If NOT is present, the output BOOL value is inverted.

Function Syntax Input Data Type Result Data Type Description
X IS TRUE BOOL BOOL Evaluates to TRUE if X evaluates to TRUE. Otherwise, evaluates to FALSE.
X IS NOT TRUE BOOL BOOL Evaluates to FALSE if X evaluates to TRUE. Otherwise, evaluates to TRUE.
X IS FALSE BOOL BOOL Evaluates to TRUE if X evaluates to FALSE. Otherwise, evaluates to FALSE.
X IS NOT FALSE BOOL BOOL Evaluates to FALSE if X evaluates to FALSE. Otherwise, evaluates to TRUE.
X IS NULL Any value type BOOL Evaluates to TRUE if X evaluates to NULL. Otherwise evaluates to FALSE.
X IS NOT NULL Any value type BOOL Evaluates to FALSE if X evaluates to NULL. Otherwise evaluates to TRUE.
X IS UNKNOWN BOOL BOOL Evaluates to TRUE if X evaluates to NULL. Otherwise evaluates to FALSE.
X IS NOT UNKNOWN BOOL BOOL Evaluates to FALSE if X evaluates to NULL. Otherwise, evaluates to TRUE.

IS DISTINCT FROM operator

expression_1 IS [NOT] DISTINCT FROM expression_2

Description

IS DISTINCT FROM returns TRUE if the input values are considered to be distinct from each other by the DISTINCT and GROUP BY clauses. Otherwise, returns FALSE.

a IS DISTINCT FROM b being TRUE is equivalent to:

  • SELECT COUNT(DISTINCT x) FROM UNNEST([a,b]) x returning 2.
  • SELECT * FROM UNNEST([a,b]) x GROUP BY x returning 2 rows.

a IS DISTINCT FROM b is equivalent to NOT (a = b), except for the following cases:

  • This operator never returns NULL so NULL values are considered to be distinct from non-NULL values, not other NULL values.
  • NaN values are considered to be distinct from non-NaN values, but not other NaN values.

You can use this operation with fields in a complex data type, but not on the complex data types themselves. These complex data types cannot be compared directly:

  • STRUCT
  • ARRAY

Input values:

  • expression_1: The first value to compare. This can be a groupable data type, NULL or NaN.
  • expression_2: The second value to compare. This can be a groupable data type, NULL or NaN.
  • NOT: If present, the output BOOL value is inverted.

Return type

BOOL

Examples

These return TRUE:

SELECT 1 IS DISTINCT FROM 2
SELECT 1 IS DISTINCT FROM NULL
SELECT 1 IS NOT DISTINCT FROM 1
SELECT NULL IS NOT DISTINCT FROM NULL

These return FALSE:

SELECT NULL IS DISTINCT FROM NULL
SELECT 1 IS DISTINCT FROM 1
SELECT 1 IS NOT DISTINCT FROM 2
SELECT 1 IS NOT DISTINCT FROM NULL

LIKE operator

expression_1 [NOT] LIKE expression_2

Description

LIKE returns TRUE if the string in the first operand expression_1 matches a pattern specified by the second operand expression_2, otherwise returns FALSE.

NOT LIKE returns TRUE if the string in the first operand expression_1 does not match a pattern specified by the second operand expression_2, otherwise returns FALSE.

Expressions can contain these characters:

  • A percent sign (%) matches any number of characters or bytes.
  • An underscore (_) matches a single character or byte.
  • You can escape \, _, or % using two backslashes. For example, \\%. If you are using raw strings, only a single backslash is required. For example, r'\%'.

This operator supports collation, but caveats apply:

  • Each % character in expression_2 represents an arbitrary string specifier. An arbitrary string specifier can represent any sequence of 0 or more characters.
  • A character in the expression represents itself and is considered a single character specifier unless:

    • The character is a percent sign (%).

    • The character is an underscore (_) and the collator is not und:ci.

  • These additional rules apply to the underscore (_) character:

    • If the collator is not und:ci, an error is produced when an underscore is not escaped in expression_2.

    • If the collator is not und:ci, the underscore is not allowed when the operands have collation specified.

    • Some compatibility composites, such as the fi-ligature () and the telephone sign (), will produce a match if they are compared to an underscore.

    • A single underscore matches the idea of what a character is, based on an approximation known as a grapheme cluster.

  • For a contiguous sequence of single character specifiers, equality depends on the collator and its language tags and tailoring.

    • By default, the und:ci collator does not fully normalize a string. Some canonically equivalent strings are considered unequal for both the = and LIKE operators.

    • The LIKE operator with collation has the same behavior as the = operator when there are no wildcards in the strings.

    • Character sequences with secondary or higher-weighted differences are considered unequal. This includes accent differences and some special cases.

      For example there are three ways to produce German sharp ß:

      • \u1E9E
      • \U00DF
      • ss

      \u1E9E and \U00DF are considered equal but differ in tertiary. They are considered equal with und:ci collation but different from ss, which has secondary differences.

    • Character sequences with tertiary or lower-weighted differences are considered equal. This includes case differences and kana subtype differences, which are considered equal.

  • There are ignorable characters defined in Unicode. Ignorable characters are ignored in the pattern matching.

Return type

BOOL

Examples

The following examples illustrate how you can check to see if the string in the first operand matches a pattern specified by the second operand.

-- Returns TRUE
SELECT 'apple' LIKE 'a%';
-- Returns FALSE
SELECT '%a' LIKE 'apple';
-- Returns FALSE
SELECT 'apple' NOT LIKE 'a%';
-- Returns TRUE
SELECT '%a' NOT LIKE 'apple';
-- Produces an error
SELECT NULL LIKE 'a%';
-- Produces an error
SELECT 'apple' LIKE NULL;

The following example illustrates how to search multiple patterns in an array to find a match with the LIKE operator:

WITH Words AS
 (SELECT 'Intend with clarity.' as value UNION ALL
  SELECT 'Secure with intention.' UNION ALL
  SELECT 'Clarity and security.')
SELECT value
FROM Words WHERE
  EXISTS(
    SELECT value FROM UNNEST(['%ity%', '%and%']) AS pattern
    WHERE value LIKE pattern
  );

/*------------------------+
 | value                  |
 +------------------------+
 | Intend with clarity.   |
 | Clarity and security.  |
 +------------------------*/

The following examples illustrate how collation can be used with the LIKE operator.

-- Returns FALSE
'Foo' LIKE '%foo%'
-- Returns TRUE
COLLATE('Foo', 'und:ci') LIKE COLLATE('%foo%', 'und:ci');
-- Returns TRUE
COLLATE('Foo', 'und:ci') = COLLATE('foo', 'und:ci');
-- Produces an error
COLLATE('Foo', 'und:ci') LIKE COLLATE('%foo%', 'binary');
-- Produces an error
COLLATE('Foo', 'und:ci') LIKE COLLATE('%f_o%', 'und:ci');
-- Returns TRUE
COLLATE('Foo_', 'und:ci') LIKE COLLATE('%foo\\_%', 'und:ci');

There are two capital forms of ß. We can use either SS or as upper case. While the difference between ß and is case difference (tertiary difference), the difference between sharp s and ss is secondary and considered not equal using the und:ci collator. For example:

-- Returns FALSE
'MASSE' LIKE 'Maße';
-- Returns FALSE
COLLATE('MASSE', 'und:ci') LIKE '%Maße%';
-- Returns FALSE
COLLATE('MASSE', 'und:ci') = COLLATE('Maße', 'und:ci');

The kana differences in Japanese are considered as tertiary or quaternary differences, and should be considered as equal in the und:ci collator with secondary strength.

  • '\u3042' is 'あ' (hiragana)
  • '\u30A2' is 'ア' (katakana)

For example:

-- Returns FALSE
'\u3042' LIKE '%\u30A2%';
-- Returns TRUE
COLLATE('\u3042', 'und:ci') LIKE COLLATE('%\u30A2%', 'und:ci');
-- Returns TRUE
COLLATE('\u3042', 'und:ci') = COLLATE('\u30A2', 'und:ci');

When comparing two strings, the und:ci collator compares the collation units based on the specification of the collation. Even though the number of code points is different, the two strings are considered equal when the collation units are considered the same.

  • '\u0041\u030A' is 'Å' (two code points)
  • '\u0061\u030A' is 'å' (two code points)
  • '\u00C5' is 'Å' (one code point)

In the following examples, the difference between '\u0061\u030A' and '\u00C5' is tertiary.

-- Returns FALSE
'\u0061\u030A' LIKE '%\u00C5%';
-- Returns TRUE
COLLATE('\u0061\u030A', 'und:ci') LIKE '%\u00C5%';
-- Returns TRUE
COLLATE('\u0061\u030A', 'und:ci') = COLLATE('\u00C5', 'und:ci');

In the following example, '\u0083' is a NO BREAK HERE character and is ignored.

-- Returns FALSE
'\u0083' LIKE '';
-- Returns TRUE
COLLATE('\u0083', 'und:ci') LIKE '';

Quantified LIKE operator

The quantified LIKE operator supports the following syntax:

search_value [NOT] LIKE quantifier patterns

quantifier:
 { ANY | SOME | ALL }

patterns:
  {
    pattern_expression_list
    | pattern_array
  }

pattern_expression_list:
  (expression[, ...])

pattern_array:
  UNNEST(array_expression)

Description

Checks search_value for matches against several patterns. Each comparison is case-sensitive. Wildcard searches are supported. Semantic rules apply, but in general, LIKE returns TRUE if a matching pattern is found, FALSE if a matching pattern is not found, or otherwise NULL. NOT LIKE returns FALSE if a matching pattern is found, TRUE if a matching pattern is not found, or otherwise NULL.

  • search_value: The value to search for matching patterns. This value can be a STRING or BYTES type.
  • patterns: The patterns to look for in the search value. Each pattern must resolve to the same type as search_value.

    • pattern_expression_list: A list of one or more patterns that match the search_value type.

    • pattern_array: An UNNEST operation that returns a column of values with the same type as search_value from an array expression.

    The regular expressions that are supported by the LIKE operator are also supported by patterns in the quantified LIKE operator.

  • quantifier: Condition for pattern matching.

    • ANY: Checks if the set of patterns contains at least one pattern that matches the search value.

    • SOME: Synonym for ANY.

    • ALL: Checks if every pattern in the set of patterns matches the search value.

Collation caveats

Collation is supported, but with the following caveats:

  • The collation caveats that apply to the LIKE operator also apply to the quantified LIKE operator.
  • If a collation-supported input contains no collation specification or an empty collation specification and another input contains an explicitly defined collation, the explicitly defined collation is used for all of the inputs.
  • All inputs with a non-empty, explicitly defined collation specification must have the same type of collation specification, otherwise an error is thrown.

Semantics rules

When using the quantified LIKE operator with ANY or SOME, the following semantics apply in this order:

  • Returns FALSE if patterns is empty.
  • Returns NULL if search_value is NULL.
  • Returns TRUE if search_value matches at least one value in patterns.
  • Returns NULL if a pattern in patterns is NULL and other patterns in patterns don't match.
  • Returns FALSE.

When using the quantified LIKE operator with ALL, the following semantics apply in this order:

  • For pattern_array, returns FALSE if patterns is empty.
  • Returns NULL if search_value is NULL.
  • Returns TRUE if search_value matches all values in patterns.
  • Returns NULL if a pattern in patterns is NULL and other patterns in patterns don't match.
  • Returns FALSE.

When using the quantified NOT LIKE operator with ANY or SOME, the following semantics apply in this order:

  • For pattern_array, returns TRUE if patterns is empty.
  • Returns NULL if search_value is NULL.
  • Returns TRUE if search_value doesn't match at least one value in patterns.
  • Returns NULL if a pattern in patterns is NULL and other patterns in patterns don't match.
  • Returns FALSE.

When using the quantified NOT LIKE operator with ALL, the following semantics apply in this order:

  • For pattern_array, returns TRUE if patterns is empty.
  • Returns NULL if search_value is NULL.
  • Returns TRUE if search_value matches none of the values in patterns.
  • Returns NULL if a pattern in patterns is NULL and other patterns in patterns don't match.
  • Returns FALSE.

Details

Some computation limitations apply. For more information, see Quotas and limits.

Return Data Type

BOOL

Examples

The following example checks to see if the Intend% or %intention% pattern exists in a value and produces that value if either pattern is found:

WITH Words AS
 (SELECT 'Intend with clarity.' as value UNION ALL
  SELECT 'Secure with intention.' UNION ALL
  SELECT 'Clarity and security.')
SELECT * FROM Words WHERE value LIKE ANY ('Intend%', '%intention%');

/*------------------------+
 | value                  |
 +------------------------+
 | Intend with clarity.   |
 | Secure with intention. |
 +------------------------*/

The following example checks to see if the %ity% pattern exists in a value and produces that value if the pattern is found.

Example with LIKE ALL:

WITH Words AS
 (SELECT 'Intend with clarity.' as value UNION ALL
  SELECT 'Secure with intention.' UNION ALL
  SELECT 'Clarity and security.')
SELECT * FROM Words WHERE value LIKE ALL ('%ity%');

/*-----------------------+
 | value                 |
 +-----------------------+
 | Intend with clarity.  |
 | Clarity and security. |
 +-----------------------*/

The following example checks to see if the %ity% pattern exists in a value produces that value if the pattern is not found:

WITH Words AS
 (SELECT 'Intend with clarity.' as value UNION ALL
  SELECT 'Secure with intention.' UNION ALL
  SELECT 'Clarity and security.')
SELECT * FROM Words WHERE value NOT LIKE ('%ity%');

/*------------------------+
 | value                  |
 +------------------------+
 | Secure with intention. |
 +------------------------*/

You can pass in an array for patterns. For example:

WITH Words AS
 (SELECT 'Intend with clarity.' as value UNION ALL
  SELECT 'Secure with intention.' UNION ALL
  SELECT 'Clarity and security.')
SELECT * FROM Words WHERE value LIKE ANY UNNEST(['%ion%', '%and%']);

/*------------------------+
 | value                  |
 +------------------------+
 | Secure with intention. |
 | Clarity and security.  |
 +------------------------*/

The following queries illustrate some of the semantic rules for the quantified LIKE operator:

SELECT
  NULL LIKE ANY ('a', 'b'), -- NULL
  'a' LIKE ANY ('a', 'c'), -- TRUE
  'a' LIKE ANY ('b', 'c'), -- FALSE
  'a' LIKE ANY ('a', NULL), -- TRUE
  'a' LIKE ANY ('b', NULL), -- NULL
  NULL NOT LIKE ANY ('a', 'b'), -- NULL
  'a' NOT LIKE ANY ('a', 'b'), -- TRUE
  'a' NOT LIKE ANY ('a', '%a%'), -- FALSE
  'a' NOT LIKE ANY ('a', NULL), -- NULL
  'a' NOT LIKE ANY ('b', NULL); -- TRUE
SELECT
  NULL LIKE SOME ('a', 'b'), -- NULL
  'a' LIKE SOME ('a', 'c'), -- TRUE
  'a' LIKE SOME ('b', 'c'), -- FALSE
  'a' LIKE SOME ('a', NULL), -- TRUE
  'a' LIKE SOME ('b', NULL), -- NULL
  NULL NOT LIKE SOME ('a', 'b'), -- NULL
  'a' NOT LIKE SOME ('a', 'b'), -- TRUE
  'a' NOT LIKE SOME ('a', '%a%'), -- FALSE
  'a' NOT LIKE SOME ('a', NULL), -- NULL
  'a' NOT LIKE SOME ('b', NULL); -- TRUE
SELECT
  NULL LIKE ALL ('a', 'b'), -- NULL
  'a' LIKE ALL ('a', '%a%'), -- TRUE
  'a' LIKE ALL ('a', 'c'), -- FALSE
  'a' LIKE ALL ('a', NULL), -- NULL
  'a' LIKE ALL ('b', NULL), -- FALSE
  NULL NOT LIKE ALL ('a', 'b'), -- NULL
  'a' NOT LIKE ALL ('b', 'c'), -- TRUE
  'a' NOT LIKE ALL ('a', 'c'), -- FALSE
  'a' NOT LIKE ALL ('a', NULL), -- FALSE
  'a' NOT LIKE ALL ('b', NULL); -- NULL

The following queries illustrate some of the semantic rules for the quantified LIKE operator and collation:

SELECT
  COLLATE('a', 'und:ci') LIKE ALL ('a', 'A'), -- TRUE
  'a' LIKE ALL (COLLATE('a', 'und:ci'), 'A'), -- TRUE
  'a' LIKE ALL ('%A%', COLLATE('a', 'und:ci')); -- TRUE
-- ERROR: BYTES and STRING values can't be used together.
SELECT b'a' LIKE ALL (COLLATE('a', 'und:ci'), 'A');

Concatenation operator

The concatenation operator combines multiple values into one.

Function Syntax Input Data Type Result Data Type
STRING || STRING [ || ... ] STRING STRING
BYTES || BYTES [ || ... ] BYTES BYTES
ARRAY<T> || ARRAY<T> [ || ... ] ARRAY<T> ARRAY<T>

Conditional expressions

GoogleSQL for BigQuery supports conditional expressions. Conditional expressions impose constraints on the evaluation order of their inputs. In essence, they are evaluated left to right, with short-circuiting, and only evaluate the output value that was chosen. In contrast, all inputs to regular functions are evaluated before calling the function. Short-circuiting in conditional expressions can be exploited for error handling or performance tuning.

Expression list

Name Summary
CASE expr Compares the given expression to each successive WHEN clause and produces the first result where the values are equal.
CASE Evaluates the condition of each successive WHEN clause and produces the first result where the condition evaluates to TRUE.
COALESCE Produces the value of the first non-NULL expression, if any, otherwise NULL.
IF If an expression evaluates to TRUE, produces a specified result, otherwise produces the evaluation for an else result.
IFNULL If an expression evaluates to NULL, produces a specified result, otherwise produces the expression.
NULLIF Produces NULL if the first expression that matches another evaluates to TRUE, otherwise returns the first expression.

CASE expr

CASE expr
  WHEN expr_to_match THEN result
  [ ... ]
  [ ELSE else_result ]
  END

Description

Compares expr to expr_to_match of each successive WHEN clause and returns the first result where this comparison evaluates to TRUE. The remaining WHEN clauses and else_result aren't evaluated.

If the expr = expr_to_match comparison evaluates to FALSE or NULL for all WHEN clauses, returns the evaluation of else_result if present; if else_result isn't present, then returns NULL.

Consistent with equality comparisons elsewhere, if both expr and expr_to_match are NULL, then expr = expr_to_match evaluates to NULL, which returns else_result. If a CASE statement needs to distinguish a NULL value, then the alternate CASE syntax should be used.

expr and expr_to_match can be any type. They must be implicitly coercible to a common supertype; equality comparisons are done on coerced values. There may be multiple result types. result and else_result expressions must be coercible to a common supertype.

This expression supports specifying collation.

Return Data Type

Supertype of result[, ...] and else_result.

Example

WITH Numbers AS (
  SELECT 90 as A, 2 as B UNION ALL
  SELECT 50, 8 UNION ALL
  SELECT 60, 6 UNION ALL
  SELECT 50, 10
)
SELECT
  A,
  B,
  CASE A
    WHEN 90 THEN 'red'
    WHEN 50 THEN 'blue'
    ELSE 'green'
    END
    AS result
FROM Numbers

/*------------------*
 | A  | B  | result |
 +------------------+
 | 90 | 2  | red    |
 | 50 | 8  | blue   |
 | 60 | 6  | green  |
 | 50 | 10 | blue   |
 *------------------*/

CASE

CASE
  WHEN condition THEN result
  [ ... ]
  [ ELSE else_result ]
  END

Description

Evaluates the condition of each successive WHEN clause and returns the first result where the condition evaluates to TRUE; any remaining WHEN clauses and else_result aren't evaluated.

If all conditions evaluate to FALSE or NULL, returns evaluation of else_result if present; if else_result isn't present, then returns NULL.

For additional rules on how values are evaluated, see the three-valued logic table in Logical operators.

condition must be a boolean expression. There may be multiple result types. result and else_result expressions must be implicitly coercible to a common supertype.

This expression supports specifying collation.

Return Data Type

Supertype of result[, ...] and else_result.

Example

WITH Numbers AS (
  SELECT 90 as A, 2 as B UNION ALL
  SELECT 50, 6 UNION ALL
  SELECT 20, 10
)
SELECT
  A,
  B,
  CASE
    WHEN A > 60 THEN 'red'
    WHEN B = 6 THEN 'blue'
    ELSE 'green'
    END
    AS result
FROM Numbers

/*------------------*
 | A  | B  | result |
 +------------------+
 | 90 | 2  | red    |
 | 50 | 6  | blue   |
 | 20 | 10 | green  |
 *------------------*/

COALESCE

COALESCE(expr[, ...])

Description

Returns the value of the first non-NULL expression, if any, otherwise NULL. The remaining expressions aren't evaluated. An input expression can be any type. There may be multiple input expression types. All input expressions must be implicitly coercible to a common supertype.

Return Data Type

Supertype of expr[, ...].

Examples

SELECT COALESCE('A', 'B', 'C') as result

/*--------*
 | result |
 +--------+
 | A      |
 *--------*/
SELECT COALESCE(NULL, 'B', 'C') as result

/*--------*
 | result |
 +--------+
 | B      |
 *--------*/

IF

IF(expr, true_result, else_result)

Description

If expr evaluates to TRUE, returns true_result, else returns the evaluation for else_result. else_result isn't evaluated if expr evaluates to TRUE. true_result isn't evaluated if expr evaluates to FALSE or NULL.

expr must be a boolean expression. true_result and else_result must be coercible to a common supertype.

Return Data Type

Supertype of true_result and else_result.

Example

WITH Numbers AS (
  SELECT 10 as A, 20 as B UNION ALL
  SELECT 50, 30 UNION ALL
  SELECT 60, 60
)
SELECT
  A,
  B,
  IF(A < B, 'true', 'false') AS result
FROM Numbers

/*------------------*
 | A  | B  | result |
 +------------------+
 | 10 | 20 | true   |
 | 50 | 30 | false  |
 | 60 | 60 | false  |
 *------------------*/

IFNULL

IFNULL(expr, null_result)

Description

If expr evaluates to NULL, returns null_result. Otherwise, returns expr. If expr doesn't evaluate to NULL, null_result isn't evaluated.

expr and null_result can be any type and must be implicitly coercible to a common supertype. Synonym for COALESCE(expr, null_result).

Return Data Type

Supertype of expr or null_result.

Examples

SELECT IFNULL(NULL, 0) as result

/*--------*
 | result |
 +--------+
 | 0      |
 *--------*/
SELECT IFNULL(10, 0) as result

/*--------*
 | result |
 +--------+
 | 10     |
 *--------*/

NULLIF

NULLIF(expr, expr_to_match)

Description

Returns NULL if expr = expr_to_match evaluates to TRUE, otherwise returns expr.

expr and expr_to_match must be implicitly coercible to a common supertype, and must be comparable.

This expression supports specifying collation.

Return Data Type

Supertype of expr and expr_to_match.

Example

SELECT NULLIF(0, 0) as result

/*--------*
 | result |
 +--------+
 | NULL   |
 *--------*/
SELECT NULLIF(10, 0) as result

/*--------*
 | result |
 +--------+
 | 10     |
 *--------*/

FUNCTIONS

AEAD encryption functions

GoogleSQL for BigQuery supports the following AEAD encryption functions. For a description of how the AEAD encryption functions work, see AEAD encryption concepts.

Function list

Name Summary
AEAD.DECRYPT_BYTES Uses the matching key from a keyset to decrypt a BYTES ciphertext.
AEAD.DECRYPT_STRING Uses the matching key from a keyset to decrypt a BYTES ciphertext into a STRING plaintext.
AEAD.ENCRYPT Encrypts STRING plaintext, using the primary cryptographic key in a keyset.
DETERMINISTIC_DECRYPT_BYTES Uses the matching key from a keyset to decrypt a BYTES ciphertext, using deterministic AEAD.
DETERMINISTIC_DECRYPT_STRING Uses the matching key from a keyset to decrypt a BYTES ciphertext into a STRING plaintext, using deterministic AEAD.
DETERMINISTIC_ENCRYPT Encrypts STRING plaintext, using the primary cryptographic key in a keyset, using deterministic AEAD encryption.
KEYS.ADD_KEY_FROM_RAW_BYTES Adds a key to a keyset, and return the new keyset as a serialized BYTES value.
KEYS.KEYSET_CHAIN Produces a Tink keyset that is encrypted with a Cloud KMS key.
KEYS.KEYSET_FROM_JSON Converts a STRING JSON keyset to a serialized BYTES value.
KEYS.KEYSET_LENGTH Gets the number of keys in the provided keyset.
KEYS.KEYSET_TO_JSON Gets a JSON STRING representation of a keyset.
KEYS.NEW_KEYSET Gets a serialized keyset containing a new key based on the key type.
KEYS.NEW_WRAPPED_KEYSET Creates a new keyset and encrypts it with a Cloud KMS key.
KEYS.REWRAP_KEYSET Re-encrypts a wrapped keyset with a new Cloud KMS key.
KEYS.ROTATE_KEYSET Adds a new primary cryptographic key to a keyset, based on the key type.
KEYS.ROTATE_WRAPPED_KEYSET Rewraps a keyset and rotates it.

AEAD.DECRYPT_BYTES

AEAD.DECRYPT_BYTES(keyset, ciphertext, additional_data)

Description

Uses the matching key from keyset to decrypt ciphertext and verifies the integrity of the data using additional_data. Returns an error if decryption or verification fails.

keyset is a serialized BYTES value returned by one of the KEYS functions or a STRUCT returned by KEYS.KEYSET_CHAIN. keyset must contain the key that was used to encrypt ciphertext, and the key must be in an 'ENABLED' state, or else the function returns an error. AEAD.DECRYPT_BYTES identifies the matching key in keyset by finding the key with the key ID that matches the one encrypted in ciphertext.

ciphertext is a BYTES value that is the result of a call to AEAD.ENCRYPT where the input plaintext was of type BYTES.

If ciphertext includes an initialization vector (IV), it should be the first bytes of ciphertext. If ciphertext includes an authentication tag, it should be the last bytes of ciphertext. If the IV and authentic tag are one (SIV), it should be the first bytes of ciphertext. The IV and authentication tag commonly require 16 bytes, but may vary in size.

additional_data is a STRING or BYTES value that binds the ciphertext to its context. This forces the ciphertext to be decrypted in the same context in which it was encrypted. This function casts any STRING value to BYTES. This must be the same as the additional_data provided to AEAD.ENCRYPT to encrypt ciphertext, ignoring its type, or else the function returns an error.

Return Data Type

BYTES

Example

This example creates a table of unique IDs with associated plaintext values and keysets. Then it uses these keysets to encrypt the plaintext values as BYTES and store them in a new table. Finally, it uses AEAD.DECRYPT_BYTES to decrypt the encrypted values and display them as plaintext.

The following statement creates a table CustomerKeysets containing a column of unique IDs, a column of AEAD_AES_GCM_256 keysets, and a column of favorite animals.

CREATE TABLE aead.CustomerKeysets AS
SELECT
  1 AS customer_id,
  KEYS.NEW_KEYSET('AEAD_AES_GCM_256') AS keyset,
  b'jaguar' AS favorite_animal
UNION ALL
SELECT
  2 AS customer_id,
  KEYS.NEW_KEYSET('AEAD_AES_GCM_256') AS keyset,
  b'zebra' AS favorite_animal
UNION ALL
SELECT
  3 AS customer_id,
  KEYS.NEW_KEYSET('AEAD_AES_GCM_256') AS keyset,
  b'nautilus' AS favorite_animal;

The following statement creates a table EncryptedCustomerData containing a column of unique IDs and a column of ciphertext. The statement encrypts the plaintext favorite_animal using the keyset value from CustomerKeysets corresponding to each unique ID.

CREATE TABLE aead.EncryptedCustomerData AS
SELECT
  customer_id,
  AEAD.ENCRYPT(keyset, favorite_animal, CAST(CAST(customer_id AS STRING) AS BYTES))
   AS encrypted_animal
FROM
  aead.CustomerKeysets AS ck;

The following query uses the keysets in the CustomerKeysets table to decrypt data in the EncryptedCustomerData table.

SELECT
  ecd.customer_id,
  AEAD.DECRYPT_BYTES(
    (SELECT ck.keyset
     FROM aead.CustomerKeysets AS ck
     WHERE ecd.customer_id = ck.customer_id),
    ecd.encrypted_animal,
    CAST(CAST(customer_id AS STRING) AS BYTES)
  ) AS favorite_animal
FROM aead.EncryptedCustomerData AS ecd;

AEAD.DECRYPT_STRING

AEAD.DECRYPT_STRING(keyset, ciphertext, additional_data)

Description

Like AEAD.DECRYPT_BYTES, but where additional_data is of type STRING.

Return Data Type

STRING

AEAD.ENCRYPT

AEAD.ENCRYPT(keyset, plaintext, additional_data)

Description

Encrypts plaintext using the primary cryptographic key in keyset. The algorithm of the primary key must be AEAD_AES_GCM_256. Binds the ciphertext to the context defined by additional_data. Returns NULL if any input is NULL.

keyset is a serialized BYTES value returned by one of the KEYS functions or a STRUCT returned by KEYS.KEYSET_CHAIN.

plaintext is the STRING or BYTES value to be encrypted.

additional_data is a STRING or BYTES value that binds the ciphertext to its context. This forces the ciphertext to be decrypted in the same context in which it was encrypted. plaintext and additional_data must be of the same type. AEAD.ENCRYPT(keyset, string1, string2) is equivalent to AEAD.ENCRYPT(keyset, CAST(string1 AS BYTES), CAST(string2 AS BYTES)).

The output is ciphertext BYTES. The ciphertext contains a Tink-specific prefix indicating the key used to perform the encryption.

Return Data Type

BYTES

Example

The following query uses the keysets for each customer_id in the CustomerKeysets table to encrypt the value of the plaintext favorite_animal in the PlaintextCustomerData table corresponding to that customer_id. The output contains a column of customer_id values and a column of corresponding ciphertext output as BYTES.

WITH CustomerKeysets AS (
  SELECT 1 AS customer_id, KEYS.NEW_KEYSET('AEAD_AES_GCM_256') AS keyset UNION ALL
  SELECT 2, KEYS.NEW_KEYSET('AEAD_AES_GCM_256') UNION ALL
  SELECT 3, KEYS.NEW_KEYSET('AEAD_AES_GCM_256')
), PlaintextCustomerData AS (
  SELECT 1 AS customer_id, 'elephant' AS favorite_animal UNION ALL
  SELECT 2, 'walrus' UNION ALL
  SELECT 3, 'leopard'
)
SELECT
  pcd.customer_id,
  AEAD.ENCRYPT(
    (SELECT keyset
     FROM CustomerKeysets AS ck
     WHERE ck.customer_id = pcd.customer_id),
    pcd.favorite_animal,
    CAST(pcd.customer_id AS STRING)
  ) AS encrypted_animal
FROM PlaintextCustomerData AS pcd;

DETERMINISTIC_DECRYPT_BYTES

DETERMINISTIC_DECRYPT_BYTES(keyset, ciphertext, additional_data)

Description

Uses the matching key from keyset to decrypt ciphertext and verifies the integrity of the data using additional_data. Returns an error if decryption fails.

keyset is a serialized BYTES value or a STRUCT value returned by one of the KEYS functions. keyset must contain the key that was used to encrypt ciphertext, the key must be in an 'ENABLED' state, and the key must be of type DETERMINISTIC_AEAD_AES_SIV_CMAC_256, or else the function returns an error. DETERMINISTIC_DECRYPT_BYTES identifies the matching key in keyset by finding the key with the key ID that matches the one encrypted in ciphertext.

ciphertext is a BYTES value that is the result of a call to DETERMINISTIC_ENCRYPT where the input plaintext was of type BYTES.

The ciphertext must follow Tink's wire format. The first byte of ciphertext should contain a Tink key version followed by a 4 byte key hint. If ciphertext includes an initialization vector (IV), it should be the next bytes of ciphertext. If ciphertext includes an authentication tag, it should be the last bytes of ciphertext. If the IV and authentic tag are one (SIV), it should be the first bytes of ciphertext. The IV and authentication tag commonly require 16 bytes, but may vary in size.

additional_data is a STRING or BYTES value that binds the ciphertext to its context. This forces the ciphertext to be decrypted in the same context in which it was encrypted. This function casts any STRING value to BYTES. This must be the same as the additional_data provided to DETERMINISTIC_ENCRYPT to encrypt ciphertext, ignoring its type, or else the function returns an error.

Return Data Type

BYTES

Example

This example creates a table of unique IDs with associated plaintext values and keysets. Then it uses these keysets to encrypt the plaintext values as BYTES and store them in a new table. Finally, it uses DETERMINISTIC_DECRYPT_BYTES to decrypt the encrypted values and display them as plaintext.

The following statement creates a table CustomerKeysets containing a column of unique IDs, a column of DETERMINISTIC_AEAD_AES_SIV_CMAC_256 keysets, and a column of favorite animals.

CREATE TABLE deterministic.CustomerKeysets AS
SELECT
  1 AS customer_id,
  KEYS.NEW_KEYSET('DETERMINISTIC_AEAD_AES_SIV_CMAC_256') AS keyset,
  b'jaguar' AS favorite_animal
UNION ALL
SELECT
  2 AS customer_id,
  KEYS.NEW_KEYSET('DETERMINISTIC_AEAD_AES_SIV_CMAC_256') AS keyset,
  b'zebra' AS favorite_animal
UNION ALL
SELECT
  3 AS customer_id,
  KEYS.NEW_KEYSET('DETERMINISTIC_AEAD_AES_SIV_CMAC_256') AS keyset,
  b'nautilus' AS favorite_animal;

The following statement creates a table EncryptedCustomerData containing a column of unique IDs and a column of ciphertext. The statement encrypts the plaintext favorite_animal using the keyset value from CustomerKeysets corresponding to each unique ID.

CREATE TABLE deterministic.EncryptedCustomerData AS
SELECT
  customer_id,
  DETERMINISTIC_ENCRYPT(ck.keyset, favorite_animal, CAST(CAST(customer_id AS STRING) AS BYTES))
   AS encrypted_animal
FROM
  deterministic.CustomerKeysets AS ck;

The following query uses the keysets in the CustomerKeysets table to decrypt data in the EncryptedCustomerData table.

SELECT
  ecd.customer_id,
  DETERMINISTIC_DECRYPT_BYTES(
    (SELECT ck.keyset
     FROM deterministic.CustomerKeysets AS ck
     WHERE ecd.customer_id = ck.customer_id),
    ecd.encrypted_animal,
    CAST(CAST(ecd.customer_id AS STRING) AS BYTES)
  ) AS favorite_animal
FROM deterministic.EncryptedCustomerData AS ecd;

DETERMINISTIC_DECRYPT_STRING

DETERMINISTIC_DECRYPT_STRING(keyset, ciphertext, additional_data)

Description

Like DETERMINISTIC_DECRYPT_BYTES, but where plaintext is of type STRING.

Return Data Type

STRING

DETERMINISTIC_ENCRYPT

DETERMINISTIC_ENCRYPT(keyset, plaintext, additional_data)

Description

Encrypts plaintext using the primary cryptographic key in keyset using deterministic AEAD. The algorithm of the primary key must be DETERMINISTIC_AEAD_AES_SIV_CMAC_256. Binds the ciphertext to the context defined by additional_data. Returns NULL if any input is NULL.

keyset is a serialized BYTES value or a STRUCT value returned by one of the KEYS functions.

plaintext is the STRING or BYTES value to be encrypted.

additional_data is a STRING or BYTES value that binds the ciphertext to its context. This forces the ciphertext to be decrypted in the same context in which it was encrypted. plaintext and additional_data must be of the same type. DETERMINISTIC_ENCRYPT(keyset, string1, string2) is equivalent to DETERMINISTIC_ENCRYPT(keyset, CAST(string1 AS BYTES), CAST(string2 AS BYTES)).

The output is ciphertext BYTES. The ciphertext contains a Tink-specific prefix indicating the key used to perform the encryption. Given an identical keyset and plaintext, this function returns the same ciphertext each time it is invoked (including across queries).

Return Data Type

BYTES

Example

The following query uses the keysets for each customer_id in the CustomerKeysets table to encrypt the value of the plaintext favorite_animal in the PlaintextCustomerData table corresponding to that customer_id. The output contains a column of customer_id values and a column of corresponding ciphertext output as BYTES.

WITH CustomerKeysets AS (
  SELECT 1 AS customer_id,
  KEYS.NEW_KEYSET('DETERMINISTIC_AEAD_AES_SIV_CMAC_256') AS keyset UNION ALL
  SELECT 2, KEYS.NEW_KEYSET('DETERMINISTIC_AEAD_AES_SIV_CMAC_256') UNION ALL
  SELECT 3, KEYS.NEW_KEYSET('DETERMINISTIC_AEAD_AES_SIV_CMAC_256')
), PlaintextCustomerData AS (
  SELECT 1 AS customer_id, 'elephant' AS favorite_animal UNION ALL
  SELECT 2, 'walrus' UNION ALL
  SELECT 3, 'leopard'
)
SELECT
  pcd.customer_id,
  DETERMINISTIC_ENCRYPT(
    (SELECT keyset
     FROM CustomerKeysets AS ck
     WHERE ck.customer_id = pcd.customer_id),
    pcd.favorite_animal,
    CAST(pcd.customer_id AS STRING)
  ) AS encrypted_animal
FROM PlaintextCustomerData AS pcd;

KEYS.ADD_KEY_FROM_RAW_BYTES

KEYS.ADD_KEY_FROM_RAW_BYTES(keyset, key_type, raw_key_bytes)

Description

Returns a serialized keyset as BYTES with the addition of a key to keyset based on key_type and raw_key_bytes.

The primary cryptographic key remains the same as in keyset. The expected length of raw_key_bytes depends on the value of key_type. The following are supported key_types:

  • 'AES_CBC_PKCS': Creates a key for AES decryption using cipher block chaining and PKCS padding. raw_key_bytes is expected to be a raw key BYTES value of length 16, 24, or 32; these lengths have sizes of 128, 192, and 256 bits, respectively. GoogleSQL AEAD functions do not support keys of these types for encryption; instead, prefer 'AEAD_AES_GCM_256' or 'AES_GCM' keys.
  • 'AES_GCM': Creates a key for AES decryption or encryption using Galois/Counter Mode. raw_key_bytes must be a raw key BYTES value of length 16 or 32; these lengths have sizes of 128 and 256 bits, respectively. When keys of this type are inputs to AEAD.ENCRYPT, the output ciphertext does not have a Tink-specific prefix indicating which key was used as input.

Return Data Type

BYTES

Example

The following query creates a table of customer IDs along with raw key bytes, called CustomerRawKeys, and a table of unique IDs, called CustomerIds. It creates a new 'AEAD_AES_GCM_256' keyset for each customer_id; then it adds a new key to each keyset, using the raw_key_bytes value corresponding to that customer_id. The output is a table where each row contains a customer_id and a keyset in BYTES, which contains the raw key added using KEYS.ADD_KEY_FROM_RAW_BYTES.

WITH CustomerRawKeys AS (
  SELECT 1 AS customer_id, b'0123456789012345' AS raw_key_bytes UNION ALL
  SELECT 2, b'9876543210543210' UNION ALL
  SELECT 3, b'0123012301230123'
), CustomerIds AS (
  SELECT 1 AS customer_id UNION ALL
  SELECT 2 UNION ALL
  SELECT 3
)
SELECT
  ci.customer_id,
  KEYS.ADD_KEY_FROM_RAW_BYTES(
    KEYS.NEW_KEYSET('AEAD_AES_GCM_256'),
    'AES_CBC_PKCS',
    (SELECT raw_key_bytes FROM CustomerRawKeys AS crk
     WHERE crk.customer_id = ci.customer_id)
  ) AS keyset
FROM CustomerIds AS ci;

The output keysets each contain two things: the primary cryptographic key created using KEYS.NEW_KEYSET('AEAD_AES_GCM_256'), and the raw key added using KEYS.ADD_KEY_FROM_RAW_BYTES. If a keyset in the output is used with AEAD.ENCRYPT, GoogleSQL uses the primary cryptographic key created using KEYS.NEW_KEYSET('AEAD_AES_GCM_256') to encrypt the input plaintext. If the keyset is used with AEAD.DECRYPT_STRING or AEAD.DECRYPT_BYTES, GoogleSQL returns the resulting plaintext if either key succeeds in decrypting the ciphertext.

KEYS.KEYSET_CHAIN

KEYS.KEYSET_CHAIN(kms_resource_name, first_level_keyset)

Description

Can be used in place of the keyset argument to the AEAD and deterministic encryption functions to pass a Tink keyset that is encrypted with a Cloud KMS key. This function lets you use other AEAD functions without including plaintext keys in a query.

This function takes the following arguments:

  • kms_resource_name: A STRING literal that contains the resource path to the Cloud KMS key that's used to decrypt first_level_keyset. This key must reside in the same Cloud region where this function is executed. A Cloud KMS key looks like this:

    gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key
    
  • first_level_keyset: A BYTES literal that represents a keyset or wrapped keyset.

Return Data Type

STRUCT

Example

This example creates a table of example data, then shows how to encrypt that data using a wrapped (encrypted) keyset. Finally it shows how to query the encrypted version of the data.

The following statement creates a table RawCustomerData containing a column of customer ids and a column of favorite animals.

CREATE TABLE aead.RawCustomerData AS
SELECT
  1 AS customer_id,
  b'jaguar' AS favorite_animal
UNION ALL
SELECT
  2 AS customer_id,
  b'zebra' AS favorite_animal
UNION ALL
SELECT
  3 AS customer_id,
  b'zebra' AS favorite_animal;

The following statement creates a table EncryptedCustomerData containing a column of unique IDs and a column of ciphertext. The statement encrypts the plaintext favorite_animal using the first_level_keyset provided.

DECLARE kms_resource_name STRING;
DECLARE first_level_keyset BYTES;
SET kms_resource_name = 'gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key';
SET first_level_keyset = b'\012\044\000\107\275\360\176\264\206\332\235\215\304...';

CREATE TABLE aead.EncryptedCustomerData AS
SELECT
  customer_id,
  AEAD.ENCRYPT(
    KEYS.KEYSET_CHAIN(kms_resource_name, first_level_keyset),
    favorite_animal,
    CAST(CAST(customer_id AS STRING) AS BYTES)
  ) AS encrypted_animal
FROM
  aead.RawCustomerData;

The following query uses the first_level_keyset to decrypt data in the EncryptedCustomerData table.

DECLARE kms_resource_name STRING;
DECLARE first_level_keyset BYTES;
SET kms_resource_name = 'gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key';
SET first_level_keyset = b'\012\044\000\107\275\360\176\264\206\332\235\215\304...';

SELECT
  customer_id,
  AEAD.DECRYPT_BYTES(
    KEYS.KEYSET_CHAIN(kms_resource_name, first_level_keyset),
    encrypted_animal,
    CAST(CAST(customer_id AS STRING) AS BYTES)
  ) AS favorite_animal
FROM
  aead.EncryptedCustomerData;

The previous two steps also work with the DETERMINISTIC_ENCRYPT and DETERMINISTIC_DECRYPT_BYTES functions. The wrapped keyset must be created using the DETERMINISTIC_AEAD_AES_SIV_CMAC_256 type.

The following statement creates a table EncryptedCustomerData containing a column of unique IDs and a column of ciphertext. The statement encrypts the plaintext favorite_animal using the first_level_keyset provided. You can see that the ciphertext for favorite_animal is the same for customers 2 and 3 since their plaintext favorite_animal is the same.

DECLARE kms_resource_name STRING;
DECLARE first_level_keyset BYTES;
SET kms_resource_name = 'gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key';
SET first_level_keyset = b'\012\044\000\107\275\360\176\264\206\332\235\215\304...';

CREATE TABLE daead.EncryptedCustomerData AS
SELECT
  customer_id,
  DETERMINISTC_ENCRYPT(
    KEYS.KEYSET_CHAIN(kms_resource_name, first_level_keyset),
    favorite_animal,
    CAST(CAST(customer_id AS STRING) AS BYTES)
  ) AS encrypted_animal
FROM
  daead.RawCustomerData;

The following query uses the first_level_keyset to decrypt data in the EncryptedCustomerData table.

DECLARE kms_resource_name STRING;
DECLARE first_level_keyset BYTES;
SET kms_resource_name = 'gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key';
SET first_level_keyset = b'\012\044\000\107\275\360\176\264\206\332\235\215\304...';

SELECT
  customer_id,
  DETERMINISTIC_DECRYPT_BYTES(
    KEYS.KEYSET_CHAIN(kms_resource_name, first_level_keyset),
    encrypted_animal,
    CAST(CAST(customer_id AS STRING) AS BYTES)
  ) AS favorite_animal
FROM dead.EncryptedCustomerData;

KEYS.KEYSET_FROM_JSON

KEYS.KEYSET_FROM_JSON(json_keyset)

Description

Returns the input json_keyset STRING as serialized BYTES, which is a valid input for other KEYS and AEAD functions. The JSON STRING must be compatible with the definition of the google.crypto.tink.Keyset protocol buffer message: the JSON keyset should be a JSON object containing objects and name-value pairs corresponding to those in the "keyset" message in the google.crypto.tink.Keyset definition. You can convert the output serialized BYTES representation back to a JSON STRING using KEYS.KEYSET_TO_JSON.

Return Data Type

BYTES

Example

KEYS.KEYSET_FROM_JSON takes JSON-formatted STRING values like the following:

{
  "key":[
      {
        "keyData":{
          "keyMaterialType":"SYMMETRIC",
          "typeUrl":"type.googleapis.com/google.crypto.tink.AesGcmKey",
          "value":"GiD80Z8kL6AP3iSNHhqseZGAIvq7TVQzClT7FQy8YwK3OQ=="
        },
        "keyId":3101427138,
        "outputPrefixType":"TINK",
        "status":"ENABLED"
      }
    ],
  "primaryKeyId":3101427138
}

The following query creates a new keyset from a JSON-formatted STRING json_keyset:

SELECT KEYS.KEYSET_FROM_JSON(json_keyset);

This returns the json_keyset serialized as BYTES, like the following:

\x08\x9d\x8e\x85\x82\x09\x12d\x0aX\x0a0
type.googleapis.com/google.crypto.tink.AesGcmKey\x12\"\x1a qX\xe4IG\x87\x1f\xde
\xe3)+e\x98\x0a\x1c}\xfe\x88<\x12\xeb\xc1t\xb8\x83\x1a\xcd\xa8\x97\x84g\x18\x01
\x10\x01\x18\x9d\x8e\x85\x82\x09 \x01

KEYS.KEYSET_LENGTH

KEYS.KEYSET_LENGTH(keyset)

Description

Returns the number of keys in the provided keyset.

Return Data Type

INT64

Example

This example references a JSON-formatted STRING called json_keyset that contains two keys:

{
   "primaryKeyId":1354994251,
   "key":[
      {
         "keyData":{
            "keyMaterialType":"SYMMETRIC",
            "typeUrl":"type.googleapis.com/google.crypto.tink.AesGcmKey",
            "value":"GiD9sxQRgFj4aYN78vaIlxInjZkG/uvyWSY9a8GN+ELV2Q=="
         },
         "keyId":1354994251,
         "outputPrefixType":"TINK",
         "status":"ENABLED"
      }
   ],
   "key":[
      {
         "keyData":{
            "keyMaterialType":"SYMMETRIC",
            "typeUrl":"type.googleapis.com/google.crypto.tink.AesGcmKey",
            "value":"PRn76sxQRgFj4aYN00vaIlxInjZkG/uvyWSY9a2bLRm"
         },
         "keyId":852264701,
         "outputPrefixType":"TINK",
         "status":"DISABLED"
      }
   ]
}

The following query converts json_keyset to a keyset and then returns the number of keys in the keyset:

SELECT KEYS.KEYSET_LENGTH(KEYS.KEYSET_FROM_JSON(json_keyset)) as key_count;

/*-----------*
 | key_count |
 +-----------+
 | 2         |
 *-----------*/

KEYS.KEYSET_TO_JSON

KEYS.KEYSET_TO_JSON(keyset)

Description

Returns a JSON STRING representation of the input keyset. The returned JSON STRING is compatible with the definition of the google.crypto.tink.Keyset protocol buffer message. You can convert the JSON STRING representation back to BYTES using KEYS.KEYSET_FROM_JSON.

Return Data Type

STRING

Example

The following query returns a new 'AEAD_AES_GCM_256' keyset as a JSON-formatted STRING.

SELECT KEYS.KEYSET_TO_JSON(KEYS.NEW_KEYSET('AEAD_AES_GCM_256'));

The result is a STRING like the following.

{
  "key":[
      {
        "keyData":{
          "keyMaterialType":"SYMMETRIC",
          "typeUrl":"type.googleapis.com/google.crypto.tink.AesGcmKey",
          "value":"GiD80Z8kL6AP3iSNHhqseZGAIvq7TVQzClT7FQy8YwK3OQ=="
        },
        "keyId":3101427138,
        "outputPrefixType":"TINK",
        "status":"ENABLED"
      }
    ],
  "primaryKeyId":3101427138
}

KEYS.NEW_KEYSET

KEYS.NEW_KEYSET(key_type)

Description

Returns a serialized keyset containing a new key based on key_type. The returned keyset is a serialized BYTES representation of google.crypto.tink.Keyset that contains a primary cryptographic key and no additional keys. You can use the keyset with the AEAD.ENCRYPT, AEAD.DECRYPT_BYTES, and AEAD.DECRYPT_STRING functions for encryption and decryption, as well as with the KEYS group of key- and keyset-related functions.

key_type is a STRING literal representation of the type of key to create. key_type cannot be NULL. key_type can be:

  • AEAD_AES_GCM_256: Creates a 256-bit key with the pseudo-random number generator provided by boringSSL. The key uses AES-GCM for encryption and decryption operations.
  • DETERMINISTIC_AEAD_AES_SIV_CMAC_256: Creates a 512-bit AES-SIV-CMAC key, which contains a 256-bit AES-CTR key and 256-bit AES-CMAC key. The AES-SIV-CMAC key is created with the pseudo-random number generator provided by boringSSL. The key uses AES-SIV for encryption and decryption operations.

Return Data Type

BYTES

Example

The following query creates a keyset for each row in CustomerIds, which can subsequently be used to encrypt data. Each keyset contains a single encryption key with randomly-generated key data. Each row in the output contains a customer_id and an 'AEAD_AES_GCM_256' key in BYTES.

SELECT customer_id, KEYS.NEW_KEYSET('AEAD_AES_GCM_256') AS keyset
FROM (
  SELECT 1 AS customer_id UNION ALL
  SELECT 2 UNION ALL
  SELECT 3
) AS CustomerIds;

KEYS.NEW_WRAPPED_KEYSET

KEYS.NEW_WRAPPED_KEYSET(kms_resource_name, key_type)

Description

Creates a new keyset and encrypts it with a Cloud KMS key. Returns the wrapped keyset as a BYTES representation of google.crypto.tink.Keyset that contains a primary cryptographic key and no additional keys.

This function takes the following arguments:

  • kms_resource_name: A STRING literal representation of the Cloud KMS key. kms_resource_name cannot be NULL. The Cloud KMS key must reside in the same Cloud region where this function is executed. A Cloud KMS key looks like this:

    gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key
    
  • key_type: A STRING literal representation of the keyset type. key_type cannot be NULL but can be one of the following values:

    • AEAD_AES_GCM_256: Creates a 256-bit key with the pseudo-random number generator provided by boringSSL. The key uses AES-GCM for encryption and decryption operations.

    • DETERMINISTIC_AEAD_AES_SIV_CMAC_256: Creates a 512-bit AES-SIV-CMAC key, which contains a 256-bit AES-CTR key and 256-bit AES-CMAC key. The AES-SIV-CMAC key is created with the pseudo-random number generator provided by boringSSL. The key uses AES-SIV for encryption and decryption operations.

Return Data Type

BYTES

Example

Put the following variables above each example query that you run:

DECLARE kms_resource_name STRING;
SET kms_resource_name = 'gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key';

The following query creates a wrapped keyset, which contains the ciphertext produced by encrypting a Tink keyset with the specified Cloud KMS key. If you run the query multiple times, it generates multiple wrapped keysets, and each wrapped keyset is unique to each query that is run.

SELECT KEYS.NEW_WRAPPED_KEYSET(kms_resource_name, 'AEAD_AES_GCM_256');

Multiple calls to this function with the same arguments in one query returns the same value. For example, the following query only creates one wrapped keyset and returns it for each row in a table called my_table.

SELECT
  *,
  KEYS.NEW_WRAPPED_KEYSET(kms_resource_name, 'AEAD_AES_GCM_256')
FROM my_table

KEYS.REWRAP_KEYSET

KEYS.REWRAP_KEYSET(source_kms_resource_name, target_kms_resource_name, wrapped_keyset)

Description

Re-encrypts a wrapped keyset with a new Cloud KMS key. Returns the wrapped keyset as a BYTES representation of google.crypto.tink.Keyset that contains a primary cryptographic key and no additional keys.

When this function is used, a wrapped keyset is decrypted by source_kms_resource_name and then re-encrypted by target_kms_resource_name. During this process, the decrypted keyset is never visible to customers.

This function takes the following arguments:

  • source_kms_resource_name: A STRING literal representation of the Cloud KMS key you want to replace. This key must reside in the same Cloud region where this function is executed. A Cloud KMS key looks like this:

    gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key
    
  • target_kms_resource_name: A STRING literal representation of the new Cloud KMS key that you want to use.

  • wrapped_keyset: A BYTES literal representation of the keyset that you want to re-encrypt.

Return Data Type

BYTES

Example

Put the following variables above each example query that you run:

DECLARE source_kms_resource_name STRING;
DECLARE target_kms_resource_name STRING;
DECLARE wrapped_keyset BYTES;
SET source_kms_resource_name = 'gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key';
SET target_kms_resource_name = 'gcp-kms://projects/my-project/locations/another-location/keyRings/my-key-ring/cryptoKeys/my-other-crypto-key';
SET wrapped_keyset = b'\012\044\000\107\275\360\176\264\206\332\235\215\304...';

The following query rewraps a wrapped keyset. If you run the query multiple times, it generates multiple wrapped keysets, and each wrapped keyset is unique to each query that is run.

SELECT KEYS.REWRAP_KEYSET(source_kms_resource_name, target_kms_resource_name, wrapped_keyset);

Multiple calls to this function with the same arguments in one query returns the same value. For example, the following query only creates one wrapped keyset and returns it for each row in a table called my_table.

SELECT
  *,
  KEYS.REWRAP_KEYSET(source_kms_resource_name, target_kms_resource_name, wrapped_keyset)
FROM my_table

KEYS.ROTATE_KEYSET

KEYS.ROTATE_KEYSET(keyset, key_type)

Description

Adds a new key to keyset based on key_type. This new key becomes the primary cryptographic key of the new keyset. Returns the new keyset serialized as BYTES.

The old primary cryptographic key from the input keyset remains an additional key in the returned keyset.

The new key_type must match the key type of existing keys in the keyset.

Return Data Type

BYTES

Example

The following statement creates a table containing a column of unique customer_id values and 'AEAD_AES_GCM_256' keysets. Then, it creates a new primary cryptographic key within each keyset in the source table using KEYS.ROTATE_KEYSET. Each row in the output contains a customer_id and an 'AEAD_AES_GCM_256' keyset in BYTES.

WITH ExistingKeysets AS (
SELECT 1 AS customer_id, KEYS.NEW_KEYSET('AEAD_AES_GCM_256') AS keyset
    UNION ALL
  SELECT 2, KEYS.NEW_KEYSET('AEAD_AES_GCM_256') UNION ALL
  SELECT 3, KEYS.NEW_KEYSET('AEAD_AES_GCM_256')
)
SELECT customer_id, KEYS.ROTATE_KEYSET(keyset, 'AEAD_AES_GCM_256') AS keyset
FROM ExistingKeysets;

KEYS.ROTATE_WRAPPED_KEYSET

KEYS.ROTATE_WRAPPED_KEYSET(kms_resource_name, wrapped_keyset, key_type)

Description

Takes an existing wrapped keyset and returns a rotated and rewrapped keyset. The returned wrapped keyset is a BYTES representation of google.crypto.tink.Keyset.

When this function is used, the wrapped keyset is decrypted, the new key is added, and then the keyset is re-encrypted. The primary cryptographic key from the input wrapped_keyset remains as an additional key in the returned keyset. During this rotation process, the decrypted keyset is never visible to customers.

This function takes the following arguments:

  • kms_resource_name: A STRING literal representation of the Cloud KMS key that was used to wrap the wrapped keyset. The Cloud KMS key must reside in the same Cloud region where this function is executed. A Cloud KMS key looks like this:

    gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key
    
  • wrapped_keyset: A BYTES literal representation of the existing keyset that you want to work with.

  • key_type: A STRING literal representation of the keyset type. This must match the key type of existing keys in wrapped_keyset.

Return Data Type

BYTES

Example

Put the following variables above each example query that you run:

DECLARE kms_resource_name STRING;
DECLARE wrapped_keyset BYTES;
SET kms_resource_name = 'gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key';
SET wrapped_keyset = b'\012\044\000\107\275\360\176\264\206\332\235\215\304...';

The following query rotates a wrapped keyset. If you run the query multiple times, it generates multiple wrapped keysets, and each wrapped keyset is unique to each query that is run.

SELECT KEYS.ROTATE_WRAPPED_KEYSET(kms_resource_name, wrapped_keyset, 'AEAD_AES_GCM_256');

Multiple calls to this function with the same arguments in one query returns the same value. For example, the following query only creates one wrapped keyset and returns it for each row in a table called my_table.

SELECT
  *,
  KEYS.ROTATE_WRAPPED_KEYSET(kms_resource_name, wrapped_keyset, 'AEAD_AES_GCM_256')
FROM my_table

Aggregate functions

GoogleSQL for BigQuery supports the following general aggregate functions. To learn about the syntax for aggregate function calls, see Aggregate function calls.

Function list

Name Summary
ANY_VALUE Gets an expression for some row.
ARRAY_AGG Gets an array of values.
ARRAY_CONCAT_AGG Concatenates arrays and returns a single array as a result.
AVG Gets the average of non-NULL values.
BIT_AND Performs a bitwise AND operation on an expression.
BIT_OR Performs a bitwise OR operation on an expression.
BIT_XOR Performs a bitwise XOR operation on an expression.
COUNT Gets the number of rows in the input, or the number of rows with an expression evaluated to any value other than NULL.
COUNTIF Gets the count of TRUE values for an expression.
GROUPING Checks if a groupable value in the GROUP BY clause is aggregated.
LOGICAL_AND Gets the logical AND of all non-NULL expressions.
LOGICAL_OR Gets the logical OR of all non-NULL expressions.
MAX Gets the maximum non-NULL value.
MAX_BY Synonym for ANY_VALUE(x HAVING MAX y).
MIN Gets the minimum non-NULL value.
MIN_BY Synonym for ANY_VALUE(x HAVING MIN y).
STRING_AGG Concatenates non-NULL STRING or BYTES values.
SUM Gets the sum of non-NULL values.

ANY_VALUE

ANY_VALUE(
  expression
  [ HAVING { MAX | MIN } expression2 ]
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns expression for some row chosen from the group. Which row is chosen is nondeterministic, not random. Returns NULL when the input produces no rows. Returns NULL when expression or expression2 is NULL for all rows in the group.

ANY_VALUE behaves as if IGNORE NULLS is specified; rows for which expression is NULL are not considered and won't be selected.

If the HAVING clause is included in the ANY_VALUE function, the OVER clause can't be used with this function.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

To learn more about the OVER clause and how to use it, see Window function calls.

Supported Argument Types

Any

Returned Data Types

Matches the input data type.

Examples

SELECT ANY_VALUE(fruit) as any_value
FROM UNNEST(["apple", "banana", "pear"]) as fruit;

/*-----------*
 | any_value |
 +-----------+
 | apple     |
 *-----------*/
SELECT
  fruit,
  ANY_VALUE(fruit) OVER (ORDER BY LENGTH(fruit) ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS any_value
FROM UNNEST(["apple", "banana", "pear"]) as fruit;

/*--------+-----------*
 | fruit  | any_value |
 +--------+-----------+
 | pear   | pear      |
 | apple  | pear      |
 | banana | apple     |
 *--------+-----------*/
WITH
  Store AS (
    SELECT 20 AS sold, "apples" AS fruit
    UNION ALL
    SELECT 30 AS sold, "pears" AS fruit
    UNION ALL
    SELECT 30 AS sold, "bananas" AS fruit
    UNION ALL
    SELECT 10 AS sold, "oranges" AS fruit
  )
SELECT ANY_VALUE(fruit HAVING MAX sold) AS a_highest_selling_fruit FROM Store;

/*-------------------------*
 | a_highest_selling_fruit |
 +-------------------------+
 | pears                   |
 *-------------------------*/
WITH
  Store AS (
    SELECT 20 AS sold, "apples" AS fruit
    UNION ALL
    SELECT 30 AS sold, "pears" AS fruit
    UNION ALL
    SELECT 30 AS sold, "bananas" AS fruit
    UNION ALL
    SELECT 10 AS sold, "oranges" AS fruit
  )
SELECT ANY_VALUE(fruit HAVING MIN sold) AS a_lowest_selling_fruit FROM Store;

/*-------------------------*
 | a_lowest_selling_fruit  |
 +-------------------------+
 | oranges                 |
 *-------------------------*/

ARRAY_AGG

ARRAY_AGG(
  [ DISTINCT ]
  expression
  [ { IGNORE | RESPECT } NULLS ]
  [ ORDER BY key [ { ASC | DESC } ] [, ... ] ]
  [ LIMIT n ]
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns an ARRAY of expression values.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

If this function is used with the OVER clause, it's part of a window function call. In a window function call, aggregate function clauses can't be used. To learn more about the OVER clause and how to use it, see Window function calls.

An error is raised if an array in the final query result contains a NULL element.

Supported Argument Types

All data types except ARRAY.

Returned Data Types

ARRAY

If there are zero input rows, this function returns NULL.

Examples

SELECT ARRAY_AGG(x) AS array_agg FROM UNNEST([2, 1,-2, 3, -2, 1, 2]) AS x;

/*-------------------------*
 | array_agg               |
 +-------------------------+
 | [2, 1, -2, 3, -2, 1, 2] |
 *-------------------------*/
SELECT ARRAY_AGG(DISTINCT x) AS array_agg
FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x;

/*---------------*
 | array_agg     |
 +---------------+
 | [2, 1, -2, 3] |
 *---------------*/
SELECT ARRAY_AGG(x IGNORE NULLS) AS array_agg
FROM UNNEST([NULL, 1, -2, 3, -2, 1, NULL]) AS x;

/*-------------------*
 | array_agg         |
 +-------------------+
 | [1, -2, 3, -2, 1] |
 *-------------------*/
SELECT ARRAY_AGG(x ORDER BY ABS(x)) AS array_agg
FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x;

/*-------------------------*
 | array_agg               |
 +-------------------------+
 | [1, 1, 2, -2, -2, 2, 3] |
 *-------------------------*/
SELECT ARRAY_AGG(x LIMIT 5) AS array_agg
FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x;

/*-------------------*
 | array_agg         |
 +-------------------+
 | [2, 1, -2, 3, -2] |
 *-------------------*/
WITH vals AS
  (
    SELECT 1 x UNION ALL
    SELECT -2 x UNION ALL
    SELECT 3 x UNION ALL
    SELECT -2 x UNION ALL
    SELECT 1 x
  )
SELECT ARRAY_AGG(DISTINCT x ORDER BY x) as array_agg
FROM vals;

/*------------*
 | array_agg  |
 +------------+
 | [-2, 1, 3] |
 *------------*/
WITH vals AS
  (
    SELECT 1 x, 'a' y UNION ALL
    SELECT 1 x, 'b' y UNION ALL
    SELECT 2 x, 'a' y UNION ALL
    SELECT 2 x, 'c' y
  )
SELECT x, ARRAY_AGG(y) as array_agg
FROM vals
GROUP BY x;

/*---------------*
 | x | array_agg |
 +---------------+
 | 1 | [a, b]    |
 | 2 | [a, c]    |
 *---------------*/
SELECT
  x,
  ARRAY_AGG(x) OVER (ORDER BY ABS(x)) AS array_agg
FROM UNNEST([2, 1, -2, 3, -2, 1, 2]) AS x;

/*----+-------------------------*
 | x  | array_agg               |
 +----+-------------------------+
 | 1  | [1, 1]                  |
 | 1  | [1, 1]                  |
 | 2  | [1, 1, 2, -2, -2, 2]    |
 | -2 | [1, 1, 2, -2, -2, 2]    |
 | -2 | [1, 1, 2, -2, -2, 2]    |
 | 2  | [1, 1, 2, -2, -2, 2]    |
 | 3  | [1, 1, 2, -2, -2, 2, 3] |
 *----+-------------------------*/

ARRAY_CONCAT_AGG

ARRAY_CONCAT_AGG(
  expression
  [ ORDER BY key [ { ASC | DESC } ] [, ... ] ]
  [ LIMIT n ]
)

Description

Concatenates elements from expression of type ARRAY, returning a single array as a result.

This function ignores NULL input arrays, but respects the NULL elements in non-NULL input arrays. An error is raised, however, if an array in the final query result contains a NULL element. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

Supported Argument Types

ARRAY

Returned Data Types

ARRAY

Examples

SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x)) AS array_concat_agg FROM (
  SELECT [NULL, 1, 2, 3, 4] AS x
  UNION ALL SELECT NULL
  UNION ALL SELECT [5, 6]
  UNION ALL SELECT [7, 8, 9]
);

/*-----------------------------------*
 | array_concat_agg                  |
 +-----------------------------------+
 | [NULL, 1, 2, 3, 4, 5, 6, 7, 8, 9] |
 *-----------------------------------*/
SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x ORDER BY ARRAY_LENGTH(x))) AS array_concat_agg FROM (
  SELECT [1, 2, 3, 4] AS x
  UNION ALL SELECT [5, 6]
  UNION ALL SELECT [7, 8, 9]
);

/*-----------------------------------*
 | array_concat_agg                  |
 +-----------------------------------+
 | [5, 6, 7, 8, 9, 1, 2, 3, 4]       |
 *-----------------------------------*/
SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x LIMIT 2)) AS array_concat_agg FROM (
  SELECT [1, 2, 3, 4] AS x
  UNION ALL SELECT [5, 6]
  UNION ALL SELECT [7, 8, 9]
);

/*--------------------------*
 | array_concat_agg         |
 +--------------------------+
 | [1, 2, 3, 4, 5, 6]       |
 *--------------------------*/
SELECT FORMAT("%T", ARRAY_CONCAT_AGG(x ORDER BY ARRAY_LENGTH(x) LIMIT 2)) AS array_concat_agg FROM (
  SELECT [1, 2, 3, 4] AS x
  UNION ALL SELECT [5, 6]
  UNION ALL SELECT [7, 8, 9]
);

/*------------------*
 | array_concat_agg |
 +------------------+
 | [5, 6, 7, 8, 9]  |
 *------------------*/

AVG

AVG(
  [ DISTINCT ]
  expression
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns the average of non-NULL values in an aggregated group.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

This function can be used with the AGGREGATION_THRESHOLD clause.

If this function is used with the OVER clause, it's part of a window function call. In a window function call, aggregate function clauses can't be used. To learn more about the OVER clause and how to use it, see Window function calls.

AVG can be used with differential privacy. For more information, see Differentially private aggregate functions.

Caveats:

  • If the aggregated group is empty or the argument is NULL for all rows in the group, returns NULL.
  • If the argument is NaN for any row in the group, returns NaN.
  • If the argument is [+|-]Infinity for any row in the group, returns either [+|-]Infinity or NaN.
  • If there is numeric overflow, produces an error.
  • If a floating-point type is returned, the result is non-deterministic, which means you might receive a different result each time you use this function.

Supported Argument Types

  • Any numeric input type
  • INTERVAL

Returned Data Types

INPUTINT64NUMERICBIGNUMERICFLOAT64INTERVAL
OUTPUTFLOAT64NUMERICBIGNUMERICFLOAT64INTERVAL

Examples

SELECT AVG(x) as avg
FROM UNNEST([0, 2, 4, 4, 5]) as x;

/*-----*
 | avg |
 +-----+
 | 3   |
 *-----*/
SELECT AVG(DISTINCT x) AS avg
FROM UNNEST([0, 2, 4, 4, 5]) AS x;

/*------*
 | avg  |
 +------+
 | 2.75 |
 *------*/
SELECT
  x,
  AVG(x) OVER (ORDER BY x ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS avg
FROM UNNEST([0, 2, NULL, 4, 4, 5]) AS x;

/*------+------*
 | x    | avg  |
 +------+------+
 | NULL | NULL |
 | 0    | 0    |
 | 2    | 1    |
 | 4    | 3    |
 | 4    | 4    |
 | 5    | 4.5  |
 *------+------*/

BIT_AND

BIT_AND(
  expression
)

Description

Performs a bitwise AND operation on expression and returns the result.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

Supported Argument Types

  • INT64

Returned Data Types

INT64

Examples

SELECT BIT_AND(x) as bit_and FROM UNNEST([0xF001, 0x00A1]) as x;

/*---------*
 | bit_and |
 +---------+
 | 1       |
 *---------*/

BIT_OR

BIT_OR(
  expression
)

Description

Performs a bitwise OR operation on expression and returns the result.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

Supported Argument Types

  • INT64

Returned Data Types

INT64

Examples

SELECT BIT_OR(x) as bit_or FROM UNNEST([0xF001, 0x00A1]) as x;

/*--------*
 | bit_or |
 +--------+
 | 61601  |
 *--------*/

BIT_XOR

BIT_XOR(
  [ DISTINCT ]
  expression
)

Description

Performs a bitwise XOR operation on expression and returns the result.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

Supported Argument Types

  • INT64

Returned Data Types

INT64

Examples

SELECT BIT_XOR(x) AS bit_xor FROM UNNEST([5678, 1234]) AS x;

/*---------*
 | bit_xor |
 +---------+
 | 4860    |
 *---------*/
SELECT BIT_XOR(x) AS bit_xor FROM UNNEST([1234, 5678, 1234]) AS x;

/*---------*
 | bit_xor |
 +---------+
 | 5678    |
 *---------*/
SELECT BIT_XOR(DISTINCT x) AS bit_xor FROM UNNEST([1234, 5678, 1234]) AS x;

/*---------*
 | bit_xor |
 +---------+
 | 4860    |
 *---------*/

COUNT

1.

COUNT(*)
[OVER over_clause]

2.

COUNT(
  [ DISTINCT ]
  expression
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

  1. Returns the number of rows in the input.
  2. Returns the number of rows with expression evaluated to any value other than NULL.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

This function can be used with the AGGREGATION_THRESHOLD clause.

To learn more about the OVER clause and how to use it, see Window function calls.

This function with DISTINCT supports specifying collation.

COUNT can be used with differential privacy. For more information, see Differentially private aggregate functions.

Supported Argument Types

expression can be any data type. If DISTINCT is present, expression can only be a data type that is groupable.

Return Data Types

INT64

Examples

You can use the COUNT function to return the number of rows in a table or the number of distinct values of an expression. For example:

SELECT
  COUNT(*) AS count_star,
  COUNT(DISTINCT x) AS count_dist_x
FROM UNNEST([1, 4, 4, 5]) AS x;

/*------------+--------------*
 | count_star | count_dist_x |
 +------------+--------------+
 | 4          | 3            |
 *------------+--------------*/
SELECT
  x,
  COUNT(*) OVER (PARTITION BY MOD(x, 3)) AS count_star,
  COUNT(DISTINCT x) OVER (PARTITION BY MOD(x, 3)) AS count_dist_x
FROM UNNEST([1, 4, 4, 5]) AS x;

/*------+------------+--------------*
 | x    | count_star | count_dist_x |
 +------+------------+--------------+
 | 1    | 3          | 2            |
 | 4    | 3          | 2            |
 | 4    | 3          | 2            |
 | 5    | 1          | 1            |
 *------+------------+--------------*/
SELECT
  x,
  COUNT(*) OVER (PARTITION BY MOD(x, 3)) AS count_star,
  COUNT(x) OVER (PARTITION BY MOD(x, 3)) AS count_x
FROM UNNEST([1, 4, NULL, 4, 5]) AS x;

/*------+------------+---------*
 | x    | count_star | count_x |
 +------+------------+---------+
 | NULL | 1          | 0       |
 | 1    | 3          | 3       |
 | 4    | 3          | 3       |
 | 4    | 3          | 3       |
 | 5    | 1          | 1       |
 *------+------------+---------*/

If you want to count the number of distinct values of an expression for which a certain condition is satisfied, this is one recipe that you can use:

COUNT(DISTINCT IF(condition, expression, NULL))

Here, IF will return the value of expression if condition is TRUE, or NULL otherwise. The surrounding COUNT(DISTINCT ...) will ignore the NULL values, so it will count only the distinct values of expression for which condition is TRUE.

For example, to count the number of distinct positive values of x:

SELECT COUNT(DISTINCT IF(x > 0, x, NULL)) AS distinct_positive
FROM UNNEST([1, -2, 4, 1, -5, 4, 1, 3, -6, 1]) AS x;

/*-------------------*
 | distinct_positive |
 +-------------------+
 | 3                 |
 *-------------------*/

Or to count the number of distinct dates on which a certain kind of event occurred:

WITH Events AS (
  SELECT DATE '2021-01-01' AS event_date, 'SUCCESS' AS event_type
  UNION ALL
  SELECT DATE '2021-01-02' AS event_date, 'SUCCESS' AS event_type
  UNION ALL
  SELECT DATE '2021-01-02' AS event_date, 'FAILURE' AS event_type
  UNION ALL
  SELECT DATE '2021-01-03' AS event_date, 'SUCCESS' AS event_type
  UNION ALL
  SELECT DATE '2021-01-04' AS event_date, 'FAILURE' AS event_type
  UNION ALL
  SELECT DATE '2021-01-04' AS event_date, 'FAILURE' AS event_type
)
SELECT
  COUNT(DISTINCT IF(event_type = 'FAILURE', event_date, NULL))
    AS distinct_dates_with_failures
FROM Events;

/*------------------------------*
 | distinct_dates_with_failures |
 +------------------------------+
 | 2                            |
 *------------------------------*/

COUNTIF

COUNTIF(
  [ DISTINCT ]
  expression
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns the count of TRUE values for expression. Returns 0 if there are zero input rows, or if expression evaluates to FALSE or NULL for all rows.

Since expression must be a BOOL, the form COUNTIF(DISTINCT ...) is generally not useful: there is only one distinct value of TRUE. So COUNTIF(DISTINCT ...) will return 1 if expression evaluates to TRUE for one or more input rows, or 0 otherwise. Usually when someone wants to combine COUNTIF and DISTINCT, they want to count the number of distinct values of an expression for which a certain condition is satisfied. One recipe to achieve this is the following:

COUNT(DISTINCT IF(condition, expression, NULL))

Note that this uses COUNT, not COUNTIF; the IF part has been moved inside. To learn more, see the examples for COUNT.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

This function can be used with the AGGREGATION_THRESHOLD clause.

To learn more about the OVER clause and how to use it, see Window function calls.

Supported Argument Types

BOOL

Return Data Types

INT64

Examples

SELECT COUNTIF(x<0) AS num_negative, COUNTIF(x>0) AS num_positive
FROM UNNEST([5, -2, 3, 6, -10, -7, 4, 0]) AS x;

/*--------------+--------------*
 | num_negative | num_positive |
 +--------------+--------------+
 | 3            | 4            |
 *--------------+--------------*/
SELECT
  x,
  COUNTIF(x<0) OVER (ORDER BY ABS(x) ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS num_negative
FROM UNNEST([5, -2, 3, 6, -10, NULL, -7, 4, 0]) AS x;

/*------+--------------*
 | x    | num_negative |
 +------+--------------+
 | NULL | 0            |
 | 0    | 1            |
 | -2   | 1            |
 | 3    | 1            |
 | 4    | 0            |
 | 5    | 0            |
 | 6    | 1            |
 | -7   | 2            |
 | -10  | 2            |
 *------+--------------*/

GROUPING

GROUPING(groupable_value)

Description

If a groupable item in the GROUP BY clause is aggregated (and thus not grouped), this function returns 1. Otherwise, this function returns 0.

Definitions:

  • groupable_value: An expression that represents a value that can be grouped in the GROUP BY clause.

Details:

The GROUPING function is helpful if you need to determine which rows are produced by which grouping sets. A grouping set is a group of columns by which rows can be grouped together. So, if you need to filter rows by a few specific grouping sets, you can use the GROUPING function to identify which grouping sets grouped which rows by creating a matrix of the results.

In addition, you can use the GROUPING function to determine the type of NULL produced by the GROUP BY clause. In some cases, the GROUP BY clause produces a NULL placeholder. This placeholder represents all groupable items that are aggregated (not grouped) in the current grouping set. This is different from a standard NULL, which can also be produced by a query.

For more information, see the following examples.

Returned Data Type

INT64

Examples

In the following example, it's difficult to determine which rows are grouped by the grouping value product_type or product_name. The GROUPING function makes this easier to determine.

Pay close attention to what's in the product_type_agg and product_name_agg column matrix. This determines how the rows are grouped.

product_type_agg product_name_agg Notes
1 0 Rows are grouped by product_name.
0 1 Rows are grouped by product_type.
0 0 Rows are grouped by product_type and product_name.
1 1 Grand total row.
WITH
  Products AS (
    SELECT 'shirt' AS product_type, 't-shirt' AS product_name, 3 AS product_count UNION ALL
    SELECT 'shirt', 't-shirt', 8 UNION ALL
    SELECT 'shirt', 'polo', 25 UNION ALL
    SELECT 'pants', 'jeans', 6
  )
SELECT
  product_type,
  product_name,
  SUM(product_count) AS product_sum,
  GROUPING(product_type) AS product_type_agg,
  GROUPING(product_name) AS product_name_agg,
FROM Products
GROUP BY GROUPING SETS(product_type, product_name, ())
ORDER BY product_name;

/*--------------+--------------+-------------+------------------+------------------+
 | product_type | product_name | product_sum | product_type_agg | product_name_agg |
 +--------------+--------------+-------------+------------------+------------------+
 | NULL         | NULL         | 42          | 1                | 1                |
 | shirt        | NULL         | 36          | 0                | 1                |
 | pants        | NULL         | 6           | 0                | 1                |
 | NULL         | jeans        | 6           | 1                | 0                |
 | NULL         | polo         | 25          | 1                | 0                |
 | NULL         | t-shirt      | 11          | 1                | 0                |
 +--------------+--------------+-------------+------------------+------------------*/

In the following example, it's difficult to determine if NULL represents a NULL placeholder or a standard NULL value in the product_type column. The GROUPING function makes it easier to determine what type of NULL is being produced. If product_type_is_aggregated is 1, the NULL value for the product_type column is a NULL placeholder.

WITH
  Products AS (
    SELECT 'shirt' AS product_type, 't-shirt' AS product_name, 3 AS product_count UNION ALL
    SELECT 'shirt', 't-shirt', 8 UNION ALL
    SELECT NULL, 'polo', 25 UNION ALL
    SELECT 'pants', 'jeans', 6
  )
SELECT
  product_type,
  product_name,
  SUM(product_count) AS product_sum,
  GROUPING(product_type) AS product_type_is_aggregated
FROM Products
GROUP BY GROUPING SETS(product_type, product_name)
ORDER BY product_name;

/*--------------+--------------+-------------+----------------------------+
 | product_type | product_name | product_sum | product_type_is_aggregated |
 +--------------+--------------+-------------+----------------------------+
 | shirt        | NULL         | 11          | 0                          |
 | NULL         | NULL         | 25          | 0                          |
 | pants        | NULL         | 6           | 0                          |
 | NULL         | jeans        | 6           | 1                          |
 | NULL         | polo         | 25          | 1                          |
 | NULL         | t-shirt      | 11          | 1                          |
 +--------------+--------------+-------------+----------------------------*/

LOGICAL_AND

LOGICAL_AND(
  expression
)

Description

Returns the logical AND of all non-NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

This function can be used with the AGGREGATION_THRESHOLD clause.

Supported Argument Types

BOOL

Return Data Types

BOOL

Examples

LOGICAL_AND returns FALSE because not all of the values in the array are less than 3.

SELECT LOGICAL_AND(x < 3) AS logical_and FROM UNNEST([1, 2, 4]) AS x;

/*-------------*
 | logical_and |
 +-------------+
 | FALSE       |
 *-------------*/

LOGICAL_OR

LOGICAL_OR(
  expression
)

Description

Returns the logical OR of all non-NULL expressions. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

This function can be used with the AGGREGATION_THRESHOLD clause.

Supported Argument Types

BOOL

Return Data Types

BOOL

Examples

LOGICAL_OR returns TRUE because at least one of the values in the array is less than 3.

SELECT LOGICAL_OR(x < 3) AS logical_or FROM UNNEST([1, 2, 4]) AS x;

/*------------*
 | logical_or |
 +------------+
 | TRUE       |
 *------------*/

MAX

MAX(
  expression
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns the maximum non-NULL value in an aggregated group.

Caveats:

  • If the aggregated group is empty or the argument is NULL for all rows in the group, returns NULL.
  • If the argument is NaN for any row in the group, returns NaN.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

To learn more about the OVER clause and how to use it, see Window function calls.

This function supports specifying collation.

Supported Argument Types

Any orderable data type except for ARRAY.

Return Data Types

The data type of the input values.

Examples

SELECT MAX(x) AS max
FROM UNNEST([8, 37, 55, 4]) AS x;

/*-----*
 | max |
 +-----+
 | 55  |
 *-----*/
SELECT x, MAX(x) OVER (PARTITION BY MOD(x, 2)) AS max
FROM UNNEST([8, NULL, 37, 55, NULL, 4]) AS x;

/*------+------*
 | x    | max  |
 +------+------+
 | NULL | NULL |
 | NULL | NULL |
 | 8    | 8    |
 | 4    | 8    |
 | 37   | 55   |
 | 55   | 55   |
 *------+------*/

MAX_BY

MAX_BY(
  x, y
)

Description

Synonym for ANY_VALUE(x HAVING MAX y).

Return Data Types

Matches the input x data type.

Examples

WITH fruits AS (
  SELECT "apple"  fruit, 3.55 price UNION ALL
  SELECT "banana"  fruit, 2.10 price UNION ALL
  SELECT "pear"  fruit, 4.30 price
)
SELECT MAX_BY(fruit, price) as fruit
FROM fruits;

/*-------*
 | fruit |
 +-------+
 | pear  |
 *-------*/

MIN

MIN(
  expression
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns the minimum non-NULL value in an aggregated group.

Caveats:

  • If the aggregated group is empty or the argument is NULL for all rows in the group, returns NULL.
  • If the argument is NaN for any row in the group, returns NaN.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

To learn more about the OVER clause and how to use it, see Window function calls.

This function supports specifying collation.

Supported Argument Types

Any orderable data type except for ARRAY.

Return Data Types

The data type of the input values.

Examples

SELECT MIN(x) AS min
FROM UNNEST([8, 37, 4, 55]) AS x;

/*-----*
 | min |
 +-----+
 | 4   |
 *-----*/
SELECT x, MIN(x) OVER (PARTITION BY MOD(x, 2)) AS min
FROM UNNEST([8, NULL, 37, 4, NULL, 55]) AS x;

/*------+------*
 | x    | min  |
 +------+------+
 | NULL | NULL |
 | NULL | NULL |
 | 8    | 4    |
 | 4    | 4    |
 | 37   | 37   |
 | 55   | 37   |
 *------+------*/

MIN_BY

MIN_BY(
  x, y
)

Description

Synonym for ANY_VALUE(x HAVING MIN y).

Return Data Types

Matches the input x data type.

Examples

WITH fruits AS (
  SELECT "apple"  fruit, 3.55 price UNION ALL
  SELECT "banana"  fruit, 2.10 price UNION ALL
  SELECT "pear"  fruit, 4.30 price
)
SELECT MIN_BY(fruit, price) as fruit
FROM fruits;

/*--------*
 | fruit  |
 +--------+
 | banana |
 *--------*/

STRING_AGG

STRING_AGG(
  [ DISTINCT ]
  expression [, delimiter]
  [ ORDER BY key [ { ASC | DESC } ] [, ... ] ]
  [ LIMIT n ]
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns a value (either STRING or BYTES) obtained by concatenating non-NULL values. Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

If a delimiter is specified, concatenated values are separated by that delimiter; otherwise, a comma is used as a delimiter.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

If this function is used with the OVER clause, it's part of a window function call. In a window function call, aggregate function clauses can't be used. To learn more about the OVER clause and how to use it, see Window function calls.

Supported Argument Types

Either STRING or BYTES.

Return Data Types

Either STRING or BYTES.

Examples

SELECT STRING_AGG(fruit) AS string_agg
FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit;

/*------------------------*
 | string_agg             |
 +------------------------+
 | apple,pear,banana,pear |
 *------------------------*/
SELECT STRING_AGG(fruit, " & ") AS string_agg
FROM UNNEST(["apple", "pear", "banana", "pear"]) AS fruit;

/*------------------------------*
 | string_agg                   |
 +------------------------------+
 | apple & pear & banana & pear |
 *------------------------------*/
SELECT STRING_AGG(DISTINCT fruit, " & ") AS string_agg
FROM UNNEST(["apple", "pear", "banana", "pear"]) AS fruit;

/*-----------------------*
 | string_agg            |
 +-----------------------+
 | apple & pear & banana |
 *-----------------------*/
SELECT STRING_AGG(fruit, " & " ORDER BY LENGTH(fruit)) AS string_agg
FROM UNNEST(["apple", "pear", "banana", "pear"]) AS fruit;

/*------------------------------*
 | string_agg                   |
 +------------------------------+
 | pear & pear & apple & banana |
 *------------------------------*/
SELECT STRING_AGG(fruit, " & " LIMIT 2) AS string_agg
FROM UNNEST(["apple", "pear", "banana", "pear"]) AS fruit;

/*--------------*
 | string_agg   |
 +--------------+
 | apple & pear |
 *--------------*/
SELECT STRING_AGG(DISTINCT fruit, " & " ORDER BY fruit DESC LIMIT 2) AS string_agg
FROM UNNEST(["apple", "pear", "banana", "pear"]) AS fruit;

/*---------------*
 | string_agg    |
 +---------------+
 | pear & banana |
 *---------------*/
SELECT
  fruit,
  STRING_AGG(fruit, " & ") OVER (ORDER BY LENGTH(fruit)) AS string_agg
FROM UNNEST(["apple", NULL, "pear", "banana", "pear"]) AS fruit;

/*--------+------------------------------*
 | fruit  | string_agg                   |
 +--------+------------------------------+
 | NULL   | NULL                         |
 | pear   | pear & pear                  |
 | pear   | pear & pear                  |
 | apple  | pear & pear & apple          |
 | banana | pear & pear & apple & banana |
 *--------+------------------------------*/

SUM

SUM(
  [ DISTINCT ]
  expression
)
[ OVER over_clause ]

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]
  [ window_frame_clause ]

Description

Returns the sum of non-NULL values in an aggregated group.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

This function can be used with the AGGREGATION_THRESHOLD clause.

To learn more about the OVER clause and how to use it, see Window function calls.

SUM can be used with differential privacy. For more information, see Differentially private aggregate functions.

Caveats:

  • If the aggregated group is empty or the argument is NULL for all rows in the group, returns NULL.
  • If the argument is NaN for any row in the group, returns NaN.
  • If the argument is [+|-]Infinity for any row in the group, returns either [+|-]Infinity or NaN.
  • If there is numeric overflow, produces an error.
  • If a floating-point type is returned, the result is non-deterministic, which means you might receive a different result each time you use this function.

Supported Argument Types

  • Any supported numeric data type
  • INTERVAL

Return Data Types

INPUTINT64NUMERICBIGNUMERICFLOAT64INTERVAL
OUTPUTINT64NUMERICBIGNUMERICFLOAT64INTERVAL

Examples

SELECT SUM(x) AS sum
FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x;

/*-----*
 | sum |
 +-----+
 | 25  |
 *-----*/
SELECT SUM(DISTINCT x) AS sum
FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x;

/*-----*
 | sum |
 +-----+
 | 15  |
 *-----*/
SELECT
  x,
  SUM(x) OVER (PARTITION BY MOD(x, 3)) AS sum
FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x;

/*---+-----*
 | x | sum |
 +---+-----+
 | 3 | 6   |
 | 3 | 6   |
 | 1 | 10  |
 | 4 | 10  |
 | 4 | 10  |
 | 1 | 10  |
 | 2 | 9   |
 | 5 | 9   |
 | 2 | 9   |
 *---+-----*/
SELECT
  x,
  SUM(DISTINCT x) OVER (PARTITION BY MOD(x, 3)) AS sum
FROM UNNEST([1, 2, 3, 4, 5, 4, 3, 2, 1]) AS x;

/*---+-----*
 | x | sum |
 +---+-----+
 | 3 | 3   |
 | 3 | 3   |
 | 1 | 5   |
 | 4 | 5   |
 | 4 | 5   |
 | 1 | 5   |
 | 2 | 7   |
 | 5 | 7   |
 | 2 | 7   |
 *---+-----*/
SELECT SUM(x) AS sum
FROM UNNEST([]) AS x;

/*------*
 | sum  |
 +------+
 | NULL |
 *------*/

Approximate aggregate functions

GoogleSQL for BigQuery supports approximate aggregate functions. To learn about the syntax for aggregate function calls, see Aggregate function calls.

Approximate aggregate functions are scalable in terms of memory usage and time, but produce approximate results instead of exact results. These functions typically require less memory than exact aggregation functions like COUNT(DISTINCT ...), but also introduce statistical uncertainty. This makes approximate aggregation appropriate for large data streams for which linear memory usage is impractical, as well as for data that is already approximate.

The approximate aggregate functions in this section work directly on the input data, rather than an intermediate estimation of the data. These functions do not allow users to specify the precision for the estimation with sketches. If you would like to specify precision with sketches, see:

Function list

Name Summary
APPROX_COUNT_DISTINCT Gets the approximate result for COUNT(DISTINCT expression).
APPROX_QUANTILES Gets the approximate quantile boundaries.
APPROX_TOP_COUNT Gets the approximate top elements and their approximate count.
APPROX_TOP_SUM Gets the approximate top elements and sum, based on the approximate sum of an assigned weight.

APPROX_COUNT_DISTINCT

APPROX_COUNT_DISTINCT(
  expression
)

Description

Returns the approximate result for COUNT(DISTINCT expression). The value returned is a statistical estimate, not necessarily the actual value.

This function is less accurate than COUNT(DISTINCT expression), but performs better on huge input.

Supported Argument Types

Any data type except:

  • ARRAY
  • STRUCT
  • INTERVAL

Returned Data Types

INT64

Examples

SELECT APPROX_COUNT_DISTINCT(x) as approx_distinct
FROM UNNEST([0, 1, 1, 2, 3, 5]) as x;

/*-----------------*
 | approx_distinct |
 +-----------------+
 | 5               |
 *-----------------*/

APPROX_QUANTILES

APPROX_QUANTILES(
  [ DISTINCT ]
  expression, number
  [ { IGNORE | RESPECT } NULLS ]
)

Description

Returns the approximate boundaries for a group of expression values, where number represents the number of quantiles to create. This function returns an array of number + 1 elements, sorted in ascending order, where the first element is the approximate minimum and the last element is the approximate maximum.

Returns NULL if there are zero input rows or expression evaluates to NULL for all rows.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

Supported Argument Types

  • expression: Any supported data type except:

    • ARRAY
    • STRUCT
    • INTERVAL
  • number: INT64 literal or query parameter.

Returned Data Types

ARRAY<T> where T is the type specified by expression.

Examples

SELECT APPROX_QUANTILES(x, 2) AS approx_quantiles
FROM UNNEST([1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

/*------------------*
 | approx_quantiles |
 +------------------+
 | [1, 5, 10]       |
 *------------------*/
SELECT APPROX_QUANTILES(x, 100)[OFFSET(90)] AS percentile_90
FROM UNNEST([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) AS x;

/*---------------*
 | percentile_90 |
 +---------------+
 | 9             |
 *---------------*/
SELECT APPROX_QUANTILES(DISTINCT x, 2) AS approx_quantiles
FROM UNNEST([1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

/*------------------*
 | approx_quantiles |
 +------------------+
 | [1, 6, 10]       |
 *------------------*/
SELECT FORMAT("%T", APPROX_QUANTILES(x, 2 RESPECT NULLS)) AS approx_quantiles
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

/*------------------*
 | approx_quantiles |
 +------------------+
 | [NULL, 4, 10]    |
 *------------------*/
SELECT FORMAT("%T", APPROX_QUANTILES(DISTINCT x, 2 RESPECT NULLS)) AS approx_quantiles
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;

/*------------------*
 | approx_quantiles |
 +------------------+
 | [NULL, 6, 10]    |
 *------------------*/

APPROX_TOP_COUNT

APPROX_TOP_COUNT(
  expression, number
)

Description

Returns the approximate top elements of expression as an array of STRUCTs. The number parameter specifies the number of elements returned.

Each STRUCT contains two fields. The first field (named value) contains an input value. The second field (named count) contains an INT64 specifying the number of times the value was returned.

Returns NULL if there are zero input rows.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

Supported Argument Types

  • expression: Any data type that the GROUP BY clause supports.
  • number: INT64 literal or query parameter.

Returned Data Types

ARRAY<STRUCT>

Examples

SELECT APPROX_TOP_COUNT(x, 2) as approx_top_count
FROM UNNEST(["apple", "apple", "pear", "pear", "pear", "banana"]) as x;

/*-------------------------*
 | approx_top_count        |
 +-------------------------+
 | [{pear, 3}, {apple, 2}] |
 *-------------------------*/

NULL handling

APPROX_TOP_COUNT does not ignore NULLs in the input. For example:

SELECT APPROX_TOP_COUNT(x, 2) as approx_top_count
FROM UNNEST([NULL, "pear", "pear", "pear", "apple", NULL]) as x;

/*------------------------*
 | approx_top_count       |
 +------------------------+
 | [{pear, 3}, {NULL, 2}] |
 *------------------------*/

APPROX_TOP_SUM

APPROX_TOP_SUM(
  expression, weight, number
)

Description

Returns the approximate top elements of expression, based on the sum of an assigned weight. The number parameter specifies the number of elements returned.

If the weight input is negative or NaN, this function returns an error.

The elements are returned as an array of STRUCTs. Each STRUCT contains two fields: value and sum. The value field contains the value of the input expression. The sum field is the same type as weight, and is the approximate sum of the input weight associated with the value field.

Returns NULL if there are zero input rows.

To learn more about the optional aggregate clauses that you can pass into this function, see Aggregate function calls.

Supported Argument Types

  • expression: Any data type that the GROUP BY clause supports.
  • weight: One of the following:

    • INT64
    • NUMERIC
    • BIGNUMERIC
    • FLOAT64
  • number: INT64 literal or query parameter.

Returned Data Types

ARRAY<STRUCT>

Examples

SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([
  STRUCT("apple" AS x, 3 AS weight),
  ("pear", 2),
  ("apple", 0),
  ("banana", 5),
  ("pear", 4)
]);

/*--------------------------*
 | approx_top_sum           |
 +--------------------------+
 | [{pear, 6}, {banana, 5}] |
 *--------------------------*/

NULL handling

APPROX_TOP_SUM does not ignore NULL values for the expression and weight parameters.

SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([STRUCT("apple" AS x, NULL AS weight), ("pear", 0), ("pear", NULL)]);

/*----------------------------*
 | approx_top_sum             |
 +----------------------------+
 | [{pear, 0}, {apple, NULL}] |
 *----------------------------*/
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([STRUCT("apple" AS x, 0 AS weight), (NULL, 2)]);

/*-------------------------*
 | approx_top_sum          |
 +-------------------------+
 | [{NULL, 2}, {apple, 0}] |
 *-------------------------*/
SELECT APPROX_TOP_SUM(x, weight, 2) AS approx_top_sum FROM
UNNEST([STRUCT("apple" AS x, 0 AS weight), (NULL, NULL)]);

/*----------------------------*
 | approx_top_sum             |
 +----------------------------+
 | [{apple, 0}, {NULL, NULL}] |
 *----------------------------*/

Array functions

GoogleSQL for BigQuery supports the following array functions.

Function list

Name Summary
ARRAY Produces an array with one element for each row in a subquery.
ARRAY_CONCAT Concatenates one or more arrays with the same element type into a single array.
ARRAY_LENGTH Gets the number of elements in an array.
ARRAY_REVERSE Reverses the order of elements in an array.
ARRAY_TO_STRING Produces a concatenation of the elements in an array as a STRING value.
GENERATE_ARRAY Generates an array of values in a range.
GENERATE_DATE_ARRAY Generates an array of dates in a range.
GENERATE_TIMESTAMP_ARRAY Generates an array of timestamps in a range.

ARRAY

ARRAY(subquery)

Description

The ARRAY function returns an ARRAY with one element for each row in a subquery.

If subquery produces a SQL table, the table must have exactly one column. Each element in the output ARRAY is the value of the single column of a row in the table.

If subquery produces a value table, then each element in the output ARRAY is the entire corresponding row of the value table.

Constraints

  • Subqueries are unordered, so the elements of the output ARRAY are not guaranteed to preserve any order in the source table for the subquery. However, if the subquery includes an ORDER BY clause, the ARRAY function will return an ARRAY that honors that clause.
  • If the subquery returns more than one column, the ARRAY function returns an error.
  • If the subquery returns an ARRAY typed column or ARRAY typed rows, the ARRAY function returns an error that GoogleSQL does not support ARRAYs with elements of type ARRAY.
  • If the subquery returns zero rows, the ARRAY function returns an empty ARRAY. It never returns a NULL ARRAY.

Return type

ARRAY

Examples

SELECT ARRAY
  (SELECT 1 UNION ALL
   SELECT 2 UNION ALL
   SELECT 3) AS new_array;

/*-----------*
 | new_array |
 +-----------+
 | [1, 2, 3] |
 *-----------*/

To construct an ARRAY from a subquery that contains multiple columns, change the subquery to use SELECT AS STRUCT. Now the ARRAY function will return an ARRAY of STRUCTs. The ARRAY will contain one STRUCT for each row in the subquery, and each of these STRUCTs will contain a field for each column in that row.

SELECT
  ARRAY
    (SELECT AS STRUCT 1, 2, 3
     UNION ALL SELECT AS STRUCT 4, 5, 6) AS new_array;

/*------------------------*
 | new_array              |
 +------------------------+
 | [{1, 2, 3}, {4, 5, 6}] |
 *------------------------*/

Similarly, to construct an ARRAY from a subquery that contains one or more ARRAYs, change the subquery to use SELECT AS STRUCT.

SELECT ARRAY
  (SELECT AS STRUCT [1, 2, 3] UNION ALL
   SELECT AS STRUCT [4, 5, 6]) AS new_array;

/*----------------------------*
 | new_array                  |
 +----------------------------+
 | [{[1, 2, 3]}, {[4, 5, 6]}] |
 *----------------------------*/

ARRAY_CONCAT

ARRAY_CONCAT(array_expression[, ...])

Description

Concatenates one or more arrays with the same element type into a single array.

The function returns NULL if any input argument is NULL.

Return type

ARRAY

Examples

SELECT ARRAY_CONCAT([1, 2], [3, 4], [5, 6]) as count_to_six;

/*--------------------------------------------------*
 | count_to_six                                     |
 +--------------------------------------------------+
 | [1, 2, 3, 4, 5, 6]                               |
 *--------------------------------------------------*/

ARRAY_LENGTH

ARRAY_LENGTH(array_expression)

Description

Returns the size of the array. Returns 0 for an empty array. Returns NULL if the array_expression is NULL.

Return type

INT64

Examples

WITH items AS
  (SELECT ["coffee", NULL, "milk" ] as list
  UNION ALL
  SELECT ["cake", "pie"] as list)
SELECT ARRAY_TO_STRING(list, ', ', 'NULL'), ARRAY_LENGTH(list) AS size
FROM items
ORDER BY size DESC;

/*--------------------+------*
 | list               | size |
 +--------------------+------+
 | coffee, NULL, milk | 3    |
 | cake, pie          | 2    |
 *--------------------+------*/

ARRAY_REVERSE

ARRAY_REVERSE(value)

Description

Returns the input ARRAY with elements in reverse order.

Return type

ARRAY

Examples

WITH example AS (
  SELECT [1, 2, 3] AS arr UNION ALL
  SELECT [4, 5] AS arr UNION ALL
  SELECT [] AS arr
)
SELECT
  arr,
  ARRAY_REVERSE(arr) AS reverse_arr
FROM example;

/*-----------+-------------*
 | arr       | reverse_arr |
 +-----------+-------------+
 | [1, 2, 3] | [3, 2, 1]   |
 | [4, 5]    | [5, 4]      |
 | []        | []          |
 *-----------+-------------*/

ARRAY_TO_STRING

ARRAY_TO_STRING(array_expression, delimiter[, null_text])

Description

Returns a concatenation of the elements in array_expression as a STRING. The value for array_expression can either be an array of STRING or BYTES data types.

If the null_text parameter is used, the function replaces any NULL values in the array with the value of null_text.

If the null_text parameter is not used, the function omits the NULL value and its preceding delimiter.

Return type

STRING

Examples

WITH items AS
  (SELECT ['coffee', 'tea', 'milk' ] as list
  UNION ALL
  SELECT ['cake', 'pie', NULL] as list)

SELECT ARRAY_TO_STRING(list, '--') AS text
FROM items;

/*--------------------------------*
 | text                           |
 +--------------------------------+
 | coffee--tea--milk              |
 | cake--pie                      |
 *--------------------------------*/
WITH items AS
  (SELECT ['coffee', 'tea', 'milk' ] as list
  UNION ALL
  SELECT ['cake', 'pie', NULL] as list)

SELECT ARRAY_TO_STRING(list, '--', 'MISSING') AS text
FROM items;

/*--------------------------------*
 | text                           |
 +--------------------------------+
 | coffee--tea--milk              |
 | cake--pie--MISSING             |
 *--------------------------------*/

GENERATE_ARRAY

GENERATE_ARRAY(start_expression, end_expression[, step_expression])

Description

Returns an array of values. The start_expression and end_expression parameters determine the inclusive start and end of the array.

The GENERATE_ARRAY function accepts the following data types as inputs:

  • INT64
  • NUMERIC
  • BIGNUMERIC
  • FLOAT64

The step_expression parameter determines the increment used to generate array values. The default value for this parameter is 1.

This function returns an error if step_expression is set to 0, or if any input is NaN.

If any argument is NULL, the function will return a NULL array.

Return Data Type

ARRAY

Examples

The following returns an array of integers, with a default step of 1.

SELECT GENERATE_ARRAY(1, 5) AS example_array;

/*-----------------*
 | example_array   |
 +-----------------+
 | [1, 2, 3, 4, 5] |
 *-----------------*/

The following returns an array using a user-specified step size.

SELECT GENERATE_ARRAY(0, 10, 3) AS example_array;

/*---------------*
 | example_array |
 +---------------+
 | [0, 3, 6, 9]  |
 *---------------*/

The following returns an array using a negative value, -3 for its step size.

SELECT GENERATE_ARRAY(10, 0, -3) AS example_array;

/*---------------*
 | example_array |
 +---------------+
 | [10, 7, 4, 1] |
 *---------------*/

The following returns an array using the same value for the start_expression and end_expression.

SELECT GENERATE_ARRAY(4, 4, 10) AS example_array;

/*---------------*
 | example_array |
 +---------------+
 | [4]           |
 *---------------*/

The following returns an empty array, because the start_expression is greater than the end_expression, and the step_expression value is positive.

SELECT GENERATE_ARRAY(10, 0, 3) AS example_array;

/*---------------*
 | example_array |
 +---------------+
 | []            |
 *---------------*/

The following returns a NULL array because end_expression is NULL.

SELECT GENERATE_ARRAY(5, NULL, 1) AS example_array;

/*---------------*
 | example_array |
 +---------------+
 | NULL          |
 *---------------*/

The following returns multiple arrays.

SELECT GENERATE_ARRAY(start, 5) AS example_array
FROM UNNEST([3, 4, 5]) AS start;

/*---------------*
 | example_array |
 +---------------+
 | [3, 4, 5]     |
 | [4, 5]        |
 | [5]           |
 +---------------*/

GENERATE_DATE_ARRAY

GENERATE_DATE_ARRAY(start_date, end_date[, INTERVAL INT64_expr date_part])

Description

Returns an array of dates. The start_date and end_date parameters determine the inclusive start and end of the array.

The GENERATE_DATE_ARRAY function accepts the following data types as inputs:

  • start_date must be a DATE.
  • end_date must be a DATE.
  • INT64_expr must be an INT64.
  • date_part must be either DAY, WEEK, MONTH, QUARTER, or YEAR.

The INT64_expr parameter determines the increment used to generate dates. The default value for this parameter is 1 day.

This function returns an error if INT64_expr is set to 0.

Return Data Type

ARRAY containing 0 or more DATE values.

Examples

The following returns an array of dates, with a default step of 1.

SELECT GENERATE_DATE_ARRAY('2016-10-05', '2016-10-08') AS example;

/*--------------------------------------------------*
 | example                                          |
 +--------------------------------------------------+
 | [2016-10-05, 2016-10-06, 2016-10-07, 2016-10-08] |
 *--------------------------------------------------*/

The following returns an array using a user-specified step size.

SELECT GENERATE_DATE_ARRAY(
 '2016-10-05', '2016-10-09', INTERVAL 2 DAY) AS example;

/*--------------------------------------*
 | example                              |
 +--------------------------------------+
 | [2016-10-05, 2016-10-07, 2016-10-09] |
 *--------------------------------------*/

The following returns an array using a negative value, -3 for its step size.

SELECT GENERATE_DATE_ARRAY('2016-10-05',
  '2016-10-01', INTERVAL -3 DAY) AS example;

/*--------------------------*
 | example                  |
 +--------------------------+
 | [2016-10-05, 2016-10-02] |
 *--------------------------*/

The following returns an array using the same value for the start_dateand end_date.

SELECT GENERATE_DATE_ARRAY('2016-10-05',
  '2016-10-05', INTERVAL 8 DAY) AS example;

/*--------------*
 | example      |
 +--------------+
 | [2016-10-05] |
 *--------------*/

The following returns an empty array, because the start_date is greater than the end_date, and the step value is positive.

SELECT GENERATE_DATE_ARRAY('2016-10-05',
  '2016-10-01', INTERVAL 1 DAY) AS example;

/*---------*
 | example |
 +---------+
 | []      |
 *---------*/

The following returns a NULL array, because one of its inputs is NULL.

SELECT GENERATE_DATE_ARRAY('2016-10-05', NULL) AS example;

/*---------*
 | example |
 +---------+
 | NULL    |
 *---------*/

The following returns an array of dates, using MONTH as the date_part interval:

SELECT GENERATE_DATE_ARRAY('2016-01-01',
  '2016-12-31', INTERVAL 2 MONTH) AS example;

/*--------------------------------------------------------------------------*
 | example                                                                  |
 +--------------------------------------------------------------------------+
 | [2016-01-01, 2016-03-01, 2016-05-01, 2016-07-01, 2016-09-01, 2016-11-01] |
 *--------------------------------------------------------------------------*/

The following uses non-constant dates to generate an array.

SELECT GENERATE_DATE_ARRAY(date_start, date_end, INTERVAL 1 WEEK) AS date_range
FROM (
  SELECT DATE '2016-01-01' AS date_start, DATE '2016-01-31' AS date_end
  UNION ALL SELECT DATE "2016-04-01", DATE "2016-04-30"
  UNION ALL SELECT DATE "2016-07-01", DATE "2016-07-31"
  UNION ALL SELECT DATE "2016-10-01", DATE "2016-10-31"
) AS items;

/*--------------------------------------------------------------*
 | date_range                                                   |
 +--------------------------------------------------------------+
 | [2016-01-01, 2016-01-08, 2016-01-15, 2016-01-22, 2016-01-29] |
 | [2016-04-01, 2016-04-08, 2016-04-15, 2016-04-22, 2016-04-29] |
 | [2016-07-01, 2016-07-08, 2016-07-15, 2016-07-22, 2016-07-29] |
 | [2016-10-01, 2016-10-08, 2016-10-15, 2016-10-22, 2016-10-29] |
 *--------------------------------------------------------------*/

GENERATE_TIMESTAMP_ARRAY

GENERATE_TIMESTAMP_ARRAY(start_timestamp, end_timestamp,
                         INTERVAL step_expression date_part)

Description

Returns an ARRAY of TIMESTAMPS separated by a given interval. The start_timestamp and end_timestamp parameters determine the inclusive lower and upper bounds of the ARRAY.

The GENERATE_TIMESTAMP_ARRAY function accepts the following data types as inputs:

  • start_timestamp: TIMESTAMP
  • end_timestamp: TIMESTAMP
  • step_expression: INT64
  • Allowed date_part values are: MICROSECOND, MILLISECOND, SECOND, MINUTE, HOUR, or DAY.

The step_expression parameter determines the increment used to generate timestamps.

Return Data Type

An ARRAY containing 0 or more TIMESTAMP values.

Examples

The following example returns an ARRAY of TIMESTAMPs at intervals of 1 day.

SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-05 00:00:00', '2016-10-07 00:00:00',
                                INTERVAL 1 DAY) AS timestamp_array;

/*--------------------------------------------------------------------------*
 | timestamp_array                                                          |
 +--------------------------------------------------------------------------+
 | [2016-10-05 00:00:00+00, 2016-10-06 00:00:00+00, 2016-10-07 00:00:00+00] |
 *--------------------------------------------------------------------------*/

The following example returns an ARRAY of TIMESTAMPs at intervals of 1 second.

SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-05 00:00:00', '2016-10-05 00:00:02',
                                INTERVAL 1 SECOND) AS timestamp_array;

/*--------------------------------------------------------------------------*
 | timestamp_array                                                          |
 +--------------------------------------------------------------------------+
 | [2016-10-05 00:00:00+00, 2016-10-05 00:00:01+00, 2016-10-05 00:00:02+00] |
 *--------------------------------------------------------------------------*/

The following example returns an ARRAY of TIMESTAMPS with a negative interval.

SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-06 00:00:00', '2016-10-01 00:00:00',
                                INTERVAL -2 DAY) AS timestamp_array;

/*--------------------------------------------------------------------------*
 | timestamp_array                                                          |
 +--------------------------------------------------------------------------+
 | [2016-10-06 00:00:00+00, 2016-10-04 00:00:00+00, 2016-10-02 00:00:00+00] |
 *--------------------------------------------------------------------------*/

The following example returns an ARRAY with a single element, because start_timestamp and end_timestamp have the same value.

SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-05 00:00:00', '2016-10-05 00:00:00',
                                INTERVAL 1 HOUR) AS timestamp_array;

/*--------------------------*
 | timestamp_array          |
 +--------------------------+
 | [2016-10-05 00:00:00+00] |
 *--------------------------*/

The following example returns an empty ARRAY, because start_timestamp is later than end_timestamp.

SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-06 00:00:00', '2016-10-05 00:00:00',
                                INTERVAL 1 HOUR) AS timestamp_array;

/*-----------------*
 | timestamp_array |
 +-----------------+
 | []              |
 *-----------------*/

The following example returns a null ARRAY, because one of the inputs is NULL.

SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-05 00:00:00', NULL, INTERVAL 1 HOUR)
  AS timestamp_array;

/*-----------------*
 | timestamp_array |
 +-----------------+
 | NULL            |
 *-----------------*/

The following example generates ARRAYs of TIMESTAMPs from columns containing values for start_timestamp and end_timestamp.

SELECT GENERATE_TIMESTAMP_ARRAY(start_timestamp, end_timestamp, INTERVAL 1 HOUR)
  AS timestamp_array
FROM
  (SELECT
    TIMESTAMP '2016-10-05 00:00:00' AS start_timestamp,
    TIMESTAMP '2016-10-05 02:00:00' AS end_timestamp
   UNION ALL
   SELECT
    TIMESTAMP '2016-10-05 12:00:00' AS start_timestamp,
    TIMESTAMP '2016-10-05 14:00:00' AS end_timestamp
   UNION ALL
   SELECT
    TIMESTAMP '2016-10-05 23:59:00' AS start_timestamp,
    TIMESTAMP '2016-10-06 01:59:00' AS end_timestamp);

/*--------------------------------------------------------------------------*
 | timestamp_array                                                          |
 +--------------------------------------------------------------------------+
 | [2016-10-05 00:00:00+00, 2016-10-05 01:00:00+00, 2016-10-05 02:00:00+00] |
 | [2016-10-05 12:00:00+00, 2016-10-05 13:00:00+00, 2016-10-05 14:00:00+00] |
 | [2016-10-05 23:59:00+00, 2016-10-06 00:59:00+00, 2016-10-06 01:59:00+00] |
 *--------------------------------------------------------------------------*/

OFFSET and ORDINAL

For information about using OFFSET and ORDINAL with arrays, see Array subscript operator and Accessing array elements.

Bit functions

GoogleSQL for BigQuery supports the following bit functions.

Function list

Name Summary
BIT_COUNT Gets the number of bits that are set in an input expression.

BIT_COUNT

BIT_COUNT(expression)

Description

The input, expression, must be an integer or BYTES.

Returns the number of bits that are set in the input expression. For signed integers, this is the number of bits in two's complement form.

Return Data Type

INT64

Example

SELECT a, BIT_COUNT(a) AS a_bits, FORMAT("%T", b) as b, BIT_COUNT(b) AS b_bits
FROM UNNEST([
  STRUCT(0 AS a, b'' AS b), (0, b'\x00'), (5, b'\x05'), (8, b'\x00\x08'),
  (0xFFFF, b'\xFF\xFF'), (-2, b'\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFE'),
  (-1, b'\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF'),
  (NULL, b'\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF')
]) AS x;

/*-------+--------+---------------------------------------------+--------*
 | a     | a_bits | b                                           | b_bits |
 +-------+--------+---------------------------------------------+--------+
 | 0     | 0      | b""                                         | 0      |
 | 0     | 0      | b"\x00"                                     | 0      |
 | 5     | 2      | b"\x05"                                     | 2      |
 | 8     | 1      | b"\x00\x08"                                 | 1      |
 | 65535 | 16     | b"\xff\xff"                                 | 16     |
 | -2    | 63     | b"\xff\xff\xff\xff\xff\xff\xff\xfe"         | 63     |
 | -1    | 64     | b"\xff\xff\xff\xff\xff\xff\xff\xff"         | 64     |
 | NULL  | NULL   | b"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" | 80     |
 *-------+--------+---------------------------------------------+--------*/

Conversion functions

GoogleSQL for BigQuery supports conversion functions. These data type conversions are explicit, but some conversions can happen implicitly. You can learn more about implicit and explicit conversion here.

Function list

Name Summary
CAST Convert the results of an expression to the given type.
PARSE_BIGNUMERIC Converts a STRING value to a BIGNUMERIC value.
PARSE_NUMERIC Converts a STRING value to a NUMERIC value.
SAFE_CAST Similar to the CAST function, but returns NULL when a runtime error is produced.

CAST

CAST(expression AS typename [format_clause])

Description

Cast syntax is used in a query to indicate that the result type of an expression should be converted to some other type.

When using CAST, a query can fail if GoogleSQL is unable to perform the cast. If you want to protect your queries from these types of errors, you can use SAFE_CAST.

Casts between supported types that do not successfully map from the original value to the target domain produce runtime errors. For example, casting BYTES to STRING where the byte sequence is not valid UTF-8 results in a runtime error.

Some casts can include a format clause, which provides instructions for how to conduct the cast. For example, you could instruct a cast to convert a sequence of bytes to a BASE64-encoded string instead of a UTF-8-encoded string.

The structure of the format clause is unique to each type of cast and more information is available in the section for that cast.

Examples

The following query results in "true" if x is 1, "false" for any other non-NULL value, and NULL if x is NULL.

CAST(x=1 AS STRING)

CAST AS ARRAY

CAST(expression AS ARRAY<element_type>)

Description

GoogleSQL supports casting to ARRAY. The expression parameter can represent an expression for these data types:

  • ARRAY

Conversion rules

From To Rule(s) when casting x
ARRAY ARRAY Must be the exact same array type.

CAST AS BIGNUMERIC

CAST(expression AS BIGNUMERIC)

Description

GoogleSQL supports casting to BIGNUMERIC. The expression parameter can represent an expression for these data types:

  • INT64
  • FLOAT64
  • NUMERIC
  • BIGNUMERIC
  • STRING

Conversion rules

From To Rule(s) when casting x
FLOAT64 BIGNUMERIC The floating point number will round half away from zero. Casting a NaN, +inf or -inf will return an error. Casting a value outside the range of BIGNUMERIC returns an overflow error.
STRING BIGNUMERIC The numeric literal contained in the string must not exceed the maximum precision or range of the BIGNUMERIC type, or an error will occur. If the number of digits after the decimal point exceeds 38, then the resulting BIGNUMERIC value will round half away from zero to have 38 digits after the decimal point.

CAST AS BOOL

CAST(expression AS BOOL)

Description

GoogleSQL supports casting to BOOL. The expression parameter can represent an expression for these data types:

  • INT64
  • BOOL
  • STRING

Conversion rules

From To Rule(s) when casting x
INT64 BOOL Returns FALSE if x is 0, TRUE otherwise.
STRING BOOL Returns TRUE if x is "true" and FALSE if x is "false"
All other values of x are invalid and throw an error instead of casting to a boolean.
A string is case-insensitive when converting to a boolean.

CAST AS BYTES

CAST(expression AS BYTES [format_clause])

Description

GoogleSQL supports casting to BYTES. The expression parameter can represent an expression for these data types:

  • BYTES
  • STRING

Format clause

When an expression of one type is cast to another type, you can use the format clause to provide instructions for how to conduct the cast. You can use the format clause in this section if expression is a STRING.

Conversion rules

From To Rule(s) when casting x
STRING BYTES Strings are cast to bytes using UTF-8 encoding. For example, the string "©", when cast to bytes, would become a 2-byte sequence with the hex values C2 and A9.

CAST AS DATE

CAST(expression AS DATE [format_clause])

Description

GoogleSQL supports casting to DATE. The expression parameter can represent an expression for these data types:

  • STRING
  • DATETIME
  • TIMESTAMP

Format clause

When an expression of one type is cast to another type, you can use the format clause to provide instructions for how to conduct the cast. You can use the format clause in this section if expression is a STRING.

Conversion rules

From To Rule(s) when casting x
STRING DATE When casting from string to date, the string must conform to the supported date literal format, and is independent of time zone. If the string expression is invalid or represents a date that is outside of the supported min/max range, then an error is produced.
TIMESTAMP DATE Casting from a timestamp to date effectively truncates the timestamp as of the default time zone.

CAST AS DATETIME

CAST(expression AS DATETIME [format_clause])

Description

GoogleSQL supports casting to DATETIME. The expression parameter can represent an expression for these data types:

  • STRING
  • DATETIME
  • TIMESTAMP

Format clause

When an expression of one type is cast to another type, you can use the format clause to provide instructions for how to conduct the cast. You can use the format clause in this section if expression is a STRING.

Conversion rules

From To Rule(s) when casting x
STRING DATETIME When casting from string to datetime, the string must conform to the supported datetime literal format, and is independent of time zone. If the string expression is invalid or represents a datetime that is outside of the supported min/max range, then an error is produced.
TIMESTAMP DATETIME Casting from a timestamp to datetime effectively truncates the timestamp as of the default time zone.

CAST AS FLOAT64

CAST(expression AS FLOAT64)

Description

GoogleSQL supports casting to floating point types. The expression parameter can represent an expression for these data types:

  • INT64
  • FLOAT64
  • NUMERIC
  • BIGNUMERIC
  • STRING

Conversion rules

From To Rule(s) when casting x
INT64 FLOAT64 Returns a close but potentially not exact floating point value.
NUMERIC FLOAT64 NUMERIC will convert to the closest floating point number with a possible loss of precision.
BIGNUMERIC FLOAT64 BIGNUMERIC will convert to the closest floating point number with a possible loss of precision.
STRING FLOAT64 Returns x as a floating point value, interpreting it as having the same form as a valid floating point literal. Also supports casts from "[+,-]inf" to [,-]Infinity, "[+,-]infinity" to [,-]Infinity, and "[+,-]nan" to NaN. Conversions are case-insensitive.

CAST AS INT64

CAST(expression AS INT64)

Description

GoogleSQL supports casting to integer types. The expression parameter can represent an expression for these data types:

  • INT64
  • FLOAT64
  • NUMERIC
  • BIGNUMERIC
  • BOOL
  • STRING

Conversion rules

From To Rule(s) when casting x
FLOAT64 INT64 Returns the closest integer value.
Halfway cases such as 1.5 or -0.5 round away from zero.
BOOL INT64 Returns 1 if x is TRUE, 0 otherwise.
STRING INT64 A hex string can be cast to an integer. For example, 0x123 to 291 or -0x123 to -291.

Examples

If you are working with hex strings (0x123), you can cast those strings as integers:

SELECT '0x123' as hex_value, CAST('0x123' as INT64) as hex_to_int;

/*-----------+------------*
 | hex_value | hex_to_int |
 +-----------+------------+
 | 0x123     | 291        |
 *-----------+------------*/
SELECT '-0x123' as hex_value, CAST('-0x123' as INT64) as hex_to_int;

/*-----------+------------*
 | hex_value | hex_to_int |
 +-----------+------------+
 | -0x123    | -291       |
 *-----------+------------*/

CAST AS INTERVAL

CAST(expression AS INTERVAL)

Description

GoogleSQL supports casting to INTERVAL. The expression parameter can represent an expression for these data types:

  • STRING

Conversion rules

From To Rule(s) when casting x
STRING INTERVAL When casting from string to interval, the string must conform to either ISO 8601 Duration standard or to interval literal format 'Y-M D H:M:S.F'. Partial interval literal formats are also accepted when they are not ambiguous, for example 'H:M:S'. If the string expression is invalid or represents an interval that is outside of the supported min/max range, then an error is produced.

Examples

SELECT input, CAST(input AS INTERVAL) AS output
FROM UNNEST([
  '1-2 3 10:20:30.456',
  '1-2',
  '10:20:30',
  'P1Y2M3D',
  'PT10H20M30,456S'
]) input

/*--------------------+--------------------*
 | input              | output             |
 +--------------------+--------------------+
 | 1-2 3 10:20:30.456 | 1-2 3 10:20:30.456 |
 | 1-2                | 1-2 0 0:0:0        |
 | 10:20:30           | 0-0 0 10:20:30     |
 | P1Y2M3D            | 1-2 3 0:0:0        |
 | PT10H20M30,456S    | 0-0 0 10:20:30.456 |
 *--------------------+--------------------*/

CAST AS NUMERIC

CAST(expression AS NUMERIC)

Description

GoogleSQL supports casting to NUMERIC. The expression parameter can represent an expression for these data types:

  • INT64
  • FLOAT64
  • NUMERIC
  • BIGNUMERIC
  • STRING

Conversion rules

From To Rule(s) when casting x
FLOAT64 NUMERIC The floating point number will round half away from zero. Casting a NaN, +inf or -inf will return an error. Casting a value outside the range of NUMERIC returns an overflow error.
STRING NUMERIC The numeric literal contained in the string must not exceed the maximum precision or range of the NUMERIC type, or an error will occur. If the number of digits after the decimal point exceeds nine, then the resulting NUMERIC value will round half away from zero. to have nine digits after the decimal point.

CAST AS RANGE

CAST(expression AS RANGE)

Description

GoogleSQL supports casting to RANGE. The expression parameter can represent an expression for these data types:

  • STRING

Conversion rules

From To Rule(s) when casting x
STRING RANGE When casting from string to range, the string must conform to the supported range literal format. If the string expression is invalid or represents a range that is outside of the supported subtype min/max range, then an error is produced.

Examples

SELECT CAST(
  '[2020-01-01, 2020-01-02)'
  AS RANGE<DATE>) AS string_to_range

/*----------------------------------------*
 | string_to_range                        |
 +----------------------------------------+
 | [DATE '2020-01-01', DATE '2020-01-02') |
 *----------------------------------------*/
SELECT CAST(
  '[2014-09-27 12:30:00.45, 2016-10-17 11:15:00.33)'
  AS RANGE<DATETIME>) AS string_to_range

/*------------------------------------------------------------------------*
 | string_to_range                                                        |
 +------------------------------------------------------------------------+
 | [DATETIME '2014-09-27 12:30:00.45', DATETIME '2016-10-17 11:15:00.33') |
 *------------------------------------------------------------------------*/
SELECT CAST(
  '[2014-09-27 12:30:00+08, 2016-10-17 11:15:00+08)'
  AS RANGE<TIMESTAMP>) AS string_to_range

-- Results depend upon where this query was executed.
/*---------------------------------------------------------------------------*
 | string_to_range                                                           |
 +---------------------------------------------------------------------------+
 | [TIMESTAMP '2014-09-27 12:30:00+08', TIMESTAMP '2016-10-17 11:15:00 UTC') |
 *---------------------------------------------------------------------------*/
SELECT CAST(
  '[UNBOUNDED, 2020-01-02)'
  AS RANGE<DATE>) AS string_to_range

/*--------------------------------*
 | string_to_range                |
 +--------------------------------+
 | [UNBOUNDED, DATE '2020-01-02') |
 *--------------------------------*/
SELECT CAST(
  '[2020-01-01, NULL)'
  AS RANGE<DATE>) AS string_to_range

/*--------------------------------*
 | string_to_range                |
 +--------------------------------+
 | [DATE '2020-01-01', UNBOUNDED) |
 *--------------------------------*/

CAST AS STRING

CAST(expression AS STRING [format_clause [AT TIME ZONE timezone_expr]])

Description

GoogleSQL supports casting to STRING. The expression parameter can represent an expression for these data types:

  • INT64
  • FLOAT64
  • NUMERIC
  • BIGNUMERIC
  • BOOL
  • BYTES
  • TIME
  • DATE
  • DATETIME
  • TIMESTAMP
  • RANGE
  • INTERVAL
  • STRING

Format clause

When an expression of one type is cast to another type, you can use the format clause to provide instructions for how to conduct the cast. You can use the format clause in this section if expression is one of these data types:

  • INT64
  • FLOAT64
  • NUMERIC
  • BIGNUMERIC
  • BYTES
  • TIME
  • DATE
  • DATETIME
  • TIMESTAMP

The format clause for STRING has an additional optional clause called AT TIME ZONE timezone_expr, which you can use to specify a specific time zone to use during formatting of a TIMESTAMP. If this optional clause is not included when formatting a TIMESTAMP, your current time zone is used.

For more information, see the following topics:

Conversion rules

From To Rule(s) when casting x
FLOAT64 STRING Returns an approximate string representation. A returned NaN or 0 will not be signed.
BOOL STRING Returns "true" if x is TRUE, "false" otherwise.
BYTES STRING Returns x interpreted as a UTF-8 string.
For example, the bytes literal b'\xc2\xa9', when cast to a string, is interpreted as UTF-8 and becomes the unicode character "©".
An error occurs if x is not valid UTF-8.
TIME STRING Casting from a time type to a string is independent of time zone and is of the form HH:MM:SS.
DATE STRING Casting from a date type to a string is independent of time zone and is of the form YYYY-MM-DD.
DATETIME STRING Casting from a datetime type to a string is independent of time zone and is of the form YYYY-MM-DD HH:MM:SS.
TIMESTAMP STRING When casting from timestamp types to string, the timestamp is interpreted using the default time zone, UTC. The number of subsecond digits produced depends on the number of trailing zeroes in the subsecond part: the CAST function will truncate zero, three, or six digits.
INTERVAL STRING Casting from an interval to a string is of the form Y-M D H:M:S.

Examples

SELECT CAST(CURRENT_DATE() AS STRING) AS current_date

/*---------------*
 | current_date  |
 +---------------+
 | 2021-03-09    |
 *---------------*/
SELECT CAST(CURRENT_DATE() AS STRING FORMAT 'DAY') AS current_day

/*-------------*
 | current_day |
 +-------------+
 | MONDAY      |
 *-------------*/
SELECT CAST(
  TIMESTAMP '2008-12-25 00:00:00+00:00'
  AS STRING FORMAT 'YYYY-MM-DD HH24:MI:SS TZH:TZM') AS date_time_to_string

-- Results depend upon where this query was executed.
/*------------------------------*
 | date_time_to_string          |
 +------------------------------+
 | 2008-12-24 16:00:00 -08:00   |
 *------------------------------*/
SELECT CAST(
  TIMESTAMP '2008-12-25 00:00:00+00:00'
  AS STRING FORMAT 'YYYY-MM-DD HH24:MI:SS TZH:TZM'
  AT TIME ZONE 'Asia/Kolkata') AS date_time_to_string

-- Because the time zone is specified, the result is always the same.
/*------------------------------*
 | date_time_to_string          |
 +------------------------------+
 | 2008-12-25 05:30:00 +05:30   |
 *------------------------------*/
SELECT CAST(INTERVAL 3 DAY AS STRING) AS interval_to_string

/*--------------------*
 | interval_to_string |
 +--------------------+
 | 0-0 3 0:0:0        |
 *--------------------*/
SELECT CAST(
  INTERVAL "1-2 3 4:5:6.789" YEAR TO SECOND
  AS STRING) AS interval_to_string

/*--------------------*
 | interval_to_string |
 +--------------------+
 | 1-2 3 4:5:6.789    |
 *--------------------*/

CAST AS STRUCT

CAST(expression AS STRUCT)

Description

GoogleSQL supports casting to STRUCT. The expression parameter can represent an expression for these data types:

  • STRUCT

Conversion rules

From To Rule(s) when casting x
STRUCT STRUCT Allowed if the following conditions are met:
  1. The two structs have the same number of fields.
  2. The original struct field types can be explicitly cast to the corresponding target struct field types (as defined by field order, not field name).

CAST AS TIME

CAST(expression AS TIME [format_clause])

Description

GoogleSQL supports casting to TIME. The expression parameter can represent an expression for these data types:

  • STRING
  • TIME
  • DATETIME
  • TIMESTAMP

Format clause

When an expression of one type is cast to another type, you can use the format clause to provide instructions for how to conduct the cast. You can use the format clause in this section if expression is a STRING.

Conversion rules

From To Rule(s) when casting x
STRING TIME When casting from string to time, the string must conform to the supported time literal format, and is independent of time zone. If the string expression is invalid or represents a time that is outside of the supported min/max range, then an error is produced.

CAST AS TIMESTAMP

CAST(expression AS TIMESTAMP [format_clause [AT TIME ZONE timezone_expr]])

Description

GoogleSQL supports casting to TIMESTAMP. The expression parameter can represent an expression for these data types:

  • STRING
  • DATETIME
  • TIMESTAMP

Format clause

When an expression of one type is cast to another type, you can use the format clause to provide instructions for how to conduct the cast. You can use the format clause in this section if expression is a STRING.

The format clause for TIMESTAMP has an additional optional clause called AT TIME ZONE timezone_expr, which you can use to specify a specific time zone to use during formatting. If this optional clause is not included, your current time zone is used.

Conversion rules

From To Rule(s) when casting x
STRING TIMESTAMP When casting from string to a timestamp, string_expression must conform to the supported timestamp literal formats, or else a runtime error occurs. The string_expression may itself contain a time zone.

If there is a time zone in the string_expression, that time zone is used for conversion, otherwise the default time zone, UTC, is used. If the string has fewer than six digits, then it is implicitly widened.

An error is produced if the string_expression is invalid, has more than six subsecond digits (i.e., precision greater than microseconds), or represents a time outside of the supported timestamp range.
DATE TIMESTAMP Casting from a date to a timestamp interprets date_expression as of midnight (start of the day) in the default time zone, UTC.
DATETIME TIMESTAMP Casting from a datetime to a timestamp interprets datetime_expression in the default time zone, UTC.

Most valid datetime values have exactly one corresponding timestamp in each time zone. However, there are certain combinations of valid datetime values and time zones that have zero or two corresponding timestamp values. This happens in a time zone when clocks are set forward or set back, such as for Daylight Savings Time. When there are two valid timestamps, the earlier one is used. When there is no valid timestamp, the length of the gap in time (typically one hour) is added to the datetime.

Examples

The following example casts a string-formatted timestamp as a timestamp:

SELECT CAST("2020-06-02 17:00:53.110+00:00" AS TIMESTAMP) AS as_timestamp

-- Results depend upon where this query was executed.
/*-----------------------------*
 | as_timestamp                |
 +-----------------------------+
 | 2020-06-03 00:00:53.110 UTC |
 *-----------------------------*/

The following examples cast a string-formatted date and time as a timestamp. These examples return the same output as the previous example.

SELECT CAST('06/02/2020 17:00:53.110' AS TIMESTAMP FORMAT 'MM/DD/YYYY HH24:MI:SS.FF3' AT TIME ZONE 'UTC') AS as_timestamp
SELECT CAST('06/02/2020 17:00:53.110' AS TIMESTAMP FORMAT 'MM/DD/YYYY HH24:MI:SS.FF3' AT TIME ZONE '+00') AS as_timestamp
SELECT CAST('06/02/2020 17:00:53.110 +00' AS TIMESTAMP FORMAT 'MM/DD/YYYY HH24:MI:SS.FF3 TZH') AS as_timestamp

PARSE_BIGNUMERIC

PARSE_BIGNUMERIC(string_expression)

Description

Converts a STRING to a BIGNUMERIC value.

The numeric literal contained in the string must not exceed the maximum precision or range of the BIGNUMERIC type, or an error occurs. If the number of digits after the decimal point exceeds 38, then the resulting BIGNUMERIC value rounds half away from zero to have 38 digits after the decimal point.


-- This example shows how a string with a decimal point is parsed.
SELECT PARSE_BIGNUMERIC("123.45") AS parsed;

/*--------*
 | parsed |
 +--------+
 | 123.45 |
 *--------*/

-- This example shows how a string with an exponent is parsed.
SELECT PARSE_BIGNUMERIC("123.456E37") AS parsed;

/*-----------------------------------------*
 | parsed                                  |
 +-----------------------------------------+
 | 123400000000000000000000000000000000000 |
 *-----------------------------------------*/

-- This example shows the rounding when digits after the decimal point exceeds 38.
SELECT PARSE_BIGNUMERIC("1.123456789012345678901234567890123456789") as parsed;

/*------------------------------------------*
 | parsed                                   |
 +------------------------------------------+
 | 1.12345678901234567890123456789012345679 |
 *------------------------------------------*/

This function is similar to using the CAST AS BIGNUMERIC function except that the PARSE_BIGNUMERIC function only accepts string inputs and allows the following in the string:

  • Spaces between the sign (+/-) and the number
  • Signs (+/-) after the number

Rules for valid input strings:

Rule Example Input Output
The string can only contain digits, commas, decimal points and signs. "- 12,34567,89.0" -123456789
Whitespaces are allowed anywhere except between digits. " - 12.345 " -12.345
Only digits and commas are allowed before the decimal point. " 12,345,678" 12345678
Only digits are allowed after the decimal point. "1.234 " 1.234
Use E or e for exponents. After the e, digits and a leading sign indicator are allowed. " 123.45e-1" 12.345
If the integer part is not empty, then it must contain at least one digit. " 0,.12 -" -0.12
If the string contains a decimal point, then it must contain at least one digit. " .1" 0.1
The string cannot contain more than one sign. " 0.5 +" 0.5

Return Data Type

BIGNUMERIC

Examples

This example shows an input with spaces before, after, and between the sign and the number:

SELECT PARSE_BIGNUMERIC("  -  12.34 ") as parsed;

/*--------*
 | parsed |
 +--------+
 | -12.34 |
 *--------*/

This example shows an input with an exponent as well as the sign after the number:

SELECT PARSE_BIGNUMERIC("12.34e-1-") as parsed;

/*--------*
 | parsed |
 +--------+
 | -1.234 |
 *--------*/

This example shows an input with multiple commas in the integer part of the number:

SELECT PARSE_BIGNUMERIC("  1,2,,3,.45 + ") as parsed;

/*--------*
 | parsed |
 +--------+
 | 123.45 |
 *--------*/

This example shows an input with a decimal point and no digits in the whole number part:

SELECT PARSE_BIGNUMERIC(".1234  ") as parsed;

/*--------*
 | parsed |
 +--------+
 | 0.1234 |
 *--------*/

Examples of invalid inputs

This example is invalid because the whole number part contains no digits:

SELECT PARSE_BIGNUMERIC(",,,.1234  ") as parsed;

This example is invalid because there are whitespaces between digits:

SELECT PARSE_BIGNUMERIC("1  23.4 5  ") as parsed;

This example is invalid because the number is empty except for an exponent:

SELECT PARSE_BIGNUMERIC("  e1 ") as parsed;

This example is invalid because the string contains multiple signs:

SELECT PARSE_BIGNUMERIC("  - 12.3 - ") as parsed;

This example is invalid because the value of the number falls outside the range of BIGNUMERIC:

SELECT PARSE_BIGNUMERIC("12.34E100 ") as parsed;

This example is invalid because the string contains invalid characters:

SELECT PARSE_BIGNUMERIC("$12.34") as parsed;

PARSE_NUMERIC

PARSE_NUMERIC(string_expression)

Description

Converts a STRING to a NUMERIC value.

The numeric literal contained in the string must not exceed the maximum precision or range of the NUMERIC type, or an error occurs. If the number of digits after the decimal point exceeds nine, then the resulting NUMERIC value rounds half away from zero to have nine digits after the decimal point.


-- This example shows how a string with a decimal point is parsed.
SELECT PARSE_NUMERIC("123.45") AS parsed;

/*--------*
 | parsed |
 +--------+
 | 123.45 |
 *--------*/

-- This example shows how a string with an exponent is parsed.
SELECT PARSE_NUMERIC("12.34E27") as parsed;

/*-------------------------------*
 | parsed                        |
 +-------------------------------+
 | 12340000000000000000000000000 |
 *-------------------------------*/

-- This example shows the rounding when digits after the decimal point exceeds 9.
SELECT PARSE_NUMERIC("1.0123456789") as parsed;

/*-------------*
 | parsed      |
 +-------------+
 | 1.012345679 |
 *-------------*/

This function is similar to using the CAST AS NUMERIC function except that the PARSE_NUMERIC function only accepts string inputs and allows the following in the string:

  • Spaces between the sign (+/-) and the number
  • Signs (+/-) after the number

Rules for valid input strings:

Rule Example Input Output
The string can only contain digits, commas, decimal points and signs. "- 12,34567,89.0" -123456789
Whitespaces are allowed anywhere except between digits. " - 12.345 " -12.345
Only digits and commas are allowed before the decimal point. " 12,345,678" 12345678
Only digits are allowed after the decimal point. "1.234 " 1.234
Use E or e for exponents. After the e, digits and a leading sign indicator are allowed. " 123.45e-1" 12.345
If the integer part is not empty, then it must contain at least one digit. " 0,.12 -" -0.12
If the string contains a decimal point, then it must contain at least one digit. " .1" 0.1
The string cannot contain more than one sign. " 0.5 +" 0.5

Return Data Type

NUMERIC

Examples

This example shows an input with spaces before, after, and between the sign and the number:

SELECT PARSE_NUMERIC("  -  12.34 ") as parsed;

/*--------*
 | parsed |
 +--------+
 | -12.34 |
 *--------*/

This example shows an input with an exponent as well as the sign after the number:

SELECT PARSE_NUMERIC("12.34e-1-") as parsed;

/*--------*
 | parsed |
 +--------+
 | -1.234 |
 *--------*/

This example shows an input with multiple commas in the integer part of the number:

SELECT PARSE_NUMERIC("  1,2,,3,.45 + ") as parsed;

/*--------*
 | parsed |
 +--------+
 | 123.45 |
 *--------*/

This example shows an input with a decimal point and no digits in the whole number part:

SELECT PARSE_NUMERIC(".1234  ") as parsed;

/*--------*
 | parsed |
 +--------+
 | 0.1234 |
 *--------*/

Examples of invalid inputs

This example is invalid because the whole number part contains no digits:

SELECT PARSE_NUMERIC(",,,.1234  ") as parsed;

This example is invalid because there are whitespaces between digits:

SELECT PARSE_NUMERIC("1  23.4 5  ") as parsed;

This example is invalid because the number is empty except for an exponent:

SELECT PARSE_NUMERIC("  e1 ") as parsed;

This example is invalid because the string contains multiple signs:

SELECT PARSE_NUMERIC("  - 12.3 - ") as parsed;

This example is invalid because the value of the number falls outside the range of BIGNUMERIC:

SELECT PARSE_NUMERIC("12.34E100 ") as parsed;

This example is invalid because the string contains invalid characters:

SELECT PARSE_NUMERIC("$12.34") as parsed;

SAFE_CAST

SAFE_CAST(expression AS typename [format_clause])

Description

When using CAST, a query can fail if GoogleSQL is unable to perform the cast. For example, the following query generates an error:

SELECT CAST("apple" AS INT64) AS not_a_number;

If you want to protect your queries from these types of errors, you can use SAFE_CAST. SAFE_CAST replaces runtime errors with NULLs. However, during static analysis, impossible casts between two non-castable types still produce an error because the query is invalid.

SELECT SAFE_CAST("apple" AS INT64) AS not_a_number;

/*--------------*
 | not_a_number |
 +--------------+
 | NULL         |
 *--------------*/

Some casts can include a format clause, which provides instructions for how to conduct the cast. For example, you could instruct a cast to convert a sequence of bytes to a BASE64-encoded string instead of a UTF-8-encoded string.

The structure of the format clause is unique to each type of cast and more information is available in the section for that cast.

If you are casting from bytes to strings, you can also use the function, SAFE_CONVERT_BYTES_TO_STRING. Any invalid UTF-8 characters are replaced with the unicode replacement character, U+FFFD.

Other conversion functions

You can learn more about these conversion functions elsewhere in the documentation:

Conversion function From To
ARRAY_TO_STRING ARRAY STRING
BOOL JSON BOOL
DATE Various data types DATE
DATE_FROM_UNIX_DATE INT64 DATE
DATETIME Various data types DATETIME
FLOAT64 JSON FLOAT64
FROM_BASE32 STRING BYTEs
FROM_BASE64 STRING BYTES
FROM_HEX STRING BYTES
INT64 JSON INT64
PARSE_DATE STRING DATE
PARSE_DATETIME STRING DATETIME
PARSE_JSON STRING JSON
PARSE_TIME STRING TIME
PARSE_TIMESTAMP STRING TIMESTAMP
SAFE_CONVERT_BYTES_TO_STRING BYTES STRING
STRING TIMESTAMP STRING
STRING JSON STRING
TIME Various data types TIME
TIMESTAMP Various data types TIMESTAMP
TIMESTAMP_MICROS INT64 TIMESTAMP
TIMESTAMP_MILLIS INT64 TIMESTAMP
TIMESTAMP_SECONDS INT64 TIMESTAMP
TO_BASE32 BYTES STRING
TO_BASE64 BYTES STRING
TO_HEX BYTES STRING
TO_JSON All data types JSON
TO_JSON_STRING All data types STRING

Date functions

GoogleSQL for BigQuery supports the following date functions.

Function list

Name Summary
CURRENT_DATE Returns the current date as a DATE value.
DATE Constructs a DATE value.
DATE_ADD Adds a specified time interval to a DATE value.
DATE_DIFF Gets the number of unit boundaries between two DATE values at a particular time granularity.
DATE_FROM_UNIX_DATE Interprets an INT64 expression as the number of days since 1970-01-01.
DATE_SUB Subtracts a specified time interval from a DATE value.
DATE_TRUNC Truncates a DATE value.
EXTRACT Extracts part of a date from a DATE value.
FORMAT_DATE Formats a DATE value according to a specified format string.
LAST_DAY Gets the last day in a specified time period that contains a DATE value.
PARSE_DATE Converts a STRING value to a DATE value.
UNIX_DATE Converts a DATE value to the number of days since 1970-01-01.

CURRENT_DATE

CURRENT_DATE()
CURRENT_DATE(time_zone_expression)
CURRENT_DATE

Description

Returns the current date as a DATE object. Parentheses are optional when called with no arguments.

This function supports the following arguments:

  • time_zone_expression: A STRING expression that represents a time zone. If no time zone is specified, the default time zone, UTC, is used. If this expression is used and it evaluates to NULL, this function returns NULL.

The current date is recorded at the start of the query statement which contains this function, not when this specific function is evaluated.

Return Data Type

DATE

Examples

The following query produces the current date in the default time zone:

SELECT CURRENT_DATE() AS the_date;

/*--------------*
 | the_date     |
 +--------------+
 | 2016-12-25   |
 *--------------*/

The following queries produce the current date in a specified time zone:

SELECT CURRENT_DATE('America/Los_Angeles') AS the_date;

/*--------------*
 | the_date     |
 +--------------+
 | 2016-12-25   |
 *--------------*/
SELECT CURRENT_DATE('-08') AS the_date;

/*--------------*
 | the_date     |
 +--------------+
 | 2016-12-25   |
 *--------------*/

The following query produces the current date in the default time zone. Parentheses are not needed if the function has no arguments.

SELECT CURRENT_DATE AS the_date;

/*--------------*
 | the_date     |
 +--------------+
 | 2016-12-25   |
 *--------------*/

When a column named current_date is present, the column name and the function call without parentheses are ambiguous. To ensure the function call, add parentheses; to ensure the column name, qualify it with its range variable. For example, the following query will select the function in the the_date column and the table column in the current_date column.

WITH t AS (SELECT 'column value' AS `current_date`)
SELECT current_date() AS the_date, t.current_date FROM t;

/*------------+--------------*
 | the_date   | current_date |
 +------------+--------------+
 | 2016-12-25 | column value |
 *------------+--------------*/

DATE

DATE(year, month, day)
DATE(timestamp_expression)
DATE(timestamp_expression, time_zone_expression)
DATE(datetime_expression)

Description

Constructs or extracts a date.

This function supports the following arguments:

  • year: The INT64 value for year.
  • month: The INT64 value for month.
  • day: The INT64 value for day.
  • timestamp_expression: A TIMESTAMP expression that contains the date.
  • time_zone_expression: A STRING expression that represents a time zone. If no time zone is specified with timestamp_expression, the default time zone, UTC, is used.
  • datetime_expression: A DATETIME expression that contains the date.

Return Data Type

DATE

Example

SELECT
  DATE(2016, 12, 25) AS date_ymd,
  DATE(DATETIME '2016-12-25 23:59:59') AS date_dt,
  DATE(TIMESTAMP '2016-12-25 05:30:00+07', 'America/Los_Angeles') AS date_tstz;

/*------------+------------+------------*
 | date_ymd   | date_dt    | date_tstz  |
 +------------+------------+------------+
 | 2016-12-25 | 2016-12-25 | 2016-12-24 |
 *------------+------------+------------*/

DATE_ADD

DATE_ADD(date_expression, INTERVAL int64_expression date_part)

Description

Adds a specified time interval to a DATE.

DATE_ADD supports the following date_part values:

  • DAY
  • WEEK. Equivalent to 7 DAYs.
  • MONTH
  • QUARTER
  • YEAR

Special handling is required for MONTH, QUARTER, and YEAR parts when the date is at (or near) the last day of the month. If the resulting month has fewer days than the original date's day, then the resulting date is the last date of that month.

Return Data Type

DATE

Example

SELECT DATE_ADD(DATE '2008-12-25', INTERVAL 5 DAY) AS five_days_later;

/*--------------------*
 | five_days_later    |
 +--------------------+
 | 2008-12-30         |
 *--------------------*/

DATE_DIFF

DATE_DIFF(end_date, start_date, granularity)

Description

Gets the number of unit boundaries between two DATE values (end_date - start_date) at a particular time granularity.

Definitions

  • start_date: The starting DATE value.
  • end_date: The ending DATE value.
  • granularity: The date part that represents the granularity. This can be:

    • DAY
    • WEEK This date part begins on Sunday.
    • WEEK(<WEEKDAY>): This date part begins on WEEKDAY. Valid values for WEEKDAY are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.
    • ISOWEEK: Uses ISO 8601 week boundaries. ISO weeks begin on Monday.
    • MONTH, except when the first two arguments are TIMESTAMP values.
    • QUARTER
    • YEAR
    • ISOYEAR: Uses the ISO 8601 week-numbering year boundary. The ISO year boundary is the Monday of the first week whose Thursday belongs to the corresponding Gregorian calendar year.

Details

If end_date is earlier than start_date, the output is negative.

Return Data Type

INT64

Example

SELECT DATE_DIFF(DATE '2010-07-07', DATE '2008-12-25', DAY) AS days_diff;

/*-----------*
 | days_diff |
 +-----------+
 | 559       |
 *-----------*/
SELECT
  DATE_DIFF(DATE '2017-10-15', DATE '2017-10-14', DAY) AS days_diff,
  DATE_DIFF(DATE '2017-10-15', DATE '2017-10-14', WEEK) AS weeks_diff;

/*-----------+------------*
 | days_diff | weeks_diff |
 +-----------+------------+
 | 1         | 1          |
 *-----------+------------*/

The example above shows the result of DATE_DIFF for two days in succession. DATE_DIFF with the date part WEEK returns 1 because DATE_DIFF counts the number of date part boundaries in this range of dates. Each WEEK begins on Sunday, so there is one date part boundary between Saturday, 2017-10-14 and Sunday, 2017-10-15.

The following example shows the result of DATE_DIFF for two dates in different years. DATE_DIFF with the date part YEAR returns 3 because it counts the number of Gregorian calendar year boundaries between the two dates. DATE_DIFF with the date part ISOYEAR returns 2 because the second date belongs to the ISO year 2015. The first Thursday of the 2015 calendar year was 2015-01-01, so the ISO year 2015 begins on the preceding Monday, 2014-12-29.

SELECT
  DATE_DIFF('2017-12-30', '2014-12-30', YEAR) AS year_diff,
  DATE_DIFF('2017-12-30', '2014-12-30', ISOYEAR) AS isoyear_diff;

/*-----------+--------------*
 | year_diff | isoyear_diff |
 +-----------+--------------+
 | 3         | 2            |
 *-----------+--------------*/

The following example shows the result of DATE_DIFF for two days in succession. The first date falls on a Monday and the second date falls on a Sunday. DATE_DIFF with the date part WEEK returns 0 because this date part uses weeks that begin on Sunday. DATE_DIFF with the date part WEEK(MONDAY) returns 1. DATE_DIFF with the date part ISOWEEK also returns 1 because ISO weeks begin on Monday.

SELECT
  DATE_DIFF('2017-12-18', '2017-12-17', WEEK) AS week_diff,
  DATE_DIFF('2017-12-18', '2017-12-17', WEEK(MONDAY)) AS week_weekday_diff,
  DATE_DIFF('2017-12-18', '2017-12-17', ISOWEEK) AS isoweek_diff;

/*-----------+-------------------+--------------*
 | week_diff | week_weekday_diff | isoweek_diff |
 +-----------+-------------------+--------------+
 | 0         | 1                 | 1            |
 *-----------+-------------------+--------------*/

DATE_FROM_UNIX_DATE

DATE_FROM_UNIX_DATE(int64_expression)

Description

Interprets int64_expression as the number of days since 1970-01-01.

Return Data Type

DATE

Example

SELECT DATE_FROM_UNIX_DATE(14238) AS date_from_epoch;

/*-----------------*
 | date_from_epoch |
 +-----------------+
 | 2008-12-25      |
 *-----------------+*/

DATE_SUB

DATE_SUB(date_expression, INTERVAL int64_expression date_part)

Description

Subtracts a specified time interval from a DATE.

DATE_SUB supports the following date_part values:

  • DAY
  • WEEK. Equivalent to 7 DAYs.
  • MONTH
  • QUARTER
  • YEAR

Special handling is required for MONTH, QUARTER, and YEAR parts when the date is at (or near) the last day of the month. If the resulting month has fewer days than the original date's day, then the resulting date is the last date of that month.

Return Data Type

DATE

Example

SELECT DATE_SUB(DATE '2008-12-25', INTERVAL 5 DAY) AS five_days_ago;

/*---------------*
 | five_days_ago |
 +---------------+
 | 2008-12-20    |
 *---------------*/

DATE_TRUNC

DATE_TRUNC(date_expression, date_part)

Description

Truncates a DATE value to the granularity of date_part. The DATE value is always rounded to the beginning of date_part, which can be one of the following:

  • DAY: The day in the Gregorian calendar year that contains the DATE value.
  • WEEK: The first day of the week in the week that contains the DATE value. Weeks begin on Sundays. WEEK is equivalent to WEEK(SUNDAY).
  • WEEK(WEEKDAY): The first day of the week in the week that contains the DATE value. Weeks begin on WEEKDAY. WEEKDAY must be one of the following: SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, or SATURDAY.
  • ISOWEEK: The first day of the ISO 8601 week in the ISO week that contains the DATE value. The ISO week begins on Monday. The first ISO week of each ISO year contains the first Thursday of the corresponding Gregorian calendar year.
  • MONTH: The first day of the month in the month that contains the DATE value.
  • QUARTER: The first day of the quarter in the quarter that contains the DATE value.
  • YEAR: The first day of the year in the year that contains the DATE value.
  • ISOYEAR: The first day of the ISO 8601 week-numbering year in the ISO year that contains the DATE value. The ISO year is the Monday of the first week whose Thursday belongs to the corresponding Gregorian calendar year.

Return Data Type

DATE

Examples

SELECT DATE_TRUNC(DATE '2008-12-25', MONTH) AS month;

/*------------*
 | month      |
 +------------+
 | 2008-12-01 |
 *------------*/

In the following example, the original date falls on a Sunday. Because the date_part is WEEK(MONDAY), DATE_TRUNC returns the DATE for the preceding Monday.

SELECT date AS original, DATE_TRUNC(date, WEEK(MONDAY)) AS truncated
FROM (SELECT DATE('2017-11-05') AS date);

/*------------+------------*
 | original   | truncated  |
 +------------+------------+
 | 2017-11-05 | 2017-10-30 |
 *------------+------------*/

In the following example, the original date_expression is in the Gregorian calendar year 2015. However, DATE_TRUNC with the ISOYEAR date part truncates the date_expression to the beginning of the ISO year, not the Gregorian calendar year. The first Thursday of the 2015 calendar year was 2015-01-01, so the ISO year 2015 begins on the preceding Monday, 2014-12-29. Therefore the ISO year boundary preceding the date_expression 2015-06-15 is 2014-12-29.

SELECT
  DATE_TRUNC('2015-06-15', ISOYEAR) AS isoyear_boundary,
  EXTRACT(ISOYEAR FROM DATE '2015-06-15') AS isoyear_number;

/*------------------+----------------*
 | isoyear_boundary | isoyear_number |
 +------------------+----------------+
 | 2014-12-29       | 2015           |
 *------------------+----------------*/

EXTRACT

EXTRACT(part FROM date_expression)

Description

Returns the value corresponding to the specified date part. The part must be one of:

  • DAYOFWEEK: Returns values in the range [1,7] with Sunday as the first day of the week.
  • DAY
  • DAYOFYEAR
  • WEEK: Returns the week number of the date in the range [0, 53]. Weeks begin with Sunday, and dates prior to the first Sunday of the year are in week 0.
  • WEEK(<WEEKDAY>): Returns the week number of the date in the range [0, 53]. Weeks begin on WEEKDAY. Dates prior to the first WEEKDAY of the year are in week 0. Valid values for WEEKDAY are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.
  • ISOWEEK: Returns the ISO 8601 week number of the date_expression. ISOWEEKs begin on Monday. Return values are in the range [1, 53]. The first ISOWEEK of each ISO year begins on the Monday before the first Thursday of the Gregorian calendar year.
  • MONTH
  • QUARTER: Returns values in the range [1,4].
  • YEAR
  • ISOYEAR: Returns the ISO 8601 week-numbering year, which is the Gregorian calendar year containing the Thursday of the week to which date_expression belongs.

Return Data Type

INT64

Examples

In the following example, EXTRACT returns a value corresponding to the DAY date part.

SELECT EXTRACT(DAY FROM DATE '2013-12-25') AS the_day;

/*---------*
 | the_day |
 +---------+
 | 25      |
 *---------*/

In the following example, EXTRACT returns values corresponding to different date parts from a column of dates near the end of the year.

SELECT
  date,
  EXTRACT(ISOYEAR FROM date) AS isoyear,
  EXTRACT(ISOWEEK FROM date) AS isoweek,
  EXTRACT(YEAR FROM date) AS year,
  EXTRACT(WEEK FROM date) AS week
FROM UNNEST(GENERATE_DATE_ARRAY('2015-12-23', '2016-01-09')) AS date
ORDER BY date;

/*------------+---------+---------+------+------*
 | date       | isoyear | isoweek | year | week |
 +------------+---------+---------+------+------+
 | 2015-12-23 | 2015    | 52      | 2015 | 51   |
 | 2015-12-24 | 2015    | 52      | 2015 | 51   |
 | 2015-12-25 | 2015    | 52      | 2015 | 51   |
 | 2015-12-26 | 2015    | 52      | 2015 | 51   |
 | 2015-12-27 | 2015    | 52      | 2015 | 52   |
 | 2015-12-28 | 2015    | 53      | 2015 | 52   |
 | 2015-12-29 | 2015    | 53      | 2015 | 52   |
 | 2015-12-30 | 2015    | 53      | 2015 | 52   |
 | 2015-12-31 | 2015    | 53      | 2015 | 52   |
 | 2016-01-01 | 2015    | 53      | 2016 | 0    |
 | 2016-01-02 | 2015    | 53      | 2016 | 0    |
 | 2016-01-03 | 2015    | 53      | 2016 | 1    |
 | 2016-01-04 | 2016    | 1       | 2016 | 1    |
 | 2016-01-05 | 2016    | 1       | 2016 | 1    |
 | 2016-01-06 | 2016    | 1       | 2016 | 1    |
 | 2016-01-07 | 2016    | 1       | 2016 | 1    |
 | 2016-01-08 | 2016    | 1       | 2016 | 1    |
 | 2016-01-09 | 2016    | 1       | 2016 | 1    |
 *------------+---------+---------+------+------*/

In the following example, date_expression falls on a Sunday. EXTRACT calculates the first column using weeks that begin on Sunday, and it calculates the second column using weeks that begin on Monday.

WITH table AS (SELECT DATE('2017-11-05') AS date)
SELECT
  date,
  EXTRACT(WEEK(SUNDAY) FROM date) AS week_sunday,
  EXTRACT(WEEK(MONDAY) FROM date) AS week_monday FROM table;

/*------------+-------------+-------------*
 | date       | week_sunday | week_monday |
 +------------+-------------+-------------+
 | 2017-11-05 | 45          | 44          |
 *------------+-------------+-------------*/

FORMAT_DATE

FORMAT_DATE(format_string, date_expr)

Description

Formats the date_expr according to the specified format_string.

See Supported Format Elements For DATE for a list of format elements that this function supports.

Return Data Type

STRING

Examples

SELECT FORMAT_DATE('%x', DATE '2008-12-25') AS US_format;

/*------------*
 | US_format  |
 +------------+
 | 12/25/08   |
 *------------*/
SELECT FORMAT_DATE('%b-%d-%Y', DATE '2008-12-25') AS formatted;

/*-------------*
 | formatted   |
 +-------------+
 | Dec-25-2008 |
 *-------------*/
SELECT FORMAT_DATE('%b %Y', DATE '2008-12-25') AS formatted;

/*-------------*
 | formatted   |
 +-------------+
 | Dec 2008    |
 *-------------*/

LAST_DAY

LAST_DAY(date_expression[, date_part])

Description

Returns the last day from a date expression. This is commonly used to return the last day of the month.

You can optionally specify the date part for which the last day is returned. If this parameter is not used, the default value is MONTH. LAST_DAY supports the following values for date_part:

  • YEAR
  • QUARTER
  • MONTH
  • WEEK. Equivalent to 7 DAYs.
  • WEEK(<WEEKDAY>). <WEEKDAY> represents the starting day of the week. Valid values are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.
  • ISOWEEK. Uses ISO 8601 week boundaries. ISO weeks begin on Monday.
  • ISOYEAR. Uses the ISO 8601 week-numbering year boundary. The ISO year boundary is the Monday of the first week whose Thursday belongs to the corresponding Gregorian calendar year.

Return Data Type

DATE

Example

These both return the last day of the month:

SELECT LAST_DAY(DATE '2008-11-25', MONTH) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-30 |
 *------------*/
SELECT LAST_DAY(DATE '2008-11-25') AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-30 |
 *------------*/

This returns the last day of the year:

SELECT LAST_DAY(DATE '2008-11-25', YEAR) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-12-31 |
 *------------*/

This returns the last day of the week for a week that starts on a Sunday:

SELECT LAST_DAY(DATE '2008-11-10', WEEK(SUNDAY)) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-15 |
 *------------*/

This returns the last day of the week for a week that starts on a Monday:

SELECT LAST_DAY(DATE '2008-11-10', WEEK(MONDAY)) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-16 |
 *------------*/

PARSE_DATE

PARSE_DATE(format_string, date_string)

Description

Converts a string representation of date to a DATE object.

format_string contains the format elements that define how date_string is formatted. Each element in date_string must have a corresponding element in format_string. The location of each element in format_string must match the location of each element in date_string.

-- This works because elements on both sides match.
SELECT PARSE_DATE('%A %b %e %Y', 'Thursday Dec 25 2008');

-- This produces an error because the year element is in different locations.
SELECT PARSE_DATE('%Y %A %b %e', 'Thursday Dec 25 2008');

-- This produces an error because one of the year elements is missing.
SELECT PARSE_DATE('%A %b %e', 'Thursday Dec 25 2008');

-- This works because %F can find all matching elements in date_string.
SELECT PARSE_DATE('%F', '2000-12-30');

When using PARSE_DATE, keep the following in mind:

  • Unspecified fields. Any unspecified field is initialized from 1970-01-01.
  • Case insensitivity. Names, such as Monday, February, and so on, are case insensitive.
  • Whitespace. One or more consecutive white spaces in the format string matches zero or more consecutive white spaces in the date string. In addition, leading and trailing white spaces in the date string are always allowed -- even if they are not in the format string.
  • Format precedence. When two (or more) format elements have overlapping information (for example both %F and %Y affect the year), the last one generally overrides any earlier ones.

Return Data Type

DATE

Examples

This example converts a MM/DD/YY formatted string to a DATE object:

SELECT PARSE_DATE('%x', '12/25/08') AS parsed;

/*------------*
 | parsed     |
 +------------+
 | 2008-12-25 |
 *------------*/

This example converts a YYYYMMDD formatted string to a DATE object:

SELECT PARSE_DATE('%Y%m%d', '20081225') AS parsed;

/*------------*
 | parsed     |
 +------------+
 | 2008-12-25 |
 *------------*/

UNIX_DATE

UNIX_DATE(date_expression)

Description

Returns the number of days since 1970-01-01.

Return Data Type

INT64

Example

SELECT UNIX_DATE(DATE '2008-12-25') AS days_from_epoch;

/*-----------------*
 | days_from_epoch |
 +-----------------+
 | 14238           |
 *-----------------*/

Datetime functions

GoogleSQL for BigQuery supports the following datetime functions.

All outputs are automatically formatted as per ISO 8601, separating date and time with aT.

Function list

Name Summary
CURRENT_DATETIME Returns the current date and time as a DATETIME value.
DATETIME Constructs a DATETIME value.
DATETIME_ADD Adds a specified time interval to a DATETIME value.
DATETIME_DIFF Gets the number of unit boundaries between two DATETIME values at a particular time granularity.
DATETIME_SUB Subtracts a specified time interval from a DATETIME value.
DATETIME_TRUNC Truncates a DATETIME value.
EXTRACT Extracts part of a date and time from a DATETIME value.
FORMAT_DATETIME Formats a DATETIME value according to a specified format string.
LAST_DAY Gets the last day in a specified time period that contains a DATETIME value.
PARSE_DATETIME Converts a STRING value to a DATETIME value.

CURRENT_DATETIME

CURRENT_DATETIME([time_zone])
CURRENT_DATETIME

Description

Returns the current time as a DATETIME object. Parentheses are optional when called with no arguments.

This function supports an optional time_zone parameter. See Time zone definitions for information on how to specify a time zone.

The current date and time is recorded at the start of the query statement which contains this function, not when this specific function is evaluated.

Return Data Type

DATETIME

Example

SELECT CURRENT_DATETIME() as now;

/*----------------------------*
 | now                        |
 +----------------------------+
 | 2016-05-19T10:38:47.046465 |
 *----------------------------*/

When a column named current_datetime is present, the column name and the function call without parentheses are ambiguous. To ensure the function call, add parentheses; to ensure the column name, qualify it with its range variable. For example, the following query will select the function in the now column and the table column in the current_datetime column.

WITH t AS (SELECT 'column value' AS `current_datetime`)
SELECT current_datetime() as now, t.current_datetime FROM t;

/*----------------------------+------------------*
 | now                        | current_datetime |
 +----------------------------+------------------+
 | 2016-05-19T10:38:47.046465 | column value     |
 *----------------------------+------------------*/

DATETIME

1. DATETIME(year, month, day, hour, minute, second)
2. DATETIME(date_expression[, time_expression])
3. DATETIME(timestamp_expression [, time_zone])

Description

  1. Constructs a DATETIME object using INT64 values representing the year, month, day, hour, minute, and second.
  2. Constructs a DATETIME object using a DATE object and an optional TIME object.
  3. Constructs a DATETIME object using a TIMESTAMP object. It supports an optional parameter to specify a time zone. If no time zone is specified, the default time zone, UTC, is used.

Return Data Type

DATETIME

Example

SELECT
  DATETIME(2008, 12, 25, 05, 30, 00) as datetime_ymdhms,
  DATETIME(TIMESTAMP "2008-12-25 05:30:00+00", "America/Los_Angeles") as datetime_tstz;

/*---------------------+---------------------*
 | datetime_ymdhms     | datetime_tstz       |
 +---------------------+---------------------+
 | 2008-12-25T05:30:00 | 2008-12-24T21:30:00 |
 *---------------------+---------------------*/

DATETIME_ADD

DATETIME_ADD(datetime_expression, INTERVAL int64_expression part)

Description

Adds int64_expression units of part to the DATETIME object.

DATETIME_ADD supports the following values for part:

  • MICROSECOND
  • MILLISECOND
  • SECOND
  • MINUTE
  • HOUR
  • DAY
  • WEEK. Equivalent to 7 DAYs.
  • MONTH
  • QUARTER
  • YEAR

Special handling is required for MONTH, QUARTER, and YEAR parts when the date is at (or near) the last day of the month. If the resulting month has fewer days than the original DATETIME's day, then the result day is the last day of the new month.

Return Data Type

DATETIME

Example

SELECT
  DATETIME "2008-12-25 15:30:00" as original_date,
  DATETIME_ADD(DATETIME "2008-12-25 15:30:00", INTERVAL 10 MINUTE) as later;

/*-----------------------------+------------------------*
 | original_date               | later                  |
 +-----------------------------+------------------------+
 | 2008-12-25T15:30:00         | 2008-12-25T15:40:00    |
 *-----------------------------+------------------------*/

DATETIME_DIFF

DATETIME_DIFF(end_datetime, start_datetime, granularity)

Description

Gets the number of unit boundaries between two DATETIME values (end_datetime - start_datetime) at a particular time granularity.

Definitions

  • start_datetime: The starting DATETIME value.
  • end_datetime: The ending DATETIME value.
  • granularity: The datetime part that represents the granularity. This can be:

    • MICROSECOND
    • MILLISECOND
    • SECOND
    • MINUTE
    • HOUR
    • DAY
    • WEEK: This date part begins on Sunday.
    • WEEK(<WEEKDAY>): This date part begins on WEEKDAY. Valid values for WEEKDAY are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.
    • ISOWEEK: Uses ISO 8601 week boundaries. ISO weeks begin on Monday.
    • MONTH, except when the first two arguments are TIMESTAMP values.
    • QUARTER
    • YEAR
    • ISOYEAR: Uses the ISO 8601 week-numbering year boundary. The ISO year boundary is the Monday of the first week whose Thursday belongs to the corresponding Gregorian calendar year.

Details

If end_datetime is earlier than start_datetime, the output is negative. Produces an error if the computation overflows, such as if the difference in microseconds between the two DATETIME values overflows.

Return Data Type

INT64

Example

SELECT
  DATETIME "2010-07-07 10:20:00" as first_datetime,
  DATETIME "2008-12-25 15:30:00" as second_datetime,
  DATETIME_DIFF(DATETIME "2010-07-07 10:20:00",
    DATETIME "2008-12-25 15:30:00", DAY) as difference;

/*----------------------------+------------------------+------------------------*
 | first_datetime             | second_datetime        | difference             |
 +----------------------------+------------------------+------------------------+
 | 2010-07-07T10:20:00        | 2008-12-25T15:30:00    | 559                    |
 *----------------------------+------------------------+------------------------*/
SELECT
  DATETIME_DIFF(DATETIME '2017-10-15 00:00:00',
    DATETIME '2017-10-14 00:00:00', DAY) as days_diff,
  DATETIME_DIFF(DATETIME '2017-10-15 00:00:00',
    DATETIME '2017-10-14 00:00:00', WEEK) as weeks_diff;

/*-----------+------------*
 | days_diff | weeks_diff |
 +-----------+------------+
 | 1         | 1          |
 *-----------+------------*/

The example above shows the result of DATETIME_DIFF for two DATETIMEs that are 24 hours apart. DATETIME_DIFF with the part WEEK returns 1 because DATETIME_DIFF counts the number of part boundaries in this range of DATETIMEs. Each WEEK begins on Sunday, so there is one part boundary between Saturday, 2017-10-14 00:00:00 and Sunday, 2017-10-15 00:00:00.

The following example shows the result of DATETIME_DIFF for two dates in different years. DATETIME_DIFF with the date part YEAR returns 3 because it counts the number of Gregorian calendar year boundaries between the two DATETIMEs. DATETIME_DIFF with the date part ISOYEAR returns 2 because the second DATETIME belongs to the ISO year 2015. The first Thursday of the 2015 calendar year was 2015-01-01, so the ISO year 2015 begins on the preceding Monday, 2014-12-29.

SELECT
  DATETIME_DIFF('2017-12-30 00:00:00',
    '2014-12-30 00:00:00', YEAR) AS year_diff,
  DATETIME_DIFF('2017-12-30 00:00:00',
    '2014-12-30 00:00:00', ISOYEAR) AS isoyear_diff;

/*-----------+--------------*
 | year_diff | isoyear_diff |
 +-----------+--------------+
 | 3         | 2            |
 *-----------+--------------*/

The following example shows the result of DATETIME_DIFF for two days in succession. The first date falls on a Monday and the second date falls on a Sunday. DATETIME_DIFF with the date part WEEK returns 0 because this time part uses weeks that begin on Sunday. DATETIME_DIFF with the date part WEEK(MONDAY) returns 1. DATETIME_DIFF with the date part ISOWEEK also returns 1 because ISO weeks begin on Monday.

SELECT
  DATETIME_DIFF('2017-12-18', '2017-12-17', WEEK) AS week_diff,
  DATETIME_DIFF('2017-12-18', '2017-12-17', WEEK(MONDAY)) AS week_weekday_diff,
  DATETIME_DIFF('2017-12-18', '2017-12-17', ISOWEEK) AS isoweek_diff;

/*-----------+-------------------+--------------*
 | week_diff | week_weekday_diff | isoweek_diff |
 +-----------+-------------------+--------------+
 | 0         | 1                 | 1            |
 *-----------+-------------------+--------------*/

DATETIME_SUB

DATETIME_SUB(datetime_expression, INTERVAL int64_expression part)

Description

Subtracts int64_expression units of part from the DATETIME.

DATETIME_SUB supports the following values for part:

  • MICROSECOND
  • MILLISECOND
  • SECOND
  • MINUTE
  • HOUR
  • DAY
  • WEEK. Equivalent to 7 DAYs.
  • MONTH
  • QUARTER
  • YEAR

Special handling is required for MONTH, QUARTER, and YEAR parts when the date is at (or near) the last day of the month. If the resulting month has fewer days than the original DATETIME's day, then the result day is the last day of the new month.

Return Data Type

DATETIME

Example

SELECT
  DATETIME "2008-12-25 15:30:00" as original_date,
  DATETIME_SUB(DATETIME "2008-12-25 15:30:00", INTERVAL 10 MINUTE) as earlier;

/*-----------------------------+------------------------*
 | original_date               | earlier                |
 +-----------------------------+------------------------+
 | 2008-12-25T15:30:00         | 2008-12-25T15:20:00    |
 *-----------------------------+------------------------*/

DATETIME_TRUNC

DATETIME_TRUNC(datetime_expression, date_time_part)

Description

Truncates a DATETIME value to the granularity of date_time_part. The DATETIME value is always rounded to the beginning of date_time_part, which can be one of the following:

  • MICROSECOND: If used, nothing is truncated from the value.
  • MILLISECOND: The nearest lessor or equal millisecond.
  • SECOND: The nearest lessor or equal second.
  • MINUTE: The nearest lessor or equal minute.
  • HOUR: The nearest lessor or equal hour.
  • DAY: The day in the Gregorian calendar year that contains the DATETIME value.
  • WEEK: The first day of the week in the week that contains the DATETIME value. Weeks begin on Sundays. WEEK is equivalent to WEEK(SUNDAY).
  • WEEK(WEEKDAY): The first day of the week in the week that contains the DATETIME value. Weeks begin on WEEKDAY. WEEKDAY must be one of the following: SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, or SATURDAY.
  • ISOWEEK: The first day of the ISO 8601 week in the ISO week that contains the DATETIME value. The ISO week begins on Monday. The first ISO week of each ISO year contains the first Thursday of the corresponding Gregorian calendar year.
  • MONTH: The first day of the month in the month that contains the DATETIME value.
  • QUARTER: The first day of the quarter in the quarter that contains the DATETIME value.
  • YEAR: The first day of the year in the year that contains the DATETIME value.
  • ISOYEAR: The first day of the ISO 8601 week-numbering year in the ISO year that contains the DATETIME value. The ISO year is the Monday of the first week whose Thursday belongs to the corresponding Gregorian calendar year.

Return Data Type

DATETIME

Examples

SELECT
  DATETIME "2008-12-25 15:30:00" as original,
  DATETIME_TRUNC(DATETIME "2008-12-25 15:30:00", DAY) as truncated;

/*----------------------------+------------------------*
 | original                   | truncated              |
 +----------------------------+------------------------+
 | 2008-12-25T15:30:00        | 2008-12-25T00:00:00    |
 *----------------------------+------------------------*/

In the following example, the original DATETIME falls on a Sunday. Because the part is WEEK(MONDAY), DATE_TRUNC returns the DATETIME for the preceding Monday.

SELECT
 datetime AS original,
 DATETIME_TRUNC(datetime, WEEK(MONDAY)) AS truncated
FROM (SELECT DATETIME(TIMESTAMP "2017-11-05 00:00:00+00", "UTC") AS datetime);

/*---------------------+---------------------*
 | original            | truncated           |
 +---------------------+---------------------+
 | 2017-11-05T00:00:00 | 2017-10-30T00:00:00 |
 *---------------------+---------------------*/

In the following example, the original datetime_expression is in the Gregorian calendar year 2015. However, DATETIME_TRUNC with the ISOYEAR date part truncates the datetime_expression to the beginning of the ISO year, not the Gregorian calendar year. The first Thursday of the 2015 calendar year was 2015-01-01, so the ISO year 2015 begins on the preceding Monday, 2014-12-29. Therefore the ISO year boundary preceding the datetime_expression 2015-06-15 00:00:00 is 2014-12-29.

SELECT
  DATETIME_TRUNC('2015-06-15 00:00:00', ISOYEAR) AS isoyear_boundary,
  EXTRACT(ISOYEAR FROM DATETIME '2015-06-15 00:00:00') AS isoyear_number;

/*---------------------+----------------*
 | isoyear_boundary    | isoyear_number |
 +---------------------+----------------+
 | 2014-12-29T00:00:00 | 2015           |
 *---------------------+----------------*/

EXTRACT

EXTRACT(part FROM datetime_expression)

Description

Returns a value that corresponds to the specified part from a supplied datetime_expression.

Allowed part values are:

  • MICROSECOND
  • MILLISECOND
  • SECOND
  • MINUTE
  • HOUR
  • DAYOFWEEK: Returns values in the range [1,7] with Sunday as the first day of of the week.
  • DAY
  • DAYOFYEAR
  • WEEK: Returns the week number of the date in the range [0, 53]. Weeks begin with Sunday, and dates prior to the first Sunday of the year are in week 0.
  • WEEK(<WEEKDAY>): Returns the week number of datetime_expression in the range [0, 53]. Weeks begin on WEEKDAY. datetimes prior to the first WEEKDAY of the year are in week 0. Valid values for WEEKDAY are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.
  • ISOWEEK: Returns the ISO 8601 week number of the datetime_expression. ISOWEEKs begin on Monday. Return values are in the range [1, 53]. The first ISOWEEK of each ISO year begins on the Monday before the first Thursday of the Gregorian calendar year.
  • MONTH
  • QUARTER
  • YEAR
  • ISOYEAR: Returns the ISO 8601 week-numbering year, which is the Gregorian calendar year containing the Thursday of the week to which date_expression belongs.
  • DATE
  • TIME

Returned values truncate lower order time periods. For example, when extracting seconds, EXTRACT truncates the millisecond and microsecond values.

Return Data Type

INT64, except in the following cases:

  • If part is DATE, returns a DATE object.
  • If part is TIME, returns a TIME object.

Examples

In the following example, EXTRACT returns a value corresponding to the HOUR time part.

SELECT EXTRACT(HOUR FROM DATETIME(2008, 12, 25, 15, 30, 00)) as hour;

/*------------------*
 | hour             |
 +------------------+
 | 15               |
 *------------------*/

In the following example, EXTRACT returns values corresponding to different time parts from a column of datetimes.

WITH Datetimes AS (
  SELECT DATETIME '2005-01-03 12:34:56' AS datetime UNION ALL
  SELECT DATETIME '2007-12-31' UNION ALL
  SELECT DATETIME '2009-01-01' UNION ALL
  SELECT DATETIME '2009-12-31' UNION ALL
  SELECT DATETIME '2017-01-02' UNION ALL
  SELECT DATETIME '2017-05-26'
)
SELECT
  datetime,
  EXTRACT(ISOYEAR FROM datetime) AS isoyear,
  EXTRACT(ISOWEEK FROM datetime) AS isoweek,
  EXTRACT(YEAR FROM datetime) AS year,
  EXTRACT(WEEK FROM datetime) AS week
FROM Datetimes
ORDER BY datetime;

/*---------------------+---------+---------+------+------*
 | datetime            | isoyear | isoweek | year | week |
 +---------------------+---------+---------+------+------+
 | 2005-01-03T12:34:56 | 2005    | 1       | 2005 | 1    |
 | 2007-12-31T00:00:00 | 2008    | 1       | 2007 | 52   |
 | 2009-01-01T00:00:00 | 2009    | 1       | 2009 | 0    |
 | 2009-12-31T00:00:00 | 2009    | 53      | 2009 | 52   |
 | 2017-01-02T00:00:00 | 2017    | 1       | 2017 | 1    |
 | 2017-05-26T00:00:00 | 2017    | 21      | 2017 | 21   |
 *---------------------+---------+---------+------+------*/

In the following example, datetime_expression falls on a Sunday. EXTRACT calculates the first column using weeks that begin on Sunday, and it calculates the second column using weeks that begin on Monday.

WITH table AS (SELECT DATETIME(TIMESTAMP "2017-11-05 00:00:00+00", "UTC") AS datetime)
SELECT
  datetime,
  EXTRACT(WEEK(SUNDAY) FROM datetime) AS week_sunday,
  EXTRACT(WEEK(MONDAY) FROM datetime) AS week_monday
FROM table;

/*---------------------+-------------+---------------*
 | datetime            | week_sunday | week_monday   |
 +---------------------+-------------+---------------+
 | 2017-11-05T00:00:00 | 45          | 44            |
 *---------------------+-------------+---------------*/

FORMAT_DATETIME

FORMAT_DATETIME(format_string, datetime_expression)

Description

Formats a DATETIME object according to the specified format_string. See Supported Format Elements For DATETIME for a list of format elements that this function supports.

Return Data Type

STRING

Examples

SELECT
  FORMAT_DATETIME("%c", DATETIME "2008-12-25 15:30:00")
  AS formatted;

/*--------------------------*
 | formatted                |
 +--------------------------+
 | Thu Dec 25 15:30:00 2008 |
 *--------------------------*/
SELECT
  FORMAT_DATETIME("%b-%d-%Y", DATETIME "2008-12-25 15:30:00")
  AS formatted;

/*-------------*
 | formatted   |
 +-------------+
 | Dec-25-2008 |
 *-------------*/
SELECT
  FORMAT_DATETIME("%b %Y", DATETIME "2008-12-25 15:30:00")
  AS formatted;

/*-------------*
 | formatted   |
 +-------------+
 | Dec 2008    |
 *-------------*/

LAST_DAY

LAST_DAY(datetime_expression[, date_part])

Description

Returns the last day from a datetime expression that contains the date. This is commonly used to return the last day of the month.

You can optionally specify the date part for which the last day is returned. If this parameter is not used, the default value is MONTH. LAST_DAY supports the following values for date_part:

  • YEAR
  • QUARTER
  • MONTH
  • WEEK. Equivalent to 7 DAYs.
  • WEEK(<WEEKDAY>). <WEEKDAY> represents the starting day of the week. Valid values are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.
  • ISOWEEK. Uses ISO 8601 week boundaries. ISO weeks begin on Monday.
  • ISOYEAR. Uses the ISO 8601 week-numbering year boundary. The ISO year boundary is the Monday of the first week whose Thursday belongs to the corresponding Gregorian calendar year.

Return Data Type

DATE

Example

These both return the last day of the month:

SELECT LAST_DAY(DATETIME '2008-11-25', MONTH) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-30 |
 *------------*/
SELECT LAST_DAY(DATETIME '2008-11-25') AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-30 |
 *------------*/

This returns the last day of the year:

SELECT LAST_DAY(DATETIME '2008-11-25 15:30:00', YEAR) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-12-31 |
 *------------*/

This returns the last day of the week for a week that starts on a Sunday:

SELECT LAST_DAY(DATETIME '2008-11-10 15:30:00', WEEK(SUNDAY)) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-15 |
 *------------*/

This returns the last day of the week for a week that starts on a Monday:

SELECT LAST_DAY(DATETIME '2008-11-10 15:30:00', WEEK(MONDAY)) AS last_day

/*------------*
 | last_day   |
 +------------+
 | 2008-11-16 |
 *------------*/

PARSE_DATETIME

PARSE_DATETIME(format_string, datetime_string)

Description

Converts a string representation of a datetime to a DATETIME object.

format_string contains the format elements that define how datetime_string is formatted. Each element in datetime_string must have a corresponding element in format_string. The location of each element in format_string must match the location of each element in datetime_string.

-- This works because elements on both sides match.
SELECT PARSE_DATETIME("%a %b %e %I:%M:%S %Y", "Thu Dec 25 07:30:00 2008");

-- This produces an error because the year element is in different locations.
SELECT PARSE_DATETIME("%a %b %e %Y %I:%M:%S", "Thu Dec 25 07:30:00 2008");

-- This produces an error because one of the year elements is missing.
SELECT PARSE_DATETIME("%a %b %e %I:%M:%S", "Thu Dec 25 07:30:00 2008");

-- This works because %c can find all matching elements in datetime_string.
SELECT PARSE_DATETIME("%c", "Thu Dec 25 07:30:00 2008");

The format string fully supports most format elements, except for %P.

PARSE_DATETIME parses string according to the following rules:

  • Unspecified fields. Any unspecified field is initialized from 1970-01-01 00:00:00.0. For example, if the year is unspecified then it defaults to 1970.
  • Case insensitivity. Names, such as Monday and February, are case insensitive.
  • Whitespace. One or more consecutive white spaces in the format string matches zero or more consecutive white spaces in the DATETIME string. Leading and trailing white spaces in the DATETIME string are always allowed, even if they are not in the format string.
  • Format precedence. When two or more format elements have overlapping information, the last one generally overrides any earlier ones, with some exceptions. For example, both %F and %Y affect the year, so the earlier element overrides the later. See the descriptions of %s, %C, and %y in Supported Format Elements For DATETIME.
  • Format divergence. %p can be used with am, AM, pm, and PM.

Return Data Type

DATETIME

Examples

The following examples parse a STRING literal as a DATETIME.

SELECT PARSE_DATETIME('%Y-%m-%d %H:%M:%S', '1998-10-18 13:45:55') AS datetime;

/*---------------------*
 | datetime            |
 +---------------------+
 | 1998-10-18T13:45:55 |
 *---------------------*/
SELECT PARSE_DATETIME('%m/%d/%Y %I:%M:%S %p', '8/30/2018 2:23:38 pm') AS datetime;

/*---------------------*
 | datetime            |
 +---------------------+
 | 2018-08-30T14:23:38 |
 *---------------------*/

The following example parses a STRING literal containing a date in a natural language format as a DATETIME.

SELECT PARSE_DATETIME('%A, %B %e, %Y','Wednesday, December 19, 2018')
  AS datetime;

/*---------------------*
 | datetime            |
 +---------------------+
 | 2018-12-19T00:00:00 |
 *---------------------*/

Debugging functions

GoogleSQL for BigQuery supports the following debugging functions.

Function list

Name Summary
ERROR Produces an error with a custom error message.

ERROR

ERROR(error_message)

Description

Returns an error.

Definitions

  • error_message: A STRING value that represents the error message to produce. Any whitespace characters beyond a single space are trimmed from the results.

Details

ERROR is treated like any other expression that may result in an error: there is no special guarantee of evaluation order.

Return Data Type

GoogleSQL infers the return type in context.

Examples

In the following example, the query returns an error message if the value of the row does not match one of two defined values.

SELECT
  CASE
    WHEN value = 'foo' THEN 'Value is foo.'
    WHEN value = 'bar' THEN 'Value is bar.'
    ELSE ERROR(CONCAT('Found unexpected value: ', value))
  END AS new_value
FROM (
  SELECT 'foo' AS value UNION ALL
  SELECT 'bar' AS value UNION ALL
  SELECT 'baz' AS value);

-- Found unexpected value: baz

In the following example, GoogleSQL may evaluate the ERROR function before or after the x > 0 condition, because GoogleSQL generally provides no ordering guarantees between WHERE clause conditions and there are no special guarantees for the ERROR function.

SELECT *
FROM (SELECT -1 AS x)
WHERE x > 0 AND ERROR('Example error');

In the next example, the WHERE clause evaluates an IF condition, which ensures that GoogleSQL only evaluates the ERROR function if the condition fails.

SELECT *
FROM (SELECT -1 AS x)
WHERE IF(x > 0, true, ERROR(FORMAT('Error: x must be positive but is %t', x)));

-- Error: x must be positive but is -1

Differentially private aggregate functions

GoogleSQL for BigQuery supports differentially private aggregate functions. For an explanation of how aggregate functions work, see Aggregate function calls.

You can only use differentially private aggregate functions with differentially private queries in a differential privacy clause.

Function list

Name Summary
AVG DIFFERENTIAL_PRIVACY-supported AVG.

Gets the differentially-private average of non-NULL, non-NaN values in a query with a DIFFERENTIAL_PRIVACY clause.
COUNT DIFFERENTIAL_PRIVACY-supported COUNT.

Signature 1: Gets the differentially-private count of rows in a query with a DIFFERENTIAL_PRIVACY clause.

Signature 2: Gets the differentially-private count of rows with a non-NULL expression in a query with a DIFFERENTIAL_PRIVACY clause.
PERCENTILE_CONT DIFFERENTIAL_PRIVACY-supported PERCENTILE_CONT.

Computes a differentially-private percentile across privacy unit columns in a query with a DIFFERENTIAL_PRIVACY clause.
SUM DIFFERENTIAL_PRIVACY-supported SUM.

Gets the differentially-private sum of non-NULL, non-NaN values in a query with a DIFFERENTIAL_PRIVACY clause.

AVG (DIFFERENTIAL_PRIVACY)

WITH DIFFERENTIAL_PRIVACY ...
  AVG(
    expression,
    [contribution_bounds_per_group => (lower_bound, upper_bound)]
  )

Description

Returns the average of non-NULL, non-NaN values in the expression. This function first computes the average per privacy unit column, and then computes the final result by averaging these averages.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support the following arguments:

  • expression: The input expression. This can be any numeric input type, such as INT64.
  • contribution_bounds_per_group: The contribution bounds named argument. Perform clamping per each group separately before performing intermediate grouping on the privacy unit column.

Return type

FLOAT64

Examples

The following differentially private query gets the average number of each item requested per professor. Smaller aggregations might not be included. This query references a table called professors.

-- With noise, using the epsilon parameter.
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=10, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    AVG(quantity, contribution_bounds_per_group => (0,100)) average_quantity
FROM professors
GROUP BY item;

-- These results will change each time you run the query.
-- Smaller aggregations might be removed.
/*----------+------------------*
 | item     | average_quantity |
 +----------+------------------+
 | pencil   | 38.5038356810269 |
 | pen      | 13.4725028762032 |
 *----------+------------------*/
-- Without noise, using the epsilon parameter.
-- (this un-noised version is for demonstration only)
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=1e20, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    AVG(quantity) average_quantity
FROM professors
GROUP BY item;

-- These results will not change when you run the query.
/*----------+------------------*
 | item     | average_quantity |
 +----------+------------------+
 | scissors | 8                |
 | pencil   | 40               |
 | pen      | 18.5             |
 *----------+------------------*/

COUNT (DIFFERENTIAL_PRIVACY)

  • Signature 1: Returns the number of rows in a differentially private FROM clause.
  • Signature 2: Returns the number of non-NULL values in an expression.

Signature 1

WITH DIFFERENTIAL_PRIVACY ...
  COUNT(
    *,
    [contribution_bounds_per_group => (lower_bound, upper_bound)]
  )

Description

Returns the number of rows in the differentially private FROM clause. The final result is an aggregation across a privacy unit column.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support the following argument:

  • contribution_bounds_per_group: The contribution bounds named argument. Perform clamping per each group separately before performing intermediate grouping on the privacy unit column.

Return type

INT64

Examples

The following differentially private query counts the number of requests for each item. This query references a table called professors.

-- With noise, using the epsilon parameter.
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=10, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    COUNT(*, contribution_bounds_per_group=>(0, 100)) times_requested
FROM professors
GROUP BY item;

-- These results will change each time you run the query.
-- Smaller aggregations might be removed.
/*----------+-----------------*
 | item     | times_requested |
 +----------+-----------------+
 | pencil   | 5               |
 | pen      | 2               |
 *----------+-----------------*/
-- Without noise, using the epsilon parameter.
-- (this un-noised version is for demonstration only)
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=1e20, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    COUNT(*, contribution_bounds_per_group=>(0, 100)) times_requested
FROM professors
GROUP BY item;

-- These results will not change when you run the query.
/*----------+-----------------*
 | item     | times_requested |
 +----------+-----------------+
 | scissors | 1               |
 | pencil   | 4               |
 | pen      | 3               |
 *----------+-----------------*/

Signature 2

WITH DIFFERENTIAL_PRIVACY ...
  COUNT(
    expression,
    [contribution_bounds_per_group => (lower_bound, upper_bound)]
  )

Description

Returns the number of non-NULL expression values. The final result is an aggregation across a privacy unit column.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support these arguments:

  • expression: The input expression. This expression can be any numeric input type, such as INT64.
  • contribution_bounds_per_group: The contribution bounds named argument. Perform clamping per each group separately before performing intermediate grouping on the privacy unit column.

Return type

INT64

Examples

The following differentially private query counts the number of requests made for each type of item. This query references a table called professors.

-- With noise, using the epsilon parameter.
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=10, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    COUNT(item, contribution_bounds_per_group => (0,100)) times_requested
FROM professors
GROUP BY item;

-- These results will change each time you run the query.
-- Smaller aggregations might be removed.
/*----------+-----------------*
 | item     | times_requested |
 +----------+-----------------+
 | pencil   | 5               |
 | pen      | 2               |
 *----------+-----------------*/
-- Without noise, using the epsilon parameter.
-- (this un-noised version is for demonstration only)
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=1e20, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    COUNT(item, contribution_bounds_per_group => (0,100)) times_requested
FROM professors
GROUP BY item;

-- These results will not change when you run the query.
/*----------+-----------------*
 | item     | times_requested |
 +----------+-----------------+
 | scissors | 1               |
 | pencil   | 4               |
 | pen      | 3               |
 *----------+-----------------*/

PERCENTILE_CONT (DIFFERENTIAL_PRIVACY)

WITH DIFFERENTIAL_PRIVACY ...
  PERCENTILE_CONT(
    expression,
    percentile,
    contribution_bounds_per_row => (lower_bound, upper_bound)
  )

Description

Takes an expression and computes a percentile for it. The final result is an aggregation across privacy unit columns.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support these arguments:

  • expression: The input expression. This can be most numeric input types, such as INT64. NULL values are always ignored.
  • percentile: The percentile to compute. The percentile must be a literal in the range [0, 1].
  • contribution_bounds_per_row: The contribution bounds named argument. Perform clamping per each row separately before performing intermediate grouping on the privacy unit column.

NUMERIC and BIGNUMERIC arguments are not allowed. If you need them, cast them as the FLOAT64 data type first.

Return type

FLOAT64

Examples

The following differentially private query gets the percentile of items requested. Smaller aggregations might not be included. This query references a view called professors.

-- With noise, using the epsilon parameter.
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=10, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    PERCENTILE_CONT(quantity, 0.5, contribution_bounds_per_row => (0,100)) percentile_requested
FROM professors
GROUP BY item;

-- These results will change each time you run the query.
-- Smaller aggregations might be removed.
 /*----------+----------------------*
  | item     | percentile_requested |
  +----------+----------------------+
  | pencil   | 72.00011444091797    |
  | scissors | 8.000175476074219    |
  | pen      | 23.001075744628906   |
  *----------+----------------------*/

SUM (DIFFERENTIAL_PRIVACY)

WITH DIFFERENTIAL_PRIVACY ...
  SUM(
    expression,
    [contribution_bounds_per_group => (lower_bound, upper_bound)]
  )

Description

Returns the sum of non-NULL, non-NaN values in the expression. The final result is an aggregation across privacy unit columns.

This function must be used with the DIFFERENTIAL_PRIVACY clause and can support these arguments:

  • expression: The input expression. This can be any numeric input type, such as INT64. NULL values are always ignored.
  • contribution_bounds_per_group: The contribution bounds named argument. Perform clamping per each group separately before performing intermediate grouping on the privacy unit column.

Return type

One of the following supertypes:

  • INT64
  • FLOAT64

Examples

The following differentially private query gets the sum of items requested. Smaller aggregations might not be included. This query references a view called professors.

-- With noise, using the epsilon parameter.
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=10, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    SUM(quantity, contribution_bounds_per_group => (0,100)) quantity
FROM professors
GROUP BY item;

-- These results will change each time you run the query.
-- Smaller aggregations might be removed.
/*----------+-----------*
 | item     | quantity  |
 +----------+-----------+
 | pencil   | 143       |
 | pen      | 59        |
 *----------+-----------*/
-- Without noise, using the epsilon parameter.
-- (this un-noised version is for demonstration only)
SELECT
  WITH DIFFERENTIAL_PRIVACY
    OPTIONS(epsilon=1e20, delta=.01, max_groups_contributed=1, privacy_unit_column=id)
    item,
    SUM(quantity) quantity
FROM professors
GROUP BY item;

-- These results will not change when you run the query.
/*----------+----------*
 | item     | quantity |
 +----------+----------+
 | scissors | 8        |
 | pencil   | 144      |
 | pen      | 58       |
 *----------+----------*/

Clamp values in a differentially private aggregate function

In differentially private queries, aggregation clamping is used to limit the contribution of outliers. You can clamp explicitly or implicitly as follows:

Implicitly clamp values

If you don't include the contribution bounds named argument with the DIFFERENTIAL_PRIVACY clause, clamping is implicit, which means bounds are derived from the data itself in a differentially private way.

Implicit bounding works best when computed using large datasets. For more information, see Implicit bounding limitations for small datasets.

Details

In differentially private aggregate functions, explicit clamping is optional. If you don't include this clause, clamping is implicit, which means bounds are derived from the data itself in a differentially private way. The process is somewhat random, so aggregations with identical ranges can have different bounds.

Implicit bounds are determined for each aggregation. So if some aggregations have a wide range of values, and others have a narrow range of values, implicit bounding can identify different bounds for different aggregations as appropriate. Implicit bounds might be an advantage or a disadvantage depending on your use case. Different bounds for different aggregations can result in lower error. Different bounds also means that different aggregations have different levels of uncertainty, which might not be directly comparable. Explicit bounds, on the other hand, apply uniformly to all aggregations and should be derived from public information.

When clamping is implicit, part of the total epsilon is spent picking bounds. This leaves less epsilon for aggregations, so these aggregations are noisier.

Example

The following anonymized query clamps each aggregate contribution for each differential privacy ID and within a derived range from the data itself. As long as all or most values fall within this range, your results will be accurate. This query references a view called view_on_professors.

--Without noise (this un-noised version is for demonstration only)
SELECT WITH DIFFERENTIAL_PRIVACY
  OPTIONS (
    epsilon = 1e20,
    delta = .01,
    privacy_unit_column=id
  )
  item,
  AVG(quantity) average_quantity
FROM view_on_professors
GROUP BY item;

/*----------+------------------*
 | item     | average_quantity |
 +----------+------------------+
 | scissors | 8                |
 | pencil   | 72               |
 | pen      | 18.5             |
 *----------+------------------*/

Explicitly clamp values

contribution_bounds_per_group => (lower_bound,upper_bound)
contribution_bounds_per_row => (lower_bound,upper_bound)

Use the contribution bounds named argument to explicitly clamp values per group or per row between a lower and upper bound in a DIFFERENTIAL_PRIVACY clause.

Input values:

  • contribution_bounds_per_row: Contributions per privacy unit are clamped on a per-row (per-record) basis. This means the following:
    • Upper and lower bounds are applied to column values in individual rows produced by the input subquery independently.
    • The maximum possible contribution per privacy unit (and per grouping set) is the product of the per-row contribution limit and max_groups_contributed differential privacy parameter.
  • contribution_bounds_per_group: Contributions per privacy unit are clamped on a unique set of entity-specified GROUP BY keys. The upper and lower bounds are applied to values per group after the values are aggregated per privacy unit.
  • lower_bound: Numeric literal that represents the smallest value to include in an aggregation.
  • upper_bound: Numeric literal that represents the largest value to include in an aggregation.

NUMERIC and BIGNUMERIC arguments are not allowed.

Details

In differentially private aggregate functions, clamping explicitly clamps the total contribution from each privacy unit column to within a specified range.

Explicit bounds are uniformly applied to all aggregations. So even if some aggregations have a wide range of values, and others have a narrow range of values, the same bounds are applied to all of them. On the other hand, when implicit bounds are inferred from the data, the bounds applied to each aggregation can be different.

Explicit bounds should be chosen to reflect public information. For example, bounding ages between 0 and 100 reflects public information because the age of most people generally falls within this range.

Examples

The following anonymized query clamps each aggregate contribution for each differential privacy ID and within a specified range (0 and 100). As long as all or most values fall within this range, your results will be accurate. This query references a view called view_on_professors.

--Without noise (this un-noised version is for demonstration only)
SELECT WITH DIFFERENTIAL_PRIVACY
  OPTIONS (
    epsilon = 1e20,
    delta = .01,
    privacy_unit_column=id
  )
  item,
  AVG(quantity, contribution_bounds_per_group=>(0,100)) AS average_quantity
FROM view_on_professors
GROUP BY item;

/*----------+------------------*
 | item     | average_quantity |
 +----------+------------------+
 | scissors | 8                |
 | pencil   | 40               |
 | pen      | 18.5             |
 *----------+------------------*/

Notice what happens when most or all values fall outside of the clamped range. To get accurate results, ensure that the difference between the upper and lower bound is as small as possible, and that most inputs are between the upper and lower bound.

--Without noise (this un-noised version is for demonstration only)
SELECT WITH DIFFERENTIAL_PRIVACY
  OPTIONS (
    epsilon = 1e20,
    delta = .01,
    privacy_unit_column=id
  )
  item,
  AVG(quantity, contribution_bounds_per_group=>(50,100)) AS average_quantity
FROM view_on_professors
GROUP BY item;

/*----------+------------------*
 | item     | average_quantity |
 +----------+------------------+
 | scissors | 54               |
 | pencil   | 58               |
 | pen      | 51               |
 *----------+------------------*/

DLP encryption functions

GoogleSQL for BigQuery supports the following DLP functions that allow interoperable encryption and decryption between BigQuery and Cloud Data Loss Prevention (Cloud DLP), using AES-SIV. To use DLP functions, you need a new cryptographic key and then use that key to get a wrapped key.

Function list

Name Summary
DLP_DETERMINISTIC_ENCRYPT Encrypts data with a DLP compatible algorithm.
DLP_DETERMINISTIC_DECRYPT Decrypts DLP-encrypted data.
DLP_KEY_CHAIN Gets a data encryption key that is wrapped by Cloud Key Management Service.

DLP_DETERMINISTIC_ENCRYPT

DLP_DETERMINISTIC_ENCRYPT(key, plaintext, surrogate)
DLP_DETERMINISTIC_ENCRYPT(key, plaintext, surrogate, context)

Description

This function derives a data encryption key from key and context, and then encrypts plaintext. You can use surrogate to prepend the encryption result. To use DLP functions, you need a new cryptographic key and then use that key to get a wrapped key.

Definitions

  • key: A serialized BYTES value that is returned by DLP_KEY_CHAIN. key must be set to ENABLED in Cloud KMS. For information about how to generate a wrapped key, see gcloud kms encrypt.
  • plaintext: The STRING value to encrypt.
  • surrogate: A STRING value that you can prepend to output. If you don't want to use surrogate, pass an empty string (enclosed in "").
  • context: A user-provided STRING value that is used with a Cloud KMS key to derive a data encryption key. For more information, see CryptoDeterministicConfig:context.

Return data type

STRING

Examples

In the following query, the wrapped key is presented in a BYTES literal format:

SELECT
  DLP_DETERMINISTIC_ENCRYPT(
    DLP_KEY_CHAIN(
      'gcp-kms://projects/myproject/locations/us/keyRings/kms-test/cryptoKeys/test-KEK',
      b'\012\044\000\325\155\264\153\246\071\172\130\372\305\103\047\342\356\061\077\014\030\126\147\041\126\150\012\036\020\202\215\044\267\310\331\014\116\233\022\071\000\363\344\230\067\274\007\340\273\016\212\151\226\064\200\377\303\207\103\147\052\267\035\350\004\147\365\251\271\133\062\251\246\152\177\017\005\270\044\141\211\116\337\043\035\263\122\340\110\333\266\220\377\247\204\215\233'),
    'Plaintext',
    '',
    'aad') AS results;

/*--------------------------------------*
 | results                              |
 +--------------------------------------+
 | AWDeSznl9C7+NzTaCgiqiEAZ8Y55fZSuvCQ= |
 *--------------------------------------*/

In the following query, the wrapped key is presented in the base64 format:

DECLARE DLP_KEY_VALUE BYTES;

SET DLP_KEY_VALUE =
  FROM_BASE64(
    'CiQA1W20a6Y5elj6xUMn4u4xPwwYVmchVmgKHhCCjSS3yNkMTpsSOQDz5Jg3vAfguw6KaZY0gP/Dh0NnKrcd6ARn9am5WzKppmp/DwW4JGGJTt8jHbNS4EjbtpD/p4SNmw==');

SELECT
  DLP_DETERMINISTIC_ENCRYPT(
    DLP_KEY_CHAIN(
      'gcp-kms://projects/myproject/locations/us/keyRings/kms-test/cryptoKeys/test-Kek',
      DLP_KEY_VALUE),
    'Plaintext',
    'your_surrogate',
    'aad') AS results;

/*---------------------------------------------------------*
 | results                                                 |
 +---------------------------------------------------------+
 | your_surrogate(36):AWDeSznl9C7+NzTaCgiqiEAZ8Y55fZSuvCQ= |
 *---------------------------------------------------------*/

DLP_DETERMINISTIC_DECRYPT

DLP_DETERMINISTIC_DECRYPT(key, ciphertext, surrogate)
DLP_DETERMINISTIC_DECRYPT(key, ciphertext, surrogate, context)

Description

This function decrypts ciphertext using an encryption key derived from key and context. You can use surrogate to prepend the decryption result. To use DLP functions, you need a new cryptographic key and then use that key to get a wrapped key.

Definitions

  • key: A serialized BYTES value returned by DLP_KEY_CHAIN. key must be set to ENABLED in Cloud KMS. For information about how to generate a wrapped key, see gcloud kms encrypt.
  • ciphertext: The STRING value to decrypt.
  • surrogate: A STRING value that you can prepend to output. If you don't want to use surrogate, pass an empty string (enclosed in "").
  • context: A STRING value that is used with a Cloud KMS key to derive a data encryption key. For more information, see CryptoDeterministicConfig:context.

Return data type

STRING

Examples

In the following query, the wrapped key is presented in a BYTES literal format:

SELECT
  DLP_DETERMINISTIC_DECRYPT(
    DLP_KEY_CHAIN(
      'gcp-kms://projects/myproject/locations/us/keyRings/kms-test/cryptoKeys/test-Kek',
      b'\012\044\000\325\155\264\153\246\071\172\130\372\305\103\047\342\356\061\077\014\030\126\147\041\126\150\012\036\020\202\215\044\267\310\331\014\116\233\022\071\000\363\344\230\067\274\007\340\273\016\212\151\226\064\200\377\303\207\103\147\052\267\035\350\004\147\365\251\271\133\062\251\246\152\177\017\005\270\044\141\211\116\337\043\035\263\122\340\110\333\266\220\377\247\204\215\233'),
    'AWDeSznl9C7+NzTaCgiqiEAZ8Y55fZSuvCQ=',
    '',
    'aad') AS results;

/*--------------------------------------*
 | results                              |
 +--------------------------------------+
 | Plaintext                            |
 *--------------------------------------*/

In the following query, the wrapped key is presented in the base64 format:

DECLARE DLP_KEY_VALUE BYTES;

SET DLP_KEY_VALUE =
  FROM_BASE64(
    'CiQA1W20a6Y5elj6xUMn4u4xPwwYVmchVmgKHhCCjSS3yNkMTpsSOQDz5Jg3vAfguw6KaZY0gP/Dh0NnKrcd6ARn9am5WzKppmp/DwW4JGGJTt8jHbNS4EjbtpD/p4SNmw==');

SELECT
  DLP_DETERMINISTIC_DECRYPT(
    DLP_KEY_CHAIN(
      'gcp-kms://projects/myproject/locations/us/keyRings/kms-test/cryptoKeys/test-Kek',
      DLP_KEY_VALUE),
    'your_surrogate(36):AWDeSznl9C7+NzTaCgiqiEAZ8Y55fZSuvCQ=',
    'your_surrogate',
    'aad') AS results;

/*--------------------------------------*
 | results                              |
 +--------------------------------------+
 | Plaintext                            |
 *--------------------------------------*/

DLP_KEY_CHAIN

DLP_KEY_CHAIN(kms_resource_name, wrapped_key)

Description

You can use this function instead of the key argument for DLP deterministic encryption functions. This function lets you use the AES-SIV encryption functions without including plaintext keys in a query. To use DLP functions, you need a new cryptographic key and then use that key to get a wrapped key.

Definitions

  • kms_resource_name: A STRING literal that contains the resource path to the Cloud KMS key. kms_resource_name cannot be NULL and must reside in the same Cloud region where this function is executed. This argument is used to derive the data encryption key in the DLP_DETERMINISTIC_DECRYPT and DLP_DETERMINISTIC_ENCRYPT functions. A Cloud KMS key looks like this:

    gcp-kms://projects/my-project/locations/us/keyRings/my-key-ring/cryptoKeys/my-crypto-key
    
  • wrapped_key: A BYTES literal that represents a secret text chosen by the user. This secret text can be 16, 24, or 32 bytes. For information about how to generate a wrapped key, see gcloud kms encrypt.

Return data type

STRUCT

Examples

In the following query, the wrapped key is presented in a BYTES literal format:

SELECT
  DLP_DETERMINISTIC_ENCRYPT(
    DLP_KEY_CHAIN(
      'gcp-kms://projects/myproject/locations/us/keyRings/kms-test/cryptoKeys/test-Kek',
      b'\012\044\000\325\155\264\153\246\071\172\130\372\305\103\047\342\356\061\077\014\030\126\147\041\126\150\012\036\020\202\215\044\267\310\331\014\116\233\022\071\000\363\344\230\067\274\007\340\273\016\212\151\226\064\200\377\303\207\103\147\052\267\035\350\004\147\365\251\271\133\062\251\246\152\177\017\005\270\044\141\211\116\337\043\035\263\122\340\110\333\266\220\377\247\204\215\233'),
    'Plaintext',
    '',
    'aad') AS results;

/*--------------------------------------*
 | results                              |
 +--------------------------------------+
 | AWDeSznl9C7+NzTaCgiqiEAZ8Y55fZSuvCQ= |
 *--------------------------------------*/

In the following query, the wrapped key is presented in the base64 format:

DECLARE DLP_KEY_VALUE BYTES;

SET DLP_KEY_VALUE =
  FROM_BASE64(
    'CiQA1W20a6Y5elj6xUMn4u4xPwwYVmchVmgKHhCCjSS3yNkMTpsSOQDz5Jg3vAfguw6KaZY0gP/Dh0NnKrcd6ARn9am5WzKppmp/DwW4JGGJTt8jHbNS4EjbtpD/p4SNmw==');

SELECT
  DLP_DETERMINISTIC_ENCRYPT(
    DLP_KEY_CHAIN(
      'gcp-kms://projects/myproject/locations/us/keyRings/kms-test/cryptoKeys/test-Kek',
      DLP_KEY_VALUE),
    'Plaintext',
    '',
    'aad') AS results;

/*--------------------------------------*
 | results                              |
 +--------------------------------------+
 | AWDeSznl9C7+NzTaCgiqiEAZ8Y55fZSuvCQ= |
 *--------------------------------------*/

Geography functions

GoogleSQL for BigQuery supports geography functions. Geography functions operate on or generate GoogleSQL GEOGRAPHY values. The signature of most geography functions starts with ST_. GoogleSQL for BigQuery supports the following functions that can be used to analyze geographical data, determine spatial relationships between geographical features, and construct or manipulate GEOGRAPHYs.

All GoogleSQL geography functions return NULL if any input argument is NULL.

Categories

The geography functions are grouped into the following categories based on their behavior:

Category Functions Description
Constructors ST_GEOGPOINT
ST_MAKELINE
ST_MAKEPOLYGON
ST_MAKEPOLYGONORIENTED
Functions that build new geography values from coordinates or existing geographies.
Parsers ST_GEOGFROM
ST_GEOGFROMGEOJSON
ST_GEOGFROMTEXT
ST_GEOGFROMWKB
ST_GEOGPOINTFROMGEOHASH
Functions that create geographies from an external format such as WKT and GeoJSON.
Formatters ST_ASBINARY
ST_ASGEOJSON
ST_ASTEXT
ST_GEOHASH
Functions that export geographies to an external format such as WKT.
Transformations ST_BOUNDARY
ST_BUFFER
ST_BUFFERWITHTOLERANCE
ST_CENTROID
ST_CENTROID_AGG (Aggregate)
ST_CLOSESTPOINT
ST_CONVEXHULL
ST_DIFFERENCE
ST_EXTERIORRING
ST_INTERIORRINGS
ST_INTERSECTION
ST_LINEINTERPOLATEPOINT
ST_LINESUBSTRING
ST_SIMPLIFY
ST_SNAPTOGRID
ST_UNION
ST_UNION_AGG (Aggregate)
Functions that generate a new geography based on input.
Accessors ST_DIMENSION
ST_DUMP
ST_ENDPOINT
ST_GEOMETRYTYPE
ST_ISCLOSED
ST_ISCOLLECTION
ST_ISEMPTY
ST_ISRING
ST_NPOINTS
ST_NUMGEOMETRIES
ST_NUMPOINTS
ST_POINTN
ST_STARTPOINT
ST_X
ST_Y
Functions that provide access to properties of a geography without side-effects.
Predicates ST_CONTAINS
ST_COVEREDBY
ST_COVERS
ST_DISJOINT
ST_DWITHIN
ST_EQUALS
ST_INTERSECTS
ST_INTERSECTSBOX
ST_TOUCHES
ST_WITHIN
Functions that return TRUE or FALSE for some spatial relationship between two geographies or some property of a geography. These functions are commonly used in filter clauses.
Measures ST_ANGLE
ST_AREA
ST_AZIMUTH
ST_BOUNDINGBOX
ST_DISTANCE
ST_EXTENT (Aggregate)
ST_HAUSDORFFDISTANCE
ST_LINELOCATEPOINT
ST_LENGTH
ST_MAXDISTANCE
ST_PERIMETER
Functions that compute measurements of one or more geographies.
Clustering ST_CLUSTERDBSCAN Functions that perform clustering on geographies.
S2 functions S2_CELLIDFROMPOINT
S2_COVERINGCELLIDS
Functions for working with S2 cell coverings of GEOGRAPHY.

Function list

Name Summary
S2_CELLIDFROMPOINT Gets the S2 cell ID covering a point GEOGRAPHY value.
S2_COVERINGCELLIDS Gets an array of S2 cell IDs that cover a GEOGRAPHY value.
ST_ANGLE Takes three point GEOGRAPHY values, which represent two intersecting lines, and returns the angle between these lines.
ST_AREA Gets the area covered by the polygons in a GEOGRAPHY value.
ST_ASBINARY Converts a GEOGRAPHY value to a BYTES WKB geography value.
ST_ASGEOJSON Converts a GEOGRAPHY value to a STRING GeoJSON geography value.
ST_ASTEXT Converts a GEOGRAPHY value to a STRING WKT geography value.
ST_AZIMUTH Gets the azimuth of a line segment formed by two point GEOGRAPHY values.
ST_BOUNDARY Gets the union of component boundaries in a GEOGRAPHY value.
ST_BOUNDINGBOX Gets the bounding box for a GEOGRAPHY value.
ST_BUFFER Gets the buffer around a GEOGRAPHY value, using a specific number of segments.
ST_BUFFERWITHTOLERANCE Gets the buffer around a GEOGRAPHY value, using tolerance.
ST_CENTROID Gets the centroid of a GEOGRAPHY value.
ST_CENTROID_AGG Gets the centroid of a set of GEOGRAPHY values.
ST_CLOSESTPOINT Gets the point on a GEOGRAPHY value which is closest to any point in a second GEOGRAPHY value.
ST_CLUSTERDBSCAN Performs DBSCAN clustering on a group of GEOGRAPHY values and produces a 0-based cluster number for this row.
ST_CONTAINS Checks if one GEOGRAPHY value contains another GEOGRAPHY value.
ST_CONVEXHULL Returns the convex hull for a GEOGRAPHY value.
ST_COVEREDBY Checks if all points of a GEOGRAPHY value are on the boundary or interior of another GEOGRAPHY value.
ST_COVERS Checks if all points of a GEOGRAPHY value are on the boundary or interior of another GEOGRAPHY value.
ST_DIFFERENCE Gets the point set difference between two GEOGRAPHY values.
ST_DIMENSION Gets the dimension of the highest-dimensional element in a GEOGRAPHY value.
ST_DISJOINT Checks if two GEOGRAPHY values are disjoint (do not intersect).
ST_DISTANCE Gets the shortest distance in meters between two GEOGRAPHY values.
ST_DUMP Returns an array of simple GEOGRAPHY components in a GEOGRAPHY value.
ST_DWITHIN Checks if any points in two GEOGRAPHY values are within a given distance.
ST_ENDPOINT Gets the last point of a linestring GEOGRAPHY value.
ST_EQUALS Checks if two GEOGRAPHY values represent the same GEOGRAPHY value.
ST_EXTENT Gets the bounding box for a group of GEOGRAPHY values.
ST_EXTERIORRING Returns a linestring GEOGRAPHY value that corresponds to the outermost ring of a polygon GEOGRAPHY value.
ST_GEOGFROM Converts a STRING or BYTES value into a GEOGRAPHY value.
ST_GEOGFROMGEOJSON Converts a STRING GeoJSON geometry value into a GEOGRAPHY value.
ST_GEOGFROMTEXT Converts a STRING WKT geometry value into a GEOGRAPHY value.
ST_GEOGFROMWKB Converts a BYTES or hexadecimal-text STRING WKT geometry value into a GEOGRAPHY value.
ST_GEOGPOINT Creates a point GEOGRAPHY value for a given longitude and latitude.
ST_GEOGPOINTFROMGEOHASH Gets a point GEOGRAPHY value that is in the middle of a bounding box defined in a STRING GeoHash value.
ST_GEOHASH Converts a point GEOGRAPHY value to a STRING GeoHash value.
ST_GEOMETRYTYPE Gets the Open Geospatial Consortium (OGC) geometry type for a GEOGRAPHY value.
ST_HAUSDORFFDISTANCE Gets the discrete Hausdorff distance between two geometries.
ST_INTERIORRINGS Gets the interior rings of a polygon GEOGRAPHY value.
ST_INTERSECTION Gets the point set intersection of two GEOGRAPHY values.
ST_INTERSECTS Checks if at least one point appears in two GEOGRAPHY values.
ST_INTERSECTSBOX Checks if a GEOGRAPHY value intersects a rectangle.
ST_ISCLOSED Checks if all components in a GEOGRAPHY value are closed.
ST_ISCOLLECTION Checks if the total number of points, linestrings, and polygons is greater than one in a GEOGRAPHY value.
ST_ISEMPTY Checks if a GEOGRAPHY value is empty.
ST_ISRING Checks if a GEOGRAPHY value is a closed, simple linestring.
ST_LENGTH Gets the total length of lines in a GEOGRAPHY value.
ST_LINEINTERPOLATEPOINT Gets a point at a specific fraction in a linestring GEOGRAPHY value.
ST_LINELOCATEPOINT Gets a section of a linestring GEOGRAPHY value between the start point and a point GEOGRAPHY value.
ST_LINESUBSTRING Gets a segment of a single linestring at a specific starting and ending fraction.
ST_MAKELINE Creates a linestring GEOGRAPHY value by concatenating the point and linestring vertices of GEOGRAPHY values.
ST_MAKEPOLYGON Constructs a polygon GEOGRAPHY value by combining a polygon shell with polygon holes.
ST_MAKEPOLYGONORIENTED Constructs a polygon GEOGRAPHY value, using an array of linestring GEOGRAPHY values. The vertex ordering of each linestring determines the orientation of each polygon ring.
ST_MAXDISTANCE Gets the longest distance between two non-empty GEOGRAPHY values.
ST_NPOINTS An alias of ST_NUMPOINTS.
ST_NUMGEOMETRIES Gets the number of geometries in a GEOGRAPHY value.
ST_NUMPOINTS Gets the number of vertices in the a GEOGRAPHY value.
ST_PERIMETER Gets the length of the boundary of the polygons in a GEOGRAPHY value.
ST_POINTN Gets the point at a specific index of a linestring GEOGRAPHY value.
ST_SIMPLIFY Converts a GEOGRAPHY value into a simplified GEOGRAPHY value, using tolerance.
ST_SNAPTOGRID Produces a GEOGRAPHY value, where each vertex has been snapped to a longitude/latitude grid.
ST_STARTPOINT Gets the first point of a linestring GEOGRAPHY value.
ST_TOUCHES Checks if two GEOGRAPHY values intersect and their interiors have no elements in common.
ST_UNION Gets the point set union of multiple GEOGRAPHY values.
ST_UNION_AGG Aggregates over GEOGRAPHY values and gets their point set union.
ST_WITHIN Checks if one GEOGRAPHY value contains another GEOGRAPHY value.
ST_X Gets the longitude from a point GEOGRAPHY value.
ST_Y Gets the latitude from a point GEOGRAPHY value.

S2_CELLIDFROMPOINT

S2_CELLIDFROMPOINT(point_geography[, level => cell_level])

Description

Returns the S2 cell ID covering a point GEOGRAPHY.

  • The optional INT64 parameter level specifies the S2 cell level for the returned cell. Naming this argument is optional.

This is advanced functionality for interoperability with systems utilizing the S2 Geometry Library.

Constraints

  • Returns the cell ID as a signed INT64 bit-equivalent to unsigned 64-bit integer representation.
  • Can return negative cell IDs.
  • Valid S2 cell levels are 0 to 30.
  • level defaults to 30 if not explicitly specified.
  • The function only supports a single point GEOGRAPHY. Use the SAFE prefix if the input can be multipoint, linestring, polygon, or an empty GEOGRAPHY.
  • To compute the covering of a complex GEOGRAPHY, use S2_COVERINGCELLIDS.

Return type

INT64

Example

WITH data AS (
  SELECT 1 AS id, ST_GEOGPOINT(-122, 47) AS geo
  UNION ALL
  -- empty geography is not supported
  SELECT 2 AS id, ST_GEOGFROMTEXT('POINT EMPTY') AS geo
  UNION ALL
  -- only points are supported
  SELECT 3 AS id, ST_GEOGFROMTEXT('LINESTRING(1 2, 3 4)') AS geo
)
SELECT id,
       SAFE.S2_CELLIDFROMPOINT(geo) cell30,
       SAFE.S2_CELLIDFROMPOINT(geo, level => 10) cell10
FROM data;

/*----+---------------------+---------------------*
 | id | cell30              | cell10              |
 +----+---------------------+---------------------+
 | 1  | 6093613931972369317 | 6093613287902019584 |
 | 2  | NULL                | NULL                |
 | 3  | NULL                | NULL                |
 *----+---------------------+---------------------*/

S2_COVERINGCELLIDS

S2_COVERINGCELLIDS(
    geography
    [, min_level => cell_level]
    [, max_level => cell_level]
    [, max_cells => max_cells]
    [, buffer => buffer])

Description

Returns an array of S2 cell IDs that cover the input GEOGRAPHY. The function returns at most max_cells cells. The optional arguments min_level and max_level specify minimum and maximum levels for returned S2 cells. The array size is limited by the optional max_cells argument. The optional buffer argument specifies a buffering factor in meters; the region being covered is expanded from the extent of the input geography by this amount.

This is advanced functionality for interoperability with systems utilizing the S2 Geometry Library.

Constraints

  • Returns the cell ID as a signed INT64 bit-equivalent to unsigned 64-bit integer representation.
  • Can return negative cell IDs.
  • Valid S2 cell levels are 0 to 30.
  • max_cells defaults to 8 if not explicitly specified.
  • buffer should be nonnegative. It defaults to 0.0 meters if not explicitly specified.

Return type

ARRAY<INT64>

Example

WITH data AS (
  SELECT 1 AS id, ST_GEOGPOINT(-122, 47) AS geo
  UNION ALL
  SELECT 2 AS id, ST_GEOGFROMTEXT('POINT EMPTY') AS geo
  UNION ALL
  SELECT 3 AS id, ST_GEOGFROMTEXT('LINESTRING(-122.12 47.67, -122.19 47.69)') AS geo
)
SELECT id, S2_COVERINGCELLIDS(geo, min_level => 12) cells
FROM data;

/*----+--------------------------------------------------------------------------------------*
 | id | cells                                                                                |
 +----+--------------------------------------------------------------------------------------+
 | 1  | [6093613931972369317]                                                                |
 | 2  | []                                                                                   |
 | 3  | [6093384954555662336, 6093390709811838976, 6093390735581642752, 6093390740145045504, |
 |    |  6093390791416217600, 6093390812891054080, 6093390817187069952, 6093496378892222464] |
 *----+--------------------------------------------------------------------------------------*/

ST_ANGLE

ST_ANGLE(point_geography_1, point_geography_2, point_geography_3)

Description

Takes three point GEOGRAPHY values, which represent two intersecting lines. Returns the angle between these lines. Point 2 and point 1 represent the first line and point 2 and point 3 represent the second line. The angle between these lines is in radians, in the range [0, 2pi). The angle is measured clockwise from the first line to the second line.

ST_ANGLE has the following edge cases:

  • If points 2 and 3 are the same, returns NULL.
  • If points 2 and 1 are the same, returns NULL.
  • If points 2 and 3 are exactly antipodal, returns NULL.
  • If points 2 and 1 are exactly antipodal, returns NULL.
  • If any of the input geographies are not single points or are the empty geography, then throws an error.

Return type

FLOAT64

Example

WITH geos AS (
  SELECT 1 id, ST_GEOGPOINT(1, 0) geo1, ST_GEOGPOINT(0, 0) geo2, ST_GEOGPOINT(0, 1) geo3 UNION ALL
  SELECT 2 id, ST_GEOGPOINT(0, 0), ST_GEOGPOINT(1, 0), ST_GEOGPOINT(0, 1) UNION ALL
  SELECT 3 id, ST_GEOGPOINT(1, 0), ST_GEOGPOINT(0, 0), ST_GEOGPOINT(1, 0) UNION ALL
  SELECT 4 id, ST_GEOGPOINT(1, 0) geo1, ST_GEOGPOINT(0, 0) geo2, ST_GEOGPOINT(0, 0) geo3 UNION ALL
  SELECT 5 id, ST_GEOGPOINT(0, 0), ST_GEOGPOINT(-30, 0), ST_GEOGPOINT(150, 0) UNION ALL
  SELECT 6 id, ST_GEOGPOINT(0, 0), NULL, NULL UNION ALL
  SELECT 7 id, NULL, ST_GEOGPOINT(0, 0), NULL UNION ALL
  SELECT 8 id, NULL, NULL, ST_GEOGPOINT(0, 0))
SELECT ST_ANGLE(geo1,geo2,geo3) AS angle FROM geos ORDER BY id;

/*---------------------*
 | angle               |
 +---------------------+
 | 4.71238898038469    |
 | 0.78547432161873854 |
 | 0                   |
 | NULL                |
 | NULL                |
 | NULL                |
 | NULL                |
 | NULL                |
 *---------------------*/

ST_AREA

ST_AREA(geography_expression[, use_spheroid])

Description

Returns the area in square meters covered by the polygons in the input GEOGRAPHY.

If geography_expression is a point or a line, returns zero. If geography_expression is a collection, returns the area of the polygons in the collection; if the collection does not contain polygons, returns zero.

The optional use_spheroid parameter determines how this function measures distance. If use_spheroid is FALSE, the function measures distance on the surface of a perfect sphere.

The use_spheroid parameter currently only supports the value FALSE. The default value of use_spheroid is FALSE.

Return type

FLOAT64

ST_ASBINARY

ST_ASBINARY(geography_expression)

Description

Returns the WKB representation of an input GEOGRAPHY.

See ST_GEOGFROMWKB to construct a GEOGRAPHY from WKB.

Return type

BYTES

ST_ASGEOJSON

ST_ASGEOJSON(geography_expression)

Description

Returns the RFC 7946 compliant GeoJSON representation of the input GEOGRAPHY.

A GoogleSQL GEOGRAPHY has spherical geodesic edges, whereas a GeoJSON Geometry object explicitly has planar edges. To convert between these two types of edges, GoogleSQL adds additional points to the line where necessary so that the resulting sequence of edges remains within 10 meters of the original edge.

See ST_GEOGFROMGEOJSON to construct a GEOGRAPHY from GeoJSON.

Return type

STRING

ST_ASTEXT

ST_ASTEXT(geography_expression)

Description

Returns the WKT representation of an input GEOGRAPHY.

See ST_GEOGFROMTEXT to construct a GEOGRAPHY from WKT.

Return type

STRING

ST_AZIMUTH

ST_AZIMUTH(point_geography_1, point_geography_2)

Description

Takes two point GEOGRAPHY values, and returns the azimuth of the line segment formed by points 1 and 2. The azimuth is the angle in radians measured between the line from point 1 facing true North to the line segment from point 1 to point 2.

The positive angle is measured clockwise on the surface of a sphere. For example, the azimuth for a line segment:

  • Pointing North is 0
  • Pointing East is PI/2
  • Pointing South is PI
  • Pointing West is 3PI/2

ST_AZIMUTH has the following edge cases:

  • If the two input points are the same, returns NULL.
  • If the two input points are exactly antipodal, returns NULL.
  • If either of the input geographies are not single points or are the empty geography, throws an error.

Return type

FLOAT64

Example

WITH geos AS (
  SELECT 1 id, ST_GEOGPOINT(1, 0) AS geo1, ST_GEOGPOINT(0, 0) AS geo2 UNION ALL
  SELECT 2, ST_GEOGPOINT(0, 0), ST_GEOGPOINT(1, 0) UNION ALL
  SELECT 3, ST_GEOGPOINT(0, 0), ST_GEOGPOINT(0, 1) UNION ALL
  -- identical
  SELECT 4, ST_GEOGPOINT(0, 0), ST_GEOGPOINT(0, 0) UNION ALL
  -- antipode
  SELECT 5, ST_GEOGPOINT(-30, 0), ST_GEOGPOINT(150, 0) UNION ALL
  -- nulls
  SELECT 6, ST_GEOGPOINT(0, 0), NULL UNION ALL
  SELECT 7, NULL, ST_GEOGPOINT(0, 0))
SELECT ST_AZIMUTH(geo1, geo2) AS azimuth FROM geos ORDER BY id;

/*--------------------*
 | azimuth            |
 +--------------------+
 | 4.71238898038469   |
 | 1.5707963267948966 |
 | 0                  |
 | NULL               |
 | NULL               |
 | NULL               |
 | NULL               |
 *--------------------*/

ST_BOUNDARY

ST_BOUNDARY(geography_expression)

Description

Returns a single GEOGRAPHY that contains the union of the boundaries of each component in the given input GEOGRAPHY.

The boundary of each component of a GEOGRAPHY is defined as follows:

  • The boundary of a point is empty.
  • The boundary of a linestring consists of the endpoints of the linestring.
  • The boundary of a polygon consists of the linestrings that form the polygon shell and each of the polygon's holes.

Return type

GEOGRAPHY

ST_BOUNDINGBOX

ST_BOUNDINGBOX(geography_expression)

Description

Returns a STRUCT that represents the bounding box for the specified geography. The bounding box is the minimal rectangle that encloses the geography. The edges of the rectangle follow constant lines of longitude and latitude.

Caveats:

  • Returns NULL if the input is NULL or an empty geography.
  • The bounding box might cross the antimeridian if this allows for a smaller rectangle. In this case, the bounding box has one of its longitudinal bounds outside of the [-180, 180] range, so that xmin is smaller than the eastmost value xmax.

Return type

STRUCT<xmin FLOAT64, ymin FLOAT64, xmax FLOAT64, ymax FLOAT64>.

Bounding box parts:

  • xmin: The westmost constant longitude line that bounds the rectangle.
  • xmax: The eastmost constant longitude line that bounds the rectangle.
  • ymin: The minimum constant latitude line that bounds the rectangle.
  • ymax: The maximum constant latitude line that bounds the rectangle.

Example

WITH data AS (
  SELECT 1 id, ST_GEOGFROMTEXT('POLYGON((-125 48, -124 46, -117 46, -117 49, -125 48))') g
  UNION ALL
  SELECT 2 id, ST_GEOGFROMTEXT('POLYGON((172 53, -130 55, -141 70, 172 53))') g
  UNION ALL
  SELECT 3 id, ST_GEOGFROMTEXT('POINT EMPTY') g
  UNION ALL
  SELECT 4 id, ST_GEOGFROMTEXT('POLYGON((172 53, -141 70, -130 55, 172 53))', oriented => TRUE)
)
SELECT id, ST_BOUNDINGBOX(g) AS box
FROM data

/*----+------------------------------------------*
 | id | box                                      |
 +----+------------------------------------------+
 | 1  | {xmin:-125, ymin:46, xmax:-117, ymax:49} |
 | 2  | {xmin:172, ymin:53, xmax:230, ymax:70}   |
 | 3  | NULL                                     |
 | 4  | {xmin:-180, ymin:-90, xmax:180, ymax:90} |
 *----+------------------------------------------*/

See ST_EXTENT for the aggregate version of ST_BOUNDINGBOX.

ST_BUFFER

ST_BUFFER(
    geography,
    buffer_radius
    [, num_seg_quarter_circle => num_segments]
    [, use_spheroid => boolean_expression]
    [, endcap => endcap_style]
    [, side => line_side])

Description

Returns a GEOGRAPHY that represents the buffer around the input GEOGRAPHY. This function is similar to ST_BUFFERWITHTOLERANCE, but you specify the number of segments instead of providing tolerance to determine how much the resulting geography can deviate from the ideal buffer radius.

  • geography: The input GEOGRAPHY to encircle with the buffer radius.
  • buffer_radius: FLOAT64 that represents the radius of the buffer around the input geography. The radius is in meters. Note that polygons contract when buffered with a negative buffer_radius. Polygon shells and holes that are contracted to a point are discarded.
  • num_seg_quarter_circle: (Optional) FLOAT64 specifies the number of segments that are used to approximate a quarter circle. The default value is 8.0. Naming this argument is optional.
  • endcap: (Optional) STRING allows you to specify one of two endcap styles: ROUND and FLAT. The default value is ROUND. This option only affects the endcaps of buffered linestrings.
  • side: (Optional) STRING allows you to specify one of three possibilities for lines: BOTH, LEFT, and RIGHT. The default is BOTH. This option only affects how linestrings are buffered.
  • use_spheroid: (Optional) BOOL determines how this function measures distance. If use_spheroid is FALSE, the function measures distance on the surface of a perfect sphere. The use_spheroid parameter currently only supports the value FALSE. The default value of use_spheroid is FALSE.

Return type

Polygon GEOGRAPHY

Example

The following example shows the result of ST_BUFFER on a point. A buffered point is an approximated circle. When num_seg_quarter_circle = 2, there are two line segments in a quarter circle, and therefore the buffered circle has eight sides and ST_NUMPOINTS returns nine vertices. When num_seg_quarter_circle = 8, there are eight line segments in a quarter circle, and therefore the buffered circle has thirty-two sides and ST_NUMPOINTS returns thirty-three vertices.

SELECT
  -- num_seg_quarter_circle=2
  ST_NUMPOINTS(ST_BUFFER(ST_GEOGFROMTEXT('POINT(1 2)'), 50, 2)) AS eight_sides,
  -- num_seg_quarter_circle=8, since 8 is the default
  ST_NUMPOINTS(ST_BUFFER(ST_GEOGFROMTEXT('POINT(100 2)'), 50)) AS thirty_two_sides;

/*-------------+------------------*
 | eight_sides | thirty_two_sides |
 +-------------+------------------+
 | 9           | 33               |
 *-------------+------------------*/

ST_BUFFERWITHTOLERANCE

ST_BUFFERWITHTOLERANCE(
    geography,
    buffer_radius,
    tolerance_meters => tolerance
    [, use_spheroid => boolean_expression]
    [, endcap => endcap_style]
    [, side => line_side])

Returns a GEOGRAPHY that represents the buffer around the input GEOGRAPHY. This function is similar to ST_BUFFER, but you provide tolerance instead of segments to determine how much the resulting geography can deviate from the ideal buffer radius.

  • geography: The input GEOGRAPHY to encircle with the buffer radius.
  • buffer_radius: FLOAT64 that represents the radius of the buffer around the input geography. The radius is in meters. Note that polygons contract when buffered with a negative buffer_radius. Polygon shells and holes that are contracted to a point are discarded.
  • tolerance_meters: FLOAT64 specifies a tolerance in meters with which the shape is approximated. Tolerance determines how much a polygon can deviate from the ideal radius. Naming this argument is optional.
  • endcap: (Optional) STRING allows you to specify one of two endcap styles: ROUND and FLAT. The default value is ROUND. This option only affects the endcaps of buffered linestrings.
  • side: (Optional) STRING allows you to specify one of three possible line styles: BOTH, LEFT, and RIGHT. The default is BOTH. This option only affects the endcaps of buffered linestrings.
  • use_spheroid: (Optional) BOOL determines how this function measures distance. If use_spheroid is FALSE, the function measures distance on the surface of a perfect sphere. The use_spheroid parameter currently only supports the value FALSE. The default value of use_spheroid is FALSE.

Return type

Polygon GEOGRAPHY

Example

The following example shows the results of ST_BUFFERWITHTOLERANCE on a point, given two different values for tolerance but with the same buffer radius of 100. A buffered point is an approximated circle. When tolerance_meters=25, the tolerance is a large percentage of the buffer radius, and therefore only five segments are used to approximate a circle around the input point. When tolerance_meters=1, the tolerance is a much smaller percentage of the buffer radius, and therefore twenty-four edges are used to approximate a circle around the input point.

SELECT
  -- tolerance_meters=25, or 25% of the buffer radius.
  ST_NumPoints(ST_BUFFERWITHTOLERANCE(ST_GEOGFROMTEXT('POINT(1 2)'), 100, 25)) AS five_sides,
  -- tolerance_meters=1, or 1% of the buffer radius.
  st_NumPoints(ST_BUFFERWITHTOLERANCE(ST_GEOGFROMTEXT('POINT(100 2)'), 100, 1)) AS twenty_four_sides;

/*------------+-------------------*
 | five_sides | twenty_four_sides |
 +------------+-------------------+
 | 6          | 24                |
 *------------+-------------------*/

ST_CENTROID

ST_CENTROID(geography_expression)

Description

Returns the centroid of the input GEOGRAPHY as a single point GEOGRAPHY.

The centroid of a GEOGRAPHY is the weighted average of the centroids of the highest-dimensional components in the GEOGRAPHY. The centroid for components in each dimension is defined as follows:

  • The centroid of points is the arithmetic mean of the input coordinates.
  • The centroid of linestrings is the centroid of all the edges weighted by length. The centroid of each edge is the geodesic midpoint of the edge.
  • The centroid of a polygon is its center of mass.

If the input GEOGRAPHY is empty, an empty GEOGRAPHY is returned.

Constraints

In the unlikely event that the centroid of a GEOGRAPHY cannot be defined by a single point on the surface of the Earth, a deterministic but otherwise arbitrary point is returned. This can only happen if the centroid is exactly at the center of the Earth, such as the centroid for a pair of antipodal points, and the likelihood of this happening is vanishingly small.

Return type

Point GEOGRAPHY

ST_CENTROID_AGG

ST_CENTROID_AGG(geography)

Description

Computes the centroid of the set of input GEOGRAPHYs as a single point GEOGRAPHY.

The centroid over the set of input GEOGRAPHYs is the weighted average of the centroid of each individual GEOGRAPHY. Only the GEOGRAPHYs with the highest dimension present in the input contribute to the centroid of the entire set. For example, if the input contains both GEOGRAPHYs with lines and GEOGRAPHYs with only points, ST_CENTROID_AGG returns the weighted average of the GEOGRAPHYs with lines, since those have maximal dimension. In this example, ST_CENTROID_AGG ignores GEOGRAPHYs with only points when calculating the aggregate centroid.

ST_CENTROID_AGG ignores NULL input GEOGRAPHY values.

See ST_CENTROID for the non-aggregate version of ST_CENTROID_AGG and the definition of centroid for an individual GEOGRAPHY value.

Return type

Point GEOGRAPHY

Example

The following queries compute the aggregate centroid over a set of GEOGRAPHY values. The input to the first query contains only points, and therefore each value contribute to the aggregate centroid. Also notice that ST_CENTROID_AGG is not equivalent to calling ST_CENTROID on the result of ST_UNION_AGG; duplicates are removed by the union, unlike ST_CENTROID_AGG. The input to the second query has mixed dimensions, and only values with the highest dimension in the set, the lines, affect the aggregate centroid.

SELECT ST_CENTROID_AGG(points) AS st_centroid_agg,
ST_CENTROID(ST_UNION_AGG(points)) AS centroid_of_union
FROM UNNEST([ST_GEOGPOINT(1, 5),
             ST_GEOGPOINT(1, 2),
             ST_GEOGPOINT(1, -1),
             ST_GEOGPOINT(1, -1)]) points;

/*---------------------------+-------------------*
 | st_centroid_agg           | centroid_of_union |
 +---------------------------+-------------------+
 | POINT(1 1.24961422620969) | POINT(1 2)        |
 *---------------------------+-------------------*/
SELECT ST_CENTROID_AGG(points) AS st_centroid_agg
FROM UNNEST([ST_GEOGPOINT(50, 26),
             ST_GEOGPOINT(34, 33.3),
             ST_GEOGFROMTEXT('LINESTRING(0 -1, 0 1)'),
             ST_GEOGFROMTEXT('LINESTRING(0 1, 0 3)')]) points;

/*-----------------*
 | st_centroid_agg |
 +-----------------+
 | POINT(0 1)      |
 *-----------------*/

ST_CLOSESTPOINT

ST_CLOSESTPOINT(geography_1, geography_2[, use_spheroid])

Description

Returns a GEOGRAPHY containing a point on geography_1 with the smallest possible distance to geography_2. This implies that the distance between the point returned by ST_CLOSESTPOINT and geography_2 is less than or equal to the distance between any other point on geography_1 and geography_2.

If either of the input GEOGRAPHYs is empty, ST_CLOSESTPOINT returns NULL.

The optional use_spheroid parameter determines how this function measures distance. If use_spheroid is FALSE, the function measures distance on the surface of a perfect sphere.

The use_spheroid parameter currently only supports the value FALSE. The default value of use_spheroid is FALSE.

Return type

Point GEOGRAPHY

ST_CLUSTERDBSCAN

ST_CLUSTERDBSCAN(geography_column, epsilon, minimum_geographies)
OVER over_clause

over_clause:
  { named_window | ( [ window_specification ] ) }

window_specification:
  [ named_window ]
  [ PARTITION BY partition_expression [, ...] ]
  [ ORDER BY expression [ { ASC | DESC }  ] [, ...] ]

Performs DBSCAN clustering on a column of geographies. Returns a 0-based cluster number.

To learn more about the OVER clause and how to use it, see Window function calls.

Input parameters

  • geography_column: A column of GEOGRAPHYs that is clustered.
  • epsilon: The epsilon that specifies the radius, measured in meters, around a core value. Non-negative FLOAT64 value.
  • minimum_geographies: Specifies the minimum number of geographies in a single cluster. Only dense input forms a cluster, otherwise it is classified as noise. Non-negative INT64 value.

Geography types and the DBSCAN algorithm

The DBSCAN algorithm identifies high-density clusters of data and marks outliers in low-density areas of noise. Geographies passed in through geography_column are classified in one of three ways by the DBSCAN algorithm:

  • Core value: A geography is a core value if it is within epsilon distance of minimum_geographies geographies, including itself. The core value starts a new cluster, or is added to the same cluster as a core value within epsilon distance. Core values are grouped in a cluster together with all other core and border values that are within epsilon distance.
  • Border value: A geography is a border value if it is within epsilon distance of a core value. It is added to the same cluster as a core value within epsilon distance. A border value may be within epsilon distance of more than one cluster. In this case, it may be arbitrarily assigned to either cluster and the function will produce the same result in subsequent calls.
  • Noise: A geography is noise if it is neither a core nor a border value. Noise values are assigned to a NULL cluster. An empty GEOGRAPHY is always classified as noise.

Constraints

  • The argument minimum_geographies is a non-negative INT64and epsilon is a non-negative FLOAT64.
  • An empty geography cannot join any cluster.
  • Multiple clustering assignments could be possible for a border value. If a geography is a border value, ST_CLUSTERDBSCAN will assign it to an arbitrary valid cluster.

Return type

INT64 for each geography in the geography column.

Examples

This example performs DBSCAN clustering with a radius of 100,000 meters with a minimum_geographies argument of 1. The geographies being analyzed are a mixture of points, lines, and polygons.

WITH Geos as
  (SELECT 1 as row_id, ST_GEOGFROMTEXT('POINT EMPTY') as geo UNION ALL
    SELECT 2, ST_GEOGFROMTEXT('MULTIPOINT(1 1, 2 2, 4 4, 5 2)') UNION ALL
    SELECT 3, ST_GEOGFROMTEXT('POINT(14 15)') UNION ALL
    SELECT 4, ST_GEOGFROMTEXT('LINESTRING(40 1, 42 34, 44 39)') UNION ALL
    SELECT 5, ST_GEOGFROMTEXT('POLYGON((40 2, 40 1, 41 2, 40 2))'))
SELECT row_id, geo, ST_CLUSTERDBSCAN(geo, 1e5, 1) OVER () AS cluster_num FROM
Geos ORDER BY row_id

/*--------+-----------------------------------+-------------*
 | row_id |                geo                | cluster_num |
 +--------+-----------------------------------+-------------+
 | 1      | GEOMETRYCOLLECTION EMPTY          | NULL        |
 | 2      | MULTIPOINT(1 1, 2 2, 5 2, 4 4)    | 0           |
 | 3      | POINT(14 15)                      | 1           |
 | 4      | LINESTRING(40 1, 42 34, 44 39)    | 2           |
 | 5      | POLYGON((40 2, 40 1, 41 2, 40 2)) | 2           |
 *--------+-----------------------------------+-------------*/

ST_CONTAINS

ST_CONTAINS(geography_1, geography_2)

Description

Returns TRUE if no point of geography_2 is outside geography_1, and the interiors intersect; returns FALSE otherwise.

NOTE: A GEOGRAPHY does not contain its own boundary. Compare with ST_COVERS.

Return type

BOOL

Example

The following query tests whether the polygon POLYGON((1 1, 20 1, 10 20, 1 1)) contains each of the three points (0, 0), (1, 1), and (10, 10), which lie on the exterior, the boundary, and the interior of the polygon respectively.

SELECT
  ST_GEOGPOINT(i, i) AS p,
  ST_CONTAINS(ST_GEOGFROMTEXT('POLYGON((1 1, 20 1, 10 20, 1 1))'),
              ST_GEOGPOINT(i, i)) AS `contains`
FROM UNNEST([0, 1, 10]) AS i;

/*--------------+----------*
 | p            | contains |
 +--------------+----------+
 | POINT(0 0)   | FALSE    |
 | POINT(1 1)   | FALSE    |
 | POINT(10 10) | TRUE     |
 *--------------+----------*/

ST_CONVEXHULL

ST_CONVEXHULL(geography_expression)

Description

Returns the convex hull for the input GEOGRAPHY. The convex hull is the smallest convex GEOGRAPHY that covers the input. A GEOGRAPHY is convex if for every pair of points in the GEOGRAPHY, the geodesic edge connecting the points are also contained in the same GEOGRAPHY.

In most cases, the convex hull consists of a single polygon. Notable edge cases include the following:

  • The convex hull of a single point is also a point.
  • The convex hull of two or more collinear points is a linestring as long as that linestring is convex.
  • If the input GEOGRAPHY spans more than a hemisphere, the convex hull is the full globe. This includes any input that contains a pair of antipodal points.
  • ST_CONVEXHULL returns NULL if the input is either NULL or the empty GEOGRAPHY.

Return type

GEOGRAPHY

Examples

The convex hull returned by ST_CONVEXHULL can be a point, linestring, or a polygon, depending on the input.

WITH Geographies AS
 (SELECT ST_GEOGFROMTEXT('POINT(1 1)') AS g UNION ALL
  SELECT ST_GEOGFROMTEXT('LINESTRING(1 1, 2 2)') AS g UNION ALL
  SELECT ST_GEOGFROMTEXT('MULTIPOINT(2 11, 4 12, 0 15, 1 9, 1 12)') AS g)
SELECT
  g AS input_geography,
  ST_CONVEXHULL(g) AS convex_hull
FROM Geographies;

/*-----------------------------------------+--------------------------------------------------------*
 |             input_geography             |                      convex_hull                       |
 +-----------------------------------------+--------------------------------------------------------+
 | POINT(1 1)                              | POINT(0.999999999999943 1)                             |
 | LINESTRING(1 1, 2 2)                    | LINESTRING(2 2, 1.49988573656168 1.5000570914792, 1 1) |
 | MULTIPOINT(1 9, 4 12, 2 11, 1 12, 0 15) | POLYGON((1 9, 4 12, 0 15, 1 9))                        |
 *-----------------------------------------+--------------------------------------------------------*/

ST_COVEREDBY

ST_COVEREDBY(geography_1, geography_2)

Description

Returns FALSE if geography_1 or geography_2 is empty. Returns TRUE if no points of geography_1 lie in the exterior of geography_2.

Given two GEOGRAPHYs a and b, ST_COVEREDBY(a, b) returns the same result as ST_COVERS(b, a). Note the opposite order of arguments.

Return type

BOOL

ST_COVERS

ST_COVERS(geography_1, geography_2)

Description

Returns FALSE if geography_1 or geography_2 is empty. Returns TRUE if no points of geography_2 lie in the exterior of geography_1.

Return type

BOOL

Example

The following query tests whether the polygon POLYGON((1 1, 20 1, 10 20, 1 1)) covers each of the three points (0, 0), (1, 1), and (10, 10), which lie on the exterior, the boundary, and the interior of the polygon respectively.

SELECT
  ST_GEOGPOINT(i, i) AS p,
  ST_COVERS(ST_GEOGFROMTEXT('POLYGON((1 1, 20 1, 10 20, 1 1))'),
            ST_GEOGPOINT(i, i)) AS `covers`
FROM UNNEST([0, 1, 10]) AS i;

/*--------------+--------*
 | p            | covers |
 +--------------+--------+
 | POINT(0 0)   | FALSE  |
 | POINT(1 1)   | TRUE   |
 | POINT(10 10) | TRUE   |
 *--------------+--------*/

ST_DIFFERENCE

ST_DIFFERENCE(geography_1, geography_2)

Description

Returns a GEOGRAPHY that represents the point set difference of geography_1 and geography_2. Therefore, the result consists of the part of geography_1 that does not intersect with geography_2.

If geometry_1 is completely contained in geometry_2, then ST_DIFFERENCE returns an empty GEOGRAPHY.

Constraints

The underlying geometric objects that a GoogleSQL GEOGRAPHY represents correspond to a closed point set. Therefore, ST_DIFFERENCE is the closure of the point set difference of geography_1 and geography_2. This implies that if geography_1 and geography_2 intersect, then a portion of the boundary of geography_2 could be in the difference.

Return type

GEOGRAPHY

Example

The following query illustrates the difference between geog1, a larger polygon POLYGON((0 0, 10 0, 10 10, 0 0)) and geog1, a smaller polygon POLYGON((4 2, 6 2, 8 6, 4 2)) that intersects with geog1. The result is geog1 with a hole where geog2 intersects with it.

SELECT
  ST_DIFFERENCE(
      ST_GEOGFROMTEXT('POLYGON((0 0, 10 0, 10 10, 0 0))'),
      ST_GEOGFROMTEXT('POLYGON((4 2, 6 2, 8 6, 4 2))')
  );

/*--------------------------------------------------------*
 | difference_of_geog1_and_geog2                          |
 +--------------------------------------------------------+
 | POLYGON((0 0, 10 0, 10 10, 0 0), (8 6, 6 2, 4 2, 8 6)) |
 *--------------------------------------------------------*/

ST_DIMENSION

ST_DIMENSION(geography_expression)

Description

Returns the dimension of the highest-dimensional element in the input GEOGRAPHY.

The dimension of each possible element is as follows:

  • The dimension of a point is 0.
  • The dimension of a linestring is 1.
  • The dimension of a polygon is 2.

If the input GEOGRAPHY is empty, ST_DIMENSION returns -1.

Return type

INT64

ST_DISJOINT

ST_DISJOINT(geography_1, geography_2)

Description

Returns TRUE if the intersection of geography_1 and geography_2 is empty, that is, no point in geography_1 also appears in geography_2.

ST_DISJOINT is the logical negation of ST_INTERSECTS.

Return type

BOOL

ST_DISTANCE

ST_DISTANCE(geography_1, geography_2[, use_spheroid])

Description

Returns the shortest distance in meters between two non-empty GEOGRAPHYs.

If either of the input GEOGRAPHYs is empty, ST_DISTANCE returns NULL.

The optional use_spheroid parameter determines how this function measures distance. If use_spheroid is FALSE, the function measures distance on the surface of a perfect sphere. If use_spheroid is TRUE, the function measures distance on the surface of the WGS84 spheroid. The default value of use_spheroid is FALSE.

Return type

FLOAT64

ST_DUMP

ST_DUMP(geography[, dimension])

Description

Returns an ARRAY of simple GEOGRAPHYs where each element is a component of the input GEOGRAPHY. A simple GEOGRAPHY consists of a single point, linestring, or polygon. If the input GEOGRAPHY is simple, the result is a single element. When the input GEOGRAPHY is a collection, ST_DUMP returns an ARRAY with one simple GEOGRAPHY for each component in the collection.

If dimension is provided, the function only returns GEOGRAPHYs of the corresponding dimension. A dimension of -1 is equivalent to omitting dimension.

Return Type

ARRAY<GEOGRAPHY>

Examples

The following example shows how ST_DUMP returns the simple geographies within a complex geography.

WITH example AS (
  SELECT ST_GEOGFROMTEXT('POINT(0 0)') AS geography
  UNION ALL
  SELECT ST_GEOGFROMTEXT('MULTIPOINT(0 0, 1 1)') AS geography
  UNION ALL
  SELECT ST_GEOGFROMTEXT('GEOMETRYCOLLECTION(POINT(0 0), LINESTRING(1 2, 2 1))'))
SELECT
  geography AS original_geography,
  ST_DUMP(geography) AS dumped_geographies
FROM example

/*-------------------------------------+------------------------------------*
 |         original_geographies        |      dumped_geographies            |
 +-------------------------------------+------------------------------------+
 | POINT(0 0)                          | [POINT(0 0)]                       |
 | MULTIPOINT(0 0, 1 1)                | [POINT(0 0), POINT(1 1)]           |
 | GEOMETRYCOLLECTION(POINT(0 0),      | [POINT(0 0), LINESTRING(1 2, 2 1)] |
 |   LINESTRING(1 2, 2 1))             |                                    |
 *-------------------------------------+------------------------------------*/

The following example shows how ST_DUMP with the dimension argument only returns simple geographies of the given dimension.

WITH example AS (
  SELECT ST_GEOGFROMTEXT('GEOMETRYCOLLECTION(POINT(0 0), LINESTRING(1 2, 2 1))') AS geography)
SELECT
  geography AS original_geography,
  ST_DUMP(geography, 1) AS dumped_geographies
FROM example

/*-------------------------------------+------------------------------*
 |         original_geographies        |      dumped_geographies      |
 +-------------------------------------+------------------------------+
 | GEOMETRYCOLLECTION(POINT(0 0),      | [LINESTRING(1 2, 2 1)]       |
 |   LINESTRING(1 2, 2 1))             |                              |
 *-------------------------------------+------------------------------*/

ST_DWITHIN

ST_DWITHIN(geography_1, geography_2, distance[, use_spheroid])

Description

Returns TRUE if the distance between at least one point in geography_1 and one point in geography_2 is less than or equal to the distance given by the distance argument; otherwise, returns FALSE. If either input GEOGRAPHY is empty, ST_DWithin returns FALSE. The given distance is in meters on the surface of the Earth.

The optional use_spheroid parameter determines how this function measures distance. If use_spheroid is FALSE, the function measures distance on the surface of a perfect sphere.

The use_spheroid parameter currently only supports the value FALSE. The default value of use_spheroid is FALSE.

Return type

BOOL

ST_ENDPOINT

ST_ENDPOINT(linestring_geography)

Description

Returns the last point of a linestring geography as a point geography. Returns an error if the input is not a linestring or if the input is empty. Use the SAFE prefix to obtain NULL for invalid input instead of an error.

Return Type

Point GEOGRAPHY

Example

SELECT ST_ENDPOINT(ST_GEOGFROMTEXT('LINESTRING(1 1, 2 1, 3 2, 3 3)')) last

/*--------------*
 | last         |
 +--------------+
 | POINT(3 3)   |
 *--------------*/

ST_EQUALS

ST_EQUALS(geography_1, geography_2)

Description

Returns TRUE if geography_1 and geography_2 represent the same

GEOGRAPHY value. More precisely, this means that one of the following conditions holds: + ST_COVERS(geography_1, geography_2) = TRUE and ST_COVERS(geography_2, geography_1) = TRUE + Both geography_1 and geography_2 are empty.

Therefore, two GEOGRAPHYs may be equal even if the ordering of points or vertices differ, as long as they still represent the same geometric structure.

Constraints

ST_EQUALS is not guaranteed to be a transitive function.

Return type

BOOL

ST_EXTENT

ST_EXTENT(geography_expression)

Description

Returns a STRUCT that represents the bounding box for the set of input GEOGRAPHY values. The bounding box is the minimal rectangle that encloses the geography. The edges of the rectangle follow constant lines of longitude and latitude.

Caveats:

  • Returns NULL if all the inputs are NULL or empty geographies.
  • The bounding box might cross the antimeridian if this allows for a smaller rectangle. In this case, the bounding box has one of its longitudinal bounds outside of the [-180, 180] range, so that xmin is smaller than the eastmost value xmax.
  • If the longitude span of the bounding box is larger than or equal to 180 degrees, the function returns the bounding box with the longitude range of [-180, 180].

Return type

STRUCT<xmin FLOAT64, ymin FLOAT64, xmax FLOAT64, ymax FLOAT64>.

Bounding box parts:

  • xmin: The westmost constant longitude line that bounds the rectangle.
  • xmax: The eastmost constant longitude line that bounds the rectangle.
  • ymin: The minimum constant latitude line that bounds the rectangle.
  • ymax: The maximum constant latitude line that bounds the rectangle.

Example

WITH data AS (
  SELECT 1 id, ST_GEOGFROMTEXT('POLYGON((-125 48, -124 46, -117 46, -117 49, -125 48))') g
  UNION ALL
  SELECT 2 id, ST_GEOGFROMTEXT('POLYGON((172 53, -130 55, -141 70, 172 53))') g
  UNION ALL
  SELECT 3 id, ST_GEOGFROMTEXT('POINT EMPTY') g
)
SELECT ST_EXTENT(g) AS box
FROM data

/*----------------------------------------------*
 | box                                          |
 +----------------------------------------------+
 | {xmin:172, ymin:46, xmax:243, ymax:70}       |
 *----------------------------------------------*/

ST_BOUNDINGBOX for the non-aggregate version of ST_EXTENT.

ST_EXTERIORRING

ST_EXTERIORRING(polygon_geography)

Description

Returns a linestring geography that corresponds to the outermost ring of a polygon geography.

  • If the input geography is a polygon, gets the outermost ring of the polygon geography and returns the corresponding linestring.
  • If the input is the full GEOGRAPHY, returns an empty geography.
  • Returns an error if the input is not a single polygon.

Use the SAFE prefix to return NULL for invalid input instead of an error.

Return type

  • Linestring GEOGRAPHY
  • Empty GEOGRAPHY

Examples

WITH geo as
 (SELECT ST_GEOGFROMTEXT('POLYGON((0 0, 1 4, 2 2, 0 0))') AS g UNION ALL
  SELECT ST_GEOGFROMTEXT('''POLYGON((1 1, 1 10, 5 10, 5 1, 1 1),
                                  (2 2, 3 4, 2 4, 2 2))''') as g)
SELECT ST_EXTERIORRING(g) AS ring FROM geo;

/*---------------------------------------*
 | ring                                  |
 +---------------------------------------+
 | LINESTRING(2 2, 1 4, 0 0, 2 2)        |
 | LINESTRING(5 1, 5 10, 1 10, 1 1, 5 1) |
 *---------------------------------------*/

ST_GEOGFROM

ST_GEOGFROM(expression)

Description

Converts an expression for a STRING or BYTES value into a GEOGRAPHY value.

If expression represents a STRING</