Spanner Graph schema overview

This document describes the Spanner Graph schema and provides examples to illustrate key concepts. To learn more about Spanner Graph, see the Spanner Graph overview.

Property graph data model

A property graph lets you model connected data. It represents information as a network of nodes and edges. Nodes symbolize entities in your data landscape, like "customers", "products", or "locations", Edges show the connections between those nodes, capturing relationships like "purchased," "follows," or "located in".

Both nodes and edges can include the following information:

  • Labels, which classify nodes and edges into sets, for example, City.
  • Properties, which are key-value pairs, for example, population.

The example in Figure 1 shows how you might design a graph to model financial activities. This graph includes the following types of entities modeled as nodes:

  • Person: Represents an individual involved in financial transactions.
  • Account: Represents a bank account used for transactions.

These entities are connected by different types of relationships, which are represented by the following directed edges:

  • Owns: A person owns one or more accounts.
  • Transfers: Money moves from one account to another.

Each directed edge indicates a one-way relationship, flowing from a source node to a destination node. For example, a Transfers edge connects a source Account to a destination Account, indicating the flow of money.

Spanner Graph schema overview diagram.

Figure 1. Example graph with multiple nodes and directed edges.

Nodes and edges can contain additional information in the form of properties. Each property has a name and a value.

  • Person nodes have the following properties:
    • name (STRING)
    • id (INT64)
  • Transfers edges have the following property:
    • amount (FLOAT64)

Directed and undirected edges

The example graph uses directed edges, which indicate a specific direction in the relationship between entities. However, some relationships, like the "friend" relationship in a social network, are undirected, and they represent a reciprocal connection without a distinct origin or endpoint. In this case, you can model undirected edges as two directed edges, one edge in each direction.

Spanner Graph schema design

Spanner Graph lets you create a graph from tables using the CREATE PROPERTY GRAPH statement. The tables used to create graphs are called input tables. This approach is based on SQL/PGQ (Property Graph Queries), which is part of SQL:2023 Standards.

Defining a node in a property graph

To define a node, you add a node definition in the NODE TABLES clause. The simplest form of node definition contains only an input table name. Rows from the input table are mapped to graph nodes.

In the following example, you use the NODE TABLES clause to define the Account node in the FinGraph property graph. The node definition contains the input table Account.

-- First, create an Account table.
CREATE TABLE Account (
  id           INT64 NOT NULL,
  create_time  TIMESTAMP,
) PRIMARY KEY (id);

-- Next, use the Account table as input table of Account node definition.
CREATE PROPERTY GRAPH FinGraph
  NODE TABLES (
    Account
  );

Default label and properties

By default, all nodes use the input table name as their label, and all columns from the input table are exposed as node properties.

In the previous example,

  • Each account node has the Account label.
  • Each account node has properties [id, create_time] created from the Account table columns.

Element Key

A node definition also defines the element key, which uniquely identifies a graph node.

  • By default, the element key is the primary key of the input table.
  • Element keys can be explicitly defined by the KEY clause.
  • Columns with uniqueness constraints UNIQUE INDEX can be used as element keys.

The following example defines Account node and Person node.

  • Account node uses the Account table's primary key as its element key by default.
  • Person node on the other hand, explicitly specifies the id as the element key with the KEY clause.
CREATE TABLE Person (
  id           INT64 NOT NULL,
  name         STRING(MAX),
) PRIMARY KEY (id);

CREATE TABLE Account (
  id           INT64 NOT NULL,
  create_time  TIMESTAMP,
) PRIMARY KEY (id);

CREATE PROPERTY GRAPH FinGraph
  NODE TABLES (
    Person KEY (id),
    Account
  );

Mapping a row in the input table to a node in the graph

  • Each row with a non-null element key maps to a unique node in the graph, identified by the element key.
  • Rows with a null element key are ignored.

Defining an edge in a property graph

To define an edge, add an edge definition into the EDGE TABLES clause. The simplest form of edge definition contains only an input table name. Rows from the input table are mapped to graph edges.

Source and destination node reference

In the following example, you create a property graph FinGraph with the following:

  • Person and Account nodes
  • PersonOwnAccount edge
CREATE TABLE Person (
 id            INT64 NOT NULL,
 name          STRING(MAX),
) PRIMARY KEY (id);

CREATE TABLE Account (
 id            INT64 NOT NULL,
 create_time   TIMESTAMP,
) PRIMARY KEY (id);

CREATE TABLE PersonOwnAccount (
 id            INT64 NOT NULL,
 account_id    INT64 NOT NULL,
 create_time   TIMESTAMP,
 FOREIGN KEY (account_id) REFERENCES Account (id)
) PRIMARY KEY (id, account_id),
  INTERLEAVE IN PARENT Person;

CREATE PROPERTY GRAPH FinGraph
  NODE TABLES (
    Person,
    Account
  )
  EDGE TABLES (
    PersonOwnAccount
      SOURCE KEY (id) REFERENCES Person (id)
      DESTINATION KEY (account_id) REFERENCES Account (id)
  );

An edge definition must define the source and destination node reference, using the SOURCE KEY, DESTINATION KEY, and REFERENCES clauses. The following example uses the edge definition of PersonOwnAccount to illustrate the this concept:

EDGE TABLES (
  PersonOwnAccount
    SOURCE KEY (id) REFERENCES Person (id)
    DESTINATION KEY (account_id) REFERENCES Account (id)
)

Each PersonOwnAccount edge connects a Person (source) to an Account (destination) node.

  • The SOURCE node of an edge is a Person node whose id is the same as the edge id.
  • The DESTINATION node of an edge is an Account node whose id is the same as the edge account_id.

Additionally, the following is true for the PersonOwnAccount edge:

  • The element key is the primary key of the PersonOwnAccount table, namely (id, account_id).
  • Each edge has the same set of properties as the columns from the PersonOwnAccount table.
  • Each edge has the default PersonOwnAccount label.

Mapping a row in an edge input table to edges in the graph

  • Each row in the edge input table, whose element key is not null, usually maps to a unique edge in your graph.
  • A row might correspond to zero or more than one edge in the graph, for example, when the source node reference matches zero or more nodes in the source node table.
  • The same input table can be used in different node or edge definitions to create different sets of nodes or edges. For more information, see Merge node and edge input tables.

Customize labels and properties

You can use the LABEL and PROPERTIES clauses to customize labels and properties.

In the following example, there are two nodes defined, Person and Account.

  • The Person nodes expose the address property through the label Customer. The address property is defined by the expression CONCAT(city, ", ", country), which refers to the city and country column from the input table Person.
  • For Account, the Account node exposes the id and create_time properties through the label Account.
  • Person and Account have the Entity label with properties [id, name].
    • For Person, the id and name properties come from the input table columns.
    • For Account, the name property refers to the nick_name column of the input table.
CREATE TABLE Person (
 id               INT64 NOT NULL,
 name             STRING(MAX),
 birthday         TIMESTAMP,
 country          STRING(MAX),
 city             STRING(MAX),
) PRIMARY KEY (id);

CREATE TABLE Account (
 id               INT64 NOT NULL,
 create_time      TIMESTAMP,
 is_blocked       BOOL,
 nick_name        STRING(MAX),
) PRIMARY KEY (id);

CREATE PROPERTY GRAPH FinGraph
  NODE TABLES (
    Person KEY (id)
      LABEL Customer
        PROPERTIES (CONCAT(city, ", ", country) AS address)
      LABEL Entity PROPERTIES (id, name),
    Account KEY (id)
      LABEL Account PROPERTIES (id, create_time)
      LABEL Entity PROPERTIES (id, nick_name AS name)
  );

Label and Property Consistency

In a graph, labels and properties are uniquely identified by their names. Labels and properties with the same name can appear in multiple node or edge definitions. However, labels and properties with the same name must follow these rules:

  • Properties with the same name must have the same value type.
  • Labels with the same name must expose the same list of properties.

In the previous example, the Entity label is defined in both Person and Account nodes. In both definitions, they have the same set of properties names [id, name] with the identical value types.

Dependencies between graphs and other schema objects

The graph created by CREATE PROPERTY GRAPH has a dependency on other schema objects like the input tables of the node and edge definitions, and the table columns referenced by the properties. If a schema change will break one of these dependencies, the change isn't allowed.

The following statement creates a dependency from FinGraph to the Account table and the id, create_time columns.

CREATE OR REPLACE PROPERTY GRAPH FinGraph
  NODE TABLES (
    Account PROPERTIES (id, create_time)
  );

The following are examples of schema changes that you aren't allowed to make:

However, you can make the following schema changes:

  • Modify the Account table and id and create_time columns schema as long as they are allowed by other schema requirements. For more information, see Make schema updates.

What's next