Best practices for tuning Spanner Graph queries

This document describes best practices for tuning Spanner Graph query performance, which include the following optimizations:

  • Avoid a full scan of the input table for nodes and edges.
  • Reduce the amount of data the query needs to read from storage.
  • Reduce the size of intermediate data.

Start from lower cardinality nodes

Write the path traversal so that it starts with the lower cardinality nodes. This approach keeps the intermediate result set small, and speeds up query execution.

For example, the following queries have the same semantics:

  • Forward edge traversal:

    GRAPH FinGraph
    MATCH (p:Person)-[:Owns]->(a:Account)
    WHERE p.name = "Alex"
      AND a.is_blocked
    RETURN p.id AS person_id, a.id AS account_id;
    
  • Reverse edge traversal:

    GRAPH FinGraph
    MATCH (a:Account)<-[:Owns]-(p:Person)
    WHERE p.name = "Alex"
      AND a.is_blocked
    RETURN p.id AS person_id, a.id AS account_id;
    

Assuming that there are fewer people with the name Alex than there are blocked accounts, we recommend that you write this query in the forward edge traversal.

Starting from lower cardinality nodes is especially important for variable-length path traversal. The following example shows the recommended way to find accounts that are transitively transferred from a given account within three hops.

GRAPH FinGraph
MATCH (:Account {id: 7})-[:Transfers]->{1,3}(a:Account)
RETURN a.id;

Specify all labels by default

Spanner Graph infers the qualifying nodes and edge labels if labels are omitted. We recommend that you specify labels for all nodes and edges where possible, because this inference might not always be possible and it might cause more labels than necessary to be scanned.

Single MATCH statement

The following example finds accounts linked by at most 3 transfers from the given account:

GRAPH FinGraph
MATCH (src:Account {id: 7})-[:Transfers]->{1,3}(dst:Account)
RETURN dst.id;

Across MATCH statements

Specify labels on nodes and edges when they refer to the same element but are across MATCH statements.

The following example shows this recommended approach:

GRAPH FinGraph
MATCH (acct:Account {id: 7})-[:Transfers]->{1,3}(other_acct:Account)
RETURN acct, COUNT(DISTINCT other_acct) AS related_accts
GROUP BY acct

NEXT

MATCH (acct:Account)<-[:Owns]-(p:Person)
RETURN p.id AS person, acct.id AS acct, related_accts;

What's next