This page describes how to perform vector similarity search in Spanner Graph to find K-nearest neighbors (KNN) and approximate nearest neighbors (ANN). You can use vector distance functions to perform KNN and ANN vector search for use cases like similarity search or retrieval-augmented generation for generative AI applications.
Spanner Graph supports the following distance functions to perform KNN vector similarity search:
COSINE_DISTANCE()
: measures the shortest distance between two vectors.EUCLIDEAN_DISTANCE()
: measures the cosine of the angle between two vectors.DOT_PRODUCT()
: calculates the cosine of the angle multiplied by the product of corresponding vector magnitudes. If you know that all the vector embeddings in your dataset are normalized, then you can useDOT_PRODUCT()
as a distance function.
For more information, see Perform vector similarity search in Spanner by finding the K-nearest neighbors.
Spanner Graph also supports the following approximate distance functions to perform ANN vector similarity search:
APPROX_COSINE_DISTANCE
: measures the approximate shortest distance between two vectors.APPROX_EUCLIDEAN_DISTANCE
: measures the approximate cosine of the angle between two vectors.APPROX_DOT_PRODUCT
: calculates the approximate cosine of the angle multiplied by the product of corresponding vector magnitudes. If you know that all the vector embeddings in your dataset are normalized, then you can useDOT_PRODUCT()
as a distance function.
For more information, see Find approximate nearest neighbors, create vector index, and query vector embeddings.
Before you begin
To run the examples in this document, you must first follow the steps in Set up and query Spanner Graph to do the following:
After you insert the essential graph data, make the following updates to your database.
Insert additional vector data in graph database
To make the required updates to your graph database, do the following:
Add a new column,
nick_name_embeddings
, to theAccount
input table.ALTER TABLE Account ADD COLUMN nick_name_embeddings ARRAY<FLOAT32>(vector_length=>4);
Add data to the
nick_name
column.UPDATE Account SET nick_name = "Fund for a refreshing tropical vacation" WHERE id = 7; UPDATE Account SET nick_name = "Fund for a rainy day!" WHERE id = 16; UPDATE Account SET nick_name = "Saving up for travel" WHERE id = 20;
Create embeddings for the text in the
nick_name
column, and populate them into the newnick_name_embeddings
column.To generate Vertex AI embeddings for your operational data in Spanner Graph, see Get Vertex AI text embeddings.
For illustrative purposes, our examples use artificial, low-dimensional vector values.
UPDATE Account SET nick_name_embeddings = ARRAY<FLOAT32>[0.3, 0.5, 0.8, 0.7] WHERE id = 7; UPDATE Account SET nick_name_embeddings = ARRAY<FLOAT32>[0.4, 0.9, 0.7, 0.1] WHERE id = 16; UPDATE Account SET nick_name_embeddings = ARRAY<FLOAT32>[0.2, 0.5, 0.6, 0.6] WHERE id = 20;
Add two new columns to the
AccountTransferAccount
input table:notes
andnotes_embeddings
.ALTER TABLE AccountTransferAccount ADD COLUMN notes STRING(MAX); ALTER TABLE AccountTransferAccount ADD COLUMN notes_embeddings ARRAY<FLOAT32>(vector_length=>4);
Create embeddings for the text in the
notes
column, and populate them into thenotes_embeddings
column.To generate Vertex AI embeddings for your operational data in Spanner Graph, see Get Vertex AI text embeddings.
For illustrative purposes, our examples use artificial, low-dimensional vector values.
UPDATE AccountTransferAccount SET notes = "for shared cost of dinner", notes_embeddings = ARRAY<FLOAT32>[0.3, 0.5, 0.8, 0.7] WHERE id = 16 AND to_id = 20; UPDATE AccountTransferAccount SET notes = "fees for tuition", notes_embeddings = ARRAY<FLOAT32>[0.1, 0.9, 0.1, 0.7] WHERE id = 20 AND to_id = 7; UPDATE AccountTransferAccount SET notes = 'loved the lunch', notes_embeddings = ARRAY<FLOAT32>[0.4, 0.5, 0.7, 0.9] WHERE id = 20 AND to_id = 16;
After adding new columns to the
Account
andAccountTransferAccount
input tables, update the property graph definition using the following statements. For more information, see Update existing node or edge definitions.CREATE OR REPLACE PROPERTY GRAPH FinGraph NODE TABLES (Account, Person) EDGE TABLES ( PersonOwnAccount SOURCE KEY (id) REFERENCES Person (id) DESTINATION KEY (account_id) REFERENCES Account (id) LABEL Owns, AccountTransferAccount SOURCE KEY (id) REFERENCES Account (id) DESTINATION KEY (to_id) REFERENCES Account (id) LABEL Transfers );
Find K-nearest neighbors
In the following example, use the EUCLIDEAN_DISTANCE()
function to perform KNN
vector search on the nodes and edges of your graph database.
Perform KNN vector search on graph nodes
You can perform a KNN vector search on the nick_name_embeddings
property of
the Account
node. This KNN vector search returns the account owner's name
and the account's nick_name
. In the following example, the result shows the
top two K-nearest neighbors for accounts for leisure travel and vacation,
which is represented by the [0.2, 0.4, 0.9, 0.6]
vector embedding.
GRAPH FinGraph
MATCH (p:Person)-[:Owns]->(a:Account)
RETURN p.name, a.nick_name
ORDER BY EUCLIDEAN_DISTANCE(a.nick_name_embeddings,
-- An illustrative embedding for 'accounts for leisure travel and vacation'
ARRAY<FLOAT32>[0.2, 0.4, 0.9, 0.6])
LIMIT 2
Results
name | nick_name |
---|---|
Alex | Fund for a refreshing tropical vacation |
Dana | Saving up for travel |
Perform KNN vector search on graph edges
You can perform a KNN vector search on the notes_embeddings
property of the
Owns
edge. This KNN vector search returns the account owner's name
and the
transfer's notes
. In the following example, the result shows the top two
K-nearest neighbors for food expenses, which is represented by the
[0.2, 0.4, 0.9, 0.6]
vector embedding.
GRAPH FinGraph
MATCH (p:Person)-[:Owns]->(:Account)-[t:Transfers]->(:Account)
WHERE t.notes_embeddings IS NOT NULL
RETURN p.name, t.notes
ORDER BY EUCLIDEAN_DISTANCE(t.notes_embeddings,
-- An illustrative vector embedding for 'food expenses'
ARRAY<FLOAT32>[0.2, 0.4, 0.9, 0.6])
LIMIT 2
Results
name | notes |
---|---|
Lee | for shared cost of dinner |
Dana | loved the lunch |
Create a vector index and find approximate nearest neighbors
To perform an ANN search, you must create a specialized vector index
that Spanner Graph uses to accelerate the vector search. The vector
index must use a specific distance metric. You can choose the distance metric
most appropriate for your use case by setting the distance_type
parameter to
one of COSINE
, DOT_PRODUCT
or EUCLIDEAN
. For more information, see
VECTOR INDEX statements.
In the following example, you create a vector index using the euclidean distance
type on the nick_name_embedding
column of the Account
input table:
CREATE VECTOR INDEX NickNameEmbeddingIndex
ON Account(nick_name_embeddings)
WHERE nick_name_embeddings IS NOT NULL
OPTIONS (distance_type = 'EUCLIDEAN', tree_depth = 2, num_leaves = 1000);
Perform ANN vector search on graph nodes
After you create a vector index, you can perform a ANN vector search on the
nick_name
property of the Account
node. The ANN vector search returns the
account owner's name
and the account's nick_name
. In the following example,
the result shows the top two approximate nearest neighbors for accounts for
leisure travel and vacation, which is represented by the
[0.2, 0.4, 0.9, 0.6]
vector embedding.
The graph hint forces the query optimizer to use the specified, vector index in the query execution plan.
GRAPH FinGraph
MATCH (@{FORCE_INDEX=NickNameEmbeddingIndex} a:Account)
WHERE a.nick_name_embeddings IS NOT NULL
RETURN a, APPROX_EUCLIDEAN_DISTANCE(a.nick_name_embeddings,
-- An illustrative embedding for 'accounts for leisure travel and vacation'
ARRAY<FLOAT32>[0.2, 0.4, 0.9, 0.6],
options => JSON '{"num_leaves_to_search": 10}') AS distance
ORDER BY distance
LIMIT 2
NEXT
MATCH (p:Person)-[:Owns]->(a)
RETURN p.name, a.nick_name
Results
name | nick_name |
---|---|
Alex | Fund for a refreshing tropical vacation |
Dana | Saving up for travel |
What's next
- Perform vector similarity search in Spanner by finding the K-nearest neighbors.
- Find approximate nearest neighbors, create vector index, and query vector embeddings.
- Get Vertex AI text embeddings
- Learn more about Spanner Graph queries.
- Learn best practices for tuning queries.