AlloyDB ScaNN Index reference

This page provides reference material about the ScaNN Index for AlloyDB for PostgreSQL.

To create ScaNN indexes that can be tuned, you must install the alloydb_scann and vector extension. For more information about creating indexes, see Create indexes.

Tuning parameters

The following index and query parameters are used to find the right balance of recall and queries per second (QPS).

Tuning parameter	Description	Option type
`max_num_levels`	The maximum number of centroid levels of the K-means clustering tree. Two-level tree index: Set to `1` by default for a two-level tree (1 centroid level + bottom leaf level). Three-level tree index: Set to `2` by default for a three-level tree (2 centroid levels + bottom leaf level) Set the value to `2` if the number of vector rows exceeds 100 million rows. Set the value to `1` if the number of vector rows are less than 10 million rows. Set to either `1` or `2` if the number of vector rows lie between 10 million and 100 million rows to optimize for index build time (set to 2) or optimize for search recall (set to 1).	Index creation (optional)
`num_leaves`	The number of partitions to apply to this index. The number of partitions you apply to when creating an index affects the index performance. By increasing partitions for a set number of vectors, you create a more fine-grained index, which improves recall and query performance. However, this comes at the cost of longer index creation times. Since three-level trees build faster than two-level trees, you can increase the `num_leaves_value` when creating a three-level tree index to achieve better performance. Two-level index: Set this value to any value between `1` and `1048576`. For an index that balances fast index build and good search performance, use `sqrt(ROWS)` as a starting point, where `ROWS` is the number of vector rows. The number of vectors that each partition holds is calculated by `ROWS/sqrt(ROWS) = sqrt(ROWS)`. Since a two-level tree index can be created on a dataset with less than 10 million vector rows, each partition will hold less than (`sqrt(10M)`) vectors, which is `3200` vectors. For optimal vector search quality, it's recommended to minimize the number of vectors in each partition. The recommended partition size is about 100 vectors per partition, so set `num_leaves` to `ROWS/100`. If you have 10 million vectors you would set `num_leaves` to 100,000. Three-level index: Set this value to any value between `1` and `1048576`. If you are unsure about selecting the exact value, use `power(ROWS, 2/3)` as a starting point, where `ROWS` is the number of vector rows. The number of vectors that each partition holds is calculated by `ROWS/power(ROWS, 2/3) = power(ROWS, 1/3)`. Since a three-level tree index can be created on a dataset with vector rows more than 100 million, each partition will hold more than (`power(100M, 1/3)`) vectors, which is `465` vectors. For optimal performance, it's recommended to minimize the number of vectors in each partition. The recommended partition size is about 100 vectors per partition, so set `num_leaves` to `ROWS/100`. If you have 100 million vectors you would set `num_leaves` to 1 million. Note: When using `AH` as the `quantizer`, the upper bound for `num_leaves` is `total count of datapoints * dimensionality / 32 * 1000`. The `dimensionality` here refers to the dimensionality of the vectors after PCA, which is enabled by default and can reduce the dimensionality significantly. Refer to the Best practices for tuning ScaNN for more details.	Index creation (required)
`quantizer`	The type of quantizer to use for the K-means tree. Available options are: `SQ8`: Provides a good balance of query performance with minimal recall loss, typically less than 1-2%. This is the default value. `FLAT`: Prioritize this for applications requiring 99% or higher recall. `AH`: Consider this for potentially better query performance when the columnar engine is enabled and your index data can fit in memory. Refer to the Best practices for tuning ScaNN for more details on AH requirements and usage.	Index creation (optional)
`scann.enable_pca`	Enables Principal Component Analysis (PCA), which is a dimension reduction technique used to automatically reduce the size of the embedding when possible. This option is enabled by default. Set to `false` if you observe deterioration in recall.	Index creation (optional)
`scann.pct_leaves_to_search (Preview)`	This database flag lets you automatically manage the number of leaves or partitions to search. Set this value to the current number of partitions. For example, to search 1% of of current number of partitions, set this value to `1`. You can set this parameter to any value between `0` to `100`. The default value is `0`, which disables this parameter and uses the `scann.num_leaves_to_search` to calculate the number of leaves to search. The parameter is disabled by default.
`scann.num_leaves_to_search`	This database flag controls the absolute number of leaves or partitions to search which lets you trade off between recall and QPS. The default value is 1% of the value set in `num_leaves`. A higher value will result in better recall but lower QPS. Similarly, a lower value will result in lower recall but higher QPS.	Query runtime (optional)
`scann.pre_reordering_num_neighbors`	The database flag, when set, specifies the number of candidate neighbors to consider during the reordering stages after the initial search identifies a set of candidates. Set this parameter to a value higher than the number of neighbors you want the query to return. A higher value results in better recall, but a lower QPS. Set this value to `0` to disable reordering. The default is `0` if PCA is not enabled during index creation. Otherwise, the default is `50 x K`, where `K` is the LIMIT specified in the query.	Query runtime (optional)
`scann.num_search_threads`	The number of searcher threads for multi-thread search. This can help reduce single query latency by using more than one thread for ScaNN ANN search in latency-sensitive applications. This setting doesn't improve single query latency if the database is already cpu-bound. The default value is `2`.	Query runtime (optional)

What's next

Get started with vector embeddings using AlloyDB AI.

AlloyDB ScaNN Index reference Stay organized with collections Save and categorize content based on your preferences.

Tuning parameters

What's next

AlloyDB ScaNN Index reference