Tuning parameters
The following index parameters and database flags are used together to find the right balance of recall and QPS.
Tuning parameter | Description | Option type |
---|---|---|
max_num_levels |
The maximum number of centroid levels of the K-means clustering tree.
|
Index creation (optional) |
num_leaves |
The number of partitions to apply to this index. The number of partitions you apply to when creating an index affects the index performance. By increasing partitions for a set number of vectors, you create a more fine-grained index, which improves recall and query performance. However, this comes at the cost of longer index creation times. Since three-level trees build faster than two-level trees, you can increase the num_leaves_value when creating a three-level tree index to achieve better performance.
|
Index creation (required) |
quantizer |
The type of quantizer you want to use for the K-means tree. The default value is set to SQ8 which provides better query performance with minimal recall loss (typically less than 1-2%).Set it to FLAT if a recall of 99% or higher is required. |
Index creation (optional) |
scann.enable_pca |
Enables Principal Component Analysis (PCA), which is a dimension reduction technique used to automatically
reduce the size of the embedding when possible. This option is enabled by default. Set to false if you observe deterioration in recall. |
Index creation (optional) |
scann.pct_leaves_to_search (Preview) |
This database flag lets you [automatically manage the number of leaves or partitions to search](/alloydb/omni/containers/15.7.1/docs/ai/maintain-vector-indexes#manage-leave-to-search-split-partitions). Set this value to the current number of partitions. For example, to search 1% of of current number of partitions, set this value to 1 . You can set this parameter to any value between 0 to 100 . The default value is 0 , which disables this parameter and uses the `scann.num_leaves_to_search` to calculate the number of leaves to search. The parameter is disabled by default. |
Query runtime (optional) |
scann.num_leaves_to_search |
This database flag controls the absolute number of leaves or partitions to search which lets you trade off between recall and QPS. The default value is 1% of the value set in num_leaves . A higher value will result in better recall but lower QPS. Similarly, a lower value will result in lower recall but higher QPS. |
Query runtime (optional) |
scann.pre_reordering_num_neighbors |
The database flag, when set, specifies the number of candidate neighbors to consider during the reordering stages after the initial search identifies a set of candidates. Set this parameter to a value higher than the number of neighbors you want the query to return. A higher value results in better recall, but a lower QPS. Set this value to 0 to disable reordering. The default is 0 if PCA is not enabled during index creation. Otherwise, the default is 50 x K , where K is the LIMIT specified in the query. |
Query runtime (optional) |
scann.num_search_threads |
The number of searcher threads for multi-thread search. This can help reduce single query latency by using more than one thread for ScaNN ANN search in latency-sensitive applications. This setting doesn't improve single query latency if the database is already cpu-bound. The default value is 2 . |
Query runtime (optional) |