To apply recommendations to help you find the optimal values of num_leaves and num_leaves_to_search for your dataset,
follow these recommended steps:
To create the ScaNN index optimized for the following cases set the num_leaves parameter to the following value, where rows is the number of rows in the indexed table:
balanced index build time and quality set num_leaves to sqrt(rows).
quality set num_leaves to rows/100.
Run your test queries, increasing the value of scann.num_of_leaves_to_search, until you achieve your target recall range–for example, 95%. For more information about analyzing your queries, see Analyze your queries.
Take note of the ratio between scann.num_leaves_to_search and
num_leaves that will be used in subsequent steps. This ratio provides an approximation around the dataset that will
help you achieve your target recall.
If you are working with high dimension vectors (500 dimensions or higher) and want to improve recall, then try tuning the value of scann.pre_reordering_num_neighbors. The default value is set to the value 500 * K where K is the limit that you set in your query.
If your QPS is too low after your queries achieve a target recall, then follow these steps:
Recreate the index, increasing the value of num_leaves and scann.num_leaves_to_search according to the following guidance:
Set num_leaves to a larger factor of the square root of your row count. For example, if the index has num_leaves set to the square root of your row count, try setting it to double the square root. If the value
is already double, then try setting it to triple the square root.
Increase scann.num_leaves_to_search as needed to maintain its ratio with num_leaves,
which you noted in Step 3.
Set num_leaves to a value less than or equal to the row count divided by 100.
Run the test queries again.
While you're running the test queries, experiment with reducing scann.num_leaves_to_search, finding a value that increases QPS while keeping your recall high. Try different values of scann.num_leaves_to_search
without rebuilding the index.
Repeat Step 4 until both the QPS and the recall range have reached acceptable values.
Three-level tree index
In addition to the recommendations for the two-level tree ScaNN index, use the following guidance.
To apply recommendations to find the optimal value of num_leaves and max_num_levels index parameters, follow these steps:
Create the ScaNN index with the following num_leaves and max_num_levels combinations based on your performance goals:
balance index build time & quality: Set max_num_levels as 2 and num_leaves as power(rows, ⅔).
optimize for quality: Set max_num_levels as 2 and num_leaves as rows/100.
Run your test queries. For more information about analyzing queries, see Analyze your queries.
Take note of the ratio between scann.num_leaves_to_search and num_leaves that will be used in subsequent steps. This ratio provides an approximation around the dataset that will help you achieve your target recall.
If you are working with high dimension vectors (500 dimensions or higher) and want to improve recall, then try tuning the value of scann.pre_reordering_num_neighbors. The default value is set to the value 500 * K where K is the limit that you set in your query.
If your QPS is too low after your queries achieve a target recall, then follow these steps:
Recreate the index, increasing the value of num_leaves and scann.num_leaves_to_search according to the following guidance:
Set num_leaves to a larger factor of the power(rows, ⅔). For example, if the index has num_leaves set to the power(rows, ⅔), try setting it to double the power(rows, ⅔). If the value is already double, then try setting it to triple the power(rows, ⅔).
Increase scann.num_leaves_to_search as needed to maintain its ratio with num_leaves, which you noted in Step 3.
Set num_leaves to a value less than or equal to rows/100.
Run the test queries again. While you're running the test queries, experiment with reducing scann.num_leaves_to_search, finding a value that increases QPS while keeping your recall high. Try different values of scann.num_leaves_to_search without rebuilding the index.
Repeat Step 4 until both the QPS and the recall range have reached acceptable values.
Index maintenance
If your table is prone to frequent updates or insertions, then we recommend periodically reindexing the existing ScaNN index in order to improve the recall accuracy.
You can monitor index metrics to view changes in vector distributions or vector mutations since the index was built, and then reindex accordingly. For more information about metrics, see Vector index metrics.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003e\u003ccode\u003eScaNN\u003c/code\u003e index parameters, such as \u003ccode\u003enum_leaves\u003c/code\u003e and \u003ccode\u003enum_leaves_to_search\u003c/code\u003e, should be adjusted based on whether you're creating a two-level or three-level tree index to optimize for recall and QPS.\u003c/p\u003e\n"],["\u003cp\u003eFor a two-level tree index, you can initially set \u003ccode\u003enum_leaves\u003c/code\u003e to \u003ccode\u003esqrt(rows)\u003c/code\u003e for a balance of build time and quality, or to \u003ccode\u003erows/100\u003c/code\u003e for higher quality, and then adjust \u003ccode\u003escann.num_leaves_to_search\u003c/code\u003e to reach the desired recall.\u003c/p\u003e\n"],["\u003cp\u003eFor a three-level tree index, \u003ccode\u003emax_num_levels\u003c/code\u003e should be set to 2, with \u003ccode\u003enum_leaves\u003c/code\u003e set to \u003ccode\u003epower(rows, ⅔)\u003c/code\u003e for balanced performance, or \u003ccode\u003erows/100\u003c/code\u003e for higher quality, adjusting \u003ccode\u003escann.num_leaves_to_search\u003c/code\u003e for recall.\u003c/p\u003e\n"],["\u003cp\u003eIf QPS is too low after achieving target recall, increase \u003ccode\u003enum_leaves\u003c/code\u003e and \u003ccode\u003escann.num_leaves_to_search\u003c/code\u003e, maintaining their ratio, and then experiment with reducing \u003ccode\u003escann.num_leaves_to_search\u003c/code\u003e to increase QPS without sacrificing recall.\u003c/p\u003e\n"],["\u003cp\u003eIf your table changes frequently, you should periodically reindex the existing \u003ccode\u003eScaNN\u003c/code\u003e index, and you can monitor the index metrics to identify changes.\u003c/p\u003e\n"]]],[],null,["# Best practices for tuning ScaNN indexes\n\nSelect a documentation version: 16.3.0keyboard_arrow_down\n\n- [Current (16.8.0)](/alloydb/omni/current/docs/ai/best-practices-tuning-scann)\n- [16.8.0](/alloydb/omni/16.8.0/docs/ai/best-practices-tuning-scann)\n- [16.3.0](/alloydb/omni/16.3.0/docs/ai/best-practices-tuning-scann)\n- [15.12.0](/alloydb/omni/15.12.0/docs/ai/best-practices-tuning-scann)\n- [15.7.1](/alloydb/omni/15.7.1/docs/ai/best-practices-tuning-scann)\n- [15.7.0](/alloydb/omni/15.7.0/docs/ai/best-practices-tuning-scann)\n\n\u003cbr /\u003e\n\nThis page provides recommendations about how to tune AlloyDB Omni index parameters for optimal balance between recall and QPS. The recommended parameters for the Scalable Nearest Neighbor (ScaNN) index differ depending on whether you choose to build a two-level or a three-level tree index.\n\n\u003cbr /\u003e\n\nScaNN index creation\n--------------------\n\nFor more information, see the [ScaNN Index reference](/alloydb/omni/16.3.0/docs/reference/scann-index-reference).\n\n### Two-level tree index\n\nTo apply recommendations to help you find the optimal values of `num_leaves` and `num_leaves_to_search` for your dataset,\nfollow these recommended steps:\n\n1. To create the `ScaNN` index optimized for the following cases set the `num_leaves` parameter to the following value, where rows is the number of rows in the indexed table:\n - **balanced index build time and quality** set `num_leaves` to `sqrt(rows)`.\n - **quality** set num_leaves to rows/100.\n2. Run your test queries, increasing the value of `scann.num_of_leaves_to_search`, until you achieve your target recall range--for example, 95%. For more information about analyzing your queries, see [Analyze your queries](/alloydb/omni/16.3.0/docs/ai/tune-indexes?resource=scann#explain-analyze).\n3. Take note of the ratio between `scann.num_leaves_to_search` and `num_leaves` that will be used in subsequent steps. This ratio provides an approximation around the dataset that will help you achieve your target recall. \n\n If you are working with high dimension vectors (500 dimensions or higher) and want to improve recall, then try tuning the value of `scann.pre_reordering_num_neighbors`. The default value is set to the value `500 * K` where `K` is the limit that you set in your query.\n4. If your QPS is too low after your queries achieve a target recall, then follow these steps:\n 1. Recreate the index, increasing the value of `num_leaves` and `scann.num_leaves_to_search` according to the following guidance:\n - Set `num_leaves` to a larger factor of the square root of your row count. For example, if the index has `num_leaves` set to the square root of your row count, try setting it to double the square root. If the value is already double, then try setting it to triple the square root.\n - Increase `scann.num_leaves_to_search` as needed to maintain its ratio with `num_leaves`, which you noted in Step 3.\n - Set `num_leaves` to a value less than or equal to the row count divided by 100.\n 2. Run the test queries again. While you're running the test queries, experiment with reducing `scann.num_leaves_to_search`, finding a value that increases QPS while keeping your recall high. Try different values of `scann.num_leaves_to_search` without rebuilding the index.\n5. Repeat Step 4 until both the QPS and the recall range have reached acceptable values.\n\n### Three-level tree index\n\nIn addition to the recommendations for the two-level tree `ScaNN` index, use the following guidance.\n\nTo apply recommendations to find the optimal value of `num_leaves` and `max_num_levels` index parameters, follow these steps:\n\n1. Create the `ScaNN` index with the following `num_leaves` and `max_num_levels` combinations based on your performance goals:\n\n - **balance index build time \\& quality** : Set `max_num_levels` as `2` and `num_leaves` as `power(rows, ⅔)`.\n - **optimize for quality** : Set `max_num_levels` as `2` and `num_leaves` as `rows/100`.\n2. Run your test queries. For more information about analyzing queries, see [Analyze your queries](/alloydb/omni/16.3.0/docs/ai/tune-indexes?resource=scann#explain-analyze).\n\n3. Take note of the ratio between `scann.num_leaves_to_search` and `num_leaves` that will be used in subsequent steps. This ratio provides an approximation around the dataset that will help you achieve your target recall.\n\nIf you are working with high dimension vectors (500 dimensions or higher) and want to improve recall, then try tuning the value of `scann.pre_reordering_num_neighbors`. The default value is set to the value `500 * K` where `K` is the limit that you set in your query.\n\n1. If your QPS is too low after your queries achieve a target recall, then follow these steps:\n\n - Recreate the index, increasing the value of `num_leaves` and `scann.num_leaves_to_search` according to the following guidance:\n - Set `num_leaves` to a larger factor of the `power(rows, ⅔)`. For example, if the index has `num_leaves` set to the `power(rows, ⅔)`, try setting it to double the `power(rows, ⅔)`. If the value is already double, then try setting it to triple the `power(rows, ⅔)`.\n - Increase `scann.num_leaves_to_search` as needed to maintain its ratio with `num_leaves`, which you noted in Step 3.\n - Set `num_leaves` to a value less than or equal to `rows/100`.\n - Run the test queries again. While you're running the test queries, experiment with reducing `scann.num_leaves_to_search`, finding a value that increases QPS while keeping your recall high. Try different values of `scann.num_leaves_to_search` without rebuilding the index.\n2. Repeat Step 4 until both the QPS and the recall range have reached acceptable values.\n\nIndex maintenance\n-----------------\n\nIf your table is prone to frequent updates or insertions, then we recommend periodically reindexing the existing `ScaNN` index in order to improve the recall accuracy.\nYou can monitor index metrics to view changes in vector distributions or vector mutations since the index was built, and then reindex accordingly. For more information about metrics, see [Vector index metrics](/alloydb/omni/16.3.0/docs/reference/vector-index-metrics).\n\nWhat's next\n-----------\n\n- [Get started with vector embeddings using AlloyDB AI](https://codelabs.developers.google.com/alloydb-ai-embedding#0)."]]