Storage-optimized Vector Search

By default, Vector Search indexes are optimized for performance. For a more cost-effective solution, you can configure indexes to be optimized for storage instead. This consolidates index data onto fewer shards in exchange for slightly slower queries. This is ideal in scenarios where reducing operational costs is more critical than achieving the lowest possible latency.

When to use a storage-optimized index

Consider storage-optimized indexes if you have any of the following:

  • A very large dataset: You must index very large numbers of vectors, and the cost of hosting a large number of performance-optimized shards is prohibitive.

  • A low-QPS workload: In low-query-volume applications, the cost savings from using fewer shards can be significant.

  • Flexible latency requirements: Your application can tolerate a minor increase in query latency, which is the time it takes to get a search result.

Performance trade-offs

Compared to the default performance-optimized index, a storage-optimized index has the following characteristics:

  • Increased query latency: Queries have a slightly higher latency at a given recall level.

How to configure a storage-optimized index

To create an index that is storage optimized, set the shardSize parameter to SHARD_SIZE_SO_DYNAMIC in your index configuration.

Example: Creating a storage-optimized index

The following example shows the metadata for creating a new streaming index that is storage-optimized.

{
  "displayName": "my-storage-optimized-index",
  "description": "An index configured to prioritize storage over performance.",
  "metadata": {
    "contentsDeltaUri": "gs://your-bucket/source-data/",
    "config": {
      "dimensions": 100,
      "approximateNeighborsCount": 150,
      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",
      "shardSize": "SHARD_SIZE_SO_DYNAMIC"
    }
  },
  "indexUpdateMethod": "STREAM_UPDATE"
}

In the example, shardSize is set to SHARD_SIZE_SO_DYNAMIC, which instructs Vector Search to build a denser index. This allows each shard to hold significantly more data points, thereby reducing the total number of shards needed for your dataset. Other fields, such as dimensions and distanceMeasureType, are configured according to your needs.

What's next?