[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-26。"],[],[],null,["# Choose among vector distance functions to measure vector embeddings similarity\n\n| **Note:** This feature is available with the Spanner Enterprise edition and Enterprise Plus edition. For more information, see the [Spanner editions overview](/spanner/docs/editions-overview).\n\n\u003cbr /\u003e\n\nThis page describes how to choose among the vector distance functions provided\nin Spanner to measure similarity between vector embeddings.\n\nAfter you've [generated embeddings](/spanner/docs/ml-tutorial-embeddings) from\nyour Spanner data, you can perform a similarity search using vector\ndistance functions. The following table describes the vector distance functions\nin Spanner.\n\nChoose a similarity measure\n---------------------------\n\nDepending on whether or not all your vector embeddings are normalized, you can\ndetermine which similarity measure to use to find similarity. A normalized\nvector embedding has a magnitude (length) of exactly 1.0.\n\nIn addition, if you know which distance function your model was trained with,\nuse that distance function to measure similarity between your vector\nembeddings.\n\n**Normalized data**\n\nIf you have a dataset where all vector embeddings are normalized, then all three\nfunctions provide the same semantic search results. In essence, although each\nfunction returns a different value, those values sort the same way. When\nembeddings are normalized, `DOT_PRODUCT()` is usually the most computationally\nefficient, but the difference is negligible in most cases. However, if your\napplication is highly performance sensitive, `DOT_PRODUCT()` might help with\nperformance tuning.\n\n**Non-normalized data**\n\nIf you have a dataset where vector embeddings aren't normalized,\nthen it's not mathematically correct to use `DOT_PRODUCT()` as a distance\nfunction because dot product as a function doesn't measure distance. Depending\non how the embeddings were generated and what type of search is preferred,\neither the `COSINE_DISTANCE()` or `EUCLIDEAN_DISTANCE()` function produces\nsearch results that are subjectively better than the other function.\nExperimentation with either `COSINE_DISTANCE()` or `EUCLIDEAN_DISTANCE()` might\nbe necessary to determine which is best for your use case.\n\n**Unsure if data is normalized or non-normalized**\n\nIf you're unsure whether or not your data is normalized and you want to use\n`DOT_PRODUCT()`, we recommend that you use `COSINE_DISTANCE()` instead.\n`COSINE_DISTANCE()` is like `DOT_PRODUCT()` with normalization built-in.\nSimilarity measured using `COSINE_DISTANCE()` ranges from `0` to `2`. A result\nthat is close to `0` indicates the vectors are very similar.\n\nWhat's next\n-----------\n\n- Learn more about how to [perform a vector search by finding the k-nearest neighbor](/spanner/docs/find-k-nearest-neighbors).\n- Learn how to [export embeddings to Vertex AI Vector Search](/spanner/docs/vector-search-embeddings).\n- Learn more about the [GoogleSQL `COSINE_DISTANCE()`, `EUCLIDEAN_DISTANCE()`, and `DOT_PRODUCT()`](/spanner/docs/reference/standard-sql/mathematical_functions) functions.\n- Learn more about the [PostgreSQL `spanner.cosine_distance()`, `spanner.euclidean_distance(), and spanner.dot_product()`](/spanner/docs/reference/postgresql/functions-and-operators#mathematical) functions."]]