About retrieval and ranking

This page describes how retrieval and ranking work together to deliver relevant search results in Vertex AI Search apps.

Overview

In short, retrieval is finding relevant documents, while ranking is ordering those retrieved documents. Ranking all the available documents can be computationally expensive. Therefore, retrieval and ranking work sequentially.

First, the search model understands the query and rewrites it. Then, depending on the data sources available and the number of indexed documents in your data store, the model retrieves documents in the order of thousands. A relevance score is assigned to the retrieved documents.

The ranking model then orders the retrieved documents and serves the top 400 ranked results. The following image shows how these two processes fit into the search workflow.

Figure 1. Retrieval and ranking in search workflow

Retrieval methods

Retrieval is the process of selecting a subset of documents from your data store that are relevant to a user's query. Vertex AI Search model manages retrieval for your search apps based on different signals, such as the following, and assigns relevance scores:

Topicality: Includes keyword matching, knowledge graphs, and web signals.
Embeddings: Includes embeddings to find conceptually similar content.
Cross-attention: Allows a model to consider the relationship between a query and a document to assign a relevance score to the document.
Freshness: Involves ascertaining the age of the documents in the data store.
User events: Includes conversion signals used for personalization.

Additionally, in a search request, you can supply relevance filters and metadata filters for website data and structured or unstructured data to narrow down the list of relevant documents.

Ranking methods

Ranking takes the documents that are selected during the retrieval phase, assigns them a new relevance score according to the following conditions, and reorders them:

Boost: Promotes and demotes certain results according to custom attributes or freshness. This impacts the first 1,000 retrieved documents and ranks the top 400. For more information, see Boost search results.
Custom ranking: Controls, tunes, and overrides the default ranking logic with a formula-based ranking algorithm to suit your specific requirements. The relevance score that custom ranking assigns takes a precedence when serving the results. For more information, see Customize search results ranking.
Search tuning: Impacts how the model perceives the semantic relevance of your documents and changes the embedding relevance scores. For more information, see Improve search results with search tuning.
Event-based reranking: Updates the results at the time of serving using user-events-based personalization model.