Perform vector search in BigQuery

The BigQuery AI and ML SDK for ABAP lets you perform vector search on your vectorized enterprise data in BigQuery through natural language based search strings, and bring back the results, from your ABAP-based agents or applications.

BigQuery Vector Search lets you use GoogleSQL to do semantic search using vector indexes for approximate results. With Vector Search, you can find semantically similar data points within large datasets by using high-dimensional vectors or embeddings. You can store your vector embeddings in BigQuery, making BigQuery as your vector database and apply vector similarity search on it by using the BigQuery function VECTOR_SEARCH.

Before you begin

Before using the BigQuery AI and ML SDK for ABAP with the Gemini models, make sure that you or your administrators have completed the following prerequisites:

Pricing

The BigQuery AI and ML SDK for ABAP is offered at no cost. However, you're responsible for the charges for the CREATE VECTOR INDEX statement and the VECTOR_SEARCH function, which use BigQuery compute pricing.

To generate a cost estimate based on your projected usage, use the pricing calculator.

For more information about BigQuery pricing, see the BigQuery pricing page.

Perform vector search on BigQuery

This section explains how to perform vector semantic search on your enterprise data stored in BigQuery from your ABAP application by using the BigQuery AI and ML SDK for ABAP.

Instantiate the BigQuery vector search invoker class

To perform vector search using a search string, you instantiate the class /GOOG/CL_BQ_VECTOR_SEARCH.

TRY.
    DATA(lo_bq_vector_search) = NEW /goog/cl_bq_vector_search( iv_key = 'CLIENT_KEY' ).
CATCH /goog/cx_sdk INTO DATA(lo_cx_sdk).
    cl_demo_output=>display( lo_cx_sdk->get_text( ) ).

ENDTRY.

Replace CLIENT_KEY with the client key that you've configured for authentication to Google Cloud during the authentication setup.

Find similar items for a search string

To run queries to perform vector search with BigQuery function VECTOR_SEARCH, use the method FIND_NEAREST_NEIGHBORS of the class /GOOG/CL_BQ_VECTOR_SEARCH.

The object of the class /GOOG/CL_BQ_QUERY set with the query is passed as an input to the method.

lo_bq_vector_search->find_nearest_neighbors( io_query = lo_bq_query ).

LO_BQ_QUERY is the reference of the class /GOOG/CL_BQ_QUERY after setting the query. You can pass the search string from the query text.

Override vector search parameters

You can define vector search parameters (under VECTOR_SEARCH function definitions) in the saved query on BigQuery or passed query text. But if you need to override the parameters for the same query from ABAP application logic, you can use the method SET_SEARCH_PARAMETERS of the class /GOOG/CL_BQ_VECTOR_SEARCH to do so. The search parameters in the initial query are completely overridden with the parameters passed through this method.

lo_bq_vector_search->set_search_parameters( iv_top_k            = TOP_K
    iv_distance_type            = DISTANCE_TYPE
    iv_fraction_lists_to_search = 'FRACTION_LISTS_TO_SEARCH' ).

Replace the following:

  • TOP_K: An INT64 value that specifies the number of nearest neighbors to return.
  • DISTANCE_TYPE: A string value that specifies the type of metric to use to compute the distance between two vectors, look for probable values from the DISTANCE_TYPE argument under definitions for the VECTOR_SEARCH function.
  • FRACTION_LISTS_TO_SEARCH: A string value that specifies the percentage of lists to search, look for probable values from FRACTION_LISTS_TO_SEARCH argument under options of definition of the VECTOR_SEARCH function.

Get the vector search response

To receive processed responses from the BigQuery ML for vector search queries and present them in a meaningful way, use the class /GOOG/CL_BQ_SEARCH_RESPONSE.

The response captured by the /GOOG/CL_BQ_SEARCH_RESPONSE class is chained to the requests made through the methods of the /GOOG/CL_BQ_VECTOR_SEARCH class, so that you can directly access the response in a single statement without requiring variables to store the intermediate results.

Get nearest neighbors for the search string

To get nearest neighbors for the search string, use the GET_NEAREST_NEIGHBORS method. The number of neighbors returned depends on the value specified or set in the TOP_K parameter of the function VECTOR_SEARCH in the invoked query.

DATA(lt_search_response) = lo_bq_vector_search->find_nearest_neighbors( io_query = lo_bq_query
                                             )->get_nearest_neighbors( ).

LT_SEARCH_RESPONSE also contains the distance of the search response item from the search string to indicate the degree of similarity.

Get nearest neighbor for the search string

To get the nearest neighbor for the search string, use the GET_NEAREST_NEIGHBOR method. With this only the closest neighbor is fetched against the search string irrespective of the value specified or set in the TOP_K parameter of the function VECTOR_SEARCH in the invoked query.

DATA(ls_search_response) = lo_bq_vector_search->find_nearest_neighbors( io_query = lo_bq_query
                                             )->get_nearest_neighbor( ).

LS_SEARCH_RESPONSE also contains the distance of the search response from the search string to indicate the degree of similarity.