SemanticCacheLookup policy

This page applies to Apigee and Apigee hybrid.

View Apigee Edge documentation.

Overview

The SemanticCacheLookup policy is an advanced caching policy designed to optimize the performance of AI workloads, particularly those involving Large Language Models (LLMs).

The policy uses the Vertex AI Text embeddings API to generate embeddings for text and Vector Search to find similar prompts based on semantic similarity, rather than exact matches.

The SemanticCacheLookup policy can reduce response times for repeated queries and optimize costs by reducing call volume to LLMs.

This policy is used in conjunction with the SemanticCachePopulate policy.

This policy is an Extensible policy and use of this policy might have cost or utilization implications, depending on your Apigee license. For information on policy types and usage implications, see Policy types.

Before you begin

Before you use the SemanticCacheLookup policy, you must complete the following tasks:

Create a Vertex AI project.
Create a Vector Search index.
Create a Vertex AI endpoint for the index.
Create a SemanticCachePopulate policy.

For more information on completing these tasks, see Get started with Semantic Caching policies.

Required roles

To get the permissions that you need to apply and use the SemanticCacheLookup policy, ask your administrator to grant you the AI Platform User (roles/aiplatform.user) IAM role on the service account you use to deploy Apigee proxies. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Enable APIs

Enable the Compute Engine, Vertex AI, and Cloud Storage APIs.

Enable the APIs

`<SemanticCacheLookup>` element

Defines a SemanticCacheLookup policy.

Default Value	See Default Policy tab, below
Required?	Required
Type	Complex object
Parent Element	N/A
Child Elements	`<DisplayName>` `<IgnoreUnresolvedVariables>` `<UserPromptSource>` `<Embeddings>` `<SimilaritySearch>`

The <SemanticCacheLookup> element uses the following syntax:

Syntax

The <SemanticCacheLookup> element uses the following syntax:

<SemanticCacheLookup async="false" continueOnError="false" enabled="true" name="SCL-lookup">
  <DisplayName>SCL-lookup</DisplayName>
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <UserPromptSource>{jsonPath($.contents[-1].parts[-1].text,request.content,true)}</UserPromptSource>
  <Embeddings>
    <VertexAI>
      <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>
    </VertexAI>
  </Embeddings>
  <SimilaritySearch>
    <VertexAI>
      <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>
      <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
      <Threshold>0.95</Threshold>
    </VertexAI>
  </SimilaritySearch>
</SemanticCacheLookup>

Default Policy

The following example shows the default settings when you add a SemanticCacheLookup policy to your flow in the Apigee UI:

<SemanticCacheLookup async="false" continueOnError="false"enabled="true" name="SCL-lookup">
  <DisplayName>SCL-lookup</DisplayName>
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <UserPromptSource>{jsonPath($.contents[-1].parts[-1].text,request.content,true)}</UserPromptSource>
  <Embeddings>
    <VertexAI>
      <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict
      </URL>
    </VertexAI>
  </Embeddings>
  <SimilaritySearch>
    <VertexAI>
      <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>
      <Threshold>0.9</Threshold>
      <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
    </VertexAI>
  </SimilaritySearch>
</SemanticCacheLookup>

When you insert a new SemanticCacheLookup policy in the Apigee UI, the template contains stubs for all possible operations. See below for information on required elements.

This element has the following attributes that are common to all policies:

Attribute	Default	Required?	Description
`name`	N/A	Required	The internal name of the policy. The value of the `name` attribute can contain letters, numbers, spaces, hyphens, underscores, and periods. This value cannot exceed 255 characters. Optionally, use the `<DisplayName>` element to label the policy in the management UI proxy editor with a different, natural-language name.
`continueOnError`	false	Optional	Set to `false` to return an error when a policy fails. This is expected behavior for most policies. Set to `true` to have flow execution continue even after a policy fails. See also: Fault rules are triggered ONLY in an error state (about continueOnError) Handling faults within the current flow
`enabled`	true	Optional	Set to `true` to enforce the policy. Set to `false` to turn off the policy. The policy will not be enforced even if it remains attached to a flow.
`async`	false	Deprecated	This attribute is deprecated.

The following table provides a high level description of the child elements of <SemanticCacheLookup>:

Child Element	Required?	Description
`<DisplayName>`	Optional	The name of the policy.
`<IgnoreUnresolvedVariables>`	Optional	Determines whether processing stops when a variable is unresolved. Set to `true` to ignore unresolved variables and continue processing.
`<UserPromptSource>`	Optional	The location of the payload for the user prompt text to be extracted. Only string text values are supported. This field supports Apigee message template syntax, including the use of variables or JSON Path functions. For example: {jsonPath($.contents[-1].parts[-1].text,request.content,true)}
`<Embeddings>`	Required	Element containing the information required to generate embeddings.
`<SimilaritySearch>`	Required	Element containing the information required to perform similarity searches. For more information, see Query public index to get nearest neighbors.

Child element reference

This section describes the child elements of <SemanticCacheLookup>.

`<DisplayName>`

Use in addition to the name attribute to label the policy in the management UI proxy editor with a different, more natural-sounding name.

The <DisplayName> element is common to all policies.

Default Value	N/A
Required?	Optional. If you omit `<DisplayName>`, the value of the policy's `name` attribute is used.
Type	String
Parent Element	<`PolicyElement`>
Child Elements	None

The <DisplayName> element uses the following syntax:

Syntax

<PolicyElement>
  <DisplayName>POLICY_DISPLAY_NAME</DisplayName>
  ...
</PolicyElement>

Example

<PolicyElement>
  <DisplayName>My Validation Policy</DisplayName>
</PolicyElement>

The <DisplayName> element has no attributes or child elements.

<IgnoreUnresolvedVariables>

Determines whether processing stops when a variable is unresolved. Set to true to ignore unresolved variables and continue processing.

IgnoreUnresolvedVariables is not applicable when <DefaultValue> is provided.

Default Value	False
Required?	Optional
Type	Boolean
Parent Element	`<SemanticCacheLookup>`
Child Elements	None

`<UserPromptSource>`

The location of the payload for the user prompt text to be extracted. Only string text values are supported.

This field supports Apigee message template syntax, including the use of variables or JSON Path functions.

For example:

{jsonPath($.contents[-1].parts[-1].text,request.content,true)}

Default Value	{jsonPath($.contents[-1].parts[-1].text,request.content,true)}
Required?	Optional
Type	String
Parent Element	`<SemanticCacheLookup>`
Child Elements	None

`<Embeddings>`

This element contains the information required to generate text embeddings.

Default Value	N/A
Required?	Optional
Type	String
Parent Element	`<SemanticCacheLookup>`
Child Elements	`<VertexAI>`

The <Embeddings> element uses the following syntax:

<Embeddings>
  <VertexAI>
    <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>
  </VertexAI>
</Embeddings>

<VertexAI> (child of `<Embeddings>`)

Contains the <URL> element for Vertex AI-specific attributes.

Default Value	N/A
Required?	Required
Type	String
Parent Element	`<Embeddings>`
Child Elements	`<URL>`

The VertexAI element uses the following syntax:

<VertexAI>
  <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>
</VertexAI>

<URL> (child of `<VertexAI>`)

The URL used to generate text embeddings. See Supported models for a list of models that can provide text embeddings for the SemanticCacheLookup policy.

Default Value	N/A
Required?	Required
Type	String
Parent Element	`<VertexAI>`
Child Elements	None

The URL element uses the following syntax:

<URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>

`<SimilaritySearch>`

This element contains the information required to perform similarity searches.

For more information, see Query public index to get nearest neighbors.

Default Value	N/A
Required?	Required
Type	String
Parent Element	`<SemanticCacheLookup>`
Child Elements	`<VertexAI>`

The <SimilaritySearch> element uses the following syntax:

<SimilaritySearch>
  <VertexAI>
    <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors
    </URL>
    <Threshold>0.9</Threshold>
    <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
  </VertexAI>
</SimilaritySearch>

<VertexAI> (child of `<SimilaritySearch>`)

Contains the <URL> element for Vertex AI-specific attributes.

Default Value	N/A
Required?	Required
Type	String
Parent Element	`<SimilaritySearch>`
Child Elements	`<URL>`

The VertexAI element uses the following syntax:

<VertexAI>
  <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>
  <Threshold>0.9</Threshold>
  <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
</VertexAI>

The following table provides a high-level description of the child elements of <VertexAI>.

Child Element Required? Description

Child Element	Required?	Description
`<URL>`	Required	String The URL used to perform similarity searches. The highest matching data point, based on the similarity threshold, will be the only data point used.
`<Threshold>`	Optional	String Similarity score used to determine if two prompts are considered a match. A value between 0 and 1. The default value is 0.9.
`<DeployedIndexID>`	Required	String The ID of the index deployed on the index endpoint used for semantic caching.

<URL>

Required

String

The URL used to perform similarity searches. The highest matching data point, based on the similarity threshold, will be the only data point used.

<Threshold>

Optional

String

Similarity score used to determine if two prompts are considered a match. A value between 0 and 1.

The default value is 0.9.

<DeployedIndexID>

Required

String

The ID of the index deployed on the index endpoint used for semantic caching.

Flow variables

Flow variables can be used to configure dynamic runtime behavior for policies and flows, based on HTTP headers or message content, or the context available in the Flow. For more information about flow variables, see Flow variables reference.

The policy can set these read-only variables during execution.

Variable name	Description
`request.content`	Contains the full content of the incoming API request.
`request.url`	Contains the URL of the incoming API request.
`semanticcache.lookup.policy_name.user_prompt`	Contains specific components extracted from the request prompt, which is used for generating embeddings or performing similarity searches.
`semanticcache.lookup.policy_name.embeddings_request`	Contains the request payload sent to the Vertex AI Embeddings API to generate text embeddings for the input text.
`semanticcache.lookup.policy_name.embeddings_response`	Contains the response from the Vertex AI Embeddings API, which includes the generated text embeddings.
`semanticcache.lookup.policy_name.dense_embeddings`	Contains the actual numerical embedding values generated by the Vertex AI Embeddings API.
`semanticcache.lookup.policy_name.is_nearest_neighbor_hit`	Specifies whether a nearest neighbor was found in the vector database for the given request and datapoint meets similarity threshold.
`semanticcache.lookup.policy_name.cache_hit`	Specifies whether the response was found in the semantic cache.
`semanticcache.lookup.policy_name.cached_llm_response`	Contains the response retrieved from the semantic cache (if a cache hit occurred).

Error reference

This section describes the fault codes and error messages that are returned and fault variables that are set by Apigee specific to the <SemanticCacheLookup> policy. This information is important to know if you are developing fault rules to handle faults. To learn more, see What you need to know about policy errors and Handling faults.

Runtime errors

These errors can occur when the policy executes.

Fault code	HTTP status	Cause
`steps.semanticcache.lookup.MessageTemplateExtractionFailed`	`400`	Failed to extract data from the request using the JSON Path expression.
`steps.semanticcache.lookup.FailedToExtractUserPrompt`	`500`	Unable to extract the user prompt from the API request.
`steps.semanticcache.lookup.EmbeddingsServiceUnavailable`	`400`	The Vertex AI Embeddings service is currently unavailable.
`steps.semanticcache.lookup.EmbeddingsAPIFailed`	`400`	The Vertex AI Embeddings service failed.
`steps.semanticcache.lookup.VectorSearchServiceUnavailable`	`400`	The Vertex AI Vector Search service is currently unavailable.
`steps.semanticcache.lookup.VectorSearchAPIFailed`	`400`	The Vertex AI Vector Search service failed.
`steps.semanticcache.lookup.AuthenticationFailure`	`500`	The service account doesn't have required permissions.
`steps.semanticcache.lookup.InternalError`	`500`	An unexpected error occurred within the SemanticCacheLookup policy.
`steps.semanticcache.lookup.CalloutError`	`500`	The Vertex AI service call failed.

Deployment errors

These errors can occur when you deploy a proxy containing this policy.

Error name	Cause
`The Embeddings/VertexAI element is required.`	Occurs if the <VertexAI> element in <Embeddings> is empty.
`The SimilaritySearch/VertexAI element is required.`	Occurs if the <VertexAI> element in <SimilaritySearch> is empty.
`The Embeddings/URL element is required.`	Occurs if the <URL> element in <Embeddings> is empty.
`The SimilaritySearch/URL element is required.`	Occurs if the <URL> element in <SimilaritySearch> is empty.
`Embeddings URL {url} is invalid.`	Occurs if the <URL> element in <Embeddings> is empty or invalid.
`The SimilaritySearch URL {url} is invalid.`	Occurs if the <URL> element in <SimilaritySearch> is empty or invalid.
`The scheme {http-scheme} of Embeddings URL {url} must be one of http, https.`	Occurs if the Embeddings <URL> element's `http` scheme is invalid.
`The scheme {http-scheme} of SimilaritySearch URL {url} must be one of http, https.`	Occurs if the SimilaritySearch <URL> element's `http` scheme is invalid.
`SimilaritySearch/Threshold element must be >= 0 and <= 1.`	If the attribute is not between 0 and 1, then the deployment of the API proxy fails.
`SimilaritySearch/DeployedIndexID element is required.`	Occurs if the <DeployedIndexID> element in <SimilaritySearch> is empty.
`SimilaritySearch/DeployedIndexID element must not contain spaces.`	Occurs if the <DeployedIndexID> element in <SimilaritySearch> contains spaces.

Fault variables

These variables are set when this policy triggers an error at runtime. For more information, see What you need to know about policy errors.

Variables	Where	Example
`fault.name="FAULT_NAME"`	`FAULT_NAME` is the name of the fault, as listed in the Runtime errors table above. The fault name is the last part of the fault code.	`fault.name Matches "UnresolvedVariable"`
`semanticcachelookup.POLICY_NAME.failed`	`POLICY_NAME` is the user-specified name of the policy that threw the fault.	`semanticcachelookup.SC-lookup.failed = true`

Example error response

Note: For error handling, the best practice is to trap the errorcode part of the error response. Do not rely on the text in the faultstring, because it could change.

{
  "fault": {
    "faultstring": "SemanticCacheLookup[SC-lookup]: unable to resolve variable [variable_name]",
    "detail": {
      "errorcode": "steps.semanticcachelookup.UnresolvedVariable"
    }
  }
}

Example fault rule

<FaultRule name="SemanticCacheLookup Faults">
    <Step>
        <Name>SCL-CustomSetVariableErrorResponse</Name>
        <Condition>(fault.name = "SetVariableFailed")</Condition>
    </Step>
    <Condition>(semanticcachelookup.failed = true)</Condition>
</FaultRule>

Schemas

Each policy type is defined by an XML schema (.xsd). For reference, policy schemas are available on GitHub.

SemanticCacheLookup policy Stay organized with collections Save and categorize content based on your preferences.

Overview

Before you begin

Required roles

Enable APIs

<SemanticCacheLookup> element

Syntax

Default Policy

Child element reference

<DisplayName>

Syntax

Example

<IgnoreUnresolvedVariables>

<UserPromptSource>

<Embeddings>

<VertexAI> (child of <Embeddings>)

<URL> (child of <VertexAI>)

<SimilaritySearch>

<VertexAI> (child of <SimilaritySearch>)

Flow variables

Error reference

Runtime errors

Deployment errors

Fault variables

Example error response

Example fault rule

Schemas

SemanticCacheLookup policy

`<SemanticCacheLookup>` element

`<DisplayName>`

`<UserPromptSource>`

`<Embeddings>`

<VertexAI> (child of `<Embeddings>`)

<URL> (child of `<VertexAI>`)

`<SimilaritySearch>`

<VertexAI> (child of `<SimilaritySearch>`)