Overview
The SemanticCachePopulate policy is an advanced caching policy designed to optimize the performance of AI workloads, particularly those involving Large Language Models (LLMs).
The policy uses the Vertex AI Text embeddings API to generate embeddings for text and Vector Search to cache API responses based on semantic similarity, rather than exact matches.
The SemanticCachePopulate policy can reduce response times for repeated queries and optimize costs by reducing call volume to LLMs.
This policy is used in conjunction with the SemanticCacheLookup policy.
This policy is an Extensible policy and use of this policy might have cost or utilization implications, depending on your Apigee license. For information on policy types and usage implications, see Policy types.
Before you begin
Before you use the SemanticCachePopulate policy, you must complete the following tasks:
- Create a Vertex AI project.
- Create a Vector Search index.
- Create a Vertex AI endpoint for the index.
- Create a SemanticCachePopulate policy.
For more information on completing these tasks, see Get started with Semantic Caching policies.
Roles and permissions
To get the permissions that
you need to apply and use the SemanticCachePopulate policy,
ask your administrator to grant you the
AI Platform User (roles/aiplatform.user
)
IAM role on the service account you use to deploy Apigee proxies.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Enable APIs
Enable the Compute Engine, Vertex AI, and Cloud Storage APIs.
<SemanticCachePopulate>
element
Defines a SemanticCachePopulate policy.
Default Value | See Default Policy tab, below |
Required? | Required |
Type | Complex object |
Parent Element | N/A |
Child Elements |
<DisplayName> <IgnoreUnresolvedVariables> <SimilaritySearch> <TTLInSeconds> |
The <SemanticCachePopulate>
element uses the following syntax:
Syntax
The <SemanticCachePopulate>
element uses the following syntax:
<SemanticCachePopulate async="false" continueOnError="false"enabled="true" name="SCP-populate"> <DisplayName>SCP-populate</DisplayName> <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables> <SimilaritySearch> <VertexAI> <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}:upsertDatapoints</URL> </VertexAI> </SimilaritySearch> <TTLInSeconds>{EXPIRATION_TIME_IN_SECONDS}</TTLInSeconds> </SemanticCachePopulate>
Default Policy
The following example shows the default settings when you add a SemanticCachePopulate policy to your API proxy in the Apigee UI:
<SemanticCachePopulate async="false" continueOnError="false"enabled="true" name="SCP-populate"> <DisplayName>SCP-populate</DisplayName> <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables> <SimilaritySearch> <VertexAI> <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}:upsertDatapoints</URL> </VertexAI> </SimilaritySearch> <TTLInSeconds>60</TTLInSeconds> </SemanticCachePopulate>
When you insert a new SemanticCachePopulate policy in the Apigee UI, the template contains stubs for all possible operations. See below for information on required elements.
This element has the following attributes that are common to all policies:
Attribute | Default | Required? | Description |
---|---|---|---|
name |
N/A | Required |
The internal name of the policy. The value of the Optionally, use the |
continueOnError |
false | Optional | Set to false to return an error when a policy fails. This is expected behavior for
most policies. Set to true to have flow execution continue even after a policy
fails. See also:
|
enabled |
true | Optional | Set to true to enforce the policy. Set to false to turn off the
policy. The policy will not be enforced even if it remains attached to a flow. |
async |
false | Deprecated | This attribute is deprecated. |
The following table provides a high level description of the child elements of
<SemanticCachePopulate>
:
Child Element | Required? | Description |
---|---|---|
<DisplayName> |
Optional | The name of the policy. |
<IgnoreUnresolvedVariables> |
Optional | Determines whether processing stops when a property set is unresolved. |
<SimilaritySearch> |
Required | Element containing the information required to update the vector index.
For more information, see Upsert datapoints. The expiration time for data points is <TTLInSeconds> from the time of entry. |
<TTLInSeconds> |
Optional | The time to live (TTL) for the cached responses, in seconds.
The default value is |
Example
This section provides an example using <SemanticCachePopulate>
.
<SemanticCachePopulate async="false" continueOnError="false"enabled="true" name="SCP-populate"> <DisplayName>SCP-populate</DisplayName> <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables> <SimilaritySearch> <VertexAI> <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}:upsertDatapoints</URL> </VertexAI> </SimilaritySearch> <TTLInSeconds>60</TTLInSeconds> </SemanticCachePopulate>
Child element reference
This section describes the child elements of <SemanticCachePopulate>
.
<DisplayName>
Use in addition to the name
attribute to label the policy in the
management UI proxy editor with a different, more natural-sounding name.
The <DisplayName>
element is common to all policies.
Default Value | N/A |
Required? | Optional. If you omit <DisplayName> , the value of the
policy's name attribute is used. |
Type | String |
Parent Element | <PolicyElement> |
Child Elements | None |
The <DisplayName>
element uses the following syntax:
Syntax
<PolicyElement> <DisplayName>POLICY_DISPLAY_NAME</DisplayName> ... </PolicyElement>
Example
<PolicyElement> <DisplayName>My Validation Policy</DisplayName> </PolicyElement>
The <DisplayName>
element has no attributes or child elements.
<IgnoreUnresolvedVariables>
Determines whether processing stops when a variable is unresolved. Set to
true
to ignore unresolved variables and continue processing.
IgnoreUnresolvedVariables
is not applicable when <DefaultValue>
is provided.
Default Value | False |
Required? | Optional |
Type | Boolean |
Parent Element |
<SemanticCachePopulate>
|
Child Elements | None |
<SimilaritySearch>
Element containing the information required to update the vector index.
For more information, see Upsert datapoints.
The expiration time for data points is <TTLInSeconds>
from the time of entry.
Default Value | N/A |
Required? | Required |
Type | String |
Parent Element |
<SemanticCachePopulate>
|
Child Elements |
<VertexAI> |
The <SimilaritySearch>
element uses the following syntax:
<SimilaritySearch> <VertexAI> <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}:upsertDatapoints</URL> </VertexAI> </SimilaritySearch>
<VertexAI> (child of <SimilaritySearch>
)
Contains the <URL> element for Vertex AI-specific attributes.
Default Value | N/A |
Required? | Required |
Type | String |
Parent Element |
<SimilaritySearch>
|
Child Elements |
<URL> |
The VertexAI
element uses the following syntax:
<VertexAI> <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}:upsertDatapoints</URL> </VertexAI>
<URL> (child of <VertexAI>
)
The URL used to upsert datapoints in the vector index.
Default Value | N/A |
Required? | Required |
Type | String |
Parent Element |
<VertexAI>
|
Child Elements |
None |
The URL
element uses the following syntax:
<URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}:upsertDatapoints</URL>
<TTLInSeconds>
Element specifying the time to live (TTL) for the cached responses, in seconds. The default value is 60.
For more information, see Update and rebuild an active index.
Default Value | N/A |
Required? | Optional |
Type | String |
Parent Element |
<SemanticCachePopulate>
|
Child Elements |
None |
Flow variables
Flow variables can be used to configure dynamic runtime behavior for policies and flows, based on HTTP headers or message content, or the context available in the Flow. For more information about flow variables, see Flow variables reference.
The policy can set these read-only variables during execution.
Variable name | Description |
---|---|
response.content |
Contains the full content of the API response. |
semanticcache.populate.policy_name.upsert_index_request |
Contains the request payload sent to the Vertex AI Vector Search API to update the vector index with new embeddings and metadata. |
semanticcache.populate.policy_name.upsert_index_response |
Contains the response from the Vertex AI Vector Search API, indicating the success or failure of the index update operation. |
Error reference
This section describes the fault codes and error messages that are returned and fault
variables that are set by Apigee specific to the <SemanticCachePopulate>
policy.
This information is important to know if you are developing fault rules to
handle faults. To learn more, see What you need to know
about policy errors and Handling
faults.
Runtime errors
These errors can occur when the policy executes.
Fault code | HTTP status | Cause |
---|---|---|
steps.semanticcachepopulate.VectorSearchUpsertServiceUnavailable
|
400 |
This error occurs if the Vector Search Upsert Datapoints API is unavailable. |
steps.semanticcache.populate.VectorSearchUpsertAPIFailed |
500 |
This error occurs if the Vector Search Upsert Datapoints API service fails. |
steps.semanticcache.populate.AuthenticationFailure |
500 |
This error occurs if the service account does not have the required permissions. |
steps.semanticcache.populate.CalloutError |
500 |
The Vertex AI service call failed. |
steps.semanticcache.populate.InternalError |
500 |
This error occurs in the event of an unexpected error within the |
Deployment errors
These errors can occur when you deploy a proxy containing this policy.
Error name | Cause |
---|---|
The SimilaritySearch URL {url} is invalid. |
Occurs if the <URL> element in <SimilaritySearch> is empty or invalid. |
The scheme {http-scheme} of SimilaritySearch URL {url} must be one of http, https. |
Occurs if the SimilaritySearch <URL> element's http scheme is invalid. |
The TTLInSeconds element must be >= 0. |
If the value is set to zero or negative number, then deployment of the API proxy fails. |
Fault variables
These variables are set when this policy triggers an error at runtime. For more information, see What you need to know about policy errors.
Variables | Where | Example |
---|---|---|
fault.name="FAULT_NAME" |
FAULT_NAME is the name of the fault, as listed in the Runtime errors table above. The fault name is the last part of the fault code. | fault.name Matches "UnresolvedVariable" |
semanticcachepopulate.POLICY_NAME.failed |
POLICY_NAME is the user-specified name of the policy that threw the fault. | semanticcachepopulate.SC-populate.failed = true |
Example error response
{ "fault": { "faultstring": "SemanticCacheLookup[SC-populate]: unable to resolve variable [variable_name]", "detail": { "errorcode": "steps.semanticcachepopulate.UnresolvedVariable" } } }
Example fault rule
<FaultRule name="SemanticCacheLookup Faults"> <Step> <Name>SCL-CustomSetVariableErrorResponse</Name> <Condition>(fault.name = "SetVariableFailed")</Condition> </Step> <Condition>(semanticcachelookup.failed = true)</Condition> </FaultRule>
Schemas
Each policy type is defined by an XML schema (.xsd
). For reference, policy schemas
are available on GitHub.