You can update when a context cache expires. The default expiration time of a
context cache is 60 minutes after its creation time. An expired context cache is
deleted during a garbage collection process and can't be used or updated. To
update the time when an unexpired context cache expires, update one of its
following properties: The following is an example of a curl command that updates its expiration time
by 3,600 seconds.
To learn more, see the
SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
Learn how to install or update the Go.
To learn more, see the
SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
You can use REST to create a update the context cache by using the
Vertex AI API to send a PATCH request to the publisher model endpoint. The
following example shows how to update the expiration date using the
Before using any of the request data,
make the following replacements:
HTTP method and URL:
Request JSON body:
To send your request, choose one of these options:
Save the request body in a file named
Save the request body in a file named You should receive a JSON response similar to the following:
ttl
- The number of seconds and nanoseconds that the cache lives after it's
created or after the ttl
is updated before it expires. When you set the
ttl
, the expireTime
of the cache is updated.expire_time
- A Timestamp
that specifies the absolute date and time when
the context cache expires.Update the context cache using its
ttl
parameterPython
Install
pip install --upgrade google-genai
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True
Go
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True
REST
ttl
parameter.
float
that specifies the seconds component of the duration before the cache expires.float
that specifies the nanoseconds component of the duration before the cache expires.PATCH https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID
{
"seconds":"SECONDS",
"nanos":"NANOSECONDS"
}
curl
request.json
,
and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID"PowerShell
request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID" | Select-Object -Expand ContentExample curl command
PROJECT_ID="PROJECT_ID"
LOCATION="us-central1"
CACHE_ID="CACHE_ID"
curl \
-X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json"\
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents/${CACHE_ID}" -d \
'{
"ttl": {"seconds":"3600","nanos":"0"}
}'
Update the context cache using its expire_time
parameter
The following is an example of a curl command that uses the expire_time
parameter to update its expiration time to 9 AM on June 30, 2024.
REST
You can use REST to create a update the context cache by using the
Vertex AI API to send a PATCH request to the publisher model endpoint. The
following example shows how to update the expiration date using the
expire_time
parameter.
Before using any of the request data, make the following replacements:
- PROJECT_ID: .
- LOCATION: The region where the request to create the context cache was processed.
- CACHE_ID: The ID of the context cache. You can find the ID in the response when you create the context cache.
- EXPIRE_TIME: A
Timestamp
that specifies the time when the context cache expires.
HTTP method and URL:
PATCH https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID
Request JSON body:
{ "expire_time":"EXPIRE_TIME" }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
Example curl command
PROJECT_ID="PROJECT_ID"
LOCATION="us-central1"
CACHE_ID="CACHE_ID"
curl \
-X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json"\
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents/${CACHE_ID}" -d \
'{
"expire_time":"2024-06-30T09:00:00.000000Z"
}'
What's next
- Learn how to use a context cache.
- Learn how to get information about all context caches associated with a Google Cloud project.