REST Resource: projects.locations.cachedContents

Resource: CachedContent

A resource used in LLM queries for users to explicitly specify what to cache and how to cache.

Fields
name string

Immutable. Identifier. The server-generated resource name of the cached content Format: projects/{project}/locations/{location}/cachedContents/{cachedContent}

displayName string

Optional. Immutable. The user-generated meaningful display name of the cached content.

model string

Immutable. The name of the publisher model to use for cached content. Format: projects/{project}/locations/{location}/publishers/{publisher}/models/{model}

systemInstruction object (Content)

Optional. Input only. Immutable. Developer set system instruction. Currently, text only

contents[] object (Content)

Optional. Input only. Immutable. The content to cache

tools[] object (Tool)

Optional. Input only. Immutable. A list of Tools the model may use to generate the next response

toolConfig object (ToolConfig)

Optional. Input only. Immutable. Tool config. This config is shared for all tools

createTime string (Timestamp format)

Output only. Creatation time of the cache entry.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

updateTime string (Timestamp format)

Output only. When the cache entry was last updated in UTC time.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

usageMetadata object (UsageMetadata)

Output only. metadata on the usage of the cached content.

expiration Union type
Expiration time of the cached content. expiration can be only one of the following:
expireTime string (Timestamp format)

timestamp of when this resource is considered expired. This is always provided on output, regardless of what was sent on input.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

ttl string (Duration format)

Input only. The TTL for this resource. The expiration time is computed: now + TTL.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

JSON representation
{
  "name": string,
  "displayName": string,
  "model": string,
  "systemInstruction": {
    object (Content)
  },
  "contents": [
    {
      object (Content)
    }
  ],
  "tools": [
    {
      object (Tool)
    }
  ],
  "toolConfig": {
    object (ToolConfig)
  },
  "createTime": string,
  "updateTime": string,
  "usageMetadata": {
    object (UsageMetadata)
  },

  // expiration
  "expireTime": string,
  "ttl": string
  // Union type
}

Tool

Tool details that the model may use to generate response.

A Tool is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model. A Tool object should contain exactly one type of Tool (e.g FunctionDeclaration, Retrieval or GoogleSearchRetrieval).

Fields
functionDeclarations[] object (FunctionDeclaration)

Optional. Function tool type. One or more function declarations to be passed to the model along with the current user query. Model may decide to call a subset of these functions by populating FunctionCall in the response. user should provide a FunctionResponse for each function call in the next turn. Based on the function responses, Model will generate the final response back to the user. Maximum 128 function declarations can be provided.

retrieval object (Retrieval)

Optional. Retrieval tool type. System will always execute the provided retrieval tool(s) to get external knowledge to answer the prompt. Retrieval results are presented to the model for generation.

googleSearchRetrieval object (GoogleSearchRetrieval)

Optional. GoogleSearchRetrieval tool type. Specialized retrieval tool that is powered by Google search.

codeExecution object (CodeExecution)

Optional. CodeExecution tool type. Enables the model to execute code as part of generation. This field is only used by the Gemini Developer API services.

JSON representation
{
  "functionDeclarations": [
    {
      object (FunctionDeclaration)
    }
  ],
  "retrieval": {
    object (Retrieval)
  },
  "googleSearch": {
    object (GoogleSearch)
  },
  "googleSearchRetrieval": {
    object (GoogleSearchRetrieval)
  },
  "codeExecution": {
    object (CodeExecution)
  }
}

Retrieval

Defines a retrieval tool that model can call to access external knowledge.

Fields
disableAttribution
(deprecated)
boolean

Optional. Deprecated. This option is no longer supported.

source Union type
The source of the retrieval. source can be only one of the following:
vertexRagStore object (VertexRagStore)

Set to use data source powered by Vertex RAG store. user data is uploaded via the VertexRagDataService.

JSON representation
{
  "disableAttribution": boolean,

  // source
  "vertexAiSearch": {
    object (VertexAISearch)
  },
  "vertexRagStore": {
    object (VertexRagStore)
  }
  // Union type
}

VertexAISearch

Retrieve from Vertex AI Search datastore for grounding. See https://cloud.google.com/products/agent-builder

Fields
datastore string

Required. Fully-qualified Vertex AI Search data store resource id. Format: projects/{project}/locations/{location}/collections/{collection}/dataStores/{dataStore}

JSON representation
{
  "datastore": string
}

VertexRagStore

Retrieve from Vertex RAG Store for grounding.

Fields
ragCorpora[]
(deprecated)
string

Optional. Deprecated. Please use ragResources instead.

ragResources[] object (RagResource)

Optional. The representation of the rag source. It can be used to specify corpus only or ragfiles. Currently only support one corpus or multiple files from one corpus. In the future we may open up multiple corpora support.

ragRetrievalConfig object (RagRetrievalConfig)

Optional. The retrieval config for the Rag query.

similarityTopK
(deprecated)
integer

Optional. Number of top k results to return from the selected corpora.

vectorDistanceThreshold
(deprecated)
number

Optional. Only return results with vector distance smaller than the threshold.

JSON representation
{
  "ragCorpora": [
    string
  ],
  "ragResources": [
    {
      object (RagResource)
    }
  ],
  "ragRetrievalConfig": {
    object (RagRetrievalConfig)
  },
  "similarityTopK": integer,
  "vectorDistanceThreshold": number
}

RagResource

The definition of the Rag resource.

Fields
ragCorpus string

Optional. RagCorpora resource name. Format: projects/{project}/locations/{location}/ragCorpora/{ragCorpus}

ragFileIds[] string

Optional. ragFileId. The files should be in the same ragCorpus set in ragCorpus field.

JSON representation
{
  "ragCorpus": string,
  "ragFileIds": [
    string
  ]
}

RagRetrievalConfig

Specifies the context retrieval config.

Fields
topK integer

Optional. The number of contexts to retrieve.

filter object (Filter)

Optional. Config for filters.

ranking object (Ranking)

Optional. Config for ranking and reranking.

JSON representation
{
  "topK": integer,
  "hybridSearch": {
    object (HybridSearch)
  },
  "filter": {
    object (Filter)
  },
  "ranking": {
    object (Ranking)
  }
}

HybridSearch

Config for Hybrid Search.

Fields
alpha number

Optional. Alpha value controls the weight between dense and sparse vector search results. The range is [0, 1], while 0 means sparse vector search only and 1 means dense vector search only. The default value is 0.5 which balances sparse and dense vector search equally.

JSON representation
{
  "alpha": number
}

Filter

Config for filters.

Fields
metadataFilter string

Optional. String for metadata filtering.

vector_db_threshold Union type
Filter contexts retrieved from the vector DB based on either vector distance or vector similarity. vector_db_threshold can be only one of the following:
vectorDistanceThreshold number

Optional. Only returns contexts with vector distance smaller than the threshold.

vectorSimilarityThreshold number

Optional. Only returns contexts with vector similarity larger than the threshold.

JSON representation
{
  "metadataFilter": string,

  // vector_db_threshold
  "vectorDistanceThreshold": number,
  "vectorSimilarityThreshold": number
  // Union type
}

Ranking

Config for ranking and reranking.

Fields
ranking_config Union type
Config options for ranking. Currently only Rank Service is supported. ranking_config can be only one of the following:
rankService object (RankService)

Optional. Config for Rank service.

llmRanker object (LlmRanker)

Optional. Config for LlmRanker.

JSON representation
{

  // ranking_config
  "rankService": {
    object (RankService)
  },
  "llmRanker": {
    object (LlmRanker)
  }
  // Union type
}

RankService

Config for Rank service.

Fields
modelName string

Optional. The model name of the rank service. Format: semantic-ranker-512@latest

JSON representation
{
  "modelName": string
}

LlmRanker

Config for LlmRanker.

Fields
modelName string

Optional. The model name used for ranking. Format: gemini-1.5-pro

JSON representation
{
  "modelName": string
}

GoogleSearch

This type has no fields.

GoogleSearch tool type. Tool to support Google Search in Model. Powered by Google.

GoogleSearchRetrieval

Tool to retrieve public web data for grounding, powered by Google.

Fields
dynamicRetrievalConfig object (DynamicRetrievalConfig)

Specifies the dynamic retrieval configuration for the given source.

JSON representation
{
  "dynamicRetrievalConfig": {
    object (DynamicRetrievalConfig)
  }
}

DynamicRetrievalConfig

Describes the options to customize dynamic retrieval.

Fields
mode enum (Mode)

The mode of the predictor to be used in dynamic retrieval.

dynamicThreshold number

Optional. The threshold to be used in dynamic retrieval. If not set, a system default value is used.

JSON representation
{
  "mode": enum (Mode),
  "dynamicThreshold": number
}

Mode

The mode of the predictor to be used in dynamic retrieval.

Enums
MODE_UNSPECIFIED Always trigger retrieval.
MODE_DYNAMIC Run retrieval only when system decides it is necessary.

CodeExecution

This type has no fields.

Tool that executes code generated by the model, and automatically returns the result to the model.

See also [ExecutableCode]and [CodeExecutionResult] which are input and output to this tool.

ToolConfig

Tool config. This config is shared for all tools provided in the request.

Fields
functionCallingConfig object (FunctionCallingConfig)

Optional. Function calling config.

JSON representation
{
  "functionCallingConfig": {
    object (FunctionCallingConfig)
  }
}

FunctionCallingConfig

Function calling config.

Fields
mode enum (Mode)

Optional. Function calling mode.

allowedFunctionNames[] string

Optional. Function names to call. Only set when the Mode is ANY. Function names should match [FunctionDeclaration.name]. With mode set to ANY, model will predict a function call from the set of function names provided.

JSON representation
{
  "mode": enum (Mode),
  "allowedFunctionNames": [
    string
  ]
}

Mode

Function calling mode.

Enums
MODE_UNSPECIFIED Unspecified function calling mode. This value should not be used.
AUTO Default model behavior, model decides to predict either function calls or natural language response.
ANY Model is constrained to always predicting function calls only. If "allowedFunctionNames" are set, the predicted function calls will be limited to any one of "allowedFunctionNames", else the predicted function calls will be any one of the provided "functionDeclarations".
NONE Model will not predict any function calls. Model behavior is same as when not passing any function declarations.

UsageMetadata

metadata on the usage of the cached content.

Fields
totalTokenCount integer

Total number of tokens that the cached content consumes.

textCount integer

Number of text characters.

imageCount integer

Number of images.

videoDurationSeconds integer

Duration of video in seconds.

audioDurationSeconds integer

Duration of audio in seconds.

JSON representation
{
  "totalTokenCount": integer,
  "textCount": integer,
  "imageCount": integer,
  "videoDurationSeconds": integer,
  "audioDurationSeconds": integer
}

Methods

create

Creates cached content, this call will initialize the cached content in the data storage, and users need to pay for the cache data storage.

delete

Deletes cached content

get

Gets cached content configurations

list

Lists cached contents in a project

patch

Updates cached content configurations