[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-29。"],[],[],null,["# Prompt caching\n\nThe Anthropic Claude models offer prompt caching to reduce latency and costs\nwhen reusing the same content in multiple requests. When you send a query, you\ncan cache all or specific parts of your input so that subsequent queries can\nuse the cached results from the previous request. This avoids additional compute\nand network costs. Caches are unique to your Google Cloud project and\ncannot be used by other projects.\n\nFor details about how to structure your prompts, see the Anthropic [Prompt\ncaching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) documentation.\n\nSupported Anthropic Claude models\n---------------------------------\n\nVertex AI supports prompt caching for the following Anthropic Claude\nmodels:\n\n- [Claude Opus 4.1](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-opus-4-1)\n- [Claude Opus 4](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-opus-4)\n- [Claude Sonnet 4](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-sonnet-4)\n- [Claude 3.7 Sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-7-sonnet)\n- [Claude 3.5 Sonnet v2](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-sonnet-v2)\n- [Claude 3.5 Haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-haiku)\n- [Claude 3.5 Sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-sonnet)\n- [Claude 3 Opus](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-opus)\n- [Claude 3 Haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-haiku)\n\n\u003cbr /\u003e\n\nData processing\n---------------\n\nAnthropic explicit prompt caching is a feature of Anthropic Claude models. The\nVertex AI offering of these Anthropic models behaves as described in\nthe [Anthropic documentation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching).\n\nPrompt caching is an optional feature. Claude computes the hashes (fingerprints)\nof requests for caching keys. These hashes are only computed for requests that\nhave caching enabled.\n\nAlthough prompt caching is a feature implemented by the Claude models, from a\ndata handling perspective, Google considers these hashes to be a type of \"User\nMetadata\". They are treated as customer \"Service Data\" under the [Google Cloud\nPrivacy Notice](https://cloud.google.com/terms/cloud-privacy-notice) and not as\n\"Customer Data\" under the [Cloud Data Processing Addendum (Customers)](https://cloud.google.com/terms/data-processing-addendum).\nIn particular, additional protections for \"Customer Data\" don't apply to these\nhashes. Google does not use these hashes for any other purpose.\n\nIf you want to completely disable this prompt caching feature and make it\nunavailable in particular Google Cloud projects, you can request this by\ncontacting [customer support](https://console.cloud.google.com/support/createcase/v2)\nand providing the relevant project numbers. After explicit caching is disabled\nfor a project, requests from the project with prompt caching enabled are\nrejected.\n\nUse prompt caching\n------------------\n\nYou can use the [Anthropic Claude SDK](https://pypi.org/project/anthropic/) or\nthe Vertex AI REST API to send requests to the Vertex AI endpoint.\n\nFor more information, see [How prompt caching works](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#how-prompt-caching-works).\n\nFor additional examples, see the [Prompt caching examples](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#prompt-caching-examples) in\nthe Anthropic documentation.\n\nCaching automatically occurs when subsequent requests contain the identical\ntext, images, and `cache_control` parameter as the first request. All requests\nmust also include the `cache_control` parameter in the same blocks.\n\nThe cache has a five minute lifetime. It is refreshed each time the cached\ncontent is accessed.\n\nPricing\n-------\n\nPrompt caching can affect billing costs. Note that:\n\n- Cache write tokens are 25% more expensive than base input tokens\n- Cache read tokens are 90% cheaper than base input tokens\n- Regular input and output tokens are priced at standard rates\n\nFor more information, see the [Pricing page](/vertex-ai/generative-ai/pricing#claude-models)."]]