ROLE:与消息关联的角色。您可以指定 user 或 assistant。第一条消息必须使用 user 角色。 Claude 模型使用交替的 user 和 assistant 回合运行。如果最终消息使用 assistant 角色,则回答内容会立即从该消息中的内容继续。您可以使用它来限制模型的部分回答。
CONTENT:user 或 assistant 消息的内容(如文本)。
HTTP 方法和网址:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict
请求 JSON 正文:
{
"model": "MODEL",
"messages": [
{
"role": "user",
"content":"how many tokens are in this request?"
}
],
}
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-29。"],[],[],null,["# Count tokens for Claude models\n\nThe `count-tokens` endpoint lets you determine the number of tokens in a\nmessage before sending it to Claude, helping you make informed decisions about\nyour prompts and usage.\n\nThere is no cost for using the `count-tokens` endpoint.\n\nSupported Claude models\n-----------------------\n\nThe following models support count tokens:\n\n- [Claude Opus 4.1](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-opus-4-1)\n- [Claude Opus 4](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-opus-4)\n- [Claude Sonnet 4](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-sonnet-4)\n- [Claude 3.7 Sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-7-sonnet)\n- [Claude 3.5 Sonnet v2](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-sonnet-v2)\n- [Claude 3.5 Haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-haiku)\n- [Claude 3.5 Sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-sonnet)\n- [Claude 3 Opus](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-opus)\n- [Claude 3 Haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-haiku)\n\n\u003cbr /\u003e\n\nSupported regions\n-----------------\n\nThe following regions support count tokens:\n\n- `us-east5`\n- `europe-west1`\n- `asia-east1`\n- `asia-southeast1`\n- `us-central1`\n- `europe-west4`\n\nCount tokens in basic messages\n------------------------------\n\nTo count tokens, send a `rawPredict` request to the `count-tokens` endpoint. The\nbody of the request must contain the model ID of the model you want to count\ntokens against. \n\n### REST\n\n\nBefore using any of the request data,\nmake the following replacements:\n\n- \u003cvar class=\"edit\" scope=\"LOCATION\" translate=\"no\"\u003eLOCATION\u003c/var\u003e: A [region](#regions) that supports Anthropic Claude models. To use the global endpoint, see [Specify\n the global endpoint](/vertex-ai/generative-ai/docs/partner-models/use-partner-models#global).\n- \u003cvar class=\"edit\" scope=\"MODEL\" translate=\"no\"\u003eMODEL\u003c/var\u003e: The [model](#model-list) to count tokens against.\n- \u003cvar translate=\"no\"\u003eROLE\u003c/var\u003e: The role associated with a message. You can specify a `user` or an `assistant`. The first message must use the `user` role. Claude models operate with alternating `user` and `assistant` turns. If the final message uses the `assistant` role, then the response content continues immediately from the content in that message. You can use this to constrain part of the model's response.\n- \u003cvar translate=\"no\"\u003eCONTENT\u003c/var\u003e: The content, such as text, of the `user` or `assistant` message.\n\n\nHTTP method and URL:\n\n```\nPOST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict\n```\n\n\nRequest JSON body:\n\n```\n{\n \"model\": \"MODEL\",\n \"messages\": [\n {\n \"role\": \"user\",\n \"content\":\"how many tokens are in this request?\"\n }\n ],\n}\n```\n\nTo send your request, choose one of these options: \n\n#### curl\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\ncurl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d @request.json \\\n \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict\"\n```\n\n#### PowerShell\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\n$cred = gcloud auth print-access-token\n$headers = @{ \"Authorization\" = \"Bearer $cred\" }\n\nInvoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -ContentType: \"application/json; charset=utf-8\" `\n -InFile request.json `\n -Uri \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict\" | Select-Object -Expand Content\n```\n\nYou should receive a JSON response similar to the following.\n\n#### Response\n\n```\n{ \"input_tokens\": 14 }\n```\n\n\u003cbr /\u003e\n\nFor information on how to count tokens in messages with tools, images, and PDFs,\nsee [Anthropic's documentation](https://docs.anthropic.com/en/docs/build-with-claude/token-counting).\n\nQuotas\n------\n\nBy default, the quota for the `count-tokens` endpoint is 2000 requests per\nminute."]]