A partir de 29 de abril de 2025, os modelos Gemini 1.5 Pro e Gemini 1.5 Flash não estarão disponíveis em projetos que não os usaram antes, incluindo novos projetos. Para mais detalhes, consulte Versões e ciclo de vida do modelo.
Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
O endpoint count-tokens permite determinar o número de tokens em uma mensagem antes de enviá-la para o Claude, ajudando você a tomar decisões fundamentadas sobre seus comandos e uso.
Não há custo para usar o endpoint count-tokens.
Modelos do Claude com suporte
Os seguintes modelos são compatíveis com a contagem de tokens:
As seguintes regiões são compatíveis com a contagem de tokens:
us-east5
europe-west1
asia-east1
asia-southeast1
us-central1
europe-west4
Contar tokens em mensagens básicas
Para contar tokens, envie uma solicitação rawPredict ao endpoint count-tokens. O corpo da solicitação precisa conter o ID do modelo que você quer usar para contar tokens.
REST
Antes de usar os dados da solicitação abaixo, faça as substituições a seguir:
ROLE: o papel associado a uma mensagem. É possível especificar user ou assistant.
A primeira mensagem precisa usar o papel user. Os modelos de Claude
funcionam com voltas alternadas de user e assistant.
Se a mensagem final usar o papel assistant, o conteúdo da resposta continuará imediatamente a partir do conteúdo dessa mensagem. É possível usar isso para restringir parte da resposta do
modelo.
CONTENT: o conteúdo, como texto, da mensagem user ou
assistant.
Método HTTP e URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict
Corpo JSON da solicitação:
{
"model": "MODEL",
"messages": [
{
"role": "user",
"content":"how many tokens are in this request?"
}
],
}
Para enviar a solicitação, escolha uma destas opções:
curl
Salve o corpo da solicitação em um arquivo com o nome request.json e execute o comando a seguir:
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-04 UTC."],[],[],null,["# Count tokens for Claude models\n\nThe `count-tokens` endpoint lets you determine the number of tokens in a\nmessage before sending it to Claude, helping you make informed decisions about\nyour prompts and usage.\n\nThere is no cost for using the `count-tokens` endpoint.\n\nSupported Claude models\n-----------------------\n\nThe following models support count tokens:\n\n- [Claude Opus 4.1](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-opus-4-1)\n- [Claude Opus 4](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-opus-4)\n- [Claude Sonnet 4](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-sonnet-4)\n- [Claude 3.7 Sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-7-sonnet)\n- [Claude 3.5 Sonnet v2](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-sonnet-v2)\n- [Claude 3.5 Haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-haiku)\n- [Claude 3.5 Sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-5-sonnet)\n- [Claude 3 Opus](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-opus)\n- [Claude 3 Haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-haiku)\n\n\u003cbr /\u003e\n\nSupported regions\n-----------------\n\nThe following regions support count tokens:\n\n- `us-east5`\n- `europe-west1`\n- `asia-east1`\n- `asia-southeast1`\n- `us-central1`\n- `europe-west4`\n\nCount tokens in basic messages\n------------------------------\n\nTo count tokens, send a `rawPredict` request to the `count-tokens` endpoint. The\nbody of the request must contain the model ID of the model you want to count\ntokens against. \n\n### REST\n\n\nBefore using any of the request data,\nmake the following replacements:\n\n- \u003cvar class=\"edit\" scope=\"LOCATION\" translate=\"no\"\u003eLOCATION\u003c/var\u003e: A [region](#regions) that supports Anthropic Claude models. To use the global endpoint, see [Specify\n the global endpoint](/vertex-ai/generative-ai/docs/partner-models/use-partner-models#global).\n- \u003cvar class=\"edit\" scope=\"MODEL\" translate=\"no\"\u003eMODEL\u003c/var\u003e: The [model](#model-list) to count tokens against.\n- \u003cvar translate=\"no\"\u003eROLE\u003c/var\u003e: The role associated with a message. You can specify a `user` or an `assistant`. The first message must use the `user` role. Claude models operate with alternating `user` and `assistant` turns. If the final message uses the `assistant` role, then the response content continues immediately from the content in that message. You can use this to constrain part of the model's response.\n- \u003cvar translate=\"no\"\u003eCONTENT\u003c/var\u003e: The content, such as text, of the `user` or `assistant` message.\n\n\nHTTP method and URL:\n\n```\nPOST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict\n```\n\n\nRequest JSON body:\n\n```\n{\n \"model\": \"MODEL\",\n \"messages\": [\n {\n \"role\": \"user\",\n \"content\":\"how many tokens are in this request?\"\n }\n ],\n}\n```\n\nTo send your request, choose one of these options: \n\n#### curl\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\ncurl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d @request.json \\\n \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict\"\n```\n\n#### PowerShell\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\n$cred = gcloud auth print-access-token\n$headers = @{ \"Authorization\" = \"Bearer $cred\" }\n\nInvoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -ContentType: \"application/json; charset=utf-8\" `\n -InFile request.json `\n -Uri \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict\" | Select-Object -Expand Content\n```\n\nYou should receive a JSON response similar to the following.\n\n#### Response\n\n```\n{ \"input_tokens\": 14 }\n```\n\n\u003cbr /\u003e\n\nFor information on how to count tokens in messages with tools, images, and PDFs,\nsee [Anthropic's documentation](https://docs.anthropic.com/en/docs/build-with-claude/token-counting).\n\nQuotas\n------\n\nBy default, the quota for the `count-tokens` endpoint is 2000 requests per\nminute."]]