The count-tokens
endpoint lets you determine the number of tokens in a
message before sending it to Claude, helping you make informed decisions about
your prompts and usage.
There is no cost for using the count-tokens
endpoint.
Supported Claude models
The following models support count tokens:
- Claude 3.5 Sonnet v2:
claude-3-5-sonnet-v2@20241022
. - Claude 3.5 Haiku:
claude-3-5-haiku@20241022
. - Claude 3 Opus:
claude-3-opus@20240229
. - Claude 3.5 Sonnet:
claude-3-5-sonnet@20240620
. - Claude 3 Haiku:
claude-3-haiku@20240307
.
Supported regions
The following regions support count tokens:
us-east5
europe-west1
asia-southeast1
us-central1
europe-west4
Count tokens in basic messages
To count tokens, send a rawPredict
request to the count-tokens
endpoint. The
body of the request must contain the model ID of the model you want to count
tokens against.
REST
Before using any of the request data, make the following replacements:
- LOCATION: A supported region.
- MODEL: The model to count tokens against.
- ROLE: The role associated with a
message. You can specify a
user
or anassistant
. The first message must use theuser
role. Claude models operate with alternatinguser
andassistant
turns. If the final message uses theassistant
role, then the response content continues immediately from the content in that message. You can use this to constrain part of the model's response. - CONTENT: The content, such as text, of the
user
orassistant
message.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict
Request JSON body:
{ "model": "claude-3-haiku@20240307", "messages": [ { "role": "user", "content":"how many tokens are in this request?" } ], }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
For information on how to count tokens in messages with tools, images, and PDFs, see Anthropic's documentation.
Quotas
By default, the quota for the count-tokens
endpoint is 2000 requests per
minute.