This page shows you how to get the token count and the number of billable
characters for a prompt by using the countTokens
API.
Supported models
The following multimodal models support getting an estimate of the prompt token count:
gemini-2.0-flash-001
gemini-1.5-flash-002
gemini-1.5-pro-002
gemini-1.0-pro-002
gemini-1.0-pro-vision-001
To learn more about model versions, see Gemini model versions and lifecycle.
Get the token count for a prompt
You can get the token count estimate and the number of billable characters for a prompt by using the Vertex AI API.
Gen AI SDK for Python
Install
pip install --upgrade google-genai
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True
REST
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- LOCATION: The region to process the request. Available
options include the following:
Click to expand a partial list of available regions
us-central1
us-west4
northamerica-northeast1
us-east4
us-west1
asia-northeast3
asia-southeast1
asia-northeast1
- PROJECT_ID: Your project ID.
- MODEL_ID: The model ID of the multimodal model that you want to use.
- ROLE:
The role in a conversation associated with the content. Specifying a role is required even in
singleturn use cases.
Acceptable values include the following:
USER
: Specifies content that's sent by you.
- TEXT: The text instructions to include in the prompt.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens
Request JSON body:
{ "contents": [{ "role": "ROLE", "parts": [{ "text": "TEXT" }] }] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Console
To get the token count for a prompt by using Vertex AI Studio in the Google Cloud console, perform the following steps:
- In the Vertex AI section of the Google Cloud console, go to the Vertex AI Studio page.
- Click either Open Freeform or Open Chat.
- The number of tokens is calculated and displayed as you type in the Prompt pane. It includes the number of tokens in any input files.
- To see more details, click <count> tokens to open the Prompt tokenizer.
- To view the tokens in the text prompt that are highlighted with different colors marking the boundary of each token ID, click Token ID to text. Media tokens aren't supported.
- To view the token IDs, click Token ID.
To close the tokenizer tool pane, click X, or click outside of the pane.
Example for text with image or video:
Gen AI SDK for Python
Install
pip install --upgrade google-genai
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True
REST
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
```sh MODEL_ID="gemini-1.0-pro-vision" PROJECT_ID="my-project" TEXT="Provide a summary with about two sentences for the following article." REGION="us-central1"curl
-X POST
-H "Authorization: Bearer $(gcloud auth print-access-token)"
-H "Content-Type: application/json"
https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:countTokens -d
$'{
"contents": [{
"role": "user",
"parts": [
{
"file_data": {
"file_uri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
"mime_type": "video/mp4"
}
},
{
"text": "'"$TEXT"'"
}]
}]
}'
```
Pricing and quota
There is no charge or quota restriction for using the CountTokens
API. The
maximum quota for the CountTokens
API is 3000 requests per minute.
What's next
- Learn how to test chat prompts.
- Learn how to test text prompts.