Gemini 2.5 Flash-Lite is our most balanced Gemini model, optimized for low latency use cases. It comes with the same capabilities that make other Gemini 2.5 models helpful, such as the ability to turn thinking on at different budgets, connecting to tools like Grounding with Google Search and code execution, multimodal input, and a 1 million-token context length.
2.5 Flash-Lite
Try in Vertex AI (Preview) Deploy example app
Model ID | gemini-2.5-flash-lite |
|
---|---|---|
Supported inputs & outputs |
|
|
Token limits |
|
|
Capabilities |
|
|
Usage types |
|
|
Input size limit | 500 MB | |
Technical specifications | ||
Images |
|
|
Documents |
|
|
Video |
|
|
Audio |
|
|
Parameter defaults |
|
|
Supported regions | ||
Model availability (Includes dynamic shared quota & Provisioned Throughput) |
|
|
ML processing |
|
|
See Data residency for more information. | ||
Knowledge cutoff date | January 2025 | |
Versions |
|
|
Security controls | ||
See Security controls for more information. | ||
Supported languages | See Supported languages. | |
Pricing | See Pricing. |
2.5 Flash-Lite
Try in Vertex AI (Preview) Deploy example app
Model ID | gemini-2.5-flash-lite-preview-09-2025 |
|
---|---|---|
Supported inputs & outputs |
|
|
Token limits |
|
|
Capabilities |
|
|
Usage types |
|
|
Technical specifications | ||
Images |
|
|
Documents |
|
|
Video |
|
|
Audio |
|
|
Parameter defaults |
|
|
Supported regions | ||
Model availability (Includes dynamic shared quota & Provisioned Throughput) |
|
|
See Data residency for more information. | ||
Knowledge cutoff date | January 2025 | |
Versions |
|
|
Security controls | ||
See Security controls for more information. | ||
Supported languages | See Supported languages. | |
Pricing | See Pricing. |