Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.
Stay organized with collections
Save and categorize content based on your preferences.
This document provides an overview of Gemini 2.5 Flash-Lite, our most balanced Gemini model, optimized for low-latency use cases. Key features include:
Thinking at different budgets: Turn thinking on at different budgets.
Tool connectivity: Connect to tools like Grounding with Google Search and code execution.
Multimodal input: Process multimodal inputs.
Large context window: Features a 1 million-token context length.
For even more detailed technical information on Gemini 2.5 Flash-Lite (such as
performance benchmarks, information on our training datasets, efforts on
sustainability, intended usage and limitations, and our approach to ethics and
safety), see our technical
report
on our Gemini 2.5 models.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-27 UTC."],[],[],null,["Gemini 2.5 Flash-Lite is our most balanced Gemini model,\noptimized for low latency use cases. It comes with the same capabilities that\nmake other Gemini 2.5 models helpful, such as the ability to turn\nthinking on at different budgets, connecting to tools like\nGrounding with Google Search and code execution, multimodal input, and\na 1 million-token context length.\n\nFor even more detailed technical information on Gemini 2.5 Flash-Lite (such as\nperformance benchmarks, information on our training datasets, efforts on\nsustainability, intended usage and limitations, and our approach to ethics and\nsafety), see our [technical\nreport](https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf)\non our Gemini 2.5 models.\n\n\n[Try in Vertex AI](https://console.cloud.google.com/vertex-ai/generative/multimodal/create/text?model=gemini-2.5-flash-lite) [(Preview) Deploy example app](https://console.cloud.google.com/vertex-ai/studio/multimodal?suggestedPrompt=How%20does%20AI%20work&deploy=true&model=gemini-2.5-flash-lite)\nNote: To use the \"Deploy example app\" feature, you need a Google Cloud project with billing and Vertex AI API enabled.\n\n| Model ID | `gemini-2.5-flash-lite` ||\n| Supported inputs \\& outputs | - Inputs: Text, Code, Images, Audio, Video - Outputs: Text ||\n| Token limits | - Maximum input tokens: 1,048,576 - Maximum output tokens: 65,536 (default) ||\n| Capabilities | - Supported - [Grounding with Google Search](/vertex-ai/generative-ai/docs/grounding/grounding-with-google-search) - [Code execution](/vertex-ai/generative-ai/docs/multimodal/code-execution) - [Tuning](/vertex-ai/generative-ai/docs/models/tune-models) - [System instructions](/vertex-ai/generative-ai/docs/learn/prompts/system-instruction-introduction) - [Batch prediction](/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini) - [Function calling](/vertex-ai/generative-ai/docs/multimodal/function-calling) - [Count Tokens](/vertex-ai/generative-ai/docs/multimodal/get-token-count) - [Thinking](/vertex-ai/generative-ai/docs/thinking) - [Context caching](/vertex-ai/generative-ai/docs/context-cache/context-cache-overview) - [Vertex AI RAG Engine](/vertex-ai/generative-ai/docs/rag-engine/rag-overview) - Not supported - [Live API](/vertex-ai/generative-ai/docs/live-api) previewPreview feature - [Chat completions](/vertex-ai/generative-ai/docs/migrate/openai/overview) ||\n| Usage types | - Supported - [Provisioned Throughput](/vertex-ai/generative-ai/docs/provisioned-throughput) - [Dynamic shared quota](/vertex-ai/generative-ai/docs/dsq) - Not supported - [Fixed quota](/vertex-ai/generative-ai/docs/quotas) ||\n| Input size limit | 500 MB ||\n| Technical specifications |\n| Technical specifications |\n| Technical specifications |\n| Technical specifications |\n| Technical specifications |\n| Technical specifications |\n|-----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---|\n| **Images** photo | - Maximum images per prompt: 3,000 - Maximum image size: 7 MB - Maximum number of output images per prompt: 10 - Supported MIME types: `image/png`, `image/jpeg`, `image/webp` |\n| **Documents** description | - Maximum number of files per prompt: 3,000 - Maximum number of pages per file: 1,000 - Maximum file size per file: 50 MB - Supported MIME types: `application/pdf`, `text/plain` |\n| **Video** videocam | - Maximum video length (with audio): Approximately 45 minutes - Maximum video length (without audio): Approximately 1 hour - Maximum number of videos per prompt: 10 - Supported MIME types: `video/x-flv`, `video/quicktime`, `video/mpeg`, `video/mpegs`, `video/mpg`, `video/mp4`, `video/webm`, `video/wmv`, `video/3gpp` |\n| **Audio** mic | - Maximum audio length per prompt: Appropximately 8.4 hours, or up to 1 million tokens - Maximum number of audio files per prompt: 1 - Supported MIME types: `audio/x-aac`, `audio/flac`, `audio/mp3`, `audio/m4a`, `audio/mpeg`, `audio/mpga`, `audio/mp4`, `audio/opus`, `audio/pcm`, `audio/wav`, `audio/webm` |\n| **Parameter defaults** tune | - Temperature: 0.0-2.0 (default 1.0) - topP: 0.0-1.0 (default 0.95) - topK: 64 (fixed) - candidateCount: 1--8 (default 1) |\n| Model availability | - Global - global - United States - us-central1 - us-east1 - us-east4 - us-east5 - us-south1 - us-west1 - us-west4 - Europe - europe-central2 - europe-north1 - europe-southwest1 - europe-west1 - europe-west4 - europe-west8 - europe-west9 |\n| ML processing | - United States - Multi-region - Europe - Multi-region |\n| See [Data residency](/vertex-ai/generative-ai/docs/learn/data-residency) for more information. ||\n| See [Security controls](/vertex-ai/generative-ai/docs/security-controls) for more information. ||"]]