Migrate your application to Gemini 2 with the Gemini API in Vertex AI

This guide shows how to migrate generative AI applications from Gemini 1.x and PaLM models to Gemini 2 models.

Why migrate to Gemini 2?

Gemini 2 delivers significant performance improvements over Gemini 1.x and PaLM models, along with new capabilities. Additionally, each model version has its own version support and availability timeline.

Upgrading most generative AI applications to Gemini 2 shouldn't require significant reengineering of prompts or code. But some applications require prompt changes, and these changes are difficult to predict without running a prompt through Gemini 2 first. Therefore, Gemini 2 testing is recommended before migration.

Significant code changes are only needed for certain breaking changes, or to use new Gemini 2 capabilities.

Which Gemini 2 model should I migrate to?

As you choose a Gemini 2 model to migrate to, you'll want to consider the features that your application requires, as well as the cost of those features.

For an overview of Gemini 2 model features, see Gemini 2. For an overview of all Google models, see Google models.

For a comparison of available Gemini models, see the following table.

Feature Gemini 1.5 Pro Gemini 1.5 Flash Gemini 2.0 Flash Gemini 2.0 Flash-Lite Gemini 2.5 Pro Gemini 2.5 Flash
Input modalities text, documents, image, video, audio text, documents, image, video, audio text, documents, image, video, audio text, documents, image, video, audio text, documents, image, video, audio text, documents, image, video, audio
Output modalities text text text text text text
Context window, total token limit 2,097,152 1,048,576 1,048,576 1,048,576 1,048,576 1,048,576
Output context length 8,192 8,192 8,192 8,192 64,192 64,192
Grounding with Search Yes Yes Yes No Yes Yes
Function calling Yes Yes Yes Yes Yes Yes
Code execution No No Yes No Yes Yes
Context caching Yes Yes Yes No Yes Yes
Batch prediction Yes Yes Yes Yes Yes Yes
Live API No No No No No No
Latency Most capable in 1.5 family Fastest in 1.5 family Fast + good cost efficiency Fast + most cost efficient Slower than Flash, but good cost efficiency Fast + most cost efficient
Fine-tuning Yes Yes Yes Yes Yes Yes
Recommended SDK Vertex AI SDK Vertex AI SDK Gen AI SDK Gen AI SDK Gen AI SDK Gen AI SDK
Pricing units Character Character Token Token Token Token

Migration process overview

This document outlines an eight-step process for migrating your application to Gemini 2. Use the following diagram to navigate to each step.

Process

Before you begin

Document model evaluation and testing requirements

Code upgrades and testing

Offline evaluation

Assess evaluation results and tune the Gemini 2 prompts and hyperparameters

Load testing

Online evaluation

Production deployment

Improving model performance

As you complete your migration, use the following tips to maximize Gemini 2 model performance:

  • Inspect your system instructions, prompts, and few-shot learning examples for any inconsistencies, contradictions, or irrelevant instructions and examples.
  • Test a more powerful model. For example, if you evaluated Gemini 2.0 Flash-Lite, try Gemini 2.0 Flash.
  • Examine any automated evaluation results to make sure they match human judgment, especially results that use a judge model. Make sure your judge model instructions don't contain inconsistencies or ambiguities.
  • One way to improve judge model instructions is to test the instructions with multiple humans in isolation and see if their judgments are consistent. If humans interpret the instructions differently and render different judgments, your judge model instructions are ambiguous.
  • Fine-tune the Gemini 2 model.
  • Examine evaluation outputs to look for patterns that show specific kinds of failures. Grouping together failures into different models, kinds, or categories gives you more targeted evaluation data, which makes it easier to adjust prompts to address these errors.
  • Make sure you are independently evaluating different generative AI components.
  • Experiment with adjusting token sampling parameters.

Getting help

If you need help, Google Cloud offers support packages to meet your needs, such as 24/7 coverage, phone support, and access to a technical support manager. For more information, see Google Cloud Support.

What's next