Overview of model tuning for Gemini

Model tuning is a crucial process in adapting Gemini to perform specific tasks with greater precision and accuracy. Model tuning works by providing a model with a training dataset that contains a set of examples of specific downstream tasks.

This page provides an overview of model tuning for Gemini, describes the tuning options available for Gemini, and helps you determine when each tuning option should be used.

Benefits of model tuning

Model tuning is an effective way to customize large models to your tasks. It's a key step to improve the model's quality and efficiency. Model tuning provides the following benefits:

  • Higher quality for your specific tasks.
  • Increased model robustness.
  • Lower inference latency and cost due to shorter prompts.

Tuning compared to prompt design

Tuning provides the following benefits over prompt design.

  • Allows deep customization on the model and results in better performance on specific tasks.
  • Offers more consistent and reliable results.
  • Capable of handling more examples at once.

Tuning approaches

Parameter-efficient tuning and full fine-tuning are two approaches to customizing large models. Both methods have their advantages and implications in terms of model quality and resource efficiency.

Parameter efficient tuning

Parameter-efficient tuning, also called adapter tuning, enables efficient adaptation of large models to your specific tasks or domain. Parameter-efficient tuning updates a relatively small subset of the model's parameters during the tuning process.

To understand how Vertex AI supports adapter tuning and serving, you can find more details in the following whitepaper, Adaptation of Large Foundation Models.

Full fine-tuning

Full fine-tuning updates all parameters of the model, making it suitable for adapting the model to highly complex tasks, with the potential of achieving higher quality. However full fine tuning demands higher computational resources for both tuning and serving, leading to higher overall costs.

Parameter efficient tuning compared to full fine tuning

Parameter-efficient tuning is more resource efficient and cost effective compared to full fine-tuning. It uses significantly lower computational resources to train. It's able to adapt the model faster with a smaller dataset. The flexibility of parameter-efficient tuning offers a solution for multi-task learning without the need for extensive retraining.

Tuning Gemini models

Gemini models (gemini-1.0-pro-002) support the following tuning methods:

  • Supervised fine-tuning (parameter-efficient)

    Supervised fine-tuning for Gemini models improves the performance of the model by teaching it a new skill. Data that contains hundreds of labeled examples is used to teach the model to mimic a desired behavior or task. Each labeled example demonstrates what you want the model to output during inference.

    Supervised fine-tuning is ideal when you have a well-defined task with available labeled data. supervised fine-tuning adapts model behavior with a labeled dataset. This process adjusts the model's weights to minimize the difference between its predictions and the actual labels.


Quota is enforced on the number of concurrent tuning jobs. Every project comes with a default quota to run at least one tuning job. This is a global quota, shared across all available regions. If you want to run more jobs concurrently, you need to request additional quota for Global concurrent tuning jobs.


Supervised fine-tuning for gemini-1.0-pro-002 is in Preview.

  • While tuning is in Preview, there is no charge to tune a model.
  • After tuning a model, inference costs for the tuned model still apply. Inference pricing is the same for each stable version of Gemini 1.0 Pro.

For more information, see Vertex AI pricing and Available Gemini stable model versions.

What's next