This document describes how to tune a Translation LLM model by using supervised
fine-tuning. Before you begin, you must prepare a supervised fine-tuning dataset.
Depending on your use case, there are different requirements. You can create a supervised fine-tuning job by using the REST API or
the Vertex AI SDK for Python. To create a model tuning job, send a POST request by using the
Before using any of the request data,
make the following replacements:
HTTP method and URL:
Request JSON body:
To send your request, choose one of these options:
Save the request body in a file named
Save the request body in a file named You should receive a JSON response similar to the following. You can view a list of tuning jobs in your current project by using the Google Cloud console,
the Vertex AI SDK for Python, or by sending a GET request by using the To view a list of model tuning jobs, send a GET request by using the
Before using any of the request data,
make the following replacements:
HTTP method and URL:
To send your request, choose one of these options:
Execute the following command:
Execute the following command:
You should receive a JSON response similar to the following. To view your tuning jobs in the Google Cloud console, go to the
Vertex AI Studio page. Your Translation LLM tuning jobs are listed in the table under the Translation LLM tuned models
section. You can get the details of a tuning job in your current project
by using the Google Cloud console, the Vertex AI SDK for Python, or by sending a GET
request by using the To view a list of model tuning jobs, send a GET request by using the
Before using any of the request data,
make the following replacements:
HTTP method and URL:
To send your request, choose one of these options:
Execute the following command:
Execute the following command:
You should receive a JSON response similar to the following. To view details of a tuned model in the Google Cloud console, go to the
Vertex AI Studio page. In the Translation LLM tuned models table, find your model and click
Details. The details of your model are shown. You can cancel a tuning job in your current project by using the Google Cloud console,
the Vertex AI SDK for Python, or by sending a POST request using the To view a list of model tuning jobs, send a GET request by using the
Before using any of the request data,
make the following replacements:
HTTP method and URL:
To send your request, choose one of these options:
Execute the following command:
Execute the following command:
You should receive a JSON response similar to the following. To cancel a tuning job in the Google Cloud console, go to the
Vertex AI Studio page. In the Translation tuned models table, click Click Cancel. You can test a tuning job in your current project by using the
Vertex AI SDK for Python or by sending a POST request using the The following example prompts a model with the question "Why is sky blue?". To test a tuned model with a prompt, send a POST request and
specify the
Before using any of the request data,
make the following replacements:
HTTP method and URL:
Request JSON body:
To send your request, choose one of these options:
Save the request body in a file named
Save the request body in a file named You should receive a JSON response similar to the following. You can configure a model tuning job to collect and report model tuning and
model evaluation metrics, which can then be visualized in
Vertex AI Studio. To view details of a tuned model in the Google Cloud console, go to the
Vertex AI Studio page. In the Tune and Distill table, click the name of the tuned model
that you want to view metrics for. The tuning metrics appear under the Monitor tab. The model tuning job automatically collects the following tuning metrics
for You can configure a model tuning job to collect the following validation metrics
for The metrics visualizations are available after the tuning job starts running.
It will be updated in real time as tuning progresses.
If you don't specify a validation dataset when you create the tuning job, only
the visualizations for the tuning metrics are available.Before you begin
Supported Models
translation-llm-002
(In preview, only supports text tuning)Create a tuning job
REST
tuningJobs.create
method. Some of the parameters are not supported by all of the models. Ensure
that you include only the applicable parameters for the model that you're
tuning.
us-central1
.translation-llm-002
.POST https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs
{
"baseModel": "BASE_MODEL",
"supervisedTuningSpec" : {
"trainingDatasetUri": "TRAINING_DATASET_URI",
"validationDatasetUri": "VALIDATION_DATASET_URI",
},
"tunedModelDisplayName": "TUNED_MODEL_DISPLAYNAME"
}
curl
request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs"PowerShell
request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs" | Select-Object -Expand ContentExample curl command
PROJECT_ID=myproject
LOCATION=us-central1
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/tuningJobs" \
-d \
$'{
"baseModel": "translation-llm-002",
"supervisedTuningSpec" : {
"training_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_train_data.jsonl",
"validation_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_validation_data.jsonl"
},
"tunedModelDisplayName": "tuned_translation_llm"
}'
Python
from vertexai.generative_models import GenerativeModel
sft_tuning_job = sft.SupervisedTuningJob("projects/<PROJECT_ID>/locations/<TUNING_JOB_REGION>/tuningJobs/<TUNING_JOB_ID>")
tuned_model = GenerativeModel(sft_tuning_job.tuned_model_endpoint_name)
print(tuned_model.generate_content(content))
import time
import vertexai
from vertexai.tuning import sft
# TODO(developer): Update and un-comment below line.
# PROJECT_ID = os.environ["GOOGLE_CLOUD_PROJECT"]
vertexai.init(project=PROJECT_ID, location="us-central1")
sft_tuning_job = sft.train(
source_model="translation-llm-002",
train_dataset="gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_train_data.jsonl",
# The following parameters are optional
validation_dataset="gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_validation_data.jsonl",
tuned_model_display_name="tuned_translation_llm_002",
)
# Polling for job completion
while not sft_tuning_job.has_ended:
time.sleep(60)
sft_tuning_job.refresh()
print(sft_tuning_job.tuned_model_name)
print(sft_tuning_job.tuned_model_endpoint_name)
print(sft_tuning_job.experiment)
# Example response:
# projects/123456789012/locations/us-central1/models/1234567890@1
# projects/123456789012/locations/us-central1/endpoints/123456789012345
# <google.cloud.aiplatform.metadata.experiment_resources.Experiment object at 0x7b5b4ae07af0>
View a list of tuning jobs
tuningJobs
method. REST
tuningJobs.list
method.
GET https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs
curl
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs"PowerShell
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs" | Select-Object -Expand Content Python
Console
Get details of a tuning job
tuningJobs
method. REST
tuningJobs.get
method and specify the TuningJob_ID
.
GET https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID
curl
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID"PowerShell
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID" | Select-Object -Expand Content Python
Console
Cancel a tuning job
tuningJobs
method. REST
tuningJobs.cancel
method and specify the TuningJob_ID
.
POST https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID:cancel
curl
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID:cancel"PowerShell
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID:cancel" | Select-Object -Expand Content Python
Console
Test the tuned model with a prompt
tuningJobs
method. REST
TUNED_ENDPOINT_ID
.
POST https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/endpoints/ENDPOINT_ID:generateContent
{
"contents": [
{
"role": "USER",
"parts": {
"text" : "English: Hello. Spanish:"
}
}
],
}
curl
request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/endpoints/ENDPOINT_ID:generateContent"PowerShell
request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/endpoints/ENDPOINT_ID:generateContent" | Select-Object -Expand Content Python
from vertexai.generative_models import GenerativeModel
sft_tuning_job = sft.SupervisedTuningJob("projects/<PROJECT_ID>/locations/<TUNING_JOB_REGION>/tuningJobs/<TUNING_JOB_ID>")
tuned_model = GenerativeModel(sft_tuning_job.tuned_model_endpoint_name)
print(tuned_model.generate_content(content))
Tuning and validation metrics
Model tuning metrics
translation-llm-002
.
/train_total_loss
: Loss for the tuning dataset at a training step./train_fraction_of_correct_next_step_preds
: The token accuracy at a training
step. A single prediction consists of a sequence of tokens. This metric
measures the accuracy of the predicted tokens when compared to the ground
truth in the tuning dataset./train_num_predictions:
Number of predicted tokens at a training step.Model validation metrics:
translation-llm-002
.
/eval_total_loss
: Loss for the validation dataset at a validation step./eval_fraction_of_correct_next_step_preds
: The token accuracy at an
validation step. A single prediction consists of a sequence of tokens. This
metric measures the accuracy of the predicted tokens when compared to the
ground truth in the validation dataset./eval_num_predictions
: Number of predicted tokens at a validation step.What's next
Tune Translation LLM models by using supervised fine-tuning
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-29 UTC.