Stay organized with collections
Save and categorize content based on your preferences.
A Vertex AI persistent resource is a long-running cluster that you can
create to run custom training jobs. After a training job completes, the
persistent resource remains available to run other training jobs until you
delete it. You can use a persistent resource to ensure compute resource
availability and to reduce the job startup time that's otherwise needed for
compute resource creation. Persistent resources support all VMs and GPUs that
are supported by custom training jobs. This page explains when to use a
persistent resource and gives you information about billing and quota.
When to use a persistent resource
We recommend using persistent resources in the following scenarios:
You want to ensure capacity availability for critical ML workloads or during
peak seasons. Unlike custom jobs, where the training service releases
the resource after job completion, persistent resource remains available until
it's deleted.
You're submitting the same job multiple times and can benefit from data and
image caching by running the jobs on the same persistent resource.
You run many short-lived training jobs where the actual training time is
shorter than the job startup time.
You are billed for the entire duration that a persistent resource is in a
running state, regardless of whether there is a job running on the persistent
resource. For each instance in the persistent resource pool, you are billed by
core hour. All jobs running on a persistent resource are not separately charged.
You are billed only for the persistent resource.
If you set up auto scaling for your persistent resource, you only pay
for the provisioned instances. For example, if min-replica-count is set to 4,
4 instances are always provisioned and this is the minimum amount you're billed
for. When your workload increases, the resource pool might scale up to 6 to
accommodate the increased demand. Then, you're billed for the 6 provisioned instances
until your resource pool scales down again. To avoid paying for idle nodes,
use auto scaling for your persistent resource, or delete it when you no longer
need it. To learn more about pricing, see the Custom-trained models
section in the Vertex AI pricing page.
Quotas
Persistent resources use your training quota, so verify you have sufficient
quota for persistent resource creation. To learn more about quotas, see Training quotas and limits.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[],[],null,["# Overview of persistent resources\n\nA Vertex AI persistent resource is a long-running cluster that you can\ncreate to run custom training jobs. After a training job completes, the\npersistent resource remains available to run other training jobs until you\ndelete it. You can use a persistent resource to ensure compute resource\navailability and to reduce the job startup time that's otherwise needed for\ncompute resource creation. Persistent resources support all VMs and GPUs that\nare supported by custom training jobs. This page explains when to use a\npersistent resource and gives you information about billing and quota.\n\nWhen to use a persistent resource\n---------------------------------\n\nWe recommend using persistent resources in the following scenarios:\n\n- You want to ensure capacity availability for critical ML workloads or during peak seasons. Unlike custom jobs, where the training service releases the resource after job completion, persistent resource remains available until it's deleted.\n- You're submitting the same job multiple times and can benefit from data and image caching by running the jobs on the same persistent resource.\n- You run many short-lived training jobs where the actual training time is shorter than the job startup time.\n\nFor more context on when to and why use a persistent resource, see the blog post\n[Bringing capacity assurance and faster startup times to Vertex AI Training](/blog/products/ai-machine-learning/vertex-ai-persistent-resources-and-capacity-assurance).\n\nBilling details\n---------------\n\nYou are billed for the entire duration that a persistent resource is in a\nrunning state, regardless of whether there is a job running on the persistent\nresource. For each instance in the persistent resource pool, you are billed by\ncore hour. All jobs running on a persistent resource are not separately charged.\nYou are billed only for the persistent resource.\n\nIf you set up auto scaling for your persistent resource, you only pay\nfor the provisioned instances. For example, if `min-replica-count` is set to `4`,\n`4` instances are always provisioned and this is the minimum amount you're billed\nfor. When your workload increases, the resource pool might scale up to `6` to\naccommodate the increased demand. Then, you're billed for the `6` provisioned instances\nuntil your resource pool scales down again. To avoid paying for idle nodes,\nuse auto scaling for your persistent resource, or delete it when you no longer\nneed it. To learn more about pricing, see the [Custom-trained models](/vertex-ai/pricing#custom-trained_models)\nsection in the Vertex AI pricing page.\n\nQuotas\n------\n\nPersistent resources use your training quota, so verify you have sufficient\nquota for persistent resource creation. To learn more about quotas, see [Training quotas and limits](/vertex-ai/docs/quotas#training).\n\nWhat's next\n-----------\n\n- [Create and use a persistent resource](/vertex-ai/docs/training/persistent-resource-create).\n- [Run training jobs on a persistent resource](/vertex-ai/docs/training/persistent-resource-train).\n- [Get information about a persistent resource](/vertex-ai/docs/training/persistent-resource-get).\n- [Reboot a persistent resource](/vertex-ai/docs/training/persistent-resource-reboot).\n- [Delete a persistent resource](/vertex-ai/docs/training/persistent-resource-delete)."]]