Stay organized with collections
Save and categorize content based on your preferences.
This page shows you how to resolve issues with deleting ephemeral
Dataproc clusters in Cloud Data Fusion.
When Cloud Data Fusion creates an ephemeral Dataproc cluster
during pipeline run provisioning, the cluster gets deleted after the pipeline
run is finished. In rare cases, the cluster deletion fails.
Strongly recommended: Upgrade to the most recent Cloud Data Fusion
version to ensure proper cluster maintenance.
Set Max Idle Time
To resolve this issue, configure the Max Idle Time value. This lets
Dataproc delete clusters automatically, even if an explicit call
on the pipeline finish fails.
Max Idle Time is available in Cloud Data Fusion versions 6.4 and later.
In Cloud Data Fusion 6.6 and later, Max Idle Time is set to 4 hours by
default.
To override the default time in the default compute profile, follow these steps:
Open the instance in the Cloud Data Fusion web interface.
Click System Admin>Configuration>System
Preferences.
Click Edit System Preferences and add the key
system.profile.properties.idleTTL and the value, in IntegerUnit format,
such as 30m.
Recommended: For versions before 6.6, set Max Idle Time manually to 30
minutes or greater.
Delete clusters manually
If you cannot upgrade your version or configure the Max Idle Time option,
instead delete stale clusters manually:
Get each project ID where the clusters were created:
In the pipeline's runtime arguments, check if the
Dataproc project ID is customized for the run.
If a Dataproc project ID is not specified explicitly,
determine which provisioner is used, and then check for a project ID:
In the pipeline runtime arguments, check the system.profile.name
value.
Open the provisioner settings and check if the
Dataproc project ID is set. If the setting is not
present or the field is empty, the project that the
Cloud Data Fusion instance is running in is used.
For each project:
Open the project in the Google Cloud console and go to the
Dataproc Clusters page.
Sort the clusters by the date that they were created, from oldest to
newest.
If the info panel is hidden, click Show info panel and go to the
Labels tab.
For every cluster that is not in use—for example, more than a day has
elapsed—check if it has a Cloud Data Fusion version label. That
is an indication that it was created by Cloud Data Fusion.
Select the checkbox by the cluster name and click Delete.
Skip cluster deletion
For debugging purposes, you can stop the automatic deletion of an ephemeral
cluster.
To stop the deletion, set the Skip Cluster Deletion property to True. You
must manually delete the cluster after you finish debugging.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eThis guide addresses the issue of failed ephemeral Dataproc cluster deletions in Cloud Data Fusion, which can occur after a pipeline run.\u003c/p\u003e\n"],["\u003cp\u003eUpgrading to the latest Cloud Data Fusion version is strongly recommended to ensure automatic cluster cleanup.\u003c/p\u003e\n"],["\u003cp\u003eConfiguring the \u003ccode\u003eMax Idle Time\u003c/code\u003e setting (available in versions 6.4+) enables automatic cluster deletion by Dataproc even if the pipeline deletion fails, with a default of 4 hours in version 6.6+.\u003c/p\u003e\n"],["\u003cp\u003eIf upgrading or setting \u003ccode\u003eMax Idle Time\u003c/code\u003e isn't possible, you can manually delete stale clusters by identifying the relevant project IDs and deleting the clusters from the Dataproc Clusters page.\u003c/p\u003e\n"],["\u003cp\u003eFor debugging, the \u003ccode\u003eSkip Cluster Deletion\u003c/code\u003e property can be set to \u003ccode\u003eTrue\u003c/code\u003e to prevent cluster deletion after a pipeline run, but you must manually delete the cluster afterward.\u003c/p\u003e\n"]]],[],null,["# Troubleshoot deleting clusters\n\nThis page shows you how to resolve issues with deleting ephemeral\nDataproc clusters in Cloud Data Fusion.\n\nWhen Cloud Data Fusion creates an ephemeral Dataproc cluster\nduring pipeline run provisioning, the cluster gets deleted after the pipeline\nrun is finished. In rare cases, the cluster deletion fails.\n\n**Strongly recommended**: Upgrade to the most recent Cloud Data Fusion\nversion to ensure proper cluster maintenance.\n\nSet Max Idle Time\n-----------------\n\nTo resolve this issue, configure the **Max Idle Time** value. This lets\nDataproc delete clusters automatically, even if an explicit call\non the pipeline finish fails.\n\n`Max Idle Time` is available in Cloud Data Fusion versions 6.4 and later.\n\nIn Cloud Data Fusion 6.6 and later, **Max Idle Time** is set to 4 hours by\ndefault.\n\nTo override the default time in the default compute profile, follow these steps:\n\n1. Open the instance in the Cloud Data Fusion web interface.\n2. Click **System Admin** \\\u003e **Configuration** \\\u003e **System\n Preferences**.\n3. Click **Edit System Preferences** and add the key `system.profile.properties.idleTTL` and the value, in IntegerUnit format, such as `30m`.\n\n**Recommended** : For versions before 6.6, set `Max Idle Time` manually to 30\nminutes or greater.\n\nDelete clusters manually\n------------------------\n\nIf you cannot upgrade your version or configure the `Max Idle Time` option,\ninstead delete stale clusters manually:\n\n1. Get each project ID where the clusters were created:\n\n 1. In the pipeline's runtime arguments, check if the\n Dataproc project ID is customized for the run.\n\n 2. If a Dataproc project ID is not specified explicitly,\n determine which provisioner is used, and then check for a project ID:\n\n 1. In the pipeline runtime arguments, check the `system.profile.name`\n value.\n\n 2. Open the provisioner settings and check if the\n Dataproc project ID is set. If the setting is not\n present or the field is empty, the project that the\n Cloud Data Fusion instance is running in is used.\n\n | **Important:** Multiple pipeline runs might use different projects. Be sure to get all of the project IDs.\n2. For each project:\n\n 1. Open the project in the Google Cloud console and go to the\n Dataproc **Clusters** page.\n\n [Go to Clusters](https://console.cloud.google.com/dataproc/clusters)\n 2. Sort the clusters by the date that they were created, from oldest to\n newest.\n\n 3. If the info panel is hidden, click **Show info panel** and go to the\n **Labels** tab.\n\n 4. For every cluster that is not in use---for example, more than a day has\n elapsed---check if it has a Cloud Data Fusion version label. That\n is an indication that it was created by Cloud Data Fusion.\n\n 5. Select the checkbox by the cluster name and click **Delete**.\n\nSkip cluster deletion\n---------------------\n\nFor debugging purposes, you can stop the automatic deletion of an ephemeral\ncluster.\n\nTo stop the deletion, set the `Skip Cluster Deletion` property to `True`. You\nmust manually delete the cluster after you finish debugging."]]