Recreate and update a Dataproc on GKE virtual cluster
Stay organized with collections
Save and categorize content based on your preferences.
You can copy an existing Dataproc on GKE virtual cluster's configuration,
update the copied configuration, and then create a new Dataproc on GKE
cluster using the updated configuration.
Recreate and update a Dataproc on GKE cluster
gcloud
Set environment variables:
CLUSTER=existing Dataproc on GKE cluster name \
REGION=region
Export the existing Dataproc on GKE cluster configuration to a YAML file.
Wait for the previous delete operation to finish, and then import the
updated cluster configuration to create a new Dataproc on GKE
virtual cluster with the updated config settings.
Make additional changes to update Dataproc on GKE virtual cluster
configuration settings, such as changing the Spark
componentVersion.
Delete the existing Dataproc on GKE virtual cluster if you will create a cluster
that has the same name as the cluster it is updating (if you are replacing the
original cluster).
Wait for the previous delete operation to finish, and
then import the updated cluster configuration to create a new Dataproc on GKE
virtual cluster with the updated settings.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-28 UTC."],[[["\u003cp\u003eYou can recreate a Dataproc on GKE cluster by copying an existing cluster's configuration, updating it, and then creating a new cluster with the updated settings.\u003c/p\u003e\n"],["\u003cp\u003eWhen updating the configuration, the \u003ccode\u003ekubernetesNamespace\u003c/code\u003e field must be removed to prevent namespace conflicts.\u003c/p\u003e\n"],["\u003cp\u003eIf the new cluster will have the same name as the existing one, the original cluster should be deleted prior to the creation of the new one.\u003c/p\u003e\n"],["\u003cp\u003eThe update process includes the option to modify settings such as the Spark component version within the cluster's configuration.\u003c/p\u003e\n"],["\u003cp\u003eThe Google Cloud Console does not support the functionality to recreate a Dataproc on GKE virtual cluster by importing an existing configuration.\u003c/p\u003e\n"]]],[],null,["# Recreate and update a Dataproc on GKE virtual cluster\n\nYou can copy an existing Dataproc on GKE virtual cluster's configuration,\nupdate the copied configuration, and then create a new Dataproc on GKE\ncluster using the updated configuration.\n\nRecreate and update a Dataproc on GKE cluster\n---------------------------------------------\n\n### gcloud\n\n1. Set environment variables:\n\n ```\n CLUSTER=existing Dataproc on GKE cluster name \\\n REGION=/compute/docs/regions-zones#available\n ```\n\n \u003cbr /\u003e\n\n2. Export the existing Dataproc on GKE cluster configuration to a YAML file.\n\n ```\n gcloud dataproc clusters export $CLUSTER \\\n --region=$REGION \u003e \"${CLUSTER}-config.yaml\"\n ```\n\n \u003cbr /\u003e\n\n3. Update the configuration.\n\n 1. Remove the\n [`kubernetesNamespace`](/dataproc/docs/reference/rest/v1/projects.regions.clusters#KubernetesClusterConfig.FIELDS.kubernetes_namespace)\n field. Removing this field is necessary to avoid a namespace conflict\n when you create the updated cluster.\n\n Sample `sed` command to remove the `kubernetesNamespace` field: \n\n ```\n sed -E \"s/kubernetesNamespace: .+$//g\" ${CLUSTER}-config.yaml\n ```\n\n \u003cbr /\u003e\n\n 2. Make additional changes to update Dataproc on GKE virtual cluster\n configuration settings, such as changing the Spark\n [componentVersion](/dataproc/docs/reference/rest/v1/projects.regions.clusters#KubernetesSoftwareConfig.FIELDS.component_version).\n\n4. [Delete the existing Dataproc on GKE virtual cluster](/dataproc/docs/guides/dpgke/dataproc-gke-delete-cluster) if you will create a cluster that\n has the same name as the cluster it is updating (if you are replacing the\n original cluster).\n\n5. Wait for the previous delete operation to finish, and then import the\n updated cluster configuration to create a new Dataproc on GKE\n virtual cluster with the updated config settings.\n\n ```\n gcloud dataproc clusters import $CLUSTER \\\n --region=$REGION \\\n --source=\"${CLUSTER}-config.yaml\"\n ```\n\n### API\n\n1. Set environment variables:\n\n ```\n CLUSTER=existing Dataproc on GKE cluster name \\\n REGION=/compute/docs/regions-zones#available\n ```\n\n \u003cbr /\u003e\n\n2. Export the existing Dataproc on GKE cluster configuration to a YAML file.\n\n ```\n curl -X GET -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \"https://dataproc.googleapis.com/v1/projects/${PROJECT}/regions/${REGION}/clusters/${CLUSTER}?alt=json\" \u003e \"${CLUSTER}-config.json\"\n ```\n\n \u003cbr /\u003e\n\n3. Update the configuration.\n\n 1. Remove the\n [`kubernetesNamespace`](/dataproc/docs/reference/rest/v1/projects.regions.clusters#KubernetesClusterConfig.FIELDS.kubernetes_namespace)\n field. Removal of this field is necessary to avoid a namespace conflict\n when you create the updated cluster.\n\n Sample `jq` command to remove `kubernetesNamespace` field: \n\n ```\n jq 'del(.virtualClusterConfig.kubernetesClusterConfig.kubernetesNamespace)'\n ```\n\n \u003cbr /\u003e\n\n 2. Make additional changes to update Dataproc on GKE virtual cluster\n configuration settings, such as changing the Spark\n [componentVersion](/dataproc/docs/reference/rest/v1/projects.regions.clusters#KubernetesSoftwareConfig.FIELDS.component_version).\n\n4. Delete the existing Dataproc on GKE virtual cluster if you will create a cluster\n that has the same name as the cluster it is updating (if you are replacing the\n original cluster).\n\n ```\n curl -X DELETE -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \"https://dataproc.googleapis.com/v1/projects/${PROJECT}/regions/${REGION}/clusters/${CLUSTER}\"\n ```\n\n \u003cbr /\u003e\n\n5. Wait for the previous delete operation to finish, and\n then import the updated cluster configuration to create a new Dataproc on GKE\n virtual cluster with the updated settings.\n\n ```\n curl -i -X POST -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json; charset=utf-8\" -d \"@${CLUSTER}-config.json\" \"https://dataproc.googleapis.com/v1/projects/${PROJECT}/regions/${REGION}/clusters?alt=json\"\n ```\n\n### Console\n\nThe Google Cloud console does not support recreating a Dataproc on GKE\nvirtual cluster by importing an existing cluster's configuration.\n\n\u003cbr /\u003e"]]