Stay organized with collections
Save and categorize content based on your preferences.
To supplement the boot disk, you can attach
local Solid State Drives (local SSDs)
to master, primary worker, and secondary worker nodes in your cluster.
When local SSDs are provided to the cluster, both HDFS and scratch data,
such as shuffle outputs, use the local SSDs instead of the boot
persistent disk.
Local SSDs can provide faster read and write times than persistent disk
(see Local SSD Performance).
The 375GB size of each local SSD is fixed, but you can attach multiple local SSDs to
increase SSD storage (see
About Local SSDs).
Each local SSD is mounted to /mnt/<id> in Dataproc cluster nodes.
Use the
gcloud dataproc clusters create
command with the --num-master-local-ssds,
--num-workers-local-ssds, and
--num-secondary-worker-local-ssds flags to attach local
SSDs to the cluster's master, primary, and secondary worker
nodes.
Local SSDs can be attached to Dataproc VMs using a SCSI
(Small Computer System Interface) or NVME (Non-Volatile Memory Express) interface (see
local SSD performance).
The default Dataproc cluster VM local SSD interface is the SCSI interface. Use the
gcloud dataproc clusters create
command with the --master-local-ssd-interface,
--worker-local-ssd-interface, and
--secondary-worker-local-ssd-interface flags
to specify the local SSD interface for master, primary, and secondary
worker nodes.
Set the
numLocalSsds
field in the masterConfig, workerConfig, and
secondaryWorkerConfigInstanceGroupConfig
in a
cluster.create
API request to attach local SSDs to the cluster's master, primary worker, and
secondary worker nodes.
Local SSDs can be attached to Dataproc VMs using a SCSI
(Small Computer System Interface) or NVME (Non-Volatile Memory Express) interface (see
local SSD performance).
The default Dataproc cluster VM local SSD interface is the SCSI interface. Set the
localSsdInterface
field in the masterConfig, workerConfig, and
secondaryWorkerConfigInstanceGroupConfig
in a
cluster.create
API request to specify the "SCSI" or "NVME" interface to attach local SSDs to the cluster's master,
primary worker, and secondary worker nodes.
Console
Create a cluster and attach local SSDs to the master,
primary, and secondary worker nodes from the Configure nodes panel of the
Dataproc
Create a cluster page
of the Google Cloud console.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eLocal SSDs can be attached to master, primary worker, and secondary worker nodes in a cluster, offering faster read and write speeds compared to persistent disks.\u003c/p\u003e\n"],["\u003cp\u003eEach local SSD has a fixed size of 375GB, but multiple SSDs can be attached to increase the total SSD storage capacity.\u003c/p\u003e\n"],["\u003cp\u003eThe gcloud command-line tool and REST API provide methods to configure the number and interface type (SCSI or NVME) of local SSDs attached to each node type.\u003c/p\u003e\n"],["\u003cp\u003eLocal SSDs are automatically mounted to the \u003ccode\u003e/mnt/<id>\u003c/code\u003e directory on Dataproc cluster nodes and use the ext4 file system by default.\u003c/p\u003e\n"],["\u003cp\u003eYou can create a cluster with local SSDs from the "Configure Nodes" panel of the Dataproc cluster creation page of the google cloud console.\u003c/p\u003e\n"]]],[],null,["# Dataproc local SSDs\n\nTo supplement the boot disk, you can attach\n[local Solid State Drives (local SSDs)](/compute/docs/disks/local-ssd)\nto master, primary worker, and secondary worker nodes in your cluster.\nWhen local SSDs are provided to the cluster, both HDFS and scratch data,\nsuch as shuffle outputs, use the local SSDs instead of the boot\npersistent disk.\n\n- Local SSDs can provide faster read and write times than persistent disk (see [Local SSD Performance](/compute/docs/disks/local-ssd#performance)).\n- The 375GB size of each local SSD is fixed, but you can attach multiple local SSDs to increase SSD storage (see [About Local SSDs](/compute/docs/disks/local-ssd)).\n- Each local SSD is mounted to `/mnt/\u003cid\u003e` in Dataproc cluster nodes.\n- Local SSDs use [`ext4`](https://en.wikipedia.org/wiki/Ext4) as the default filesystem.\n\nUse local SSDs\n--------------\n\n### gcloud command\n\nUse the\n[`gcloud dataproc clusters create`](/sdk/gcloud/reference/dataproc/clusters/create)\ncommand with the `--num-master-local-ssds`,\n`--num-workers-local-ssds`, and\n`--num-secondary-worker-local-ssds` flags to attach local\nSSDs to the cluster's master, primary, and secondary worker\nnodes.\n\nLocal SSDs can be attached to Dataproc VMs using a SCSI\n(Small Computer System Interface) or NVME (Non-Volatile Memory Express) interface (see\n[local SSD performance](/compute/docs/disks/local-ssd#performance)).\nThe default Dataproc cluster VM local SSD interface is the SCSI interface. Use the\n[gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create)\ncommand with the `--master-local-ssd-interface`,\n`--worker-local-ssd-interface`, and\n`--secondary-worker-local-ssd-interface` flags\nto specify the local SSD interface for master, primary, and secondary\nworker nodes.\n\n**Example:** \n\n```\ngcloud dataproc clusters create cluster-name \\\n --region=region \\\n --num-master-local-ssds=1 \\\n --num-worker-local-ssds=1 \\\n --num-secondary-worker-local-ssds=1 \\\n --master-local-ssd-interface=NVME \\\n --worker-local-ssd-interface=NVME \\\n --secondary-worker-local-ssd-interface=NVME \\\n ... other args ...\n```\n\n### REST API\n\nSet the\n[numLocalSsds](/dataproc/docs/reference/rest/v1/ClusterConfig#diskconfig)\nfield in the `masterConfig`, `workerConfig`, and\n`secondaryWorkerConfig`\n[InstanceGroupConfig](/dataproc/docs/reference/rest/v1/ClusterConfig#InstanceGroupConfig)\nin a\n[cluster.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nAPI request to attach local SSDs to the cluster's master, primary worker, and\nsecondary worker nodes.\n\nLocal SSDs can be attached to Dataproc VMs using a SCSI\n(Small Computer System Interface) or NVME (Non-Volatile Memory Express) interface (see\n[local SSD performance](/compute/docs/disks/local-ssd#performance)).\nThe default Dataproc cluster VM local SSD interface is the SCSI interface. Set the\n[localSsdInterface](/dataproc/docs/reference/rest/v1/ClusterConfig#diskconfig)\nfield in the `masterConfig`, `workerConfig`, and\n`secondaryWorkerConfig`\n[InstanceGroupConfig](/dataproc/docs/reference/rest/v1/ClusterConfig#InstanceGroupConfig)\nin a\n[cluster.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nAPI request to specify the \"SCSI\" or \"NVME\" interface to attach local SSDs to the cluster's master,\nprimary worker, and secondary worker nodes.\n\n### Console\n\nCreate a cluster and attach local SSDs to the master,\nprimary, and secondary worker nodes from the Configure nodes panel of the\nDataproc\n[Create a cluster](https://console.cloud.google.com/dataproc/clustersAdd) page\nof the Google Cloud console."]]