[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-18。"],[],[],null,["# Use hierarchical namespace enabled buckets for Hadoop workloads\n\nThis page describes how to use [hierarchical namespace](/storage/docs/hns-overview) enabled buckets for Hadoop workloads.\n\nOverview\n--------\n\nWhen using a Cloud Storage bucket with hierarchical namespace, you can configure the [Cloud Storage connector](/dataproc/docs/concepts/connectors/cloud-storage) to use the [rename folder](/storage/docs/rename-hns-folders) operation for workloads like Hadoop, Spark, Hive.\n\nIn a bucket without hierarchical namespace, a rename operation in Hadoop, Spark,\nand Hive involves multiple object copy and delete jobs, impacting\nperformance and consistency. Renaming a folder using the Cloud Storage\nconnector optimizes performance and ensures consistency, when handling folders\nwith a large number of objects.\n\nBefore you begin\n----------------\n\nTo use features of hierarchical namespace buckets, use the following Cloud Storage\nconnector versions:\n\n- 2.2.23 or later (if you are using version 2.x.x)\n- 3.0.1 or later (if you are using version 3.x.x)\n\nOlder connector versions (3.0.0 and older than 2.2.23) have limitations. For more information about the limitations, see [Compatibility with\nCloud Storage connector version 3.0.0 or versions older than\n2.2.23](/storage/docs/hns-hadoop-workloads#backwards-compatibility-connector).\n\nEnable the Cloud Storage connector on a cluster\n-----------------------------------------------\n\nThis section describes how to enable the Cloud Storage connector on a Dataproc cluster and a self-managed Hadoop cluster. \n\n### Dataproc\n\nYou can use the Google Cloud CLI to create a Dataproc cluster and enable the Cloud Storage connector to perform the folder operations.\n\n1. Create a Dataproc cluster using the following command:\n\n \u003cbr /\u003e\n\n ```\n gcloud dataproc clusters create CLUSTER_NAME\n --properties=core:fs.gs.hierarchical.namespace.folders.enable=true,\n core:fs.gs.http.read-timeout=30000\n \n ```\n\n \u003cbr /\u003e\n\n Where:\n - \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e is the name of the cluster. For example, `my-cluster`\n - `fs.gs.hierarchical.namespace.folders.enable` is used to enable the hierarchical namespace on a bucket.\n - `fs.gs.http.read-timeout` is the maximum time allowed, in milliseconds, to read data from an established connection. This is an optional setting.\n\n | **Note:** If you are using the Cloud Storage connector version 3.0.0 or a version older than 2.2.23, the configuration setting `fs.gs.hierarchical.namespace.folders.enable` is not supported and results in an error if included.\n\n### Self-managed Hadoop\n\nYou can enable the Cloud Storage connector on your self-managed Hadoop cluster to perform the folder operations.\n\n1. Add the following to core-site.xml configuration file:\n\n \u003cbr /\u003e\n\n ```\n \u003cproperty\u003e\n \u003cname\u003efs.gs.hierarchical.namespace.folders.enable\u003c/name\u003e\n \u003cvalue\u003etrue\u003c/value\u003e\n \u003c/property\u003e\n \u003cproperty\u003e\n \u003cname\u003efs.gs.http.read-timeout\u003c/name\u003e\n \u003cvalue\u003e30000\u003c/value\u003e\n \u003c/property\u003e\n \n ```\n\n \u003cbr /\u003e\n\n Where:\n - `fs.gs.hierarchical.namespace.folders.enable` is used to enable the hierarchical namespace on a bucket\n - `fs.gs.http.read-timeout` is the maximum time allowed, in milliseconds, to read data from an established connection. This is an optional setting.\n\n | **Note:** If you are using the Cloud Storage connector version 3.0.0 or a version older than 2.2.23, the configuration setting `fs.gs.hierarchical.namespace.folders.enable` is not supported and results in an error if included.\n\nCompatibility with Cloud Storage connector version 3.0.0 or versions older than 2.2.23\n--------------------------------------------------------------------------------------\n\nUsing the Cloud Storage connector version 3.0.0 or versions older than 2.2.23 or disabling folder operations for hierarchical namespace can lead to the following limitations:\n\n- **Inefficient folder renames** : Folder rename operations in Hadoop happen\n using object-level copy and delete operations, which is slower and less\n efficient than the dedicated `rename folder` operation.\n\n- **Accumulation of empty folders**: Folders are not deleted\n automatically, leading to the accumulation of empty folders in your bucket.\n Accumulation of empty folders can have the following impact:\n\n - Increase storage costs if not deleted explicitly.\n - Slow down the list operations and increase the risk of list operation\n timeouts.\n\n | **Note:** To reduce the risk of list operation timeouts, configure the `fs.gs.http.read-timeout` timeout value to `30000` milliseconds. To configure timeout settings, refer to the instructions for [Dataproc](/storage/docs/hns-hadoop-workloads#dataproc) or [Self-managed Hadoop](/storage/docs/hns-hadoop-workloads#self-managed-hadoop), depending on which one you are using.\n- **Compatibility issues**: Mixing the usage of older and newer connector\n versions, or enabling and disabling folder operations, can lead to\n compatibility issues, when renaming folders. Consider the following scenario\n which uses a combination of connector versions:\n\n 1. Use the Cloud Storage connector version older than 2.2.23 to\n perform the following tasks:\n\n 1. Write objects under the folder `foo/`.\n 2. Rename the folder `foo/` to `bar/`. The rename operation copies and deletes the objects under `foo/` but does not delete the empty `foo/` folder.\n 2. Use the Cloud Storage connector version 2.2.23 with the\n folder operations settings enabled to rename the folder `bar/` to `foo/`.\n\n The connector version 2.2.23, with the folder operation enabled,\n detects the existing `foo/` folder, causing the rename operation to\n fail. The older connector version, did not delete the `foo/` folder as\n the folder operation was disabled.\n\nWhat's next\n-----------\n\n- [Create buckets with hierarchical namespace enabled](/storage/docs/create-hns-bucket).\n- [Create and manage folders](/storage/docs/create-folders).\n\nTry it for yourself\n-------------------\n\n\nIf you're new to Google Cloud, create an account to evaluate how\nCloud Storage performs in real-world\nscenarios. New customers also get $300 in free credits to run, test, and\ndeploy workloads.\n[Try Cloud Storage free](https://console.cloud.google.com/freetrial)"]]