Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Dokumen ini menjelaskan cara menggunakan Secret Manager
sebagai penyimpanan kredensial dengan Google Cloud Serverless untuk Apache Spark guna menyimpan dan mengakses data sensitif yang diproses oleh workload serverless dengan aman.
Ringkasan
Secret Manager dapat
melindungi data sensitif Anda, seperti kunci API, sandi, dan
sertifikat. Anda dapat menggunakannya untuk mengelola, mengakses, dan mengaudit secret di seluruhGoogle Cloud.
Saat menjalankan workload batch Serverless untuk Apache Spark, Anda dapat mengonfigurasinya
untuk menggunakan secret Secret Manager dengan menggunakan
Penyedia Kredensial Secret Manager Dataproc.
Ketersediaan
Fitur ini tersedia untuk versi runtime Serverless for Apache Spark
1.2.29+, 2.2.29+, atau versi runtime utama yang lebih baru.
Terminologi
Tabel berikut menjelaskan istilah yang digunakan dalam dokumen ini.
Istilah
Deskripsi
Secret
Secret Secret Manager adalah objek project global yang berisi kumpulan metadata dan versi secret. Anda dapat menyimpan, mengelola, dan mengakses secret sebagai
blob biner atau string teks.
Credential
Di Hadoop dan workload Dataproc lainnya, kredensial terdiri dari nama kredensial (ID) dan nilai kredensial (sandi).
ID dan nilai kredensial dipetakan ke ID rahasia dan nilai rahasia (versi rahasia) Secret Manager.
Anda dapat mengonfigurasi komponen Hadoop dan OSS lainnya yang didukung
agar berfungsi dengan Secret Manager dengan menetapkan properti
berikut saat Anda mengirimkan workload Serverless for Apache Spark:
Jalur penyedia (wajib): Properti jalur penyedia, spark.hadoop.hadoop.security.credential.provider.path,
adalah daftar yang dipisahkan koma dari satu atau beberapa URI penyedia kredensial yang dilalui untuk menyelesaikan kredensial.
scheme di jalur penyedia menunjukkan jenis penyedia kredensial.
Skema Hadoop mencakup
jceks://, user://,localjceks://. Gunakan skema gsm://
untuk menelusuri kredensial di Secret Manager.
Operator titik pengganti : Layanan Secret Manager tidak mengizinkan titik (.) dalam nama secret. Namun, beberapa komponen software open source (OSS) menggunakan titik dalam kunci kredensialnya. Untuk mengatasi batasan ini, aktifkan properti ini untuk mengganti titik (.) dengan tanda hubung (-) dalam nama kredensial. Hal ini memastikan bahwa kredensial OSS dengan titik dalam namanya dapat disimpan dan diambil dengan benar dari Secret Manager.
Misalnya,
Jika kunci kredensial OSS adalah a.b.c, Anda harus mengubahnya menjadi a-b-c saat menyimpannya di Secret Manager.
Ini adalah properti opsional. Secara default, nilainya adalah false. Untuk kunci kredensial yang tidak memiliki operator titik (.) dalam nama kredensialnya, properti ini dapat diabaikan dengan aman.
Versi secret : Secret di Secret Manager dapat memiliki beberapa versi (nilai). Gunakan properti ini untuk mengakses versi rahasia tertentu untuk akses stabil di lingkungan produksi.
Ini adalah properti opsional. Secara default, Secret Manager
mengakses versi LATEST, yang di-resolve ke nilai terbaru secret saat runtime. Jika kasus penggunaan Anda adalah selalu mengakses versi LATEST rahasia, properti ini dapat diabaikan dengan aman.
Menjalankan workload batch dengan Penyedia Kredensial Secret Manager
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eDataproc Serverless can use Secret Manager as a credential store to securely manage sensitive data like API keys, passwords, and certificates.\u003c/p\u003e\n"],["\u003cp\u003eTo utilize Secret Manager, configure the \u003ccode\u003ehadoop.security.credential.provider.path\u003c/code\u003e property with the \u003ccode\u003egsm://\u003c/code\u003e scheme when submitting a Dataproc Serverless workload.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003ehadoop.security.credstore.google-secret-manager.secret-id.substitute-dot-operator\u003c/code\u003e property allows for the substitution of dots with hyphens in credential names, enabling compatibility with OSS components that use dots.\u003c/p\u003e\n"],["\u003cp\u003eYou can specify a particular secret version with the property \u003ccode\u003ehadoop.security.credstore.google-secret-manager.secret-version\u003c/code\u003e for consistent access, or omit it to use the latest version of the secret.\u003c/p\u003e\n"],["\u003cp\u003eThis feature is available for Dataproc Serverless for Spark runtime versions 1.2.29+, 2.2.29+, or later.\u003c/p\u003e\n"]]],[],null,["# Secret Manager Credential Provider\n\nThis document describes how to use Secret Manager\nas a credential store with Google Cloud Serverless for Apache Spark to safely store and access sensitive\ndata processed by serverless workloads.\n\nOverview\n--------\n\nThe [Secret Manager](/secret-manager/docs/overview) can\nsafeguard your sensitive data, such as your API keys, passwords, and\ncertificates. You can use it to manage, access, and audit your secrets across\nGoogle Cloud.\n\nWhen you run a Serverless for Apache Spark batch workload, you can configure it\nto use a Secret Manager secret by using the\nDataproc Secret Manager Credential Provider.\n\nAvailability\n------------\n\nThis feature is available for Serverless for Apache Spark runtime versions\n1.2.29+, 2.2.29+, or later major\n[runtime versions](/dataproc-serverless/docs/concepts/versions/dataproc-serverless-versions#supported-dataproc-serverless-for-spark-runtime-versions).\n| **Note:** This feature is also available for use with Dataproc on clusters created with image versions 2.0.97+, 2.1.41+, 2.2.6+, or later major [image versions](/dataproc/docs/concepts/versioning/dataproc-version-clusters#supported-dataproc-image-versions). For Dataproc on Compute Engine information, see [Secret Manager Credential Provider for Dataproc](/dataproc/docs/guides/hadoop-google-secret-manager-credential-provider).\n\nTerminology\n-----------\n\nThe following table describes the terms used in this document.\n\nPermissions\n-----------\n\nDataproc checks if the following optional secrets exist:\n\n- fs-gs-encryption-key\n- fs-gs-encryption-key-hash\n- fs-gs-proxy-password\n- fs-gs-proxy-username\n\nTo make sure that the [Dataproc VM Service Account](/dataproc/docs/concepts/iam/dataproc-principals#vm_service_account_data_plane_identity)\nhas permission to check the `fs-gs` secrets, add the\n[Secret Manager Secret Accessor role](/secret-manager/docs/access-control#secretmanager.secretAccessor)\nwith the following condition to the service account, as follows: \n\n```\n{\n \"expression\": \"resource.name.startsWith(\\\"projects/PROJECT_NUMBER/secrets/fs-gs-\\\")\",\n \"title\": \"gsmkeycondition\",\n \"description\": \"Permission to access Dataproc secrets\"\n}\n```\n\nUsage\n-----\n\nYou can configure [supported Hadoop and other OSS components](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Usage_Overview)\nto work with the Secret Manager by setting the following\nproperties when you submit a Serverless for Apache Spark workload:\n\n- **Provider path** (required): The provider path property, `spark.hadoop.hadoop.security.credential.provider.path`,\n is a comma-separated list of one or more credential provider URIs that is traversed to resolve a credential.\n\n ```\n --properties=spark.hadoop.hadoop.security.credential.provider.path=gsm://projects/PROJECT_ID\n ```\n - The `scheme` in the provider path indicates the credential provider type. Hadoop schemes include `jceks://`, `user://`,`localjceks://`. Use the `gsm://` scheme to search for credentials in Secret Manager.\n- **Substitute dot operator** : The Secret Manager service does not allow dots (`.`) in secret names. However, some open source software (OSS) components use dots in their credential keys. To fix this limitation, enable this property to replace dots (`.`) with hyphens (`-`) in credential names. This ensures that OSS credentials with dots in their names can be stored and retrieved correctly from Secret Manager.\n\n For example,\n If an OSS credential key is `a.b.c`, you must modify it to `a-b-c` when storing it in Secret Manager. \n\n ```\n --properties=spark.hadoop.hadoop.security.credstore.google-secret-manager.secret-id.substitute-dot-operator=true\n ```\n\n This is an optional property. By default, the value is `false`. For credentials keys that have no dot (`.`) operator in their credential name, this property can be safely ignored.\n- **Secret version** : Secrets in Secret Manager can have multiple versions (values). Use this property to access a specific secret version for stable access in production environments.\n\n ```\n --properties=spark.hadoop.hadoop.security.credstore.google-secret-manager.secret-version=1\n ```\n\n This is an optional property. By default, Secret Manager\n accesses the `LATEST` version, which resolves to the latest value of the secret at runtime. If your use case is to always access the `LATEST` version of a secret, this property can be safely ignored.\n\n### Run a batch workload with Secret Manager Credential Provider\n\nTo [submit a batch workload](/dataproc-serverless/docs/quickstarts/spark-batch#submit_a_spark_batch_workload)\nthat uses Secret Manager Credential Provider, run the following command\nlocally or in [Cloud Shell](/shell). \n\n```\ngcloud dataproc batches submit spark \\\n --region=REGION \\\n --jars=JARS \\\n --class=MAIN_CLASS \\\n --properties=\"spark.hadoop.hadoop.security.credential.provider.path=gsm://projects/PROJECT_ID,spark.hadoop.hadoop.security.credstore.google-secret-manager.secret-id.substitute-dot-operator=true\" \\\n ...other flags as needed...\n```\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e: a [Compute Engine region](/compute/docs/regions-zones#available) where your workload runs\n- \u003cvar translate=\"no\"\u003eJARS\u003c/var\u003e: workload jar path\n- \u003cvar translate=\"no\"\u003eMAIN_CLASS\u003c/var\u003e: the Jar main class\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: your project ID, listed in the **Project info** section of the [Google Cloud console dashboard](https://console.cloud.google.com/home/dashboard)"]]