Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Anda dapat menentukan template alur kerja dalam file YAML, lalu membuat instance template
untuk menjalankan alur kerja. Anda juga dapat mengimpor dan mengekspor file YAML template alur kerja untuk membuat dan memperbarui resource template alur kerja Dataproc.
Tentukan template alur kerja Anda dalam file YAML. File YAML harus menyertakan semua kolom
WorkflowTemplate
wajib diisi, kecuali kolom id, dan juga harus mengecualikan
kolom version dan semua kolom hanya output.
Dalam contoh alur kerja berikut, daftar prerequisiteStepIds di
langkah terasort memastikan langkah terasort
hanya akan dimulai setelah langkah teragen selesai
dengan berhasil.
Buat instance alur kerja menggunakan file YAML dengan Penempatan Zona Otomatis Dataproc
Tentukan template alur kerja Anda dalam file YAML. File YAML ini sama dengan
file YAML sebelumnya, kecuali kolom zoneUri ditetapkan ke string kosong ('')
untuk mengizinkan Dataproc
Penempatan Zona Otomatis
memilih zona untuk cluster.
Mengimpor dan mengekspor file YAML template alur kerja
Anda dapat mengimpor dan mengekspor file YAML template alur kerja. Biasanya, template alur kerja diekspor terlebih dahulu sebagai file YAML, lalu YAML diedit, dan kemudian file YAML yang telah diedit diimpor untuk memperbarui template.
Ekspor template alur kerja
ke file YAML. Selama operasi ekspor,
kolom id dan version, serta semua kolom hanya output
difilter dari output dan tidak muncul dalam
file YAML yang diekspor.
Anda dapat meneruskan
WorkflowTemplateid atau resource template yang sepenuhnya memenuhi syarat name
("projects/PROJECT_ID/regions/REGION/workflowTemplates/TEMPLATE_ID") ke perintah.
Edit file YAML secara lokal. Perhatikan bahwa kolom id, version, dan hanya output, yang difilter dari file YAML saat template diekspor, tidak diizinkan dalam file YAML yang diimpor.
Anda dapat meneruskan
WorkflowTemplateid atau resource template yang sepenuhnya memenuhi syarat name
("projects/PROJECT_ID/regions/region/workflowTemplates/TEMPLATE_ID") ke perintah. Resource template dengan nama template yang sama akan ditimpa (diperbarui)
dan nomor versinya akan bertambah. Jika template dengan nama template yang sama tidak ada, template tersebut akan dibuat.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eYou can define workflow templates in YAML files and then instantiate them to run workflows, allowing for efficient workflow management.\u003c/p\u003e\n"],["\u003cp\u003eWorkflows can be run directly from a YAML file without creating a workflow template resource by using the \u003ccode\u003egcloud dataproc workflow-templates instantiate-from-file\u003c/code\u003e command.\u003c/p\u003e\n"],["\u003cp\u003eWhen defining a workflow template, you can set \u003ccode\u003eprerequisiteStepIds\u003c/code\u003e to specify dependencies between steps, ensuring they run in the correct order.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Auto Zone Placement can be used by setting the \u003ccode\u003ezoneUri\u003c/code\u003e field to an empty string in the workflow template YAML file, simplifying cluster zone selection.\u003c/p\u003e\n"],["\u003cp\u003eWorkflow templates can be exported to YAML files, edited locally, and then imported to update existing templates using \u003ccode\u003egcloud dataproc workflow-templates export\u003c/code\u003e and \u003ccode\u003egcloud dataproc workflow-templates import\u003c/code\u003e commands.\u003c/p\u003e\n"]]],[],null,["You can define a workflow template in a YAML file, then instantiate the template\nto run the workflow. You can also import and export a workflow template YAML\nfile to create and update a Dataproc workflow template resource.\n| Also see [Using inline Dataproc workflows](/dataproc/docs/concepts/workflows/inline-workflows) for other ways to run a workflow without creating a workflow template resource.\n\nRun a workflow using a YAML file\n\nTo run a workflow without first creating a workflow template resource,\nuse the\n[gcloud dataproc workflow-templates instantiate-from-file](/sdk/gcloud/reference/dataproc/workflow-templates/instantiate-from-file)\ncommand.\n\n1. Define your workflow template in a YAML file. The YAML file must include all required [WorkflowTemplate](/dataproc/docs/reference/rest/v1/projects.regions.workflowTemplates) fields except the `id` field, and it must also exclude the `version` field and all output-only fields. In the following workflow example, the `prerequisiteStepIds` list in the `terasort` step ensures the `terasort` step will only begin after the `teragen` step completes successfully. \n\n ```\n jobs:\n - hadoopJob:\n args:\n - teragen\n - '1000'\n - hdfs:///gen/\n mainJarFileUri: file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar\n stepId: teragen\n - hadoopJob:\n args:\n - terasort\n - hdfs:///gen/\n - hdfs:///sort/\n mainJarFileUri: file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar\n stepId: terasort\n prerequisiteStepIds:\n - teragen\n placement:\n managedCluster:\n clusterName: my-managed-cluster\n config:\n gceClusterConfig:\n zoneUri: us-central1-a\n ```\n2. Run the workflow: \n\n ```\n gcloud dataproc workflow-templates instantiate-from-file \\\n --file=TEMPLATE_YAML \\\n --region=REGION\n ```\n\nInstantiate a workflow using a YAML file with Dataproc Auto Zone Placement\n\n1. Define your workflow template in a YAML file. This YAML file is the same as the previous YAML file, except the `zoneUri` field is set to the empty string ('') to allow Dataproc [Auto Zone Placement](/dataproc/docs/concepts/configuring-clusters/auto-zone) to select the zone for the cluster. \n\n ```\n jobs:\n - hadoopJob:\n args:\n - teragen\n - '1000'\n - hdfs:///gen/\n mainJarFileUri: file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar\n stepId: teragen\n - hadoopJob:\n args:\n - terasort\n - hdfs:///gen/\n - hdfs:///sort/\n mainJarFileUri: file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar\n stepId: terasort\n prerequisiteStepIds:\n - teragen\n placement:\n managedCluster:\n clusterName: my-managed-cluster\n config:\n gceClusterConfig:\n zoneUri: ''\n ```\n2. Run the workflow. When using Auto Placement, you must pass a [region](/dataproc/docs/concepts/regional-endpoints) to the `gcloud` command. \n\n ```\n gcloud dataproc workflow-templates instantiate-from-file \\\n --file=TEMPLATE_YAML \\\n --region=REGION\n ```\n\nImport and export a workflow template YAML file\n\nYou can import and export workflow template YAML files. Typically, a workflow\ntemplate is first exported as a YAML file, then the YAML is edited, and then\nthe edited YAML file is imported to update the template.\n\n1. [Export the workflow template](/sdk/gcloud/reference/dataproc/workflow-templates/export)\n to a YAML file. During the export operation,\n the `id` and `version` fields, and all output-only fields\n are filtered from the output and do not appear in the\n exported YAML file.\n\n ```\n gcloud dataproc workflow-templates export TEMPLATE_ID or TEMPLATE_NAME \\\n --destination=TEMPLATE_YAML \\\n --region=REGION\n ```\n You can pass either the [WorkflowTemplate](/dataproc/docs/reference/rest/v1/projects.regions.workflowTemplates#resource-workflowtemplate) `id` or the fully qualified template resource `name` (\"projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/regions/\u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e/workflowTemplates/\u003cvar translate=\"no\"\u003eTEMPLATE_ID\u003c/var\u003e\") to the command. If you omit the `--destination` flag, the output is directed to `stdout`, so the following command will also export the template to a YAML file: \n |\n | ```\n | gcloud dataproc workflow-templates export TEMPLATE_ID or TEMPLATE_NAME \\\n | --region=REGION \u003e TEMPLATE_YAML\n |\n | ```\n\n \u003cbr /\u003e\n\n2. Edit the YAML file locally. Note that the `id`, `version`,\n and output-only fields, which were filtered\n from the YAML file when the template was exported, are disallowed in the\n imported YAML file.\n\n3. [Import the updated workflow template](/sdk/gcloud/reference/dataproc/workflow-templates/import)\n YAML file:\n\n ```\n gcloud dataproc workflow-templates import TEMPLATE_ID or TEMPLATE_NAME \\\n --source=TEMPLATE_YAML \\\n --region=REGION\n ```\n You can pass either the [WorkflowTemplate](/dataproc/docs/reference/rest/v1/projects.regions.workflowTemplates#resource-workflowtemplate) `id` or the fully qualified template resource `name` (\"projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/regions/\u003cvar translate=\"no\"\u003eregion\u003c/var\u003e/workflowTemplates/\u003cvar translate=\"no\"\u003eTEMPLATE_ID\u003c/var\u003e\") to the command. The template resource with the same template name will be overwritten (updated) and its version number will be incremented. If a template with the same template name does not exist, it will be created.\n\n \u003cbr /\u003e"]]