无论您是使用用户代管的服务账号,还是使用集群中虚拟机上的默认 Compute Engine 服务账号,都必须向 Cloud Data Fusion 授予 Service Account User 角色。否则,Cloud Data Fusion 无法预配 Dataproc 集群,并且在执行数据流水线时会显示以下错误:
PROVISION task failed in REQUESTING_CREATE state for program run [pipeline-name] due to Dataproc operation failure: INVALID_ARGUMENT: User not authorized to act as service account '[service-account-name]'
获取服务账号名称
在 Google Cloud 控制台中,前往 Identity and Access Management 页面。 转到 IAM 页面
从页面顶部的项目选择器中,选择 Cloud Data Fusion 实例所属的项目、文件夹或组织。
找到并复制 Cloud Data Fusion 服务账号名称。请使用以下格式:service-[project-number]@gcp-sa-datafusion.iam.gserviceaccount.com。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-03-26。"],[[["To allow Cloud Data Fusion to provision and run pipelines on Dataproc clusters, the Cloud Data Fusion Service Agent must be granted the Service Account User role on the Dataproc Service Account."],["For Dataproc service accounts, it's also necessary to grant `datafusion.instances.runtime` permission to enable access to Cloud Data Fusion runtime resources."],["If the Service Account User role is not granted, Cloud Data Fusion will be unable to provision a Dataproc cluster, resulting in an error during pipeline execution."],["The Cloud Data Fusion runner role and Cloud Storage admin role must be granted to service accounts used by Dataproc to authorize the execution of Cloud Data Fusion pipelines and access Cloud Storage resources."],["You can grant these permissions in the UI when you create an instance starting in Cloud Data Fusion versions 6.2.3, alternatively you can follow the manual steps provided to grant these roles."]]],[]]