Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Escrever um componente para mostrar um link do console Google Cloud
É comum que, ao executar um componente, você queira ver não apenas o link para o job do componente que está sendo iniciado, mas também o link para os recursos de nuvem subjacentes, como os jobs de previsão em lote do Vertex ou jobs do Dataflow.
O proto do gcp_resource é um parâmetro especial que pode ser usado no componente para permitir que o console Google Cloud forneça uma visualização personalizada dos registros e do status do recurso no console do Vertex AI Pipelines.
Gerar o parâmetro gcp_resource
Como usar um componente baseado em contêiner
Primeiro, é necessário definir o parâmetro gcp_resource no seu componente, conforme mostrado no seguinte arquivo component.py de exemplo:
# Copyright 2023 The Kubeflow Authors. All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.fromtypingimportListfromgoogle_cloud_pipeline_componentsimport_imagefromgoogle_cloud_pipeline_componentsimport_placeholdersfromkfp.dslimportcontainer_componentfromkfp.dslimportContainerSpecfromkfp.dslimportOutputPath@container_componentdefdataflow_python(python_module_path:str,temp_location:str,gcp_resources:OutputPath(str),location:str='us-central1',requirements_file_path:str='',args:List[str]=[],project:str=_placeholders.PROJECT_ID_PLACEHOLDER,):# fmt: off"""Launch a self-executing Beam Python file on Google Cloud using the Dataflow Runner. Args: location: Location of the Dataflow job. If not set, defaults to `'us-central1'`. python_module_path: The GCS path to the Python file to run. temp_location: A GCS path for Dataflow to stage temporary job files created during the execution of the pipeline. requirements_file_path: The GCS path to the pip requirements file. args: The list of args to pass to the Python file. Can include additional parameters for the Dataflow Runner. project: Project to create the Dataflow job. Defaults to the project in which the PipelineJob is run. Returns: gcp_resources: Serialized gcp_resources proto tracking the Dataflow job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md. """# fmt: onreturnContainerSpec(image=_image.GCPC_IMAGE_TAG,command=['python3','-u','-m','google_cloud_pipeline_components.container.v1.dataflow.dataflow_launcher',],args=['--project',project,'--location',location,'--python_module_path',python_module_path,'--temp_location',temp_location,'--requirements_file_path',requirements_file_path,'--args',args,'--gcp_resources',gcp_resources,],)
Em seguida, dentro do contêiner, instale o pacote de componentes do pipeline Google Cloud :
É possível definir o resource_type como uma string arbitrária, mas apenas os seguintes tipos têm links no console do Google Cloud :
BatchPredictionJob
BigQueryJob
CustomJob
DataflowJob
HyperparameterTuningJob
Escrever um componente para cancelar os recursos subjacentes
Quando um job de pipeline é cancelado, o comportamento padrão é que os recursos Google Cloud subjacentes continuem em execução. Elas não são canceladas automaticamente. Para mudar esse comportamento, anexe um gerenciador SIGTERM ao job do pipeline. Um bom lugar para fazer isso é antes de um loop de pesquisa para um job que pode ser executado por muito tempo.
O cancelamento foi implementado em vários componentes do Google Cloud Pipeline, incluindo:
Job de previsão em lote
Job do BigQuery ML
Job personalizado
Job em lote do Dataproc sem servidor
Job de ajuste de hiperparâmetros
Para obter mais informações, incluindo exemplos de código que mostram como anexar um gerenciador SIGTERM, consulte os seguintes links do GitHub:
Considere o seguinte ao implementar seu gerenciador SIGTERM:
A propagação de cancelamento só funciona depois que o componente estiver em execução por alguns minutos. Geralmente, isso ocorre devido a tarefas de inicialização em segundo plano que precisam ser processadas antes de os gerenciadores de sinal do Python serem chamados.
Alguns recursos do Google Cloud podem não ter o cancelamento implementado. Por exemplo, criar ou excluir um endpoint ou modelo do Vertex AI pode criar uma operação de longa duração que aceite uma solicitação de cancelamento pela API REST, mas não implemente a operação de cancelamento em si.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-19 UTC."],[],[],null,["| To learn more,\n| run the \"Custom training workflow with prebuilt Pipeline Components and custom components\" notebook in one of the following\n| environments:\n|\n| [Open in Colab](https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/pipelines/google_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n|\n|\n| \\|\n|\n| [Open in Colab Enterprise](https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fpipelines%2Fgoogle_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n|\n|\n| \\|\n|\n| [Open\n| in Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fpipelines%2Fgoogle_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n|\n|\n| \\|\n|\n| [View on GitHub](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/pipelines/google_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n\nWrite a component to show a Google Cloud console link\n\nIt's common that when running a component, you want to not only see the link to the component job being launched, but also the link to the underlying cloud resources, such as the Vertex batch prediction jobs or dataflow jobs.\n\nThe [`gcp_resource` proto](https://github.com/kubeflow/pipelines/tree/master/components/google-cloud/google_cloud_pipeline_components/proto) is a special parameter that you can use in your component to enable the Google Cloud console to provide a customized view of the resource's logs and status in the Vertex AI Pipelines console.\n\nOutput the `gcp_resource` parameter\n\nUsing a container-based component\n\nFirst, you'll need to define the `gcp_resource` parameter in your component as shown in the following example `component.py` file: \n\nPython\n\nTo learn how to install or update the Vertex AI SDK for Python, see [Install the Vertex AI SDK for Python](/vertex-ai/docs/start/use-vertex-ai-python-sdk).\n\nFor more information, see the\n[Python API reference documentation](/python/docs/reference/aiplatform/latest).\n\n # Copyright 2023 The Kubeflow Authors. All Rights Reserved.\n #\n # Licensed under the Apache License, Version 2.0 (the \"License\");\n # you may not use this file except in compliance with the License.\n # You may obtain a copy of the License at\n #\n # http://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n from typing import List\n\n from google_cloud_pipeline_components import _image\n from google_cloud_pipeline_components import _placeholders\n from kfp.dsl import container_component\n from kfp.dsl import ContainerSpec\n from kfp.dsl import OutputPath\n\n\n @container_component\n def dataflow_python(\n python_module_path: str,\n temp_location: str,\n gcp_resources: OutputPath(str),\n location: str = 'us-central1',\n requirements_file_path: str = '',\n args: List[str] = [],\n project: str = _placeholders.PROJECT_ID_PLACEHOLDER,\n ):\n # fmt: off\n \"\"\"Launch a self-executing Beam Python file on Google Cloud using the\n Dataflow Runner.\n\n Args:\n location: Location of the Dataflow job. If not set, defaults to `'us-central1'`.\n python_module_path: The GCS path to the Python file to run.\n temp_location: A GCS path for Dataflow to stage temporary job files created during the execution of the pipeline.\n requirements_file_path: The GCS path to the pip requirements file.\n args: The list of args to pass to the Python file. Can include additional parameters for the Dataflow Runner.\n project: Project to create the Dataflow job. Defaults to the project in which the PipelineJob is run.\n\n Returns:\n gcp_resources: Serialized gcp_resources proto tracking the Dataflow job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.\n \"\"\"\n # fmt: on\n return ContainerSpec(\n image=_image.GCPC_IMAGE_TAG,\n command=[\n 'python3',\n '-u',\n '-m',\n 'google_cloud_pipeline_components.container.v1.dataflow.dataflow_launcher',\n ],\n args=[\n '--project',\n project,\n '--location',\n location,\n '--python_module_path',\n python_module_path,\n '--temp_location',\n temp_location,\n '--requirements_file_path',\n requirements_file_path,\n '--args',\n args,\n '--gcp_resources',\n gcp_resources,\n ],\n )\n\n\u003cbr /\u003e\n\nNext, inside the container, install the Google Cloud Pipeline Components package: \n\n pip install --upgrade google-cloud-pipeline-components\n\nNext, in the Python code, define the resource as a `gcp_resource` parameter: \n\nPython\n\nTo learn how to install or update the Vertex AI SDK for Python, see [Install the Vertex AI SDK for Python](/vertex-ai/docs/start/use-vertex-ai-python-sdk).\n\nFor more information, see the\n[Python API reference documentation](/python/docs/reference/aiplatform/latest).\n\n from google_cloud_pipeline_components.proto.gcp_resources_pb2 import GcpResources\n from google.protobuf.json_format import MessageToJson\n\n dataflow_resources = GcpResources()\n dr = dataflow_resources.resources.add()\n dr.resource_type='DataflowJob'\n dr.resource_uri='https://dataflow.googleapis.com/v1b3/projects/[your-project]/locations/us-east1/jobs/[dataflow-job-id]'\n\n with open(gcp_resources, 'w') as f:\n f.write(MessageToJson(dataflow_resources))\n\n\u003cbr /\u003e\n\nUsing a Python component\n\nAlternatively, you can return the `gcp_resources` output parameter as you would any string output parameter: \n\n @dsl.component(\n base_image='python:3.9',\n packages_to_install=['google-cloud-pipeline-components==2.19.0'],\n )\n def launch_dataflow_component(project: str, location:str) -\u003e NamedTuple(\"Outputs\", [(\"gcp_resources\", str)]):\n # Launch the dataflow job\n dataflow_job_id = [dataflow-id]\n dataflow_resources = GcpResources()\n dr = dataflow_resources.resources.add()\n dr.resource_type='DataflowJob'\n dr.resource_uri=f'https://dataflow.googleapis.com/v1b3/projects/{project}/locations/{location}/jobs/{dataflow_job_id}'\n gcp_resources=MessageToJson(dataflow_resources)\n return gcp_resources\n\nSupported `resource_type` values\n\nYou can set the `resource_type` to be an arbitrary string, but only the following types have links in the Google Cloud console:\n\n- BatchPredictionJob\n- BigQueryJob\n- CustomJob\n- DataflowJob\n- HyperparameterTuningJob\n\nWrite a component to cancel the underlying resources\n\nWhen a pipeline job is canceled, the default behavior is for the underlying Google Cloud resources to keep running. They are not canceled automatically. To change this behavior, you should attach a [SIGTERM](https://docs.python.org/3/library/signal.html#signal.SIGTERM) handler to the pipeline job. A good place to do this is just before a polling loop for a job that could run for a long time.\n\nCancellation has been implemented on several Google Cloud Pipeline Components, including:\n\n- Batch prediction job\n- BigQuery ML job\n- Custom job\n- Dataproc Serverless batch job\n- Hyperparameter tuning job\n\nFor more information, including sample code that shows how to attach a SIGTERM handler, see the following GitHub links:\n\n- \u003chttps://github.com/kubeflow/pipelines/blob/google-cloud-pipeline-components-2.19.0/components/google-cloud/google_cloud_pipeline_components/container/utils/execution_context.py\u003e\n- \u003chttps://github.com/kubeflow/pipelines/blob/google-cloud-pipeline-components-2.19.0/components/google-cloud/google_cloud_pipeline_components/container/v1/gcp_launcher/job_remote_runner.py#L124\u003e\n\nConsider the following when implementing your SIGTERM handler:\n\n- Cancellation propagation works only after the component has been running for a few minutes. This is typically due to background startup tasks that need to be [processed](https://docs.python.org/3/library/signal.html#execution-of-python-signal-handlers) before the Python signal handlers are called.\n- Some Google Cloud resources might not have cancellation implemented. For example, creating or deleting a Vertex AI Endpoint or Model could create a long-running operation that accepts a cancellation request through its REST API, but doesn't implement the cancellation operation itself."]]