This quickstart guides you through the installation of the Google Cloud Pipeline Components (GCPC) SDK.
Install latest release
Use the following command to install the Google Cloud Pipeline Components SDK from the Python Package Index (PyPI):
pip install --upgrade google-cloud-pipeline-components
Use a prebuilt component via the GCPC SDK
After you install the Google Cloud Pipeline Components SDK, you can use it to import a prebuilt component.
For SDK reference information for supported components, see
the google_cloud_pipeline_components
SDK documentation.
For example, you can use the following code to import and use the Dataflow component in a pipeline.
from google_cloud_pipeline_components.v1.dataflow import DataflowPythonJobOp
from kfp import dsl
@dsl.pipeline(
name=PIPELINE_NAME,
description='Dataflow launch python pipeline'
)
def pipeline(
python_file_path:str = 'gs://ml-pipeline-playground/samples/dataflow/wc/wc.py',
project_id:str = PROJECT_ID,
location:str = LOCATION,
staging_dir:str = PIPELINE_ROOT,
requirements_file_path:str = 'gs://ml-pipeline-playground/samples/dataflow/wc/requirements.txt',
):
dataflow_python_op = DataflowPythonJobOp(
project=project_id,
location=location,
python_module_path=python_file_path,
temp_location = staging_dir,
requirements_file_path = requirements_file_path,
args = ['--output', OUTPUT_FILE],
)
What's next
- Read the Introduction to Google Cloud Pipeline Components.
- See all tutorials that use the
google_cloud_pipeline_components
SDK. - Get started with Dataflow components.