Dataflow ML

Dataflow ML lets you use Dataflow to deploy and manage complete machine learning (ML) pipelines. Use ML models to do local and remote inference with batch and streaming pipelines. Use data processing tools to prepare your data for model training and to process the results of the models.
Whether you want to classify images in real-time, run remote inference calls, or build a custom model handler, you can find complete Dataflow ML examples.
Use the MLTransform class to preprocess data for machine learning (ML) workflows. By combining multiple data processing transforms in one class, MLTransform streamlines the process of applying Apache Beam ML data processing transforms to your workflow.
with pipeline as p:
  predictions = (
      p
      | beam.ReadFromSource('a_source')
      | RunInference(MODEL_HANDLER))
Using RunInference is as straightforward as adding the transform code to your pipeline. In this example, MODEL_HANDLER is the model configuration object.
with beam.Pipeline() as p:
  transformed_data = (
    p
    | beam.Create(data)
    | MLTransform(...)
    | beam.Map(print))
To prepare your data for training ML models, use MLTransform in your pipeline. MLTransform wraps multiple data processing transforms in one class, letting you use one class for a variety of preprocessing tasks.

Prediction and inference with pre-trained models

Use a pre-trained model with Pytorch.
Use a pre-trained model with scikit-learn.
Use a pre-trained model with TensorFlow.
Apache Beam has built-in support for sending requests to a remotely deployed Vertex AI endpoint. This notebook shows how to use the Apache Beam RunInference transform for image classification with Vertex AI.
Use the RunInference transform with a keyed model handler to use multiple models in the same RunInference transform.

Data processing with MLTransform

Use Apache Beam's MLTransform class with the Vertex AI text-embeddings API to generate text embeddings. Text embeddings are a way to represent text as numerical vectors, which is necessary for many natural language processing (NLP) tasks.
Use Apache Beam's MLTransform class with Hugging Face Hub models to generate text embeddings. Hugging Face's SentenceTransformers framework uses Python to generate sentence, text, and image embeddings.
Compute a unique vocabulary from a dataset and then map each word or token to a distinct integer index. Use this transform to change textual data into numerical representations for machine learning tasks.
Scale your data so that you can use it to train your ML model. Apache Beam's MLTransform class includes multiple data scaling transforms.

Prediction and inference with hub models

You can use Gemma models in your inference pipelines to gauge the sentiment of a conversation, summarize that conversation's content, and draft a reply for a difficult conversation.
Use the RunInference transform with a trained model from Hugging Face.
Use the RunInference transform for TensorFlow with a trained model from TensorFlow Hub.
Use the RunInference transform for generative AI tasks. This notebook uses a language model from the Hugging Face Model Hub.

ML workflow orchestration

Vertex AI Pipelines helps you to automate, monitor, and govern your ML systems by orchestrating your ML workflows in a serverless manner. Use Vertex AI Pipelines to orchestrate workflow DAGs defined by either TFX or KFP and to automatically track your ML artifacts using Vertex ML Metadata.
TensorFlow Extended (TFX) lets you deploy complete ML pipelines by using an orchestration framework that has a built-in integration with Apache Beam and the Dataflow runner.
Kubeflow makes deployments of ML workflows on Kubernetes simple, portable, and scalable. Kubeflow Pipelines are reusable complete ML workflows built using the Kubeflow Pipelines SDK.

Additional features

Using GPUs in Dataflow jobs can accelerate image processing and machine learning processing tasks. See GPU types supported by Dataflow and recommendations for which type of GPU to use for different workloads.
Mix and match GPUs and CPUs for high performance and lower cost. The ability to target resources to specific pipeline steps provides additional pipeline flexibility and capability, and potential cost savings.
Apache Beam simplifies the data enrichment workflow by providing a turnkey enrichment transform that you can add to your pipeline.

Model maintenance and evaluation

RunInference lets you perform automatic model updates without stopping your Apache Beam pipeline. Use side inputs to update your model in real time, even while the pipeline is running.
Use TensorFlow Model Analysis (TFMA) to investigate and visualize the performance of a model by creating and comparing two models. With Apache Beam, you can evaluate and compare multiple models in one step.

Resources

To use RunInference with a Java pipeline, create a cross-language Python transform. The pipeline calls the transform, which does the preprocessing, postprocessing, and inference.
To run the Dataflow ML examples, you might need to configure your Google Cloud permissions. Read a detailed guide about the required permissions for Dataflow pipelines.
The examples and the corresponding source code are available on GitHub. In GitHub, you can also find instructions for running the examples in Colab.