Dataflow use cases

Dataflow is designed to support streaming and batch pipelines at large scale. Dataflow is built on the open-source Apache Beam framework.

This page links to tutorials and example use cases to help you get started.

Data movement

This tutorial shows how to run a Dataflow template that reads from Managed Service for Apache Kafka and writes the records to a BigQuery table.
This tutorial shows how to run a Dataflow template that reads JSON-encoded messages from Pub/Sub and writes them to a BigQuery table.

Dataflow ML

This notebook shows how to use ML models in Apache Beam pipelines that use the RunInference transform.
This notebook shows how to run machine learning inference by using vLLM and GPUs. vLLM is a library for LLM inference and serving.

Other resources

Links to sample code and technical reference guides for common Dataflow use cases.
In this tutorial, you create a pipeline that transforms ecommerce data from Pub/Sub and outputs the data to BigQuery and Bigtable.
With Dataflow, you can run highly parallel workloads in a single pipeline, improving efficiency and making your workflow easier to manage.