Dataflow use cases

Dataflow is designed to support streaming and batch pipelines at large scale. Dataflow is built on the open-source Apache Beam framework.
This page links to tutorials and example use cases to help you get started.
Data movement
Process data from Kafka to BigQuery
This tutorial shows how to run a Dataflow template that reads from Managed Service for Apache Kafka and writes the records to a BigQuery table.
Process data from Pub/Sub to BigQuery
This tutorial shows how to run a Dataflow template that reads JSON-encoded messages from Pub/Sub and writes them to a BigQuery table.
Dataflow ML
Use RunInference and Embeddings
This notebook shows how to use ML models in Apache Beam pipelines that use the RunInference transform.
Use GPUs in your pipeline
This notebook shows how to run machine learning inference by using vLLM and GPUs. vLLM is a library for LLM inference and serving.
Other resources
Reference patterns
Links to sample code and technical reference guides for common Dataflow use cases.
Ecommerce streaming pipeline
In this tutorial, you create a pipeline that transforms ecommerce data from Pub/Sub and outputs the data to BigQuery and Bigtable.
HPC highly parallel workloads
With Dataflow, you can run highly parallel workloads in a single pipeline, improving efficiency and making your workflow easier to manage.