Use cases for Dataflow | Google Cloud

English
Deutsch
Español – América Latina
Français
Indonesia
Italiano
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어

Console

Contact Us Start free

Dataflow use cases

Dataflow is designed to support streaming and batch pipelines at large scale. Dataflow is built on the open-source Apache Beam framework.

This page links to tutorials and example use cases to help you get started.

About Dataflow About Apache Beam

Data movement

Process data from Kafka to BigQuery

This tutorial shows how to run a Dataflow template that reads from Managed Service for Apache Kafka and writes the records to a BigQuery table.

Process data from Pub/Sub to BigQuery

This tutorial shows how to run a Dataflow template that reads JSON-encoded messages from Pub/Sub and writes them to a BigQuery table.

Dataflow ML

Use RunInference and Embeddings

This notebook shows how to use ML models in Apache Beam pipelines that use the RunInference transform.

Use GPUs in your pipeline

This notebook shows how to run machine learning inference by using vLLM and GPUs. vLLM is a library for LLM inference and serving.

Other resources

Reference patterns

Links to sample code and technical reference guides for common Dataflow use cases.

Ecommerce streaming pipeline

In this tutorial, you create a pipeline that transforms ecommerce data from Pub/Sub and outputs the data to BigQuery and Bigtable.

HPC highly parallel workloads

With Dataflow, you can run highly parallel workloads in a single pipeline, improving efficiency and making your workflow easier to manage.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-06-05 UTC.