Dataflow adalah layanan terkelola untuk menjalankan berbagai macam pola pemrosesan data. Dokumentasi di situs ini menunjukkan cara men-deploy
pipeline pemrosesan data batch dan streaming menggunakan
Dataflow, termasuk petunjuk untuk menggunakan fitur layanan.
Apache Beam SDK
adalah model pemrograman open source yang memungkinkan Anda mengembangkan pipeline batch
dan streaming. Anda membuat pipeline dengan program Apache Beam, lalu menjalankannya di layanan Dataflow. Dokumentasi
Apache Beammemberikan informasi konseptual dan materi referensi mendalam untuk model pemrograman, SDK, dan runner lainnya Apache Beam.
Untuk mempelajari konsep dasar Apache Beam, lihat
Tour of Beam dan Beam Playground.
Repositori
Dataflow Cookbook juga menyediakan pipeline yang siap diluncurkan dan mandiri
serta kasus penggunaan Dataflow yang paling umum.
Apache, Apache Beam, Beam, logo Beam, dan maskot kunang-kunang Beam adalah merek dagang terdaftar dari The Apache Software Foundation di Amerika Serikat dan/atau negara lainnya.
Mulai bukti konsep Anda dengan kredit gratis senilai $300
-
Mendapatkan akses ke Gemini 2.0 Flash Thinking
-
Penggunaan bulanan gratis untuk produk populer, termasuk AI API dan BigQuery
-
Tidak ada biaya otomatis, tanpa komitmen
Terus jelajahi dengan lebih dari 20 produk yang selalu gratis
Akses 20+ produk gratis untuk kasus penggunaan umum, termasuk API AI, VM, data warehouse, dan lainnya.
Kasus penggunaan
Kasus penggunaan
Menjalankan workload paralel tinggi HPC
Dengan Dataflow, Anda dapat menjalankan workload yang sangat paralel dalam satu pipeline, sehingga meningkatkan efisiensi dan mempermudah pengelolaan alur kerja.
Streaming
Kasus penggunaan
Kasus penggunaan
Menjalankan inferensi dengan Dataflow ML
Dengan Dataflow ML, Anda dapat menggunakan Dataflow untuk men-deploy dan mengelola pipeline machine learning (ML) yang lengkap. Menggunakan model ML untuk melakukan inferensi lokal dan jarak jauh dengan pipeline streaming dan batch. Menggunakan alat pemrosesan data guna menyiapkan data Anda untuk pelatihan model dan memproses hasil model.
ML
Streaming
Kasus penggunaan
Kasus penggunaan
Membuat pipeline streaming e-commerce
Bangun aplikasi contoh e-commerce menyeluruh yang mengalirkan data dari toko web ke BigQuery dan Bigtable. Aplikasi contoh ini menggambarkan kasus penggunaan umum dan praktik terbaik untuk menerapkan analisis data streaming dan kecerdasan buatan (AI) real-time.
ecommerce
Streaming
Kecuali dinyatakan lain, konten di halaman ini dilisensikan berdasarkan Lisensi Creative Commons Attribution 4.0, sedangkan contoh kode dilisensikan berdasarkan Lisensi Apache 2.0. Untuk mengetahui informasi selengkapnya, lihat Kebijakan Situs Google Developers. Java adalah merek dagang terdaftar dari Oracle dan/atau afiliasinya.
Terakhir diperbarui pada 2025-08-18 UTC.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-18 UTC."],[[["\u003cp\u003eDataflow is a managed service for executing batch and streaming data processing pipelines, with comprehensive documentation available on deployment and feature usage.\u003c/p\u003e\n"],["\u003cp\u003eThe Apache Beam SDK, an open-source programming model, is used to create pipelines that can be run on the Dataflow service, and its documentation can be found on the Apache website.\u003c/p\u003e\n"],["\u003cp\u003eVarious guides, references, and resources are provided, including quickstarts for creating pipelines in Java, Python, and Go, along with troubleshooting information.\u003c/p\u003e\n"],["\u003cp\u003eDataflow supports highly parallel workloads, machine learning inference, and the creation of ecommerce streaming pipelines, which are detailed in use case examples.\u003c/p\u003e\n"],["\u003cp\u003eThe documentation provides access to code samples, pricing information, quotas, release notes, support and billing help, all relevant to the managed service.\u003c/p\u003e\n"]]],[],null,["# Dataflow documentation\n======================\n\n[Read product documentation](/dataflow/docs/overview)\nDataflow is a managed service for executing a wide variety of data\nprocessing patterns. The documentation on this site shows you how to deploy\nyour batch and streaming data processing pipelines using\nDataflow, including directions for using service features.\n\n\nThe Apache Beam SDK\nis an open source programming model that enables you to develop both batch\nand streaming pipelines. You create your pipelines with an Apache Beam\nprogram and then run them on the Dataflow service. The\n[Apache Beam\ndocumentation](https://beam.apache.org/documentation/) provides in-depth conceptual information and reference\nmaterial for the Apache Beam programming model, SDKs, and other runners.\n\nTo learn basic Apache Beam concepts, see the [Tour of Beam](https://tour.beam.apache.org/) and [Beam Playground](https://play.beam.apache.org/).\nThe [Dataflow Cookbook](https://github.com/GoogleCloudPlatform/dataflow-cookbook) repository also provides ready-to-launch and self-contained pipelines\nand the most common Dataflow use cases. \n*Apache, Apache Beam, Beam, the\nBeam logo, and the Beam firefly mascot are registered trademarks of The Apache Software Foundation in the\nUnited States and/or other countries.* [Get started for free](https://console.cloud.google.com/freetrial) \n\n#### Start your proof of concept with $300 in free credit\n\n- Get access to Gemini 2.0 Flash Thinking\n- Free monthly usage of popular products, including AI APIs and BigQuery\n- No automatic charges, no commitment \n[View free product offers](/free/docs/free-cloud-features#free-tier) \n\n#### Keep exploring with 20+ always-free products\n\n\nAccess 20+ free products for common use cases, including AI APIs, VMs, data warehouses,\nand more.\n\nDocumentation resources\n-----------------------\n\nFind quickstarts and guides, review key references, and get help with common issues. \nformat_list_numbered\n\n### Guides\n\n-\n\n [Create a Dataflow pipeline using Java](/dataflow/docs/quickstarts/create-pipeline-java)\n\n-\n\n [Create a Dataflow pipeline using Python](/dataflow/docs/quickstarts/create-pipeline-python)\n\n-\n\n [Create a Dataflow pipeline using Go](/dataflow/docs/quickstarts/create-pipeline-go)\n\n-\n\n [Create a streaming pipeline using a Dataflow template](/dataflow/docs/quickstarts/create-streaming-pipeline-template)\n\n-\n\n [Build and run a Flex Template](/dataflow/docs/guides/templates/using-flex-templates)\n\n-\n\n [Deploy Dataflow pipelines](/dataflow/docs/guides/deploying-a-pipeline)\n\n-\n\n [Develop with notebooks](/dataflow/docs/guides/interactive-pipeline-development)\n\n-\n\n [Troubleshooting and debugging](/dataflow/docs/guides/troubleshooting-your-pipeline)\n\nfind_in_page\n\n### Reference\n\n-\n\n [Install the Apache Beam SDK](/dataflow/docs/guides/installing-beam-sdk)\n\n-\n\n [Java SDK](https://beam.apache.org/documentation/sdks/javadoc/current/)\n\n-\n\n [Python SDK](https://beam.apache.org/documentation/sdks/pydoc/current/)\n\n-\n\n [Go SDK](https://pkg.go.dev/github.com/apache/beam/sdks/v2/go/pkg/beam)\n\n-\n\n [SDK version support status](/dataflow/docs/support/sdk-version-support-status)\n\n-\n\n [REST API](/dataflow/docs/reference/rest)\n\n-\n\n [gcloud command-line functions](/sdk/gcloud/reference/dataflow)\n\n-\n\n [Google-provided templates](/dataflow/docs/concepts/dataflow-templates)\n\ninfo\n\n### Resources\n\n-\n\n [Dataflow code samples](/dataflow/docs/samples)\n\n-\n\n [Pricing](/dataflow/pricing)\n\n-\n\n [Quotas and limits](/dataflow/quotas)\n\n-\n\n [Release Notes](/dataflow/docs/release-notes)\n\n-\n\n [Getting support](/dataflow/docs/support/getting-support)\n\n-\n\n [Billing questions](/dataflow/docs/support/billing-questions)\n\nRelated resources\n-----------------\n\nExplore self-paced training, use cases, reference architectures, and code samples with examples of how to use and connect Google Cloud services. Use case \nUse cases\n\n### Run HPC highly parallel workloads\n\n\nWith Dataflow, you can run your highly parallel workloads in a single pipeline, improving efficiency and making your workflow easier to manage.\n\nStreaming\n\n\u003cbr /\u003e\n\n[Learn more](/dataflow/docs/hpc-ep) \nUse case \nUse cases\n\n### Run inference with Dataflow ML\n\n\nDataflow ML lets you use Dataflow to deploy and manage complete machine learning (ML) pipelines. Use ML models to do local and remote inference with batch and streaming pipelines. Use data processing tools to prepare your data for model training and to process the results of the models.\n\nML Streaming\n\n\u003cbr /\u003e\n\n[Learn more](/dataflow/docs/machine-learning) \nUse case \nUse cases\n\n### Create an ecommerce streaming pipeline\n\n\nBuild an end-to-end ecommerce sample application that streams data from a webstore to BigQuery and Bigtable. The sample application illustrates common use cases and best practices for implementing streaming data analytics and real-time artificial intelligence (AI).\n\necommerce Streaming\n\n\u003cbr /\u003e\n\n[Learn more](/dataflow/docs/tutorials/ecommerce-retail-pipeline)\n\nRelated videos\n--------------"]]