Stay organized with collections
Save and categorize content based on your preferences.
For streaming pipelines, a straggler is defined as a work item with the
following characteristics:
It prevents the
watermark
from advancing for a significant length of time (on the order of minutes).
It processes for a long time relative to other work items in the same stage.
Stragglers hold back the watermark and add latency to the job. If the lag is
acceptable for your use case, then you don't need to take any action. If you
want to reduce a job's latency, start by addressing any stragglers.
View streaming stragglers in the Google Cloud console
After you start a Dataflow job, you can use the Google Cloud console
to view any detected stragglers.
In the Job details page, click the Execution details tab.
In the Graph view list, select Stage progress. The progress graph
shows aggregated counts of all stragglers detected within each stage.
To see details for a stage, hold the pointer over the bar for the stage. The
details pane includes a link to the worker logs. Clicking this link opens
Cloud Logging scoped to the worker and the time range when the straggler
was detected.
View stragglers by stage workflow
To view stragglers by stage workflow:
In the Google Cloud console, go to the Dataflow Jobs
page.
In the job details page, click the Execution details tab.
In the Graph view list, select Stage workflow. The stage workflow
shows the execution stages of the job, represented as a workflow graph.
Troubleshoot streaming stragglers
If a straggler is detected, it means that an operation in your pipeline has
been running for an unusually long time.
To troubleshoot the issue, first check whether
Dataflow insights
pinpoints any issues.
If you still can't determine the cause, check the worker logs for the stage that
reported the straggler. To see the relevant worker logs, view the
straggler details in the stage progress.
Then click the link for the worker. This link opens Cloud Logging, scoped to
the worker and the time range when the straggler was detected. Look for problems
that might be slowing down the stage, such as:
Bugs in DoFn code or
stuck DoFns. Look
for stack traces in the logs, near the timestamp when the straggler was
detected.
Calls to external services that take a long time to complete. To mitigate this
issue,
batch calls to external services
and set timeouts on RPCs.
Quota limits in sinks. If your pipeline outputs to a Google Cloud
service, you might be able to raise the quota. For more information, see
the Cloud Quotas documentation. Also, consult the documentation for the
particular service for optimization strategies, as well as the documentation
for the
I/O Connector.
DoFns that perform large read or write operations on persistent state.
Consider refactoring your code to perform smaller reads or writes on
persistent state.
You can also use the
Side info
panel to find the slowest steps in the stage. One of these steps might be
causing the straggler. Click on the step name to view the worker logs for that
step.
After you determine the cause,
update your pipeline with new
code and monitor the result.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-26 UTC."],[[["\u003cp\u003eStreaming pipeline stragglers are work items that significantly delay watermark advancement and process for a notably longer duration than other items in the same stage, leading to increased job latency.\u003c/p\u003e\n"],["\u003cp\u003eThe Google Cloud console allows viewing of detected streaming stragglers through the stage progress view or the stage workflow view after a Dataflow job has started.\u003c/p\u003e\n"],["\u003cp\u003eTroubleshooting streaming stragglers involves checking for issues with Dataflow insights, reviewing worker logs for the relevant stage, and investigating potential causes like bugs in \u003ccode\u003eDoFn\u003c/code\u003e code, slow external service calls, quota limits, or large read/write operations on persistent state.\u003c/p\u003e\n"],["\u003cp\u003eThe Side info panel in the console can help identify the slowest steps in a stage, potentially revealing the cause of a straggler, and these steps can be used to find the relevant worker logs for the issue.\u003c/p\u003e\n"],["\u003cp\u003eAfter identifying the root cause of a straggler, you should update your pipeline code to resolve the issue, and then monitor the job's performance for improvement.\u003c/p\u003e\n"]]],[],null,["For streaming pipelines, a *straggler* is defined as a work item with the\nfollowing characteristics:\n\n- It prevents the [watermark](/dataflow/docs/concepts/beam-programming-model#advanced_concepts) from advancing for a significant length of time (on the order of minutes).\n- It processes for a long time relative to other work items in the same stage.\n\nStragglers hold back the watermark and add latency to the job. If the lag is\nacceptable for your use case, then you don't need to take any action. If you\nwant to reduce a job's latency, start by addressing any stragglers.\n| **Note:** For information about troubleshooting stragglers in batch jobs, see [Troubleshoot stragglers in batch\n| jobs](/dataflow/docs/guides/troubleshoot-batch-stragglers).\n\nView streaming stragglers in the Google Cloud console\n\nAfter you start a Dataflow job, you can use the Google Cloud console\nto view any detected stragglers.\n\nYou can view streaming stragglers in the [stage progress\nview](/dataflow/docs/concepts/execution-details#stage_progress_for_streaming_jobs)\nor the [stage workflow\nview](/dataflow/docs/concepts/execution-details#stage_workflow).\n\nView stragglers by stage progress\n\nTo view stragglers by stage progress:\n\n1. In the Google Cloud console, go to the Dataflow **Jobs**\n page.\n\n [Go to Jobs](https://console.cloud.google.com/dataflow/jobs)\n2. Click the name of the job.\n\n3. In the **Job details** page, click the **Execution details** tab.\n\n4. In the **Graph view** list, select **Stage progress**. The progress graph\n shows aggregated counts of all stragglers detected within each stage.\n\n5. To see details for a stage, hold the pointer over the bar for the stage. The\n details pane includes a link to the worker logs. Clicking this link opens\n Cloud Logging scoped to the worker and the time range when the straggler\n was detected.\n\nView stragglers by stage workflow\n\nTo view stragglers by stage workflow:\n\n1. In the Google Cloud console, go to the Dataflow **Jobs**\n page.\n\n Go to [Jobs](https://console.cloud.google.com/dataflow/jobs)\n2. Click the name of the job.\n\n3. In the job details page, click the **Execution details** tab.\n\n4. In the **Graph view** list, select **Stage workflow**. The stage workflow\n shows the execution stages of the job, represented as a workflow graph.\n\nTroubleshoot streaming stragglers\n\nIf a straggler is detected, it means that an operation in your pipeline has\nbeen running for an unusually long time.\n\nTo troubleshoot the issue, first check whether\n[Dataflow insights](/dataflow/docs/guides/using-dataflow-insights)\npinpoints any issues.\n\nIf you still can't determine the cause, check the worker logs for the stage that\nreported the straggler. To see the relevant worker logs, view the\n[straggler details](#view_stragglers_by_stage_progress) in the stage progress.\nThen click the link for the worker. This link opens Cloud Logging, scoped to\nthe worker and the time range when the straggler was detected. Look for problems\nthat might be slowing down the stage, such as:\n\n- Bugs in `DoFn` code or [stuck `DoFns`](/dataflow/docs/guides/common-errors#processing-stuck). Look for stack traces in the logs, near the timestamp when the straggler was detected.\n- Calls to external services that take a long time to complete. To mitigate this issue, [batch calls to external services](/dataflow/docs/tutorials/ecommerce-java#micro-batch-calls) and set timeouts on RPCs.\n- Quota limits in sinks. If your pipeline outputs to a Google Cloud service, you might be able to raise the quota. For more information, see the [Cloud Quotas documentation](/docs/quotas/overview). Also, consult the documentation for the particular service for optimization strategies, as well as the documentation for the [I/O Connector](https://beam.apache.org/documentation/io/connectors/).\n- `DoFns` that perform large read or write operations on persistent state. Consider refactoring your code to perform smaller reads or writes on persistent state.\n\nYou can also use the\n[**Side info**](/dataflow/docs/concepts/execution-details#stage-info)\npanel to find the slowest steps in the stage. One of these steps might be\ncausing the straggler. Click on the step name to view the worker logs for that\nstep.\n\nAfter you determine the cause,\n[update your pipeline](/dataflow/docs/guides/updating-a-pipeline) with new\ncode and monitor the result.\n\nWhat's next\n\n- Learn to use the [Dataflow monitoring interface](/dataflow/docs/guides/using-monitoring-intf).\n- Understand the [**Execution details**](/dataflow/docs/concepts/execution-details) tab in the monitoring interface."]]