Stay organized with collections
Save and categorize content based on your preferences.
This page provides an overview of stream concurrency controls, such as the maximum
number of concurrent change data capture (CDC) tasks and backfill tasks. You can
control stream performance by increasing or decreasing the values of these parameters.
Concurrency controls overview
By using the concurrency controls, you can either achieve faster backfill and CDC,
or balance the load on the source database. If you require higher throughputs,
and can afford a higher load on the database, then you can increase the concurrency
of CDC and backfill tasks. Conversely, if your database is experiencing a high
load, and you want to protect it from being overloaded, then you can reduce the
values of these parameters.
Maximum number of CDC tasks
The maxConcurrentCdcTasks parameter lets you control the number of CDC
tasks that a stream runs in parallel. To extend the CDC throughput, increase the
value of this parameter and allow Datastream to process more CDC log files
at the same time.
The key characteristics of the parameter include:
The default value is 5. You can set this parameter to any value between
1 and 50, inclusive.
The parameter is applicable only to Oracle and MySQL sources.
The parameter has impact only if there are more database log files available
to read than there are CDC tasks. The log files settings are controlled by
the source database configuration parameters: the maximum log file size and
the maximum log rotation time interval. For more information about these
parameters, refer to Oracle and MySQL documentation.
If you decrease the number of concurrent CDC tasks, Datastream might
lag behind the database logs, which might eventually lead to log position loss
and stream failure.
Maximum number of backfill tasks
The maxConcurrentBackfillTasks parameter lets you control the number of backfill
tasks that a stream can run in parallel. You can increase or decrease this value
to control the backfill throughput.
The key characteristics of the parameter include:
The default value is 15. You can set this parameter to any value between
1 and 50, inclusive.
There is a high risk associated with increasing the backfill concurrency,
because backfill tasks have significant impact on the database performance.
Each backfill task runs an unfiltered SELECT query on a table, and for
large tables, such queries return a large number of rows.
If you decrease the backfill concurrency, it has no negative impact on the
source database except for the backfill taking a longer time to complete.
Change the values of concurrency controls
You can change the values of concurrency control parameters using the
Datastream API.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eThis page outlines how to manage stream concurrency controls, specifically the maximum number of concurrent Change Data Capture (CDC) and backfill tasks.\u003c/p\u003e\n"],["\u003cp\u003eIncreasing CDC and backfill task concurrency can enhance throughput but may increase load on the source database, while decreasing concurrency can protect the database from overload at the cost of slower performance.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003emaxConcurrentCdcTasks\u003c/code\u003e parameter, applicable to MySQL and Oracle sources only, defaults to 5 and can be adjusted between 1 and 50 to control parallel CDC tasks, with a caution against decreasing it too much as it could cause log position loss.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003emaxConcurrentBackfillTasks\u003c/code\u003e parameter, which also defaults to 15 and ranges from 1 to 50, controls the number of parallel backfill tasks, and while increasing it may significantly impact database performance, decreasing it only extends backfill completion time.\u003c/p\u003e\n"],["\u003cp\u003eConcurrency control parameters can be modified using the Datastream API, as detailed in the documentation for changing the number of concurrent CDC and backfill tasks.\u003c/p\u003e\n"]]],[],null,["# Stream concurrency controls\n\nThis page provides an overview of stream concurrency controls, such as the maximum\nnumber of concurrent change data capture (CDC) tasks and backfill tasks. You can\ncontrol stream performance by increasing or decreasing the values of these parameters.\n\nConcurrency controls overview\n-----------------------------\n\nBy using the concurrency controls, you can either achieve faster backfill and CDC,\nor balance the load on the source database. If you require higher throughputs,\nand can afford a higher load on the database, then you can increase the concurrency\nof CDC and backfill tasks. Conversely, if your database is experiencing a high\nload, and you want to protect it from being overloaded, then you can reduce the\nvalues of these parameters.\n| **Note:** The maximum number of concurrent CDC and backfill tasks parameters are independent of each other. Because of the potential impact that the parameters can have on the source database, we recommend that you modify their values incrementally to see how your system responds.\n\n### Maximum number of CDC tasks\n\n| **Note:** This configuration is only available for MySQL and Oracle sources. CDC in PostgreSQL and SQL Server is single-threaded. For information about how to overcome this limitation for PostgreSQL, see [Diagnose issues](/datastream/docs/diagnose-issues#psql-errors).\n\nThe `maxConcurrentCdcTasks` parameter lets you control the number of CDC\ntasks that a stream runs in parallel. To extend the CDC throughput, increase the\nvalue of this parameter and allow Datastream to process more CDC log files\nat the same time.\n\nThe key characteristics of the parameter include:\n\n- The default value is `5`. You can set this parameter to any value between `1` and `50`, inclusive.\n- The parameter is applicable only to Oracle and MySQL sources.\n- The parameter has impact only if there are more database log files available to read than there are CDC tasks. The log files settings are controlled by the source database configuration parameters: the maximum log file size and the maximum log rotation time interval. For more information about these parameters, refer to Oracle and MySQL documentation.\n- If you decrease the number of concurrent CDC tasks, Datastream might lag behind the database logs, which might eventually lead to log position loss and stream failure.\n\n### Maximum number of backfill tasks\n\nThe `maxConcurrentBackfillTasks` parameter lets you control the number of backfill\ntasks that a stream can run in parallel. You can increase or decrease this value\nto control the backfill throughput.\n\nThe key characteristics of the parameter include:\n\n- The default value is `15`. You can set this parameter to any value between `1` and `50`, inclusive.\n- There is a high risk associated with increasing the backfill concurrency, because backfill tasks have significant impact on the database performance. Each backfill task runs an unfiltered `SELECT` query on a table, and for large tables, such queries return a large number of rows.\n- If you decrease the backfill concurrency, it has no negative impact on the source database except for the backfill taking a longer time to complete.\n\nChange the values of concurrency controls\n-----------------------------------------\n\nYou can change the values of concurrency control parameters using the\nDatastream API.\n\n- To learn how to increase or decrease the number of concurrent CDC tasks, see [Change the number of maximum concurrent CDC tasks](/datastream/docs/manage-streams#change_the_number_of_maximum_concurrent_cdc_tasks).\n- To learn how to increase or decrease the number of concurrent backfill tasks, see [Change the number of maximum concurrent backfill tasks](/datastream/docs/manage-streams#change_the_number_of_maximum_concurrent_backfill_tasks)\n\nWhat's next\n-----------\n\n- See [managing streams](/datastream/docs/manage-streams) to learn more about how to use the Datastream API.\n- See the [Datastream API reference documentation](/datastream/docs/reference/rest/v1/projects.locations.streams) to learn more about the `Stream` resource."]]