Stay organized with collections
Save and categorize content based on your preferences.
You can specify whether a pipeline task must be rerun if it fails, by
configuring the retries for that task. You can set the number of attempts to
rerun the task on failure and the delay between subsequent retries.
Use the following code sample to configure the failure policy of a pipeline task
named train_op by using the
set_retry
method in the Kubeflow Pipelines SDK:
NUMBER_OF_RETRIES: The number of times to retry the task upon failure.
BACKOFF_DURATION: Optional. The duration of time wait after the task
fails before retrying. If you don't set this parameter, the duration
is set to 0s, by default.
BACKOFF_FACTOR: Optional. The factor by which the backoff duration
is multiplied for each subsequent retry. If you don't set this parameter, the
backoff factor is set to 2.0, by default.
BACKOFF_MAX_DURATION: Optional. The maximum backoff duration between subsequent retries.
If you don't set this parameter, the maximum duration is set to 3600s, by default.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[],[],null,["# Configure retries for a pipeline task\n\nYou can specify whether a pipeline task must be rerun if it fails, by\nconfiguring the retries for that task. You can set the number of attempts to\nrerun the task on failure and the delay between subsequent retries.\n\nUse the following code sample to configure the failure policy of a pipeline task\nnamed `train_op` by using the\n[`set_retry`](https://kubeflow-pipelines.readthedocs.io/page/source/dsl.html#kfp.dsl.PipelineTask.set_retry)\nmethod in the Kubeflow Pipelines SDK: \n\n from kfp import dsl\n\n @dsl.pipeline(name='custom-container-pipeline')\n def pipeline():\n generate = generate_op()\n train = (\n train_op(\n training_data=generate.outputs['training_data'],\n test_data=generate.outputs['test_data'],\n config_file=generate.outputs['config_file'])\n .set_retry(\n num_retries=\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eNUMBER_OF_RETRIES\u003c/span\u003e\u003c/var\u003e,\n backoff_duration='\u003cvar translate=\"no\"\u003eBACKOFF_DURATION\u003c/var\u003e',\n backoff_factor=\u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eBACKOFF_FACTOR\u003c/span\u003e\u003c/var\u003e,\n backoff_maxk_duration='\u003cvar translate=\"no\"\u003eBACKOFF_MAX_DURATION\u003c/var\u003e'\n )\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eNUMBER_OF_RETRIES\u003c/var\u003e: The number of times to retry the task upon failure.\n\n- \u003cvar translate=\"no\"\u003eBACKOFF_DURATION\u003c/var\u003e: Optional. The duration of time wait after the task\n fails before retrying. If you don't set this parameter, the duration\n is set to `0s`, by default.\n\n- \u003cvar translate=\"no\"\u003eBACKOFF_FACTOR\u003c/var\u003e: Optional. The factor by which the backoff duration\n is multiplied for each subsequent retry. If you don't set this parameter, the\n backoff factor is set to `2.0`, by default.\n\n- \u003cvar translate=\"no\"\u003eBACKOFF_MAX_DURATION\u003c/var\u003e: Optional. The maximum backoff duration between subsequent retries.\n If you don't set this parameter, the maximum duration is set to `3600s`, by default.\n\n| **Caution:** You can't pass output parameters from other pipeline tasks or pipeline input parameters as parameter values for the set_retry method. These values must be available when you compile the pipeline."]]