Cluster Scheduled Stop

To avoid incurring Google Cloud charges for an inactive cluster, or the need to delete and recreate a cluster to avoid incurring cluster charges, use the Dataproc cluster scheduled stop feature, which stops all cluster VMs. You aren't charged for stopped VMs, but charges continue for associated resources, such as persistent disks.

Stopping a cluster stops all cluster VMs and causes any running jobs to fail. WHen a cluster is stopped, you can't update the cluster, submit jobs to the cluster, or access optional components on the cluster using the Dataproc Component Gateway. After stopping a cluster, you can restart the cluster, and resume work.

Cluster scheduled stop is available for clusters created with 2.2.42+ , 2.1.76+, and 2.0.57+, and later image versions.

Features

  • You can stop clusters after a specified idle period, at a specified future time, or after a specified period from the cluster creation request.

  • Cluster scheduled stop supports clusters with secondary workers and zero scale clusters.

  • You can update or cancel the cluster scheduled stop configuration.

Limitations and considerations

  • Cluster scheduled stop isn't supported for clusters with local SSDs.
  • You can't set cluster scheduled stop values using the Google Cloud console.
  • Although you can update a cluster scheduled stop configuration, an initiated stop operation will continue. To check if the stop operation has started, examine cluster logs in Cloud Logging.
  • Updating a stop schedule on a cluster that has past scheduled stop time removes the scheduled top configuration. To re-enable the scheduled stop, include a future time in your update request.

Actions that disable cluster scheduled stop

While a cluster is running, the following actions disable cluster scheduled stop until the disabling action is reversed:

Cluster idle time calculation

For a cluster to be considered idle, the following conditions must be met:

  • cluster creation is finished (time taken for cluster provisioning and startup is excluded from the idle time calculation)
  • no jobs are running on the cluster
  • the cluster isn't in a STOPPED state

Submitting a job to the cluster or stopping a cluster resets the idle time calculation.

The dataproc:dataproc.cluster-ttl.consider-yarn-activity cluster property affects the calculation of cluster idle time, as follows:

  • This property is enabled (set to true) by default.
  • When this property is enabled, both YARN and Dataproc Jobs API activity must be idle to start and continue incrementing the cluster idle time calculation.
    • YARN activity includes pending and running YARN applications.
    • Dataproc Jobs API activity includes pending and running jobs submitted to the Dataproc Jobs API.
  • When this property is set to false, the cluster idle time calculation starts and continues only when Dataproc Jobs API activity is idle.

Use Cluster Scheduled Stop

gcloud CLI

You can set scheduled stop values when you create a cluster using the Google Cloud CLI or Dataproc API. After you create the cluster, you can update the cluster to change or delete cluster scheduledstop values previously set on the cluster.

Flag Description Finest Granularity Min Value Max Value
--stop-max-idle1 Applies to cluster create and cluster update commands. The duration from the moment when the cluster enters the idle state (after creation or startup) to the moment when the cluster begins to stop. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" or "1d" (30 minutes or 1 day from when the cluster becomes idle). 1 second 5 minutes 14 days
--no-stop-max-idle Applies to cluster update command only. Cancels cluster scheduled stop by previously set --stop-max-idle flag Not applicable Not applicable Not applicable
--stop-expiration-time2 Applies to cluster create and cluster update commands. The time to begin stopping the cluster in ISO 8601 datetime format. You can generate the datetime in correct format using the Timestamp Generator. For example, "2017-08-22T13:31:48-08:00" specifies an expiration time of 13:21:48 in the UTC -8:00 time zone.1 second10 minutes from the current time 14 days from the current time
--stop-max-age2 Applies to cluster create and cluster update commands. The duration from the moment of submitting the cluster create request to the moment when the cluster begins to stop. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days). Examples: "30m": 30 minutes from now; "1d": 1 day from now. 1 second 10 minutes 14 days
Notes:
  1. You can pass the stop-max-idle flag with either the stop-expiration-time or stop-max-age flag in your cluster create or update request. The first to become true takes effect to stop the cluster.
  2. You can pass either thecstop-expiration-time flag or the stop-max-age flag to the cluster create or update command, but not both.

Cluster creation example:

gcloud dataproc clusters create CLUSTER_NAME \
    --region=REGION \
    --stop-max-idle=DURATION \
    --stop-expiration-time=TIME \
    ... other flags ...

Cluster update example:

For example:

gcloud dataproc clusters update CLUSTER_NAME \
    --region=REGION \
    --stop-max-idle=DURATION \
    --no-stop-max-age \
    ... other flags

REST API

You can create or update scheduled stop values on a cluster by setting the Dataproc API ClusterLifecycleConfig fields and values listed in the following table as part of a Dataproc cluster.create or cluster.patch API request.

Flag Description Finest Granularity Min Value Max Value
idleStopTtl1 Applies to cluster create and cluster update commands. The duration from the moment when the cluster enters the idle state after the cluster is created or updated to the moment when the cluster begins to stop. Provide a duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s". Submit a cluster.patch request with an empty duration to cancel a previously set idleDeleteTtl value. 1 second 5 minutes
14 days
autoStopTime2 Applies to cluster create and cluster update commands. The time to begin stopping the cluster. Provide a timestamp in RFC 3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z". 1 second 10 minutes from the current time 14 days from the current time
autoStopTtl2 The duration from the moment of submitting the cluster create or update request to the moment when the cluster begins to stop. Provide a duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s". 1 second 10 minutes.
Submit a cluster.patch request with an empty duration to cancel a previously set autoStopTtl value.
14 days
Notes:
  1. You can pass the stop-max-idle flag with either the stop-expiration-time or stop-max-age flag in your cluster create or update request. The first to become true takes effect to stop the cluster.
  2. You can pass either thecstop-expiration-time flag or the stop-max-age flag to the cluster create or update command, but not both.

Using scheduled stop with scheduled deletion

If you use both cluster scheduled stop with cluster scheduled deletion, when creating or updating a cluster, note the following constraints:

  • The stop-max-idle period must be shorter than or equal to the delete-max-idle period, or the period resulting from delete-max-age or delete-expiration-time.

  • The stop-max-age and stop-expiration-time must be later than delete-max-age and delete-expiration-time respectively.

View Scheduled Stop cluster settings

gcloud CLI

You can use the gcloud dataproc clusters list command to confirm that a cluster has scheduled stop enabled.

 gcloud dataproc clusters list \
     --region=REGION

Sample output:

...
NAME         WORKER_COUNT ... SCHEDULED_STOP
CLUSTER_ID   NUMBER       ... enabled
...

You can use the gcloud dataproc clusters describe command to check cluster LifecycleConfig scheduled stop settings.

gcloud dataproc clusters describe CLUSTER_NAME \
    --region=REGION

Sample output:

...
lifecycleConfig:
  autoStopTime: '2018-11-28T19:33:48.146Z'
  idleStopTtl: 1800s
  idleStartTime: '2018-11-28T18:33:48.146Z'
...

The autoStopTime and idleStopTtl values are set by the user. Dataproc generates the idleStartTime value, which is the latest cluster idle start time.

While Dataproc calculates idleStartTime based on the cessation of job activity, the mechanism for scheduled cluster stopping considers both the idleStartTime and the cluster's last start time. Specifically, if a cluster is stopped either by a user or by Dataproc, the idle calculation for the scheduled stop feature is reset. This means the countdown to a scheduled stop restarts upon the cluster next start. However, the idleStartTime itself isn't reset when a stopped cluster is restarted. It continues to reflect the last occurrence of job inactivity prior to the stop.

Therefore, two conditions must be met for Dataproc to stop a cluster based on the idleStopTtl:

  1. The cluster must have been idle for the duration specified by idleStopTtl since it was last started.
  2. The cluster must have been idle for the duration specified by idleStopTtl since the last idleStartTime reset.

REST API

You can make a clusters.list request to confirm that a cluster has scheduled stop enabled.