To avoid incurring Google Cloud charges for an inactive cluster, or the need to delete and recreate a cluster to avoid incurring cluster charges, use the Dataproc cluster scheduled stop feature, which stops all cluster VMs. You aren't charged for stopped VMs, but charges continue for associated resources, such as persistent disks.
Stopping a cluster stops all cluster VMs and causes any running jobs to fail. WHen a cluster is stopped, you can't update the cluster, submit jobs to the cluster, or access optional components on the cluster using the Dataproc Component Gateway. After stopping a cluster, you can restart the cluster, and resume work.
Cluster scheduled stop is available for clusters created with 2.2.42+ , 2.1.76+, and 2.0.57+, and later image versions.
Features
You can stop clusters after a specified idle period, at a specified future time, or after a specified period from the cluster creation request.
Cluster scheduled stop supports clusters with secondary workers and zero scale clusters.
You can update or cancel the cluster scheduled stop configuration.
Limitations and considerations
- Cluster scheduled stop isn't supported for clusters with local SSDs.
- You can't set cluster scheduled stop values using the Google Cloud console.
- Although you can update a cluster scheduled stop configuration, an initiated stop operation will continue. To check if the stop operation has started, examine cluster logs in Cloud Logging.
- Updating a stop schedule on a cluster that has past scheduled stop time removes the scheduled top configuration. To re-enable the scheduled stop, include a future time in your update request.
Actions that disable cluster scheduled stop
While a cluster is running, the following actions disable cluster scheduled stop until the disabling action is reversed:
- Removing IAM Dataproc Service Agent role on the Dataproc Service Agent service account
- Disabling the Dataproc API in the cluster project
- Enabling VPC-Service Controls if the Dataproc Service Agent service account (Control plane identity) isn't within the perimeter boundary
Cluster idle time calculation
For a cluster to be considered idle, the following conditions must be met:
- cluster creation is finished (time taken for cluster provisioning and startup is excluded from the idle time calculation)
- no jobs are running on the cluster
- the cluster isn't in a
STOPPED
state
Submitting a job to the cluster or stopping a cluster resets the idle time calculation.
The dataproc:dataproc.cluster-ttl.consider-yarn-activity
cluster property
affects the calculation of cluster idle time, as follows:
- This property is enabled (set to
true
) by default. - When this property is enabled, both YARN and Dataproc Jobs API
activity must be idle to start and continue incrementing the cluster idle time
calculation.
- YARN activity includes pending and running YARN applications.
- Dataproc Jobs API activity includes pending and running jobs submitted to the Dataproc Jobs API.
- When this property is set to
false
, the cluster idle time calculation starts and continues only when Dataproc Jobs API activity is idle.
Use Cluster Scheduled Stop
gcloud CLI
You can set scheduled stop values when you create a cluster using the Google Cloud CLI or Dataproc API. After you create the cluster, you can update the cluster to change or delete cluster scheduledstop values previously set on the cluster.
Flag | Description | Finest Granularity | Min Value | Max Value |
---|---|---|---|---|
--stop-max-idle 1 |
Applies to cluster create and cluster update commands.
The duration from the moment when the cluster enters the idle state
(after creation or startup) to the moment when the cluster begins to stop.
Provide the duration in IntegerUnit format, where the unit can
be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples:
"30m" or "1d" (30 minutes or 1 day from when the cluster becomes idle). |
1 second | 5 minutes | 14 days |
--no-stop-max-idle |
Applies to cluster update command only.
Cancels cluster scheduled stop by previously set
--stop-max-idle flag |
Not applicable | Not applicable | Not applicable |
--stop-expiration-time 2 |
Applies to cluster create and cluster update commands. The time to begin stopping the cluster in ISO 8601 datetime format. You can generate the datetime in correct format using the Timestamp Generator. For example, "2017-08-22T13:31:48-08:00" specifies an expiration time of 13:21:48 in the UTC -8:00 time zone. | 1 second | 10 minutes from the current time | 14 days from the current time |
--stop-max-age 2 |
Applies to cluster create and cluster update commands.
The duration from the moment of submitting the cluster create request
to the moment when the cluster begins to stop. Provide the duration
in IntegerUnit format, where the unit can be "s, m, h, d"
(seconds, minutes, hours, days). Examples: "30m": 30 minutes from now;
"1d": 1 day from now. |
1 second | 10 minutes | 14 days |
- You can pass the
stop-max-idle
flag with either thestop-expiration-time
orstop-max-age
flag in your cluster create or update request. The first to become true takes effect to stop the cluster. - You can pass either thec
stop-expiration-time
flag or thestop-max-age
flag to the cluster create or update command, but not both.
Cluster creation example:
gcloud dataproc clusters create CLUSTER_NAME \ --region=REGION \ --stop-max-idle=DURATION \ --stop-expiration-time=TIME \ ... other flags ...
Cluster update example:
For example:
gcloud dataproc clusters update CLUSTER_NAME \ --region=REGION \ --stop-max-idle=DURATION \ --no-stop-max-age \ ... other flags
REST API
You can create or update scheduled stop values on a cluster by setting the Dataproc API ClusterLifecycleConfig fields and values listed in the following table as part of a Dataproc cluster.create or cluster.patch API request.
Flag | Description | Finest Granularity | Min Value | Max Value |
---|---|---|---|---|
idleStopTtl 1 |
Applies to cluster create and cluster update commands.
The duration from the moment when the cluster enters the idle state
after the cluster is created or updated to the moment when the cluster
begins to stop.
Provide a duration in seconds with up to nine fractional digits,
terminated by 's'. Example: "3.5s".
Submit a cluster.patch request with an
empty duration to cancel a previously set idleDeleteTtl
value. |
1 second | 5 minutes |
14 days |
autoStopTime 2 |
Applies to cluster create and cluster update commands. The time to begin stopping the cluster. Provide a timestamp in RFC 3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z". | 1 second | 10 minutes from the current time | 14 days from the current time |
autoStopTtl 2 |
The duration from the moment of submitting the cluster create or update request to the moment when the cluster begins to stop. Provide a duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s". | 1 second | 10 minutes. Submit a cluster.patch request with an empty
duration to cancel a previously set autoStopTtl value. |
14 days |
- You can pass the
stop-max-idle
flag with either thestop-expiration-time
orstop-max-age
flag in your cluster create or update request. The first to become true takes effect to stop the cluster. - You can pass either thec
stop-expiration-time
flag or thestop-max-age
flag to the cluster create or update command, but not both.
Using scheduled stop with scheduled deletion
If you use both cluster scheduled stop with cluster scheduled deletion, when creating or updating a cluster, note the following constraints:
The
stop-max-idle
period must be shorter than or equal to thedelete-max-idle
period, or the period resulting fromdelete-max-age
ordelete-expiration-time
.The
stop-max-age
andstop-expiration-time
must be later thandelete-max-age
anddelete-expiration-time
respectively.
View Scheduled Stop cluster settings
gcloud CLI
You can use the gcloud dataproc clusters list
command to
confirm that a cluster has scheduled stop enabled.
gcloud dataproc clusters list \ --region=REGION
Sample output:
... NAME WORKER_COUNT ... SCHEDULED_STOP CLUSTER_ID NUMBER ... enabled ...
You can use the gcloud dataproc clusters describe
command to
check cluster LifecycleConfig
scheduled stop settings.
gcloud dataproc clusters describe CLUSTER_NAME \ --region=REGION
Sample output:
... lifecycleConfig: autoStopTime: '2018-11-28T19:33:48.146Z' idleStopTtl: 1800s idleStartTime: '2018-11-28T18:33:48.146Z' ...
The autoStopTime
and idleStopTtl
values are set by the user. Dataproc generates the
idleStartTime
value, which is the latest cluster idle start time.
While Dataproc calculates idleStartTime
based on
the cessation of job activity, the mechanism for scheduled cluster stopping
considers both the idleStartTime
and the cluster's last start time.
Specifically, if a cluster is stopped either by a user or by Dataproc,
the idle calculation for the scheduled stop feature is reset. This means the
countdown to a scheduled stop restarts upon the cluster next start. However,
the idleStartTime
itself isn't reset when a stopped cluster is
restarted. It continues to reflect the last occurrence of job inactivity prior to
the stop.
Therefore, two conditions must be met for Dataproc
to stop a cluster based on the idleStopTtl
:
- The cluster must have been idle for the duration specified by
idleStopTtl
since it was last started. - The cluster must have been idle for the duration specified by
idleStopTtl
since the lastidleStartTime
reset.
REST API
You can make a
clusters.list
request to confirm that a cluster has scheduled stop enabled.