Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Halaman ini mencantumkan penyebab umum penundaan penjadwalan tugas Dataproc,
dengan informasi yang dapat membantu Anda menghindarinya.
Ringkasan
Berikut adalah alasan umum mengapa tugas Dataproc tertunda (dibatasi):
Terlalu banyak tugas yang sedang berjalan
Penggunaan memori sistem yang tinggi
Memori kosong tidak cukup
Batas kapasitas terlampaui
Biasanya, pesan penundaan tugas akan dikeluarkan dalam format berikut:
Awaiting execution [SCHEDULER_MESSAGE]"
Bagian berikut memberikan kemungkinan penyebab dan solusi untuk skenario penundaan tugas tertentu.
Terlalu banyak tugas yang sedang berjalan
Pesan penjadwal:
Throttling job ### (and maybe others): Too many running jobs (current=xx max=xx)
Penyebab:
Jumlah maksimum tugas serentak berdasarkan memori VM master terlampaui (driver tugas berjalan di VM master cluster Dataproc).
Secara default, Dataproc mencadangkan memori sebesar 3,5 GB untuk aplikasi,
dan mengizinkan 1 tugas per GB.
Contoh: Jenis mesin n1-standard-4 memiliki memori 15GB. Dengan 3.5GB yang dialokasikan untuk biaya operasional,
11.5GB tetap tersedia. Dengan membulatkan ke bawah ke bilangan bulat, 11GB tersedia untuk hingga 11 tugas serentak.
Solusi:
Pantau metrik log, seperti penggunaan CPU dan memori, untuk memperkirakan persyaratan tugas.
Saat Anda membuat cluster tugas:
Gunakan jenis mesin dengan memori yang lebih besar untuk VM master cluster.
Jika 1GB per pekerjaan lebih dari yang Anda butuhkan, tetapkan
dataproc:dataproc.scheduler.driver-size-mbproperti cluster
ke kurang dari 1024.
Tetapkan dataproc:dataproc.scheduler.max-concurrent-jobsproperti cluster
ke nilai yang sesuai dengan persyaratan tugas Anda.
Memori sistem tinggi atau memori kosong tidak cukup
Pesan penjadwal:
Throttling job xxx_____JOBID_____xxx (and maybe others): High system memory usage (current=xx%)
Throttling job xxx_____JOBID_____xxx (and maybe others): Not enough free memory (current=xx min=xx)
Penyebab:
Secara default, agen Dataproc membatasi pengiriman tugas saat penggunaan memori mencapai 90% (0.9). Jika batas ini tercapai, tugas baru tidak dapat dijadwalkan.
Jumlah memori kosong yang diperlukan untuk menjadwalkan tugas lain di cluster tidak mencukupi.
Solusi:
Saat Anda membuat cluster:
Tingkatkan nilai
dataproc:dataproc.scheduler.max-memory-usedproperti cluster.
Misalnya, tetapkan di atas default 0.90 ke 0.95.
Tingkatkan nilai
dataproc.scheduler.min-free-memory.mbproperti cluster. Nilai defaultnya adalah 256 MB.
Agen Dataproc mencapai batas kecepatan pengiriman tugas.
Solusi:
Secara default, pengiriman tugas agen Dataproc dibatasi pada
1.0 QPS, yang dapat Anda tetapkan ke nilai yang berbeda saat membuat cluster
dengan dataproc:dataproc.scheduler.job-submission-rateproperti cluster.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eDataproc job scheduling delays, often indicated by an "Awaiting execution" message, can occur due to several common reasons, including having too many running jobs, high system memory usage, not enough free memory, or exceeding the job rate limit.\u003c/p\u003e\n"],["\u003cp\u003eThe maximum number of concurrent jobs is limited by the master VM's available memory, with Dataproc reserving 3.5GB and typically allowing 1 job per remaining GB; adjustments can be made to this via cluster properties, in order to be more efficient.\u003c/p\u003e\n"],["\u003cp\u003eJob throttling can occur when system memory usage exceeds 90% or there is insufficient free memory, and these thresholds can be adjusted using specific cluster properties such as \u003ccode\u003edataproc:dataproc.scheduler.max-memory-used\u003c/code\u003e and \u003ccode\u003edataproc.scheduler.min-free-memory.mb\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe job submission rate is limited by default to 1.0 QPS (Queries Per Second), but this limit can be modified by setting the \u003ccode\u003edataproc:dataproc.scheduler.job-submission-rate\u003c/code\u003e cluster property during cluster creation.\u003c/p\u003e\n"]]],[],null,["This page lists common causes of Dataproc job scheduling delays,\nwith information that can help you avoid them.\n\nOverview\n\nThe following are common reasons why a Dataproc job is being delayed\n(throttled):\n\n- Too many running jobs\n- High system memory usage\n- Not enough free memory\n- Rate limit exceeded\n\nTypically, the job delay message will be issued in the following format: \n\n Awaiting execution [\u003cvar translate=\"no\"\u003eSCHEDULER_MESSAGE\u003c/var\u003e]\"\n\nThe following sections provide possible causes and solutions for specific\njob delay scenarios.\n\nToo many running jobs\n\nScheduler message: \n\n Throttling job ### (and maybe others): Too many running jobs (current=xx max=xx)\n\nCauses:\n\nThe maximum number of concurrent jobs based on master VM memory is exceeded (the\njob driver runs on the Dataproc cluster master VM).\nBy default, Dataproc reserves 3.5GB of memory for applications,\nand allows 1 job per GB.\n\nExample: The `n1-standard-4` machine type has `15GB` memory. With `3.5GB` reserved for overhead,\n`11.5GB` remains. Rounding down to an integer, `11GB` is available for up to 11 concurrent jobs.\n\nSolutions:\n\n1. Monitor log metrics, such as CPU usage and memory, to estimate job requirements.\n\n2. When you create a job cluster:\n\n 1. Use a larger memory machine type for the cluster master VM.\n\n 2. If `1GB` per job is more than you need, set the\n `dataproc:dataproc.scheduler.driver-size-mb`\n [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties)\n to less than `1024`.\n\n 3. Set the `dataproc:dataproc.scheduler.max-concurrent-jobs`\n [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties)\n to a value suited to your job requirements.\n\nHigh system memory or not enough free memory\n\nScheduler message: \n\n Throttling job xxx_____JOBID_____xxx (and maybe others): High system memory usage (current=xx%)\n\n Throttling job xxx_____JOBID_____xxx (and maybe others): Not enough free memory (current=xx min=xx)\n\nCauses:\n\nBy default, the Dataproc agent throttles job submission when\nmemory use reaches 90% (`0.9)`. When this limit is reached, new jobs cannot be\nscheduled.\n\nThe amount of free memory needed to schedule another job on the cluster\nis not sufficient.\n\nSolution:\n\n1. When you create a cluster:\n\n 1. Increase the value of the `dataproc:dataproc.scheduler.max-memory-used` [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties). For example, set it above the `0.90` default to `0.95`. Setting the value to `1.0` disables master-memory-utilization job throttling.\n 2. Increase the value of the `dataproc.scheduler.min-free-memory.mb` [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties). The default value is `256` MB.\n\nJob rate limit exceeded\n\nScheduler message: \n\n Throttling job xxx__JOBID___xxx (and maybe others): Rate limit\n\nCauses:\n\nThe Dataproc agent reached the job submission rate limit.\n\nSolutions:\n\n1. By default, the Dataproc agent job submission is limited at `1.0 QPS`, which you can set to a different value when you create a cluster with the `dataproc:dataproc.scheduler.job-submission-rate` [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties).\n\nView job status\n\nTo view job status and details, see\n[Job monitoring and debugging](/dataproc/docs/concepts/jobs/life-of-a-job#job_monitoring_and_debugging)."]]