Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Nesta página, listamos as causas comuns de atrasos no agendamento de jobs do Dataproc, com informações que podem ajudar você a evitar esses problemas.
Visão geral
Confira a seguir os motivos comuns para o atraso (limitação) de um job do Dataproc:
Muitos jobs em execução
Uso elevado da memória do sistema
Memória livre insuficiente
Limitação de taxa excedida
Normalmente, a mensagem de atraso do job é emitida no seguinte formato:
Awaiting execution [SCHEDULER_MESSAGE]"
As seções a seguir fornecem possíveis causas e soluções para cenários específicos de atraso de jobs.
Muitos jobs em execução
Mensagem do programador:
Throttling job ### (and maybe others): Too many running jobs (current=xx max=xx)
Causas:
O número máximo de jobs simultâneos com base na memória da VM mestre foi excedido. O driver do job é executado na VM mestre do cluster do Dataproc.
Por padrão, o Dataproc reserva 3,5 GB de memória para aplicativos e permite um job por GB.
Exemplo: o tipo de máquina n1-standard-4 tem 15GB de memória. Com 3.5GB reservado para sobrecarga, restam 11.5GB. Arredondando para um número inteiro, 11GB está disponível para até 11 jobs simultâneos.
Soluções:
Monitore as métricas de registro, como uso da CPU e memória, para estimar os requisitos do job.
Ao criar um cluster de job:
Use um tipo de máquina com mais memória para a VM mestre do cluster.
Se 1GB por job for mais do que você precisa, defina a
dataproc:dataproc.scheduler.driver-size-mbpropriedade do cluster
como menos de 1024.
Defina a dataproc:dataproc.scheduler.max-concurrent-jobspropriedade de cluster
com um valor adequado aos requisitos do seu job.
Memória do sistema alta ou memória livre insuficiente
Mensagem do programador:
Throttling job xxx_____JOBID_____xxx (and maybe others): High system memory usage (current=xx%)
Throttling job xxx_____JOBID_____xxx (and maybe others): Not enough free memory (current=xx min=xx)
Causas:
Por padrão, o agente do Dataproc limita o envio de jobs quando o uso de memória atinge 90% (0.9)). Quando esse limite é atingido, não é possível programar novos jobs.
A quantidade de memória livre necessária para programar outro job no cluster
não é suficiente.
Solução:
Ao criar um cluster:
Aumente o valor da propriedade do clusterdataproc:dataproc.scheduler.max-memory-used.
Por exemplo, defina acima do padrão 0.90 como 0.95.
Aumente o valor da propriedade do clusterdataproc.scheduler.min-free-memory.mb. O valor padrão é 256 MB.
O agente do Dataproc atingiu o limite de taxa de envio de jobs.
Soluções:
Por padrão, o envio de jobs do agente do Dataproc é limitado a
1.0 QPS, que pode ser definido como um valor diferente ao criar um cluster
com a dataproc:dataproc.scheduler.job-submission-ratepropriedade do cluster.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-22 UTC."],[[["\u003cp\u003eDataproc job scheduling delays, often indicated by an "Awaiting execution" message, can occur due to several common reasons, including having too many running jobs, high system memory usage, not enough free memory, or exceeding the job rate limit.\u003c/p\u003e\n"],["\u003cp\u003eThe maximum number of concurrent jobs is limited by the master VM's available memory, with Dataproc reserving 3.5GB and typically allowing 1 job per remaining GB; adjustments can be made to this via cluster properties, in order to be more efficient.\u003c/p\u003e\n"],["\u003cp\u003eJob throttling can occur when system memory usage exceeds 90% or there is insufficient free memory, and these thresholds can be adjusted using specific cluster properties such as \u003ccode\u003edataproc:dataproc.scheduler.max-memory-used\u003c/code\u003e and \u003ccode\u003edataproc.scheduler.min-free-memory.mb\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe job submission rate is limited by default to 1.0 QPS (Queries Per Second), but this limit can be modified by setting the \u003ccode\u003edataproc:dataproc.scheduler.job-submission-rate\u003c/code\u003e cluster property during cluster creation.\u003c/p\u003e\n"]]],[],null,["# Troubleshoot job delays\n\nThis page lists common causes of Dataproc job scheduling delays,\nwith information that can help you avoid them.\n\nOverview\n--------\n\nThe following are common reasons why a Dataproc job is being delayed\n(throttled):\n\n- Too many running jobs\n- High system memory usage\n- Not enough free memory\n- Rate limit exceeded\n\nTypically, the job delay message will be issued in the following format: \n\n Awaiting execution [\u003cvar translate=\"no\"\u003eSCHEDULER_MESSAGE\u003c/var\u003e]\"\n\nThe following sections provide possible causes and solutions for specific\njob delay scenarios.\n\nToo many running jobs\n---------------------\n\nScheduler message: \n\n Throttling job ### (and maybe others): Too many running jobs (current=xx max=xx)\n\nCauses:\n\nThe maximum number of concurrent jobs based on master VM memory is exceeded (the\njob driver runs on the Dataproc cluster master VM).\nBy default, Dataproc reserves 3.5GB of memory for applications,\nand allows 1 job per GB.\n\nExample: The `n1-standard-4` machine type has `15GB` memory. With `3.5GB` reserved for overhead,\n`11.5GB` remains. Rounding down to an integer, `11GB` is available for up to 11 concurrent jobs.\n\nSolutions:\n\n1. Monitor log metrics, such as CPU usage and memory, to estimate job requirements.\n\n2. When you create a job cluster:\n\n 1. Use a larger memory machine type for the cluster master VM.\n\n 2. If `1GB` per job is more than you need, set the\n `dataproc:dataproc.scheduler.driver-size-mb`\n [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties)\n to less than `1024`.\n\n 3. Set the `dataproc:dataproc.scheduler.max-concurrent-jobs`\n [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties)\n to a value suited to your job requirements.\n\nHigh system memory or not enough free memory\n--------------------------------------------\n\nScheduler message: \n\n Throttling job xxx_____JOBID_____xxx (and maybe others): High system memory usage (current=xx%)\n\n Throttling job xxx_____JOBID_____xxx (and maybe others): Not enough free memory (current=xx min=xx)\n\nCauses:\n\nBy default, the Dataproc agent throttles job submission when\nmemory use reaches 90% (`0.9)`. When this limit is reached, new jobs cannot be\nscheduled.\n\nThe amount of free memory needed to schedule another job on the cluster\nis not sufficient.\n\nSolution:\n\n1. When you create a cluster:\n\n 1. Increase the value of the `dataproc:dataproc.scheduler.max-memory-used` [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties). For example, set it above the `0.90` default to `0.95`. Setting the value to `1.0` disables master-memory-utilization job throttling.\n 2. Increase the value of the `dataproc.scheduler.min-free-memory.mb` [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties). The default value is `256` MB.\n\nJob rate limit exceeded\n-----------------------\n\nScheduler message: \n\n Throttling job xxx__JOBID___xxx (and maybe others): Rate limit\n\nCauses:\n\nThe Dataproc agent reached the job submission rate limit.\n\nSolutions:\n\n1. By default, the Dataproc agent job submission is limited at `1.0 QPS`, which you can set to a different value when you create a cluster with the `dataproc:dataproc.scheduler.job-submission-rate` [cluster property](/dataproc/docs/concepts/configuring-clusters/cluster-properties#service_properties).\n\nView job status.\n----------------\n\nTo view job status and details, see\n[Job monitoring and debugging](/dataproc/docs/concepts/jobs/life-of-a-job#job_monitoring_and_debugging)."]]