Dataproc 指標

Cloud Monitoring 可讓您掌握雲端應用程式的效能、運作時間和整體健康狀態。Google Cloud Observability 會收集和擷取 Dataproc 叢集的指標、事件和中繼資料,包括每個叢集的 HDFS、YARN、工作和作業指標,透過資訊主頁和圖表產生深入分析資料 (請參閱「Cloud Monitoring Dataproc 指標」)。

Dataproc 資源指標收集

Cloud Monitoring 會收集下列 Dataproc 資源相關指標:

  • Cloud Dataproc 叢集
  • Cloud Dataproc 工作
  • Cloud Dataproc 批次
  • Cloud Dataproc 工作階段

Dataproc 資源指標會以以下格式收集:dataproc.googleapis.com/RESOURCE/METRIC,並包含多個 OSS 指標的集合。

查看 Dataproc 資源指標

您可以在 Metrics Explorer 中選取並查看 Dataproc 資源指標,方法是在 Filter by resource or metric name 方塊中輸入「dataproc」,然後選取「Cloud Dataproc」資源。

自訂指標收集

建立 Dataproc 叢集時,您可以啟用一或多個自訂指標來源的指標收集功能。除非您指定要從指標來源收集的指標 (使用者指定的指標稱為指標「覆寫」),否則系統會從每個已啟用的指標來源收集一組標準指標。

系統會以以下格式收集自訂 OSS 指標: custom.googleapis.com/OSS_COMPONENT/METRIC

自訂 OSS 指標範例:

custom.googleapis.com/spark/driver/DAGScheduler/job/allJobs
custom.googleapis.com/hiveserver2/memory/MaxNonHeapMemory

啟用自訂指標收集功能

您可以使用 gcloud CLI 或 Dataproc API,啟用從一個或多個指標來源收集自訂指標的功能。

gcloud CLI

自訂指標收集

使用 gcloud dataproc clusters create --metric-sources 標記,即可從一或多個指標來源收集自訂指標

gcloud dataproc clusters create cluster-name \
    --metric-sources=METRIC_SOURCE(s) \
    ... other flags

注意:

覆寫指標收集

您可以視需要新增 --metric-overrides--metric-overrides-file 標記,從一或多個指標來源收集一或多個自訂指標

  • 任何自訂指標和所有Spark 指標,都可以列出收集為指標覆寫值。覆寫指標值區分大小寫,且必須以駝峰式大小寫格式提供 (如適用)。

    例如:

    • sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed
    • hiveserver2:JVM:Memory:NonHeapMemoryUsage.used
    • yarn:ResourceManager:JvmMetrics:MemHeapMaxM

  • 系統只會從指定的指標來源收集指定的覆寫指標。舉例來說,如果一或多個 spark:executive 指標列為指標覆寫值,系統就不會收集其他 SPARK 指標。其他指標來源的自訂指標收集作業不會受到影響。舉例來說,如果同時啟用 SPARKYARN 指標來源,且僅為 Spark 指標提供覆寫值,系統就會收集已啟用的標準 YARN 指標組合。
  • 您必須啟用指定指標覆寫值的來源。舉例來說,如果一或多個 spark:driver 指標做為指標覆寫值提供,則必須啟用 spark 指標來源 (--metric-sources=spark)。

取代指標清單

gcloud dataproc clusters create cluster-name \
    --metric-sources=METRIC_SOURCE(s) \
    --metric-overrides=LIST_OF_METRIC_OVERRIDES \
    ... other flags

注意:

  • --metric-sources:啟用自訂指標收集功能時必填。指定一或多個下列指標來源:sparkflinkhdfsyarnspark-history-serverhiveserver2hivemetastoremonitoring-agent-defaults。指標來源名稱不區分大小寫,例如「yarn」或「YARN」皆可。
  • --metric-overrides:請以下列格式提供指標清單:

    METRIC_SOURCE:INSTANCE:GROUP:METRIC

    示例:--metric-overrides=sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed

    這個標記是 --metric-overrides-file 標記的替代方案,但無法與 --metric-overrides-file 標記搭配使用。

覆寫指標檔案

gcloud dataproc clusters create cluster-name \
    --metric-sources=METRIC-SOURCE(s) \
    --metric-overrides-file=METRIC_OVERRIDES_FILENAME \
    ... other flags

注意:

  • --metric-sources:啟用自訂指標收集功能時必須使用。指定一或多個下列指標來源:sparkflinkhdfsyarnspark-history-serverhiveserver2hivemetastoremonitoring-agent-defaults。指標來源名稱不區分大小寫,例如「yarn」或「YARN」皆可。
  • --metric-overrides-file:指定包含一或多個指標的本機或 Cloud Storage 檔案 (gs://bucket/filename),格式如下:

    METRIC_SOURCE:INSTANCE:GROUP:METRIC

    視情況使用大寫字母格式。

    範例:

    • --metric-overrides-file=gs://my-bucket/my-filename.txt
    • --metric-overrides-file=./local-directory/local-filename.txt

      這個標記是 --metric-overrides 標記的替代方案,但無法與 --metric-overrides 標記搭配使用。

REST API

使用 DataprocMetricConfig 做為 clusters.create 要求的一部分,即可啟用自訂指標的收集功能。注意:除非安裝 Ops Agent,否則 2.2 映像檔版本叢集無法使用 monitoring-agent-defaults

查看自訂指標

您可以選取 VM Instance 資源,然後選取 Custom metrics,在 Metrics Explorer 中選取及查看 Dataproc 資源指標。

自訂指標

您可以啟用 Dataproc 收集下表所列的自訂指標。

  • 如果您啟用相關聯的指標來源,Dataproc 會在收集指標時標示「y」

  • 如果您覆寫指標來源的標準啟用指標集合收集作業 (請參閱「啟用自訂指標收集」),則可為指標來源的任何指標 (以及所有Spark 指標) 啟用收集功能。

  • Dataproc 會使用監控代理程式收集指標。啟用任何指標來源後,系統就會收集代理程式指標。這些指標不會向使用者收費,而是由 Dataproc 用於診斷指標收集問題。

Hadoop 指標

HDFS 指標

指標 Metrics Explorer 名稱 已啟用的指標
hdfs:NameNode:FSNamesystem:CapacityTotalGB dfs/FSNamesystem/CapacityTotalGB y
hdfs:NameNode:FSNamesystem:CapacityUsedGB dfs/FSNamesystem/CapacityUsedGB y
hdfs:NameNode:FSNamesystem:CapacityRemainingGB dfs/FSNamesystem/CapacityRemainingGB y
hdfs:NameNode:FSNamesystem:FilesTotal dfs/FSNamesystem/FilesTotal y
hdfs:NameNode:FSNamesystem:MissingBlocks dfs/FSNamesystem/MissingBlocks n
hdfs:NameNode:FSNamesystem:ExpiredHeartbeats dfs/FSNamesystem/ExpiredHeartbeats n
hdfs:NameNode:FSNamesystem:TransactionsSinceLastCheckpoint dfs/FSNamesystem/TransactionsSinceLastCheckpoint n
hdfs:NameNode:FSNamesystem:TransactionsSinceLastLogRoll dfs/FSNamesystem/TransactionsSinceLastLogRoll n
hdfs:NameNode:FSNamesystem:LastWrittenTransactionId dfs/FSNamesystem/LastWrittenTransactionId n
hdfs:NameNode:FSNamesystem:CapacityTotal dfs/FSNamesystem/CapacityTotal n
hdfs:NameNode:FSNamesystem:CapacityUsed dfs/FSNamesystem/CapacityUsed n
hdfs:NameNode:FSNamesystem:CapacityRemaining dfs/FSNamesystem/CapacityRemaining n
hdfs:NameNode:FSNamesystem:CapacityUsedNonDFS dfs/FSNamesystem/CapacityUsedNonDFS n
hdfs:NameNode:FSNamesystem:TotalLoad dfs/FSNamesystem/TotalLoad n
hdfs:NameNode:FSNamesystem:SnapshottableDirectories dfs/FSNamesystem/SnapshottableDirectories n
hdfs:NameNode:FSNamesystem:Snapshots dfs/FSNamesystem/Snapshots n
hdfs:NameNode:FSNamesystem:BlocksTotal dfs/FSNamesystem/BlocksTotal n
hdfs:NameNode:FSNamesystem:PendingReplicationBlocks dfs/FSNamesystem/PendingReplicationBlocks n
hdfs:NameNode:FSNamesystem:UnderReplicatedBlocks dfs/FSNamesystem/UnderReplicatedBlocks n
hdfs:NameNode:FSNamesystem:CorruptBlocks dfs/FSNamesystem/CorruptBlocks n
hdfs:NameNode:FSNamesystem:ScheduledReplicationBlocks dfs/FSNamesystem/ScheduledReplicationBlocks n
hdfs:NameNode:FSNamesystem:PendingDeletionBlocks dfs/FSNamesystem/PendingDeletionBlocks n
hdfs:NameNode:FSNamesystem:ExcessBlocks dfs/FSNamesystem/ExcessBlocks n
hdfs:NameNode:FSNamesystem:PostponedMisreplicatedBlocks dfs/FSNamesystem/PostponedMisreplicatedBlocks n
hdfs:NameNode:FSNamesystem:PendingDataNodeMessageCourt dfs/FSNamesystem/PendingDataNodeMessageCourt n
hdfs:NameNode:FSNamesystem:MillisSinceLastLoadedEdits dfs/FSNamesystem/MillisSinceLastLoadedEdits n
hdfs:NameNode:FSNamesystem:BlockCapacity dfs/FSNamesystem/BlockCapacity n
hdfs:NameNode:FSNamesystem:StaleDataNodes dfs/FSNamesystem/StaleDataNodes n
hdfs:NameNode:FSNamesystem:TotalFiles dfs/FSNamesystem/TotalFiles n
hdfs:NameNode:JvmMetrics:MemHeapUsedM dfs/jvm/MemHeapUsedM n
hdfs:NameNode:JvmMetrics:MemHeapCommittedM dfs/jvm/MemHeapCommittedM n
hdfs:NameNode:JvmMetrics:MemHeapMaxM dfs/jvm/MemHeapMaxM n
hdfs:NameNode:JvmMetrics:MemMaxM dfs/jvm/MemMaxM n

YARN 指標

指標 Metrics Explorer 名稱 已啟用的指標
yarn:ResourceManager:ClusterMetrics:NumActiveNMs yarn/ClusterMetrics/NumActiveNMs y
yarn:ResourceManager:ClusterMetrics:NumDecommissionedNMs yarn/ClusterMetrics/NumDecommissionedNMs n
yarn:ResourceManager:ClusterMetrics:NumLostNMs yarn/ClusterMetrics/NumLostNMs n
yarn:ResourceManager:ClusterMetrics:NumUnhealthyNMs yarn/ClusterMetrics/NumUnhealthyNMs n
yarn:ResourceManager:ClusterMetrics:NumRebootedNMs yarn/ClusterMetrics/NumRebootedNMs n
yarn:ResourceManager:QueueMetrics:running_0 yarn/QueueMetrics/running_0 y
yarn:ResourceManager:QueueMetrics:running_60 yarn/QueueMetrics/running_60 y
yarn:ResourceManager:QueueMetrics:running_300 yarn/QueueMetrics/running_300 y
yarn:ResourceManager:QueueMetrics:running_1440 yarn/QueueMetrics/running_1440 y
yarn:ResourceManager:QueueMetrics:AppsSubmitted yarn/QueueMetrics/AppsSubmitted y
yarn:ResourceManager:QueueMetrics:AvailableMB yarn/QueueMetrics/AvailableMB y
yarn:ResourceManager:QueueMetrics:PendingContainers yarn/QueueMetrics/PendingContainers y
yarn:ResourceManager:QueueMetrics:AppsRunning yarn/QueueMetrics/AppsRunning n
yarn:ResourceManager:QueueMetrics:AppsPending yarn/QueueMetrics/AppsPending n
yarn:ResourceManager:QueueMetrics:AppsCompleted yarn/QueueMetrics/AppsCompleted n
yarn:ResourceManager:QueueMetrics:AppsKilled yarn/QueueMetrics/AppsKilled n
yarn:ResourceManager:QueueMetrics:AppsFailed yarn/QueueMetrics/AppsFailed n
yarn:ResourceManager:QueueMetrics:AllocatedMB yarn/QueueMetrics/AllocatedMB n
yarn:ResourceManager:QueueMetrics:AllocatedVCores yarn/QueueMetrics/AllocatedVCores n
yarn:ResourceManager:QueueMetrics:AllocatedContainers yarn/QueueMetrics/AllocatedContainers n
yarn:ResourceManager:QueueMetrics:AggregateContainersAllocated yarn/QueueMetrics/AggregateContainersAllocated n
yarn:ResourceManager:QueueMetrics:AggregateContainersReleased yarn/QueueMetrics/AggregateContainersReleased n
yarn:ResourceManager:QueueMetrics:AvailableVCores yarn/QueueMetrics/AvailableVCores n
yarn:ResourceManager:QueueMetrics:PendingMB yarn/QueueMetrics/PendingMB n
yarn:ResourceManager:QueueMetrics:PendingVCores yarn/QueueMetrics/PendingVCores n
yarn:ResourceManager:QueueMetrics:ReservedMB yarn/QueueMetrics/ReservedMB n
yarn:ResourceManager:QueueMetrics:ReservedVCores yarn/QueueMetrics/ReservedVCores n
yarn:ResourceManager:QueueMetrics:ReservedContainers yarn/QueueMetrics/ReservedContainers n
yarn:ResourceManager:QueueMetrics:ActiveUsers yarn/QueueMetrics/ActiveUsers n
yarn:ResourceManager:QueueMetrics:ActiveApplications yarn/QueueMetrics/ActiveApplications n
yarn:ResourceManager:QueueMetrics:FairShareMB yarn/QueueMetrics/FairShareMB n
yarn:ResourceManager:QueueMetrics:FairShareVCores yarn/QueueMetrics/FairShareVCores n
yarn:ResourceManager:QueueMetrics:MinShareMB yarn/QueueMetrics/MinShareMB n
yarn:ResourceManager:QueueMetrics:MinShareVCores yarn/QueueMetrics/MinShareVCores n
yarn:ResourceManager:QueueMetrics:MaxShareMB yarn/QueueMetrics/MaxShareMB n
yarn:ResourceManager:QueueMetrics:MaxShareVCores yarn/QueueMetrics/MaxShareVCores n
yarn:ResourceManager:JvmMetrics:MemHeapUsedM yarn/jvm/MemHeapUsedM n
yarn:ResourceManager:JvmMetrics:MemHeapCommittedM yarn/jvm/MemHeapCommittedM n
yarn:ResourceManager:JvmMetrics:MemHeapMaxM yarn/jvm/MemHeapMaxM n
yarn:ResourceManager:JvmMetrics:MemMaxM yarn/jvm/MemMaxM n

Spark 指標

Spark 驅動程式指標

指標 Metrics Explorer 名稱 已啟用的指標
spark:driver:BlockManager:disk.diskSpaceUsed_MB spark/driver/BlockManager/disk/diskSpaceUsed_MB y
spark:driver:BlockManager:memory.maxMem_MB spark/driver/BlockManager/memory/maxMem_MB y
spark:driver:BlockManager:memory.memUsed_MB spark/driver/BlockManager/memory/memUsed_MB y
spark:driver:DAGScheduler:job.allJobs spark/driver/DAGScheduler/job/allJobs y
spark:driver:DAGScheduler:stage.failedStages spark/driver/DAGScheduler/stage/failedStages y
spark:driver:DAGScheduler:stage.waitingStages spark/driver/DAGScheduler/stage/waitingStages y

Spark 執行器指標

指標 Metrics Explorer 名稱 已啟用的指標
spark:executor:executor:bytesRead spark/executor/bytesRead y
spark:executor:executor:bytesWritten spark/executor/bytesWritten y
spark:executor:executor:cpuTime spark/executor/cpuTime y
spark:executor:executor:diskBytesSpilled spark/executor/diskBytesSpilled y
spark:executor:executor:recordsRead spark/executor/recordsRead y
spark:executor:executor:recordsWritten spark/executor/recordsWritten y
spark:executor:executor:runTime spark/executor/runTime y
spark:executor:executor:shuffleRecordsRead spark/executor/shuffleRecordsRead y
spark:executor:executor:shuffleRecordsWritten spark/executor/shuffleRecordsWritten y
指標 Metrics Explorer 名稱 已啟用的指標
flink:jobmanager:numRegisteredTaskManagers flink/jobmanager/numRegisteredTaskManagers n
flink:jobmanager:numRunningJobs flink/jobmanager/numRunningJobs n
flink:jobmanager:Status.JVM.ClassLoader.ClassesLoaded flink/jobmanager/Status.JVM.ClassLoader.ClassesLoaded n
flink:jobmanager:Status.JVM.ClassLoader.ClassesUnloaded flink/jobmanager/Status.JVM.ClassLoader.ClassesUnloaded n
flink:jobmanager:Status.JVM.CPU.Load flink/jobmanager/Status.JVM.CPU.Load n
flink:jobmanager:Status.JVM.CPU.Time flink/jobmanager/Status.JVM.CPU.Time y
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count n
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time n
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Count flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Count n
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Time flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Time n
flink:jobmanager:Status.JVM.Memory.Direct.Count flink/jobmanager/Status.JVM.Memory.Direct.Count y
flink:jobmanager:Status.JVM.Memory.Direct.MemoryUsed flink/jobmanager/Status.JVM.Memory.Direct.MemoryUsed y
flink:jobmanager:Status.JVM.Memory.Direct.TotalCapacity flink/jobmanager/Status.JVM.Memory.Direct.TotalCapacity y
flink:jobmanager:Status.JVM.Memory.Heap.Committed flink/jobmanager/Status.JVM.Memory.Heap.Committed y
flink:jobmanager:Status.JVM.Memory.Heap.Max flink/jobmanager/Status.JVM.Memory.Heap.Max y
flink:jobmanager:Status.JVM.Memory.Heap.Used flink/jobmanager/Status.JVM.Memory.Heap.Used y
flink:jobmanager:Status.JVM.Memory.Mapped.Count flink/jobmanager/Status.JVM.Memory.Mapped.Count y
flink:jobmanager:Status.JVM.Memory.Mapped.MemoryUsed flink/jobmanager/Status.JVM.Memory.Mapped.MemoryUsed y
flink:jobmanager:Status.JVM.Memory.Mapped.TotalCapacity flink/jobmanager/Status.JVM.Memory.Mapped.TotalCapacity y
flink:jobmanager:Status.JVM.Memory.Metaspace.Committed flink/jobmanager/Status.JVM.Memory.Metaspace.Committed n
flink:jobmanager:Status.JVM.Memory.Metaspace.Max flink/jobmanager/Status.JVM.Memory.Metaspace.Max n
flink:jobmanager:Status.JVM.Memory.Metaspace.Used flink/jobmanager/Status.JVM.Memory.Metaspace.Used n
flink:jobmanager:Status.JVM.Memory.NonHeap.Committed flink/jobmanager/Status.JVM.Memory.NonHeap.Committed n
flink:jobmanager:Status.JVM.Memory.NonHeap.Max flink/jobmanager/Status.JVM.Memory.NonHeap.Max n
flink:jobmanager:Status.JVM.Memory.NonHeap.Used flink/jobmanager/Status.JVM.Memory.NonHeap.Used n
flink:jobmanager:Status.JVM.Threads.Count flink/jobmanager/Status.JVM.Threads.Count n
flink:jobmanager:taskSlotsAvailable flink/jobmanager/taskSlotsAvailable y
flink:jobmanager:taskSlotsTotal flink/jobmanager/taskSlotsTotal y
flink:operator:numRecordsIn flink/operator/numRecordsIn n
flink:operator:numRecordsInPerSecond.count flink/operator/numRecordsInPerSecond.count n
flink:operator:numRecordsInPerSecond.rate flink/operator/numRecordsInPerSecond.rate n
flink:operator:numRecordsOut flink/operator/numRecordsOut n
flink:operator:numRecordsOutPerSecond.count flink/operator/numRecordsOutPerSecond.count n
flink:operator:numRecordsOutPerSecond.rate flink/operator/numRecordsOutPerSecond.rate n
flink:operator:numSplitsProcessed flink/operator/numSplitsProcessed n
flink:task:buffers.inPoolUsage flink/task/buffers.inPoolUsage n
flink:task:buffers.inputExclusiveBuffersUsage flink/task/buffers.inputExclusiveBuffersUsage n
flink:task:buffers.inputFloatingBuffersUsage flink/task/buffers.inputFloatingBuffersUsage n
flink:task:buffers.inputQueueLength flink/task/buffers.inputQueueLength n
flink:task:buffers.outPoolUsage flink/task/buffers.outPoolUsage n
flink:task:buffers.outputQueueLength flink/task/buffers.outputQueueLength n
flink:task:idleTimeMsPerSecond.count flink/task/idleTimeMsPerSecond.count n
flink:task:idleTimeMsPerSecond.rate flink/task/idleTimeMsPerSecond.rate n
flink:task:numBuffersInLocal flink/task/numBuffersInLocal n
flink:task:numBuffersInLocalPerSecond.count flink/task/numBuffersInLocalPerSecond.count n
flink:task:numBuffersInLocalPerSecond.rate flink/task/numBuffersInLocalPerSecond.rate n
flink:task:numBuffersInRemote flink/task/numBuffersInRemote n
flink:task:numBuffersInRemotePerSecond.count flink/task/numBuffersInRemotePerSecond.count n
flink:task:numBuffersInRemotePerSecond.rate flink/task/numBuffersInRemotePerSecond.rate n
flink:task:numBuffersOut flink/task/numBuffersOut n
flink:task:numBuffersOutPerSecond.count flink/task/numBuffersOutPerSecond.count n
flink:task:numBuffersOutPerSecond.rate flink/task/numBuffersOutPerSecond.rate n
flink:task:numBytesIn flink/task/numBytesIn n
flink:task:numBytesInLocal flink/task/numBytesInLocal n
flink:task:numBytesInLocalPerSecond.count flink/task/numBytesInLocalPerSecond.count n
flink:task:numBytesInLocalPerSecond.rate flink/task/numBytesInLocalPerSecond.rate n
flink:task:numBytesInPerSecond.count flink/task/numBytesInPerSecond.count n
flink:task:numBytesInPerSecond.rate flink/task/numBytesInPerSecond.rate n
flink:task:numBytesInRemote flink/task/numBytesInRemote n
flink:task:numBytesInRemotePerSecond.count flink/task/numBytesInRemotePerSecond.count n
flink:task:numBytesInRemotePerSecond.rate flink/task/numBytesInRemotePerSecond.rate n
flink:task:numBytesOut flink/task/numBytesOut n
flink:task:numBytesOutPerSecond.count flink/task/numBytesOutPerSecond.count n
flink:task:numBytesOutPerSecond.rate flink/task/numBytesOutPerSecond.rate n
flink:task:numRecordsIn flink/task/numRecordsIn n
flink:task:numRecordsInPerSecond.count flink/task/numRecordsInPerSecond.count n
flink:task:numRecordsInPerSecond.rate flink/task/numRecordsInPerSecond.rate n
flink:task:numRecordsOut flink/task/numRecordsOut n
flink:task:numRecordsOutPerSecond.count flink/task/numRecordsOutPerSecond.count n
flink:task:numRecordsOutPerSecond.rate flink/task/numRecordsOutPerSecond.rate n
flink:task:Shuffle.Netty.Input.Buffers.inPoolUsage flink/task/Shuffle.Netty.Input.Buffers.inPoolUsage n
flink:task:Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage flink/task/Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage n
flink:task:Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage flink/task/Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage n
flink:task:Shuffle.Netty.Input.Buffers.inputQueueLength flink/task/Shuffle.Netty.Input.Buffers.inputQueueLength n
flink:task:Shuffle.Netty.Input.numBuffersInLocal flink/task/Shuffle.Netty.Input.numBuffersInLocal n
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.count flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.count n
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate n
flink:task:Shuffle.Netty.Input.numBuffersInRemote flink/task/Shuffle.Netty.Input.numBuffersInRemote n
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.count flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.count n
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate n
flink:task:Shuffle.Netty.Input.numBytesInLocal flink/task/Shuffle.Netty.Input.numBytesInLocal n
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.count flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.count n
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.rate flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.rate n
flink:task:Shuffle.Netty.Input.numBytesInRemote flink/task/Shuffle.Netty.Input.numBytesInRemote n
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.count flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.count n
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.rate flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.rate n
flink:task:Shuffle.Netty.Output.Buffers.outPoolUsage flink/task/Shuffle.Netty.Output.Buffers.outPoolUsage n
flink:task:Shuffle.Netty.Output.Buffers.outputQueueLength flink/task/Shuffle.Netty.Output.Buffers.outputQueueLength n
flink:taskmanager:Status.flink.Memory.Managed.Total flink/taskmanager/Status.flink.Memory.Managed.Total n
flink:taskmanager:Status.flink.Memory.Managed.Used flink/taskmanager/Status.flink.Memory.Managed.Used n
flink:taskmanager:Status.JVM.ClassLoader.ClassesLoaded flink/taskmanager/Status.JVM.ClassLoader.ClassesLoaded n
flink:taskmanager:Status.JVM.ClassLoader.ClassesUnloaded flink/taskmanager/Status.JVM.ClassLoader.ClassesUnloaded n
flink:taskmanager:Status.JVM.CPU.Load flink/taskmanager/Status.JVM.CPU.Load n
flink:taskmanager:Status.JVM.CPU.Time flink/taskmanager/Status.JVM.CPU.Time y
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count n
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time n
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Count flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Count n
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Time flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Time n
flink:taskmanager:Status.JVM.Memory.Direct.Count flink/taskmanager/Status.JVM.Memory.Direct.Count y
flink:taskmanager:Status.JVM.Memory.Direct.MemoryUsed flink/taskmanager/Status.JVM.Memory.Direct.MemoryUsed y
flink:taskmanager:Status.JVM.Memory.Direct.TotalCapacity flink/taskmanager/Status.JVM.Memory.Direct.TotalCapacity y
flink:taskmanager:Status.JVM.Memory.Heap.Committed flink/taskmanager/Status.JVM.Memory.Heap.Committed y
flink:taskmanager:Status.JVM.Memory.Heap.Max flink/taskmanager/Status.JVM.Memory.Heap.Max y
flink:taskmanager:Status.JVM.Memory.Heap.Used flink/taskmanager/Status.JVM.Memory.Heap.Used y
flink:taskmanager:Status.JVM.Memory.Mapped.Count flink/taskmanager/Status.JVM.Memory.Mapped.Count y
flink:taskmanager:Status.JVM.Memory.Mapped.MemoryUsed flink/taskmanager/Status.JVM.Memory.Mapped.MemoryUsed y
flink:taskmanager:Status.JVM.Memory.Mapped.TotalCapacity flink/taskmanager/Status.JVM.Memory.Mapped.TotalCapacity y
flink:taskmanager:Status.JVM.Memory.Metaspace.Committed flink/taskmanager/Status.JVM.Memory.Metaspace.Committed n
flink:taskmanager:Status.JVM.Memory.Metaspace.Max flink/taskmanager/Status.JVM.Memory.Metaspace.Max n
flink:taskmanager:Status.JVM.Memory.Metaspace.Used flink/taskmanager/Status.JVM.Memory.Metaspace.Used n
flink:taskmanager:Status.JVM.Memory.NonHeap.Committed flink/taskmanager/Status.JVM.Memory.NonHeap.Committed n
flink:taskmanager:Status.JVM.Memory.NonHeap.Max flink/taskmanager/Status.JVM.Memory.NonHeap.Max n
flink:taskmanager:Status.JVM.Memory.NonHeap.Used flink/taskmanager/Status.JVM.Memory.NonHeap.Used n
flink:taskmanager:Status.JVM.Threads.Count flink/taskmanager/Status.JVM.Threads.Count n
flink:taskmanager:Status.Network.AvailableMemorySegments flink/taskmanager/Status.Network.AvailableMemorySegments n
flink:taskmanager:Status.Network.TotalMemorySegments flink/taskmanager/Status.Network.TotalMemorySegments n
flink:taskmanager:Status.Shuffle.Netty.AvailableMemory flink/taskmanager/Status.Shuffle.Netty.AvailableMemory n
flink:taskmanager:Status.Shuffle.Netty.AvailableMemorySegments flink/taskmanager/Status.Shuffle.Netty.AvailableMemorySegments n
flink:taskmanager:Status.Shuffle.Netty.TotalMemory flink/taskmanager/Status.Shuffle.Netty.TotalMemory n
flink:taskmanager:Status.Shuffle.Netty.TotalMemorySegments flink/taskmanager/Status.Shuffle.Netty.TotalMemorySegments n
flink:taskmanager:Status.Shuffle.Netty.UsedMemory flink/taskmanager/Status.Shuffle.Netty.UsedMemory n
flink:taskmanager:Status.Shuffle.Netty.UsedMemorySegments flink/taskmanager/Status.Shuffle.Netty.UsedMemorySegments n

Spark 記錄伺服器指標

Dataproc 會收集下列 Spark 歷史記錄服務 JVM 記憶體指標:

指標 Metrics Explorer 名稱 已啟用的指標
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.committed sparkHistoryServer/memory/CommittedHeapMemory y
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.used sparkHistoryServer/memory/UsedHeapMemory y
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.max sparkHistoryServer/memory/MaxHeapMemory y
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed sparkHistoryServer/memory/CommittedNonHeapMemory y
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.used sparkHistoryServer/memory/UsedNonHeapMemory y
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.max sparkHistoryServer/memory/MaxNonHeapMemory y

HiveServer 2 指標

指標 Metrics Explorer 名稱 已啟用的指標
hiveserver2:JVM:Memory:HeapMemoryUsage.committed hiveserver2/memory/CommittedHeapMemory y
hiveserver2:JVM:Memory:HeapMemoryUsage.used hiveserver2/memory/UsedHeapMemory y
hiveserver2:JVM:Memory:HeapMemoryUsage.max hiveserver2/memory/MaxHeapMemory y
hiveserver2:JVM:Memory:NonHeapMemoryUsage.committed hiveserver2/memory/CommittedNonHeapMemory y
hiveserver2:JVM:Memory:NonHeapMemoryUsage.used hiveserver2/memory/UsedNonHeapMemory y
hiveserver2:JVM:Memory:NonHeapMemoryUsage.max hiveserver2/memory/MaxNonHeapMemory y

Hive 中繼存放區指標

指標 Metrics Explorer 名稱 已啟用的指標
hivemetastore:API:GetDatabase:Mean hivemetastore/get_database/mean y
hivemetastore:API:CreateDatabase:Mean hivemetastore/create_database/mean y
hivemetastore:API:DropDatabase:Mean hivemetastore/drop_database/mean y
hivemetastore:API:AlterDatabase:Mean hivemetastore/alter_database/mean y
hivemetastore:API:GetAllDatabases:Mean hivemetastore/get_all_databases/mean y
hivemetastore:API:CreateTable:Mean hivemetastore/create_table/mean y
hivemetastore:API:DropTable:Mean hivemetastore/drop_table/mean y
hivemetastore:API:AlterTable:Mean hivemetastore/alter_table/mean y
hivemetastore:API:GetTable:Mean hivemetastore/get_table/mean y
hivemetastore:API:GetAllTables:Mean hivemetastore/get_all_tables/mean y
hivemetastore:API:AddPartitionsReq:Mean hivemetastore/add_partitions_req/mean y
hivemetastore:API:DropPartition:Mean hivemetastore/drop_partition/mean y
hivemetastore:API:AlterPartition:Mean hivemetastore/alter_partition/mean y
hivemetastore:API:GetPartition:Mean hivemetastore/get_partition/mean y
hivemetastore:API:GetPartitionNames:Mean hivemetastore/get_partition_names/mean y
hivemetastore:API:GetPartitionsPs:Mean hivemetastore/get_partitions_ps/mean y
hivemetastore:API:GetPartitionsPsWithAuth:Mean hivemetastore/get_partitions_ps_with_auth/mean y

Hive 中繼存放區指標評估

統計指標 範例指標 範例指標名稱
最大值 hivemetastore:API:GetDatabase:Max hivemetastore/get_database/max
最小值 hivemetastore:API:GetDatabase:Min hivemetastore/get_database/min
平均值 hivemetastore:API:GetDatabase:Mean hivemetastore/get_database/mean
數量 hivemetastore:API:GetDatabase:Count hivemetastore/get_database/count
50thPercentile hivemetastore:API:GetDatabase:50thPercentile hivemetastore/get_database/median
75thPercentile hivemetastore:API:GetDatabase:75thPercentile hivemetastore/get_database/75th_percentile
95thPercentile hivemetastore:API:GetDatabase:95thPercentile hivemetastore/get_database/95th_percentile
98thPercentile hivemetastore:API:GetDatabase:98thPercentile hivemetastore/get_database/98th_percentile
99thPercentile hivemetastore:API:GetDatabase:99thPercentile hivemetastore/get_database/99th_percentile
999thPercentile hivemetastore:API:GetDatabase:999thPercentile hivemetastore/get_database/999th_percentile
StdDev hivemetastore:API:GetDatabase:StdDev hivemetastore/get_database/stddev
FifteenMinuteRate hivemetastore:API:GetDatabase:FifteenMinuteRate hivemetastore/get_database/15min_rate
FiveMinuteRate hivemetastore:API:GetDatabase:FiveMinuteRate hivemetastore/get_database/5min_rate
OneMinuteRate hivemetastore:API:GetDatabase:OneMinuteRate hivemetastore/get_database/1min_rate
MeanRate hivemetastore:API:GetDatabase:MeanRate hivemetastore/get_database/mean_rate

Dataproc 監控代理程式指標

設定 --metric-sources=monitoring-agent-defaults 時,Dataproc 會收集下列 Dataproc 監控代理程式指標。這些指標會使用 agent.googleapis.com 前置字串發布。

CPU
agent.googleapis.com/cpu/load_15m
agent.googleapis.com/cpu/load_1m
agent.googleapis.com/cpu/load_5m
agent.googleapis.com/cpu/usage_time*
agent.googleapis.com/cpu/utilization*

磁碟
agent.googleapis.com/disk/bytes_used
agent.googleapis.com/disk/io_time
agent.googleapis.com/disk/merged_operations
agent.googleapis.com/disk/operation_count
agent.googleapis.com/disk/operation_time
agent.googleapis.com/disk/pending_operations
agent.googleapis.com/disk/percent_used
agent.googleapis.com/disk/read_bytes_count

Swap
agent.googleapis.com/swap/bytes_used
agent.googleapis.com/swap/io
agent.googleapis.com/swap/percent_used

記憶體
agent.googleapis.com/memory/bytes_used
agent.googleapis.com/memory/percent_used

程序 - 部分屬性會遵循獨特的配額政策
agent.googleapis.com/processes/count_by_state
agent.googleapis.com/processes/cpu_time
agent.googleapis.com/processes/disk/read_bytes_count
agent.googleapis.com/processes/disk/write_bytes_count
agent.googleapis.com/processes/fork_count
agent.googleapis.com/processes/rss_usage
agent.googleapis.com/processes/vm_usage

介面
agent.googleapis.com/interface/errors
agent.googleapis.com/interface/packets
agent.googleapis.com/interface/traffic

網路
agent.googleapis.com/network/tcp_connections

建構 Monitoring 資訊主頁

您可以建構 Monitoring 資訊主頁,顯示所選 Dataproc 指標的圖表。

  1. 在 Monitoring 的「Dashboards Overview」頁面中選取「+ Create Dashboard」。為資訊主頁命名,然後按一下右上方選單中的「新增圖表」,開啟「新增圖表」視窗。選取「Cloud Dataproc Cluster」做為資源類型。選取一或多個指標及圖表屬性。然後儲存圖表。

  2. 您可以在資訊主頁中新增其他圖表。儲存資訊主頁後,標題會顯示在 Monitoring 的「Dashboards Overview」頁面中。您可以在資訊主頁顯示頁面中查看、更新及刪除資訊主頁圖表。

後續步驟