使用 Google Cloud Managed Service for Prometheus 收集预先配置的第三方指标和用户定义指标,然后将它们发送到 Cloud Monitoring。借助 Google Cloud Managed Service for Prometheus,您可以使用 Prometheus 监控工作负载并发出提醒,而不需要手动完成大量的 Prometheus 管理和操作任务。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-08。"],[],[],null,["# Observability for GKE\n\n[Autopilot](/kubernetes-engine/docs/concepts/autopilot-overview) [Standard](/kubernetes-engine/docs/concepts/choose-cluster-mode)\n\n*** ** * ** ***\n\nThis page describes how you can understand the health of your applications and\nmaintain application availability and reliability.\n\nDefault observability features\n------------------------------\n\nBy default, GKE clusters are configured to do the following:\n\n- Send [system logs, audit logs, and application logs](/kubernetes-engine/docs/concepts/about-logs#what_logs) to Cloud Logging.\n- Send [system metrics](/kubernetes-engine/docs/how-to/configure-metrics#system-metrics) to Cloud Monitoring.\n- Use [Google Cloud Managed Service for Prometheus](/stackdriver/docs/managed-prometheus) to collect configured third-party and user-defined metrics and then send them to Cloud Monitoring. Google Cloud Managed Service for Prometheus lets you monitor and alert on your workloads using Prometheus, without manually managing and operating Prometheus at scale.\n\nCustomize and enhance data collection\n-------------------------------------\n\nBy default, GKE creates a Logging repository for\nstoring logs for each cluster. You can control which logs and which metrics, if\nany, are sent from your GKE cluster to Cloud Logging and\nCloud Monitoring.\n\nYou can also control whether to enable\n[Google Cloud Managed Service for Prometheus](/stackdriver/docs/managed-prometheus).\n\nFor GKE Autopilot clusters, you cannot disable\nthe Cloud Monitoring and Cloud Logging integration.\n\n### Additional observability metrics\n\nYou can collect additional observability metrics by enabling one or more\n[observability metrics packages](/kubernetes-engine/docs/how-to/configure-metrics).\n\n- [Control plane metrics](/kubernetes-engine/docs/how-to/control-plane-metrics): Monitor the health of Kubernetes components by collecting metrics for the Kubernetes API server, Scheduler, and Controller Manager. These metrics are useful signals of service health for defining service level objectives (SLOs).\n- [Kube state metrics](/kubernetes-engine/docs/how-to/kube-state-metrics): Monitor the health of Kubernetes objects such as Deployments, Nodes, and Pods.\n- [cAdvisor/Kubelet metrics](/kubernetes-engine/docs/how-to/cadvisor-kubelet-metrics): Monitor the health of containers and the kubelet.\n\nIf you have enabled GKE Enterprise in your project, these\nmetrics are\n[enabled by default](/kubernetes-engine/docs/how-to/configure-metrics#default-metrics-enterprise)\nwhen you\n[register to a fleet](/kubernetes-engine/fleet-management/docs/register/gke#register_your_cluster)\nduring cluster creation.\n\n### Third-party and user-defined metrics\n\nTo monitor third-party applications running on your clusters such as Postgres,\nMongoDB, and Redis, use\n[Prometheus exporters](/stackdriver/docs/managed-prometheus/exporters/introduction)\nwith Google Cloud Managed Service for Prometheus.\n\nYou can also\n[write custom exporters](https://prometheus.io/docs/instrumenting/writing_exporters/)\nto monitor other signals of health and performance.\n\nUse collected data\n------------------\n\nUse the data you collect to analyze application health, debug, troubleshoot,\nand test as you develop, deploy, and maintain your applications.\n\nGKE provides built-in observability features to get you started\nquickly:\n\n- View collected data for your clusters and workloads on in\n GKE\n [observability dashboards](/kubernetes-engine/docs/how-to/view-observability-metrics).\n You can customize the provided dashboards for the following purposes:\n\n - View key cluster metrics, such as CPU utilization, memory utilization, and the number of open incidents.\n - View clusters by their infrastructure, workloads, or Services.\n - Inspect namespaces, Nodes, workloads, Services, Pods, and containers.\n - For Pods and containers, view metrics as a function of time and view log entries.\n\n You can also [create your own dashboards](/monitoring/charts/dashboards) or\n [import Grafana dashboards](/monitoring/dashboards/import-grafana-dashboards)\n to meet your needs.\n | **Note:** The provided GKE dashboards only display information for GKE clusters running on Google Cloud. They don't display information for GKE clusters running anywhere else, for example using on-premises or bare-metal servers.\n- From the **Observability** tab, you can create recommended alert policies so\n that you are notified about issues. To learn more about alerting, see the\n [Alerting overview](/monitoring/alerts).\n\n- [Create SLOs](/stackdriver/docs/solutions/slo-monitoring/ui/create-slo) to\n monitor your service performance goals using collected GKE\n metrics.\n\n- Use GKE playbooks to troubleshoot common issues such as\n [unschedulable Pods](/kubernetes-engine/docs/troubleshooting/deployed-workloads#PodUnschedulable)\n and\n [containers that repeatedly crash after restart](/kubernetes-engine/docs/troubleshooting/deployed-workloads#CrashLoopBackOff).\n\n- Explore and analyze your data with tools such as\n [Logs Explorer](/logging/docs/view/logs-explorer-interface),\n [Metrics Explorer](/monitoring/charts/metrics-explorer) and\n [Error Reporting](/error-reporting/docs/grouping-errors).\n\n- Review GKE\n [audit logs](/kubernetes-engine/docs/how-to/audit-logging) that record\n administrative activities and accesses as part of\n [Cloud Audit Logs](/logging/docs/audit).\n [Audit log policy](/kubernetes-engine/docs/concepts/audit-policy) determines\n which events are recorded and whether a log entry belongs to an Admin Activity\n log or a Data Access log.\n\nOther features\n--------------\n\nGKE integrates with other Google Cloud services to help you\nmonitor and manage your clusters and workloads.\n\n- Use the\n [security posture dashboard](/kubernetes-engine/docs/concepts/about-security-posture-dashboard)\n to identify security concerns based on our standards and industry best\n practices.\n\n- View\n [insights and recommendations](/kubernetes-engine/docs/how-to/optimize-with-recommenders)\n to optimize your clusters.\n\n- Use\n [network policy logging](/kubernetes-engine/docs/how-to/network-policy-logging)\n to help you troubleshoot issues with Kubernetes network policies. If you use\n [GKE Dataplane V2](/kubernetes-engine/docs/concepts/dataplane-v2), then\n network policy logging is built-in.\n\nPricing\n-------\n\nPricing for integration with Cloud Logging (including\nCloud Audit Logs), Cloud Monitoring, and Google Cloud Managed Service for Prometheus is\nbased on the amount of logs and metrics collected. See the\n[Pricing](/stackdriver/pricing) page for details.\n\nFeatures provided by other Google Cloud services listed in\n[Other features](#other-features) have separate pricing. See the Pricing section\nof those documentation pages for more information.\n\nWhat's next\n-----------\n\n- [Observe your clusters](/kubernetes-engine/docs/how-to/view-observability-metrics).\n Learn how to view dashboards, organize cluster information, and view alerting\n details.\n\n- [Enable verbose, OS-level audit logging](/kubernetes-engine/docs/how-to/linux-auditd-logging)\n on GKE cluster nodes and how to export logs to\n Cloud Logging.\n\n- For more information about how to use observability features to troubleshoot\n GKE, see\n [Introduction to GKE troubleshooting](/kubernetes-engine/docs/troubleshooting/introduction)."]]