When your Google Kubernetes Engine (GKE) clusters or applications experience issues, it's crucial to quickly determine if the cause is internal or related to a wider Google Cloud service disruption. Spending time on local debugging is inefficient if the root cause is a known platform incident.
Use this page to determine if an issue with your GKE cluster is caused by a wider Google Cloud service disruption. Learn where to find official status updates, personalized health events, and service incident insights from the following sources:
- Google Cloud Service Health: status information for Google Cloud services, by region.
- Personalized Service Health: service disruptions relevant to your projects.
- Service incident insights and recommendations: GKE clusters that are affected by an ongoing service incident.
This information is important for Platform admins and operators and Application developers who are troubleshooting and need to understand if observed issues are linked to a broader Google Cloud service health event. For more information about the common roles and example tasks that we reference in Google Cloud content, see Common GKE user roles and tasks.
Review Google Cloud service health
The Google Cloud Service Health page provides status information about the services that are part of Google Cloud.
To review incidents related to GKE, go to the Google Cloud Service Health page.
Go to all incidents reported for Google Kubernetes Engine
Review Personalized Service Health
Personalized Service Health lets you identify Google Cloud service disruptions that are relevant to your projects. These disruptions are called service health events, and information about them is available in the Google Cloud console and a variety of integration points.
To review incidents related to GKE that are relevant to your projects, view service health events in the Personalized Service Health dashboard in the Google Cloud console.
Go to Personalized Service Health
You can filter incidents by service, location, relevance, and status. The dashboard also provides incident details such as scope of impact, symptoms, workarounds, and resolution progress updates. To get started, see Quickstart: View service health events in the Google Cloud console.
Review service incident insights and recommendations
Service incident insights and recommendations let you identify GKE clusters that are impacted by an ongoing service incident.
To get service incident insights, view insights and recommendations for the
GKE_RELIABILITY_INCIDENT
subtype. You can get insights by using
the Google Cloud console, the Google Cloud CLI, or the Recommender API. For
more information, see View insights and
recommendations.
Insights and recommendations include the following information:
- Impacted cluster: a cluster that's impacted by the incident.
- Incident name: an incident identifier for reference when you communicate with Cloud Customer Care.
- Incident description: information about the incident from the incident response team.
- Last effective time: the last time that information about the incident was updated.
- Mitigation action: mitigation action that's recommended by the incident response team, if available.
The service incident insight remains visible until the Google Cloud incident response team mitigates the incident and determines that the insight is no longer relevant. There will be a delay between the time the incident is mitigated and no longer impacts your resources, and the time the insight is removed. If you implemented a workaround and no longer want to see the insight, you can dismiss it.
What's next
Read Assess cluster and workload health in the Google Cloud console (the next page in this series).
For advice about resolving specific problems, review GKE's troubleshooting guides.
If you can't find a solution to your problem in the documentation, see Get support for further help, including advice on the following topics:
- Opening a support case by contacting Cloud Customer Care.
- Getting support from the community by
asking questions on StackOverflow
and using the
google-kubernetes-engine
tag to search for similar issues. You can also join the#kubernetes-engine
Slack channel for more community support. - Opening bugs or feature requests by using the public issue tracker.