You're viewing Apigee and Apigee hybrid documentation.
View
Apigee Edge documentation.
The act of troubleshooting is both an art and a science. The constant effort of Apigee technical support teams has been to demystify the art and expose the science behind problem identification and resolution.
What are playbooks?
Developed in collaboration with ApigeeTechnical Support teams, Apigee troubleshooting playbooks are designed to provide quick and effective solutions to errors or other issues that you may encounter when working with Apigee products.
Audience
Troubleshooting playbooks are intended for readers with a high-level understanding of Apigee and its architecture, as well as some understanding of basic concepts such as policies and analytics.
Some problems can be diagnosed and solved only by Apigee hybrid users and may require knowledge of internal components such such as Cassandra and Postgres datastores, Message Processors, and Routers.
If you are on Apigee, then we clearly specify when you can perform the indicated troubleshooting steps and when you need to contact Google Cloud Customer Care for assistance.
Playbooks
This section describes the current playbooks.
To filter this table, do one or more of the following: select a category, select a product, type a search term, or click a column heading to sort.
Category | Playbook/Problem description | Error message | Playbook applicable for |
---|---|---|---|
Cassandra | Troubleshooting Cassandra restore | During the Cassandra restoration in Apigee hybrid, you may encounter errors in the restore logs. | Apigee hybrid only |
Automated issue surfacing | No network connectivity between runtime plane and control plane | Apigee API management requests fail:
|
Apigee hybrid only |
Automated issue surfacing | Virtual host missing environment group | After running kubectl -n apigee get apigeeissues , the
AIS_VIRTUALHOST_MISSING_ENVGROUP error is displayed.
|
Apigee hybrid only |
Automated issue surfacing | Virtual host missing selector | After running kubectl -n apigee get apigeeissues , the
AIS_VIRTUALHOST_MISSING_SELECTOR error is displayed.
|
Apigee hybrid only |
Automated issue surfacing | Ingress cert mismatch | After running kubectl -n apigee get apigeeissues , the
AIS_INGRESS_CERT_MISMATCH error is displayed.
|
Apigee hybrid only |
Automated issue surfacing | Ingress cert expiry | After running kubectl -n apigee get apigeeissues , the
AIS_INGRESS_CERT_EXPIREY error is displayed.
|
Apigee hybrid only |
Automated issue surfacing | Ingress mTLS CA cert expiry | After running kubectl -n apigee get apigeeissues , the
AIS_INGRESS_MTLS_CA_CERT_EXPIREY error is displayed.
|
Apigee hybrid only |
Automated issue surfacing | Ingress mTLS CA cert invalid | After running kubectl -n apigee get apigeeissues , the
AIS_INGRESS_MTLS_CA_CERT_INVALID error is displayed.
|
Apigee hybrid only |
Cassandra | Cassandra data replication failure |
When replicating data during a multi-region expansion, the
CassandraDataReplication status may show an error state and data
replication may fail.
|
Apigee hybrid only |
Cassandra | Cassandra Java heap space issues |
Cassandra heap issues may cause slowness in the Apigee hybrid proxy
execution or even Datastore errors. Sometimes logs are an early
indicator, even before the onset of symptoms.
|
Apigee hybrid only |
Cassandra | Cassandra pods not starting in the secondary region |
Cassandra pods fail to start in one of the regions in a multi-region Hybrid setup.
You may see a node already exists error message in the Cassandra pod logs, or
a FailedPreStopHook warning in the Cassandra pod status.
|
Apigee hybrid only |
Cassandra | Cassandra troubleshooting guide |
When you use kubectl to view the pod states, you see that
one or more Cassandra pods are stuck. This guide describes the diagnosis
and resolution for problems with the Cassandra datastore.
|
Apigee hybrid only |
Deployment | API proxy deployments fail with no active runtime pods warning | The No active runtime pods warning is displayed in the Details dialog next to the error message Deployment issues on ENVIRONMENT: REVISION_NUMBER on the API proxy page. | Apigee hybrid only |
Ingressgateway | API calls fail with timeout errors |
curl: (7) Failed to connect to example.apis.com port 443: Operation timed out |
Apigee hybrid only |
Ingressgateway | API Calls failing with TLS errors |
curl: (35) LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to example.apis.com:443 |
Apigee hybrid only |
Logging | Troubleshooting Apigee logs missing from Cloud Logging | No error messages are known to be shown in this scenario. | Apigee and Apigee hybrid |
Management/UI | Inconsistent/no data observed for entities in hybrid UI or through Management APIs | No error messages are known to be shown in this scenario. | Apigee hybrid only |
Network configuration | Access routing issues with Apigee |
External clients are not able to access/connect to Apigee in a
desired manner. These include either network connectivity failures
(TLS handshake fails) or 4xx/5xx responses from Apigee.
|
Apigee and Apigee hybrid |
Network configuration | Apigee connectivity issues with southbound PSC targets |
A network connection issue or a TCP timeout between Apigee and the
target service would show up as a 503 error response and
would show an error similar to below if you create a debug session.
{"fault":{"faultstring":"The Service is temporarily unavailable","detail":{"errorcode":"messaging.adaptors.http.flow.ServiceUnavailable","reason":"TARGET_CONNECT_TIMEOUT"}}} |
Apigee and Apigee hybrid |
Other | Expanding Istio property replica counts when draining nodes | When draining Istio pods some nodes may not drain because they have a replica count of 1, while 3 or more replicas are required. In order to avoid this, you should set the minimum replica count for each property to at least 3. | Apigee hybrid only |
Other | Message processor troubleshooting guide |
One or more apigee-runtime pods are not in the Ready state.
When you use kubectl to describe a failed
apigee-runtime pod, you see the error:
Readiness probe failed: HTTP probe failed with statuscode: 500 |
Apigee hybrid only |
Other | Print build info |
The buildinfo API returns information about the current
build for a runtime component. This information may be useful if you
need to contact support.
|
Apigee hybrid only |
Other | StreamingPull errors 100% |
If you see in your metrics dashboard that the method
google.pubsub.vl.Subscriber.StreamingPull is failing with
100% errors, you can safely ignore the issue. This is expected
behavior.
|
Apigee hybrid only |
Deployment | Instance is not reporting status for environment group | Deployments of API proxies fail with Instance INSTANCE_NAME is not reporting status for environment group ENV_GROUP_NAME error in the Apigee hybrid UI. | Apigee hybrid only |
Deployment | API proxy deployments fail with apigee-serving-cert is not found or expired |
API proxy deployments fail with error messages in the
apigee-watcher logs.
|
Apigee hybrid only |
Ingressgateway | Expand Istio property replica counts to avoid problems when draining Istio nodes |
When draining Istio pods some nodes may not drain because they have a
replica count of 1 , while 3 or more replicas are
required. In order to avoid this, you should set the minimum replica count
for each property to at least 3 .
|
Apigee hybrid only |
Network configuration | No free IP address space troubleshooting | During Apigee provisioning, if you select a network CIDR range that is not completely free, you may see an error message. | Apigee and Apigee hybrid |
Network configuration | VPC Peering 503 Service Unavailable error with TARGET_CONNECT_TIMEOUT | This document describes how to diagnose and correct "503 Service Unavailable" errors with TARGET_CONNECT_TIMEOUT when using VPC peering. | Apigee |
Network configuration | 504 Gateway timeout - Target read timeout | This document describes how to diagnose and correct "504 Gateway Timeout" errors with a TARGET_READ_TIMEOUT reason. | Apigee and Apigee hybrid |
Other | Troubleshooting Apigee hybrid stuck in creating or releasing state |
This document describes how to reset Apigee hybrid components when
they are stuck in a creating or releasing
state.
|
Apigee hybrid only |
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-11-06 UTC.