Troubleshoot CRDs with an invalid CA bundle


Custom Resource Definitions (CRDs) are powerful tools for extending Kubernetes capabilities. However, if a CRD contains an invalid or malformed Certificate Authority (CA) bundle within its conversion webhook configuration spec.conversion.webhook.clientConfig.caBundle, it can disrupt cluster operations. This can manifest as errors during resource creation, updates, or deletions. Google Kubernetes Engine (GKE) monitors your clusters and uses the Recommender service to deliver guidance for how you can optimize your usage of the platform.

To help you ensure that your cluster remains stable and performant, see recommendations from GKE for CRDs that operate but have an invalid CA bundle. Use this guidance to check your potentially misconfigured CRDs and update them, if necessary. To learn more about how to manage insights and recommendations from Recommenders, see Optimize your usage of GKE with insights and recommendations.

Identify impacted clusters

To get insights identifying clusters that are affected by CRDs with invalid CA bundles, follow the instructions to view insights and recommendations for subtype K8S_CRD_WITH_INVALID_CA_BUNDLE. You can get insights in the following ways:

  • Use the Google Cloud console.
  • Use the Google Cloud CLI, or the Recommender API, filtering with the subtype K8S_CRD_WITH_INVALID_CA_BUNDLE.

After you identify the CRDs using the insights, follow the instructions to troubleshoot the misconfigured CA bundle.

When GKE detects misconfigured CRDs

GKE generates an insight and recommendation with the K8S_CRD_WITH_INVALID_CA_BUNDLE subtype if the GKE cluster has one or more CRDs reporting a misconfigured caBundle for the webhook client configuration in spec.conversion.webhook.clientConfig.

Follow the instructions to check CRDs with misconfigured CA bundle.

Troubleshoot the detected CRDs

The following sections have instructions for you to troubleshoot the CRDs that GKE detected as potentially misconfigured.

After you implement the instructions and the CRDs are correctly configured, the recommendation is resolved within 24 hours and no longer appears in the console. If it has been less than 24 hours since you've implemented the guidance of the recommendation, you can mark the recommendation as resolved. If you don't want to implement the recommendation, you can dismiss it.

Identify affected CRDs in a cluster

  1. View insights and recommendations for subtype K8S_CRD_WITH_INVALID_CA_BUNDLE, choosing one insight at a time to troubleshoot. GKE generates one insight per cluster which has a broken CRD.

  2. Run the following command to describe the Service to find CRDs with potentially problematic CA bundles:

    kubectl get crd -o custom-columns=NAME:.metadata.name,CABUNDLE:.spec.conversion.webhook.clientConfig.caBundle
    

    The output includes the following:

    • Name: The name of the CRD.
    • CaBundle: The CA bundle associated with the CRD's conversion webhook, if present. Examine the output. If the caBundle column is empty for a CRD that you know utilizes a conversion webhook, this signals a potential issue with the caBundle.

Recreate the CRD

To resolve this error, recreate the affected CRD with a valid CA bundle:

  1. Back up existing custom resources associated with this problematic CRD, if you have any. Run the following command to export the existing resources:

      kubectl get <crd-name> -o yaml > backup.yaml
    
  2. Delete the existing CRD:

      kubectl delete crd <crd-name>
    
  3. Ensure that the caBundle field of the CRD contains a well-formed, base-64-encoded PEM certificate. You can do this either by editing the CRD directly or by reaching out to its authors.

  4. Modify the CRD YAML definition, updating the spec.conversion.webhook.clientConfig.caBundle field with the valid CA bundle data. The result should look something like the following:

        spec:
          conversion:
            webhook:
              clientConfig:
                caBundle: <base64-encoded-ca-bundle>
    
  5. Apply the corrected CRD:

        kubectl apply -f <corrected-crd-file.yaml>
    
  6. Restore your custom resources:

        kubectl apply -f backup.yaml
    

What's next