Error Medic

Troubleshooting 'cert-manager certificate expired' and 'x509: certificate has expired' in Kubernetes

Fix 'cert-manager certificate expired' errors by verifying ClusterIssuer status, checking Order and Challenge resources, and forcing a manual certificate renewa

Last updated:
Last verified:
1,353 words
Key Takeaways
  • Always check the 'Ready' status of the Certificate object using 'kubectl get cert'.
  • Trace the issuance pipeline: Certificate -> CertificateRequest -> Order -> Challenge.
  • DNS-01 challenge failures (e.g., IAM permissions, propagation delays) are the most common root cause of expired certificates.
  • Use 'cmctl renew' or safely delete the TLS Secret to force cert-manager to re-issue an expired certificate.
Fix Approaches Compared
MethodWhen to UseTimeRisk
cmctl renewCertificate is stuck in 'Issuing' or expired but configuration is correct.< 1 minLow
Delete TLS SecretCertificate controller is completely stuck or secret is corrupted.1-2 minsMedium (Causes immediate downtime if cert wasn't already expired)
Restart cert-manager podsWebhook errors or controller cache is out of sync.2 minsLow
Update Issuer/ClusterIssuerCredentials expired or rate limits reached (switching to staging).5 minsLow

Understanding the Error

When you encounter a cert-manager certificate expired or x509: certificate has expired error in your Kubernetes cluster, it means that cert-manager failed to automatically renew the TLS certificate before its expiration date. By default, cert-manager attempts to renew certificates 30 days before they expire. If it reaches the expiration date, your applications will experience downtime as clients (like web browsers or API consumers) reject the invalid SSL/TLS connection.

Typical error messages associated with this include:

  • Browser: NET::ERR_CERT_DATE_INVALID
  • Curl/CLI: curl: (60) SSL certificate problem: certificate has expired
  • Kubernetes Events: Issuing certificate as Secret does not exist or is missing data followed by stalled progress.

Step 1: Diagnose the Issuance Pipeline

cert-manager uses a specific pipeline to issue certificates. When a certificate expires, it means one of these pipeline resources is stuck or has failed. You must trace the failure from the top down.

First, check the status of your certificates:

kubectl get certificates -A

Look for certificates where READY is False or the EXPIRATION date has passed.

Next, describe the specific broken certificate to look for events:

kubectl describe certificate <cert-name> -n <namespace>

If the certificate is stuck, check the CertificateRequest:

kubectl get certificaterequest -n <namespace>
kubectl describe certificaterequest <request-name> -n <namespace>

If using ACME (like Let's Encrypt), the CertificateRequest will spawn an Order. Check the Order:

kubectl get orders -n <namespace>
kubectl describe order <order-name> -n <namespace>

Finally, the Order spawns a Challenge (either HTTP-01 or DNS-01). This is where 90% of certificate renewals fail:

kubectl get challenges -n <namespace>
kubectl describe challenge <challenge-name> -n <namespace>

Look closely at the Events section of the Challenge description. You will often see errors like Waiting for DNS-01 challenge propagation: DNS record for "_acme-challenge.example.com" not found or Error presenting challenge: unauthorized.

Step 2: Common Root Causes and Fixes

1. DNS-01 Challenge Propagation Failures

If you are using DNS-01 challenges, cert-manager needs to create a TXT record in your DNS provider (e.g., Route53, Cloudflare). If the IAM role, API token, or service account used by the ClusterIssuer has expired or lacks permissions, the challenge will fail. Fix: Verify the credentials referenced in your ClusterIssuer Secret. Update the API token if it has expired. Ensure your DNS provider's API is reachable from the cluster.

2. Let's Encrypt Rate Limits

If you have been repeatedly deleting and recreating certificates, you may hit Let's Encrypt's strict rate limits (e.g., 5 duplicate certificates per week). Fix: Look for 429 Too Many Requests in the Order or Challenge logs. If you hit this, you must wait until the rate limit resets. In the future, always use the Let's Encrypt Staging environment (https://acme-staging-v02.api.letsencrypt.org/directory) for testing.

3. cert-manager Webhook Issues

The cert-manager webhook component intercepts and validates cert-manager resources. If the webhook's own internal TLS certificate is expired or misconfigured, it will block all certificate issuance. Error symptom: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook... x509: certificate has expired or is not yet valid. Fix: Restart the webhook pod, or manually delete the webhook's mutating/validating webhook configurations and let cert-manager recreate them:

kubectl delete mutatingwebhookconfiguration cert-manager-webhook
kubectl delete validatingwebhookconfiguration cert-manager-webhook
kubectl rollout restart deployment/cert-manager-webhook -n cert-manager
4. HTTP-01 Challenge Routing Issues

For HTTP-01 challenges, cert-manager temporarily spins up an ingress resource and a pod to serve a specific token at /.well-known/acme-challenge/. If your Ingress Controller (like NGINX or ALB) isn't routing traffic properly, or if you have strict network policies blocking external access to this path, the challenge will fail. Fix: Check your Ingress Controller logs. Ensure no WAF (Web Application Firewall) is blocking the Let's Encrypt validation servers. Verify that the temporary ingress resource created by cert-manager is acquiring an IP address.

Step 3: Forcing a Manual Renewal

Once you have identified and fixed the root cause (e.g., fixed IAM permissions for DNS-01), you need to tell cert-manager to try again.

The safest way is using cmctl (the cert-manager CLI tool):

cmctl renew <cert-name> -n <namespace>

If you don't have cmctl installed, you can trigger a renewal by deleting the Kubernetes Secret that stores the expired certificate data. cert-manager will detect that the Secret is missing and immediately kick off a new issuance process:

kubectl delete secret <cert-secret-name> -n <namespace>

Warning: Only do this if the certificate is already expired, as deleting the secret will temporarily break TLS for any pods currently mounting it until the new one is issued.

Monitoring and Prevention

To prevent this from happening again, implement proactive monitoring. Do not rely solely on cert-manager's internal mechanisms.

  1. Use Prometheus and Alertmanager to alert on the certmanager_certificate_expiration_timestamp_seconds metric.
  2. Set up alerts for any Certificate objects where the Ready condition is False for more than 1 hour.
  3. Implement synthetic checks (like Blackbox Exporter) to externally probe your endpoints and alert if an SSL certificate has less than 14 days of validity remaining.

Frequently Asked Questions

bash
#!/bin/bash
# Diagnostic script to thoroughly inspect a stuck cert-manager certificate

NAMESPACE="your-namespace"
CERT_NAME="your-cert-name"

echo "=== 1. Certificate Status ==="
kubectl describe certificate $CERT_NAME -n $NAMESPACE | tail -n 20

echo "\n=== 2. Associated CertificateRequest ==="
CR_NAME=$(kubectl get certificaterequest -n $NAMESPACE -l "cert-manager.io/certificate-name=$CERT_NAME" -o jsonpath='{.items[0].metadata.name}')
kubectl describe certificaterequest $CR_NAME -n $NAMESPACE | tail -n 20

echo "\n=== 3. Associated Order ==="
ORDER_NAME=$(kubectl get order -n $NAMESPACE -l "cert-manager.io/certificate-name=$CERT_NAME" -o jsonpath='{.items[0].metadata.name}')
kubectl describe order $ORDER_NAME -n $NAMESPACE | tail -n 20

echo "\n=== 4. Associated Challenges ==="
CHALLENGE_NAMES=$(kubectl get challenge -n $NAMESPACE -l "acme.cert-manager.io/order-name=$ORDER_NAME" -o name)
for CHALLENGE in $CHALLENGE_NAMES; do
  echo "Describing $CHALLENGE:"
  kubectl describe $CHALLENGE -n $NAMESPACE | tail -n 20
done

# Command to force renewal once the issue is fixed:
# cmctl renew $CERT_NAME -n $NAMESPACE
E

Error Medic Editorial

Senior Site Reliability Engineers specializing in Kubernetes, cloud-native infrastructure, and automated TLS certificate management.

Sources

Related Articles in Cert Manager

Explore More DevOps Config Guides