Why didn't my certificate auto-renew before it expired?

cert-manager typically attempts renewal 30 days before expiration. Failures are usually caused by broken DNS-01 provider credentials, HTTP-01 routing issues (like a newly added WAF blocking Let's Encrypt), or rate limits preventing successful validation.

How do I forcefully renew an expired cert-manager certificate?

The recommended approach is to use the cert-manager CLI tool: `cmctl renew -n `. Alternatively, you can delete the Kubernetes Secret containing the TLS data, which will prompt cert-manager to issue a new one.

What does 'waiting for dns-01 challenge propagation' mean?

This means cert-manager has successfully instructed your DNS provider (via API) to create the required TXT record, but the Let's Encrypt servers cannot verify it yet. This can be due to high DNS TTLs, split-brain DNS, or the DNS provider taking too long to publish the record.

How do I fix the 'x509: certificate has expired' error on the cert-manager-webhook?

The webhook uses its own self-signed certificates. If they expire or break, you will see webhook errors when applying resources. Fix this by deleting the `ValidatingWebhookConfiguration` and `MutatingWebhookConfiguration` for cert-manager, then restarting the cert-manager-webhook pod to regenerate them.

Can I bypass Let's Encrypt rate limits after hitting them?

No, Let's Encrypt rate limits (like 5 duplicate certs per week) are strictly enforced. You must wait for the limit to expire. Always use the Let's Encrypt Staging environment for testing and troubleshooting to avoid locking out your production domain.

Troubleshooting 'cert-manager certificate expired' and 'x509: certificate has expired' in Kubernetes

Fix 'cert-manager certificate expired' errors by verifying ClusterIssuer status, checking Order and Challenge resources, and forcing a manual certificate renewa

Last updated: February 24, 2026

Last verified: February 24, 2026

1,353 words

Key Takeaways

Always check the 'Ready' status of the Certificate object using 'kubectl get cert'.
Trace the issuance pipeline: Certificate -> CertificateRequest -> Order -> Challenge.
DNS-01 challenge failures (e.g., IAM permissions, propagation delays) are the most common root cause of expired certificates.
Use 'cmctl renew' or safely delete the TLS Secret to force cert-manager to re-issue an expired certificate.

Fix Approaches Compared
Method	When to Use	Time	Risk
cmctl renew	Certificate is stuck in 'Issuing' or expired but configuration is correct.	< 1 min	Low
Delete TLS Secret	Certificate controller is completely stuck or secret is corrupted.	1-2 mins	Medium (Causes immediate downtime if cert wasn't already expired)
Restart cert-manager pods	Webhook errors or controller cache is out of sync.	2 mins	Low
Update Issuer/ClusterIssuer	Credentials expired or rate limits reached (switching to staging).	5 mins	Low

Understanding the Error

When you encounter a cert-manager certificate expired or x509: certificate has expired error in your Kubernetes cluster, it means that cert-manager failed to automatically renew the TLS certificate before its expiration date. By default, cert-manager attempts to renew certificates 30 days before they expire. If it reaches the expiration date, your applications will experience downtime as clients (like web browsers or API consumers) reject the invalid SSL/TLS connection.

Typical error messages associated with this include:

Browser: NET::ERR_CERT_DATE_INVALID
Curl/CLI: curl: (60) SSL certificate problem: certificate has expired
Kubernetes Events: Issuing certificate as Secret does not exist or is missing data followed by stalled progress.

Step 1: Diagnose the Issuance Pipeline

cert-manager uses a specific pipeline to issue certificates. When a certificate expires, it means one of these pipeline resources is stuck or has failed. You must trace the failure from the top down.

First, check the status of your certificates:

kubectl get certificates -A

Look for certificates where READY is False or the EXPIRATION date has passed.

Next, describe the specific broken certificate to look for events:

kubectl describe certificate <cert-name> -n <namespace>

If the certificate is stuck, check the CertificateRequest:

kubectl get certificaterequest -n <namespace>
kubectl describe certificaterequest <request-name> -n <namespace>

If using ACME (like Let's Encrypt), the CertificateRequest will spawn an Order. Check the Order:

kubectl get orders -n <namespace>
kubectl describe order <order-name> -n <namespace>

Finally, the Order spawns a Challenge (either HTTP-01 or DNS-01). This is where 90% of certificate renewals fail:

kubectl get challenges -n <namespace>
kubectl describe challenge <challenge-name> -n <namespace>

Look closely at the Events section of the Challenge description. You will often see errors like Waiting for DNS-01 challenge propagation: DNS record for "_acme-challenge.example.com" not found or Error presenting challenge: unauthorized.

Step 2: Common Root Causes and Fixes

1. DNS-01 Challenge Propagation Failures

If you are using DNS-01 challenges, cert-manager needs to create a TXT record in your DNS provider (e.g., Route53, Cloudflare). If the IAM role, API token, or service account used by the ClusterIssuer has expired or lacks permissions, the challenge will fail. Fix: Verify the credentials referenced in your ClusterIssuer Secret. Update the API token if it has expired. Ensure your DNS provider's API is reachable from the cluster.

2. Let's Encrypt Rate Limits

If you have been repeatedly deleting and recreating certificates, you may hit Let's Encrypt's strict rate limits (e.g., 5 duplicate certificates per week). Fix: Look for 429 Too Many Requests in the Order or Challenge logs. If you hit this, you must wait until the rate limit resets. In the future, always use the Let's Encrypt Staging environment (https://acme-staging-v02.api.letsencrypt.org/directory) for testing.

3. cert-manager Webhook Issues

The cert-manager webhook component intercepts and validates cert-manager resources. If the webhook's own internal TLS certificate is expired or misconfigured, it will block all certificate issuance. Error symptom: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook... x509: certificate has expired or is not yet valid. Fix: Restart the webhook pod, or manually delete the webhook's mutating/validating webhook configurations and let cert-manager recreate them:

kubectl delete mutatingwebhookconfiguration cert-manager-webhook
kubectl delete validatingwebhookconfiguration cert-manager-webhook
kubectl rollout restart deployment/cert-manager-webhook -n cert-manager

4. HTTP-01 Challenge Routing Issues

For HTTP-01 challenges, cert-manager temporarily spins up an ingress resource and a pod to serve a specific token at /.well-known/acme-challenge/. If your Ingress Controller (like NGINX or ALB) isn't routing traffic properly, or if you have strict network policies blocking external access to this path, the challenge will fail. Fix: Check your Ingress Controller logs. Ensure no WAF (Web Application Firewall) is blocking the Let's Encrypt validation servers. Verify that the temporary ingress resource created by cert-manager is acquiring an IP address.

Step 3: Forcing a Manual Renewal

Once you have identified and fixed the root cause (e.g., fixed IAM permissions for DNS-01), you need to tell cert-manager to try again.

The safest way is using cmctl (the cert-manager CLI tool):

cmctl renew <cert-name> -n <namespace>

If you don't have cmctl installed, you can trigger a renewal by deleting the Kubernetes Secret that stores the expired certificate data. cert-manager will detect that the Secret is missing and immediately kick off a new issuance process:

kubectl delete secret <cert-secret-name> -n <namespace>

Warning: Only do this if the certificate is already expired, as deleting the secret will temporarily break TLS for any pods currently mounting it until the new one is issued.

Monitoring and Prevention

To prevent this from happening again, implement proactive monitoring. Do not rely solely on cert-manager's internal mechanisms.

Use Prometheus and Alertmanager to alert on the certmanager_certificate_expiration_timestamp_seconds metric.
Set up alerts for any Certificate objects where the Ready condition is False for more than 1 hour.
Implement synthetic checks (like Blackbox Exporter) to externally probe your endpoints and alert if an SSL certificate has less than 14 days of validity remaining.

Frequently Asked Questions

bash

#!/bin/bash
# Diagnostic script to thoroughly inspect a stuck cert-manager certificate

NAMESPACE="your-namespace"
CERT_NAME="your-cert-name"

echo "=== 1. Certificate Status ==="
kubectl describe certificate $CERT_NAME -n $NAMESPACE | tail -n 20

echo "\n=== 2. Associated CertificateRequest ==="
CR_NAME=$(kubectl get certificaterequest -n $NAMESPACE -l "cert-manager.io/certificate-name=$CERT_NAME" -o jsonpath='{.items[0].metadata.name}')
kubectl describe certificaterequest $CR_NAME -n $NAMESPACE | tail -n 20

echo "\n=== 3. Associated Order ==="
ORDER_NAME=$(kubectl get order -n $NAMESPACE -l "cert-manager.io/certificate-name=$CERT_NAME" -o jsonpath='{.items[0].metadata.name}')
kubectl describe order $ORDER_NAME -n $NAMESPACE | tail -n 20

echo "\n=== 4. Associated Challenges ==="
CHALLENGE_NAMES=$(kubectl get challenge -n $NAMESPACE -l "acme.cert-manager.io/order-name=$ORDER_NAME" -o name)
for CHALLENGE in $CHALLENGE_NAMES; do
  echo "Describing $CHALLENGE:"
  kubectl describe $CHALLENGE -n $NAMESPACE | tail -n 20
done

# Command to force renewal once the issue is fixed:
# cmctl renew $CERT_NAME -n $NAMESPACE

Error Medic Editorial

Senior Site Reliability Engineers specializing in Kubernetes, cloud-native infrastructure, and automated TLS certificate management.

Sources

Explore More DevOps Config Guides

Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors

Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.

ArgoCD 'connection refused' Error: Complete Troubleshooting Guide (2024)

Fix ArgoCD 'connection refused', CrashLoopBackOff, ImagePullBackOff, and timeout errors with step-by-step diagnostic commands and proven solutions.

ArgoCD Connection Refused: Fix CrashLoopBackOff, ImagePullBackOff, Permission Denied & Timeout Errors

Fix ArgoCD connection refused errors: diagnose CrashLoopBackOff, ImagePullBackOff, permission denied, and timeout with step-by-step kubectl commands and config