How to Fix "cert-manager certificate expired" (x509: certificate has expired) in Kubernetes
Fix Kubernetes cert-manager certificate expired errors (x509). Learn to diagnose failed renewals, troubleshoot ACME challenges, and force manual certificate ren
- Certificates typically expire because the automated renewal process failed silently due to DNS-01 or HTTP-01 validation blocking.
- Stuck CertificateRequests, Let's Encrypt rate limits, and misconfigured Ingress classes are the most common root causes.
- Quick Fix: Use 'cmctl renew <cert-name>' to trigger a forced manual renewal, or delete the failing Challenge and CertificateRequest objects.
- cert-manager's webhook component failure can block all validations; always ensure the cert-manager pods are Running and Ready.
| Method | When to Use | Time | Risk |
|---|---|---|---|
| Force Renewal (cmctl) | Validation failed temporarily due to a transient network or API issue | 1 min | Low |
| Delete Secret & Request | State is corrupted, or CertificateRequest is stuck in a failed loop | 2 mins | Medium (Creates brief TLS termination downtime if active) |
| Reconfigure Issuer/Challenge | DNS/HTTP01 challenge is permanently failing due to infrastructure changes | 15 mins | Low |
| Restart cert-manager Webhook | Validations are failing with 'failed calling webhook' errors | 2 mins | Low |
Understanding the Error
When a cert-manager certificate expires in a Kubernetes cluster, users typically experience immediate outages on secure endpoints. Browsers throw NET::ERR_CERT_DATE_INVALID, APIs return curl: (60) SSL certificate problem: certificate has expired, and your Ingress controller logs will be flooded with x509: certificate has expired or is not yet valid errors.
cert-manager is designed to automatically renew certificates before they expire (usually 30 days prior for Let's Encrypt). If a certificate has actually expired, it means the automated renewal process has been failing silently for weeks. This failure is rarely a bug in cert-manager itself; rather, it is almost always an infrastructure issue preventing the ACME server (like Let's Encrypt) from validating domain ownership via HTTP-01 or DNS-01 challenges.
The cert-manager Resource Chain
To troubleshoot effectively, you must understand the resource chain cert-manager uses to issue a certificate:
- Certificate: The top-level resource requesting a specific domain.
- CertificateRequest: Created by the Certificate when a new keypair/cert is needed.
- Order: Created for ACME issuers to represent the request to the CA.
- Challenge: Created by the Order to prove domain control.
When a certificate expires, the breakdown usually happens at the Challenge or Order level.
Step 1: Diagnose the Failure
Do not start deleting resources blindly. Find out exactly why the renewal failed.
1. Check the Certificate Status
Run the following command to check the READY status of your certificates:
kubectl get certificates -A
If the READY column says False, describe the certificate to read the events:
kubectl describe certificate <certificate-name> -n <namespace>
Look at the Conditions section at the bottom. You might see Issuing or a specific error message stating why it failed.
2. Trace the Request, Order, and Challenge
If the Certificate is stuck issuing, move down the chain:
kubectl get certificaterequest,order,challenge -n <namespace>
You will likely see a challenge object that has been pending for a long time. Describe it:
kubectl describe challenge <challenge-name> -n <namespace>
The State and Reason fields in the Challenge events will tell you exactly what Let's Encrypt saw when it tried to validate your domain. Common reasons include:
Waiting for HTTP-01 challenge propagation: failed to perform self check GET requestDNS record for _acme-challenge.example.com not found403 ForbiddenorRate limit exceeded
Step 2: Resolve Common Root Causes
Root Cause A: HTTP-01 Challenge Failing
HTTP-01 challenges work by spinning up a temporary pod and Ingress route to serve a specific token. If Let's Encrypt cannot reach this token via http://<your-domain>/.well-known/acme-challenge/<token>, validation fails.
Fixes:
- Ingress Class mismatch: Ensure your
IssuerorClusterIssuerspecifies the correct ingress class name (e.g.,nginxortraefik). If you recently upgraded your ingress controller, the class name might have changed. - Firewalls/WAF: Check if a Web Application Firewall (like Cloudflare or AWS WAF) is blocking HTTP traffic to
.well-known/acme-challenge/paths. - Global Redirects: If your Ingress strictly redirects all HTTP traffic to HTTPS, the challenge might fail if the TLS certificate is already completely expired and invalid, preventing the ACME server from following the redirect securely. Temporarily disable the HTTPS redirect, or configure your ingress to bypass redirects for the
.well-knownpath.
Root Cause B: DNS-01 Challenge Failing
DNS-01 challenges create a TXT record _acme-challenge.<your-domain>. cert-manager needs API access to your DNS provider (e.g., AWS Route53, Cloudflare, Google Cloud DNS) to create this record.
Fixes:
- IAM Permissions: If using AWS IRSA or GCP Workload Identity, verify the
cert-managerservice account still has the correct IAM role bound. Service account tokens may have rotated, or permissions may have been revoked. - API Token Expiration: If you are using a Kubernetes Secret to store a Cloudflare or DigitalOcean API token, check if the token itself has expired or been revoked.
- Propagation Delay: Sometimes DNS takes longer to propagate than
cert-managerexpects. You can increase the DNS01 propagation check delay in your Issuer configuration.
Root Cause C: Webhook Unavailability
Sometimes the cert-manager-webhook pod crashes or the ValidatingWebhookConfiguration gets out of sync, preventing cert-manager from modifying any of its custom resources.
Fix: Check the cert-manager pods:
kubectl get pods -n cert-manager
If the webhook pod is crash-looping or has restarts, check its logs. You may need to delete the webhook pod to force a restart, or re-apply the cert-manager manifests if the TLS certificates for the webhook itself have expired.
Step 3: Force the Renewal Process
Once you have resolved the underlying infrastructure issue (e.g., fixed the Ingress route or updated the DNS API token), cert-manager uses exponential backoff and might not retry immediately. You should force it to retry.
Using cmctl (Recommended):
If you have the cmctl CLI tool installed:
cmctl renew <certificate-name> -n <namespace>
Using kubectl (Manual approach):
If you don't have cmctl, you can trigger a renewal by deleting the stuck CertificateRequest and the underlying Secret (Note: deleting the secret means the expired cert is gone entirely until the new one is issued, which causes hard TLS drops rather than expired cert warnings).
kubectl delete certificaterequest -l app.kubernetes.io/name=cert-manager -n <namespace>
Alternatively, edit the Certificate resource and add a dummy annotation (like kubectl annotate cert <name> force-renew=$(date +%s)) which sometimes nudges the controller.
Step 4: Verification
Watch the logs of the cert-manager controller to ensure the issuance succeeds:
kubectl logs -n cert-manager -l app=cert-manager -f
You should see lines indicating the Order was created, the Challenge was presented, and finally, the Certificate was issued successfully. Verify the new expiration date:
echo | openssl s_client -showcerts -servername your-domain.com -connect your-domain.com:443 2>/dev/null | openssl x509 -inform pem -noout -dates
Frequently Asked Questions
# 1. Check the status of all certificates in the cluster
kubectl get certificates -A
# 2. Describe the failing certificate to find the root cause
kubectl describe certificate <certificate-name> -n <namespace>
# 3. Check for stuck CertificateRequests, Orders, or Challenges
kubectl get certificaterequest,order,challenge -n <namespace>
# 4. Describe the specific challenge to see why validation failed
kubectl describe challenge <challenge-name> -n <namespace>
# 5. Check cert-manager controller logs for detailed API errors
kubectl logs -n cert-manager deploy/cert-manager
# 6. FORCE RENEWAL (Once the infrastructure issue is fixed)
# Option A: Using cmctl (Recommended)
cmctl renew <certificate-name> -n <namespace>
# Option B: Using kubectl to clean up stuck requests
kubectl delete certificaterequest <certificaterequest-name> -n <namespace>
kubectl delete challenge <challenge-name> -n <namespace>
# 7. Verify the newly issued certificate date on your live endpoint
echo | openssl s_client -showcerts -servername example.com -connect example.com:443 2>/dev/null | openssl x509 -inform pem -noout -datesError Medic Editorial
Our team of Senior DevOps and Site Reliability Engineers specializes in Kubernetes infrastructure, observability, and cloud-native troubleshooting.