ArgoCD Connection Refused: Fix CrashLoopBackOff, ImagePullBackOff, Permission Denied & Timeout Errors
Fix ArgoCD connection refused errors: diagnose CrashLoopBackOff, ImagePullBackOff, permission denied, and timeout with step-by-step kubectl commands and config
- ArgoCD 'connection refused' on port 443 is caused by the argocd-server pod being down, a Service selector mismatch, TLS misconfiguration, or a NetworkPolicy silently dropping packets before the TCP handshake completes.
- CrashLoopBackOff in ArgoCD components most commonly signals OOMKilled resource limits on argocd-repo-server, a corrupted or missing argocd-secret, an invalid Redis connection string, or a broken kubeconfig mount in the application controller.
- ImagePullBackOff occurs when the container image tag does not exist on the registry, the cluster lacks imagePullSecrets for a private registry, or Docker Hub rate limiting is encountered on shared-IP nodes.
- Permission denied errors arise when the argocd-manager ClusterRoleBinding in the destination cluster was deleted or its token expired, or when the argocd-rbac-cm ConfigMap contains an overly restrictive policy.csv.
- Timeout errors during sync or API calls are typically caused by argocd-repo-server exhausting memory while rendering large Helm charts, or by the default 30-second deadline being too short for slow API servers.
- Quick fix summary: run 'kubectl get pods -n argocd' first, check Events with 'kubectl describe pod', validate argocd-secret keys, then address the specific root cause before restarting pods.
| Method | When to Use | Time | Risk |
|---|---|---|---|
| Restart argocd-server Deployment | Pod is in CrashLoopBackOff or Pending after a config change | < 2 min | Low — graceful rolling restart |
| Patch argocd-secret or argocd-cm | Wrong admin password hash, Dex config error, or missing TLS keys | 5–10 min | Medium — wrong values break login entirely |
| Re-apply argocd-manager ClusterRoleBinding | Permission denied syncing to remote cluster, expired token, or deleted binding | 5 min | Low — additive permission grant |
| Rebuild imagePullSecret | ImagePullBackOff on private registry or expired credential token | 5–10 min | Low — secret replacement is non-disruptive |
| Increase resource limits on repo-server | OOMKilled CrashLoopBackOff when rendering large Helm charts or Kustomize | 10 min + rollout | Low — monitored rolling update |
| Fix NetworkPolicy or Ingress annotations | Connection refused from outside cluster or between ArgoCD internal components | 15–30 min | Medium — may briefly disrupt in-flight syncs |
| Re-register destination cluster | ArgoCD cannot connect to managed cluster due to rotated certificate or expired token | 10 min | Low — cluster entry is cleanly replaced |
Understanding ArgoCD Connection Refused and Related Errors
ArgoCD is a declarative GitOps continuous delivery tool for Kubernetes. Its architecture comprises several pods in the argocd namespace: argocd-server (API and UI gateway on port 443/8080), argocd-repo-server (clones repositories and renders manifests on port 8081), argocd-application-controller (StatefulSet that reconciles desired vs live state), argocd-dex-server (OIDC connector on port 5556), and argocd-redis (in-memory cache on port 6379). When any component fails, errors cascade — a Redis crash causes argocd-server to crash, which produces connection refused for every client.
What the Error Messages Look Like
When argocd-server is unreachable, the CLI shows:
FATAL[0001] dial tcp <IP>:443: connect: connection refused
In the browser you see ERR_CONNECTION_REFUSED. This is distinct from a timeout (no route) or TLS error (connected but handshake failed). A CrashLoopBackOff pod produces this error because Kubernetes restarts the container but the TCP port is not yet listening during the backoff window.
Step 1: Triage All ArgoCD Pod Statuses
Begin every investigation with a full namespace snapshot:
kubectl get pods -n argocd -o wide
kubectl get svc,endpoints -n argocd
kubectl get events -n argocd --sort-by='.lastTimestamp' | tail -40
Expected healthy output:
NAME READY STATUS RESTARTS AGE
argocd-application-controller-0 1/1 Running 0 2d
argocd-dex-server-7f8d9b4c6-xk2pq 1/1 Running 0 2d
argocd-redis-6b4d5f8c7-lmnop 1/1 Running 0 2d
argocd-repo-server-5c9d8f7b6-qrstu 1/1 Running 0 2d
argocd-server-6d7e8f9a5-vwxyz 1/1 Running 0 2d
If any pod shows CrashLoopBackOff, ImagePullBackOff, Pending, or a restart count above 5, that component is your starting point.
Step 2: Diagnosing and Fixing CrashLoopBackOff
CrashLoopBackOff means the container started, exited with a non-zero code, and Kubernetes is applying an exponential backoff before the next restart attempt (starting at 10s, doubling up to 5 minutes).
2a. OOMKilled — Out of Memory
kubectl describe pod -n argocd <pod-name> | grep -A8 'Last State'
If you see Reason: OOMKilled, the container exceeded its memory limit. Increase the limit for the affected Deployment:
kubectl patch deployment argocd-repo-server -n argocd \
--type=json \
-p='[{"op":"replace","path":"/spec/template/spec/containers/0/resources/limits/memory","value":"768Mi"}]'
argocd-repo-server is the most memory-intensive component when rendering Helm charts with many dependencies or Kustomize overlays on monorepos. Start with 512Mi and increase if OOMKills persist.
2b. Broken or Missing argocd-secret
ArgoCD requires a Secret named argocd-secret for the admin password hash, TLS certificates, and encryption key. If it is absent or missing keys, argocd-server crashes immediately:
panic: Failed to initialize server: could not read admin password: secret "argocd-secret" not found
Verify required keys are present:
kubectl get secret argocd-secret -n argocd -o jsonpath='{.data}' | \
python3 -c "import sys,json; [print(k) for k in json.load(sys.stdin)]"
Expected keys: admin.password, admin.passwordMtime, server.secretkey, tls.crt, tls.key.
To reset the admin password (requires htpasswd from the apache2-utils package):
HASHED=$(htpasswd -nbBC 10 "" 'NewSecurePass123!' | tr -d ':\n' | sed 's/$2y/$2a/')
kubectl -n argocd patch secret argocd-secret \
-p "{\"stringData\":{\"admin.password\":\"$HASHED\",\"admin.passwordMtime\":\"$(date +%FT%T%Z)\"}}"
2c. Redis Connection Failure Causing Cascade Crash
If argocd-redis is down, both argocd-server and argocd-repo-server will crash with:
FATAL[0000] Failed to connect to Redis: dial tcp argocd-redis:6379: connect: connection refused
Verify Redis health and its Service endpoints:
kubectl get pod -n argocd -l app.kubernetes.io/name=argocd-redis
kubectl get endpoints argocd-redis -n argocd
If the pod is Running but Endpoints is empty, the Service label selector was broken — commonly by a partial Helm upgrade that changed label schemas between versions.
Step 3: Diagnosing and Fixing ImagePullBackOff
The kubelet reports this error when it cannot pull the container image. kubectl describe pod will show one of these Event messages:
Failed to pull image "quay.io/argoproj/argocd:v2.99.0": manifest unknown: manifest unknown
Failed to pull image "quay.io/argoproj/argocd:v2.9.0": unauthorized: access to the requested resource is not authorized
Fix: Verify the Image Tag Exists
# Check the exact image reference the Deployment is using
kubectl get deployment argocd-server -n argocd \
-o jsonpath='{.spec.template.spec.containers[0].image}'
# Cross-check the tag exists on quay.io
curl -s "https://quay.io/api/v1/repository/argoproj/argocd/tag/?specificTag=v2.9.0" | \
jq '.tags[0].name'
Fix: Create an imagePullSecret for Private Registries
kubectl create secret docker-registry argocd-registry-creds \
--docker-server=<your-registry> \
--docker-username=<username> \
--docker-password=<token-or-password> \
-n argocd
# Patch the argocd-server ServiceAccount to use it
kubectl patch serviceaccount argocd-server -n argocd \
-p '{"imagePullSecrets":[{"name":"argocd-registry-creds"}]}'
# Restart affected pods to pick up the new pull secret
kubectl rollout restart deployment argocd-server argocd-repo-server -n argocd
Step 4: Fixing Permission Denied Errors
4a. ArgoCD UI and CLI Access Denied
If users see permission denied in the UI or rpc error: code = PermissionDenied desc = permission denied from the CLI, the argocd-rbac-cm ConfigMap policy is too restrictive:
kubectl get configmap argocd-rbac-cm -n argocd -o yaml
A minimal read-only policy for a developer group:
data:
policy.csv: |
p, role:viewer, applications, get, */*, allow
p, role:viewer, applications, sync, */*, allow
p, role:viewer, clusters, get, *, allow
g, my-org:developers, role:viewer
policy.default: role:readonly
4b. Destination Cluster Permission Denied During Sync
When ArgoCD cannot create or update resources in the target cluster:
failed to sync: error creating ClusterRole: clusterroles.rbac.authorization.k8s.io is forbidden: User "system:serviceaccount:argocd:argocd-manager" cannot create resource "clusterroles"
The argocd-manager ClusterRoleBinding in the destination cluster was deleted or its ServiceAccount token expired. Re-register the cluster to restore it:
argocd cluster add <kubectl-context-name> --name <cluster-display-name>
This command re-creates the argocd-manager ServiceAccount, ClusterRole, and ClusterRoleBinding in the destination cluster and stores the new token in the argocd namespace.
Step 5: Diagnosing and Fixing Timeout Errors
Timeout errors appear in the ArgoCD UI as:
RPC failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
or in argocd app sync output:
FATAL[0030] timed out waiting for sync to complete
5a. Increase the Repo Server Timeout
Edit argocd-cmd-params-cm to extend the deadline:
kubectl edit configmap argocd-cmd-params-cm -n argocd
Add or update these keys:
data:
server.repo.server.timeout.seconds: "120"
controller.repo.server.timeout.seconds: "120"
Apply the change by restarting the deployments:
kubectl rollout restart deployment/argocd-server deployment/argocd-repo-server -n argocd
5b. NetworkPolicy Blocking Internal Component Communication
ArgoCD components communicate on fixed ports. If strict NetworkPolicy rules are in place, verify connectivity:
# Test argocd-server to argocd-repo-server (gRPC manifests API)
kubectl exec -n argocd deploy/argocd-server -- nc -zv argocd-repo-server 8081
# Test argocd-server to argocd-redis
kubectl exec -n argocd deploy/argocd-server -- nc -zv argocd-redis 6379
# Test argocd-server to argocd-dex-server (OIDC token endpoint)
kubectl exec -n argocd deploy/argocd-server -- nc -zv argocd-dex-server 5556
Apply a permissive intra-namespace NetworkPolicy to allow all ArgoCD components to communicate:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-argocd-internal
namespace: argocd
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: argocd
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: argocd
policyTypes: [Ingress, Egress]
Step 6: Fixing Connection Refused When the Pod Is Already Running
If argocd-server shows Running but you still cannot connect, the issue is at the Service or Ingress layer:
# Check the Service type and ports
kubectl get svc argocd-server -n argocd -o wide
# Verify the Service selector matches the pod labels
kubectl get pod -n argocd -l app.kubernetes.io/name=argocd-server --show-labels
# If type=ClusterIP, use port-forward for local access
kubectl port-forward svc/argocd-server -n argocd 8080:443
For LoadBalancer Services that remain <pending> on cloud providers, check your cloud LB quota. On bare-metal clusters using MetalLB, verify the IPAddressPool has available addresses:
kubectl get ipaddresspool -n metallb-system
kubectl get l2advertisement -n metallb-system
TLS Passthrough vs SSL Termination
ArgoCD multiplexes HTTPS and gRPC on the same port 443 using ALPN. If your Ingress terminates TLS and re-encrypts without gRPC passthrough, the argocd CLI will fail with EOF or connection refused even though the browser works. For NGINX Ingress Controller, use TLS passthrough:
annotations:
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
Verification Checklist
After any fix, confirm full resolution:
# Confirm all pods Running with zero restarts
kubectl get pods -n argocd
# Log in via CLI
argocd login <server-hostname> --username admin --password <password> --insecure
# List and inspect application health
argocd app list
argocd app get <app-name> --refresh
# Force a sync if the app shows Degraded or Unknown
argocd app sync <app-name> --prune --force
Frequently Asked Questions
#!/usr/bin/env bash
# ArgoCD Full Diagnostic Script
# Usage: bash argocd-diagnose.sh [namespace]
# Default namespace: argocd
NS="${1:-argocd}"
separator() { echo ""; echo "=============================="; echo "=== $1"; echo "=============================="; }
separator "Pod Status"
kubectl get pods -n "$NS" -o wide
separator "Services and Endpoints"
kubectl get svc,endpoints -n "$NS"
separator "Recent Events (newest 50)"
kubectl get events -n "$NS" --sort-by='.lastTimestamp' | tail -50
separator "argocd-server Logs (last 60 lines)"
kubectl logs -n "$NS" deployment/argocd-server --tail=60 2>&1
separator "argocd-repo-server Logs (last 60 lines)"
kubectl logs -n "$NS" deployment/argocd-repo-server --tail=60 2>&1
separator "argocd-application-controller Logs (last 60 lines)"
kubectl logs -n "$NS" statefulset/argocd-application-controller --tail=60 2>&1
separator "argocd-redis Logs (last 30 lines)"
kubectl logs -n "$NS" deployment/argocd-redis --tail=30 2>&1
separator "Describe Non-Running Pods"
for pod in $(kubectl get pods -n "$NS" --no-headers | awk '$3 != "Running" {print $1}'); do
echo "--- Pod: $pod ---"
kubectl describe pod "$pod" -n "$NS" | grep -A25 'State:\|Last State:\|Events:'
echo ""
done
separator "argocd-secret Key Names (no values)"
kubectl get secret argocd-secret -n "$NS" -o json 2>/dev/null | \
python3 -c "import sys,json; d=json.load(sys.stdin); [print(k) for k in d.get('data',{}).keys()]" || \
echo "ERROR: argocd-secret not found — this will cause CrashLoopBackOff"
separator "argocd-cm ConfigMap"
kubectl get configmap argocd-cm -n "$NS" -o yaml 2>&1
separator "argocd-cmd-params-cm ConfigMap"
kubectl get configmap argocd-cmd-params-cm -n "$NS" -o yaml 2>&1
separator "argocd-rbac-cm ConfigMap"
kubectl get configmap argocd-rbac-cm -n "$NS" -o yaml 2>&1
separator "Resource Usage (top pods)"
kubectl top pods -n "$NS" 2>/dev/null || echo "metrics-server not available"
separator "Internal Network Connectivity Tests"
kubectl exec -n "$NS" deployment/argocd-server -- sh -c '
echo -n "repo-server:8081 -> "; nc -zv argocd-repo-server 8081 2>&1
echo -n "redis:6379 -> "; nc -zv argocd-redis 6379 2>&1
echo -n "dex-server:5556 -> "; nc -zv argocd-dex-server 5556 2>&1
' 2>/dev/null || echo "Could not exec into argocd-server pod"
separator "Registered Remote Clusters"
argocd cluster list 2>/dev/null || echo "argocd CLI not authenticated — run: argocd login <server>"
separator "Application Health Summary"
argocd app list 2>/dev/null || echo "argocd CLI not authenticated"
separator "Diagnostic Complete"
echo "Review any ERROR or CrashLoopBackOff entries above."
echo "Check argocd-secret keys, Redis Endpoints, and NetworkPolicy rules first."Error Medic Editorial
The Error Medic Editorial team consists of senior DevOps engineers, SREs, and platform engineers with collective experience running Kubernetes workloads at scale across AWS EKS, Google GKE, Azure AKS, and on-premises clusters. We specialize in the Kubernetes GitOps ecosystem including ArgoCD, Flux CD, Tekton, and Crossplane. Our troubleshooting guides are validated against real production incidents, tested across multiple ArgoCD versions (2.6 through 2.12), and reviewed against official upstream documentation and GitHub issue history.
Sources
- https://argo-cd.readthedocs.io/en/stable/operator-manual/troubleshooting/
- https://argo-cd.readthedocs.io/en/stable/operator-manual/installation/
- https://argo-cd.readthedocs.io/en/stable/operator-manual/rbac/
- https://argo-cd.readthedocs.io/en/stable/operator-manual/ingress/
- https://github.com/argoproj/argo-cd/issues/7723
- https://github.com/argoproj/argo-cd/issues/4174
- https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes
- https://stackoverflow.com/questions/tagged/argocd