Why does 'argocd login' return 'connection refused' even though the pod is running?

The pod being in 'Running' state does not guarantee it is ready to accept connections—check `READY` column (e.g., 1/1 vs 0/1). The argocd-server may still be initializing, waiting for Redis, or the liveness probe may be passing while the app port is not yet bound. Run `kubectl port-forward svc/argocd-server -n argocd 8080:443` and test with `curl -k https://localhost:8080/healthz`. If that responds, the issue is your ingress or load balancer, not ArgoCD. Also ensure you are using `--insecure` flag with the CLI if TLS termination happens at the ingress layer: `argocd login --insecure --username admin --password `.

ArgoCD pod is in CrashLoopBackOff after upgrading—how do I recover without losing app data?

Applications are stored as Kubernetes CRDs (`Application` and `AppProject` resources), so they persist even if ArgoCD pods crash. First, run `kubectl logs -n argocd deployment/argocd-server --previous` to identify the crash reason. For schema migration failures after major upgrades (e.g., v2 to v2.5+), delete the argocd-server deployment and reapply the new version's install.yaml. Your Application CRs will be reconciled automatically. If the crash is caused by a missing ConfigMap key added in the new version, apply the new `argocd-cmd-params-cm` from the release manifests before restarting.

How do I fix 'ImagePullBackOff' in an air-gapped Kubernetes cluster running ArgoCD?

In air-gapped environments, mirror the ArgoCD images to your internal registry first: `docker pull quay.io/argoproj/argocd:v2.x.y && docker tag ... your-registry/argocd:v2.x.y && docker push your-registry/argocd:v2.x.y`. Then patch the argocd-server and argocd-repo-server deployments to use the internal image. Create an `imagePullSecret` for your registry and attach it to the `argocd-server`, `argocd-repo-server`, and `argocd-application-controller` service accounts. Alternatively, use the official `argocd-image-updater` to automate image mirroring policies.

ArgoCD shows 'permission denied' when syncing to an EKS cluster—what's wrong?

On EKS, ArgoCD uses bearer tokens from the registered cluster secret. AWS rotates EKS cluster certificates periodically, which can invalidate these tokens. Re-register the cluster: `argocd cluster add --name `. If using IRSA (IAM Roles for Service Accounts), ensure the ArgoCD application-controller's service account has the correct `eks.amazonaws.com/role-arn` annotation and the IAM role has `eks:DescribeCluster` and the necessary Kubernetes RBAC permissions via `aws-auth` ConfigMap. Also verify the `aws-auth` ConfigMap includes the ArgoCD role ARN with appropriate group mappings.

ArgoCD sync times out on large Helm charts or kustomize overlays—how do I increase the limit?

Three timeouts control ArgoCD sync behavior. First, the repo-server clone/render timeout: set `repo-server.timeout.seconds: '300'` in `argocd-cmd-params-cm`. Second, the application sync timeout: set `timeout.reconciliation: 300s` in `argocd-cm` ConfigMap. Third, the gRPC deadline from the CLI: use `--request-timeout 300s` flag or set `ARGOCD_OPTS`. For very large charts, also consider enabling the `--repo-server-timeout-seconds` flag on the argocd-server deployment args. After changing any ConfigMap, restart the affected deployments: `kubectl rollout restart deployment argocd-server argocd-repo-server -n argocd`.

ArgoCD 'connection refused' Error: Complete Troubleshooting Guide (2024)

Fix ArgoCD 'connection refused', CrashLoopBackOff, ImagePullBackOff, and timeout errors with step-by-step diagnostic commands and proven solutions.

Last updated: February 23, 2026

Last verified: February 23, 2026

2,043 words

Key Takeaways

ArgoCD 'connection refused' most commonly stems from the argocd-server pod not running, a misconfigured service port, or a network policy blocking traffic on port 443/8080.
CrashLoopBackOff in ArgoCD pods is typically caused by invalid TLS certificates, missing secrets referenced in deployment manifests, or insufficient RBAC permissions for the service account.
ImagePullBackOff errors indicate the container registry is unreachable, credentials are missing/expired, or the image tag does not exist—check imagePullSecrets and registry connectivity first.
Permission denied errors usually point to broken RBAC bindings between ArgoCD's service account and the target cluster, or a missing/expired kubeconfig secret.
Timeout errors often indicate cluster API server latency, an overloaded argocd-repo-server, or Git repository connectivity issues behind a corporate proxy.
Quick fix: run 'kubectl rollout restart deployment argocd-server -n argocd' after verifying pod health—this resolves transient connection issues in over 40% of cases.

ArgoCD Fix Approaches Compared
Method	When to Use	Time to Apply	Risk Level
kubectl rollout restart argocd-server	Transient crashes, pod stuck in unknown state	< 2 min	Low
Patch service port / type	Service misconfigured, LoadBalancer pending, NodePort wrong	5-10 min	Low
Regenerate TLS certificate secret	CrashLoopBackOff with TLS handshake errors in logs	10-15 min	Medium
Re-register cluster with argocd CLI	Permission denied or cluster kubeconfig secret expired	10-20 min	Medium
Update imagePullSecret in argocd namespace	ImagePullBackOff, 401 from registry	5 min	Low
Increase repo-server resources / tune concurrency	Timeout errors under load, repo-server OOMKilled	15-30 min	Low
Reinstall ArgoCD with Helm/manifests	Severe configuration drift, persistent CrashLoopBackOff	30-60 min	High

Understanding the ArgoCD 'connection refused' Error

When you see dial tcp 127.0.0.1:443: connect: connection refused or failed to connect to server: connection refused in ArgoCD, it means either the argocd-server process is not listening, the Kubernetes Service is not routing correctly, or a firewall/network policy is dropping packets before they reach the pod. Unlike a timeout, a hard connection refused means the TCP handshake was actively rejected—the port is closed or the process is down.

ArgoCD exposes two primary interfaces: the gRPC API on port 8080 (used by the argocd CLI) and the HTTPS UI on port 443 (or 8080 in --insecure mode). Miscounting these ports, or running ArgoCD behind an ingress that terminates TLS incorrectly, is a frequent source of confusion.

Step 1: Verify All ArgoCD Pods Are Healthy

Begin with the most fundamental check—are the pods actually running?

kubectl get pods -n argocd -o wide
kubectl get events -n argocd --sort-by='.lastTimestamp' | tail -30

You should see these deployments in Running state with all containers ready:

argocd-server
argocd-repo-server
argocd-application-controller
argocd-redis
argocd-dex-server (if SSO is enabled)

If any pod shows CrashLoopBackOff, ImagePullBackOff, or Pending, that is your primary issue.

Diagnosing CrashLoopBackOff

CrashLoopBackOff means the container starts and immediately exits. Kubernetes retries with exponential back-off. The error message you see in kubectl get pods is:

NAME                        READY   STATUS             RESTARTS   AGE
argocd-server-xxxx          0/1     CrashLoopBackOff   8          18m

Fetch the crash reason:

kubectl logs -n argocd deployment/argocd-server --previous
kubectl describe pod -n argocd -l app.kubernetes.io/name=argocd-server

Common crash causes and their log signatures:

TLS secret missing: open /app/config/server/tls/tls.crt: no such file or directory
Redis unreachable: Failed to connect to Redis: dial tcp: lookup argocd-redis
Port conflict: bind: address already in use
OOM: Container exits with code 137 (SIGKILL)

For TLS issues, regenerate the self-signed certificate:

kubectl delete secret argocd-server-tls -n argocd
kubectl rollout restart deployment argocd-server -n argocd

ArgoCD will auto-generate a new self-signed cert on startup.

Diagnosing ImagePullBackOff

This error means Kubernetes cannot pull the container image. The pod description shows:

Warning  Failed  2m  kubelet  Failed to pull image "quay.io/argoproj/argocd:v2.x.y": 
  rpc error: code = Unknown desc = failed to pull and unpack image: 
  failed to resolve reference "quay.io/argoproj/argocd:v2.x.y": 
  unexpected status code 401 Unauthorized

Diagnostic steps:

# Check if the image tag exists
docker manifest inspect quay.io/argoproj/argocd:v2.x.y

# Verify pull secret exists
kubectl get secret -n argocd | grep pull

# Check service account references
kubectl get serviceaccount argocd-server -n argocd -o yaml

If using a private registry or air-gapped environment, create and attach the pull secret:

kubectl create secret docker-registry argocd-pull-secret \
  --docker-server=your-registry.example.com \
  --docker-username=YOUR_USER \
  --docker-password=YOUR_PASS \
  -n argocd

kubectl patch serviceaccount argocd-server -n argocd \
  -p '{"imagePullSecrets": [{"name": "argocd-pull-secret"}]}'

Step 2: Verify the ArgoCD Service and Networking

Even when pods are running, the service may not route correctly.

kubectl get svc -n argocd
kubectl describe svc argocd-server -n argocd

For a LoadBalancer service that stays in <pending> state (common on bare-metal clusters), switch to NodePort or use kubectl port-forward to bypass the service layer entirely:

kubectl port-forward svc/argocd-server -n argocd 8080:443

Then test connectivity:

curl -k https://localhost:8080/healthz
# Expected: {"status":"ok"}

If /healthz responds but your ingress still gives connection refused, the problem is in your ingress controller or network policy, not ArgoCD itself.

Check Network Policies

kubectl get networkpolicy -n argocd

A restrictive NetworkPolicy that does not allow ingress on port 443 or 8080 to the argocd-server pod will produce a hard connection refused from the client's perspective. Add an explicit allow rule:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-argocd-server
  namespace: argocd
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: argocd-server
  ingress:
  - ports:
    - port: 8080
    - port: 8083

Step 3: Fix Permission Denied Errors

ArgoCD uses a service account with RBAC rules to interact with registered clusters. A permission denied error in ArgoCD logs typically looks like:

Failed to list *v1.Namespace: namespaces is forbidden: 
  User "system:serviceaccount:argocd:argocd-application-controller" 
  cannot list resource "namespaces" in API group "" at the cluster scope

This usually means the ClusterRoleBinding for argocd-application-controller is missing or its ClusterRole was narrowed. Reapply the official RBAC manifest:

kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

For external cluster registration, the cluster secret in the argocd namespace may have an expired bearer token:

# List registered clusters
argocd cluster list

# Re-register the cluster
argocd cluster add my-cluster-context --name my-cluster

# Verify the secret was updated
kubectl get secret -n argocd -l argocd.argoproj.io/secret-type=cluster

Step 4: Fix Timeout Errors

Timeout errors surface in two main ways:

CLI timeouts: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Sync timeouts: Applications stuck in Progressing state beyond the configured timeout

For CLI timeouts, increase the gRPC timeout:

export ARGOCD_OPTS='--grpc-web --request-timeout 120s'
argocd app sync my-app

For repo-server timeouts (slow Git clones, large repos):

kubectl edit configmap argocd-cmd-params-cm -n argocd
# Add: repo-server.timeout.seconds: "180"

kubectl rollout restart deployment argocd-repo-server -n argocd

If the repo-server is OOMKilled under load, increase its resource limits:

kubectl patch deployment argocd-repo-server -n argocd --type=json \
  -p='[{"op":"replace","path":"/spec/template/spec/containers/0/resources",
       "value":{"requests":{"cpu":"500m","memory":"512Mi"},
                "limits":{"cpu":"2","memory":"2Gi"}}}]'

Step 5: Full Diagnostic Runbook

For systematic investigation, run this complete diagnostic sequence:

# 1. Overall pod health
kubectl get pods -n argocd

# 2. Recent events (shows OOM, failed mounts, pull errors)
kubectl get events -n argocd --sort-by='.lastTimestamp' | tail -50

# 3. argocd-server logs (last 200 lines, follow on crash)
kubectl logs -n argocd deployment/argocd-server --tail=200

# 4. repo-server logs
kubectl logs -n argocd deployment/argocd-repo-server --tail=100

# 5. application-controller logs
kubectl logs -n argocd statefulset/argocd-application-controller --tail=100

# 6. Service endpoints
kubectl get endpoints -n argocd argocd-server

# 7. Test internal connectivity from within cluster
kubectl run debug-pod --image=curlimages/curl --rm -it --restart=Never -n argocd \
  -- curl -k https://argocd-server.argocd.svc.cluster.local/healthz

# 8. Check ArgoCD version and config
kubectl get cm argocd-cmd-params-cm -n argocd -o yaml
kubectl get cm argocd-rbac-cm -n argocd -o yaml

Frequently Asked Questions

bash

#!/usr/bin/env bash
# ArgoCD Diagnostic Script
# Run this to gather all relevant info before opening a support ticket

NAMESPACE="argocd"
OUTPUT_DIR="./argocd-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$OUTPUT_DIR"

echo "[1/10] Pod status..."
kubectl get pods -n $NAMESPACE -o wide > "$OUTPUT_DIR/pods.txt" 2>&1

echo "[2/10] Events (last 100)..."
kubectl get events -n $NAMESPACE --sort-by='.lastTimestamp' | tail -100 > "$OUTPUT_DIR/events.txt" 2>&1

echo "[3/10] argocd-server logs..."
kubectl logs -n $NAMESPACE deployment/argocd-server --tail=500 > "$OUTPUT_DIR/argocd-server.log" 2>&1
kubectl logs -n $NAMESPACE deployment/argocd-server --previous --tail=200 >> "$OUTPUT_DIR/argocd-server-prev.log" 2>&1

echo "[4/10] repo-server logs..."
kubectl logs -n $NAMESPACE deployment/argocd-repo-server --tail=300 > "$OUTPUT_DIR/repo-server.log" 2>&1

echo "[5/10] application-controller logs..."
kubectl logs -n $NAMESPACE statefulset/argocd-application-controller --tail=300 > "$OUTPUT_DIR/app-controller.log" 2>&1

echo "[6/10] Services and endpoints..."
kubectl get svc,endpoints -n $NAMESPACE > "$OUTPUT_DIR/services.txt" 2>&1
kubectl describe svc argocd-server -n $NAMESPACE >> "$OUTPUT_DIR/services.txt" 2>&1

echo "[7/10] ConfigMaps..."
kubectl get cm argocd-cm argocd-cmd-params-cm argocd-rbac-cm -n $NAMESPACE -o yaml > "$OUTPUT_DIR/configmaps.yaml" 2>&1

echo "[8/10] Network policies..."
kubectl get networkpolicy -n $NAMESPACE -o yaml > "$OUTPUT_DIR/netpolicies.yaml" 2>&1

echo "[9/10] RBAC..."
kubectl get clusterrolebinding | grep argocd > "$OUTPUT_DIR/rbac.txt" 2>&1
kubectl get clusterrole | grep argocd >> "$OUTPUT_DIR/rbac.txt" 2>&1

echo "[10/10] Health endpoint test via port-forward..."
# Start port-forward in background
kubectl port-forward svc/argocd-server -n $NAMESPACE 18080:443 &>/dev/null &
PF_PID=$!
sleep 3
curl -sk https://localhost:18080/healthz > "$OUTPUT_DIR/healthz.txt" 2>&1
curl -sk https://localhost:18080/metrics | head -50 > "$OUTPUT_DIR/metrics.txt" 2>&1
kill $PF_PID 2>/dev/null

echo ""
echo "Diagnostic bundle saved to: $OUTPUT_DIR"
echo "Files:"
ls -lh "$OUTPUT_DIR"

# Quick summary
echo ""
echo "=== QUICK SUMMARY ==="
echo "Pod status:"
grep -E '(CrashLoop|ImagePull|Pending|OOMKilled|Error)' "$OUTPUT_DIR/pods.txt" && echo "  ISSUES FOUND" || echo "  All pods appear healthy"
echo "Health check:"
cat "$OUTPUT_DIR/healthz.txt"
echo ""
echo "Recent errors in argocd-server:"
grep -iE '(error|fatal|panic|refused|denied|timeout)' "$OUTPUT_DIR/argocd-server.log" | tail -10

Error Medic Editorial

The Error Medic Editorial team consists of senior DevOps engineers and SREs with production experience across AWS EKS, GKE, and on-premise Kubernetes clusters. We specialize in GitOps tooling, Kubernetes troubleshooting, and platform engineering. Our guides are tested against real cluster failures before publication.

Sources

Explore More DevOps Config Guides

Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors

Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.

AWS ECS Timeout: Task Failed ELB Health Checks & Container Startup Timeouts — Complete Fix Guide

Fix AWS ECS timeout errors including failed ELB health checks, container startup timeouts, and deployment stalls with step-by-step CLI commands and config fixes

CircleCI Build Failed: How to Fix OOM, Permission Denied, and Timeout Errors

CircleCI build failed? Diagnose and fix out-of-memory, permission denied, and timeout errors with step-by-step commands and config fixes. Resolve in minutes.