I added my user to the docker group but still get 'permission denied' — why?

Group membership changes do not apply to your current shell session. You must either log out and log back in, run 'newgrp docker' in your current terminal, or use 'exec su -l $USER' to reload group membership. Run 'id' and confirm 'docker' appears in the output before retrying.

Docker says 'connection refused' even though the daemon appears to be running — what's wrong?

The most common cause is a corrupt or invalid /etc/docker/daemon.json. The daemon may appear running briefly then fail silently. Run 'sudo python3 -m json.tool /etc/docker/daemon.json' to validate JSON syntax, then check 'journalctl -u docker -n 50' for the real error. Also verify the DOCKER_HOST environment variable is not pointing to a stale TCP endpoint with 'echo $DOCKER_HOST'.

My container keeps restarting with exit code 137. Is this a permission error?

Exit code 137 is not a permission error — it means the container received SIGKILL, most commonly from the Linux OOM (out-of-memory) killer. Confirm with 'dmesg | grep -i oom' and 'docker inspect --format={{.State.OOMKilled}} '. Fix by increasing the container's memory limit with '-m 2g' in docker run, or by tuning your application's heap size.

What is the difference between ImagePullBackOff and ErrImagePull in Kubernetes?

ErrImagePull means Kubernetes just attempted and failed to pull the image. ImagePullBackOff means Kubernetes has retried multiple times and is now backing off with exponential delay before trying again. Both ultimately have the same root causes: invalid image name or tag, missing imagePullSecret for private registries, expired credentials, TLS certificate errors, or Docker Hub rate limiting. Check 'kubectl describe pod ' for the specific failure reason.

How do I fix 'docker: Got permission denied' in a CI/CD pipeline (GitHub Actions, GitLab CI)?

In CI environments the Docker socket is typically exposed via the DOCKER_HOST variable or a mounted socket. For GitHub Actions, use the official 'docker/setup-buildx-action' and ensure the runner has Docker installed. For GitLab CI with Docker-in-Docker (dind), add 'privileged: true' to the runner config or use the socket-binding approach by mounting '/var/run/docker.sock:/var/run/docker.sock' as a volume in your job definition. Never hardcode credentials — use CI/CD secret variables and 'docker login' with those values.

How to Fix Docker Permission Denied: Complete Troubleshooting Guide

Fix Docker 'permission denied' errors fast. Covers /var/run/docker.sock, user group fixes, rootless Docker, certificate issues, and CrashLoopBackOff. Step-by-st

Last updated: February 23, 2026

Last verified: February 23, 2026

2,158 words

Key Takeaways

The most common cause is the current Linux user not belonging to the 'docker' group, resulting in 'permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock'
Certificate expiration or misconfiguration causes 'x509: certificate has expired or is not yet valid' and manifests as ImagePullBackOff or connection refused errors in both Docker Engine and Kubernetes
OOM kills, CrashLoopBackOff, and container crashes often share root causes with permission and resource-limit misconfigurations — diagnose with 'docker inspect', 'docker logs', and 'dmesg | grep oom' before applying fixes
Quick fix summary: add user to the docker group ('sudo usermod -aG docker $USER && newgrp docker'), rotate expired certs, and increase memory limits for OOM scenarios

Fix Approaches Compared
Method	When to Use	Time	Risk
Add user to docker group	User sees 'permission denied' on docker.sock for the first time	2 min	Low — standard configuration
sudo prefix (temporary)	Quick one-off command, testing access before group fix	30 sec	Low — no persistent change
Fix docker.sock permissions (chmod)	Group fix won't apply until re-login and you need it now	1 min	Medium — resets on daemon restart
Rootless Docker install	Multi-tenant system, hardened security posture required	15 min	Low — no root exposure
Rotate TLS certificates	ImagePullBackOff or 'certificate expired' errors in logs	10–30 min	Medium — brief registry downtime
Increase container memory limit	Container exits with code 137, dmesg shows OOM killer	2 min	Low — monitor for host impact
Reset Docker daemon config	Daemon fails to start, corrupted /etc/docker/daemon.json	5 min	Medium — loses custom config
Reinstall Docker Engine	Multiple failures persist after all other fixes	20 min	High — wipes images/volumes

Understanding Docker Permission Denied and Related Errors

Docker errors can be deceptive — a surface-level 'permission denied' message may actually mask a certificate problem, a misconfigured daemon, or a resource exhaustion event. This guide walks through every major failure mode systematically.

Step 1: Identify the Exact Error Message

Before applying any fix, capture the exact error. Run:

docker info 2>&1
docker ps 2>&1
journalctl -u docker --since '10 minutes ago' --no-pager

Match your output to one of these canonical error classes:

Class A — Socket Permission Denied

Got permission denied while trying to connect to the Docker daemon socket
at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/create:
dial unix /var/run/docker.sock: connect: permission denied

Class B — Certificate / TLS Errors

Error response from daemon: Get "https://registry-1.docker.io/v2/": 
x509: certificate has expired or is not yet valid
Error response from daemon: client sent an HTTP request to an HTTPS server

Class C — Connection Refused / Daemon Not Running

Cannot connect to the Docker daemon at unix:///var/run/docker.sock.
Is the docker daemon running?
error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.40/info": 
dial unix /var/run/docker.sock: connect: connection refused

Class D — Image Pull Failures (ImagePullBackOff / ErrImagePull)

Failed to pull image "myrepo/myapp:latest": rpc error: code = Unknown
msg = "failed to pull and unpack image: failed to resolve reference
\"myrepo/myapp:latest\": unexpected status code 401 Unauthorized"

Class E — CrashLoopBackOff / OOM Kill

Back-off restarting failed container
container exit code 137
Killed process 12345 (java) total-vm:2048000kB, anon-rss:1024000kB

Step 2: Fix Class A — Docker Socket Permission Denied

This is the most common issue. Docker daemon creates /var/run/docker.sock owned by root:docker. Any non-root user must be in the docker group.

Check current group membership:

id $USER
groups
ls -la /var/run/docker.sock
# Expected: srw-rw---- 1 root docker ...

Permanent fix — add user to docker group:

sudo usermod -aG docker $USER
# Apply immediately without logout:
newgrp docker
# Verify:
docker run --rm hello-world

If you cannot log out, force the new group into your current shell with newgrp docker or exec su -l $USER.

Security note: Users in the docker group have effective root access to the host filesystem via volume mounts. For production multi-tenant systems, prefer rootless Docker or use sudo with restricted sudoers rules.

Rootless Docker alternative (Docker Engine 20.10+):

dockerd-rootless-setuptool.sh install
export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock
# Add to ~/.bashrc for persistence

Step 3: Fix Class B — Certificate Expired or TLS Misconfiguration

Expired certificates break both image pulls and private registry authentication. They appear as ImagePullBackOff in Kubernetes and as x509 errors in plain Docker.

Check system clock first — an incorrect system clock causes valid certificates to appear expired:

date
timedatectl status
# If clock is wrong:
sudo timedatectl set-ntp true

Check Docker daemon certificate paths:

cat /etc/docker/daemon.json
# Look for: "tlscacert", "tlscert", "tlskey" entries
openssl x509 -in /etc/docker/certs.d/myregistry.example.com/ca.crt -noout -dates

Rotate Docker daemon TLS certificates:

# Backup old certs
mv ~/.docker/certs.d /tmp/docker-certs-backup
# Regenerate (adjust CN/SANs to your environment)
openssl genrsa -out ca-key.pem 4096
openssl req -new -x509 -days 365 -key ca-key.pem -sha256 \
  -out ca.pem -subj "/CN=docker-ca"
openssl genrsa -out server-key.pem 4096
openssl req -subj "/CN=$(hostname)" -sha256 -new \
  -key server-key.pem -out server.csr
echo subjectAltName = DNS:$(hostname),IP:$(hostname -I | awk '{print $1}'),IP:127.0.0.1 > extfile.cnf
openssl x509 -req -days 365 -sha256 -in server.csr \
  -CA ca.pem -CAkey ca-key.pem -CAcreateserial \
  -out server-cert.pem -extfile extfile.cnf
sudo systemctl restart docker

For insecure (HTTP) private registries, add to /etc/docker/daemon.json:

{
  "insecure-registries": ["myregistry.example.com:5000"]
}

Then sudo systemctl restart docker.

Step 4: Fix Class C — Connection Refused / Daemon Not Running

# Check daemon status
sudo systemctl status docker
sudo systemctl status containerd

# If failed, check logs:
journalctl -u docker -n 50 --no-pager

# Common fix — corrupted daemon.json:
cat /etc/docker/daemon.json
# Validate JSON syntax:
python3 -m json.tool /etc/docker/daemon.json

# If invalid, back up and reset:
sudo mv /etc/docker/daemon.json /etc/docker/daemon.json.bak
sudo systemctl start docker

If the daemon starts after removing daemon.json, re-add options one at a time to identify the bad configuration key.

Check for port conflicts on TCP mode:

sudo ss -tlnp | grep 2376
# Kill conflicting process if found, then:
sudo systemctl restart docker

Step 5: Fix Class D — ImagePullBackOff / Access Denied

ImagePullBackOff means Kubernetes (or Docker) cannot pull the container image. Common causes: wrong credentials, private registry without imagePullSecret, or rate limiting.

Re-authenticate with registry:

docker logout registry-1.docker.io
docker login registry-1.docker.io
# For private registries:
docker login myregistry.example.com

In Kubernetes — create and attach imagePullSecret:

kubectl create secret docker-registry regcred \
  --docker-server=myregistry.example.com \
  --docker-username=myuser \
  --docker-password=mypassword \
  --docker-email=me@example.com

# Reference in your pod spec:
# spec:
#   imagePullSecrets:
#   - name: regcred

Docker Hub rate limit (HTTP 429):

# Check remaining pulls:
curl --head -H "Authorization: Bearer $(curl -s \
  \"https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull\" \
  | jq -r .token)" \
  https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest \
  | grep -i ratelimit

Upgrade to Docker Hub Pro or use a registry mirror to avoid limits.

Step 6: Fix Class E — CrashLoopBackOff, OOM, and Container Crashes

Exit code 137 means the container was sent SIGKILL, almost always by the Linux OOM killer.

# Confirm OOM kill:
dmesg | grep -i 'oom\|killed process' | tail -20
cat /var/log/syslog | grep -i oom | tail -20

# Inspect the container's last exit:
docker inspect --format='{{.State.OOMKilled}} {{.State.ExitCode}}' <container_id>
# Output: true 137

# Check current memory limit:
docker inspect --format='{{.HostConfig.Memory}}' <container_id>
# 0 = unlimited (relies on host limits)

Increase memory limit:

# docker run example:
docker run -m 2g --memory-swap 2g myapp:latest

# docker-compose.yml:
# services:
#   app:
#     mem_limit: 2g
#     memswap_limit: 2g

Diagnose CrashLoopBackOff in Kubernetes:

kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
# Look for: OOMKilled, Error, or application-level exit codes

If the application itself is crashing (not OOM), review the application logs from --previous. Common causes: missing environment variables, failed database connections at startup, or missing secrets.

Step 7: Validate the Fix

After any repair, run this health check sequence:

# 1. Daemon health
docker info
# 2. Basic run
docker run --rm hello-world
# 3. Pull from registry
docker pull alpine:latest
# 4. Resource check
docker system df
docker stats --no-stream

All four steps passing confirms a healthy Docker environment.

Frequently Asked Questions

bash

#!/usr/bin/env bash
# docker-diagnose.sh — comprehensive Docker health check
# Run as the affected user (not root) to surface permission issues accurately

set -euo pipefail
DIVIDER='----------------------------------------'

echo "$DIVIDER"
echo "DOCKER DIAGNOSTIC REPORT — $(date)"
echo "User: $(id)"
echo "$DIVIDER"

# 1. Socket permissions
echo "[1] Docker socket:"
ls -la /var/run/docker.sock 2>/dev/null || echo "  ERROR: socket not found"

# 2. Group membership
echo "[2] Docker group members:"
getent group docker || echo "  ERROR: docker group does not exist"

# 3. Daemon connectivity
echo "[3] Docker daemon info:"
docker info --format 'Server Version: {{.ServerVersion}}\nStorage Driver: {{.Driver}}\nLogging Driver: {{.LoggingDriver}}' 2>&1 || echo "  ERROR: cannot connect to daemon"

# 4. Daemon logs (last 20 lines)
echo "[4] Recent daemon journal entries:"
journalctl -u docker --since '5 minutes ago' --no-pager -n 20 2>/dev/null || echo "  (journald not available)"

# 5. daemon.json validation
echo "[5] daemon.json syntax check:"
if [ -f /etc/docker/daemon.json ]; then
  python3 -m json.tool /etc/docker/daemon.json > /dev/null && echo "  OK" || echo "  ERROR: invalid JSON"
else
  echo "  /etc/docker/daemon.json not present (using defaults)"
fi

# 6. TLS certificate expiry check
echo "[6] TLS certificate expiry (if configured):"
for cert_dir in /etc/docker/certs.d/*/; do
  for cert in "$cert_dir"*.crt "$cert_dir"*.pem; do
    [ -f "$cert" ] || continue
    expiry=$(openssl x509 -in "$cert" -noout -enddate 2>/dev/null | cut -d= -f2)
    echo "  $cert => expires $expiry"
  done
done

# 7. System time (clock skew causes cert failures)
echo "[7] System clock:"
timedatectl status 2>/dev/null | grep -E 'Local time|NTP|synchronized' || date

# 8. Disk space (Docker needs headroom)
echo "[8] Docker disk usage:"
docker system df 2>/dev/null || echo "  (cannot connect to daemon)"

# 9. OOM events
echo "[9] Recent OOM events:"
dmesg --time-format=reltime 2>/dev/null | grep -i 'oom\|killed' | tail -10 || \
  grep -i oom /var/log/syslog 2>/dev/null | tail -10 || echo "  No OOM events found"

# 10. Running containers
echo "[10] Container status:"
docker ps -a --format 'table {{.Names}}\t{{.Status}}\t{{.Image}}' 2>/dev/null || echo "  (cannot connect to daemon)"

echo "$DIVIDER"
echo "DIAGNOSTIC COMPLETE"

# --- COMMON FIXES ---
echo ""
echo "Quick fixes:"
echo "  Add user to docker group:  sudo usermod -aG docker \$USER && newgrp docker"
echo "  Restart daemon:            sudo systemctl restart docker"
echo "  Prune disk space:          docker system prune -af --volumes"
echo "  Re-login to registry:      docker logout && docker login"
echo "  Check Kubernetes pods:     kubectl describe pod <name> && kubectl logs <name> --previous"

Error Medic Editorial

The Error Medic Editorial team consists of senior DevOps engineers, SREs, and platform engineers with experience running containerized workloads at scale on AWS, GCP, and bare-metal Kubernetes clusters. Our guides are tested against real failure scenarios in lab environments before publication.

Sources

Explore More DevOps Config Guides

Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors

Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.

ArgoCD 'connection refused' Error: Complete Troubleshooting Guide (2024)

Fix ArgoCD 'connection refused', CrashLoopBackOff, ImagePullBackOff, and timeout errors with step-by-step diagnostic commands and proven solutions.

ArgoCD Connection Refused: Fix CrashLoopBackOff, ImagePullBackOff, Permission Denied & Timeout Errors

Fix ArgoCD connection refused errors: diagnose CrashLoopBackOff, ImagePullBackOff, permission denied, and timeout with step-by-step kubectl commands and config