Docker Permission Denied: Complete Fix Guide for Crashes, OOM, Disk Full, 502/504 & More
Fix docker permission denied, OOM kills, no space left on device, 502/504 errors, and high CPU with step-by-step Linux commands and a diagnostic script.
- Permission denied on /var/run/docker.sock is caused by your Linux user not belonging to the docker group — fix with: sudo usermod -aG docker $USER && newgrp docker
- Exit code 137 means the container was OOM-killed by the Linux kernel — set a memory limit with docker run -m 2g or mem_limit: 2g in docker-compose.yml
- 'no space left on device' errors require pruning stopped containers, dangling images, and unused volumes with docker system prune -a --volumes
- 502 Bad Gateway and connection refused errors almost always mean the Docker daemon is down or the container crashed before binding its port — check systemctl status docker and docker logs
- Quick fix summary: (1) verify daemon is running, (2) fix user group membership, (3) prune disk space, (4) set memory/CPU resource limits, (5) read docker logs for root cause
| Method | When to Use | Time | Risk |
|---|---|---|---|
| sudo usermod -aG docker $USER | Permission denied on docker.sock | < 1 min + re-login | Low |
| docker system prune -a --volumes | No space left on device / disk full | 1–10 min | Medium — deletes unused data |
| docker run -m 2g | Container OOM killed (exit code 137) | < 1 min | Low |
| sudo systemctl restart docker | Daemon unresponsive / 502 Bad Gateway | < 1 min | Medium — stops all containers |
| Edit daemon.json data-root | Docker partition permanently full | 5–15 min + migration | High — requires data copy |
| docker update --cpus='1.5' | Container consuming excessive CPU | < 1 min | Low |
| DOCKER_BUILDKIT=1 docker build | Slow Docker image builds | Varies | None |
| sudo journalctl -xeu docker.service | Daemon fails to start / core dumps | Diagnostic only | None |
Understanding Docker Errors on Linux
Docker errors range from a simple Unix socket permission mismatch to kernel-level OOM kills and filesystem exhaustion. Every problem has a clear diagnostic path. This guide covers each failure class with the exact error strings you will see and the commands to resolve them.
Exact Error Messages You Will Encounter
Permission denied connecting to the daemon:
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: dial unix /var/run/docker.sock: connect: permission denied
Container OOM killed:
Error response from daemon: Cannot start container <id>: [8] System error: cannot allocate memory
Killed
Exit code will be 137 (128 + SIGKILL).
Disk / filesystem full:
Error response from daemon: no space left on device
Write /var/lib/docker/tmp/GetImageBlob: no space left on device
Daemon not running:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
DNS or registry connection refused:
dial tcp: lookup registry-1.docker.io: connection refused
502 / 504 from reverse proxy (nginx, Traefik, Caddy):
502 Bad Gateway
504 Gateway Timeout
upstream connect error or disconnect/reset before headers. reset reason: connection failure
Step 1: Verify the Docker Daemon Is Running
Every Docker failure starts with this check. If the daemon is down, every command fails.
sudo systemctl status docker
If output shows Active: failed or Active: inactive:
sudo systemctl start docker
If it refuses to start, inspect the systemd journal:
sudo journalctl -xeu docker.service --no-pager | tail -60
Common daemon startup failures include corrupted /var/lib/docker, invalid JSON in /etc/docker/daemon.json, or a port/socket conflict. Validate the config file before restarting:
sudo dockerd --validate --config-file /etc/docker/daemon.json
If the config is malformed, reset it to a safe default:
echo '{}' | sudo tee /etc/docker/daemon.json
sudo systemctl restart docker
Step 2: Fix Docker Permission Denied Errors
The permission denied error on /var/run/docker.sock is the most common Docker issue on Linux. Docker's Unix socket is owned by the docker group; only root or group members can access it.
Check your current group membership:
groups $USER
Add your user to the docker group:
sudo usermod -aG docker $USER
Apply without logging out:
newgrp docker
Or fully log out and back in, then verify:
groups | grep docker
docker ps
Security note: Members of the docker group have effective root access to the host system. For production environments, consider rootless Docker instead:
dockerd-rootless-setuptool.sh install
export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock
CI/CD environments where the runner user is not in the docker group:
docker run -v /var/run/docker.sock:/var/run/docker.sock \
--group-add $(stat -c '%g' /var/run/docker.sock) \
your-image
Step 3: Diagnose 502, 504, and Connection Refused Errors
These errors appear when a reverse proxy (nginx, Traefik, Caddy) cannot reach the upstream container. The proxy returns 502 Bad Gateway when the container is down and 504 Gateway Timeout when the container is alive but responding too slowly.
Check all container states:
docker ps -a
Containers with status Exited or Restarting are your culprits. Read their logs:
docker logs --tail=200 <container_name>
docker logs --since=30m <container_name>
Inspect the exit code and OOM status:
docker inspect <container_id> \
--format='ExitCode: {{.State.ExitCode}} | OOMKilled: {{.State.OOMKilled}} | Error: {{.State.Error}}'
Verify port bindings are correct:
docker port <container_id>
ss -tlnp | grep <expected_port>
Test the application from inside the container:
docker exec -it <container_id> curl -v http://localhost:<internal_port>/health
For 504 timeouts, check whether the app is deadlocked or CPU-starved:
docker exec <container_id> ps aux
docker exec <container_id> top -b -n1 | head -20
Step 4: Fix OOM and Out of Memory Errors
Exit code 137 nearly always means OOM kill. Confirm definitively:
docker inspect <container_id> --format='{{.State.OOMKilled}}'
# Returns: true
dmesg | grep -iE 'out of memory|oom|killed process'
Set a container memory limit at run time:
docker run -m 2g --memory-swap 2g your-image
Setting --memory-swap equal to -m disables swap for that container. Set it larger to permit swap.
In docker-compose.yml:
services:
app:
image: your-image
mem_limit: 2g
memswap_limit: 2g
Update a running container without restarting:
docker update --memory 2g --memory-swap 2g <container_name>
Check host memory availability:
free -h
cat /proc/meminfo | grep -E 'MemAvailable|SwapFree'
If the host itself is under memory pressure, add a swap file:
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
Step 5: Fix Docker Disk Full and No Space Left on Device
Docker accumulates data aggressively: image layers, stopped container filesystems, anonymous volumes, and build cache. First, understand the breakdown:
docker system df
docker system df -v
df -h /var/lib/docker
Incremental cleanup (safe):
docker container prune # remove stopped containers
docker image prune # remove dangling images only
docker image prune -a # remove ALL unused images
docker volume prune # remove unused named volumes
docker builder prune # remove build cache
Full cleanup (removes all unused resources):
docker system prune -a --volumes
Move Docker data to a larger disk (permanent fix):
- Stop Docker:
sudo systemctl stop docker - Copy data to new location:
sudo rsync -aP /var/lib/docker/ /mnt/large-disk/docker/ - Update
/etc/docker/daemon.json:
{
"data-root": "/mnt/large-disk/docker"
}
- Start Docker:
sudo systemctl start docker - Verify:
docker info | grep 'Docker Root Dir'
Automate cleanup with cron to prevent recurrence:
# Add to root's crontab (sudo crontab -e):
0 3 * * 0 docker system prune -f --filter 'until=168h' >> /var/log/docker-cleanup.log 2>&1
Enable log rotation in /etc/docker/daemon.json to prevent logs from filling disk:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
Step 6: Fix Docker High CPU Usage
Identify the offending container:
docker stats --no-stream
Apply CPU limits:
# Limit to 1.5 cores at run time
docker run --cpus='1.5' your-image
# Update a running container without restart
docker update --cpus='1.5' <container_name>
In docker-compose (v3 Swarm syntax):
services:
app:
deploy:
resources:
limits:
cpus: '1.50'
Profile the process inside the container:
docker exec -it <container_id> top -b -n3
docker exec -it <container_id> sh -c 'ps aux --sort=-%cpu | head -20'
Step 7: Analyze Container Crashes and Core Dumps
When a container crashes with a segmentation fault or generates a core dump:
# Check dmesg for segfaults or signal 11
dmesg | grep -E 'segfault|core dumped|signal 11'
# Get crash context from journald
sudo journalctl -u docker --since '1 hour ago' | grep -iE 'fatal|panic|segfault|core'
# Read the crash log from the container
docker logs --tail=200 <container_id>
# Enable core dumps and ptrace for deep debugging
docker run --ulimit core=-1 \
--cap-add SYS_PTRACE \
-v /tmp/cores:/cores \
your-image
Identify the crash log location inside the container:
docker exec <container_id> ls /var/crash/ 2>/dev/null || echo 'No /var/crash directory'
docker exec <container_id> find /var/log -name '*.log' -newer /proc/1 2>/dev/null | head -10
Step 8: Fix Slow Docker Performance
Enable BuildKit for significantly faster image builds:
DOCKER_BUILDKIT=1 docker build -t myapp .
Or enable it permanently in /etc/docker/daemon.json:
{"features": {"buildkit": true}}
Reduce build context with .dockerignore:
node_modules
.git
*.log
dist
__pycache__
.pytest_cache
Fix DNS resolution slowness (often causes 2+ second delays on every network call):
# Test DNS inside a container
docker run --rm busybox nslookup google.com
# If slow, override DNS in /etc/docker/daemon.json:
# {"dns": ["8.8.8.8", "8.8.4.4"]}
# Then: sudo systemctl restart docker
Verify the storage driver is overlay2 (not the slow devicemapper loop mode):
docker info | grep 'Storage Driver'
# Should show: Storage Driver: overlay2
To switch to overlay2, add to /etc/docker/daemon.json:
{"storage-driver": "overlay2"}
Then restart Docker. Note: this does NOT migrate existing images.
Step 9: Emergency Docker Recovery
If Docker is completely non-functional:
# 1. Gracefully stop all containers
docker stop $(docker ps -q) 2>/dev/null || true
# 2. Stop the daemon
sudo systemctl stop docker
# 3. Validate config
sudo dockerd --validate --config-file /etc/docker/daemon.json
# 4. If config is corrupt, reset it
echo '{}' | sudo tee /etc/docker/daemon.json
# 5. Restart
sudo systemctl start docker
# LAST RESORT: Full reset — loses all containers, images, and volumes
sudo systemctl stop docker
sudo rm -rf /var/lib/docker/*
sudo systemctl start docker
Frequently Asked Questions
#!/usr/bin/env bash
# Docker Comprehensive Diagnostics Script
# Usage: bash docker-diag.sh 2>&1 | tee /tmp/docker-diag.log
set -uo pipefail
HR='================================================================='
echo "$HR"
echo "DOCKER DIAGNOSTICS — $(date)"
echo "$HR"
echo ""
echo "--- 1. Daemon Status ---"
systemctl is-active docker 2>/dev/null && echo "Daemon: RUNNING" || echo "Daemon: STOPPED"
systemctl is-enabled docker 2>/dev/null | xargs -I{} echo "Enabled: {}"
echo ""
echo "--- 2. Docker Version ---"
docker version 2>/dev/null | head -8 || echo "ERROR: Cannot connect to daemon"
echo ""
echo "--- 3. User Group Check ---"
groups | grep -q docker \
&& echo "OK: Current user is in the docker group" \
|| echo "WARNING: Not in docker group. Fix: sudo usermod -aG docker $USER && newgrp docker"
echo ""
echo "--- 4. Docker Socket Permissions ---"
ls -la /var/run/docker.sock 2>/dev/null || echo "WARNING: docker.sock not found"
echo ""
echo "--- 5. Disk Usage ---"
df -h /var/lib/docker 2>/dev/null || df -h / 2>/dev/null
echo ""
docker system df 2>/dev/null || echo "(cannot reach daemon)"
echo ""
echo "--- 6. All Containers (running + stopped) ---"
docker ps -a 2>/dev/null || echo "(cannot reach daemon)"
echo ""
echo "--- 7. Container Resource Usage ---"
docker stats --no-stream 2>/dev/null || echo "(cannot reach daemon)"
echo ""
echo "--- 8. OOM Events (dmesg) ---"
dmesg 2>/dev/null | grep -iE 'out of memory|oom_kill|killed process' | tail -20 \
|| echo "No OOM events found (or dmesg requires root)"
echo ""
echo "--- 9. Daemon Logs (last hour) ---"
sudo journalctl -u docker --no-pager --since '1 hour ago' 2>/dev/null | tail -40 \
|| echo "Cannot read journald (try: sudo journalctl -u docker)"
echo ""
echo "--- 10. Daemon Config ---"
if [ -f /etc/docker/daemon.json ]; then
echo "Contents of /etc/docker/daemon.json:"
cat /etc/docker/daemon.json
else
echo "No /etc/docker/daemon.json found (using all defaults)"
fi
echo ""
echo "--- 11. Storage Driver and Root Dir ---"
docker info 2>/dev/null | grep -E 'Storage Driver|Docker Root Dir|Logging Driver|Cgroup Driver' \
|| echo "(cannot reach daemon)"
echo ""
echo "--- 12. Crash Indicators (segfault / core dump) ---"
dmesg 2>/dev/null | grep -iE 'segfault|signal 11|core dumped' | tail -10 \
|| echo "No segfaults found in dmesg"
echo ""
echo "$HR"
echo "Diagnostics complete. Review warnings above."
echo "Full output saved to: /tmp/docker-diag.log (if redirected)"
echo "$HR"Error Medic Editorial
Error Medic Editorial is a team of senior DevOps engineers and SREs with combined decades of experience managing containerized workloads on Linux in production. We specialize in Docker, Kubernetes, and cloud-native infrastructure troubleshooting — translating real incident postmortems into actionable, command-first guides.
Sources
- https://docs.docker.com/engine/install/linux-postinstall/
- https://docs.docker.com/config/containers/resource_constraints/
- https://docs.docker.com/engine/reference/commandline/system_prune/
- https://docs.docker.com/storage/storagedriver/overlayfs-driver/
- https://stackoverflow.com/questions/48957195/how-to-fix-docker-got-permission-denied-issue
- https://github.com/moby/moby/issues/9815