Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors
Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.
- Connection refused usually means SSH is not running on port 22 of the target host, or a firewall is blocking access — verify with `nc -zv <host> 22` before touching Ansible config.
- Permission denied is almost always a missing or mismatched SSH key, a wrong `ansible_user`, or sudo not configured correctly on the remote host.
- Timeout errors stem from slow DNS resolution, high network latency, or `ansible_ssh_timeout` / `DEFAULT_TIMEOUT` set too low for your environment.
- Quick fix summary: run `ansible <host> -m ping -vvvv` to get full SSH-level output, isolate which layer is failing (network, auth, privilege escalation), then apply the targeted fix from the comparison table below.
| Method | When to Use | Time to Apply | Risk |
|---|---|---|---|
| Add/fix SSH public key in authorized_keys | Permission denied with key auth | 2 min | Low — non-destructive |
| Set ansible_password + sshpass for password auth | No key-based auth available | 5 min | Medium — credential exposure risk in logs |
| Configure become + sudo NOPASSWD in /etc/sudoers | Permission denied during privilege escalation | 5 min | Medium — grants passwordless sudo |
| Increase ansible_ssh_timeout and DEFAULT_TIMEOUT | Sporadic timeout on slow or remote hosts | 1 min | Low — benign config change |
| Switch ansible_connection to paramiko | OpenSSH incompatibility with old hosts | 2 min | Low — slight performance overhead |
| Use ProxyJump / ansible_ssh_common_args for bastion | Connection refused through a jump host | 10 min | Low — adds network hop complexity |
| Fix /etc/hosts or DNS for target hostnames | Timeout caused by failed name resolution | 5 min | Low — may affect other services |
| Disable host key checking (testing only) | Known_hosts mismatch blocking first run | 1 min | High — disables MITM protection |
Understanding Ansible Failures
Ansible executes tasks by opening an SSH connection to each managed host, copying a small Python module, running it, and returning structured output. When any step in that chain breaks, Ansible surfaces a FAILED or UNREACHABLE task with a short error message. The three most common failure families — connection refused, permission denied, and timeout — each point to a different layer of the stack.
Understanding which layer is failing is the single most important diagnostic step. Resist the urge to tweak ansible.cfg blindly; instead, isolate the problem with the verbosity flag and standard Unix tools first.
Step 1: Enable Verbose Output and Reproduce the Failure
Run your playbook or ad-hoc command with -vvvv to expose the raw SSH command Ansible is constructing:
ansible all -m ping -vvvv -i inventory.ini
This reveals the exact SSH arguments, the control socket path, and the remote Python interpreter path. Copy the raw SSH command from the output and run it manually in your terminal. If the manual SSH succeeds but Ansible fails, the problem is in your Ansible configuration. If the manual SSH also fails, the problem is at the OS/network level and Ansible is not the right place to fix it.
Step 2: Diagnose Connection Refused
What you see:
fatal: [web01]: UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.1.10 port 22: Connection refused",
"unreachable": true
}
Root causes and checks:
SSH daemon not running — Connect to the host via console or out-of-band access and run
systemctl status sshd. If it is stopped, start it withsystemctl start sshd.Wrong port — Some hardened hosts move SSH to a non-standard port. Check
sshd_configon the target:grep ^Port /etc/ssh/sshd_config. Setansible_portin your inventory or group_vars to match.Firewall blocking port 22 — From your Ansible control node, run
nc -zv <target_ip> 22. AConnection refusedfrom nc confirms the port is closed or blocked. On the target host, checkiptables -L -norfirewall-cmd --list-all. Open the port with:firewall-cmd --permanent --add-service=ssh && firewall-cmd --reloadWrong host or IP in inventory — Verify the host is resolvable:
getent hosts web01. If DNS fails, add a static entry to/etc/hostson the control node or use the IP directly in inventory.Jump host / bastion required — If the target is in a private subnet, you need a ProxyJump. In
ansible.cfgor group_vars:ansible_ssh_common_args='-o ProxyJump=bastion.example.com'
Step 3: Diagnose Permission Denied
What you see:
fatal: [db01]: UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: deploy@db01: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).",
"unreachable": true
}
or during privilege escalation:
fatal: [db01]: FAILED! => {
"msg": "Missing sudo password"
}
Authentication-layer fixes:
Verify the correct user — Check
ansible_userin your inventory orremote_userinansible.cfg. The user must exist on the remote host. Confirm with:ssh -i ~/.ssh/id_rsa deploy@db01.Check the SSH key — Ansible defaults to
~/.ssh/id_rsa. If your key is elsewhere, setansible_ssh_private_key_filein inventory. Ensure the public key appears in~/.ssh/authorized_keyson the target, with permissions600on the file and700on~/.ssh.Repair authorized_keys — If you can access the host another way, add the key:
ssh-copy-id -i ~/.ssh/id_rsa.pub deploy@db01Or manually append the public key and fix permissions:
chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keysSELinux context on authorized_keys — On RHEL/CentOS systems with SELinux enforcing, a wrong file context blocks SSH key auth even though permissions look correct:
restorecon -Rv ~/.ssh
Privilege escalation (sudo) fixes:
Add become configuration — In your playbook or
ansible.cfg:[privilege_escalation] become=True become_method=sudo become_user=rootConfigure passwordless sudo — Edit
/etc/sudoerson the target (always usevisudo):deploy ALL=(ALL) NOPASSWD: ALLFor a narrower scope, restrict to specific commands.
Provide the sudo password at runtime — If passwordless sudo is not acceptable:
ansible-playbook site.yml --ask-become-passOr store it encrypted in Ansible Vault.
Step 4: Diagnose Timeout Errors
What you see:
fatal: [app01]: UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: connect to host app01 port 22: Operation timed out",
"unreachable": true
}
or mid-task:
fatal: [app01]: FAILED! => {
"msg": "Timeout (12s) waiting for privilege escalation prompt"
}
Timeout-layer fixes:
Increase the SSH connection timeout — In
ansible.cfg:[defaults] timeout = 60 [ssh_connection] ssh_args = -o ConnectTimeout=60Or per-host in inventory:
ansible_ssh_timeout=60.Slow DNS resolution — If
getent hosts <target>takes several seconds, DNS is the bottleneck. Add the host to/etc/hostson the control node, or setUseDNS noinsshd_configon target hosts to skip reverse-DNS lookup during SSH handshake.SSH multiplexing — Enable ControlPersist to reuse connections across tasks, dramatically reducing per-task overhead:
[ssh_connection] ssh_args = -o ControlMaster=auto -o ControlPersist=60s pipelining = TrueGSSAPI delays on non-Kerberos hosts — SSH tries Kerberos auth before publickey on many distros. Disable it:
[ssh_connection] ssh_args = -o GSSAPIAuthentication=noThis alone can cut connection time by 5–30 seconds per host.
Privilege escalation timeout — If sudo prompts are timing out, ensure
requirettyis not set in/etc/sudoers(it breaks Ansible's non-interactive sudo):Defaults !requiretty
Step 5: Validate the Fix
After applying any change, validate with a targeted ping:
ansible <host_or_group> -m ping -i inventory.ini
Expected success output:
web01 | SUCCESS => {
"changed": false,
"ping": "pong"
}
Then run your playbook with --check (dry-run) before live execution:
ansible-playbook site.yml --check --diff -i inventory.ini
For CI/CD pipelines, add ANSIBLE_HOST_KEY_CHECKING=False only in ephemeral environments where hosts are freshly provisioned and MITM risk is zero. Never disable host key checking in production.
Frequently Asked Questions
#!/usr/bin/env bash
# Ansible Failure Diagnostic Script
# Usage: bash ansible-diag.sh <target_host> [inventory_file]
HOST=${1:?"Usage: $0 <target_host> [inventory]"}
INVENTORY=${2:-"inventory.ini"}
echo "=== [1] Resolve hostname ==="
getent hosts "$HOST" || echo "WARN: hostname not resolvable -- check DNS or /etc/hosts"
echo ""
echo "=== [2] Test TCP port 22 (connection refused check) ==="
nc -zv -w 5 "$HOST" 22 2>&1 || echo "FAIL: port 22 unreachable -- check sshd and firewall"
echo ""
echo "=== [3] Ansible ping with full verbosity ==="
ansible "$HOST" -m ping -i "$INVENTORY" -vvvv 2>&1 | tee /tmp/ansible-ping-debug.log
echo "Full output saved to /tmp/ansible-ping-debug.log"
echo ""
echo "=== [4] Extract raw SSH command from debug log ==="
grep -o "ssh .*$HOST.*" /tmp/ansible-ping-debug.log | head -5
echo ""
echo "=== [5] Check current ansible.cfg in effect ==="
ansible --version | grep 'config file'
echo ""
echo "=== [6] Show effective variables for host ==="
ansible "$HOST" -m debug -a 'var=ansible_ssh_private_key_file' -i "$INVENTORY"
ansible "$HOST" -m debug -a 'var=ansible_user' -i "$INVENTORY"
ansible "$HOST" -m debug -a 'var=ansible_port' -i "$INVENTORY"
echo ""
echo "=== [7] Test privilege escalation ==="
ansible "$HOST" -m command -a 'whoami' -i "$INVENTORY" --become
echo ""
echo "=== [8] Check GSSAPI auth delay ==="
time ssh -o GSSAPIAuthentication=no -o BatchMode=yes "$HOST" echo ok 2>&1 || true
echo "If above is significantly faster, add GSSAPIAuthentication=no to ssh_args in ansible.cfg"
echo ""
echo "=== Diagnostic complete ==="
Error Medic Editorial
Error Medic Editorial is a team of senior DevOps and SRE engineers with combined experience managing infrastructure at scale across cloud and on-premises environments. We write practical, command-first troubleshooting guides tested against real systems.
Sources
- https://docs.ansible.com/ansible/latest/collections/ansible/builtin/ssh_connection.html
- https://docs.ansible.com/ansible/latest/user_guide/become.html
- https://docs.ansible.com/ansible/latest/reference_appendices/config.html
- https://stackoverflow.com/questions/32297456/how-to-ignore-ansible-ssh-authenticity-checking
- https://github.com/ansible/ansible/issues/9517