Why does Ansible say 'Permission denied (publickey)' even though I can SSH manually with the same user?

Ansible may be picking up a different key than your interactive SSH session. Your shell likely has `ssh-agent` running with the correct key loaded, but Ansible bypasses the agent unless `ansible_ssh_private_key_file` is set or `ssh_args` includes `-o ForwardAgent=yes`. Check which key Ansible is using by running `ansible -m ping -vvvv` and looking for the `-i /path/to/key` argument in the SSH command. Set `ansible_ssh_private_key_file=~/.ssh/your_key` in inventory or group_vars to pin the correct key.

Ansible shows 'UNREACHABLE' with 'connection refused' but SSH works fine when I run it from the terminal. What gives?

The most common cause is that your terminal SSH is connecting to a different host (resolved via a local alias, VPN, or /etc/hosts entry) than Ansible is. Run `ansible -m ping -vvvv` and compare the IP in the raw SSH command with the IP you connect to manually. Also check that `ansible_host` in inventory is set to the correct IP and that no SSH tunnel or ProxyCommand is active in your `~/.ssh/config` that Ansible is not inheriting.

How do I fix 'Timeout waiting for privilege escalation prompt' when using become?

This error means sudo is waiting for a password interactively but Ansible cannot provide one. Fix it by either: (1) setting `become_pass` via `--ask-become-pass` or Ansible Vault, (2) configuring NOPASSWD in /etc/sudoers for the target user, or (3) removing `Defaults requiretty` from /etc/sudoers which forces an interactive terminal that Ansible does not provide. Also check that `pipelining=True` in ansible.cfg is not conflicting with `requiretty`.

My playbook works for most hosts but a few always time out. How do I tune per-host timeouts?

Set `ansible_ssh_timeout` as a host variable in your inventory for the slow hosts: `slow-host ansible_ssh_timeout=120`. Additionally, enable SSH connection multiplexing globally (`ControlMaster=auto ControlPersist=300s`) so subsequent tasks reuse the established connection instead of re-handshaking. For geographically distant hosts, also check whether `GSSAPIAuthentication no` in ssh_args shaves off DNS lookup time during the Kerberos probe.

I get 'Failed to connect to the host via ssh: Host key verification failed.' How do I resolve this without disabling host key checking?

The safest resolution is to accept the host key properly. Run `ssh-keyscan -H >> ~/.ssh/known_hosts` from the control node to add the host fingerprint. For managed fleets, automate this by provisioning known_hosts via your configuration management bootstrap process. Only fall back to `ANSIBLE_HOST_KEY_CHECKING=False` in fully automated ephemeral environments (CI pipelines building and destroying VMs) where hosts are provably fresh and not subject to MITM attacks.

Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors

Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.

Last updated: February 23, 2026

Last verified: February 23, 2026

1,946 words

Key Takeaways

Connection refused usually means SSH is not running on port 22 of the target host, or a firewall is blocking access — verify with `nc -zv <host> 22` before touching Ansible config.
Permission denied is almost always a missing or mismatched SSH key, a wrong `ansible_user`, or sudo not configured correctly on the remote host.
Timeout errors stem from slow DNS resolution, high network latency, or `ansible_ssh_timeout` / `DEFAULT_TIMEOUT` set too low for your environment.
Quick fix summary: run `ansible <host> -m ping -vvvv` to get full SSH-level output, isolate which layer is failing (network, auth, privilege escalation), then apply the targeted fix from the comparison table below.

Fix Approaches Compared
Method	When to Use	Time to Apply	Risk
Add/fix SSH public key in authorized_keys	Permission denied with key auth	2 min	Low — non-destructive
Set ansible_password + sshpass for password auth	No key-based auth available	5 min	Medium — credential exposure risk in logs
Configure become + sudo NOPASSWD in /etc/sudoers	Permission denied during privilege escalation	5 min	Medium — grants passwordless sudo
Increase ansible_ssh_timeout and DEFAULT_TIMEOUT	Sporadic timeout on slow or remote hosts	1 min	Low — benign config change
Switch ansible_connection to paramiko	OpenSSH incompatibility with old hosts	2 min	Low — slight performance overhead
Use ProxyJump / ansible_ssh_common_args for bastion	Connection refused through a jump host	10 min	Low — adds network hop complexity
Fix /etc/hosts or DNS for target hostnames	Timeout caused by failed name resolution	5 min	Low — may affect other services
Disable host key checking (testing only)	Known_hosts mismatch blocking first run	1 min	High — disables MITM protection

Understanding Ansible Failures

Ansible executes tasks by opening an SSH connection to each managed host, copying a small Python module, running it, and returning structured output. When any step in that chain breaks, Ansible surfaces a FAILED or UNREACHABLE task with a short error message. The three most common failure families — connection refused, permission denied, and timeout — each point to a different layer of the stack.

Understanding which layer is failing is the single most important diagnostic step. Resist the urge to tweak ansible.cfg blindly; instead, isolate the problem with the verbosity flag and standard Unix tools first.

Step 1: Enable Verbose Output and Reproduce the Failure

Run your playbook or ad-hoc command with -vvvv to expose the raw SSH command Ansible is constructing:

ansible all -m ping -vvvv -i inventory.ini

This reveals the exact SSH arguments, the control socket path, and the remote Python interpreter path. Copy the raw SSH command from the output and run it manually in your terminal. If the manual SSH succeeds but Ansible fails, the problem is in your Ansible configuration. If the manual SSH also fails, the problem is at the OS/network level and Ansible is not the right place to fix it.

Step 2: Diagnose Connection Refused

What you see:

fatal: [web01]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.1.10 port 22: Connection refused",
    "unreachable": true
}

Root causes and checks:

SSH daemon not running — Connect to the host via console or out-of-band access and run systemctl status sshd. If it is stopped, start it with systemctl start sshd.
Wrong port — Some hardened hosts move SSH to a non-standard port. Check sshd_config on the target: grep ^Port /etc/ssh/sshd_config. Set ansible_port in your inventory or group_vars to match.
Firewall blocking port 22 — From your Ansible control node, run nc -zv <target_ip> 22. A Connection refused from nc confirms the port is closed or blocked. On the target host, check iptables -L -n or firewall-cmd --list-all. Open the port with:
```
firewall-cmd --permanent --add-service=ssh && firewall-cmd --reload
```
Wrong host or IP in inventory — Verify the host is resolvable: getent hosts web01. If DNS fails, add a static entry to /etc/hosts on the control node or use the IP directly in inventory.
Jump host / bastion required — If the target is in a private subnet, you need a ProxyJump. In ansible.cfg or group_vars:
```
ansible_ssh_common_args='-o ProxyJump=bastion.example.com'
```

Step 3: Diagnose Permission Denied

What you see:

fatal: [db01]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: deploy@db01: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).",
    "unreachable": true
}

or during privilege escalation:

fatal: [db01]: FAILED! => {
    "msg": "Missing sudo password"
}

Authentication-layer fixes:

Verify the correct user — Check ansible_user in your inventory or remote_user in ansible.cfg. The user must exist on the remote host. Confirm with: ssh -i ~/.ssh/id_rsa deploy@db01.
Check the SSH key — Ansible defaults to ~/.ssh/id_rsa. If your key is elsewhere, set ansible_ssh_private_key_file in inventory. Ensure the public key appears in ~/.ssh/authorized_keys on the target, with permissions 600 on the file and 700 on ~/.ssh.
Repair authorized_keys — If you can access the host another way, add the key:
```
ssh-copy-id -i ~/.ssh/id_rsa.pub deploy@db01
```
Or manually append the public key and fix permissions:
```
chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys
```
SELinux context on authorized_keys — On RHEL/CentOS systems with SELinux enforcing, a wrong file context blocks SSH key auth even though permissions look correct:
```
restorecon -Rv ~/.ssh
```

Privilege escalation (sudo) fixes:

Add become configuration — In your playbook or ansible.cfg:

[privilege_escalation]
become=True
become_method=sudo
become_user=root

Configure passwordless sudo — Edit /etc/sudoers on the target (always use visudo):
```
deploy ALL=(ALL) NOPASSWD: ALL
```
For a narrower scope, restrict to specific commands.
Provide the sudo password at runtime — If passwordless sudo is not acceptable:
```
ansible-playbook site.yml --ask-become-pass
```
Or store it encrypted in Ansible Vault.

Step 4: Diagnose Timeout Errors

What you see:

fatal: [app01]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: connect to host app01 port 22: Operation timed out",
    "unreachable": true
}

or mid-task:

fatal: [app01]: FAILED! => {
    "msg": "Timeout (12s) waiting for privilege escalation prompt"
}

Timeout-layer fixes:

Increase the SSH connection timeout — In ansible.cfg:
```
[defaults]
timeout = 60

[ssh_connection]
ssh_args = -o ConnectTimeout=60
```
Or per-host in inventory: ansible_ssh_timeout=60.
Slow DNS resolution — If getent hosts <target> takes several seconds, DNS is the bottleneck. Add the host to /etc/hosts on the control node, or set UseDNS no in sshd_config on target hosts to skip reverse-DNS lookup during SSH handshake.
SSH multiplexing — Enable ControlPersist to reuse connections across tasks, dramatically reducing per-task overhead:
```
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
pipelining = True
```
GSSAPI delays on non-Kerberos hosts — SSH tries Kerberos auth before publickey on many distros. Disable it:
```
[ssh_connection]
ssh_args = -o GSSAPIAuthentication=no
```
This alone can cut connection time by 5–30 seconds per host.
Privilege escalation timeout — If sudo prompts are timing out, ensure requiretty is not set in /etc/sudoers (it breaks Ansible's non-interactive sudo):
```
Defaults !requiretty
```

Step 5: Validate the Fix

After applying any change, validate with a targeted ping:

ansible <host_or_group> -m ping -i inventory.ini

Expected success output:

web01 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Then run your playbook with --check (dry-run) before live execution:

ansible-playbook site.yml --check --diff -i inventory.ini

For CI/CD pipelines, add ANSIBLE_HOST_KEY_CHECKING=False only in ephemeral environments where hosts are freshly provisioned and MITM risk is zero. Never disable host key checking in production.

Frequently Asked Questions

bash

#!/usr/bin/env bash
# Ansible Failure Diagnostic Script
# Usage: bash ansible-diag.sh <target_host> [inventory_file]

HOST=${1:?"Usage: $0 <target_host> [inventory]"}
INVENTORY=${2:-"inventory.ini"}

echo "=== [1] Resolve hostname ==="
getent hosts "$HOST" || echo "WARN: hostname not resolvable -- check DNS or /etc/hosts"

echo ""
echo "=== [2] Test TCP port 22 (connection refused check) ==="
nc -zv -w 5 "$HOST" 22 2>&1 || echo "FAIL: port 22 unreachable -- check sshd and firewall"

echo ""
echo "=== [3] Ansible ping with full verbosity ==="
ansible "$HOST" -m ping -i "$INVENTORY" -vvvv 2>&1 | tee /tmp/ansible-ping-debug.log
echo "Full output saved to /tmp/ansible-ping-debug.log"

echo ""
echo "=== [4] Extract raw SSH command from debug log ==="
grep -o "ssh .*$HOST.*" /tmp/ansible-ping-debug.log | head -5

echo ""
echo "=== [5] Check current ansible.cfg in effect ==="
ansible --version | grep 'config file'

echo ""
echo "=== [6] Show effective variables for host ==="
ansible "$HOST" -m debug -a 'var=ansible_ssh_private_key_file' -i "$INVENTORY"
ansible "$HOST" -m debug -a 'var=ansible_user' -i "$INVENTORY"
ansible "$HOST" -m debug -a 'var=ansible_port' -i "$INVENTORY"

echo ""
echo "=== [7] Test privilege escalation ==="
ansible "$HOST" -m command -a 'whoami' -i "$INVENTORY" --become

echo ""
echo "=== [8] Check GSSAPI auth delay ==="
time ssh -o GSSAPIAuthentication=no -o BatchMode=yes "$HOST" echo ok 2>&1 || true
echo "If above is significantly faster, add GSSAPIAuthentication=no to ssh_args in ansible.cfg"

echo ""
echo "=== Diagnostic complete ==="

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps and SRE engineers with combined experience managing infrastructure at scale across cloud and on-premises environments. We write practical, command-first troubleshooting guides tested against real systems.

Sources

Explore More DevOps Config Guides

ArgoCD 'connection refused' Error: Complete Troubleshooting Guide (2024)

Fix ArgoCD 'connection refused', CrashLoopBackOff, ImagePullBackOff, and timeout errors with step-by-step diagnostic commands and proven solutions.

ArgoCD Connection Refused: Fix CrashLoopBackOff, ImagePullBackOff, Permission Denied & Timeout Errors

Fix ArgoCD connection refused errors: diagnose CrashLoopBackOff, ImagePullBackOff, permission denied, and timeout with step-by-step kubectl commands and config

AWS ECS Timeout: Task Failed ELB Health Checks & Container Startup Timeouts — Complete Fix Guide

Fix AWS ECS timeout errors including failed ELB health checks, container startup timeouts, and deployment stalls with step-by-step CLI commands and config fixes

Understanding Ansible Failures

Step 1: Enable Verbose Output and Reproduce the Failure

Step 2: Diagnose Connection Refused

Step 3: Diagnose Permission Denied

Step 4: Diagnose Timeout Errors

Step 5: Validate the Fix

Frequently Asked Questions

Why does Ansible say 'Permission denied (publickey)' even though I can SSH manually with the same user?

Ansible shows 'UNREACHABLE' with 'connection refused' but SSH works fine when I run it from the terminal. What gives?

How do I fix 'Timeout waiting for privilege escalation prompt' when using become?

My playbook works for most hosts but a few always time out. How do I tune per-host timeouts?

I get 'Failed to connect to the host via ssh: Host key verification failed.' How do I resolve this without disabling host key checking?

Sources

Related Articles in Ansible

Explore More DevOps Config Guides