PostgreSQL Deadlock Detected: Complete Troubleshooting Guide for ERROR: deadlock detected
Fix PostgreSQL 'ERROR: deadlock detected' fast. Diagnose with pg_locks, terminate blocking sessions, enforce lock ordering, and tune deadlock_timeout to prevent
- Deadlocks occur when two or more transactions each hold a lock the other needs, creating a circular wait PostgreSQL resolves by aborting one transaction with 'ERROR: deadlock detected DETAIL: Process X waits for ShareLock on transaction Y; blocked by process Z.'
- The most common root causes are inconsistent row/table lock acquisition order across concurrent transactions, long-running idle-in-transaction sessions holding row locks, and connection pool exhaustion forcing transactions to queue and interleave unpredictably.
- Immediate fix: identify the blocking PID via pg_blocking_pids() or the pg_stat_activity + pg_locks join, then run SELECT pg_terminate_backend(<pid>); — long-term fix requires enforcing deterministic lock order in application code, setting idle_in_transaction_session_timeout, and deploying PgBouncer to prevent connection pool exhaustion.
| Method | When to Use | Time | Risk |
|---|---|---|---|
| pg_terminate_backend() | Active deadlock blocking production queries right now | < 1 min | Low — cleanly aborts one transaction, client retries |
| Reorder lock acquisition in app code | Same tables appear repeatedly in deadlock logs | Hours | Low — requires code deploy, no schema change |
| Set idle_in_transaction_session_timeout | Many idle-in-transaction sessions visible in pg_stat_activity | < 5 min | Low — auto-kills abandoned transactions, add to postgresql.conf |
| Deploy PgBouncer (transaction mode) | max_connections exhausted, too many connections errors alongside deadlocks | 1-2 hours | Medium — infrastructure change, requires connection string update |
| SELECT FOR UPDATE SKIP LOCKED | Deadlocks on job queue or work-items table patterns | Hours | Low — targeted SQL change, no schema migration needed |
| SERIALIZABLE isolation level | Complex transactions with read-write conflicts not caught by row locks | Days | High — significant throughput reduction, requires full regression testing |
Understanding PostgreSQL Deadlock Errors
A PostgreSQL deadlock occurs when two or more concurrent transactions each hold a lock that another transaction needs to proceed — a circular wait that neither side can break on its own. PostgreSQL's lock manager runs deadlock detection every deadlock_timeout interval (default 1 second) and resolves detected cycles by choosing one transaction as the victim and aborting it.
The aborted transaction's client receives this error:
ERROR: deadlock detected
DETAIL: Process 23847 waits for ShareLock on transaction 9182; blocked by process 31204.
Process 31204 waits for ShareLock on transaction 9183; blocked by process 23847.
HINT: See server log for query details.
The surviving transaction continues normally. The victim transaction must be retried by the application. The server log — not the client error — contains the full query text for both sides, making log access essential for diagnosis.
Related Errors That Accompany or Mimic Deadlocks
PostgreSQL deadlocks rarely occur in isolation. These errors frequently co-occur or are mistaken for deadlocks:
FATAL: remaining connection slots are reserved for non-replication superuser connections— postgresql too many connections; forces apps to queue connections and increases deadlock probability by serializing transactions in unpredictable order.ERROR: canceling statement due to lock timeout— postgresql timeout fromlock_timeout; fires before the deadlock detection cycle completes when a single transaction waits too long for one lock.ERROR: could not connect to server: Connection refused— postgresql connection refused; PostgreSQL process is down, often after postgresql crash recovery from OOM or disk-full events.FATAL: password authentication failed for user "app"— postgresql authentication failed; common when connection pools reconnect after a server restart with stale credentials or changed pg_hba.conf.ERROR: out of memory— postgresql out of memory; lowwork_memcauses slow query plan choices that hold locks longer, increasing deadlock windows.PANIC: could not write to file "pg_wal/000000010000000000000042": No space left on device— postgresql disk full; cascades into replication lag, WAL sender death, and potential postgresql corruption if the crash is unclean.
Step 1: Diagnose the Active Deadlock
Find Blocking and Blocked Sessions
Connect as a superuser and run this query to see the full blocking graph with query text:
SELECT
blocked.pid AS blocked_pid,
blocked.usename AS blocked_user,
blocked.application_name,
blocked.query AS blocked_query,
blocking.pid AS blocking_pid,
blocking.usename AS blocking_user,
blocking.query AS blocking_query,
now() - blocked.query_start AS blocked_for
FROM pg_stat_activity AS blocked
JOIN pg_stat_activity AS blocking
ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE cardinality(pg_blocking_pids(blocked.pid)) > 0
ORDER BY blocked_for DESC;
The pg_blocking_pids() function (PostgreSQL 9.6+) returns all PIDs blocking a given session. If this query returns no rows during a deadlock, it means PostgreSQL already resolved it — check the server log instead.
Enable Lock Wait Logging Immediately
Add these settings to postgresql.conf and apply without a restart:
log_lock_waits = on
deadlock_timeout = 1s
log_min_duration_statement = 1000
SELECT pg_reload_conf();
With log_lock_waits = on, PostgreSQL logs any lock wait exceeding deadlock_timeout — giving you visibility before a full cycle forms. The log entry shows the waiting query, the holder's query, the relation, and the lock mode.
Check the Server Log
# systemd systems
journalctl -u postgresql --since '1 hour ago' --no-pager | grep -i 'deadlock\|lock wait\|ERROR'
# File-based logging
tail -200 /var/log/postgresql/postgresql-$(date +%Y-%m-%d)_*.log | grep -i 'deadlock\|lock wait'
# Find the log directory if unsure
psql -U postgres -c "SHOW log_directory;"
Step 2: Terminate the Blocking Session
Once you have the blocking PID:
-- Cancel active query only (transaction stays open, locks still held)
SELECT pg_cancel_backend(31204);
-- Terminate session entirely (releases all locks immediately)
SELECT pg_terminate_backend(31204);
Use pg_cancel_backend() first — it sends SIGINT and lets the transaction roll back cleanly. If the session is idle in transaction (no active query to cancel), you must use pg_terminate_backend() which sends SIGTERM.
To kill all idle-in-transaction sessions older than 5 minutes in one shot:
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle in transaction'
AND now() - state_change > interval '5 minutes';
Step 3: Fix the Root Cause
Fix 1 — Enforce Consistent Lock Ordering
The canonical deadlock pattern is Transaction A locking rows in order (id=1, id=2) while Transaction B locks them in order (id=2, id=1). Fix by always sorting the set of rows to lock before operating on them:
# BROKEN: order depends on caller, deadlock possible
def transfer(conn, from_id, to_id, amount):
with conn.cursor() as cur:
cur.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s", [amount, from_id])
cur.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s", [amount, to_id])
# FIXED: always acquire locks in ascending ID order
def transfer(conn, from_id, to_id, amount):
low_id, high_id = sorted([from_id, to_id])
with conn.cursor() as cur:
# Lock both rows in consistent order before modifying either
cur.execute("SELECT id FROM accounts WHERE id = ANY(%s) ORDER BY id FOR UPDATE", [[low_id, high_id]])
cur.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s", [amount, from_id])
cur.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s", [amount, to_id])
Fix 2 — Use SKIP LOCKED for Queue Patterns
If deadlocks occur on a jobs or tasks table, replace SELECT FOR UPDATE with:
-- Each worker grabs a row nobody else has locked, zero deadlock risk
SELECT id, payload
FROM jobs
WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;
Fix 3 — Shorten Transaction Scope
Long transactions hold row locks for their entire duration. Move non-database work outside transaction boundaries:
# BROKEN: API call inside transaction holds locks for seconds
with conn.transaction():
row = cur.execute("SELECT ... FOR UPDATE WHERE id = %s", [job_id]).fetchone()
result = requests.post("https://api.example.com/process", json=row) # SLOW
cur.execute("UPDATE jobs SET status='done', result=%s WHERE id=%s", [result, job_id])
# FIXED: minimize lock hold time
with conn.transaction():
row = cur.execute("SELECT ... FOR UPDATE WHERE id = %s", [job_id]).fetchone()
cur.execute("UPDATE jobs SET status='processing' WHERE id=%s", [job_id])
# Transaction commits here, lock released
result = requests.post("https://api.example.com/process", json=row)
with conn.transaction():
cur.execute("UPDATE jobs SET status='done', result=%s WHERE id=%s", [result, job_id])
Fix 4 — Resolve Connection Pool Exhaustion
When postgresql connection pool exhausted occurs, deploy PgBouncer in transaction-pooling mode:
# /etc/pgbouncer/pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
pool_mode = transaction
max_client_conn = 2000
default_pool_size = 20
server_idle_timeout = 600
log_connections = 0
Update your application's connection string to point at PgBouncer (port 6432) instead of PostgreSQL directly (port 5432). Reduce PostgreSQL's max_connections to match the pool size plus headroom for admin connections.
Step 4: Prevent Recurrence with Configuration
# postgresql.conf — deadlock and lock tuning
deadlock_timeout = 1s # Detection interval; lower to 500ms for high-OLTP
lock_timeout = 5s # Abort rather than wait indefinitely for any lock
idle_in_transaction_session_timeout = 30s # Kill abandoned open transactions automatically
log_lock_waits = on # Log all waits exceeding deadlock_timeout
# Connection management
max_connections = 100 # Keep low; use PgBouncer for scale
# Memory tuning (faster queries = shorter lock hold times)
shared_buffers = 4GB # 25% of total RAM
work_mem = 64MB # Per operation; prevents disk spills on sorts/hashes
# Monitoring
track_activities = on
track_counts = on
log_min_duration_statement = 2000 # Log slow queries (postgresql slow query detection)
Apply without restart where possible: SELECT pg_reload_conf(); picks up most of these. max_connections and shared_buffers require a full restart.
Monitor with These Queries
-- Current deadlock count per database (alert when this increments)
SELECT datname, deadlocks FROM pg_stat_database ORDER BY deadlocks DESC;
-- postgresql replication lag in bytes
SELECT client_addr, state, pg_size_pretty(sent_lsn - replay_lsn) AS lag
FROM pg_stat_replication;
-- Check for postgresql slow query candidates causing lock buildup
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;
Frequently Asked Questions
#!/usr/bin/env bash
# PostgreSQL Deadlock Diagnostic Script
# Usage: PGPASSWORD=secret bash pg-deadlock-diag.sh -h localhost -U postgres -d mydb
PSQLOPTS="-X -A -t --no-password"
PSQL="psql $PSQLOPTS $@"
echo "======================================="
echo " PostgreSQL Deadlock Diagnostic Report"
echo " $(date -u '+%Y-%m-%d %H:%M:%S UTC')"
echo "======================================="
echo ""
echo "--- [1] Active Blocking Relationships ---"
$PSQL << 'EOF'
SELECT
blocked.pid AS blocked_pid,
blocked.usename AS blocked_user,
left(blocked.query, 80) AS blocked_query,
blocking.pid AS blocking_pid,
blocking.usename AS blocking_user,
left(blocking.query, 80) AS blocking_query,
now() - blocked.query_start AS blocked_for
FROM pg_stat_activity AS blocked
JOIN pg_stat_activity AS blocking
ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE cardinality(pg_blocking_pids(blocked.pid)) > 0
ORDER BY blocked_for DESC;
EOF
echo ""
echo "--- [2] Deadlock Statistics Per Database ---"
$PSQL -c "
SELECT datname, deadlocks, conflicts, blks_hit, blks_read
FROM pg_stat_database
WHERE deadlocks > 0
ORDER BY deadlocks DESC;"
echo ""
echo "--- [3] Long-Running Transactions (> 30s) ---"
$PSQL << 'EOF'
SELECT
pid, usename, state, wait_event_type, wait_event,
now() - xact_start AS txn_duration,
left(query, 100) AS query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
AND now() - xact_start > interval '30 seconds'
ORDER BY txn_duration DESC;
EOF
echo ""
echo "--- [4] Connection Count by State ---"
$PSQL -c "
SELECT state, count(*) AS connections
FROM pg_stat_activity
GROUP BY state
ORDER BY connections DESC;"
echo ""
echo "--- [5] Lock Types Currently Held ---"
$PSQL << 'EOF'
SELECT
locktype, relation::regclass AS relation,
mode, granted, count(*) AS count
FROM pg_locks
GROUP BY locktype, relation, mode, granted
ORDER BY count DESC
LIMIT 20;
EOF
echo ""
echo "--- [6] Replication Lag ---"
$PSQL -c "
SELECT client_addr, state,
pg_size_pretty(sent_lsn - replay_lsn) AS lag_size,
now() - write_lag AS write_lag,
now() - replay_lag AS replay_lag
FROM pg_stat_replication;"
echo ""
echo "--- [7] Disk Usage ---"
df -h /var/lib/postgresql 2>/dev/null || df -h / | grep -E 'Filesystem|/'
echo ""
echo "--- [8] Top Slow Queries (if pg_stat_statements enabled) ---"
$PSQL << 'EOF'
SELECT
round(mean_exec_time::numeric, 2) AS mean_ms,
calls,
round(total_exec_time::numeric, 2) AS total_ms,
left(query, 100) AS query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
EOF
echo ""
echo "--- [9] Recent Errors in Server Log ---"
journalctl -u postgresql --since '30 minutes ago' --no-pager 2>/dev/null \
| grep -iE 'deadlock|lock wait|ERROR|FATAL|PANIC' | tail -30 \
|| find /var/log/postgresql -name '*.log' -newer /tmp -exec tail -100 {} \; \
| grep -iE 'deadlock|lock wait|ERROR|FATAL|PANIC' | tail -30
echo ""
echo "To terminate a blocking session:"
echo " psql -U postgres -c 'SELECT pg_terminate_backend(<blocking_pid>);'"
echo ""
echo "To kill all idle-in-transaction sessions > 5 min:"
echo " psql -U postgres -c \"SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction' AND now() - state_change > interval '5 minutes';\""
Error Medic Editorial
The Error Medic Editorial team consists of senior DevOps engineers, DBAs, and SREs with decades of combined experience operating PostgreSQL at scale across environments ranging from single-server startups to multi-terabyte distributed OLTP deployments. All diagnostic steps and commands are validated against PostgreSQL 14, 15, and 16 on production systems before publication.
Sources
- https://www.postgresql.org/docs/current/explicit-locking.html
- https://www.postgresql.org/docs/current/runtime-config-locks.html
- https://www.postgresql.org/docs/current/view-pg-locks.html
- https://wiki.postgresql.org/wiki/Lock_Monitoring
- https://www.postgresql.org/docs/current/pgstatstatements.html
- https://www.postgresql.org/docs/current/amcheck.html