Why am I getting 'connection refused' when the PostgreSQL service is definitely running?

If the service is running, 'connection refused' usually means PostgreSQL is not listening on the IP address you are trying to connect to. Check the `listen_addresses` setting in `postgresql.conf`. If it's set to `localhost`, external connections will fail. Also, verify that a firewall (like iptables, ufw, or AWS Security Groups) isn't blocking port 5432.

How do I fix a PostgreSQL 'Out of memory: Killed process' error?

The Linux OOM killer terminated Postgres because the system ran out of RAM. You must reduce PostgreSQL's memory footprint. Lower `shared_buffers` (aim for 25% of total RAM), drastically reduce `work_mem` (as it's allocated per-sort operation per-connection), and consider reducing `max_connections` or implementing PgBouncer to lower the baseline process overhead.

What should I do if PostgreSQL fails to start with 'No space left on device'?

Your database volume is full, preventing Postgres from writing transaction logs (WAL), which causes a PANIC and crash. You must free up space. Do NOT delete files in `pg_wal` manually. Instead, expand the disk volume, delete temporary files, clear rotated standard logs in `/var/log`, or clear space on other partitions if they share the same disk.

How can I tell if 'too many connections' is causing my outages?

Check your PostgreSQL logs. If you see `FATAL: sorry, too many clients already`, your application is attempting to open more concurrent connections than `max_connections` allows. Instead of just increasing this limit (which risks an OOM crash), implement a connection pooler like PgBouncer.

Why is my PostgreSQL CPU usage consistently at 100% causing connection timeouts?

High CPU is rarely a configuration issue and almost always a query optimization issue. Long-running queries performing sequential scans on large tables will consume CPU. Enable `pg_stat_statements` to find the slowest queries, then use `EXPLAIN ANALYZE` to see why they are slow, and add the missing indexes.

Troubleshooting PostgreSQL Connection Refused & Service Failures

A complete DevOps guide to fixing PostgreSQL connection refused errors, service crashes, OOM, disk full, high CPU, and too many connections.

Last updated: February 24, 2026

Last verified: February 24, 2026

1,755 words

Key Takeaways

Connection refused usually means the PostgreSQL service is stopped, restarting, or listening on the wrong interface.
Service crashes are frequently caused by Out of Memory (OOM) killer, Disk Full errors on the WAL directory, or permission issues.
Connection limits ('too many connections') will block new clients; use a connection pooler like PgBouncer instead of blindly increasing max_connections.
Quick Fix: Check service status ('systemctl status postgresql') and inspect the latest logs ('journalctl -u postgresql' or '/var/log/postgresql/') to identify the exact root cause.

Fix Approaches Compared
Method	When to Use	Time	Risk
Restart Service (`systemctl restart`)	Service is stopped or hung without data corruption	Low	Low
Modify `listen_addresses` / `pg_hba.conf`	Network connections are rejected or routing fails	Medium	Low
Increase `max_connections`	Seeing 'FATAL: too many connections' in logs	Low	Medium (Can cause OOM)
Install & Configure PgBouncer	High connection churn or persistent 'too many connections'	High	Low
Tune `shared_buffers` & `work_mem`	Experiencing OOM kills or high memory utilization	Medium	Medium (Requires restart)
Clear Disk Space / Expand Volume	Logs show 'No space left on device' (Disk Full)	Medium	High (If deleting wrong files)

Understanding the Error: PostgreSQL Connection Refused

When you encounter a psql: error: connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused error, it signifies that the client application successfully reached the host, but no PostgreSQL server process was listening on the specified port. This is arguably the most common PostgreSQL error, but it's often a symptom of a deeper issue, ranging from simple configuration oversights to severe resource exhaustion leading to a crash.

In this comprehensive guide, we will troubleshoot not only the standard "connection refused" scenario but also the cascading failures that cause it: service not starting, out of memory (OOM) kills, disk full errors, high CPU utilization, and connection limit exhaustion.

Step 1: Diagnose the Initial State

The first step is always to check the status of the PostgreSQL service and examine the most recent logs. A stopped service is the primary reason for a connection refusal.

1.1 Check Service Status

Use systemctl to verify if the service is active, inactive, or failed. If the service is inactive, it explains the connection refusal. The critical question is why it stopped.

1.2 Inspect PostgreSQL Logs

The logs are the ultimate source of truth. Depending on your OS and installation, they are typically found in /var/log/postgresql/, /var/lib/pgsql/data/log/, or accessible via journalctl -u postgresql. Look for FATAL, PANIC, or ERROR messages immediately preceding the time the connections started failing.

Step 2: Root Cause Analysis and Fixes

Based on your log analysis, you will typically fall into one of the following root cause categories.

Scenario A: The Service is Not Starting (Configuration or Permissions)

If systemctl status postgresql shows the service as failed immediately after trying to start, the issue is likely configuration or filesystem permissions.

Listen Addresses and Port: Ensure postgresql.conf has listen_addresses = '*' (or a specific IP) and port = 5432. If it's set to localhost but you attempt to connect externally, the connection will be refused by the OS before it even reaches Postgres.
pg_hba.conf: While a pg_hba.conf rejection usually results in a "fatal: no pg_hba.conf entry" or "permission denied" error rather than a pure "connection refused", severe misconfigurations here can prevent startup or block specific subnets.
Permission Denied: If the postgres system user cannot read the data directory, configuration files, or SSL certificates, the service will crash on startup. Fix ownership using chown -R postgres:postgres /var/lib/postgresql/data (adjust path as necessary) and ensure permissions are strictly 0700 for the data directory.

Scenario B: "Too Many Connections"

If the service is running but refusing new connections, you might see FATAL: sorry, too many clients already in the logs. This often presents as a "connection refused" or a timeout to the end-user application layer.

The Band-Aid Fix: You can increase max_connections in postgresql.conf. However, this is dangerous because every PostgreSQL connection forks a new OS process, consuming a baseline amount of memory. Setting this too high is a primary trigger for OOM kills.
The Engineering Fix: Implement a connection pooler like PgBouncer or Odyssey. A pooler sits between your application and PostgreSQL, multiplexing thousands of application connections onto a small, fixed number of actual database connections, drastically reducing memory overhead and context switching.

Scenario C: Resource Exhaustion (OOM, High CPU, Disk Full)

When a database server runs out of critical resources, the OS or the PostgreSQL engine itself will take drastic measures, leading to a crash or a forced restart. During this recovery phase, connections will be refused.

Out of Memory (OOM) and Segfaults: If you see Out of memory: Killed process in your system logs (dmesg or /var/log/messages), the Linux OOM killer terminated the PostgreSQL postmaster process to save the OS. This happens when memory configurations like shared_buffers, work_mem, or maintenance_work_mem are too high for the available system RAM. A segmentation fault (segfault) or core dump can also occur due to memory corruption, extensions with memory leaks, or hardware issues.
- Fix: Tune postgresql.conf. A general rule of thumb is to set shared_buffers to 25% of total RAM. Strictly limit work_mem because this allocation can be multiplied by the number of active connections and the number of sort/hash operations in a single complex query. Lower max_connections if no pooler is used.
Disk Full: A PANIC: could not write to file "pg_wal/xlog..." : No space left on device error will immediately crash the database to prevent silent data corruption.
- Fix: Free up disk space immediately. You may need to clear old log files (not WAL files!), expand the logical volume, or set up WAL archiving to move Write-Ahead Logs to a cheaper storage tier like S3 to prevent the pg_wal directory from filling up.
High CPU and Slow Queries: A system pinned at 100% CPU might not actively refuse network connections at the socket level, but connection timeouts from the application will masquerade as connection failures. High CPU is often caused by missing indexes forcing sequential scans on massive tables, unoptimized complex joins, or severe bloat.
- Fix: Enable and use the pg_stat_statements extension to identify the most time-consuming queries. Use EXPLAIN ANALYZE on these queries to understand the query plan, add appropriate indexes, and ensure autovacuum is keeping statistics up to date.

Scenario D: Crashes and Core Dumps

If the PostgreSQL logs show server process (PID ...) was terminated by signal 9 or mention generating a core dump without an obvious OOM message, PostgreSQL encountered a critical internal failure.

Hardware Issues: Faulty RAM or failing local NVMe/SSD storage can cause unexpected panics that look like software bugs.
Software Bugs: While rare in core PostgreSQL, bugs do exist. Ensure you are running the latest minor release for your major PostgreSQL version (e.g., if on 15.2, upgrade to 15.x latest). Also, heavily scrutinize any third-party extensions (like PostGIS or TimescaleDB) as they run in the same memory space and a bug there will crash the entire database.

Step 3: Preventative Maintenance and SRE Best Practices

To avoid waking up to a critical "connection refused" alert, implement proactive monitoring and SRE practices:

Monitor Connection Utilization: Set up alerts when active connections reach 80% of max_connections.
Monitor Disk Space Strictly: Alert at 75%, 85%, and 90% utilization of the data and WAL directories. A full disk is an avoidable outage.
Mandatory Connection Pooling: Make PgBouncer a standard component of your database architecture, even for small workloads, to protect the database engine.
Autovacuum Tuning: Ensure autovacuum is aggressively configured. Default settings are often too passive for high-write databases, leading to table bloat, slow queries (high CPU), and eventually transaction ID wraparound issues.

Frequently Asked Questions

bash

# 1. Check if the PostgreSQL service is running and view its status
sudo systemctl status postgresql

# 2. View the last 50 lines of PostgreSQL logs to find PANIC, FATAL, or ERROR messages
sudo journalctl -u postgresql -n 50 --no-pager

# 3. Check for Out of Memory (OOM) kills in system logs
sudo dmesg -T | grep -i 'out of memory\|postgres\|killed'

# 4. Check disk space, specifically the partition holding /var/lib/postgresql
df -h

# 5. Fix ownership if permission denied errors are found on startup (adjust path for your OS)
sudo chown -R postgres:postgres /var/lib/postgresql/15/main
sudo chmod 0700 /var/lib/postgresql/15/main

# 6. Check active connections and limits (run via psql if accessible)
# psql -U postgres -c "SHOW max_connections;"
# psql -U postgres -c "SELECT count(*) FROM pg_stat_activity;"

Error Medic Editorial

Our SRE and DevOps editorial team specializes in deep-dive troubleshooting for mission-critical Linux infrastructure and open-source databases. We turn late-night pagers into robust, scalable architectures.

Sources

Explore More Linux Sysadmin Guides

Apache Crash & Not Working: Complete Troubleshooting Guide (Connection Refused, OOM, Core Dump)

Fix Apache crashes, connection refused errors, and OOM kills in minutes. Step-by-step diagnostic commands, MPM tuning, and core dump analysis included.

AWS Redis Connection Refused: Troubleshooting ECONNREFUSED and tcp 127.0.0.1:6379

Fix 'Redis connection refused' errors in AWS, Kubernetes, Laravel, and WSL. Learn how to diagnose binding issues, security groups, and network configurations.

Comprehensive Guide to Fixing Nginx 502 Bad Gateway, 504 Timeouts, and Core Crashes

Diagnose and resolve Nginx 502 Bad Gateway, 504 Timeouts, connection refused errors, out-of-memory crashes, and permission denied issues with this SRE guide.