Error Medic

How to Fix Nginx 502 Bad Gateway, 504 Timeouts, and Connection Refused Errors

Step-by-step guide to troubleshooting Nginx 502 Bad Gateway, 504 Timeouts, connection refused, and OOM crashes. Learn root causes and permanent fixes.

Last updated:
Last verified:
1,720 words
Key Takeaways
  • 502 Bad Gateway errors occur when Nginx cannot communicate with the upstream backend service (e.g., Node.js, PHP-FPM, Gunicorn) due to it being down or misconfigured.
  • 504 Gateway Timeout errors mean the backend service is running but taking too long to process the request, exceeding Nginx's proxy timeout settings.
  • Connection refused or permission denied errors (13: Permission denied) are typically caused by incorrect UNIX socket ownership or restrictive SELinux/AppArmor profiles.
  • Nginx high CPU usage or Out of Memory (OOM) crashes are often the result of insufficient worker_connections, misconfigured buffer sizes, or unmitigated traffic spikes.
Fix Approaches Compared
MethodWhen to UseTimeRisk
Restart Upstream ServiceInitial 502/504 occurrence, suspected backend crash1 minLow
Increase Proxy TimeoutsBackend requires more time for heavy processing (504)5 minsMedium (can tie up worker connections)
Fix Socket PermissionsSeeing '13: Permission denied' while connecting to upstream5 minsLow
Adjust Worker ConnectionsSeeing 'too many connections' or high CPU load10 minsMedium

Understanding Nginx 50x Errors and Crashes

Nginx is an incredibly robust web server and reverse proxy, but when it sits in front of dynamic application servers, errors are inevitable. The most common issues—Nginx 502 Bad Gateway and Nginx 504 Gateway Timeout—are rarely bugs within Nginx itself. Instead, they are symptoms of a breakdown in communication between Nginx and your upstream backend services.

When Nginx acts as a reverse proxy, it forwards client requests to an application server (like PHP-FPM, a Node.js Express app, Python's Gunicorn, or a Java Tomcat server). If that upstream server is down, crashes, hangs, or rejects the connection, Nginx must return an HTTP 50x error to the client.

The Anatomy of a 502 Bad Gateway

A 502 Bad Gateway means Nginx reached out to the upstream server, but received an invalid response, or couldn't connect at all. You will typically see an error like this in your /var/log/nginx/error.log:

2023/10/24 10:15:30 [error] 12345#0: *6789 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.1.100, server: example.com, request: "GET /api/data HTTP/1.1", upstream: "http://127.0.0.1:3000/api/data"

Or, if using UNIX sockets:

2023/10/24 10:16:45 [error] 12345#0: *6790 connect() to unix:/var/run/php/php8.1-fpm.sock failed (2: No such file or directory) while connecting to upstream

The Anatomy of a 504 Gateway Timeout

A 504 Gateway Timeout means Nginx successfully connected to the upstream server, but the upstream server didn't respond within the configured time limit. The error log will look like this:

2023/10/24 10:20:10 [error] 12345#0: *6791 upstream timed out (110: Connection timed out) while reading response header from upstream


Step 1: Diagnose the Root Cause

Before making configuration changes, you must pinpoint the exact failure. Blindly increasing timeouts or worker limits can mask underlying application performance issues or make a Denial of Service (DoS) situation worse.

1. Inspect the Nginx Error Log

The first step is always to check the Nginx error log. By default, this is located at /var/log/nginx/error.log.

tail -n 50 /var/log/nginx/error.log

Look for specific error codes in parentheses:

  • (111: Connection refused): The backend service is not listening on the specified port, or is down.
  • (13: Permission denied): Nginx does not have read/write access to the UNIX socket file, or SELinux is blocking the connection.
  • (110: Connection timed out): The backend took too long to process the request.
  • (104: Connection reset by peer): The backend abruptly closed the connection before sending a complete response.

2. Verify Upstream Service Status

If Nginx reports Connection refused, the upstream service is likely stopped or crashed. Check the status of your application server using systemctl:

systemctl status php8.1-fpm
# or
systemctl status my-node-app

If the service is running, ensure it is listening on the expected port or socket:

sudo ss -tulpn | grep LISTEN

3. Check System Resources (CPU, Memory, File Descriptors)

If Nginx is crashing (nginx core dump, nginx out of memory), or throwing nginx too many connections errors, the server might be starved for resources.

Run dmesg -T | grep -i oom to check if the Linux Out Of Memory killer terminated the Nginx worker processes or the backend application.

Run top or htop to check for nginx high cpu usage. If Nginx is consuming 100% CPU, it might be stuck in an infinite loop due to misconfigured rewrites, or it might be overwhelmed by SSL/TLS handshakes during a traffic spike.


Step 2: Fix and Optimize

Depending on your diagnostic findings, apply the appropriate fixes below.

Fix 1: Resolving 'Connection Refused' (502)

If the backend service is down, start it:

sudo systemctl restart your-backend-service

If the service is running but Nginx still cannot connect, ensure the upstream or proxy_pass directive in your Nginx configuration matches the exact IP and port the backend is listening on. If your backend is bound to 127.0.0.1:8080, your Nginx config must look like this:

location / {
    proxy_pass http://127.0.0.1:8080;
}

Fix 2: Resolving 'Permission Denied' on UNIX Sockets (502)

If you are using a UNIX socket (e.g., /run/gunicorn.sock) and see a 13: Permission denied error, the Nginx worker process (usually running as the www-data or nginx user) does not have read/write permissions for the socket file.

Solution A: Fix Socket Ownership Configure your backend service (e.g., Gunicorn, PHP-FPM) to create the socket with the correct ownership.

For PHP-FPM (/etc/php/8.1/fpm/pool.d/www.conf):

listen.owner = www-data
listen.group = www-data
listen.mode = 0660

Then restart PHP-FPM.

Solution B: SELinux If you are on CentOS, RHEL, or Fedora, SELinux might be blocking Nginx from connecting to network ports or sockets. To allow Nginx to connect to network proxies:

sudo setsebool -P httpd_can_network_connect 1

Fix 3: Resolving Gateway Timeouts (504)

If your backend application is executing long-running database queries or external API calls, it might legitimately take 30-60 seconds to respond. Nginx's default proxy timeout is usually 60 seconds. If the backend takes longer, Nginx drops the connection and returns a 504.

To fix this, increase the proxy timeout directives in your Nginx location block:

location /api/reports/ {
    proxy_pass http://backend;
    proxy_read_timeout 300s;
    proxy_connect_timeout 75s;
    proxy_send_timeout 300s;
}

If using FastCGI (like PHP-FPM), increase the FastCGI timeouts:

location ~ \.php$ {
    fastcgi_pass unix:/var/run/php/php8.1-fpm.sock;
    fastcgi_read_timeout 300s;
    # ... other fastcgi params
}

Warning: Do not set timeouts to excessively high values globally. If your application is hanging indefinitely, high timeouts will cause Nginx worker connections to pile up until the server exhausts all memory and crashes.

Fix 4: Fixing 'Too Many Connections' and 'Out of Memory'

If your logs show worker_connections are not enough, you need to tune Nginx for higher concurrency.

Open /etc/nginx/nginx.conf and locate the events block. Increase the worker_connections:

events {
    worker_connections 4096; # Default is often 512 or 1024
    multi_accept on;
}

Ensure worker_processes is set to auto so Nginx utilizes all available CPU cores.

To prevent 'Out of Memory' crashes when handling large file uploads or large headers, adjust the buffer sizes in the http block:

client_max_body_size 50M;
client_body_buffer_size 128k;
proxy_buffer_size 128k;
proxy_buffers 4 256k;
proxy_busy_buffers_size 256k;

Fix 5: Investigating Core Dumps

If Nginx is crashing completely (nginx core dump or nginx failed), it is often due to a bug in a third-party Nginx module (like PageSpeed or ModSecurity), or a system-level issue like running out of file descriptors.

First, increase the file descriptor limit for the Nginx process. Add this to the top level of /etc/nginx/nginx.conf:

worker_rlimit_nofile 65535;

If crashes persist, enable core dumps to analyze the crash using gdb. Add this to your nginx.conf:

working_directory /tmp/cores;
worker_rlimit_core 500M;

Restart Nginx, wait for the crash, and analyze the core dump file with a C debugger. In most production scenarios, simply updating Nginx to the latest mainline release or disabling custom compiled modules resolves crash loop issues.

Conclusion

Troubleshooting Nginx 50x errors requires a systematic approach. Always start with /var/log/nginx/error.log to determine if the issue is a connection refusal, a timeout, or a permissions problem. By aligning Nginx's timeouts and buffer configurations with your backend application's realistic performance profile, you can eliminate most proxy-related downtime.

Frequently Asked Questions

bash
# Diagnostic commands to identify Nginx upstream errors

# 1. Check the Nginx error log for the exact error code
sudo tail -f /var/log/nginx/error.log | grep upstream

# 2. Check if the backend service is listening on the expected port
sudo ss -tulpn | grep 8080

# 3. Test backend response locally (bypassing Nginx)
curl -I http://127.0.0.1:8080/

# 4. Check for SELinux blocking network connections (RHEL/CentOS)
sudo getsebool httpd_can_network_connect
# Enable if it is off
sudo setsebool -P httpd_can_network_connect 1

# 5. Check if OOM Killer terminated Nginx or the backend
dmesg -T | grep -i 'killed process'

# 6. Test Nginx configuration syntax before reloading
sudo nginx -t && sudo systemctl reload nginx
E

Error Medic Editorial

Our SRE team specializes in high-availability Linux infrastructure, Nginx tuning, and distributed systems troubleshooting.

Sources

Related Articles in Nginx

Explore More Linux Sysadmin Guides