What is the primary difference between an Nginx 502 and a 504 error?

A 502 Bad Gateway means Nginx could not establish a valid connection to the upstream backend (it was down, misconfigured, or rejected the connection). A 504 Gateway Timeout means Nginx successfully connected, but the upstream took too long to generate a response, exceeding Nginx's configured timeout limit.

Why am I getting a '13: Permission denied' error when Nginx connects to PHP-FPM or Gunicorn?

This happens when Nginx attempts to communicate with the application over a Unix socket, but the user running the Nginx worker process (usually 'nginx' or 'www-data') does not have read/write permissions for that socket file. It can also be caused by strict SELinux policies blocking proxy networking.

How do I fix the 'worker_connections are not enough' error?

You need to increase the 'worker_connections' directive inside the 'events' block in your nginx.conf file. However, you must also ensure your operating system's file descriptor limit (ulimit -n) is high enough to accommodate the new connection limit.

Why did my Nginx process suddenly crash and output 'exited on signal 9'?

Signal 9 (SIGKILL) usually indicates that the Linux kernel's Out Of Memory (OOM) killer terminated the Nginx worker process because the server ran out of RAM. Check your system logs with 'dmesg -T | grep -i oom' to confirm.

How can I trace a slow upstream backend that is causing 504 Timeouts?

You should enable slow query logging on your database, review your application-level error logs, and implement APM (Application Performance Monitoring) tools like Datadog or New Relic to profile the exact functions or database calls causing the delay. Temporarily increasing 'proxy_read_timeout' in Nginx will stop the 504s, but it won't fix the underlying slowness.

Comprehensive Guide to Fixing Nginx 502 Bad Gateway, 504 Timeouts, and Core Crashes

Diagnose and resolve Nginx 502 Bad Gateway, 504 Timeouts, connection refused errors, out-of-memory crashes, and permission denied issues with this SRE guide.

Last updated: February 24, 2026

Last verified: February 24, 2026

1,688 words

Key Takeaways

Nginx 502 Bad Gateway errors indicate a broken connection to the upstream service (e.g., PHP-FPM, Node.js), often caused by the service being down, misconfigured ports, or socket permission issues.
Nginx 504 Gateway Timeouts happen when the upstream application takes too long to process a request; fixing this requires application profiling and tuning proxy/fastcgi timeout directives.
Resource exhaustion, such as 'too many connections' or 'out of memory' crashes, demands kernel-level tuning (ulimit, file descriptors) and careful configuration of worker processes and buffers.

Common Nginx Errors and Fix Approaches Compared
Symptom / Error	Common Root Cause	Primary Diagnostic Tool	Typical Resolution Time
Nginx 502 / Connection Refused	Upstream service down or wrong port	systemctl status, netstat	5-10 mins
Nginx 504 / Nginx Slow	Heavy backend processing	Application APM, slow logs	30+ mins
Permission Denied (Socket)	Incorrect socket owner or SELinux	ls -l, getenforce, audit2allow	10 mins
Too Many Connections	Traffic spike exceeding worker limits	nginx error.log, ulimit -n	15 mins
Nginx Out of Memory / Crash	OOM Killer, memory leak in module	dmesg, gdb (core dump)	Hours/Days

Understanding Nginx Proxy Architecture & The 5xx Error Family

Nginx acts as the highly efficient, event-driven gateway for modern web infrastructure. It rarely serves dynamic content itself; instead, it proxies requests to upstream backend application servers such as PHP-FPM, Node.js, Python Gunicorn, or Java Tomcat. When you encounter errors like nginx 502, nginx 504, or experience an nginx crash, the root cause almost always lies in the communication layer between Nginx and the upstream service, or in resource exhaustion at the operating system level.

As a Site Reliability Engineer (SRE), debugging these issues requires a systematic approach: confirming the Nginx process health, verifying system resources, analyzing the error logs, and validating upstream connectivity.

Diagnosing "502 Bad Gateway" and "Connection Refused"

A 502 Bad Gateway error means Nginx successfully accepted the client's request but received an invalid response—or no response at all—from the upstream server.

When you check /var/log/nginx/error.log, you will typically see: [error] 1234#0: *5678 connect() failed (111: Connection refused) while connecting to upstream

Step 1: Verify Upstream Health The nginx connection refused error literally means the operating system rejected the TCP or Unix domain socket connection. Your first step is to verify if the backend is actually running.

systemctl status php8.1-fpm
# or
systemctl status my-node-app

If the service is running, ensure it is listening on the expected port or socket. Use netstat -tulpn or ss -tulpn to verify the bindings.

Step 2: Addressing "Nginx Permission Denied" If your upstream relies on Unix sockets (common for PHP-FPM or Gunicorn) instead of TCP ports, you might see: [error] 1234#0: *5678 connect() to unix:/var/run/php-fpm.sock failed (13: Permission denied) while connecting to upstream

This is a strict file permission issue. Nginx runs under a specific user (usually nginx or www-data). If this user does not have read and write permissions to the socket file, it cannot proxy traffic. Fix: Check the user running the upstream service. You may need to change the socket owner configuration in your PHP-FPM pool (listen.owner = www-data, listen.group = www-data). On RHEL/CentOS systems, SELinux is often the hidden culprit blocking proxy connections. If SELinux is active, run setsebool -P httpd_can_network_connect 1 to allow Nginx to connect to network proxies.

Solving "504 Gateway Timeout" and "Nginx Slow" Issues

Unlike a 502, an nginx 504 Gateway Timeout implies that Nginx established the connection to the upstream, sent the request, but the upstream failed to return a response before the proxy timeout limit was reached.

Log example: [error] 1234#0: *5678 upstream timed out (110: Connection timed out) while reading response header from upstream

If users complain that your site is nginx slow and eventually throws a 504, the problem is your backend application code, a slow database query, or an external API call hanging.

Mitigation and Tuning: While fixing the application is the true solution, you can temporarily increase Nginx's patience by tuning the timeout directives in nginx.conf or your server block:

location / {
    proxy_pass http://backend;
    proxy_read_timeout 300s;
    proxy_connect_timeout 75s;
    proxy_send_timeout 300s;
}

If you are using FastCGI (PHP), adjust the fastcgi_read_timeout directive instead. Keep in mind that infinitely increasing timeouts will eventually tie up all your Nginx worker connections, leading to complete service degradation.

Tackling "Nginx Too Many Connections"

During traffic spikes or DDoS attacks, your server might run out of available connection slots. The error log will clearly state: [alert] 1234#0: *5678 1024 worker_connections are not enough

How to Fix:

Open /etc/nginx/nginx.conf.
In the events block, increase the limit: worker_connections 4096; or higher.
Ensure worker_processes auto; is set so Nginx spawns one worker per CPU core.

Kernel Limits: Nginx cannot open more connections than the Linux kernel allows file descriptors. If you increase worker_connections to 10000, but your OS limit is 1024, Nginx will still fail. Check the limit for the Nginx user by running su - nginx -c 'ulimit -n'. To increase this permanently, edit /etc/security/limits.conf:

nginx       soft    nofile   65535
nginx       hard    nofile   65535

Restart Nginx after making these kernel-level changes.

Investigating Nginx Out of Memory, High CPU, and Core Dumps

When a server suffers from nginx high cpu or an nginx out of memory event, the symptoms are severe. The service may abruptly terminate, leaving users with generic browser connection errors.

OOM Killer: If Nginx consumes all available system RAM—perhaps due to a massive influx of traffic with large payloads, unoptimized proxy_buffers, or a memory leak in a third-party dynamic module—the Linux kernel will terminate it to protect the OS. Check the kernel logs for OOM termination: dmesg -T | grep -i oom-killer If you see nginx listed here, you need to either add more physical RAM/swap, or restrict Nginx's memory footprint by tuning client_max_body_size and optimizing buffer sizes.

Nginx Crash and Core Dumps: If you encounter a segmentation fault where Nginx abruptly exits with an nginx failed status and a signal 11 or 9, you have a deep bug, often related to OpenSSL or compiled third-party modules. To trace an nginx crash log, you must enable core dumps.

Add this to the top of your nginx.conf (main context):

worker_rlimit_core 500M;
working_directory /tmp/nginx-cores;

Ensure the /tmp/nginx-cores directory exists and is writable by the nginx user. When the nginx crash happens next, a core file (e.g., core.1234) will be written. You can then use the GNU Debugger to analyze the nginx core dump: gdb /usr/sbin/nginx /tmp/nginx-cores/core.1234 Typing bt (backtrace) in GDB will reveal the exact C function where Nginx crashed, which is invaluable for submitting bug reports or removing the offending module.

Resolving "Nginx Service Not Starting"

Often during deployments, you may find Nginx completely dead with a systemd status of nginx service not starting or nginx not working.

Configuration Syntax: Never restart Nginx without testing the config. Run nginx -t. A simple missing semicolon can prevent the entire master process from booting.
Port Binding Conflicts: If the error log shows bind() to 0.0.0.0:80 failed (98: Address already in use), another process is hoarding the port. This could be Apache, an orphaned Nginx master process, or another reverse proxy. Find the culprit using netstat -tulpn | grep :80 or lsof -i :80 and terminate it using kill -9 <PID>.

By systematically verifying upstream health, tuning timeout and connection limits, and deeply analyzing system logs and core dumps, you can ensure your Nginx infrastructure remains resilient under extreme loads.

Frequently Asked Questions

bash

#!/bin/bash
# Nginx Diagnostic Script: Checks syntax, ports, upstream health, and logs

echo "--- 1. Testing Nginx Configuration Syntax ---"
nginx -t

echo -e "\n--- 2. Checking Nginx Process Health ---"
systemctl status nginx --no-pager | grep -i active

echo -e "\n--- 3. Identifying Processes Listening on Port 80/443 ---"
netstat -tulpn | grep -E ':80|:443'

echo -e "\n--- 4. Extracting Recent 502 and 504 Errors from Nginx Logs ---"
if [ -f /var/log/nginx/error.log ]; then
    tail -n 500 /var/log/nginx/error.log | grep -E 'Connection refused|timed out|Permission denied|worker_connections'
else
    echo "Log file /var/log/nginx/error.log not found."
fi

echo -e "\n--- 5. Checking for Kernel OOM (Out of Memory) Kills ---"
dmesg -T | grep -i 'oom-killer' | tail -n 5

echo -e "\n--- 6. Checking Current File Descriptor Limits (ulimit) ---"
su - nginx -s /bin/bash -c 'ulimit -n'

Error Medic Editorial

Error Medic Editorial is composed of senior Site Reliability Engineers and DevOps architects dedicated to publishing actionable, deeply technical troubleshooting guides for enterprise infrastructure.

Sources

Explore More Linux Sysadmin Guides

Apache Crash & Not Working: Complete Troubleshooting Guide (Connection Refused, OOM, Core Dump)

Fix Apache crashes, connection refused errors, and OOM kills in minutes. Step-by-step diagnostic commands, MPM tuning, and core dump analysis included.

AWS Redis Connection Refused: Troubleshooting ECONNREFUSED and tcp 127.0.0.1:6379

Fix 'Redis connection refused' errors in AWS, Kubernetes, Laravel, and WSL. Learn how to diagnose binding issues, security groups, and network configurations.

Cron Not Working: Fix Permission Denied, 504 Timeouts & Silent Crashes

Cron job not working? Diagnose and fix permission denied errors, 504 timeouts, and silent crashes with exact commands—covers PATH, env vars, and log analysis.