Why do I get a 504 Gateway Timeout only when uploading large files?

Large file uploads take longer to process and transmit. The default NGINX `proxy-read-timeout` is 60 seconds. If the upload takes longer, NGINX closes the connection. Fix this by adding the `nginx.ingress.kubernetes.io/proxy-read-timeout: "600"` annotation to your Ingress resource and ensuring `client_max_body_size` is also increased.

How do I fix "nginx ingress connection refused" on AWS EKS?

On EKS, this usually means the AWS Load Balancer cannot reach the worker nodes. Check the Security Groups attached to your worker nodes and ensure they allow inbound traffic on the NodePort range (30000-32767) from the Load Balancer's Security Group.

My NGINX Ingress controller is stuck in CrashLoopBackOff. How do I debug it?

First, run `kubectl describe pod -n ingress-nginx ` to check for `OOMKilled` (Out of Memory). If it's not OOMKilled, run `kubectl logs -n ingress-nginx --previous` to see the logs from the crashed container. Usually, a malformed Ingress annotation creates an invalid `nginx.conf`, causing the crash.

Why does my ingress return "default backend - 404"?

A 404 from the "default backend" means traffic reached the NGINX controller, but no Ingress rules matched the request's Host header or URL path. Verify your DNS points to the right controller and that your Ingress `host` and `path` rules exactly match the requested URL.

How can I reload the NGINX configuration without restarting the pod?

The NGINX Ingress controller automatically reloads its configuration dynamically when it detects changes to Ingress resources, Secrets, or ConfigMaps. You generally do not need to reload it manually. However, you can check the generated configuration inside the pod using `kubectl exec -it -n ingress-nginx -- cat /etc/nginx/nginx.conf`.

Troubleshooting NGINX Ingress: "504 Gateway Timeout", "Connection Refused", and CrashLoopBackOff

Fix NGINX Ingress timeouts, Connection Refused, and CrashLoopBackOff errors. Learn how to debug proxy settings, adjust timeouts, and check pod health.

Last updated: February 24, 2026

Last verified: February 24, 2026

1,418 words

Key Takeaways

504 Gateway Timeouts usually stem from upstream application pods taking too long to respond; fix by adjusting proxy-read-timeout annotations.
Connection Refused often indicates the NGINX controller service lacks endpoints, or an external cloud LoadBalancer/Security Group is misconfigured.
CrashLoopBackOff typically points to OOMKilled events from high memory usage or bad NGINX configuration syntax caused by a malformed Ingress resource.
Quick fix: Always verify controller logs using `kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx` to pinpoint whether the issue is routing, upstream health, or internal controller crashes.

NGINX Ingress Failure Modes Compared
Symptom / Error	Primary Root Cause	Diagnostic Command	Resolution Strategy
504 Gateway Timeout	Upstream pod processing delay > 60s	kubectl logs <app-pod>	Add nginx.ingress.kubernetes.io/proxy-read-timeout annotation
Connection Refused	Service selector mismatch / Firewall	kubectl get endpoints -n ingress-nginx	Fix service label selectors or cloud security groups
CrashLoopBackOff	OOMKilled or invalid nginx.conf syntax	kubectl logs <controller-pod> --previous	Increase memory limits or fix malformed Ingress snippet
404 Default Backend	Host or Path routing rule mismatch	kubectl describe ingress <name>	Correct the host header or pathType in Ingress YAML

Understanding NGINX Ingress Failures

When managing Kubernetes clusters, the NGINX Ingress Controller is often the critical gateway for all incoming traffic. When it fails, your entire application stack can appear to be offline. The most common symptoms reported by developers and operators include 504 Gateway Timeout, Connection Refused, the dreaded CrashLoopBackOff state, or a general nginx ingress not working complaint.

This guide breaks down each of these failure modes, providing actionable diagnostic steps and concrete solutions to get your traffic routing restored.

1. NGINX Ingress Timeout (504 Gateway Timeout)

A 504 Gateway Timeout occurs when NGINX is acting as a proxy and does not receive a timely response from the upstream server (your application pod). By default, NGINX waits 60 seconds for a response. If your backend takes longer than this to generate a response, NGINX severs the connection and returns a 504 to the client.

Diagnostic Steps

Check Upstream Pods: Are your application pods actually processing requests, or are they deadlocked? Use kubectl logs <your-app-pod>. Look for slow database queries, thread exhaustion, or application-level timeouts.
Check Ingress Logs: Run kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx. You will see entries like upstream timed out (110: Connection timed out) while reading response header from upstream.

The Fix

If your application legitimately needs more than 60 seconds to process a request (e.g., file uploads, complex report generation, or long-polling web sockets), you need to increase the timeout annotations on your specific Ingress resource:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-long-running-app
  annotations:
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "120"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "120"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "120"

2. NGINX Ingress Connection Refused

Getting a Connection Refused error usually means the client reached the server IP, but no process was listening on the target port (typically 80 or 443). This happens at the TCP layer before HTTP negotiation even begins.

Diagnostic Steps

Verify the Service: Ensure the Ingress controller service is actually exposed and running. kubectl get svc -n ingress-nginx.
Check Endpoints: Does the service have endpoints? kubectl get endpoints -n ingress-nginx ingress-nginx-controller. If endpoints show as <none>, the service selector isn't matching the running controller pods.
External Load Balancer: If using AWS ELB/NLB, Azure ALB, or GCP Load Balancers, verify the target groups. A misconfigured security group might block traffic between the load balancer and the Kubernetes worker nodes.

The Fix

If endpoints are missing, verify the pod labels match the service selector. If you are testing locally (Minikube/Kind/Docker Desktop), ensure you are using minikube tunnel or port-forwarding correctly: kubectl port-forward --namespace=ingress-nginx service/ingress-nginx-controller 8080:80 If it's a cloud environment, ensure the NodePort assigned to the NGINX service is allowed through your node's firewall/security groups.

3. NGINX Ingress CrashLoopBackOff

A CrashLoopBackOff state means the Ingress controller pod starts, crashes almost immediately, and Kubernetes keeps backing off and trying to restart it. The controller cannot route traffic while in this state.

Diagnostic Steps

Describe the Pod: kubectl describe pod -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx. Look at the Events section at the bottom and the State reason for the container.
Check for OOMKilled: If the reason is OOMKilled, the controller ran out of memory. This is highly common in large clusters with thousands of ingress rules or frequent reload events.
Inspect Logs for Syntax Errors: NGINX will crash if it generates an invalid nginx.conf. Run kubectl logs <ingress-pod> --previous to see the log right before the crash.

The Fix

For OOMKilled: Increase the memory limits in the controller's Deployment spec. A typical production deployment might need 512Mi or even 1Gi of memory depending on cluster size.
For Configuration Errors: A bad configuration snippet (via nginx.ingress.kubernetes.io/configuration-snippet) in any single Ingress resource can break the global nginx.conf template, taking down the whole controller. Look for the specific Ingress resource causing the syntax error in the logs, and fix or delete it.

4. NGINX Ingress Not Working (Generic 404 or 503)

If you receive a default backend - 404 error, the request successfully reached the NGINX Ingress controller, but NGINX couldn't find a matching routing rule for the Host header or URL path.

Diagnostic Steps

Check Host Headers: Ensure the Host header in your HTTP request (e.g., in your browser or curl) exactly matches the host field defined in your Ingress rule.
Check Path Matching: NGINX path matching can be strict. Check your pathType (Prefix vs Exact).
Verify Upstream Health (503 Error): A 503 Service Temporarily Unavailable usually means the endpoints list is populated, but the upstream application pods are failing their readiness probes, so NGINX has marked them as down.

The Fix

Double-check your Ingress YAML definition. If you are using a regex in the path, ensure you include the rewrite-target annotation if necessary, and ensure you use pathType: ImplementationSpecific or Prefix appropriately depending on your ingress-nginx controller version.

Frequently Asked Questions

bash

#!/bin/bash
# NGINX Ingress Diagnostic Script

NAMESPACE="ingress-nginx"
SELECTOR="app.kubernetes.io/name=ingress-nginx"

echo "=== 1. Checking NGINX Ingress Pod Status ==="
kubectl get pods -n $NAMESPACE -l $SELECTOR -o wide

echo -e "\n=== 2. Checking NGINX Ingress Service and Endpoints ==="
kubectl get svc,endpoints -n $NAMESPACE -l $SELECTOR

echo -e "\n=== 3. Fetching Recent Error Logs (Looking for Syntax Errors or Timeouts) ==="
POD_NAME=$(kubectl get pods -n $NAMESPACE -l $SELECTOR -o jsonpath='{.items[0].metadata.name}')
kubectl logs -n $NAMESPACE $POD_NAME | grep -iE "error|fatal|timeout|invalid"

echo -e "\n=== 4. Checking Previous Crashed Container Logs (For CrashLoopBackOff) ==="
kubectl logs -n $NAMESPACE $POD_NAME --previous | tail -n 20 || echo "No previous crashed container found."

echo -e "\n=== 5. Testing Local Port Forwarding ==="
echo "Run this command manually to test bypassing the external Load Balancer:"
echo "kubectl port-forward -n $NAMESPACE svc/ingress-nginx-controller 8080:80"

Error Medic Editorial

Our SRE and DevOps editorial team specializes in Kubernetes troubleshooting, cloud-native architecture, and site reliability engineering at scale.

Sources

Explore More DevOps Config Guides

Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors

Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.

ArgoCD 'connection refused' Error: Complete Troubleshooting Guide (2024)

Fix ArgoCD 'connection refused', CrashLoopBackOff, ImagePullBackOff, and timeout errors with step-by-step diagnostic commands and proven solutions.

ArgoCD Connection Refused: Fix CrashLoopBackOff, ImagePullBackOff, Permission Denied & Timeout Errors

Fix ArgoCD connection refused errors: diagnose CrashLoopBackOff, ImagePullBackOff, permission denied, and timeout with step-by-step kubectl commands and config

Understanding NGINX Ingress Failures

1. NGINX Ingress Timeout (504 Gateway Timeout)

Diagnostic Steps

The Fix

2. NGINX Ingress Connection Refused

Diagnostic Steps

The Fix

3. NGINX Ingress CrashLoopBackOff

Diagnostic Steps

The Fix

4. NGINX Ingress Not Working (Generic 404 or 503)

Diagnostic Steps

The Fix

Frequently Asked Questions

Why do I get a 504 Gateway Timeout only when uploading large files?

How do I fix "nginx ingress connection refused" on AWS EKS?

My NGINX Ingress controller is stuck in CrashLoopBackOff. How do I debug it?

Why does my ingress return "default backend - 404"?

How can I reload the NGINX configuration without restarting the pod?

Sources

Related Articles in Nginx Ingress

Explore More DevOps Config Guides