Why do I get a 504 Gateway Timeout only on file uploads?

File uploads take time to transfer over the network and be processed by your backend. If this process takes longer than the default 60 seconds, NGINX terminates the connection. You must increase `proxy-read-timeout` and `proxy-send-timeout` annotations, and potentially `client_max_body_size` if you are also seeing 413 Payload Too Large errors.

What causes `connect() failed (111: Connection refused)` in NGINX ingress?

This almost always means NGINX routed traffic to a Pod IP, but no application is actively listening on the target port at that IP. Ensure your application is binding to `0.0.0.0` (not `127.0.0.1`), verify the Service `targetPort` matches your container port, and ensure the application hasn't crashed internally.

How do I fix NGINX ingress controller CrashLoopBackOff?

First, check `kubectl describe pod` on the controller. If the reason is `OOMKilled`, increase the memory limits in your controller deployment. If it's not OOM, check the previous logs (`kubectl logs --previous`). It is likely crashing due to a fatal syntax error in `nginx.conf` caused by a malformed Ingress annotation (like a bad snippet). Find and delete the faulty Ingress.

Why is my NGINX ingress returning '404 Not Found' from the default backend?

A default backend 404 means the request reached NGINX, but no routing rule matched the request's Host header and URL path. Verify that your DNS points to the correct load balancer, ensure your `curl` or browser is sending the exact `Host` defined in your Ingress manifest, and ensure your Ingress resource includes the correct `ingressClassName: nginx`.

How can I view the generated nginx.conf inside the ingress controller?

You can view the actively running configuration by executing into the controller pod and reading the file directly: `kubectl exec -it -n ingress-nginx -- cat /etc/nginx/nginx.conf`. This is invaluable for verifying that your Ingress annotations are being translated correctly.

Resolving NGINX Ingress '504 Gateway Timeout', 'Connection Refused', and CrashLoopBackOff Errors in Kubernetes

Fix NGINX Ingress 504 Gateway Timeouts and Connection Refused errors. Learn to adjust proxy-read-timeout, resolve CrashLoopBackOff, and debug K8s routing.

Last updated: February 24, 2026

Last verified: February 24, 2026

1,953 words

Key Takeaways

504 Gateway Timeouts are usually caused by backend applications taking longer to respond than the configured NGINX proxy-read-timeout (default 60s).
Connection Refused (502 Bad Gateway) typically indicates a disconnect between the Kubernetes Service and the backend Pods (e.g., app bound to localhost, missing endpoints, or wrong targetPort).
CrashLoopBackOff in the NGINX controller is often due to OOMKilled events (insufficient memory limits) or invalid NGINX configuration syntax injected via ConfigMaps/Ingress rules.
Always verify the data path: Ingress Controller Logs -> Kubernetes Endpoints -> Backend Pod Logs.

Fix Approaches Compared by Symptom
Symptom / Error	Primary Fix Method	Time to Resolve	Risk Level
504 Gateway Time-out	Add nginx.ingress.kubernetes.io/proxy-read-timeout annotation	5 mins	Low
111: Connection refused	Fix Service selectors and container binding (0.0.0.0)	15 mins	Low
CrashLoopBackOff (OOM)	Increase memory limits in Controller Deployment	10 mins	Medium
CrashLoopBackOff (Config)	Identify and remove invalid Ingress resource using admission webhooks	20 mins	High

Understanding NGINX Ingress Errors in Kubernetes

When managing Kubernetes clusters, the NGINX Ingress Controller is often the critical entry point for external traffic reaching your microservices. Because it sits at the edge of your cluster, any misconfiguration, resource exhaustion, or network policy issue will manifest as an ingress error. The four most common complaints developers raise are 'nginx ingress timeout' (504), 'nginx ingress connection refused' (502), 'nginx ingress crashloopbackoff', and a general 'nginx ingress not working'.

This guide provides a senior-level, systematic approach to diagnosing and resolving these specific errors, tracing the request lifecycle from the external load balancer down to the application container.

Scenario 1: The 'nginx ingress timeout' (504 Gateway Time-out)

The Symptom

Users or API clients report receiving a 504 Gateway Time-out response. In your NGINX ingress controller logs, you will see entries similar to this:

[error] 1234#1234: *5678 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 192.168.1.5, server: api.example.com, request: "POST /v1/reports/generate HTTP/1.1", upstream: "http://10.244.1.15:8080/v1/reports/generate", host: "api.example.com"

Root Cause Analysis

A 504 Gateway Timeout means that NGINX successfully routed the request to the upstream backend pod, but the backend pod failed to return an HTTP response within the configured timeout window. By default, the proxy-read-timeout, proxy-send-timeout, and proxy-connect-timeout in the NGINX Ingress Controller are set to exactly 60 seconds.

If you have an endpoint that generates large reports, handles massive file uploads, or runs complex AI model inferences, it will likely take longer than 60 seconds. When the clock hits 60s, NGINX ruthlessly cuts the connection and returns a 504 to the client, even if the backend pod is still happily processing the job.

Step 1: Diagnose

Check the ingress controller logs to confirm the upstream timed out error.
Cross-reference the timestamp with the backend pod logs. You will often see the backend pod successfully complete the task after the 504 was issued, completely unaware that NGINX dropped the client.

Step 2: The Fix

You need to instruct NGINX to wait longer for this specific route. This is done using Ingress annotations. Do not change this globally unless absolutely necessary, as it ties up NGINX worker connections.

Apply the following annotations to your specific Ingress resource:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: reporting-api-ingress
  namespace: production
  annotations:
    # Increase timeouts to 300 seconds (5 minutes)
    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "300"

Note: If your application sits behind an AWS Classic Load Balancer (ELB) or Application Load Balancer (ALB), ensure the load balancer's idle timeout is also increased to match or exceed your NGINX timeout. Otherwise, the cloud LB will drop the connection before NGINX does, resulting in a 504 from the cloud provider, not NGINX.

Scenario 2: The 'nginx ingress connection refused' (502 Bad Gateway)

The Symptom

Clients receive a 502 Bad Gateway error. The NGINX logs reveal the following critical error:

[error] 456#456: *890 connect() failed (111: Connection refused) while connecting to upstream, client: 10.0.0.5, server: app.example.com, request: "GET / HTTP/1.1", upstream: "http://10.244.2.33:3000/", host: "app.example.com"

Root Cause Analysis

Unlike a timeout, Connection refused (Error 111 at the TCP layer) means NGINX actively reached out to the Pod IP (10.244.2.33 on port 3000), but the operating system kernel inside the pod explicitly rejected the TCP SYN packet with a RST packet.

This happens for three main reasons:

Application Binding to Localhost: The application inside the container is listening on 127.0.0.1 (localhost) instead of 0.0.0.0 (all interfaces). NGINX connects via the Pod's eth0 IP, which is rejected.
Port Mismatch: The Kubernetes Service targetPort does not match the port the application is actually listening on.
Application Crashed: The pod is running, but the specific application process inside it has crashed, and the container hasn't restarted yet.

Step 1: Diagnose

First, verify that the Kubernetes Endpoints object is correctly populated. NGINX bypasses kube-proxy and routes directly to the Endpoints.

Run: kubectl get endpoints <service-name> -n <namespace>

If endpoints exist, port-forward directly to the pod to test local connectivity: kubectl port-forward pod/<pod-name> 3000:3000

If port-forwarding works but NGINX fails, check the application's bind address.

Step 2: The Fix

Fix A: Change the Application Bind Address Ensure your Node.js, Python, or Go application binds to 0.0.0.0.

Node/Express: app.listen(3000, '0.0.0.0')
Python/Flask: app.run(host='0.0.0.0', port=3000)
Go: http.ListenAndServe(":3000", nil)

Fix B: Correct the Service TargetPort Ensure your Service configuration aligns with the container port:

apiVersion: v1
kind: Service
metadata:
  name: my-app-service
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80         # Port exposed internally in the cluster
      targetPort: 3000 # EXACT port the app listens on inside the container

Scenario 3: NGINX Ingress 'CrashLoopBackOff'

The Symptom

When you run kubectl get pods -n ingress-nginx, you see the controller pod constantly restarting:

ingress-nginx-controller-5c8d66c76d-xyz12 0/1 CrashLoopBackOff 15 (3m ago) 45m

Root Cause Analysis

A CrashLoopBackOff on the ingress controller is a severe cluster-level issue. It means the NGINX process is terminating abruptly. The two primary culprits are:

OOMKilled (Out of Memory): Under high traffic, NGINX consumes memory for active connections, SSL buffers, and caching. If the limits are set too low, the Linux kernel's OOM killer will terminate the NGINX process.
Invalid Configuration (Poison Pill): The NGINX controller dynamically generates nginx.conf based on your Ingress resources. If a developer deploys an Ingress resource with a malformed snippet annotation (e.g., nginx.ingress.kubernetes.io/configuration-snippet), it can result in invalid NGINX syntax. NGINX will fail to reload or start, causing the pod to crash.

Step 1: Diagnose

First, determine if it's an OOM issue by describing the pod: kubectl describe pod -l app.kubernetes.io/name=ingress-nginx -n ingress-nginx

Look for Reason: OOMKilled under the Last State section.

If it is NOT OOMKilled, check the logs of the previous crashed container: kubectl logs -l app.kubernetes.io/name=ingress-nginx -n ingress-nginx --previous

You are looking for a fatal NGINX syntax error, such as: nginx: [emerg] "server" directive is not allowed here in /etc/nginx/nginx.conf:145

Step 2: The Fix

Fix A: Resolving OOMKilled Update your Helm chart values or deployment manifest to increase memory limits.

controller:
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 1000m
      memory: 1Gi # Increase this significantly

Fix B: Resolving Poison Pill Configurations If the controller is crashing due to bad syntax, you must find and delete the offending Ingress object. Because the controller is in CrashLoopBackOff, it cannot process deletions gracefully. You may need to manually inspect all recently modified ingress resources:

kubectl get ingress --all-namespaces -o yaml | grep -C 5 snippet

Once identified, delete the malformed ingress. To prevent this permanently, enable the NGINX Ingress Validating Webhook. The webhook intercepts kubectl apply commands and tests the resulting nginx.conf before accepting the Ingress object into the cluster.

Scenario 4: NGINX Ingress Not Working (Default Backend 404)

The Symptom

You deploy your app, create an Ingress, but when you curl the endpoint, you get: default backend - 404

Root Cause Analysis

This generic "not working" state means the request reached the NGINX controller, but NGINX has no matching rule in its routing table for the provided Host header and Path.

Missing IngressClass: In newer Kubernetes versions, Ingress resources require an ingressClassName. Without it, the controller ignores the object.
Host Header Mismatch: The client is requesting an IP or a domain that does not perfectly match the host field in the Ingress rule.
Path Mismatch: The pathType (Exact vs Prefix) or the path regex is incorrect.

Step 1: Diagnose & Fix

Ensure your Ingress resource correctly specifies the ingressClassName and matches the requested Host.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app
  namespace: default
spec:
  ingressClassName: nginx # CRITICAL: Must match your controller's class
  rules:
  - host: myapp.example.com # Must match the HTTP Host header exactly
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app-service
            port:
              number: 80

Test via curl by explicitly passing the Host header: curl -H "Host: myapp.example.com" http://<ingress-controller-ip>/

By systematically verifying timeouts, backend connectivity, resource limits, and routing rules, you can resolve the vast majority of NGINX Ingress Controller issues in production Kubernetes environments.

Frequently Asked Questions

bash

#!/bin/bash
# Comprehensive NGINX Ingress Diagnostic Script

NAMESPACE="production"
INGRESS_NAME="my-app-ingress"
SERVICE_NAME="my-app-service"
INGRESS_NS="ingress-nginx"

echo "--- 1. Checking Ingress Controller Logs for Errors (Timeouts/Refused) ---"
kubectl logs -n $INGRESS_NS -l app.kubernetes.io/name=ingress-nginx --tail=50 | grep -E 'error|warn|504|111'

echo -e "\n--- 2. Checking Ingress Controller Pod Status (CrashLoopBackOff check) ---"
kubectl get pods -n $INGRESS_NS -l app.kubernetes.io/name=ingress-nginx

echo -e "\n--- 3. Verifying Ingress Resource Annotations and Class ---"
kubectl get ingress $INGRESS_NAME -n $NAMESPACE -o yaml | grep -E 'proxy-|ingressClassName|host'

echo -e "\n--- 4. Checking Service and Endpoints Mapping ---"
kubectl get svc $SERVICE_NAME -n $NAMESPACE
kubectl get endpoints $SERVICE_NAME -n $NAMESPACE

echo -e "\n--- 5. Checking Backend Pod Status ---"
kubectl get pods -n $NAMESPACE -l app=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.selector.app}')

Error Medic Editorial

Error Medic Editorial comprises senior Site Reliability Engineers and DevOps architects dedicated to breaking down complex distributed systems failures into actionable, production-ready solutions.

Sources

Explore More DevOps Config Guides

Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors

Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.

ArgoCD 'connection refused' Error: Complete Troubleshooting Guide (2024)

Fix ArgoCD 'connection refused', CrashLoopBackOff, ImagePullBackOff, and timeout errors with step-by-step diagnostic commands and proven solutions.

ArgoCD Connection Refused: Fix CrashLoopBackOff, ImagePullBackOff, Permission Denied & Timeout Errors

Fix ArgoCD connection refused errors: diagnose CrashLoopBackOff, ImagePullBackOff, permission denied, and timeout with step-by-step kubectl commands and config

Understanding NGINX Ingress Errors in Kubernetes

Scenario 1: The 'nginx ingress timeout' (504 Gateway Time-out)

The Symptom

Root Cause Analysis

Step 1: Diagnose

Step 2: The Fix

Scenario 2: The 'nginx ingress connection refused' (502 Bad Gateway)

The Symptom

Root Cause Analysis

Step 1: Diagnose

Step 2: The Fix

Scenario 3: NGINX Ingress 'CrashLoopBackOff'

The Symptom

Root Cause Analysis

Step 1: Diagnose

Step 2: The Fix

Scenario 4: NGINX Ingress Not Working (Default Backend 404)

The Symptom

Root Cause Analysis

Step 1: Diagnose & Fix

Frequently Asked Questions

Why do I get a 504 Gateway Timeout only on file uploads?

What causes `connect() failed (111: Connection refused)` in NGINX ingress?

How do I fix NGINX ingress controller CrashLoopBackOff?

Why is my NGINX ingress returning '404 Not Found' from the default backend?

How can I view the generated nginx.conf inside the ingress controller?

Sources

Related Articles in Nginx Ingress

Explore More DevOps Config Guides