What does the 'UC' flag mean in Envoy access logs next to a 503?

'UC' stands for Upstream Connection Failure. It means Envoy attempted to open a TCP connection to the backend service but failed. This is typically due to the backend being down, a network firewall blocking the port, or a misconfigured IP/port in Envoy.

How do I fix a 503 'UH' (No Healthy Upstream) error?

A 'UH' error indicates Envoy's active health checks for the backend cluster are failing. To fix it, ensure your backend service is running, check that the health check endpoint (e.g., /healthz) returns a 200 OK, and verify Envoy's health check configuration matches the backend's requirements.

Can connection timeouts cause a 503 in Envoy?

Yes. If the upstream service takes longer to respond than Envoy's configured timeout, Envoy will return a 504 Gateway Timeout or sometimes a 503 with the 'UT' (Upstream Request Timeout) flag, depending on the exact stage of the connection lifecycle.

Why am I getting a 503 'UO' (Upstream Overflow)?

'UO' means a circuit breaker tripped. Envoy limits the number of concurrent connections or pending requests to protect backends. You either need to scale up your backend service to handle the load or adjust Envoy's circuit breaker limits (like max_connections) if the backend has spare capacity.

Envoy 503 Service Unavailable: Root Causes and Troubleshooting Guide

Diagnostic Approaches Compared
Method	When to Use	Time	Risk
Access Log Analysis	Initial triage to identify response flags (UC, UH, etc.)	Fast	Low
Admin Stats Interface	Checking circuit breaker states and connection pools	Medium	Low
Envoy Trace Logging	Deep diving into TLS handshakes or routing logic	Slow	High (Performance impact)
Upstream Application Logs	When Envoy indicates the upstream closed the connection (URX)	Medium	Low

Diagnostic Approaches Compared

Method

When to Use

Time

Risk

Access Log Analysis

Initial triage to identify response flags (UC, UH, etc.)

Fast

Low

Admin Stats Interface

Checking circuit breaker states and connection pools

Medium

Low

Envoy Trace Logging

Deep diving into TLS handshakes or routing logic

Slow

High (Performance impact)

Upstream Application Logs

When Envoy indicates the upstream closed the connection (URX)

Medium

Low

Understanding the Error

When Envoy Proxy returns a 503 Service Unavailable error, it is acting as a faithful messenger. It means Envoy successfully received the downstream request but was unable to forward it to or receive a valid response from the upstream service.

Unlike a generic web server error, Envoy provides highly granular telemetry indicating exactly why the 503 occurred via Response Flags in its access logs. A 503 is rarely an issue with Envoy itself; it's almost always a problem with the upstream service's health, network connectivity, or Envoy's configuration regarding how to communicate with that upstream.

Step 1: Diagnose with Response Flags

The most critical step in troubleshooting an Envoy 503 is checking the access logs. Envoy appends a short code (the response flag) to the log entry. Here are the most common flags associated with a 503:

UC (Upstream Connection Failure): Envoy could not establish a TCP connection to the upstream. This usually means the upstream process is not running, is listening on the wrong port, or a firewall/security group is blocking the connection.
UH (No Healthy Upstream): Envoy's active health checking has determined that no hosts in the upstream cluster are healthy. The request is rejected before a connection is even attempted.
URX (Upstream Retry Limit Exceeded): The request failed, and Envoy exhausted its configured retry attempts.
UT (Upstream Request Timeout): The upstream connection was established, but the upstream failed to respond within the configured timeout period.
UO (Upstream Overflow): The circuit breakers for the upstream cluster tripped. This often happens if the connection pool is exhausted or pending request limits are hit.

To view these logs, you typically need to check stdout for the Envoy container or your centralized logging system. Look for entries like:

[2023-10-27T10:00:00.000Z] "GET /api/v1/data HTTP/1.1" 503 UC 0 0 5 - "-" "curl/7.68.0" ...

Notice the UC right after the 503 status code.

Step 2: Fix Common Scenarios

Scenario A: Upstream Connection Failure (UC)

If you see UC, verify the upstream is reachable from the Envoy pod/node.

Check Upstream Status: Ensure the target pod/VM is actually running.
Verify Ports: Double-check that Envoy's cluster configuration matches the port the upstream application is listening on.
Network Policies: In Kubernetes, ensure no NetworkPolicies are blocking traffic from Envoy's namespace to the target namespace.

Scenario B: No Healthy Upstream (UH)

If you see UH, Envoy believes the upstream is dead.

Check Health Check Config: Review Envoy's active health check configuration. Is the path correct? Is the expected status code 200?
Inspect Upstream Logs: Look at the upstream service's logs. Is it failing its health check endpoints? Is it out of memory or deadlocking?
Bypass Health Checks (Temporarily): For testing, you can temporarily disable active health checks to see if traffic flows, which confirms the health check configuration is the culprit.

Scenario C: Upstream Overflow (UO) / Circuit Breaking

If you see UO, Envoy is protecting the upstream by shedding load.

Check Admin Stats: Port-forward to Envoy's admin port (usually 15000 or 9901) and check /stats. Look for cluster.<cluster_name>.upstream_cx_overflow or cluster.<cluster_name>.upstream_rq_pending_overflow.
Tune Circuit Breakers: If the upstream can handle more load, increase the max_connections, max_pending_requests, or max_requests in the cluster's circuit breaker configuration.
Scale Upstream: If the upstream genuinely cannot handle the load, scale out the upstream deployment.

Step 3: TLS and Protocol Issues

Sometimes a 503 occurs due to a protocol mismatch. If Envoy expects an HTTP/2 upstream but the upstream only speaks HTTP/1.1, the connection will drop. Similarly, if Envoy is configured to use TLS to talk to the upstream (transport_socket configuration), but the upstream certificate is invalid, expired, or doesn't match the expected Subject Alternative Name (SAN), Envoy will close the connection and return a 503. Check the Upstream Connection Termination (UCT) flag in these cases.

# 1. Check Envoy Access Logs for the Response Flag (e.g., UC, UH, UO) kubectl logs -l app=envoy-proxy -n gateway-system | grep " 503 " # 2. Port-forward the Envoy Admin Interface to check stats kubectl port-forward deployment/envoy-proxy 15000:15000 -n gateway-system # 3. Check for Circuit Breaker overflows (UO flag) curl -s http://localhost:15000/stats | grep overflow # 4. Check cluster health status (UH flag) curl -s http://localhost:15000/clusters | grep health_flags # 5. Test upstream connectivity directly from the Envoy pod (UC flag) kubectl exec -it deploy/envoy-proxy -n gateway-system -- curl -v http://<upstream-ip>:<port>