Why does my Azure API return 504 even though I set HttpClient.Timeout to 300 seconds?

The client timeout and the APIM gateway timeout are independent. If the APIM `forward-request` policy timeout is lower than your client timeout (common on older services where it defaults to 60 s), APIM will abort the backend call and return 504 before your client deadline fires. Fix: open the APIM policy for the affected API/operation and set ` ` (max practical value is 230 s due to the Azure Load Balancer hard limit).

What is the absolute maximum timeout for a synchronous Azure API call?

The Azure Load Balancer enforces a hard 4-minute (240 s) idle TCP timeout that cannot be configured away. In practice, you should treat 230 seconds as the ceiling for any synchronous HTTP response through the Azure fabric. Any operation that may exceed this must be redesigned to use the 202 Accepted async polling pattern.

My Azure Function times out with 'Timeout value of 00:05:00 exceeded'. How do I extend it?

Add or update the `functionTimeout` key in your `host.json`: `{ "version": "2.0", "functionTimeout": "00:10:00" }`. The Consumption plan maximum is 10 minutes; the Premium and Dedicated (App Service) plans support up to 60 minutes, and you can set `"functionTimeout": "00:00:00"` (unlimited) on Dedicated plans. After changing `host.json`, redeploy the function app for the setting to take effect.

I see 'The client '...' with object id '...' does not have authorization' followed by a timeout. Are these related?

Not directly. The authorization error means the managed identity or service principal lacks the required RBAC role on the target resource, causing the SDK to retry the failed call until its own deadline fires, which then manifests as a timeout. Fix the RBAC assignment first (e.g., grant 'Contributor' or a purpose-built role via `az role assignment create`), then re-test. The timeout symptom will disappear once the underlying 403 is resolved.

How do I add retry logic for Azure SDK calls (e.g., Blob Storage, Service Bus) without writing Polly policies manually?

Every Azure SDK client in the `Azure.*` NuGet package family has a built-in `RetryOptions` property on its client options. For example: `new BlobServiceClient(uri, credential, new BlobClientOptions { Retry = { MaxRetries = 5, Delay = TimeSpan.FromSeconds(2), MaxDelay = TimeSpan.FromSeconds(32), Mode = RetryMode.Exponential, NetworkTimeout = TimeSpan.FromSeconds(90) } })`. This configures exponential back-off with jitter natively without an additional dependency.

Azure API Timeout: How to Diagnose and Fix 408/504 Timeout Errors

Fix Azure API timeout errors (408, 504, OperationTimedOut) by adjusting timeout settings, enabling retries, and optimizing long-running calls. Step-by-step guid

Last updated: February 23, 2026

Last verified: February 23, 2026

2,015 words

Key Takeaways

Azure API timeouts surface as HTTP 408, 504, or the exception message 'The operation timed out' / 'OperationTimedOut' and stem from four root causes: client-side timeout too short, Azure API Management (APIM) gateway timeout, backend service cold start, or a long-running operation exceeding the 230-second Azure Load Balancer hard limit.
Azure Application Gateway and the public Azure Load Balancer enforce a 4-minute (240 s) idle TCP timeout that cannot be extended; any HTTP request that takes longer than 230 s end-to-end will be silently dropped by the fabric before your backend responds.
Quick fix summary: (1) set HttpClient.Timeout / Axios timeout to at least 100 s for synchronous calls; (2) raise the APIM policy timeout to match; (3) convert calls longer than 90 s to the async polling pattern (202 Accepted + Location header); (4) add an exponential-backoff retry policy with jitter for transient 429/503/504 responses.

Fix Approaches Compared
Method	When to Use	Implementation Time	Risk
Raise client HttpClient timeout	Client times out before server responds; 408 on client side	< 15 min	Low – isolated to your client code
Raise APIM forward-request timeout	APIM policy returns 504 before backend finishes	15–30 min	Low – scoped to one API/operation policy
Switch to async polling (202 + Location)	Operations regularly exceed 90 s (reports, exports, ML inference)	2–8 h	Medium – requires API contract change
Add Polly retry with exponential backoff	Transient 429 / 503 / 504 bursts	30–60 min	Low – retries are idempotent only on safe methods
Enable APIM caching for repeated reads	Repeated identical GET calls timing out under load	30–60 min	Low – stale-data risk on mutable resources
Scale out / warm up backend	Cold-start latency on Azure Functions consumption plan	1–4 h	Low-Medium – cost increase, needs load testing
Move to Azure Durable Functions	Workflows that fan-out, aggregate, or run > 5 min	1–3 days	Medium – architectural refactor

Understanding Azure API Timeout Errors

When an Azure API call exceeds a time boundary, the failure can originate at several distinct layers, each producing a different error signature:

Client SDK / HttpClient – throws TaskCanceledException (C#) or ECONNABORTED (Node.js) with message: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
Azure API Management gateway – returns HTTP 504 Gateway Timeout with body { "statusCode": 504, "message": "Origin server did not respond in time." }
Azure Load Balancer idle timeout – silently resets the TCP connection after 4 minutes of inactivity; the client sees a connection reset or SocketException.
Azure Resource Manager (ARM) polling – returns HTTP 202 Accepted immediately but the polling loop eventually times out with CloudException: OperationTimedOut.
Azure SQL / Cosmos DB – surfaces as SqlException: Timeout expired or RequestRateTooLargeException (429) which, if unretried, manifests as a logical timeout.

Understanding which layer fired is the mandatory first step before applying any fix.

Step 1: Identify the Timeout Layer

1a. Read the full exception chain. In .NET, always call exception.ToString() rather than .Message – the inner TaskCanceledException or SocketException reveals whether the cancellation token came from your code or the HTTP stack.

1b. Check the HTTP status code. 408 = client or server explicitly signaled timeout. 504 = intermediate proxy (APIM, Application Gateway, or Azure Front Door) gave up. A connection-reset with no status code = TCP-layer idle timeout from the Load Balancer.

1c. Pull APIM diagnostic logs. In the Azure portal go to API Management → APIs → [your API] → Test and inspect the trace, or enable Application Insights on APIM:

GET https://management.azure.com/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.ApiManagement/service/{apim}/apis/{api}/diagnostics/applicationinsights?api-version=2022-08-01

Look for backend-duration in the trace. If it is close to your forward-request timeout value, the backend is the bottleneck, not your client.

1d. Check Azure Monitor / Application Insights. Run this KQL query in Log Analytics to find all requests that exceeded 30 seconds:

requests
| where timestamp > ago(1h)
| where duration > 30000
| project timestamp, name, resultCode, duration, cloud_RoleName
| order by duration desc

Step 2: Fix Client-Side Timeouts (C# / .NET)

The default HttpClient timeout is 100 seconds. For APIs that legitimately take longer, create a named client via IHttpClientFactory:

// Program.cs / Startup.cs
builder.Services.AddHttpClient("AzureBackend", client =>
{
    client.BaseAddress = new Uri("https://myapi.azure-api.net");
    client.Timeout = TimeSpan.FromSeconds(180); // explicit, documented
})
.AddPolicyHandler(GetRetryPolicy());

static IAsyncPolicy<HttpResponseMessage> GetRetryPolicy() =>
    HttpPolicyExtensions
        .HandleTransientHttpError()          // 5xx and network errors
        .OrResult(r => r.StatusCode == (HttpStatusCode)429)
        .WaitAndRetryAsync(
            retryCount: 4,
            sleepDurationProvider: attempt =>
                TimeSpan.FromSeconds(Math.Pow(2, attempt))   // 2, 4, 8, 16 s
                + TimeSpan.FromMilliseconds(new Random().Next(0, 500)));

Important: Set the CancellationToken on the request itself when you need per-request control, rather than mutating HttpClient.Timeout at runtime (which is not thread-safe).

Step 3: Fix APIM Gateway Timeouts

In Azure API Management, the default forward-request timeout is 300 seconds (since API version 2021+) but older services default to 60 seconds. Raise it in the inbound or backend policy:

<!-- APIM Policy (API or Operation scope) -->
<policies>
  <inbound>
    <base />
  </inbound>
  <backend>
    <forward-request timeout="180" follow-redirects="true" />
  </backend>
  <outbound>
    <base />
  </outbound>
  <on-error>
    <base />
  </on-error>
</policies>

Note that the timeout attribute is in seconds and cannot exceed 230 seconds due to the underlying Azure Load Balancer constraint. If your operation needs more than 230 seconds, you must use the async pattern described in Step 4.

Step 4: Convert Long-Running Operations to Async Polling

The Azure-standard pattern for operations > 90 seconds is the REST Long-Running Operations (LRO) specification:

Client POSTs the request.
Backend immediately returns 202 Accepted with a Location or Operation-Location header pointing to a status endpoint.
Client polls the status endpoint (with exponential back-off) until it receives 200/201 with the final result or a terminal error.

import time, requests

def start_operation(endpoint, payload, token):
    headers = {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
    r = requests.post(endpoint, json=payload, headers=headers, timeout=30)
    r.raise_for_status()
    if r.status_code == 202:
        return r.headers["Operation-Location"]
    return None  # synchronous completion

def poll_until_done(operation_url, token, max_wait=600):
    headers = {"Authorization": f"Bearer {token}"}
    elapsed = 0
    interval = 5
    while elapsed < max_wait:
        r = requests.get(operation_url, headers=headers, timeout=30)
        r.raise_for_status()
        body = r.json()
        status = body.get("status", "").lower()
        if status in ("succeeded", "failed", "canceled"):
            return body
        time.sleep(interval)
        elapsed += interval
        interval = min(interval * 1.5, 30)  # back-off up to 30 s
    raise TimeoutError(f"Operation did not complete within {max_wait}s")

Step 5: Fix Azure Function Cold-Start Timeouts

Azure Functions on the Consumption plan can take 5–15 seconds to cold-start. If your API call hits a cold instance, the cumulative latency often triggers client timeouts.

Options:

Set "functionTimeout": "00:10:00" in host.json (max 10 min on Consumption, unlimited on Premium/Dedicated).
Enable Always On (App Service Plan) or Pre-warmed instances (Premium plan) to eliminate cold starts.
Use Azure Front Door health probes to keep instances warm.

Step 6: Verify the Fix in Staging

After applying changes, validate with a load test using Azure Load Testing or k6 before promoting to production:

# k6 smoke test – replace URL and token
k6 run --vus 10 --duration 60s - <<'EOF'
import http from 'k6/http';
import { check, sleep } from 'k6';

const TOKEN = __ENV.AZURE_TOKEN;
const BASE  = __ENV.API_BASE_URL;

export default function () {
  const res = http.post(`${BASE}/api/long-running`, JSON.stringify({input: 'test'}), {
    headers: { 'Authorization': `Bearer ${TOKEN}`, 'Content-Type': 'application/json' },
    timeout: '190s',
  });
  check(res, {
    'status is 200 or 202': (r) => r.status === 200 || r.status === 202,
    'no timeout':           (r) => r.status !== 408 && r.status !== 504,
  });
  sleep(1);
}
EOF

Frequently Asked Questions

bash

#!/usr/bin/env bash
# Azure API Timeout Diagnostic Script
# Prerequisites: az CLI logged in, jq, curl
# Usage: APIM_NAME=mygw RG=mygroup API_ID=myapi bash diagnose-api-timeout.sh

set -euo pipefail

APIM_NAME="${APIM_NAME:?Set APIM_NAME}"
RG="${RG:?Set RG}"
API_ID="${API_ID:?Set API_ID}"
SUB=$(az account show --query id -o tsv)

echo "=== 1. Check APIM SKU and forward-request timeout ==="
az apim show -n "$APIM_NAME" -g "$RG" \
  --query '{sku:sku.name, capacity:sku.capacity, provisioningState:provisioningState}' -o table

echo ""
echo "=== 2. Fetch backend policy for API $API_ID ==="
az rest --method GET \
  --url "https://management.azure.com/subscriptions/$SUB/resourceGroups/$RG/providers/Microsoft.ApiManagement/service/$APIM_NAME/apis/$API_ID/policies/policy?api-version=2022-08-01" \
  --query 'properties.value' -o tsv 2>/dev/null | grep -oP '(?<=forward-request timeout=")\d+' \
  && echo " seconds" || echo "forward-request timeout not explicitly set (check inherited policy)"

echo ""
echo "=== 3. Recent 504/408 errors from APIM in Azure Monitor (last 1h) ==="
az monitor log-analytics query \
  --workspace "$(az monitor log-analytics workspace list -g "$RG" --query '[0].customerId' -o tsv)" \
  --analytics-query "
    ApiManagementGatewayLogs
    | where TimeGenerated > ago(1h)
    | where ResponseCode in (408, 504)
    | project TimeGenerated, OperationId, BackendId, BackendResponseCode, DurationMs
    | order by DurationMs desc
    | limit 20" \
  --output table 2>/dev/null || echo "Log Analytics workspace not found or insufficient permissions"

echo ""
echo "=== 4. Check Function App timeout setting ==="
FUNC_APPS=$(az functionapp list -g "$RG" --query '[].name' -o tsv)
for FUNC in $FUNC_APPS; do
  TIMEOUT=$(az functionapp config appsettings list -n "$FUNC" -g "$RG" \
    --query "[?name=='AzureFunctionsJobHost__functionTimeout'].value" -o tsv 2>/dev/null || echo "default")
  HOST_JSON=$(az storage file download --account-name \
    "$(az functionapp show -n "$FUNC" -g "$RG" --query 'storageAccountRequired' -o tsv)" \
    --share-name "$FUNC" --path host.json --dest /dev/stdout 2>/dev/null | \
    python3 -c "import sys,json; d=json.load(sys.stdin); print(d.get('functionTimeout','not set'))" 2>/dev/null || echo "could not read")
  echo "  Function App: $FUNC | AppSetting timeout: $TIMEOUT | host.json: $HOST_JSON"
done

echo ""
echo "=== 5. Measure raw backend latency bypassing APIM ==="
BACKEND_URL="${BACKEND_URL:-}"
if [[ -n "$BACKEND_URL" ]]; then
  curl -o /dev/null -s -w \
    "DNS: %{time_namelookup}s | Connect: %{time_connect}s | TTFB: %{time_starttransfer}s | Total: %{time_total}s\n" \
    "$BACKEND_URL"
else
  echo "  Set BACKEND_URL env var to measure raw backend latency"
fi

echo ""
echo "=== Diagnostics complete ==="

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps and SRE engineers with hands-on experience designing and operating production systems on Azure, AWS, and GCP. Our troubleshooting guides are built from real incident postmortems, not documentation summaries.

Sources

Explore More API Errors Guides

AWS API Rate Limit Exceeded (ThrottlingException): Complete Troubleshooting Guide

Fix AWS ThrottlingException and API timeouts with exponential backoff, Service Quotas increases, and optimized API polling strategies for your workloads.

Cloudflare API Timeout: Fix 524, 522 & Script Execution Errors (Complete Guide)

Fix Cloudflare API timeout errors (524, 522, Workers CPU limit exceeded). Step-by-step diagnosis and fixes including origin tuning, retry logic, and Worker opti

Cloudflare API Timeout: Fix 524, 522, and Gateway Timeout Errors

Fix Cloudflare API timeout errors (524, 522, 504) with step-by-step diagnosis, origin tuning, Worker limits, and retry logic. Resolve in under 30 minutes.