Why am I getting AWS API timeouts even though I haven't hit the rate limit?

Timeouts often occur due to client-side network congestion, aggressive SDK timeout configurations, or when AWS is experiencing localized endpoint degradation. Ensure your HTTP read timeout is set high enough (e.g., 60 seconds) and check the AWS Service Health Dashboard. Also, ensure your NAT Gateway has sufficient port allocation if making many concurrent outbound calls.

How do I fix ThrottlingException on AWS Lambda?

Lambda concurrency limits and API rate limits are different. If your Lambda function is making the API calls that get throttled, configure the AWS SDK inside your Lambda handler to use exponential backoff, or use SQS to buffer the events triggering your Lambda so you can control the concurrency and API call rate.

Does AWS charge for throttled API requests?

Generally, AWS does not charge for API requests that result in a 400 (ThrottlingException) or 503 error. However, the compute time your application spends retrying those requests (e.g., Lambda execution time) will incur costs.

How do I monitor AWS API throttling proactively?

Use Amazon CloudWatch Metrics. Many services provide a `ClientErrors` or `ThrottledRequests` metric. You can also create a CloudWatch Metric Filter based on your CloudTrail logs searching for the `RateExceeded` error code and set an alarm when it breaches a threshold.

Can I request an API rate limit increase for any AWS service?

Not all API rate limits are adjustable. Some control plane APIs have hard limits to protect the control plane's stability. Check the AWS Service Quotas console; if the quota is marked as 'Adjustable: No', you must fix the issue through caching, backoff, or architectural changes.

AWS API Rate Limit Exceeded (ThrottlingException): Complete Troubleshooting Guide

Fix AWS ThrottlingException and API timeouts with exponential backoff, Service Quotas increases, and optimized API polling strategies for your workloads.

Last updated: February 24, 2026

Last verified: February 24, 2026

1,127 words

Key Takeaways

Root Cause 1: Burst API call spikes exceeding account or service-level default rate limits (e.g., EC2 DescribeInstances).
Root Cause 2: Lack of exponential backoff and jitter in retry logic, leading to retry storms and AWS API timeouts.
Quick Fix Summary: Implement exponential backoff in your SDK clients, cache API responses where possible, and request a Service Quota increase for the specific throttled API.

Fix Approaches Compared
Method	When to Use	Time	Risk
Implement Exponential Backoff	Immediate code-level fix for intermittent throttling	Medium	Low
Request Service Quota Increase	Sustained workload growth legitimately requires higher limits	24-48 Hours	Low
API Response Caching	High read volume of static or slow-changing AWS resources	High	Medium (Stale Data)
Jitter Addition to Retries	Preventing thundering herd problems during mass retries	Low	Low

Understanding the Error

When interacting with Amazon Web Services, you may encounter ThrottlingException, RateExceeded, or RequestLimitExceeded errors. These occur when your application makes too many API requests within a specific timeframe, exceeding the allowed token bucket algorithms enforced by AWS endpoints. Similarly, sustained throttling or network congestion can result in an AWS API timeout, where the SDK client drops the connection before AWS can process the queued request.

The Anatomy of an AWS Throttling Error

You will typically see an error response similar to this in your CloudTrail logs or application stack traces:

ClientError: An error occurred (ThrottlingException) when calling the DescribeInstances operation: Rate exceeded

AWS uses a token bucket algorithm for API throttling. You have a maximum burst capacity (bucket size) and a refill rate. If you consume tokens faster than they refill, subsequent requests are dropped with an HTTP 400 (ThrottlingException) or HTTP 503 (Service Unavailable).

Step 1: Diagnose the Exact Throttled API

Before changing code or requesting quota increases, you must identify precisely which API call is being throttled and by which IAM principal. AWS CloudTrail is the best tool for this.

Open the AWS Management Console and navigate to CloudTrail.
Go to Event history.
Filter by Event name or Error code (type ThrottlingException or RateExceeded).
Analyze the event record to identify the eventSource (e.g., ec2.amazonaws.com) and the eventName (e.g., DescribeInstances).

Alternatively, you can query CloudTrail via AWS CLI or Amazon Athena if you have a CloudTrail lake configured. Athena is highly recommended for identifying the top throttled APIs across an organization.

Step 2: Implement Exponential Backoff and Jitter

The most immediate and robust fix is to ensure your application handles rate limits gracefully. Most AWS SDKs implement basic retries by default, but heavily concurrent applications require tuning.

Exponential backoff increases the wait time between retries exponentially (e.g., 1s, 2s, 4s, 8s). Jitter adds a randomized delay to prevent multiple threads from retrying at the exact same millisecond (the "thundering herd" problem).

If you are using Boto3 (Python), you can configure the retry mode to adaptive, which automatically handles rate limiting using a token bucket implementation client-side.

Step 3: Requesting a Quota Increase

If your architecture legitimately requires high API throughput (e.g., a large-scale infrastructure discovery tool) and client-side pacing is insufficient, you must request a quota increase.

Navigate to the Service Quotas console.
Select the AWS service (e.g., Amazon EC2).
Search for the specific API quota (e.g., Describe API operations rate).
Select the quota and click Request quota increase.
Provide a valid business justification. AWS Support will review and typically approve this within 24-48 hours.

Step 4: Architectural Mitigations (Caching & Event-Driven)

If you are constantly polling AWS APIs to detect state changes (e.g., waiting for an EC2 instance to reach the running state), you are wasting API calls and risking rate limits.

Anti-Pattern: Polling DescribeInstances every 5 seconds. Best Practice: Use Amazon EventBridge. AWS emits events when resources change state. You can route an EventBridge rule to an SQS queue or Lambda function, eliminating the need for polling entirely.

For static data (e.g., fetching VPC subnets for deployments), implement an in-memory cache (like Redis or Memcached) or cache the results in your application state for 5-10 minutes. Only invalidate the cache when a deployment operation occurs.

Frequently Asked Questions

python

import boto3
from botocore.config import Config

# Configure Boto3 to use the 'adaptive' retry mode
# This automatically handles exponential backoff and client-side pacing
# to prevent ThrottlingException and API timeouts.

custom_config = Config(
    retries = {
        'max_attempts': 10,
        'mode': 'adaptive'
    },
    # Increase connect and read timeouts to prevent premature client timeouts
    connect_timeout=10,
    read_timeout=60
)

# Initialize the client with the custom configuration
ec2_client = boto3.client('ec2', config=custom_config, region_name='us-east-1')

try:
    # This call will now automatically retry with backoff if throttled
    response = ec2_client.describe_instances()
    print(f"Successfully fetched {len(response.get('Reservations', []))} reservations.")
except Exception as e:
    print(f"API call failed after exhausting retries: {e}")

# Diagnostic Bash Command using AWS CLI to check for throttling in CloudTrail:
# aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeInstances --max-results 50 | grep -i throttle

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps, SRE, and Cloud Architects dedicated to solving complex infrastructure and deployment errors. With decades of combined experience managing AWS, GCP, and Kubernetes environments at scale, we provide actionable, production-ready solutions for developers.

Sources

Explore More API Errors Guides

Azure API Timeout: 'The operation timed out' — Root Causes and Fixes

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Azure API Timeout: Fix 504 Gateway Timeout and RequestTimeout Errors in Azure API Management, Functions, and ARM

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI

Azure API Timeout: Fixing 'The request timed out' and 408/504 Errors in Azure APIs

Fix Azure API timeout errors (408, 504, RequestTimeout) fast. Covers ARM, APIM, Function App, and SDK timeouts with real commands and config fixes.