What is the difference between a rate limit and a timeout in AWS?

A rate limit (ThrottlingException) means AWS actively rejected your request because you exceeded the allowed number of API calls per second. A timeout means your request was sent, but the network or the AWS service failed to return a response within your client's configured waiting period.

How do I check my current AWS API rate limits?

You can view your account-specific limits in the AWS Management Console under 'Service Quotas'. However, note that some internal API control-plane rate limits are not publicly documented or adjustable, requiring you to rely strictly on exponential backoff.

Why does my AWS Lambda function timeout when calling other AWS services?

Lambda functions have a hard execution timeout (max 15 minutes), but the AWS SDK inside the Lambda also has default socket timeouts. If the service you are calling (like a slow DynamoDB query or an external API) takes longer than the SDK's default read timeout, the SDK throws a timeout error. You must explicitly increase the SDK's timeout configuration.

Does AWS charge me for throttled API requests?

Generally, AWS does not charge for requests that return a 429 ThrottlingException, as they are rejected at the edge/control plane before consuming compute resources. However, it's best practice to minimize these to avoid unnecessary network egress and client-side compute waste.

How do I configure automatic retries in the AWS CLI?

You can configure retries globally for the AWS CLI by setting environment variables: `export AWS_MAX_ATTEMPTS=10` and `export AWS_RETRY_MODE=adaptive`. Alternatively, you can add `max_attempts = 10` and `retry_mode = adaptive` under your profile in the `~/.aws/config` file.

Resolving AWS API Rate Limit (ThrottlingException) and Timeout Errors

Fix AWS API rate limit (ThrottlingException) and timeout errors. Learn how to implement exponential backoff, request service quotas, and optimize API calls.

Last updated: February 24, 2026

Last verified: February 24, 2026

1,437 words

Key Takeaways

AWS API requests are throttled (ThrottlingException) when exceeding account-level or service-level token bucket quotas.
API timeouts frequently stem from high network latency, large payloads, or missing client-side socket timeout configurations.
Quick fix: Enable standard or adaptive retry modes in your AWS SDK and implement exponential backoff with jitter.
Long-term fix: Decouple high-volume API calls using Amazon SQS, cache responses, and request targeted Service Quota increases.

Fix Approaches Compared
Method	When to Use	Time	Risk
SDK Retry/Backoff Config	Immediate mitigation for intermittent ThrottlingException and brief network timeouts	5 mins	Low
Service Quota Increase	Consistent hard limits hit during expected scaling or normal operational load	1-2 Days	Low
API Response Caching	High read volume of static or slowly changing AWS metadata (e.g., DescribeInstances)	1-2 Days	Medium
SQS/Event-Driven Queue	Massive bursts of write/mutation API calls causing severe throttling and system failure	1-2 Weeks	High

Understanding the Error

When working with Amazon Web Services (AWS) at scale, encountering an aws api rate limit or an aws api timeout is practically a rite of passage. These errors typically manifest in your application logs as ThrottlingException: Rate exceeded, ProvisionedThroughputExceededException, or simply as an HTTP 400/429 for throttling, and HTTP 500/503/504 for timeouts.

AWS protects its infrastructure using the Token Bucket algorithm. Every API endpoint has a predefined bucket of tokens. Each API request consumes a token. Tokens are replenished at a steady rate. If your application sends requests faster than the bucket replenishes, the bucket empties, and subsequent requests are flat-out rejected with a ThrottlingException.

On the other hand, an aws api timeout (often a ConnectTimeoutError or ReadTimeoutError) occurs when a request is dispatched, but the client does not receive a response within the configured temporal threshold. This can be caused by transient network partitioning, massive payload sizes (like fetching heavily nested DynamoDB scans), or AWS control plane degradation.

Step 1: Diagnose

Before implementing a fix, you must distinguish between a hard rate limit and a network timeout. They require fundamentally different solutions.

1. Identifying Throttling via CloudTrail and CloudWatch: Most AWS services log API calls to AWS CloudTrail. You can query CloudTrail to find exactly which IAM role or user is getting throttled and which API action is failing. In Amazon CloudWatch, look for the ClientErrors metric or specific ThrottledRequests metrics provided by services like API Gateway or DynamoDB.

If you see a sudden spike in ThrottlingException on DescribeInstances or AssumeRole, a specific script or automated process in your CI/CD pipeline is likely misbehaving, polling too aggressively without backoff.

2. Identifying Timeouts: Timeouts are typically client-side phenomena. Check your application logs for stack traces mentioning urllib3.exceptions.ReadTimeoutError (Python/Boto3) or TimeoutError: Connection timed out (Node.js). If the timeout happens exactly at 60 seconds or 120 seconds, it's highly likely you are hitting a default client socket timeout configuration rather than an actual AWS service outage.

Step 2: Fix Throttling with Exponential Backoff and Jitter

The most robust way to handle aws api rate limit errors is by implementing Exponential Backoff with Jitter.

When an API call fails due to throttling, immediately retrying it will likely fail again and contribute to further throttling (the "thundering herd" problem). Exponential backoff dictates that you wait progressively longer between retries (e.g., 1s, 2s, 4s, 8s).

Jitter adds randomness to this wait time. If 50 parallel Lambda functions are throttled simultaneously, and they all back off for exactly 2 seconds before retrying, they will synchronize and cause another massive spike. Adding jitter ensures they retry at slightly different intervals (e.g., 1.5s, 2.1s, 2.8s), smoothing out the load on the AWS API.

Fortunately, modern AWS SDKs handle this natively. You must explicitly configure the retry mode:

Python (Boto3): Update your boto3 client initialization to use the adaptive retry mode, which dynamically throttles the client-side request rate based on the AWS server responses.

from botocore.config import Config
import boto3

retry_config = Config(
    retries = {
        'max_attempts': 10,
        'mode': 'adaptive'
    }
)
client = boto3.client('ec2', config=retry_config)

Step 3: Fix Timeouts by Adjusting Client Configurations

If you are encountering an aws api timeout, relying on retries alone might exacerbate the problem, especially if the API call is resource-intensive.

Instead, you need to adjust the TCP socket timeouts in your SDK client. For example, if you are calling an AWS Lambda function synchronously and it takes 10 minutes to process, the default API Gateway or SDK timeout (usually 60 seconds) will drop the connection long before the Lambda finishes.

Adjust the connect_timeout and read_timeout settings in your SDK.

Node.js (AWS SDK v3):

import { S3Client } from "@aws-sdk/client-s3";
import { NodeHttpHandler } from "@aws-sdk/node-http-handler";

const client = new S3Client({
  requestHandler: new NodeHttpHandler({
    connectionTimeout: 5000, // 5 seconds connect timeout
    socketTimeout: 300000,   // 5 minutes read timeout
  }),
});

Step 4: Requesting a Service Quota Increase

If you have implemented backoff and are still consistently hitting the aws api rate limit, your application's legitimate baseline traffic has simply outgrown the default AWS account limits.

Navigate to the Service Quotas console in AWS.
Search for the specific service (e.g., Amazon EC2, AWS STS).
Find the specific quota. Note that API rate limits are often categorized under "API Request Rate" or "Mutating API requests".
Select the quota and click Request quota increase.
Provide a solid business justification. AWS Support rarely approves massive leaps (e.g., jumping from 10 req/sec to 1000 req/sec) without mathematical proof of need. Explain your caching strategies and retry logic in the support ticket to expedite approval.

Step 5: Architectural Mitigation (Caching and Queues)

For enterprise-grade resilience against API limits and timeouts, code-level fixes aren't enough. You must alter your architecture.

Read-Heavy Workloads: If your application repeatedly polls AWS APIs for state (e.g., checking if an EC2 instance is running, or polling Parameter Store), implement an in-memory cache (like Redis or Memcached) or use AWS Systems Manager Parameter Store's advanced tier.

Write-Heavy Workloads: If you are slamming an AWS service with creation requests (e.g., spawning hundreds of ECS tasks at once), decouple the creation process. Push the requests into an Amazon SQS Queue. Have a background worker or Lambda function consume the queue at a controlled concurrency rate that stays safely below your AWS API rate limits. This guarantees delivery and eliminates throttling entirely at the cost of asynchronous processing.

Frequently Asked Questions

bash

# Find the most throttled AWS API calls in the last 24 hours using CloudTrail
# Requirements: aws cli, jq

START_TIME=$(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ)
END_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ)

echo "Searching CloudTrail for ThrottlingExceptions between $START_TIME and $END_TIME..."

aws cloudtrail lookup-events \
    --lookup-attributes AttributeKey=EventName,AttributeValue=ThrottlingException \
    --start-time $START_TIME \
    --end-time $END_TIME \
    --query 'Events[*].CloudTrailEvent' \
    --output text | \
    jq -r '. | [.eventName, .userAgent, .sourceIPAddress] | @tsv' | \
    sort | uniq -c | sort -nr

# Note: If jq fails, ensure your CloudTrail events are outputting valid JSON strings.

Error Medic Editorial

Our SRE and DevOps editorial team consists of certified AWS Solutions Architects and Site Reliability Engineers dedicated to demystifying complex cloud infrastructure failures.

Sources

Explore More API Errors Guides

Azure API Timeout: 'The operation timed out' — Root Causes and Fixes

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Azure API Timeout: Fix 504 Gateway Timeout and RequestTimeout Errors in Azure API Management, Functions, and ARM

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI

Azure API Timeout: Fixing 'The request timed out' and 408/504 Errors in Azure APIs

Fix Azure API timeout errors (408, 504, RequestTimeout) fast. Covers ARM, APIM, Function App, and SDK timeouts with real commands and config fixes.