Error Medic

How to Fix AWS API Rate Limit (ThrottlingException: Rate exceeded) and Timeout Errors

Resolve AWS API rate limit (ThrottlingException) and timeout errors by implementing exponential backoff, jitter, requesting quota increases, and optimizing API

Last updated:
Last verified:
1,404 words
Key Takeaways
  • Root cause: Exceeding the maximum allowed API request rate for an AWS service, resulting in a ThrottlingException or HTTP 429 Too Many Requests.
  • Root cause: Network congestion, slow endpoints, or aggressive client-side SDK configurations causing TimeoutError or HTTP 500/503/504.
  • Quick fix: Implement exponential backoff with jitter in your retry logic, tune your AWS SDK timeouts, and request an AWS Service Quota increase if hitting a hard cap.
Fix Approaches Compared
MethodWhen to UseTimeRisk
Implement Exponential BackoffImmediate fix for bursty traffic causing intermittent throttling.MediumLow
Request Service Quota IncreaseWhen consistently hitting baseline limits despite optimized code.High (AWS Support SLA)Low
Optimize API Calls (Batching/Caching)To permanently reduce the overall volume of API requests.Medium to HighMedium
Tune SDK Timeout SettingsWhen facing client-side or transient network timeouts on long-running tasks.LowMedium

Understanding the Error

When building scalable cloud applications, interacting with the AWS API is a fundamental requirement. Whether you are provisioning resources, querying databases, or invoking serverless functions, your application relies on the AWS Control Plane and Data Plane APIs. However, as your application's throughput increases, you will inevitably encounter API rate limits (throttling) or API timeouts.

These errors manifest in various forms depending on the AWS SDK or CLI tool you are using. The most common error messages include:

  • ThrottlingException: An error occurred (ThrottlingException) when calling the [Operation] operation: Rate exceeded
  • TooManyRequestsException: HTTP 429 Too Many Requests.
  • ProvisionedThroughputExceededException: Specific to services like Amazon DynamoDB.
  • TimeoutError: Connection timed out after 120000ms or HTTP 504 Gateway Timeout.

Why Does AWS Throttle API Requests?

AWS implements rate limiting to protect the underlying infrastructure from being overwhelmed by too many requests (either intentionally via DDoS attacks or unintentionally via runaway code). This ensures fair usage and high availability for all tenants in the shared cloud environment.

There are two primary types of API limits in AWS:

  1. Hard Limits (Service Quotas): These are absolute maximums on the number of resources you can create or the sustained rate of API calls you can make. Some of these can be increased by contacting AWS Support.
  2. Token Bucket (Burst) Limits: AWS uses a token bucket algorithm for many APIs. You accumulate tokens at a steady rate. Each API call consumes a token. If you burst and empty the bucket, subsequent calls are throttled until new tokens accumulate.

Step 1: Diagnose the Bottleneck

Before applying a fix, you must determine which API is throttling you and why. Blindly increasing retries can exacerbate the problem.

Analyzing CloudTrail Logs

AWS CloudTrail records API calls made within your account. You can query CloudTrail to identify throttling events. This is especially useful for Control Plane APIs (like ec2:DescribeInstances).

You can use Amazon Athena to query CloudTrail logs efficiently to find the worst offenders.

Monitoring AWS SDK Metrics

If you are encountering timeouts (TimeoutError), the issue might be client-side. The default HTTP timeout in many AWS SDKs is aggressive. If the AWS service takes longer to respond than the SDK's configured timeout, the SDK drops the connection and throws an error, even if the AWS service eventually completes the request.

Check CloudWatch metrics for the specific service (e.g., DynamoDB ThrottledRequests, API Gateway 4XXError and 5XXError, Lambda Throttles).

Step 2: Implement the Fix

Fixing AWS API rate limits and timeouts requires a multi-layered approach.

1. Implement Exponential Backoff with Jitter

The most critical defense against throttling is implementing robust retry logic. Standard retries (e.g., waiting exactly 1 second between each attempt) can cause the "thundering herd" problem, where multiple failing clients retry simultaneously, further overwhelming the API.

Exponential backoff increases the wait time between retries exponentially (e.g., 1s, 2s, 4s, 8s). Adding "jitter" introduces randomness to the wait time, spreading out the retries. Most modern AWS SDKs implement this automatically, but you may need to tune the maximum number of retries depending on your workload's tolerance for latency.

2. Tune Client-Side Timeouts

If you are seeing aws api timeout errors (e.g., TimeoutError or SocketTimeoutException), you may need to increase the HTTP socket timeout in your AWS SDK client configuration. This is particularly relevant for long-running operations like large S3 uploads, Athena queries, or invoking slow Lambda functions.

3. Optimize and Batch API Calls

The best way to avoid API limits is to make fewer API calls.

  • Batching: Instead of sending 100 individual PutItem requests to DynamoDB, use BatchWriteItem to send them in a single network request.
  • Caching: If you are repeatedly polling an API that returns relatively static data (like sts:GetCallerIdentity or ssm:GetParameter), cache the response in memory for a few minutes.
  • Pagination Awareness: When listing resources (e.g., s3:ListObjectsV2), ensure you are properly handling pagination tokens rather than repeatedly requesting the first page.
4. Decouple Architecture with Amazon SQS

If your architecture is synchronous (e.g., API Gateway -> Lambda -> DynamoDB) and a downstream service throttles, the error bubbles all the way back to the user.

By introducing Amazon Simple Queue Service (SQS) (e.g., API Gateway -> SQS -> Lambda -> DynamoDB), you can decouple the components. The SQS queue acts as a shock absorber. If DynamoDB throttles the Lambda function, the message remains in the queue and Lambda will retry it automatically based on the visibility timeout, smoothing out traffic spikes without losing data or returning immediate 500 errors to the client.

5. Request a Service Quota Increase

If you have optimized your code, implemented backoff, and are still consistently hitting the ceiling, you are likely hitting a hard Service Quota limit.

  1. Navigate to the Service Quotas console in AWS.
  2. Search for the specific service and API limit.
  3. Select the quota and click Request quota increase.
  4. Provide a strong business justification and architectural details to AWS Support to ensure prompt approval.

Frequently Asked Questions

bash
# Diagnostic command to find the top throttled AWS API calls using AWS CLI and jq (requires CloudTrail logs in JSON format)

cat cloudtrail_logs.json | jq -r '.Records[] | select(.errorCode != null) | select(.errorCode | contains("ThrottlingException") or contains("Rate exceeded")) | .eventName' | sort | uniq -c | sort -nr

# Example Python Boto3 Configuration with Custom Retries and Timeouts
# import boto3
# from botocore.config import Config
# 
# custom_config = Config(
#     region_name='us-east-1',
#     signature_version='v4',
#     retries={
#         'max_attempts': 10,
#         'mode': 'adaptive' # 'adaptive' mode automatically handles backoff and throttling
#     },
#     connect_timeout=10,
#     read_timeout=120
# )
# 
# client = boto3.client('s3', config=custom_config)
E

Error Medic Editorial

Error Medic Editorial is a team of certified cloud architects and SREs dedicated to providing actionable, code-first solutions for complex infrastructure and deployment challenges.

Sources

Related Articles in Aws Api

Explore More API Errors Guides