Error Medic

AWS API Rate Limit Exceeded (ThrottlingException): Complete Troubleshooting Guide

Fix AWS ThrottlingException and API timeouts with exponential backoff, Service Quotas increases, and optimized API polling strategies for your workloads.

Last updated:
Last verified:
1,127 words
Key Takeaways
  • Root Cause 1: Burst API call spikes exceeding account or service-level default rate limits (e.g., EC2 DescribeInstances).
  • Root Cause 2: Lack of exponential backoff and jitter in retry logic, leading to retry storms and AWS API timeouts.
  • Quick Fix Summary: Implement exponential backoff in your SDK clients, cache API responses where possible, and request a Service Quota increase for the specific throttled API.
Fix Approaches Compared
MethodWhen to UseTimeRisk
Implement Exponential BackoffImmediate code-level fix for intermittent throttlingMediumLow
Request Service Quota IncreaseSustained workload growth legitimately requires higher limits24-48 HoursLow
API Response CachingHigh read volume of static or slow-changing AWS resourcesHighMedium (Stale Data)
Jitter Addition to RetriesPreventing thundering herd problems during mass retriesLowLow

Understanding the Error

When interacting with Amazon Web Services, you may encounter ThrottlingException, RateExceeded, or RequestLimitExceeded errors. These occur when your application makes too many API requests within a specific timeframe, exceeding the allowed token bucket algorithms enforced by AWS endpoints. Similarly, sustained throttling or network congestion can result in an AWS API timeout, where the SDK client drops the connection before AWS can process the queued request.

The Anatomy of an AWS Throttling Error

You will typically see an error response similar to this in your CloudTrail logs or application stack traces:

ClientError: An error occurred (ThrottlingException) when calling the DescribeInstances operation: Rate exceeded

AWS uses a token bucket algorithm for API throttling. You have a maximum burst capacity (bucket size) and a refill rate. If you consume tokens faster than they refill, subsequent requests are dropped with an HTTP 400 (ThrottlingException) or HTTP 503 (Service Unavailable).

Step 1: Diagnose the Exact Throttled API

Before changing code or requesting quota increases, you must identify precisely which API call is being throttled and by which IAM principal. AWS CloudTrail is the best tool for this.

  1. Open the AWS Management Console and navigate to CloudTrail.
  2. Go to Event history.
  3. Filter by Event name or Error code (type ThrottlingException or RateExceeded).
  4. Analyze the event record to identify the eventSource (e.g., ec2.amazonaws.com) and the eventName (e.g., DescribeInstances).

Alternatively, you can query CloudTrail via AWS CLI or Amazon Athena if you have a CloudTrail lake configured. Athena is highly recommended for identifying the top throttled APIs across an organization.

Step 2: Implement Exponential Backoff and Jitter

The most immediate and robust fix is to ensure your application handles rate limits gracefully. Most AWS SDKs implement basic retries by default, but heavily concurrent applications require tuning.

Exponential backoff increases the wait time between retries exponentially (e.g., 1s, 2s, 4s, 8s). Jitter adds a randomized delay to prevent multiple threads from retrying at the exact same millisecond (the "thundering herd" problem).

If you are using Boto3 (Python), you can configure the retry mode to adaptive, which automatically handles rate limiting using a token bucket implementation client-side.

Step 3: Requesting a Quota Increase

If your architecture legitimately requires high API throughput (e.g., a large-scale infrastructure discovery tool) and client-side pacing is insufficient, you must request a quota increase.

  1. Navigate to the Service Quotas console.
  2. Select the AWS service (e.g., Amazon EC2).
  3. Search for the specific API quota (e.g., Describe API operations rate).
  4. Select the quota and click Request quota increase.
  5. Provide a valid business justification. AWS Support will review and typically approve this within 24-48 hours.

Step 4: Architectural Mitigations (Caching & Event-Driven)

If you are constantly polling AWS APIs to detect state changes (e.g., waiting for an EC2 instance to reach the running state), you are wasting API calls and risking rate limits.

Anti-Pattern: Polling DescribeInstances every 5 seconds. Best Practice: Use Amazon EventBridge. AWS emits events when resources change state. You can route an EventBridge rule to an SQS queue or Lambda function, eliminating the need for polling entirely.

For static data (e.g., fetching VPC subnets for deployments), implement an in-memory cache (like Redis or Memcached) or cache the results in your application state for 5-10 minutes. Only invalidate the cache when a deployment operation occurs.

Frequently Asked Questions

python
import boto3
from botocore.config import Config

# Configure Boto3 to use the 'adaptive' retry mode
# This automatically handles exponential backoff and client-side pacing
# to prevent ThrottlingException and API timeouts.

custom_config = Config(
    retries = {
        'max_attempts': 10,
        'mode': 'adaptive'
    },
    # Increase connect and read timeouts to prevent premature client timeouts
    connect_timeout=10,
    read_timeout=60
)

# Initialize the client with the custom configuration
ec2_client = boto3.client('ec2', config=custom_config, region_name='us-east-1')

try:
    # This call will now automatically retry with backoff if throttled
    response = ec2_client.describe_instances()
    print(f"Successfully fetched {len(response.get('Reservations', []))} reservations.")
except Exception as e:
    print(f"API call failed after exhausting retries: {e}")

# Diagnostic Bash Command using AWS CLI to check for throttling in CloudTrail:
# aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeInstances --max-results 50 | grep -i throttle
E

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps, SRE, and Cloud Architects dedicated to solving complex infrastructure and deployment errors. With decades of combined experience managing AWS, GCP, and Kubernetes environments at scale, we provide actionable, production-ready solutions for developers.

Sources

Related Articles in Aws Api

Explore More API Errors Guides