What does Plaid error_code RATE_LIMIT_EXCEEDED actually mean?

It means your application has sent too many requests to a specific Plaid endpoint within their allowed time window. Plaid returns HTTP 429 with error_type and error_code both set to RATE_LIMIT_EXCEEDED. You must wait the number of seconds specified in the Retry-After response header before retrying. The request_id in the response body is useful if you escalate to Plaid support.

Why am I getting Plaid rate limited only in production, not Sandbox?

This is backwards from what most developers expect: Plaid Sandbox actually has stricter, lower rate limits than Production to force you to build resilient retry logic early. If you see rate limit errors in Production that you did not see in Sandbox, you have likely scaled past Production thresholds — most commonly by adding many users whose Items are all being refreshed simultaneously. Review your bulk refresh scheduling and ensure you are using webhooks rather than polling.

How do I find out what Plaid's actual rate limits are?

Plaid does not publicly document specific numeric rate limits (requests per minute) because they vary by endpoint, product tier, and account type. Your best sources are: (1) the Retry-After header on 429 responses, which tells you how long to wait; (2) Plaid's error documentation at plaid.com/docs/errors/rate-limit-exceeded/; and (3) contacting Plaid support directly — enterprise customers can negotiate higher limits. As a practical guideline, avoid calling any per-Item endpoint more than once every 60 seconds for the same Item.

Can I get my Plaid rate limit increased?

Yes. If you have a legitimate high-volume use case (fintech app with many concurrent users), contact Plaid support or your account manager and request a limit increase. You will need to provide your client_id, describe your use case, and show that you have implemented proper backoff and webhook-driven architecture. Plaid is more likely to approve increases if you can demonstrate responsible API usage patterns.

My background worker is hitting rate limits even with backoff — what else can I try?

First, ensure your backoff uses jitter (randomness) so workers do not all retry simultaneously. Second, add a per-Item in-memory or Redis lock so only one worker can refresh a given Item at a time. Third, reduce the frequency of your background refresh jobs — for most use cases, hourly or webhook-triggered is sufficient. Fourth, check if you have multiple deployments (e.g., staging and production sharing the same client_id) inadvertently consuming the same rate limit budget.

Plaid Rate Limit Error: How to Fix RATE_LIMIT_EXCEEDED and 429 Responses

Fix Plaid rate limit errors (HTTP 429, RATE_LIMIT_EXCEEDED) with exponential backoff, request batching, and token caching strategies. Step-by-step guide.

Last updated: February 23, 2026

Last verified: February 23, 2026

1,845 words

Key Takeaways

Plaid enforces per-item and per-endpoint rate limits; exceeding them returns HTTP 429 with error_code RATE_LIMIT_EXCEEDED
The most common cause is polling /transactions/get or /investments/holdings/get in a tight loop without respecting retry-after headers
Immediate fixes: implement exponential backoff with jitter, cache access tokens, and switch from polling to Plaid webhooks for data freshness
In Sandbox, rate limits are intentionally stricter than Production to help you test resilience early
Long-term solution: use /transactions/sync instead of /transactions/get and batch item refreshes with queue workers

Fix Approaches Compared
Method	When to Use	Time to Implement	Risk
Exponential backoff + jitter	Any polling or retry loop hitting 429	1-2 hours	Low — no architecture change
Webhook-driven refresh	Replacing polling for transaction/balance updates	1-2 days	Low — Plaid-native pattern
Request queue with rate limiter	High-volume multi-item apps (>100 Items)	2-4 hours	Low — isolates Plaid calls
Access token caching	Re-using tokens instead of calling /item/public_token/exchange repeatedly	30 min	Low — tokens don't expire
Switch to /transactions/sync	Apps still using legacy /transactions/get	4-8 hours	Medium — requires pagination refactor
Staggered cron refresh	Batch-refreshing many items on a schedule	2-3 hours	Low — spreads load across time window

Understanding Plaid Rate Limit Errors

When your application exceeds Plaid's API rate limits, every affected request returns an HTTP 429 Too Many Requests response. The JSON body looks like this:

{
  "error_type": "RATE_LIMIT_EXCEEDED",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "error_message": "You have exceeded your rate limit. Please retry after some time.",
  "display_message": null,
  "request_id": "HNbe2",
  "causes": [],
  "status": 429,
  "documentation_url": "https://plaid.com/docs/errors/rate-limit-exceeded/",
  "suggested_action": null
}

Plaid imposes two categories of limits you need to understand:

Per-Item limits — How often you can call data-fetch endpoints (like /transactions/get, /accounts/balance/get, /investments/holdings/get) for a single linked bank account (Item). These are the limits developers hit most often.
Per-client limits — Aggregate limits across your entire client ID. High-volume production apps with thousands of Items can hit these if refresh logic is not throttled.

Plaid does not publish exact numeric limits in their public docs, but they do return a Retry-After header (in seconds) and the request_id you should log for support escalations.

Step 1: Diagnose — Confirm You Are Actually Rate Limited

Before changing code, verify the 429 is a genuine rate limit and not a misconfigured endpoint or invalid token.

Check the response headers:

HTTP/2 429
content-type: application/json
retry-after: 60
x-request-id: HNbe2

If Retry-After is present, you are rate limited. Log error_code and request_id on every 429 — you will need request_id if you open a Plaid support ticket.

Find the hot endpoint. Add structured logging around every Plaid call:

import logging, time

logger = logging.getLogger("plaid")

def call_plaid(fn, *args, **kwargs):
    start = time.monotonic()
    try:
        response = fn(*args, **kwargs)
        logger.info({"endpoint": fn.__name__, "duration_ms": int((time.monotonic()-start)*1000), "status": "ok"})
        return response
    except plaid.ApiException as e:
        logger.error({"endpoint": fn.__name__, "status_code": e.status, "error_code": e.body.get("error_code"), "request_id": e.body.get("request_id")})
        raise

Aggregate these logs by endpoint to find which call is firing too frequently.

Common culprits in order of frequency:

/transactions/get or /transactions/sync called on every page load
/accounts/balance/get polled every few seconds for "live" balance display
/item/public_token/exchange called on every API request instead of caching the resulting access_token
A misconfigured cron job refreshing all Items simultaneously

Step 2: Implement Exponential Backoff With Jitter

This is the fastest fix — add it to your Plaid API wrapper immediately, regardless of other architectural changes.

import time, random
from plaid.exceptions import ApiException

def plaid_with_backoff(fn, *args, max_retries=5, base_delay=1.0, **kwargs):
    """
    Retry a Plaid API call with exponential backoff + full jitter.
    Respects Retry-After header when present.
    """
    for attempt in range(max_retries):
        try:
            return fn(*args, **kwargs)
        except ApiException as exc:
            if exc.status != 429:
                raise  # Don't retry non-rate-limit errors
            
            retry_after = None
            if exc.headers and "Retry-After" in exc.headers:
                retry_after = int(exc.headers["Retry-After"])
            
            if attempt == max_retries - 1:
                raise  # Exhausted retries
            
            # Full jitter: sleep between 0 and cap
            cap = retry_after if retry_after else base_delay * (2 ** attempt)
            sleep_secs = random.uniform(0, cap)
            time.sleep(sleep_secs)
    raise RuntimeError("Unreachable")

Why jitter? Without randomness, all your workers wake up simultaneously after the same backoff delay, creating a thundering-herd that immediately re-triggers the rate limit.

Step 3: Cache Access Tokens (Avoid Exchange Calls)

Some developers mistakenly call /item/public_token/exchange on every request. Access tokens are permanent (until revoked) — exchange once and persist them.

# WRONG: calling exchange on every request
def get_transactions(public_token):
    exchange_response = plaid_client.item_public_token_exchange({"public_token": public_token})
    access_token = exchange_response["access_token"]  # This fires a Plaid API call!
    ...

# RIGHT: exchange once, store in your DB, reuse
def get_transactions(user_id):
    access_token = db.get_plaid_token(user_id)  # Load from database
    ...

Step 4: Replace Polling With Webhooks

Polling /transactions/get or /balance/get on a timer is the root cause of most rate limit issues. Plaid sends webhooks when new data is available — use them.

Set webhook URL on item creation:

link_token_response = plaid_client.link_token_create({
    "user": {"client_user_id": user_id},
    "client_name": "MyApp",
    "products": ["transactions"],
    "country_codes": ["US"],
    "language": "en",
    "webhook": "https://your-api.example.com/plaid/webhook"  # <-- add this
})

Handle webhook events — only fetch data when Plaid tells you to:

@app.route("/plaid/webhook", methods=["POST"])
def plaid_webhook():
    body = request.json
    webhook_type = body.get("webhook_type")
    webhook_code = body.get("webhook_code")
    item_id = body.get("item_id")
    
    if webhook_type == "TRANSACTIONS" and webhook_code in ("INITIAL_UPDATE", "HISTORICAL_UPDATE", "DEFAULT_UPDATE"):
        # Enqueue a background job — never call Plaid synchronously in a webhook handler
        task_queue.enqueue(sync_transactions_for_item, item_id)
    
    return {"status": "ok"}, 200

Step 5: Throttle Bulk Item Refreshes With a Queue

If you have a cron job that refreshes all Items (e.g., nightly), spreading them out over time prevents aggregate rate limit hits.

import redis
from rq import Queue

r = redis.Redis()
q = Queue(connection=r)

def schedule_all_item_refreshes():
    items = db.get_all_active_items()
    # Stagger: send 1 refresh job every 200ms to stay well under limits
    for i, item in enumerate(items):
        q.enqueue_in(
            timedelta(milliseconds=i * 200),
            refresh_item_transactions,
            item.access_token
        )

Step 6: Migrate From /transactions/get to /transactions/sync

If you are still using the legacy /transactions/get endpoint, migrate to /transactions/sync. The newer endpoint is cursor-based and designed for incremental updates, reducing the number of API calls needed to stay current.

The key difference: /transactions/get requires you to pass a date range and fetches all matching transactions each time. /transactions/sync returns only what has changed since your last cursor, dramatically reducing call volume for established Items.

Store the cursor per Item in your database and pass it on every call:

def sync_transactions(access_token, cursor=None):
    all_added, all_modified, all_removed = [], [], []
    has_more = True
    
    while has_more:
        response = plaid_client.transactions_sync({"access_token": access_token, "cursor": cursor})
        all_added.extend(response["added"])
        all_modified.extend(response["modified"])
        all_removed.extend(response["removed"])
        has_more = response["has_more"]
        cursor = response["next_cursor"]
    
    db.save_cursor(access_token, cursor)
    return all_added, all_modified, all_removed

Frequently Asked Questions

bash

#!/usr/bin/env bash
# Plaid Rate Limit Diagnostics Script
# Usage: PLAID_SECRET=<secret> PLAID_CLIENT_ID=<id> bash plaid_ratelimit_diag.sh

PLAID_ENV="https://production.plaid.com"
# Change to https://sandbox.plaid.com for Sandbox

echo "=== 1. Test basic connectivity and credential validity ==="
curl -s -o /dev/null -w "HTTP %{http_code}\n" \
  -X POST "${PLAID_ENV}/institutions/get" \
  -H "Content-Type: application/json" \
  -d "{\"client_id\":\"${PLAID_CLIENT_ID}\",\"secret\":\"${PLAID_SECRET}\",\"count\":1,\"offset\":0,\"country_codes\":[\"US\"]}"

echo ""
echo "=== 2. Check rate limit headers on a lightweight endpoint ==="
curl -s -D - -o /dev/null \
  -X POST "${PLAID_ENV}/institutions/get" \
  -H "Content-Type: application/json" \
  -d "{\"client_id\":\"${PLAID_CLIENT_ID}\",\"secret\":\"${PLAID_SECRET}\",\"count\":1,\"offset\":0,\"country_codes\":[\"US\"]}" \
  | grep -i -E "(retry-after|x-request-id|x-ratelimit|HTTP/)"

echo ""
echo "=== 3. Parse last 500 lines of app logs for 429 patterns ==="
# Adjust log path to your application log file
LOG_FILE="/var/log/app/plaid.log"
if [ -f "$LOG_FILE" ]; then
  tail -500 "$LOG_FILE" | grep -c '"status_code": 429' && \
  echo "429 occurrences in last 500 log lines"
  echo "--- Most affected endpoints ---"
  tail -500 "$LOG_FILE" | grep '"status_code": 429' | \
    python3 -c "import sys,json; [print(json.loads(l).get('endpoint','?')) for l in sys.stdin]" | \
    sort | uniq -c | sort -rn | head -10
else
  echo "Log file not found at $LOG_FILE — adjust LOG_FILE variable"
fi

echo ""
echo "=== 4. Verify webhook endpoint is reachable ==="
WEBHOOK_URL="https://your-api.example.com/plaid/webhook"
curl -s -o /dev/null -w "Webhook HTTP %{http_code}\n" \
  -X POST "$WEBHOOK_URL" \
  -H "Content-Type: application/json" \
  -d '{"webhook_type":"SANDBOX","webhook_code":"FIRE_WEBHOOK"}'

echo ""
echo "=== 5. Count active Plaid items in your database ==="
# Example for PostgreSQL — adjust to your DB
if command -v psql &>/dev/null && [ -n "$DATABASE_URL" ]; then
  psql "$DATABASE_URL" -c "SELECT COUNT(*) AS active_items FROM plaid_items WHERE revoked_at IS NULL;"
else
  echo "Set DATABASE_URL env var and ensure psql is installed to count items"
fi

echo ""
echo "=== Diagnostics complete ==="
echo "If you see repeated 429s, implement exponential backoff and migrate to webhooks."
echo "Plaid error docs: https://plaid.com/docs/errors/rate-limit-exceeded/"

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps and SRE engineers with experience scaling fintech and data-intensive platforms. We specialize in API reliability patterns, distributed systems debugging, and third-party integration troubleshooting across production environments handling millions of daily requests.

Sources

Explore More API Errors Guides

AWS API Rate Limit Exceeded (ThrottlingException): Complete Troubleshooting Guide

Fix AWS ThrottlingException and API timeouts with exponential backoff, Service Quotas increases, and optimized API polling strategies for your workloads.

Azure API Timeout: 'The operation timed out' — Root Causes and Fixes

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Azure API Timeout: Fix 504 Gateway Timeout and RequestTimeout Errors in Azure API Management, Functions, and ARM

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI