Error Medic

Plaid Rate Limit Error: How to Fix RATE_LIMIT_EXCEEDED and 429 Responses

Fix Plaid rate limit errors (HTTP 429, RATE_LIMIT_EXCEEDED) with exponential backoff, request batching, and token caching strategies. Step-by-step guide.

Last updated:
Last verified:
1,845 words
Key Takeaways
  • Plaid enforces per-item and per-endpoint rate limits; exceeding them returns HTTP 429 with error_code RATE_LIMIT_EXCEEDED
  • The most common cause is polling /transactions/get or /investments/holdings/get in a tight loop without respecting retry-after headers
  • Immediate fixes: implement exponential backoff with jitter, cache access tokens, and switch from polling to Plaid webhooks for data freshness
  • In Sandbox, rate limits are intentionally stricter than Production to help you test resilience early
  • Long-term solution: use /transactions/sync instead of /transactions/get and batch item refreshes with queue workers
Fix Approaches Compared
MethodWhen to UseTime to ImplementRisk
Exponential backoff + jitterAny polling or retry loop hitting 4291-2 hoursLow — no architecture change
Webhook-driven refreshReplacing polling for transaction/balance updates1-2 daysLow — Plaid-native pattern
Request queue with rate limiterHigh-volume multi-item apps (>100 Items)2-4 hoursLow — isolates Plaid calls
Access token cachingRe-using tokens instead of calling /item/public_token/exchange repeatedly30 minLow — tokens don't expire
Switch to /transactions/syncApps still using legacy /transactions/get4-8 hoursMedium — requires pagination refactor
Staggered cron refreshBatch-refreshing many items on a schedule2-3 hoursLow — spreads load across time window

Understanding Plaid Rate Limit Errors

When your application exceeds Plaid's API rate limits, every affected request returns an HTTP 429 Too Many Requests response. The JSON body looks like this:

{
  "error_type": "RATE_LIMIT_EXCEEDED",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "error_message": "You have exceeded your rate limit. Please retry after some time.",
  "display_message": null,
  "request_id": "HNbe2",
  "causes": [],
  "status": 429,
  "documentation_url": "https://plaid.com/docs/errors/rate-limit-exceeded/",
  "suggested_action": null
}

Plaid imposes two categories of limits you need to understand:

  1. Per-Item limits — How often you can call data-fetch endpoints (like /transactions/get, /accounts/balance/get, /investments/holdings/get) for a single linked bank account (Item). These are the limits developers hit most often.
  2. Per-client limits — Aggregate limits across your entire client ID. High-volume production apps with thousands of Items can hit these if refresh logic is not throttled.

Plaid does not publish exact numeric limits in their public docs, but they do return a Retry-After header (in seconds) and the request_id you should log for support escalations.


Step 1: Diagnose — Confirm You Are Actually Rate Limited

Before changing code, verify the 429 is a genuine rate limit and not a misconfigured endpoint or invalid token.

Check the response headers:

HTTP/2 429
content-type: application/json
retry-after: 60
x-request-id: HNbe2

If Retry-After is present, you are rate limited. Log error_code and request_id on every 429 — you will need request_id if you open a Plaid support ticket.

Find the hot endpoint. Add structured logging around every Plaid call:

import logging, time

logger = logging.getLogger("plaid")

def call_plaid(fn, *args, **kwargs):
    start = time.monotonic()
    try:
        response = fn(*args, **kwargs)
        logger.info({"endpoint": fn.__name__, "duration_ms": int((time.monotonic()-start)*1000), "status": "ok"})
        return response
    except plaid.ApiException as e:
        logger.error({"endpoint": fn.__name__, "status_code": e.status, "error_code": e.body.get("error_code"), "request_id": e.body.get("request_id")})
        raise

Aggregate these logs by endpoint to find which call is firing too frequently.

Common culprits in order of frequency:

  • /transactions/get or /transactions/sync called on every page load
  • /accounts/balance/get polled every few seconds for "live" balance display
  • /item/public_token/exchange called on every API request instead of caching the resulting access_token
  • A misconfigured cron job refreshing all Items simultaneously

Step 2: Implement Exponential Backoff With Jitter

This is the fastest fix — add it to your Plaid API wrapper immediately, regardless of other architectural changes.

import time, random
from plaid.exceptions import ApiException

def plaid_with_backoff(fn, *args, max_retries=5, base_delay=1.0, **kwargs):
    """
    Retry a Plaid API call with exponential backoff + full jitter.
    Respects Retry-After header when present.
    """
    for attempt in range(max_retries):
        try:
            return fn(*args, **kwargs)
        except ApiException as exc:
            if exc.status != 429:
                raise  # Don't retry non-rate-limit errors
            
            retry_after = None
            if exc.headers and "Retry-After" in exc.headers:
                retry_after = int(exc.headers["Retry-After"])
            
            if attempt == max_retries - 1:
                raise  # Exhausted retries
            
            # Full jitter: sleep between 0 and cap
            cap = retry_after if retry_after else base_delay * (2 ** attempt)
            sleep_secs = random.uniform(0, cap)
            time.sleep(sleep_secs)
    raise RuntimeError("Unreachable")

Why jitter? Without randomness, all your workers wake up simultaneously after the same backoff delay, creating a thundering-herd that immediately re-triggers the rate limit.


Step 3: Cache Access Tokens (Avoid Exchange Calls)

Some developers mistakenly call /item/public_token/exchange on every request. Access tokens are permanent (until revoked) — exchange once and persist them.

# WRONG: calling exchange on every request
def get_transactions(public_token):
    exchange_response = plaid_client.item_public_token_exchange({"public_token": public_token})
    access_token = exchange_response["access_token"]  # This fires a Plaid API call!
    ...

# RIGHT: exchange once, store in your DB, reuse
def get_transactions(user_id):
    access_token = db.get_plaid_token(user_id)  # Load from database
    ...

Step 4: Replace Polling With Webhooks

Polling /transactions/get or /balance/get on a timer is the root cause of most rate limit issues. Plaid sends webhooks when new data is available — use them.

Set webhook URL on item creation:

link_token_response = plaid_client.link_token_create({
    "user": {"client_user_id": user_id},
    "client_name": "MyApp",
    "products": ["transactions"],
    "country_codes": ["US"],
    "language": "en",
    "webhook": "https://your-api.example.com/plaid/webhook"  # <-- add this
})

Handle webhook events — only fetch data when Plaid tells you to:

@app.route("/plaid/webhook", methods=["POST"])
def plaid_webhook():
    body = request.json
    webhook_type = body.get("webhook_type")
    webhook_code = body.get("webhook_code")
    item_id = body.get("item_id")
    
    if webhook_type == "TRANSACTIONS" and webhook_code in ("INITIAL_UPDATE", "HISTORICAL_UPDATE", "DEFAULT_UPDATE"):
        # Enqueue a background job — never call Plaid synchronously in a webhook handler
        task_queue.enqueue(sync_transactions_for_item, item_id)
    
    return {"status": "ok"}, 200

Step 5: Throttle Bulk Item Refreshes With a Queue

If you have a cron job that refreshes all Items (e.g., nightly), spreading them out over time prevents aggregate rate limit hits.

import redis
from rq import Queue

r = redis.Redis()
q = Queue(connection=r)

def schedule_all_item_refreshes():
    items = db.get_all_active_items()
    # Stagger: send 1 refresh job every 200ms to stay well under limits
    for i, item in enumerate(items):
        q.enqueue_in(
            timedelta(milliseconds=i * 200),
            refresh_item_transactions,
            item.access_token
        )

Step 6: Migrate From /transactions/get to /transactions/sync

If you are still using the legacy /transactions/get endpoint, migrate to /transactions/sync. The newer endpoint is cursor-based and designed for incremental updates, reducing the number of API calls needed to stay current.

The key difference: /transactions/get requires you to pass a date range and fetches all matching transactions each time. /transactions/sync returns only what has changed since your last cursor, dramatically reducing call volume for established Items.

Store the cursor per Item in your database and pass it on every call:

def sync_transactions(access_token, cursor=None):
    all_added, all_modified, all_removed = [], [], []
    has_more = True
    
    while has_more:
        response = plaid_client.transactions_sync({"access_token": access_token, "cursor": cursor})
        all_added.extend(response["added"])
        all_modified.extend(response["modified"])
        all_removed.extend(response["removed"])
        has_more = response["has_more"]
        cursor = response["next_cursor"]
    
    db.save_cursor(access_token, cursor)
    return all_added, all_modified, all_removed

Frequently Asked Questions

bash
#!/usr/bin/env bash
# Plaid Rate Limit Diagnostics Script
# Usage: PLAID_SECRET=<secret> PLAID_CLIENT_ID=<id> bash plaid_ratelimit_diag.sh

PLAID_ENV="https://production.plaid.com"
# Change to https://sandbox.plaid.com for Sandbox

echo "=== 1. Test basic connectivity and credential validity ==="
curl -s -o /dev/null -w "HTTP %{http_code}\n" \
  -X POST "${PLAID_ENV}/institutions/get" \
  -H "Content-Type: application/json" \
  -d "{\"client_id\":\"${PLAID_CLIENT_ID}\",\"secret\":\"${PLAID_SECRET}\",\"count\":1,\"offset\":0,\"country_codes\":[\"US\"]}"

echo ""
echo "=== 2. Check rate limit headers on a lightweight endpoint ==="
curl -s -D - -o /dev/null \
  -X POST "${PLAID_ENV}/institutions/get" \
  -H "Content-Type: application/json" \
  -d "{\"client_id\":\"${PLAID_CLIENT_ID}\",\"secret\":\"${PLAID_SECRET}\",\"count\":1,\"offset\":0,\"country_codes\":[\"US\"]}" \
  | grep -i -E "(retry-after|x-request-id|x-ratelimit|HTTP/)"

echo ""
echo "=== 3. Parse last 500 lines of app logs for 429 patterns ==="
# Adjust log path to your application log file
LOG_FILE="/var/log/app/plaid.log"
if [ -f "$LOG_FILE" ]; then
  tail -500 "$LOG_FILE" | grep -c '"status_code": 429' && \
  echo "429 occurrences in last 500 log lines"
  echo "--- Most affected endpoints ---"
  tail -500 "$LOG_FILE" | grep '"status_code": 429' | \
    python3 -c "import sys,json; [print(json.loads(l).get('endpoint','?')) for l in sys.stdin]" | \
    sort | uniq -c | sort -rn | head -10
else
  echo "Log file not found at $LOG_FILE — adjust LOG_FILE variable"
fi

echo ""
echo "=== 4. Verify webhook endpoint is reachable ==="
WEBHOOK_URL="https://your-api.example.com/plaid/webhook"
curl -s -o /dev/null -w "Webhook HTTP %{http_code}\n" \
  -X POST "$WEBHOOK_URL" \
  -H "Content-Type: application/json" \
  -d '{"webhook_type":"SANDBOX","webhook_code":"FIRE_WEBHOOK"}'

echo ""
echo "=== 5. Count active Plaid items in your database ==="
# Example for PostgreSQL — adjust to your DB
if command -v psql &>/dev/null && [ -n "$DATABASE_URL" ]; then
  psql "$DATABASE_URL" -c "SELECT COUNT(*) AS active_items FROM plaid_items WHERE revoked_at IS NULL;"
else
  echo "Set DATABASE_URL env var and ensure psql is installed to count items"
fi

echo ""
echo "=== Diagnostics complete ==="
echo "If you see repeated 429s, implement exponential backoff and migrate to webhooks."
echo "Plaid error docs: https://plaid.com/docs/errors/rate-limit-exceeded/"
E

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps and SRE engineers with experience scaling fintech and data-intensive platforms. We specialize in API reliability patterns, distributed systems debugging, and third-party integration troubleshooting across production environments handling millions of daily requests.

Sources

Related Articles in Plaid

Explore More API Errors Guides