Why am I getting RATE_LIMIT_EXCEEDED in Plaid Sandbox but not Production?

Plaid Sandbox enforces significantly stricter rate limits than Production — often 10x lower — to prevent abuse of the free testing environment. Sandbox limits are intentionally low and not representative of Production limits. If you're hitting limits only in Sandbox, don't worry: the same code will behave differently in Production. However, you should still implement backoff and webhook-driven patterns in Sandbox to build good habits and avoid test failures.

What is the exact Plaid rate limit for /accounts/balance/get?

Plaid does not publish exact rate limit numbers in their documentation because limits vary by plan, account standing, and can change. The /accounts/balance/get endpoint is explicitly noted as having stricter limits than other endpoints because each call triggers a real-time request to the financial institution. As a practical guideline, don't call it more than once per 30–60 seconds per item. Cache responses aggressively and only fetch fresh balances when the user explicitly requests them.

How long does a Plaid rate limit last? When does it reset?

Plaid uses rolling time windows, not fixed-minute resets. Check the Retry-After header in the 429 response — it contains the number of seconds to wait before retrying. If no Retry-After header is present, implement exponential backoff starting at 1 second, doubling up to a cap of 60 seconds. Per-item limits typically reset within seconds to minutes; application-level limits may take longer to recover from if heavily exceeded.

I'm only making one request — why am I still getting rate limited?

Several scenarios can cause this even with infrequent calls: (1) Multiple instances of your application running simultaneously (e.g., multiple Kubernetes pods or Lambda invocations) all calling Plaid for the same item at once. (2) A previous burst from your application put the item or your app-level limit into a backoff window. (3) The specific endpoint you're calling (/accounts/balance/get, /investments/holdings/get) has very strict limits. Use the Plaid Dashboard API Logs to verify the actual request volume from your client_id.

Does Plaid rate limit by IP address or by API key?

Plaid rate limits by client_id (your API key), not by IP address. This means all requests from all instances of your application using the same client_id share the same rate limit budget. There is no benefit to rotating IP addresses or proxy servers — doing so won't help and may raise fraud flags on your account. The correct approach is to reduce total request volume at the application level.

Plaid Rate Limit Error: How to Fix 'RATE_LIMIT_EXCEEDED' and Stop Being Rate Limited

Fix Plaid rate limit errors fast. Learn why RATE_LIMIT_EXCEEDED happens, how to implement exponential backoff, and optimize API calls to stay within Plaid limit

Last updated: February 23, 2026

Last verified: February 23, 2026

2,000 words

Key Takeaways

Plaid enforces per-item, per-endpoint, and per-application rate limits that reset on rolling windows — exceeding any tier triggers a RATE_LIMIT_EXCEEDED error with HTTP 429
The most common root cause is polling /transactions/get or /investments/transactions/get too aggressively instead of using Plaid's webhook-driven architecture
Immediate fix: implement exponential backoff with jitter on all Plaid API calls, switch to webhook + /transactions/sync pattern, and audit your request volume in the Plaid Dashboard

Fix Approaches Compared
Method	When to Use	Time to Implement	Risk
Exponential backoff with jitter	Any synchronous Plaid call that may burst	30 min	Low — pure retry logic, no architecture change
Switch to /transactions/sync + webhooks	Apps still using polling /transactions/get	2–4 hours	Medium — requires webhook endpoint and migration
Request deduplication / caching	Multiple services calling same item/endpoint	1–2 hours	Low — add cache layer in front of Plaid client
Upgrade Plaid plan / request limit increase	Legitimate volume exceeds current plan limits	1–3 days (Plaid approval)	None — administrative only
Queue-based Plaid request serializer	High-concurrency workers hitting same item	4–8 hours	Medium — new infrastructure component
Per-item rate limit tracking	Multi-tenant apps with many active items	2–4 hours	Low — instrumentation only

Understanding the Plaid Rate Limit Error

When your application exceeds Plaid's API rate limits, you receive an HTTP 429 Too Many Requests response containing a Plaid error object. The exact error body looks like this:

{
  "error_type": "RATE_LIMIT_EXCEEDED",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "error_message": "rate limit exceeded",
  "display_message": null,
  "request_id": "abc123XYZ",
  "causes": [],
  "status": 429
}

Plaid enforces rate limits at multiple granularities simultaneously:

Per-item limits: Each linked bank account (Item) has its own request budget per rolling window. Hammering one user's Item with repeated /transactions/get calls is the single most common trigger.
Per-endpoint limits: Some endpoints like /accounts/balance/get have stricter limits because they trigger live requests to financial institutions.
Per-application limits: Your overall application has a request-per-minute ceiling that scales with your Plaid plan (Development vs. Production).
Sandbox limits: The Sandbox environment has much stricter limits than Production — typically 10–15 requests per minute per item — which trips up developers who assume Sandbox mirrors Production behavior.

Step 1: Diagnose the Rate Limit Source

Check the Plaid Dashboard first. Navigate to dashboard.plaid.com → API → Logs. Filter by error_code = RATE_LIMIT_EXCEEDED and look at:

Which endpoint is triggering the most 429s
Whether errors cluster around specific item_id values (per-item limit) or are spread across all items (application limit)
The time distribution — a spike pattern suggests batch jobs; steady-state suggests polling

Instrument your Plaid client to log request metadata:

import time
import logging
from plaid.api import plaid_api
from plaid.exceptions import ApiException

logger = logging.getLogger(__name__)

def plaid_request_with_logging(client, method, *args, **kwargs):
    start = time.monotonic()
    try:
        response = method(*args, **kwargs)
        elapsed = time.monotonic() - start
        logger.info("plaid_request", extra={
            "method": method.__name__,
            "elapsed_ms": round(elapsed * 1000),
            "status": "success"
        })
        return response
    except ApiException as e:
        elapsed = time.monotonic() - start
        logger.error("plaid_request_failed", extra={
            "method": method.__name__,
            "elapsed_ms": round(elapsed * 1000),
            "error_code": e.body.get("error_code"),
            "status": e.status,
            "retry_after": e.headers.get("Retry-After")
        })
        raise

Check the Retry-After header. Plaid includes this header on 429 responses telling you exactly how many seconds to wait before retrying. Many SDKs discard headers — make sure your error handler reads it.

Step 2: Implement Exponential Backoff with Jitter

Never retry Plaid requests immediately on a 429. Use exponential backoff with full jitter to spread retry load:

import random
import time
from plaid.exceptions import ApiException

def plaid_call_with_backoff(fn, *args, max_retries=5, base_delay=1.0, max_delay=60.0, **kwargs):
    """
    Call a Plaid API function with exponential backoff + full jitter on RATE_LIMIT_EXCEEDED.
    """
    for attempt in range(max_retries + 1):
        try:
            return fn(*args, **kwargs)
        except ApiException as e:
            error_code = e.body.get("error_code") if e.body else None
            
            # Only retry on rate limits; surface all other errors immediately
            if e.status != 429 or error_code != "RATE_LIMIT_EXCEEDED":
                raise
            
            if attempt == max_retries:
                raise  # Exhausted retries
            
            # Respect Retry-After if Plaid provides it
            retry_after = float(e.headers.get("Retry-After", 0)) if e.headers else 0
            
            # Exponential backoff with full jitter
            cap = min(max_delay, base_delay * (2 ** attempt))
            sleep_time = max(retry_after, random.uniform(0, cap))
            
            logger.warning(
                f"Plaid rate limited (attempt {attempt + 1}/{max_retries}). "
                f"Sleeping {sleep_time:.1f}s before retry."
            )
            time.sleep(sleep_time)

Step 3: Migrate from Polling to Webhooks + /transactions/sync

If you're calling /transactions/get on a schedule to check for new transactions, you are very likely hitting per-item rate limits. Plaid's intended pattern is:

Register a webhook URL when creating your Item (or update it via /item/webhook/update)
Listen for TRANSACTIONS webhook events: INITIAL_UPDATE, HISTORICAL_UPDATE, DEFAULT_UPDATE, TRANSACTIONS_REMOVED
Only call /transactions/sync (the modern replacement for /transactions/get) when you receive a webhook

# webhook_handler.py — Flask example
from flask import Flask, request, jsonify
from plaid.model.transactions_sync_request import TransactionsSyncRequest

app = Flask(__name__)

@app.route("/plaid/webhook", methods=["POST"])
def plaid_webhook():
    payload = request.get_json()
    webhook_type = payload.get("webhook_type")
    webhook_code = payload.get("webhook_code")
    item_id = payload.get("item_id")
    
    if webhook_type == "TRANSACTIONS" and webhook_code in (
        "INITIAL_UPDATE", "HISTORICAL_UPDATE", "DEFAULT_UPDATE"
    ):
        # Queue the sync job — don't call Plaid synchronously in the webhook handler
        sync_transactions_for_item.delay(item_id)  # Celery task example
    
    return jsonify({"status": "received"}), 200


def sync_transactions_for_item(item_id: str):
    cursor = get_cursor_for_item(item_id)  # Load from your DB
    
    while True:
        req = TransactionsSyncRequest(
            access_token=get_access_token(item_id),
            cursor=cursor,
            count=500
        )
        response = plaid_call_with_backoff(plaid_client.transactions_sync, req)
        
        # Process added/modified/removed transactions
        process_transactions(response.added, response.modified, response.removed)
        
        cursor = response.next_cursor
        save_cursor_for_item(item_id, cursor)
        
        if not response.has_more:
            break

Step 4: Add Request Deduplication and Caching

In multi-service architectures, multiple workers may independently call Plaid for the same item. Implement a short-lived cache (Redis recommended) keyed on (item_id, endpoint):

import redis
import json
import hashlib

r = redis.Redis()

def cached_plaid_balance(access_token: str, item_id: str, ttl_seconds: int = 30):
    """
    Cache /accounts/balance/get responses for 30s to avoid redundant calls.
    Balance endpoint has stricter limits because it pings banks in real time.
    """
    cache_key = f"plaid:balance:{item_id}"
    cached = r.get(cache_key)
    
    if cached:
        return json.loads(cached)
    
    from plaid.model.accounts_balance_get_request import AccountsBalanceGetRequest
    req = AccountsBalanceGetRequest(access_token=access_token)
    response = plaid_call_with_backoff(plaid_client.accounts_balance_get, req)
    
    r.setex(cache_key, ttl_seconds, json.dumps(response.to_dict()))
    return response.to_dict()

Step 5: Rate-Limit Yourself Before Plaid Does

For batch operations (e.g., syncing all items nightly), implement a token bucket or sliding window rate limiter at the application level to stay well under Plaid's limits:

import time
from threading import Semaphore

class PlaidRateLimiter:
    """Token bucket: max `rate` calls per second, with burst capacity `burst`."""
    
    def __init__(self, rate: float = 5.0, burst: int = 10):
        self.rate = rate  # tokens per second
        self.burst = burst
        self.tokens = float(burst)
        self.last_refill = time.monotonic()
        self._lock = Semaphore(1)
    
    def acquire(self):
        with self._lock:
            now = time.monotonic()
            elapsed = now - self.last_refill
            self.tokens = min(self.burst, self.tokens + elapsed * self.rate)
            self.last_refill = now
            
            if self.tokens < 1:
                sleep_time = (1 - self.tokens) / self.rate
                time.sleep(sleep_time)
                self.tokens = 0
            else:
                self.tokens -= 1

plaid_limiter = PlaidRateLimiter(rate=5.0, burst=10)

# Wrap all Plaid calls:
plaid_limiter.acquire()
response = plaid_call_with_backoff(plaid_client.transactions_sync, req)

Step 6: Request a Limit Increase

If your legitimate usage exceeds your plan's rate limits after optimization, contact Plaid support at dashboard.plaid.com → Support → Submit a request. Include:

Your client_id
The specific endpoint and expected request volume
Your use case justification
Current and target requests-per-minute figures

Frequently Asked Questions

bash

#!/usr/bin/env bash
# Plaid Rate Limit Diagnostics Script
# Prerequisites: jq, curl, PLAID_CLIENT_ID, PLAID_SECRET, PLAID_ENV env vars set

PLAID_ENV="${PLAID_ENV:-sandbox}"  # sandbox | development | production
BASE_URL="https://${PLAID_ENV}.plaid.com"

# ─── 1. Test connectivity and credentials ────────────────────────────────────
echo "[1/4] Testing Plaid credentials for environment: ${PLAID_ENV}"
CATEGORIES_RESP=$(curl -s -X POST "${BASE_URL}/categories/get" \
  -H "Content-Type: application/json" \
  -d '{"client_id":"'"${PLAID_CLIENT_ID}"'","secret":"'"${PLAID_SECRET}"'"}')

if echo "${CATEGORIES_RESP}" | jq -e '.categories' > /dev/null 2>&1; then
  echo "  ✓ Credentials valid"
else
  echo "  ✗ Credential error: $(echo ${CATEGORIES_RESP} | jq -r '.error_code // .error_message')"
  exit 1
fi

# ─── 2. Check for recent rate limit errors in Plaid Dashboard API ─────────────
# Note: Plaid doesn't expose a programmatic logs API; use the Dashboard UI
echo "[2/4] To see rate limit errors:"
echo "  → https://dashboard.plaid.com/team/api"
echo "  → Filter logs by error_code = RATE_LIMIT_EXCEEDED"
echo "  → Note which endpoints and item_ids are affected"

# ─── 3. Check current request rate for a specific access token ───────────────
echo "[3/4] Probing per-item rate limit status for ACCESS_TOKEN..."
if [ -z "${PLAID_ACCESS_TOKEN}" ]; then
  echo "  ⚠ Set PLAID_ACCESS_TOKEN to test per-item limits"
else
  # Make a lightweight call and inspect headers
  RESP=$(curl -s -i -X POST "${BASE_URL}/item/get" \
    -H "Content-Type: application/json" \
    -d '{"client_id":"'"${PLAID_CLIENT_ID}"'","secret":"'"${PLAID_SECRET}"'","access_token":"'"${PLAID_ACCESS_TOKEN}"'"}' \
    2>&1)
  
  HTTP_STATUS=$(echo "${RESP}" | grep -i '^HTTP/' | awk '{print $2}' | tail -1)
  RETRY_AFTER=$(echo "${RESP}" | grep -i '^retry-after:' | awk '{print $2}' | tr -d '\r')
  ERROR_CODE=$(echo "${RESP}" | tail -1 | jq -r '.error_code // empty' 2>/dev/null)
  
  echo "  HTTP Status: ${HTTP_STATUS}"
  if [ "${HTTP_STATUS}" = "429" ]; then
    echo "  ✗ RATE_LIMIT_EXCEEDED — Retry-After: ${RETRY_AFTER:-unknown} seconds"
    echo "  Error Code: ${ERROR_CODE}"
  else
    echo "  ✓ No rate limit hit"
  fi
fi

# ─── 4. Validate webhook configuration for an item ───────────────────────────
echo "[4/4] Verifying webhook is configured (avoids need for polling)..."
if [ -n "${PLAID_ACCESS_TOKEN}" ]; then
  ITEM_RESP=$(curl -s -X POST "${BASE_URL}/item/get" \
    -H "Content-Type: application/json" \
    -d '{"client_id":"'"${PLAID_CLIENT_ID}"'","secret":"'"${PLAID_SECRET}"'","access_token":"'"${PLAID_ACCESS_TOKEN}"'"}')
  
  WEBHOOK=$(echo "${ITEM_RESP}" | jq -r '.item.webhook // empty')
  if [ -z "${WEBHOOK}" ]; then
    echo "  ⚠ No webhook configured — you may be polling unnecessarily"
    echo "  Fix: call /item/webhook/update to register a webhook URL"
  else
    echo "  ✓ Webhook configured: ${WEBHOOK}"
  fi
fi

echo ""
echo "Done. Review Plaid Dashboard logs for historical rate limit patterns."

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps, SRE, and backend engineers who specialize in API integration troubleshooting, distributed systems reliability, and financial data infrastructure. Contributors have production experience with Plaid, Stripe, and other fintech APIs at scale.

Sources

Explore More API Errors Guides

AWS API Rate Limit Exceeded (ThrottlingException): Complete Troubleshooting Guide

Fix AWS ThrottlingException and API timeouts with exponential backoff, Service Quotas increases, and optimized API polling strategies for your workloads.

Azure API Timeout: 'The operation timed out' — Root Causes and Fixes

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Azure API Timeout: Fix 504 Gateway Timeout and RequestTimeout Errors in Azure API Management, Functions, and ARM

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI