Why does my query timeout in my application, but works instantly when I run it directly via cURL?

This is a classic symptom of a client-side timeout or an intermediary proxy timeout. Your application's HTTP client or the load balancer (e.g., Nginx, AWS ALB) sitting between your app and Elasticsearch likely has a strict timeout configured (often 30 seconds), while your cURL command is waiting patiently for the response.

I increased the Elasticsearch timeout to 5 minutes, but I still get a 504 Gateway Timeout error. Why?

The 504 error comes from a reverse proxy or load balancer, not Elasticsearch itself. Even if Elasticsearch is allowed to process a query for 5 minutes, the proxy will sever the HTTP connection if it doesn't receive a response within its own configured timeout window. You must increase the proxy timeout or, preferably, use the Async Search API.

What is the difference between client-side timeout and server-side search timeout?

A client-side timeout (e.g., configured in the Python/Node.js client) dictates how long the client will wait before severing the connection and throwing an error. The server-side search timeout (passed as `?timeout=10s` in the query) tells Elasticsearch to stop processing after 10 seconds and return whatever partial results it has gathered so far.

How do I fix the 'java.util.concurrent.TimeoutException' error?

This indicates a server-side timeout where a node failed to complete its portion of the request within the internal allotted time. This is almost always caused by a cluster under heavy load, massive Garbage Collection (GC) pauses freezing the node, or a fundamentally unoptimized query (like deep pagination or leading wildcards) tying up the search thread pool.

Is it safe to vastly increase the thread pool queue size to prevent timeouts?

No. Increasing queue sizes is generally an anti-pattern. If Elasticsearch is rejecting requests, it is protecting itself from running out of memory and crashing. Artificially increasing the queue just delays the inevitable and increases the likelihood of a catastrophic OutOfMemoryError. You need to optimize queries or scale the cluster.

How to Fix Elasticsearch API Timeout Errors (Request Timeout after 30000ms)

Resolve Elasticsearch API timeouts. Diagnose slow queries, GC pauses, and thread pool exhaustion. Learn to optimize queries and adjust client timeout settings.

Last updated: February 24, 2026

Last verified: February 24, 2026

1,583 words

Key Takeaways

Client-side timeouts (e.g., RequestTimeoutError) occur when the client gives up before Elasticsearch finishes processing.
Server-side timeouts often result from heavy queries, deep pagination, unoptimized mappings, or JVM Garbage Collection (GC) pauses.
Increasing timeouts is a temporary band-aid; long-term fixes require query optimization, using Async Search, or scaling the cluster.
Thread pool rejections and high CPU/Heap usage are primary indicators of an under-resourced cluster causing API timeouts.

Fix Approaches Compared
Method	When to Use	Time	Risk
Increase Client Timeout	Immediate mitigation for occasional spikes	1 min	High (Can mask underlying issues and exhaust cluster resources)
Use Async Search API	For known long-running aggregations or reports	Hours	Low (Designed specifically for heavy workloads)
Optimize Queries	Permanent fix for inefficient searches (e.g., removing leading wildcards)	Days	Low (Improves overall cluster stability)
Scale Up/Out Cluster	When CPU, RAM, or JVM heap are consistently maxed out	Days-Weeks	Medium (Requires downtime or careful rolling restarts)

Understanding the Error

When working with Elasticsearch at scale, one of the most common and frustrating roadblocks developers and operations teams encounter is the Elasticsearch API timeout error. Depending on where the timeout occurs—on the client, at an intermediary proxy, or within the Elasticsearch cluster itself—the exact error message you see can vary wildly.

Common error manifestations include:

Node.js/JavaScript Client: RequestTimeoutError: Request Timeout after 30000ms
Python Client: elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))
Direct cURL or Kibana Dev Tools: {"error":{"root_cause":[{"type":"timeout_exception","reason":"java.util.concurrent.TimeoutException"}],"type":"timeout_exception","reason":"java.util.concurrent.TimeoutException"},"status":500}
Reverse Proxy (Nginx/HAProxy/AWS API Gateway): 504 Gateway Timeout

These errors almost invariably mean a mismatch between the time your client is willing to wait and the time Elasticsearch requires to gather, compute, and return the data. In distributed systems, this isn't just an annoyance; it's a symptom of resource contention, unoptimized data models, or poorly constructed queries.

Step 1: Diagnose the Root Cause

Before you immediately reach for the timeout dial to increase it to 5 minutes, you must diagnose why the timeout is happening. Increasing the timeout without understanding the underlying cause often leads to cascading failures, where long-running queries pile up, consume all available search threads, trigger massive Garbage Collection (GC) pauses, and eventually crash nodes.

1. Check Cluster Health and Pending Tasks

If your cluster is constantly red or yellow, or if it has a massive backlog of pending tasks, API requests will queue up and eventually time out. Run GET /_cluster/health?pretty. Look at the status, number_of_pending_tasks, and active_shards_percent. Next, run GET /_cluster/pending_tasks?pretty. If you see a massive list of cluster state updates, mapping updates, or shard allocations, your cluster is busy doing internal bookkeeping, starving your search requests.

2. Analyze Node Hot Threads and JVM Pressure

When a query times out, the Elasticsearch node might be actively grinding through CPU cycles, or it might be paused entirely due to JVM garbage collection. Run GET /_nodes/hot_threads during a timeout event. This endpoint is pure gold for SREs. It returns a stack trace of the threads consuming the most CPU. If you see deep stack traces involving org.apache.lucene.search, you have an expensive query. If you see lots of GC threads, your JVM heap is likely maxed out.

3. Identify Expensive Queries via Slow Logs

Elasticsearch has built-in slow logs that you can enable dynamically. If you suspect a specific index is causing the timeouts, enable the search slow log:

PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.fetch.debug": "500ms"
}

Monitor your Elasticsearch log files. You will likely find queries using wildcards at the beginning of terms (e.g., *keyword), heavy regex queries, massive terms aggregations, or deep pagination (from: 100000, size: 100).

Step 2: Implement Fixes

Fixing Elasticsearch timeouts requires a layered approach: short-term mitigations to restore service, and long-term architectural changes to ensure stability.

Short-Term Fix: Adjusting Timeouts Wisely

If your cluster is healthy but a specific API endpoint requires more time, you can increase the timeout. However, you must differentiate between the client-side timeout and the server-side timeout.

Client-Side: If using the official clients, you must explicitly tell the client to wait longer. For example, in Python:

from elasticsearch import Elasticsearch
es = Elasticsearch(["http://localhost:9200"], timeout=60, max_retries=3, retry_on_timeout=True)

In Node.js:

const { Client } = require('@elastic/elasticsearch')
const client = new Client({ node: 'http://localhost:9200', requestTimeout: 60000 })

Server-Side: You can also pass a timeout parameter to your search requests to tell Elasticsearch to return whatever partial results it has gathered before the time expires: GET /my-index/_search?timeout=10s This is highly recommended for user-facing applications where partial data is better than an error page.

Long-Term Fix 1: Adopt the Async Search API

If you are running heavy aggregations, reporting queries, or extracting massive amounts of data, you should not be using synchronous HTTP requests. A standard HTTP connection will likely be dropped by intermediate load balancers (like AWS ALB or Nginx) after 60 seconds anyway.

Instead, use the Elasticsearch _async_search API. This API allows you to submit a query and get a id back immediately. You can then poll this id to check the status and retrieve the results when they are ready, completely bypassing HTTP timeout limitations.

Long-Term Fix 2: Optimize Queries and Mappings

Stop Deep Pagination: If you are using from and size to page through thousands of results, stop. Elasticsearch must sort and rank all results up to from + size for every page request. Use search_after or the Point in Time (PIT) API for deep scrolling.
Avoid Leading Wildcards: Queries like *error* force Lucene to scan the entire inverted index. Use match_phrase, n-grams, or standard text analysis instead.
Use Keyword types for exact matches: Do not run terms aggregations on text fields. Ensure your mappings specify keyword for fields you intend to aggregate or filter on exactly.
Pre-calculate data: If you run the same heavy aggregation constantly, use Transforms to pivot the data into a summarized index in the background.

Long-Term Fix 3: Cluster Sizing and Circuit Breakers

If queries are optimized but timeouts persist, your cluster simply lacks horsepower.

Ensure your JVM heap size is exactly 50% of your available RAM, but no larger than 32GB to remain under the compressed object pointers (OOP) threshold.
Check your thread pools (GET /_cat/thread_pool?v). If you see a high number of rejected tasks in the search or write queues, your nodes are saturated. You need to add more data nodes to the cluster to distribute the load.

Frequently Asked Questions

bash

# --- Diagnostic Commands ---

# 1. Check overall cluster health and unassigned shards
curl -X GET "localhost:9200/_cluster/health?pretty"

# 2. Check for operations blocking the cluster
curl -X GET "localhost:9200/_cluster/pending_tasks?pretty"

# 3. Identify what the CPU is currently doing (run during a timeout event)
curl -X GET "localhost:9200/_nodes/hot_threads?pretty"

# 4. Check thread pool rejections (look for the 'search' thread pool)
curl -X GET "localhost:9200/_cat/thread_pool?v&s=name"

# --- Remediation & Workarounds ---

# Execute a search with a strict server-side timeout to return partial data
curl -X GET "localhost:9200/my-index/_search?timeout=5s" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match_all": {}
  }
}'

# Example of using the Async Search API for long-running queries
# This returns an ID immediately instead of timing out
curl -X POST "localhost:9200/my-index/_async_search?size=0" -H 'Content-Type: application/json' -d'
{
  "aggs": {
    "daily_sales": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "1d"
      }
    }
  }
}'

# To retrieve the results later using the returned ID:
# curl -X GET "localhost:9200/_async_search/<YOUR_ASYNC_SEARCH_ID>"

Error Medic Editorial

Error Medic Editorial comprises senior DevOps engineers, SREs, and database administrators dedicated to solving complex infrastructure bottlenecks and distributed system failures.

Sources

Explore More API Errors Guides

AWS API Rate Limit Exceeded (ThrottlingException): Complete Troubleshooting Guide

Fix AWS ThrottlingException and API timeouts with exponential backoff, Service Quotas increases, and optimized API polling strategies for your workloads.

Azure API Timeout: 'The operation timed out' — Root Causes and Fixes

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Azure API Timeout: Fix 504 Gateway Timeout and RequestTimeout Errors in Azure API Management, Functions, and ARM

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI