Error Medic

Elasticsearch API Timeout: Diagnosing and Fixing Request Timeout Errors

Fix Elasticsearch API timeout errors fast. Covers socket timeouts, request_timeout settings, slow queries, and cluster health fixes with real commands.

Last updated:
Last verified:
1,850 words
Key Takeaways
  • Most Elasticsearch API timeouts are caused by one of three root causes: undersized thread pools, slow or unoptimized queries hitting too many shards, or a cluster with unassigned shards causing request queuing
  • The default request timeout in most Elasticsearch clients is 30 seconds; if your query or bulk operation exceeds this, you will see a `ConnectionTimeout` or `RequestError` with a `timeout` reason in the response
  • Quick fix: start by running `GET _cluster/health` and `GET _cat/thread_pool?v` to triage whether the issue is cluster-wide, query-specific, or client-configuration-related before changing any settings
Fix Approaches Compared
MethodWhen to UseTime to ApplyRisk
Increase client request_timeoutSlow but valid long-running queries (reindex, bulk)< 5 minLow — client-side only
Tune search.default_search_timeoutQueries consistently slow across the cluster5–15 minMedium — affects all queries
Add more shards / reduce shard countToo many or too few shards causing hot spots30–120 minMedium — requires reindex
Scale data nodes horizontallyThread pool queue saturation under sustained load1–4 hrsLow — additive change
Add request circuit breakerProtect cluster from memory-exhausting queries10 minLow — adds guardrails
Optimize query (filters, avoid wildcards)Specific slow query identified via slow log15–60 minLow — query-level change
Force-assign unassigned shardsRed/yellow cluster blocking primary operations5–30 minMedium — review reason first

Understanding Elasticsearch API Timeout Errors

When a request to Elasticsearch exceeds the configured time limit, the client or server terminates the connection and surfaces one of several timeout-related errors. Understanding which layer is timing out is the critical first diagnostic step.

Common Error Messages

Depending on your client library and configuration, you will see one of these signatures:

Python elasticsearch-py client:

elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=30))

Java High-Level REST Client:

java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-0 [ACTIVE]

Kibana / curl:

{"error":{"root_cause":[{"type":"search_phase_execution_exception","reason":"all shards failed"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"my-index","node":"abc123","reason":{"type":"query_shard_exception","reason":"Request timed out"}}]}}

Server-side timeout (search timeout):

{"timed_out":true,"_shards":{"total":5,"successful":3,"skipped":0,"failed":2}}

Note the difference: a client timeout throws an exception before any response arrives. A server-side timeout returns a 200 response with "timed_out": true and partial results.


Step 1: Diagnose — Identify the Timeout Layer

Run these commands in order to narrow down the root cause.

1a. Check cluster health first

curl -s 'http://localhost:9200/_cluster/health?pretty'

A red or yellow status means shards are unassigned. Primary shard unavailability causes all writes and some reads to block until timeout. If status is red, fix this before anything else.

1b. Inspect thread pool saturation

curl -s 'http://localhost:9200/_cat/thread_pool/search,write,bulk?v&h=node_name,name,active,queue,rejected,completed'

If queue is consistently > 0 or rejected is climbing, your nodes are overwhelmed. Requests are queuing past the client timeout window.

1c. Find slow queries via the slow log

Enable the slow log temporarily:

curl -X PUT 'http://localhost:9200/my-index/_settings' -H 'Content-Type: application/json' -d '{
  "index.search.slowlog.threshold.query.warn": "2s",
  "index.search.slowlog.threshold.fetch.warn": "1s",
  "index.search.slowlog.level": "warn"
}'

Then tail the Elasticsearch log:

tail -f /var/log/elasticsearch/my-cluster_index_search_slowlog.log

1d. Check pending tasks

curl -s 'http://localhost:9200/_cluster/pending_tasks?pretty'

A large backlog of pending cluster tasks (e.g., shard assignment, mapping updates) can delay query execution.

1e. Check JVM heap pressure

curl -s 'http://localhost:9200/_nodes/stats/jvm?pretty' | python3 -c "
import json, sys
nodes = json.load(sys.stdin)['nodes']
for nid, n in nodes.items():
    heap = n['jvm']['mem']
    used_pct = heap['heap_used_percent']
    print(f"{n['name']}: heap {used_pct}%")
"

Heap usage consistently above 75% triggers frequent GC pauses, which directly cause timeouts.


Step 2: Fix — Apply the Right Remedy

Fix A: Adjust the Client-Side Timeout

This is the correct fix when your operation (bulk indexing, reindex, aggregation over large datasets) is legitimately long-running and valid.

Python:

from elasticsearch import Elasticsearch

es = Elasticsearch(
    ['http://localhost:9200'],
    request_timeout=120  # seconds
)

# Per-request override
result = es.search(index='my-index', body=query, request_timeout=60)

Node.js (@elastic/elasticsearch):

const { Client } = require('@elastic/elasticsearch')
const client = new Client({
  node: 'http://localhost:9200',
  requestTimeout: 120000  // milliseconds
})

curl:

curl --max-time 120 -X GET 'http://localhost:9200/my-index/_search' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}}}'
Fix B: Set a Cluster-Wide Search Timeout

This prevents runaway queries from tying up resources. It returns partial results rather than blocking forever.

curl -X PUT 'http://localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d '{
  "persistent": {
    "search.default_search_timeout": "30s"
  }
}'

Or pass timeout per request at query time:

curl -X GET 'http://localhost:9200/my-index/_search?timeout=10s' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}}}'
Fix C: Fix Unassigned Shards (Red Cluster)
# Find unassigned shards and their reason
curl -s 'http://localhost:9200/_cluster/allocation/explain?pretty'

# Retry failed shard allocation
curl -X POST 'http://localhost:9200/_cluster/reroute?retry_failed=true'

# If a node was permanently removed and you accept data loss on that shard:
curl -X POST 'http://localhost:9200/_cluster/reroute' -H 'Content-Type: application/json' -d '{
  "commands": [{
    "allocate_stale_primary": {
      "index": "my-index",
      "shard": 0,
      "node": "node-1",
      "accept_data_loss": true
    }
  }]
}'
Fix D: Relieve Thread Pool Pressure

For sustained search thread pool saturation, tune the thread pool size (requires node restart):

# elasticsearch.yml
thread_pool:
  search:
    size: 13          # default: (vCPUs * 3) / 2 + 1
    queue_size: 1000  # default: 1000

Alternatively, reduce concurrent client connections or implement request rate limiting upstream (NGINX, API gateway).

Fix E: Optimize the Query

Replace slow wildcard / fuzzy / leading-wildcard patterns with match, prefix on keyword fields, or use ngram tokenizers:

# Bad — leading wildcard scans all terms:
{"query": {"wildcard": {"title": "*cloud*"}}}

# Good — use match on analyzed field or prefix on keyword:
{"query": {"match": {"title": "cloud"}}}

Use filter context for non-scoring clauses (enables caching):

{
  "query": {
    "bool": {
      "filter": [
        {"term": {"status": "active"}},
        {"range": {"created_at": {"gte": "now-7d"}}}
      ],
      "must": [
        {"match": {"description": "kubernetes"}}
      ]
    }
  }
}
Fix F: Reduce Shard Count for Over-Sharded Indices

Over-sharding (too many small shards) causes excessive coordination overhead. The recommended shard size is 10–50 GB.

# Check current shard sizes
curl -s 'http://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,docs,store,node' | sort -k6 -h

# Shrink an over-sharded index (must be read-only, all shards on one node)
curl -X PUT 'http://localhost:9200/my-index/_settings' -H 'Content-Type: application/json' -d '{
  "settings": {
    "index.routing.allocation.require._name": "node-1",
    "index.blocks.write": true
  }
}'

curl -X POST 'http://localhost:9200/my-index/_shrink/my-index-shrunk' -H 'Content-Type: application/json' -d '{
  "settings": {
    "index.number_of_shards": 1,
    "index.number_of_replicas": 1,
    "index.routing.allocation.require._name": null
  }
}'

Step 3: Verify the Fix

After applying your fix, confirm the improvement:

# Confirm cluster is green
curl -s 'http://localhost:9200/_cluster/health?wait_for_status=green&timeout=30s&pretty'

# Check response time of your query
time curl -s -X GET 'http://localhost:9200/my-index/_search' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}},"size":1}' | python3 -m json.tool | grep took

# Confirm timed_out is false
curl -s 'http://localhost:9200/my-index/_search?timeout=10s' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}}}' | python3 -c "import json,sys; r=json.load(sys.stdin); print('timed_out:', r.get('timed_out'))"

Frequently Asked Questions

bash
#!/usr/bin/env bash
# elasticsearch-timeout-triage.sh
# Run against your cluster to collect timeout diagnostic data

ES_HOST="${ES_HOST:-http://localhost:9200}"
INDEX="${1:-*}"  # pass index name as arg or defaults to all

echo "=== [1] Cluster Health ==="
curl -s "${ES_HOST}/_cluster/health?pretty"

echo -e "\n=== [2] Thread Pool Saturation ==="
curl -s "${ES_HOST}/_cat/thread_pool/search,write,bulk?v&h=node_name,name,active,queue,rejected,completed"

echo -e "\n=== [3] Pending Cluster Tasks ==="
curl -s "${ES_HOST}/_cluster/pending_tasks?pretty" | python3 -c "
import json,sys
d=json.load(sys.stdin)
print(f'Pending tasks: {len(d.get(\"tasks\",[])))}')
for t in d.get('tasks',[])[:5]:
    print(' -', t.get('source','?'), '|', t.get('time_in_queue','?'))
"

echo -e "\n=== [4] JVM Heap by Node ==="
curl -s "${ES_HOST}/_nodes/stats/jvm?pretty" | python3 -c "
import json,sys
nodes=json.load(sys.stdin)['nodes']
for nid,n in nodes.items():
    heap=n['jvm']['mem']
    used=heap['heap_used_in_bytes']
    total=heap['heap_max_in_bytes']
    pct=heap['heap_used_percent']
    gc_count=sum(v['collection_count'] for v in n['jvm']['gc']['collectors'].values())
    print(f\"{n['name']}: heap={pct}% ({used//1024//1024}MB/{total//1024//1024}MB) gc_count={gc_count}\")
"

echo -e "\n=== [5] Unassigned Shards ==="
UNASSIGNED=$(curl -s "${ES_HOST}/_cat/shards?h=index,shard,prirep,state" | grep -c UNASSIGNED || true)
echo "Unassigned shard count: ${UNASSIGNED}"
if [ "${UNASSIGNED}" -gt 0 ]; then
    echo "  --> Run: curl -s '${ES_HOST}/_cluster/allocation/explain?pretty' for details"
fi

echo -e "\n=== [6] Large / Hot Shards ==="
curl -s "${ES_HOST}/_cat/shards/${INDEX}?v&h=index,shard,prirep,state,docs,store,node&s=store:desc" | head -20

echo -e "\n=== [7] Sample Query Latency ==="
SAMPLE_MS=$(curl -s -X GET "${ES_HOST}/${INDEX}/_search" \
  -H 'Content-Type: application/json' \
  -d '{"query":{"match_all":{}},"size":1,"timeout":"5s"}' \
  | python3 -c "import json,sys; r=json.load(sys.stdin); print(r.get('took','?'),'ms | timed_out:',r.get('timed_out'))")
echo "  Took: ${SAMPLE_MS}"

echo -e "\n=== [8] Circuit Breaker Status ==="
curl -s "${ES_HOST}/_nodes/stats/breaker?pretty" | python3 -c "
import json,sys
nodes=json.load(sys.stdin)['nodes']
for nid,n in nodes.items():
    print(n['name'])
    for name,cb in n['breakers'].items():
        print(f\"  {name}: {cb['overhead']}x overhead, limit={cb['limit_size_in_bytes']//1024//1024}MB, tripped={cb['tripped']}\")
"

echo -e "\n=== Triage Complete ==="
E

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps engineers and SREs with hands-on experience managing large-scale Elasticsearch deployments across cloud and on-premises environments. Our guides are based on real production incidents, official documentation, and community-verified solutions.

Sources

Related Articles in Elasticsearch Api

Explore More API Errors Guides