Why does my Elasticsearch query return `timed_out: true` but still gives a 200 HTTP status?

Elasticsearch distinguishes between a server-side search timeout and a connection timeout. When you set a `timeout` parameter on the search request or configure `search.default_search_timeout`, Elasticsearch will stop collecting results after the deadline and return whatever it has gathered so far with `timed_out: true` in the JSON body and an HTTP 200. This means you received partial results. To avoid this, either increase the timeout or optimize the query so it completes within the allotted time. Always check `timed_out` in your application code before trusting aggregation results.

My bulk indexing API calls time out even though the cluster is green. How do I fix this?

Bulk indexing is inherently slower than search queries. The default client timeout (often 30s) is frequently too short for large bulk payloads. First, reduce your bulk batch size to 5–15 MB per request and keep document counts under 1,000 per batch. Second, increase the client-side `request_timeout` to 120–300 seconds for bulk operations specifically. Third, check `_cat/thread_pool/bulk?v` for queue saturation — if the bulk thread pool is full, slow down your producer or add data nodes.

How do I find which specific query is causing the timeout in a production cluster?

Enable the index slow log with a threshold matching your timeout window (e.g., `index.search.slowlog.threshold.query.warn: 5s`), then reproduce the issue. The slow log records the full query source, shard, and execution time. Additionally, use the Profile API on a non-production replica: add `\"profile\": true` to your search body to get a breakdown of each query clause's execution time and the operations consuming the most time within each shard.

I see `ConnectionTimeout` in Python but `_cluster/health` is green. What else could cause this?

Several network-layer issues can cause client-side timeouts even with a healthy cluster: (1) A load balancer or proxy (NGINX, HAProxy, AWS ALB) with a shorter idle timeout than your query duration — check the LB timeout settings; (2) The Elasticsearch node the client connected to is experiencing a GC pause — check `_nodes/stats/jvm` for high heap usage; (3) Too many open connections exhausting the OS file descriptor limit — run `ulimit -n` on the ES node and ensure it is set to at least 65535; (4) Network packet loss or high latency between client and ES host — validate with `ping` or `traceroute`.

After upgrading Elasticsearch from 7.x to 8.x my timeouts increased. Why?

Several breaking changes in Elasticsearch 8.x affect timeout behavior. The security layer (TLS + authentication) is enabled by default, adding overhead to every request. The default `http.max_content_length` dropped to 100mb. Additionally, the coordinating node now enforces stricter circuit breakers by default. Check that your client is using the correct 8.x-compatible library version, verify TLS certificate validation is not causing handshake delays, and review your JVM heap allocation as 8.x has a higher baseline memory footprint.

Elasticsearch API Timeout: Diagnosing and Fixing Request Timeout Errors

Fix Elasticsearch API timeout errors fast. Covers socket timeouts, request_timeout settings, slow queries, and cluster health fixes with real commands.

Last updated: February 23, 2026

Last verified: February 23, 2026

1,850 words

Key Takeaways

Most Elasticsearch API timeouts are caused by one of three root causes: undersized thread pools, slow or unoptimized queries hitting too many shards, or a cluster with unassigned shards causing request queuing
The default request timeout in most Elasticsearch clients is 30 seconds; if your query or bulk operation exceeds this, you will see a `ConnectionTimeout` or `RequestError` with a `timeout` reason in the response
Quick fix: start by running `GET _cluster/health` and `GET _cat/thread_pool?v` to triage whether the issue is cluster-wide, query-specific, or client-configuration-related before changing any settings

Fix Approaches Compared
Method	When to Use	Time to Apply	Risk
Increase client request_timeout	Slow but valid long-running queries (reindex, bulk)	< 5 min	Low — client-side only
Tune search.default_search_timeout	Queries consistently slow across the cluster	5–15 min	Medium — affects all queries
Add more shards / reduce shard count	Too many or too few shards causing hot spots	30–120 min	Medium — requires reindex
Scale data nodes horizontally	Thread pool queue saturation under sustained load	1–4 hrs	Low — additive change
Add request circuit breaker	Protect cluster from memory-exhausting queries	10 min	Low — adds guardrails
Optimize query (filters, avoid wildcards)	Specific slow query identified via slow log	15–60 min	Low — query-level change
Force-assign unassigned shards	Red/yellow cluster blocking primary operations	5–30 min	Medium — review reason first

Understanding Elasticsearch API Timeout Errors

When a request to Elasticsearch exceeds the configured time limit, the client or server terminates the connection and surfaces one of several timeout-related errors. Understanding which layer is timing out is the critical first diagnostic step.

Common Error Messages

Depending on your client library and configuration, you will see one of these signatures:

Python elasticsearch-py client:

elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=30))

Java High-Level REST Client:

java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-0 [ACTIVE]

Kibana / curl:

{"error":{"root_cause":[{"type":"search_phase_execution_exception","reason":"all shards failed"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"my-index","node":"abc123","reason":{"type":"query_shard_exception","reason":"Request timed out"}}]}}

Server-side timeout (search timeout):

{"timed_out":true,"_shards":{"total":5,"successful":3,"skipped":0,"failed":2}}

Note the difference: a client timeout throws an exception before any response arrives. A server-side timeout returns a 200 response with "timed_out": true and partial results.

Step 1: Diagnose — Identify the Timeout Layer

Run these commands in order to narrow down the root cause.

1a. Check cluster health first

curl -s 'http://localhost:9200/_cluster/health?pretty'

A red or yellow status means shards are unassigned. Primary shard unavailability causes all writes and some reads to block until timeout. If status is red, fix this before anything else.

1b. Inspect thread pool saturation

curl -s 'http://localhost:9200/_cat/thread_pool/search,write,bulk?v&h=node_name,name,active,queue,rejected,completed'

If queue is consistently > 0 or rejected is climbing, your nodes are overwhelmed. Requests are queuing past the client timeout window.

1c. Find slow queries via the slow log

Enable the slow log temporarily:

curl -X PUT 'http://localhost:9200/my-index/_settings' -H 'Content-Type: application/json' -d '{
  "index.search.slowlog.threshold.query.warn": "2s",
  "index.search.slowlog.threshold.fetch.warn": "1s",
  "index.search.slowlog.level": "warn"
}'

Then tail the Elasticsearch log:

tail -f /var/log/elasticsearch/my-cluster_index_search_slowlog.log

1d. Check pending tasks

curl -s 'http://localhost:9200/_cluster/pending_tasks?pretty'

A large backlog of pending cluster tasks (e.g., shard assignment, mapping updates) can delay query execution.

1e. Check JVM heap pressure

curl -s 'http://localhost:9200/_nodes/stats/jvm?pretty' | python3 -c "
import json, sys
nodes = json.load(sys.stdin)['nodes']
for nid, n in nodes.items():
    heap = n['jvm']['mem']
    used_pct = heap['heap_used_percent']
    print(f"{n['name']}: heap {used_pct}%")
"

Heap usage consistently above 75% triggers frequent GC pauses, which directly cause timeouts.

Step 2: Fix — Apply the Right Remedy

Fix A: Adjust the Client-Side Timeout

This is the correct fix when your operation (bulk indexing, reindex, aggregation over large datasets) is legitimately long-running and valid.

Python:

from elasticsearch import Elasticsearch

es = Elasticsearch(
    ['http://localhost:9200'],
    request_timeout=120  # seconds
)

# Per-request override
result = es.search(index='my-index', body=query, request_timeout=60)

Node.js (@elastic/elasticsearch):

const { Client } = require('@elastic/elasticsearch')
const client = new Client({
  node: 'http://localhost:9200',
  requestTimeout: 120000  // milliseconds
})

curl:

curl --max-time 120 -X GET 'http://localhost:9200/my-index/_search' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}}}'

Fix B: Set a Cluster-Wide Search Timeout

This prevents runaway queries from tying up resources. It returns partial results rather than blocking forever.

curl -X PUT 'http://localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d '{
  "persistent": {
    "search.default_search_timeout": "30s"
  }
}'

Or pass timeout per request at query time:

curl -X GET 'http://localhost:9200/my-index/_search?timeout=10s' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}}}'

Fix C: Fix Unassigned Shards (Red Cluster)

# Find unassigned shards and their reason
curl -s 'http://localhost:9200/_cluster/allocation/explain?pretty'

# Retry failed shard allocation
curl -X POST 'http://localhost:9200/_cluster/reroute?retry_failed=true'

# If a node was permanently removed and you accept data loss on that shard:
curl -X POST 'http://localhost:9200/_cluster/reroute' -H 'Content-Type: application/json' -d '{
  "commands": [{
    "allocate_stale_primary": {
      "index": "my-index",
      "shard": 0,
      "node": "node-1",
      "accept_data_loss": true
    }
  }]
}'

Fix D: Relieve Thread Pool Pressure

For sustained search thread pool saturation, tune the thread pool size (requires node restart):

# elasticsearch.yml
thread_pool:
  search:
    size: 13          # default: (vCPUs * 3) / 2 + 1
    queue_size: 1000  # default: 1000

Alternatively, reduce concurrent client connections or implement request rate limiting upstream (NGINX, API gateway).

Fix E: Optimize the Query

Replace slow wildcard / fuzzy / leading-wildcard patterns with match, prefix on keyword fields, or use ngram tokenizers:

# Bad — leading wildcard scans all terms:
{"query": {"wildcard": {"title": "*cloud*"}}}

# Good — use match on analyzed field or prefix on keyword:
{"query": {"match": {"title": "cloud"}}}

Use filter context for non-scoring clauses (enables caching):

{
  "query": {
    "bool": {
      "filter": [
        {"term": {"status": "active"}},
        {"range": {"created_at": {"gte": "now-7d"}}}
      ],
      "must": [
        {"match": {"description": "kubernetes"}}
      ]
    }
  }
}

Fix F: Reduce Shard Count for Over-Sharded Indices

Over-sharding (too many small shards) causes excessive coordination overhead. The recommended shard size is 10–50 GB.

# Check current shard sizes
curl -s 'http://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,docs,store,node' | sort -k6 -h

# Shrink an over-sharded index (must be read-only, all shards on one node)
curl -X PUT 'http://localhost:9200/my-index/_settings' -H 'Content-Type: application/json' -d '{
  "settings": {
    "index.routing.allocation.require._name": "node-1",
    "index.blocks.write": true
  }
}'

curl -X POST 'http://localhost:9200/my-index/_shrink/my-index-shrunk' -H 'Content-Type: application/json' -d '{
  "settings": {
    "index.number_of_shards": 1,
    "index.number_of_replicas": 1,
    "index.routing.allocation.require._name": null
  }
}'

Step 3: Verify the Fix

After applying your fix, confirm the improvement:

# Confirm cluster is green
curl -s 'http://localhost:9200/_cluster/health?wait_for_status=green&timeout=30s&pretty'

# Check response time of your query
time curl -s -X GET 'http://localhost:9200/my-index/_search' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}},"size":1}' | python3 -m json.tool | grep took

# Confirm timed_out is false
curl -s 'http://localhost:9200/my-index/_search?timeout=10s' -H 'Content-Type: application/json' -d '{"query":{"match_all":{}}}' | python3 -c "import json,sys; r=json.load(sys.stdin); print('timed_out:', r.get('timed_out'))"

Frequently Asked Questions

bash

#!/usr/bin/env bash
# elasticsearch-timeout-triage.sh
# Run against your cluster to collect timeout diagnostic data

ES_HOST="${ES_HOST:-http://localhost:9200}"
INDEX="${1:-*}"  # pass index name as arg or defaults to all

echo "=== [1] Cluster Health ==="
curl -s "${ES_HOST}/_cluster/health?pretty"

echo -e "\n=== [2] Thread Pool Saturation ==="
curl -s "${ES_HOST}/_cat/thread_pool/search,write,bulk?v&h=node_name,name,active,queue,rejected,completed"

echo -e "\n=== [3] Pending Cluster Tasks ==="
curl -s "${ES_HOST}/_cluster/pending_tasks?pretty" | python3 -c "
import json,sys
d=json.load(sys.stdin)
print(f'Pending tasks: {len(d.get(\"tasks\",[])))}')
for t in d.get('tasks',[])[:5]:
    print(' -', t.get('source','?'), '|', t.get('time_in_queue','?'))
"

echo -e "\n=== [4] JVM Heap by Node ==="
curl -s "${ES_HOST}/_nodes/stats/jvm?pretty" | python3 -c "
import json,sys
nodes=json.load(sys.stdin)['nodes']
for nid,n in nodes.items():
    heap=n['jvm']['mem']
    used=heap['heap_used_in_bytes']
    total=heap['heap_max_in_bytes']
    pct=heap['heap_used_percent']
    gc_count=sum(v['collection_count'] for v in n['jvm']['gc']['collectors'].values())
    print(f\"{n['name']}: heap={pct}% ({used//1024//1024}MB/{total//1024//1024}MB) gc_count={gc_count}\")
"

echo -e "\n=== [5] Unassigned Shards ==="
UNASSIGNED=$(curl -s "${ES_HOST}/_cat/shards?h=index,shard,prirep,state" | grep -c UNASSIGNED || true)
echo "Unassigned shard count: ${UNASSIGNED}"
if [ "${UNASSIGNED}" -gt 0 ]; then
    echo "  --> Run: curl -s '${ES_HOST}/_cluster/allocation/explain?pretty' for details"
fi

echo -e "\n=== [6] Large / Hot Shards ==="
curl -s "${ES_HOST}/_cat/shards/${INDEX}?v&h=index,shard,prirep,state,docs,store,node&s=store:desc" | head -20

echo -e "\n=== [7] Sample Query Latency ==="
SAMPLE_MS=$(curl -s -X GET "${ES_HOST}/${INDEX}/_search" \
  -H 'Content-Type: application/json' \
  -d '{"query":{"match_all":{}},"size":1,"timeout":"5s"}' \
  | python3 -c "import json,sys; r=json.load(sys.stdin); print(r.get('took','?'),'ms | timed_out:',r.get('timed_out'))")
echo "  Took: ${SAMPLE_MS}"

echo -e "\n=== [8] Circuit Breaker Status ==="
curl -s "${ES_HOST}/_nodes/stats/breaker?pretty" | python3 -c "
import json,sys
nodes=json.load(sys.stdin)['nodes']
for nid,n in nodes.items():
    print(n['name'])
    for name,cb in n['breakers'].items():
        print(f\"  {name}: {cb['overhead']}x overhead, limit={cb['limit_size_in_bytes']//1024//1024}MB, tripped={cb['tripped']}\")
"

echo -e "\n=== Triage Complete ==="

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps engineers and SREs with hands-on experience managing large-scale Elasticsearch deployments across cloud and on-premises environments. Our guides are based on real production incidents, official documentation, and community-verified solutions.

Sources

Explore More API Errors Guides

AWS API Rate Limit Exceeded (ThrottlingException): Complete Troubleshooting Guide

Fix AWS ThrottlingException and API timeouts with exponential backoff, Service Quotas increases, and optimized API polling strategies for your workloads.

Azure API Timeout: 'The operation timed out' — Root Causes and Fixes

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Azure API Timeout: Fix 504 Gateway Timeout and RequestTimeout Errors in Azure API Management, Functions, and ARM

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI