10 Performance Tuning Tips for Elasticsearch

Optimize Elasticsearch with Hyperflex’s proven tips for faster queries, better indexing, and scalability in any industry.

Introduction

Whether you're handling logs, search, metrics, or security data, Elasticsearch is powerful—but only when tuned right. At Hyperflex, we work across industries helping engineers build fast, stable, and scalable Elastic clusters. This guide distills our most effective, field-tested performance tips. Each one is backed by real-world consulting experience across banking, healthcare, insurance, and retail.

Use these tips to:

  • Speed up queries
  • Reduce heap pressure
  • Improve indexing throughput
  • Avoid scaling bottlenecks

1. Size Shards Based on Data

Don't just use default 1 shards. Right-sizing your shards can drastically improve performance.

Hyperflex Insight: We helped a healthcare provider reduce query latency by 65% just by moving from 6 shards to 2 for 30GB indices. The result: faster access to patient records during emergency care.

Guideline:

  • Keep shards between 10–50 GB

  • Use the _shrink or _reindex API for cleanup

2. Reduce Segment Count with Force Merge

Too many segments? Elasticsearch spends more time merging than serving queries.

Retail Example: A clothing e-commerce site saw a 40% drop in CPU load on its search nodes by running _forcemerge after major inventory uploads.

Use:

POST /my-index/_forcemerge?max_num_segments=1

Run after bulk ingestion or reindexing.

3. Increase Refresh Interval for Heavy Write Loads

The default refresh_interval is 1s. That’s too frequent for bulk indexing.

Insurance Example: One insurance company increased ingestion speed by 3x during policy batch imports by temporarily setting refresh_interval to 60s.

PUT /my-index/_settings
{
  "refresh_interval": "60s"
}

Reset after the job completes.

4. Use Doc Values and Disable Norms When Possible

  • Doc values = faster sorting, aggregations
  • Norms = needed only for scoring; disable for keyword fields

Banking Example: A major bank disabled norms on all transaction fields and improved dashboard response time by 30%.

"title": {
  "type": "text",
  "norms": false
}

5. Filter First, Score Later

Use filter context for clauses that don’t affect scoring (e.g., date, status).

Retail Use Case: In a large product catalog, filtering by category and availability before matching improved search responsiveness during flash sales.

6. Cache Smartly with Filter Context

Elasticsearch caches filter results. Use filter for repeat queries.

Healthcare Example: A diagnostics platform reduced recurring query times for lab results by caching test type and patient group filters.

Don’t:

"must": [
  { "term": { "status": "error" } }
]

Do:

"filter": [
  { "term": { "status": "error" } }
]

7. Use Bulk API for High Write Throughput

Indexing one document at a time is inefficient.

Insurance Example: Processing claims required indexing 5 million documents daily. Switching to bulk reduced ingestion time from 6 hours to 45 minutes.

POST /_bulk

{ "index": { "_index": "logs" } }
{ "message": "hello" }
{ "index": { "_index": "logs" } }
{ "message": "world" }

Batch 500-1000 docs per request.

8. Monitor Heap and GC Overhead

Use the _nodes/stats/jvm API to track memory pressure.

Guideline:

  • Keep heap usage below 50%
  • Use up to 30.5GB heap for compressed object pointers

Retail Example: A marketplace scaled out ingest nodes after noticing excessive GC during peak sale days.

9. Avoid Wildcards and Leading Regex

These kill performance:

{ "wildcard": { "name": "*john*" } }

Use edge_ngram, prefix, or search_as_you_type fields instead.

Banking Example: Replacing wildcard queries with search_as_you_type reduced autocomplete lag on customer portals.

10. Profile and Benchmark Everything

Use:

GET /my-index/_search/profile

Identify slow steps in your query pipeline.
Also use Rally (https://esrally.readthedocs.io) to simulate performance scenarios.

Hyperflex Tip: We built custom Rally tracks for a retail analytics client to simulate 10K QPS and test reindexing strategies before launching in production.

11. Tune JVM Heap Memory

Elasticsearch relies heavily on JVM memory tuning. Set the heap to ~50% of system RAM, max 30.5–32GB.

ES_JAVA_OPTS="-Xms16g -Xmx16g"

Tip: Avoid swapping; use mlockall or its modern replacement to lock memory. Another option available on Linux systems is to ensure that the sysctl value vm.swappiness is set to 1. This reduces the kernel’s tendency to swap and should not lead to swapping under normal circumstances, while still allowing the whole system to swap in emergency conditions.

Monitor with:

GET _nodes/stats/jvm

Hyperflex Example: An insurance company reduced GC pauses by optimizing heap settings on ingestion-heavy nodes, improving SLA compliance.

12. Use Index Templates and ILM

Ensure consistent index management with templates and ILM policies.

Template Setup:

PUT _index_template/logs_template

ILM Example:

PUT _ilm/policy/hot-warm-delete-policy

Retail Use Case: ILM automation helped a fashion retailer manage log retention during peak sales, reducing manual ops overhead.

Final Takeaway

Tuning Elasticsearch isn't about flipping magic switches. It’s about understanding trade-offs, testing changes, and optimizing for your use case.

Hyperflex helps engineers across banking, healthcare, insurance, and retail optimize and scale Elasticsearch clusters with confidence.

Need help? We offer Elasticsearch Consulting Services tailored to your performance goals. Contact us at marketing@hyperflex.co.