ElasticDoctor - Elasticsearch Health Diagnostics

Performance Intelligence at Index Level

The Index Stats check is your performance microscope, analyzing 13 critical sub-metrics across query performance, indexing efficiency, cache utilization, and I/O operations to identify optimization opportunities.

While cluster health gives you the big picture, index statistics reveal the detailed performance story. This comprehensive check examines cache hit ratios, query latency, document processing speed, and I/O efficiency across all your indices, providing actionable insights for optimization.

What You'll Learn

Performance Metrics

• Query and search performance analysis
• Cache efficiency and optimization
• Indexing throughput and bottlenecks
• I/O operations and disk efficiency

Optimization Insights

• Identifying slow-performing indices
• Cache tuning opportunities
• Query optimization recommendations
• Resource allocation improvements

Index Stats API Deep Dive

GET RequestAll ES Versions (5.x - 9.x)

GET /_stats

Simple English Explanation

Think of this API as getting a detailed performance report for each index in your cluster. It's like asking: "How fast are searches? How efficient is caching? Are there any bottlenecks?"

This data helps you understand which indices are performing well and which need optimization.

📊 Metric Categories

• Search Stats: Query performance and latency
• Indexing Stats: Document processing speed
• Cache Stats: Hit ratios and efficiency
• Store Stats: Storage and I/O metrics

⏱️ Key Performance Indicators

• Query latency: Average response times
• Cache hit ratios: Memory efficiency
• Indexing throughput: Documents per second
• I/O efficiency: Disk operations

13 Critical Performance Metrics

Search Performance Metrics

🔍 Query Performance

• query_total: Total queries executed
• query_time_in_millis: Total query time
• query_current: Currently executing queries
• avg_query_time: Average latency

📄 Fetch Operations

• fetch_total: Document fetches
• fetch_time_in_millis: Fetch time
• fetch_current: Active fetches
• avg_fetch_time: Fetch efficiency

📊 Scroll Operations

• scroll_total: Scroll queries
• scroll_time_in_millis: Scroll time
• scroll_current: Active scrolls
• Monitor for long-running scrolls

Indexing Performance Metrics

📝 Document Operations

• index_total: Documents indexed
• index_time_in_millis: Indexing time
• index_current: Active indexing
• delete_total: Documents deleted

⚡ Throughput Analysis

• Calculate docs/second indexing rate
• Monitor indexing latency trends
• Identify indexing bottlenecks
• Track failed indexing operations

Cache Efficiency Metrics

🧠 Query Cache

• query_cache_hit_count: Cache hits
• query_cache_miss_count: Cache misses
• query_cache_memory_size: Memory usage
• hit_ratio: Cache efficiency (%)

💾 Request Cache

• request_cache_hit_count: Request hits
• request_cache_miss_count: Request misses
• request_cache_evictions: Cache evictions
• Monitor cache pressure indicators

Performance Analysis & Optimization

Query Performance Optimization

🚨 Performance Warning Signs

• Average query time > 1000ms
• High number of concurrent queries
• Low cache hit ratios (<80%)
• Frequent cache evictions

✅ Optimization Actions

• Optimize slow queries and filters
• Increase cache sizes if memory allows
• Implement query result caching
• Review index mapping and settings

ElasticDoctor Analysis

🔍 How ElasticDoctor Analyzes Index Performance

Performance Benchmarking

ElasticDoctor compares your index performance metrics against industry benchmarks and best practices to identify optimization opportunities.

Cache Efficiency Analysis

Analyzes cache hit ratios and memory utilization patterns to recommend optimal cache sizing and configuration adjustments.

Bottleneck Detection

Identifies performance bottlenecks in query execution, indexing operations, and I/O patterns with specific recommendations for improvement.

Trend Analysis

Tracks performance trends over time to predict capacity needs and identify gradual performance degradation before it impacts users.

Performance Monitoring Best Practices

✅ Performance Excellence

• Monitor query latency continuously
• Maintain cache hit ratios >80%
• Track indexing throughput trends
• Set alerts for performance degradation
• Regular performance baseline reviews

💡 Optimization Tips

• Use filters instead of queries when possible
• Implement proper index warming strategies
• Optimize mapping for your use case
• Monitor and tune garbage collection
• Use appropriate refresh intervals

❌ Performance Killers

• Ignoring slow query patterns
• Undersized cache configurations
• Not monitoring cache evictions
• Allowing unlimited scroll operations
• Missing performance baseline data

⚠️ Warning Thresholds

• Query latency: >100ms warning, >1s critical
• Cache hit ratio: <80% warning, <60% critical
• Indexing rate: Monitor for sudden drops
• Current operations: Alert on high counts
• Memory usage: Track cache memory growth

Index Stats API Examples

Basic Index Stats Retrieval

# Get stats for all indices
GET /_stats

# Get stats for specific indices
GET /logs-*/_stats

# Get specific metric groups
GET /_stats/search,indexing,cache

# Get stats with human-readable sizes
GET /_stats?human=true

Performance Metrics Analysis

# Example response structure
{
  "indices": {
    "logs-2024.12.15": {
      "primaries": {
        "search": {
          "query_total": 150234,
          "query_time_in_millis": 45671,
          "query_current": 3,
          "fetch_total": 89456,
          "fetch_time_in_millis": 12345
        },
        "indexing": {
          "index_total": 2345678,
          "index_time_in_millis": 234567,
          "index_current": 12,
          "delete_total": 1234
        },
        "query_cache": {
          "memory_size_in_bytes": 234567890,
          "hit_count": 45678,
          "miss_count": 1234,
          "cache_size": 5678,
          "cache_count": 234
        }
      }
    }
  }
}

Cache Optimization Commands

# Clear query cache for performance testing
POST /logs-*/_cache/clear?query=true

# Clear request cache
POST /logs-*/_cache/clear?request=true

# Clear all caches
POST /logs-*/_cache/clear

# Warm up cache with common queries
GET /logs-*/_search
{
  "query": {
    "match_all": {}
  },
  "size": 0
}

Performance Troubleshooting Guide

🚨 High Query Latency

Average query time consistently exceeds 1000ms, indicating performance bottlenecks.

Diagnostic Steps:

1. Analyze slow query logs for problematic patterns
2. Check index mapping efficiency and field types
3. Review query structure and filter usage
4. Monitor shard size and distribution
5. Evaluate hardware resources (CPU, memory, disk)

⚠️ Low Cache Hit Ratio

Query cache hit ratio below 80% suggests inefficient cache usage or configuration.

Optimization Actions:

• Increase query cache size if memory permits
• Analyze query patterns for cacheable operations
• Implement proper filter contexts for caching
• Review cache eviction patterns and frequency
• Consider index warming strategies for common queries

ℹ️ Indexing Performance Issues

Slow indexing throughput affecting real-time data ingestion performance.

Performance Tuning:

• Optimize bulk request sizes (5-15MB per request)
• Adjust refresh interval for better throughput
• Review mapping complexity and analyzer usage
• Monitor merge operations and I/O patterns
• Consider using multiple indexing threads

Index Performance Mastery

Critical Insights

• Query Performance: Monitor latency and optimization opportunities
• Cache Efficiency: Maximize memory utilization for speed
• Indexing Throughput: Ensure optimal document processing
• Trend Analysis: Predict and prevent performance issues

Action Plan

• Implement continuous performance monitoring
• Set up automated alerting for key metrics
• Create performance optimization procedures
• Establish regular performance review cycles

Previous: Node Performance Check Next: Cat Shards Check

Index Stats Check: 13 Critical Performance Metrics for Elasticsearch Optimization