API latency has a direct impact on user experience and operational efficiency. To tackle latency issues in our LeadX API, we recently implemented Hybrid Caching, a powerful combination of in-memory and Redis caching, achieving a significant 17% improvement in response times.
π‘ Why Hybrid Caching?
We chose a hybrid approach because it combines the speed of in-memory caching with the resilience of Redis caching. This strategy enables us to serve configuration data efficiently, even after API redeployments, by leveraging two levels of cache:
- In-memory cache (L1): Delivers the fastest response times, storing data for quick access in the same instance.
- Redis cache (L2): Provides persistence across deployments, retaining data even after restarts and repopulating the L1 cache for future requests.
By using both layers, we balance speed and persistence, significantly reducing the need to hit the database for frequently requested configurations, thereby lowering latency.
βοΈ Implementation of Hybrid Caching
Previously, our API queried the database each time it needed to retrieve organizational configurations. These configurations include feature flags and settings such as enabled features (e.g., Conversation Enabled, Bot Enabled), time zones, and user assignment criteria. Each retrieval added overhead, slowing down response times.
With Hybrid Caching in place, the API now checks for these configurations in the in-memory cache first, then in Redis if needed. This setup significantly reduces database calls, resulting in faster and more efficient data retrieval.
Key Benefits of Hybrid Caching:
- Reduced Latency: Enhanced API response times by up to 17%.
- Lower Database Load: Fewer direct database hits, even during high traffic.
- Resilience and Scalability: Redis cache ensures data persistence across deployments.
π Performance Benchmark Insights
To measure the impact, we benchmarked both non-cached and cached API performance:
Non-Cache Performance (50 Requests, Concurrency = 1)
- Average time per request: 1585.713 ms
- Response Time Percentiles:
- 50%: 1548 ms
- 90%: 1680 ms
- 95%: 1751 ms
- 100% (longest): 2894 ms
Hybrid Cache Performance (50 Requests, Concurrency = 1)
- Average time per request: 1386.210 ms
- Response Time Percentiles:
- 50%: 1379 ms
- 90%: 1440 ms
- 95%: 1445 ms
- 100% (longest): 1574 ms
π Improvement Analysis
- 50th Percentile: Improved from 1548 ms to 1379 ms (10.91% gain)
- 90th Percentile: Improved from 1680 ms to 1440 ms (14.29% gain)
- 95th Percentile: Improved from 1751 ms to 1445 ms (17.43% gain)
The shift to Hybrid Caching demonstrated clear and consistent performance gains, reducing latency across all percentiles.
Note: The current latency reduction applies only to application and organization configurations. Frequently used entities are not yet utilizing Hybrid Caching but will be included in future optimizations.
π― Key Takeaways
- Hybrid Caching: This combination of in-memory and Redis caching enabled us to decrease latency by minimizing database access.
- 17% Performance Boost: The performance gains across various percentiles showed measurable improvements in response times.
- Increased System Resilience: By reducing database load, weβve made our API more resilient to traffic spikes, ensuring better performance during high-demand periods.
π Closing Thoughts
By implementing Hybrid Caching, we achieved a substantial 17% reduction in API latency. This optimization not only improves the LeadX user experience but also enhances the systemβs resilience and scalability.