Cut Cloud Costs with Kubernetes: Flipkart’s Hybrid Approach
As engineering-driven SaaS and AI companies scale, managing cloud costs without compromising performance becomes a critical challenge. Flipkart, one of India’s largest e-commerce platforms, provides a compelling case study in solving this issue. Flipkart transitioned to a hybrid cloud strategy powered by Kubernetes to optimize costs, improve flexibility, and ensure reliability at scale.
This article dissects Flipkart’s hybrid cloud journey, offering actionable insights for scaling companies seeking to control their cloud infrastructure expenses while maintaining high availability and performance.
The Challenge: Managing Infrastructure Costs at Scale
Flipkart’s infrastructure requirements fluctuate significantly depending on the time of year. During their flagship event, Big Billion Days (BBD), the demand for compute resources surges exponentially compared to business-as-usual (BAU) operations. Traditional approaches, such as investing in dedicated on-prem hardware for peak loads, led to significant waste, as much of this hardware remained underutilized during BAU periods.
Additionally, relying exclusively on public cloud services presented challenges in cost predictability and potential vendor lock-in. Flipkart needed an approach that combined cost efficiency, reliability, scalability, and flexibility – and they found their solution in a hybrid cloud infrastructure.
sbb-itb-f9e5962
Flipkart’s Hybrid Cloud Approach

What Is a Hybrid Cloud?
A hybrid cloud combines on-premises data centers with public cloud services, allowing workloads to shift between the two environments based on demand. Flipkart uses their on-prem infrastructure for BAU operations and bursts into the public cloud during peak events like BBD.
This strategy minimizes upfront hardware investments while taking advantage of the elasticity of the public cloud for short-term surges in demand.
Key Benefits of Flipkart’s Hybrid Cloud
- Cost Efficiency:
Flipkart no longer needs to overprovision on-prem hardware for peak loads. By leveraging the public cloud during BBD, they avoid the year-round expense of maintaining underutilized on-prem resources. - Reliability:
A hybrid setup improves disaster recovery and business continuity. If one cloud (public or private) fails, the other can seamlessly take over, mitigating downtime risks. - Flexibility:
Flipkart’s hybrid setup enables workload placement based on need. They can optimize costs by assigning less critical workloads to on-prem infrastructure and high-demand, unpredictable workloads to the public cloud. - Scalability:
With the ability to burst into the public cloud, Flipkart’s infrastructure is prepared to handle unexpected demand spikes without requiring manual intervention or excessive resources.
Kubernetes: The Backbone of Flipkart’s Hybrid Cloud

Why Kubernetes?
At the heart of Flipkart’s hybrid cloud solution lies Kubernetes, an open-source container orchestration platform. Kubernetes enables Flipkart to deploy, manage, and scale applications across both on-prem and public cloud environments seamlessly.
Key features of Kubernetes for Flipkart’s use case include:
- Abstraction of Cloud APIs: Kubernetes provides a unified layer for managing infrastructure, reducing the complexity of managing different cloud provider APIs.
- Scalability and Flexibility: Features like auto-scaling and load balancing allow Flipkart to manage workloads dynamically.
- Avoidance of Vendor Lock-In: As an open-source platform, Kubernetes enables portability across any cloud or on-prem environment.
Overcoming Kubernetes Challenges
While Kubernetes offers powerful capabilities, managing complex workloads like databases on Kubernetes is not straightforward. Flipkart addressed these challenges using operators – purpose-built tools that automate the management of stateful applications, such as databases, within Kubernetes.
Operators: Automating Complex Workloads
An operator acts as an automated operations engineer for Kubernetes, specifically designed to handle tasks like scaling, failovers, and backups. Operators monitor and manage the desired state of an application, ensuring it runs optimally, even when failures occur.
Flipkart used a combination of official and custom-built operators to manage their databases. For example, they utilized:
- TiDB Operator: A Kubernetes-native operator for managing TiDB, a distributed SQL database.
- Aerospike Kubernetes Operator: To manage Aerospike workloads.
- Custom Operators: Flipkart also built in-house operators for MySQL and other specialized use cases, ensuring compatibility with their unique infrastructure requirements.
Practical Applications of Kubernetes in Flipkart’s Hybrid Cloud
1. Scaling Workloads
- Stateless Components: Kubernetes makes it quick and easy to scale stateless applications, such as web services, either vertically (adding resources to existing pods) or horizontally (adding more pods).
- Stateful Components: Scaling databases requires more careful planning, as it involves rebalancing data and ensuring replication consistency. Flipkart’s operators handle this complexity, enabling smooth vertical or horizontal scaling depending on workload needs.
2. Failure Management
Kubernetes and its operators enable robust failure recovery mechanisms:
- Network-Attached Storage (NAS): When only compute nodes fail, Kubernetes can respawn pods and reattach them to the same persistent storage volumes, ensuring data integrity.
- Local Storage: If both compute and storage fail, data must be replicated from backups, which can take longer. Flipkart’s hybrid setup ensures redundancy to mitigate such risks.
3. Planned Maintenance
Flipkart built a custom Scheduled Maintenance Operator to handle hardware issues proactively. This operator automatically migrates workloads to healthy nodes before planned downtime, ensuring uninterrupted operations.
4. Backup and Disaster Recovery
Using Kubernetes-native tools, Flipkart automated backups and integrated them with public cloud storage solutions like Amazon S3 or Google Cloud Storage. This streamlined disaster recovery while controlling backup storage costs.
Lessons Learned: Balancing Performance, Cost, and Availability
Storage Choices Matter
Flipkart evaluated different storage types for Kubernetes:
- Locally Attached Storage: Low latency but lacks failover capabilities.
- Network-Attached Storage (NAS): Better redundancy but higher costs and latency.
- Elastic Block Storage (EBS): Offers scalability and redundancy but comes with higher costs in the public cloud.
Each storage type has trade-offs, and businesses must make decisions based on workload requirements.
Avoid Resource Waste
Kubernetes offers powerful auto-scaling capabilities, but without proper monitoring, resources can be overprovisioned. Flipkart emphasized the importance of setting resource limits to prevent "noisy neighbors" from consuming shared infrastructure.
Benchmarking Is Essential
Performance on Kubernetes may differ from virtual machines (VMs) due to its orchestration overhead. Flipkart highlighted the importance of benchmarking workloads to ensure proper resource allocation and cost-effectiveness.
Key Takeaways
- Hybrid Cloud Provides Cost Optimization: Leveraging a mix of on-prem and public cloud infrastructure reduces costs by avoiding overprovisioning for peak loads.
- Kubernetes Is an Enabler: Kubernetes abstracts the complexity of managing hybrid clouds while offering scalability, flexibility, and resilience.
- Operators Simplify Stateful Workloads: Use Kubernetes operators to automate the management of complex workloads like databases.
- Replication Is Crucial for Bursting: Always maintain active replication between on-prem and cloud environments to handle demand spikes effectively.
- Storage Decisions Impact Performance and Cost: Evaluate storage options (local, NAS, EBS) based on workload needs, performance requirements, and budget constraints.
- Monitor Resource Usage: Implement resource limits and regularly clean up orphaned resources to avoid unexpected cloud costs.
- Plan for Failures: Design for redundancy and automate failure recovery processes to minimize downtime during outages.
- Kubernetes Is Not a Magic Bullet: While powerful, Kubernetes introduces its own complexities. Evaluate its benefits relative to your specific business needs.
Conclusion
Flipkart’s hybrid cloud, powered by Kubernetes, offers a blueprint for scaling SaaS and AI companies to control cloud costs while maintaining high availability and performance. By leveraging Kubernetes operators, Flipkart automated the management of both stateless and stateful workloads, enabling seamless scaling and robust failure recovery.
For growing companies struggling with ballooning cloud expenses, Flipkart’s approach underscores the importance of a well-planned hybrid cloud strategy and the power of Kubernetes to balance performance, cost, and reliability.
Source: "👉Flipkart’s Bold Move: Production Databases on Kubernetes Explained" – Perfology, YouTube, Feb 8, 2026 – https://www.youtube.com/watch?v=RMeSlgI_FmQ