Most Kubernetes clusters we audit are running at 15–25% of their requested resource capacity. The waste is invisible because the cluster looks healthy - pods are running, nothing is alerting, the dashboards are green. The bill just keeps growing. This blueprint documents the methodology we used to reduce one client's Kubernetes spend from $700k to $280k annually. The numbers are real. The methodology is repeatable. We've run the same process across four different clusters since then with similar results.
What's inside
The document is structured into 7 sections. Each is self-contained - you can use individual sections as standalone references or work through the document in sequence.
How to pull two weeks of actual CPU and memory utilisation data using Prometheus and kube-state-metrics, and how to read the output in a way that identifies rightsizing opportunities vs genuine load. Includes the specific PromQL queries that surface the worst offenders.
The Goldilocks + VPA setup we use for recommendations, the decision rules for applying recommendations safely vs staging them in canary deployments first, and the categories of workload that need manual review regardless of what VPA suggests.
How to identify which instance families fit your workload profiles (compute-optimised for API servers, memory-optimised for caches), how to configure node affinity to route workloads correctly, and the node group sizing changes that reduce bin-packing waste.
The workload categorisation framework - Spot-safe, mixed, On-Demand only - and the Karpenter NodePool configuration for mixed provisioning. Includes the pod topology spread constraints that prevent single-AZ Spot concentration.
The HPA configuration changes that pair with rightsizing, the KEDA ScaledObject setup using Prometheus RPS metrics for services where CPU is a lagging indicator, and the off-peak scale-down configuration that reduces overnight costs without SLA risk.
The audit process for identifying unused PersistentVolumes, idle LoadBalancers, stale snapshots, and forgotten node groups - the category that's often overlooked but consistently delivers $15–30k of quick wins before any structural change.
The Kubernetes label taxonomy and AWS tag propagation setup that makes cost attribution by team, environment, and workload actually work. The policy template we use on every EKS engagement, with the Terraform module for enforcement.
What this doesn't cover
This blueprint is written for AWS EKS. The rightsizing methodology and Spot migration patterns apply across cloud providers, but the specific tooling references (Karpenter, AWS cost explorer queries, RDS connection pooling) are AWS-specific. GCP GKE and Azure AKS equivalents are noted where they differ significantly.
Who this is for
Platform engineers or DevOps leads with an EKS cluster and a growing AWS bill
Engineering managers preparing a FinOps initiative for their organisation
CTOs evaluating whether Kubernetes cost optimisation is worth the engineering time (it almost always is)
Teams that have done basic rightsizing but aren't sure what else to look at
How it was built
Built from the methodology used in a real Q4 2024 EKS engagement. The Terraform modules are from the same codebase used in production. Updated December 2024 to include Karpenter v1 API changes.
Every resource Sequere publishes is written by the engineers who ran the actual engagement - not by a content team working from secondhand notes. The trade-off is that we publish less frequently. The benefit is that the specifics are real.
Download
This resource is free to download with no account or signup required. The PDF downloads immediately.
If you use this resource on a real project and have feedback - things that were missing, out of date, or wrong - we want to hear it. Every update to this document has come from people who used it in production.