Entirely possible for an enterprise-y or B2B use-case - some clients might want rigid data / network isolation in a separate account / VPC, plus it reduces the blast radius instead of running everything in one big cluster. There are ways of achieving this in a single cluster with a lot of added complexity, and spinning up a new VPC + K8s might be easier if you have the Terraform modules ready to go.
While I'm pretty sure the article is clickbait (can't tell, paywalled too soon), having many clusters ain't dumb nowadays.
The automation of Kubernetes maintenance is great nowadays. For any case, bare metal, onprem, public cloud managed. Leveraging that makes it easier to manage multiple clusters and give each team/project it's own cluster, than implement proper mechanisms on a single cluster. Like proper rbac between projects, network boundaries etc..
Nowadays you can easily move that complexity one layer up and treat whole clusters as some volatile component that is defined in code and is also not a snowflake.
That way you can let each team/project have core things different and managed by themselves, like implementing their contrasting opinions on network meshes, or CRDs that would otherwise be in conflict etc.
The overhead is not that huge, or at least doesn't have to be. My test clusters with multiple environments of my apps consume around 4GB of mem in total (aside of my apps themselves), that includes any k8s stuff, logging aggregation, metrics aggregation and so on. You don't even have to manage your own control plane - cloud can give you a shared one (like Azure has in two lower tiers), or use services that provide you just the control plane, while nodes are on whatever hardware (like scaleway kosmos).
So yeah, it's not for everyone, but it surely can grow to those numbers of clusters, especially if you multiply by dev/staging/qa/prod for each team and add some Infra to actual tests of infra/IaC.
Although, why they had such an overhead is a mystery to me, would be cool to see that part described.
I Stopped Wearing Shoes. My Wife Is Happier Than Ever
I used to own 10,000 pairs of shoes, so I had to buy an extra house just to have room to store all of them. My fridge was completely filled with shoes, and my wife dreaded even walking around in our home.
After going barefoot, I was able to sell the other house and now have room for groceries in my fridge, and my kids can now eat.
Strange article. Saying "99.99% uptime maintained" and that they had 4 major outages in a week is kind of strange, since 99.99% uptime only allow for 4 minutes of downtime a month...
Simplicity always wins over complexity. I don't think the problem here is Kubernetes, but more like the way they used it. Any system can be made utterly complex, if you don't take the time to make it simple.
Um. Interesting. I don't think anyone should be operating 47 different Kubernetes clusters for an application. You should probably max out at three: production, staging, and dev — if you even need a dev cluster (ideally you can just run your dev server locally); you can probably also get away with colocating staging and production in the same cluster, but in different namespaces or using different sets of services/labels, and ultimately just run one Kubernetes cluster.
They mentioned they run on three different cloud providers at the same time (...why...?), but even then, I'm not clear how that results in forty seven different K8s clusters. 47 isn't even divisible by three!
Sadly the rest of the article post-paywall doesn't explain anything about how they ended up in that mess. Apparently they have "8 senior DevOps engineers," and you... really shouldn't be operating 7x more clusters than you have senior DevOps engineers in my opinion.
Not a Kubernetes guy, so perhaps ignorant question. Why would you run 47 clusters?
I thought the point of a Kubernetes clusters is you just throw your workload at it and be happy?
I get you want a few for testing and development etc, and perhaps failover to other provider or similar. But 47?
reply