What do you mean by large? For up to a few hundred the typical orchestration tools like puppet, ansible etc. are likely enough. Plus you need monitoring. The old school system was nagios. IDK what the cool kids use now.
For 1000+ servers you probably have to know what you’re doing, and you’ll have gotten the knowledge from running smaller clusters. I get the impression that this is the level where Kubernetes starts to be worth the complexity, but I haven’t dealt with it myself.
What do you mean by large? For up to a few hundred the typical orchestration tools like puppet, ansible etc. are likely enough. Plus you need monitoring. The old school system was nagios. IDK what the cool kids use now.
For 1000+ servers you probably have to know what you’re doing, and you’ll have gotten the knowledge from running smaller clusters. I get the impression that this is the level where Kubernetes starts to be worth the complexity, but I haven’t dealt with it myself.