At $JOB we often find ourselves at customer sites where we see the
same set of basic problems that we have previously encountered
elsewhere (“your clocks aren’t in sync” or “your filesystem is full”
or “you haven’t installed a critical update”, etc). We would like a
simple tool that could be run either by the customer or by our own
engineers to test for and report on these common issues.
Fundamentally, we want something that acts like a typical code test
suite, but for infrastructure.