You’ve all probably seen this scene from the Tom Hanks movie, Apollo 13 right? But you’re probably wondering what it has to do with OSS?
Well, this scene came to mind when I was preparing a list of user stories required to facilitate Autonomous Networking.
More specifically, to the use-case where we want the Autonomous Network to quickly recover (as best it can) from unplanned catastrophic network failures.
Of course we don’t want catastrophic network failures in production environments, but if one does occur, we’d prefer that our learning machines already have some idea on how to respond to any unlikely situation. We don’t want them to be learning response mechanisms after a production event.
But similarly, we don’t want to trigger massive outages on production just to build up a knowledge base of possible cause-effect groupings. That would be ridiculous.
That’s where the Apollo 13 analogy comes into play:
- The engineers on the ground (ie the non-prod environment) were tasked with finding a solution to the problem (as they said, “fitting a square peg in a round hole”)
- The parts the Engineers were given matched the parts available in the spacecraft (ie non-prod and prod weren’t an exact match, but enough of a replica to be useful)
- The Engineers were able to trial many combinations using the available parts until they found a workable resolution to the problem (even if it relied heavily on duct tape!)
- Once the workable solution was found, it was codified (as a procedure manual) and transferred to the spacecraft (ie migrating seed data from non-prod to prod)
If I were responsible for building an Autonomous Network, I’d want to dream up as many failure scenarios as I could, initiate them in non-prod and then duct-tape* solutions together for them all… and then attempt to pre-seed those learnings into production.
* By “duct-tape” I mean letting the learning machine attempt to find optimal solutions by trialing different combinations of automated / programmatic and manual interventions.