OSS that repair virtualised networks – the dual loop approach

In a recent article, we talked about Network Service Assurance (NSA) in an environment where network virtualisation exists.

One of the benefits of virtualisation or NaaS (Network as a Service) is that it provides a layer of programmability to your network. That is, to be able to instantiate network services by software through a network API. Virtualisation also tends to assume/imply that there is a huge amount of available capacity (the resource pool) that it can shift workloads between. If one virtual service instance dies or deteriorates, then just automatically spin up another. If one route goes down, customer services are automatically re-directed via alternate routes and the service is maintained. No problem…

But there are some problems that can’t be solved in software. You can’t just use software to fix a cable that’s been cut by an excavator. You can’t just use software to fix failed electronics. Modern virtualised networks can do a great job of self-healing, routing around the problem areas. But there are still physical failures that need to be repaired / replaced / maintained by a field workforce. NSA doesn’t tend to cover that.

Looking at the diagram below, NSA does a great job of the closed-loop assurance within the red circle. But it then needs to kick out to the green closed-loop assurance processes that are already driven by our OSS/BSS.

As described in the link above, “Perhaps if the NSA was just assuring the yellow cloud/s, any time it identifies any physical degradation / failure in the resource pool, it kicks a notification up to the Customer Service Assurance (CSA) tools in the OSS/BSS layers? The OSS/BSS would then coordinate 1) any required customer notifications and 2) any truck rolls or fixes that can’t be achieved programmatically; just like it already does today. The additional benefit of this two-tiered assurance approach is that NSA can handle the NFV / VNF world, whilst not trying to replicate the enormous effort that’s already been invested into the CSA (ie the existing OSS/BSS assurance stack that looks after PNFs, other physical resources and the field workforce processes that look after it all).”

Therefore, a key part of the NSA process is how it kicks up from closed-loop 1 to closed-loop 2. Then, after closed-loop 2 has repaired the physical problem, NSA needs to be aware that the repaired resource is now back in the pool of available resources. Does your NSA automatically notice this, or must it receive a notification from closed loop 2?

It could be as simple as NSA sending alarms into the alarm list with a clearly articulate root-cause. The alarm has a ticket/s raised against it. The ticket triggers the field workforce to rectify it and the triggers customer assurance teams/tools to send notifications to impacted customers (if indeed they send notifications to customers who may not actually be effected yet due to the resilience measures that have kicked in). Standard OSS/BSS practice!

June 24, 2019
Ryan

If you found this article useful or valuable, subscribe (in the top-right corner of this page) and share. Let's spread the word and inspire more people to become passionate about OSS. Ryan is Passionate About OSS and has dedicated the last two decades to sharing his passion for OSS with the world. He is a founder, author, blogger, Engineer, connector and inquisitive learner about OSS and managing networks. To find out a little about his back-story and why he's so Passionate About OSS, click on the About Page. To connect with Ryan and the PAOSS team, click on the Contact page.

All Posts