Using anomalytics to manage virtualised networks

My Ph.D. was based on research and development of new breakthrough technologies. As a scientist, Dr. Andersen instructed us to use our minds, training and skills to solve complex and hard scientific problems. If one solves easy problems, then hundreds of people can do the same. You will differentiate yourself by solving complex, yet big, problems…”
Hossein Eslambolchi
here in this interesting article on anomalytics.

In a recent post, I discussed the concept of abstracting complexity from the virtualised infrastructure of the near future. The OSS of today tend to fail to deliver on one or all of the triple constraints of time, budget and functionality. As indicated on the triple constraint link, I’ve always had the feeling that complexity was the major factor in that (and that people tend to make things more complex than they already are).

Unfortunately, virtualised networks add even greater complexity than we’ve ever had to deal with.

With traditional OSS, we’ve always tried to model inventory (ie physical inventory, logical inventory, service inventory, etc) and build up hierarchies of relationships (ie a fibre cable can carry multiple DWDM wavelengths, which can carry an SDH link, which can carry EoSDH, etc) as well as all their related attributes. The problem with this approach is that it requires quite a bit of effort to create and maintain these relationships. This approach also tends to rely on structured data sets that conforms to the hierarchy of objects.

The old approach however, doesn’t tend to suit the next generation of networks, which will require real-time inventory tracking due to transient services, the increasingly complex hierarchy of objects (due to the complexity of managing virtualisation as per the link above) and a much larger number of touch-points.

One of the great features of virtualisation is the load-balancing capabilities of automatically spawning new virtual machines (VM) if one VM in the pool becomes inoperable or the load exceeds a certain threshold.

This makes me wonder whether we can remove some of the complexity – do we actually need to build and manage the hierarchy that links objects for the purpose of root-cause analysis?

Could we simply monitor for anomalies (or be alerted to any by customers) and then just kill off the effected VM/s whilst triggering new ones to take their place? Can we simply cram all of the data (eg flows, IDS/IPS hits, alarms, events, performance counters, service metrics, CIs, etc, etc) into unstructured big data models and run machine-learning algorithms on the data to constantly refine/improve on the anomalytics rather than managing via human intervention?

Do the principles of Intent Networking equally apply to Intent OSS? We’ll take a closer look at that next week.

Read the Passionate About OSS blog for more.

Leave a Reply

Your email address will not be published. Required fields are marked *