Zero Touch Assurance – ZTA (part 3)

This is the third in a series on ZTA, following on from yesterday’s post that suggested intentionally triggering events to allow the accumulation of a much larger library of historical network data.

Today we’ll look at the impact of data collection on our ability to achieve ZTA and refer back to part 1 in the series too.

Monitoring – There is monitoring the events that happen in the network and responding manually
Post-cognition – There is monitoring events that happen in the network, comparing them to past events and actions (using analytics to identify repeating patterns), using the past to recommend (or automate) a response
Pre-cognition – There is identifying events that have never happened in the network before, yet still being able to provide a recommended / automated response

In my early days of OSS projects, it was common that network performance data would be collected at 15 minute intervals at best. Sometimes even less if it put too much load on the processor of any given network devices. That was useful for long and medium term trend analysis, but averaging across the 15 minute period meant that significant performance events could be missed. Back in those days it was mostly Stage 1 – Monitoring. Stage 2, Post-cognition, was unsophisticated at a system level (eg manually adjusting threshold-crossing event levels) so post-cognition relied on talented operators who could remember similar events in the past.

If we want to reach the goal of ZTA, we have to drastically reduce measurement / polling / notification intervals. Ideally, we want near-real-time data collection across the following dimensions:

To extract (from the device/EMS)
To transform / normalise the data (different devices may use different counter models for example)
To load
To identify patterns (15 minute poll cycles disguise too many events)
To compare with past patterns
To compare with past responses / results
To recommend or automate a response

I’m sure you can see the challenge here. The faster the poll cycle, the more data that needs to be processed. It can be a significant feat of engineering to process large data volumes at near-real-time speeds (streaming analytics) on large networks.

January 30, 2019
Ryan

If you found this article useful or valuable, subscribe (in the top-right corner of this page) and share. Let's spread the word and inspire more people to become passionate about OSS. Ryan is Passionate About OSS and has dedicated the last two decades to sharing his passion for OSS with the world. He is a founder, author, blogger, Engineer, connector and inquisitive learner about OSS and managing networks. To find out a little about his back-story and why he's so Passionate About OSS, click on the About Page. To connect with Ryan and the PAOSS team, click on the Contact page.

All Posts