Is your service assurance really service assurance?? (Part 5)

In yesterday’s fourth part of this series about modern network service assurance, we wrote this:

I also just stumbled upon OpenTelemetry, an open source project designed to capture traces / metrics / logs from apps / microservices. It intrigued me because just as you have the concept of traces / metrics / logs for apps, you similarly have traces / metrics / logs for networks.

In the network world, we’re good at getting metrics / logs / events, but not very good at getting trace data (ie end-to-end service chains) as described earlier in this blog series. And if we can’t monitor traces, we can’t easily interpret a customer’s experience whilst they’re using their network service. We currently do “service assurance” by reverse-engineering logs / events, which seems a bit backward to me.

Take a closer look at the OpenTelemetry link above, which provides an overview of how their team is going to gather application telemetry. With increasing software-ification of our networks (eg SDN / NFV) and the use of microservices / NaaS / APIs in our management stacks, could this actually be our path to the holy grail of service assurance (ie capturing trace data – network service telemetry)?? Is it data plane? Is it control / management plane? Is it something in between?

Note: The “active measurements” approach described in part 3 is slightly compromised in current form, which is why I’m so intrigued by the potential of extending the concepts of OpenTelemetry into our software / virtual networks.

I’d really love your take on this one because I’m sure there are many elements to this that I haven’t thought through yet. Please leave your thoughts on the viability of the approach.

Read the Passionate About OSS Blog for more or Subscribe to the Passionate About OSS Blog by Email

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.