Five nines is a catch-cry in the telco industry. There’s a view that we should measure the up-time of all the devices in our network and they should all reach the gold standard of 99.999% (ie downtime is less than five and a half minutes per year).
But how many times have you seen an ops team quoting their five nines KPIs, whereas the users are complaining about the unreliability of a service? Put simply, they use five nines to measure up-time of boxes, not measure a customer journey that traverses many boxes, networks, OSS, BSS, etc.
The way to measure the real customer experience is to establish synthetic transactions that mimic the user’s journey through various use-cases (eg a service activation, a change of service configuration, a service / configuration / billing enquiry, etc). These synthetic transactions must have a way of traversing all of the elements that make up the service.
In some cases, such as when flow-through provisioning is established, then it’s as simple as firing a transaction at your test platform (eg Google’s Postman) and observing the results against established benchmarks. In other cases, they’re not easy to set up, particularly if an operator’s OSS and BSS have manual, swivel-chairing activities in any of the user journeys. In these cases, the test harness becomes more complex, as does the chaining of transactions to mimic a customer service.
You might want to consider how this functionality performs within an environment where you have chaos monkeys running around causing havoc within your infrastructure.