Networks lead. OSS are an afterthought. Time for a change?

In a recent post, we described how changing conditions in networks (eg topologies, technologies, etc) cause us to reconsider our OSS.

Networks always lead and OSS (or any form of network management including EMS/NMS) is always an afterthought. Often a distant afterthought.

But what if we spun this around? What if OSS initiated change in our networks / services? After all, OSS is the platform that operationalises the network. So instead of attempting to cope with a huge variety of network options (which introduces a massive number of variants and in turn, massive complexity, which we’re always struggling with in our OSS), what if we were to define the ways that networks are operationalised?

Let’s assume we want to lead. What has to happen first?

Network vendors tend to lead currently because they’re offering innovation in their networks, but more importantly on the associated services supported over the network. They’re prepared to take the innovation risk knowing that operators are looking to invest in solutions they can offer to their customers (as products / services) for a profit. The modus operandi is for operators to look to network vendors, not OSS vendors / integrators, to help to generate new revenues. It would take a significant perception shift for operators to break this nexus and seek out OSS vendors before network vendors. For a start, OSS vendors have to create a revenue generation story rather than the current tendency towards a cost-out business case.

ONAP provides an interesting new line of thinking though. As you know, it’s an open-source project that represents multiple large network operators banding together to build an innovative new approach to OSS (even if it is being driven by network change – the network virtualisation paradigm shift in particular). With a white-label, software-defined network as a target, we have a small opening. But to turn this into an opportunity, our OSS need to provide innovation in the services being pushed onto the SDN. That innovation has to be in the form of services/products that are readily monetisable by the operators.

Who’s up for this challenge?

As an aside:
If we did take the lead, would our OSS look vastly different to what’s available today? Would they unequivocally need to use the abstract model to cope with the multitude of scenarios?

A purple cow in our OSS paddock

A few years ago, I read a book that had a big impact on the way I thought about OSS and OSS product development. Funnily enough, the book had nothing to do with OSS or product development. It was a book about marketing – a subject that I wasn’t very familiar with at the time, but am now fascinated with.

And the book? Purple Cow by Seth Godin.
Purple Cow

The premise behind the book is that when we go on a trip into the countryside, we notice the first brown or black cows, but after a while we don’t pay attention to them anymore. The novelty has worn off and we filter them out. But if there was a purple cow, that would be remarkable. It would definitely stand out from all the other cows and be talked about. Seth promoted the concept of building something into your products that make them remarkable, worth talking about.

I recently heard an interview with Seth. Despite the book being launched in 2003, apparently he’s still asked on a regular basis whether idea X is a purple cow. His answer is always the same – “I don’t decide whether your idea is a purple cow. The market does.”

That one comment brought a whole new perspective to me. As hard as we might try to build something into our OSS products that create a word-of-mouth buzz, ultimately we don’t decide if it’s a purple cow concept. The market does.

So let me ask you a question. You’ve probably seen plenty of different OSS products over the years (I know I have). How many of them are so remarkable that you want to talk about them with your OSS colleagues, or even have a single feature that’s remarkable enough to discuss?

There are a lot of quite brilliant OSS products out there, but I would still classify almost all of them as brown cows. Brilliant in their own right, but unremarkable for their relative sameness to lots of others.

The two stand-out purple cows for me in recent times have been CROSS’ built-in data quality ranking and Moogsoft’s Incident Room model. But it’s not for me to decide. The market will ultimately decide whether these features are actual purple cows.

I’d love to hear about your most memorable OSS purple cows.

You may also be wondering how to go about developing your own purple OSS cow. Well I start by asking, “What are people complaining about?” or “What are our biggest issues?” That’s where the opportunities lie. Once discovering those issues, the challenge is solving the problem/s in an entirely different, but better, way. I figure that if people care enough to complain about those issues, then they’re sure to talk about any product that solves the problem for them.

Designing OSS to cope with greater transience (part 2)

This is the second episode discussing the significant change to OSS thinking caused by modern network models. Yesterday’s post discussed how there has been a paradigm shift from static networks (think PDH) to dynamic / transient networks (think SDN/NFV) and that OSS are faced with a similar paradigm shift in how they manage modern network models.

We can either come up with adaptive / algorithmic mechanisms to deal with that transience, or mimic the “nailed-up” concepts of the past.

Let’s take Carrier Ethernet as a basis for explanation, with its E-LAN service model [We could similarly analyse E-Line and E-Tree service models, but maybe another day].

An E-Line is a point-to-point service between an A-end UNI (User-Network Interface) and a Z-end UNI, connected by an EVC (Ethernet Virtual Connection). The EVC is a conceptual pipe that is carried across a service provider’s network – a pipe that can actually span multiple network assets / links.

In our OSS, we can apply either:

  1. Abstract Model – Just mimic the EVC as a point-to-point connection between the two UNIs
  2. Specific Model – Attempt to tie network assets / links associated with the conceptual pipe to the EVC construct

The abstract OSS can be set up just once and delegate the responsibility of real-time switching / transience within the EVC to network controllers / EMS. This is the simpler model, but doesn’t add as much value to assurance use-cases in particular.

The specific OSS must either have the algorithms / policies to dynamically manage the EVC or to dynamically associate assets to the EVC. This is obviously much more sophisticated, but provides operators with a more real-time view of network utilisation and health.

Designing OSS to cope with greater transience

There are three broad models of networking in use today. The first is the adaptive model where devices exchange peer information to discover routes and destinations. This is how IP networks, including the Internet, work. The second is the static model where destinations and pathways (routes) are explicitly defined in a tabular way, and the final is the central model where destinations and routes are centrally controlled but dynamically set based on policies and conditions.”
Tom Nolle here.

OSS of decades past worked best with static networks. Services / circuits that were predominantly “nailed up” and (relatively) rarely changed after activation. This took the real-time aspect out of play and justified the significant manual effort required to establish a new service / circuit.

However, adaptive and centrally managed networks have come to dominate the landscape now. In fact, I’m currently working on an assignment where DWDM, a technology that was once largely static, is now being augmented to introduce an SDN controller and dynamic routing (at optical level no less!).

This paradigm shift changes the fundamentals of how OSS operate. Apart from the physical layer, network connectivity is now far more transient, so our tools must be able to cope with that. Not only that, but the changes are too frequent to justify the manual effort of the past.

To tie in with yesterday’s post, we are again faced with the option of abstract / generic modelling or specific modelling.

Put another way, we have to either come up with adaptive / algorithmic mechanisms to deal with that transience (the specific model), or need to mimic “nailed-up” concepts (the abstract model).

More on the implications of this tomorrow.

Which OSS tool model do you prefer – Abstract or Specific?

There’s something I’ve noticed about OSS products – they are either designed to be abstract / flexible or they are designed to cater for specific technologies / topologies.

When designed from the abstract perspective, the tools are built around generic core data models. For example, whether a virtual / logical device, a physical device, a customer premises device, a core network device, an access network device, a trasmission device, a security device, etc, they would all be lumped into a single object called a device (but with flexibility to cater for the many differences).

When designed from the specific perspective, the tools are built with exact network and/or service models in mind. That could be 5G, DWDM, physical plant, etc and when any new technology / topology comes along, a new plug-in has to be built to augment the tools.

The big advantage of the abstract approach is obvious (if they truly do have a flexible core design) – you don’t have to do any product upgrades to support a new technology / topology. You just configure the existing system to mimic the new technology / topology.

The first OSS product I worked with had a brilliant core design and was able to cope with any technology / topology that we were faced with. It often just required some lateral thinking to make the new stuff fit into the abstract data objects.

What I also noticed was that operators always wanted to customise the solution so that it became more specific. They were effectively trying to steer the product from an open / abstract / flexible tool-set to the other end of the scale. They generally paid significant amounts to achieve specificity – to mould the product to exactly what their needs were at that precise moment in time.

However, even as an OSS newbie, I found that to be really short-term thinking. A network (and the services it carries) is constantly evolving. Equipment goes end-of-life / end-of-support every few years. Useful life models are usually approx 5-7 years and capital refresh projects tend to be ongoing. Then, of course, vendors are constantly coming up with new products, features, topologies, practices, etc.

Given this constant change, I’d much rather have a set of tools that are abstract and flexible rather than specific but less malleable. More importantly, I’ll always try to ensure that any customisations should still retain the integrity of the abstract and flexible core rather than steering the product towards specificity.

How about you? What are your thoughts?

An OSS conundrum with many perspectives

Even aside from the OSS impact, it illustrates the contrast between “bottom-up” planning of networks (new card X is cheaper/has more ports) and “top down” (what do we need to change to reduce our costs/increase capacity).”
Robert Curran
.

Robert’s quote above is in response to a post called “Trickle-down impact planning.”

Robert makes a really interesting point. Adding a new card type is a relatively common event for a big network operator. It’s a relatively minor challenge for the networks team – a BAU (Business as Usual) activity in fact. But if you follow the breadcrumbs, the impact to other parts of the business can be quite significant.

Your position in your organisation possibly dictates your perspective on the alternative approaches Robert discusses above. Networks, IT, planning, operations, sales, marketing, projects/delivery, executive – all will have different impacts and a different field of view on the conundrum. This makes it an interesting problem to solve – which viewpoint is the “right” one to tackle the challenge from?

My “solutioning” background tends to align with the top down viewpoint, but today we’ll take a look at this from the perspective of how OSS can assist from either direction.

Bottom Up: In an ideal world, our OSS and associated processes would be able to identify a new card (or similar) and just ripple changes out without interface changes. The first OSS I worked on did this really well. However, it was a “single-vendor” solution so the ripples were self-contained (mostly). This is harder to control in the more typical “best-of-breed” OSS stacks of today. There are architectural mechanisms for controlling the ripples out but it’s still a significant challenge to solve. I’d love to hear from you if you’re aware of any vendors or techniques that do this really well.

Top Down: This is where things get interesting. Should top-down impact analysis even be the task of an OSS/BSS? Since it’s a common / BAU operational task, then you could argue it is. If so, how do we create OSS tools* that help with organisational impact / change / options analysis and not just network impact analysis? How do we build the tools* that can:

  1. Predict the rippling impacts
  2. Allow us to estimate the impact of each
  3. Present options (if relevant) and
  4. Provide a cost-benefit comparison to determine whether any of the options are viable for development

* When I say “tools,” this might be a product, but it could just mean a process, data extract, etc.

I have the sense that this type of functionality falls into the category of, “just because you can, doesn’t mean you should… build it into your OSS.” Have you seen an OSS/BSS with this type of impact analysis functionality built-in?

Designing an OSS from NFRs backwards

When we’re preparing a design (or capturing requirements) for a new or updated OSS, I suspect most of us design with functional requirements (FRs) in mind. That is, our first line of thinking is on the shiny new features or system behaviours we have to implement.

But what if we were to flip this completely? What if we were to design against Non-Functional Requirements (NFRs) instead? [In case you’re not familiar with NFRs, they’re the requirements that measure the function or performance of a solution rather than features / behaviours]

What if we already have all the really important functionality in our OSS (the 80/20 rule suggests you will), but those functions are just really inefficient to use? What if we can meet the FR of searching a database for a piece of inventory… but our loaded system takes 5 mins to return the results of the query? It doesn’t sound much, but if it’s an important task that you’re doing dozens of times a day, then you’re wasting hours each day. Worse still, if it’s a system task that needs to run hundreds of times a day…

I personally find NFRs to be really hard to design for because we usually won’t know response times until we’ve actually built the functionality and tried different load / fail-over / pattern (eg different query types) models on the available infrastructure. Yes, we can benchmark, but that tends to be a bit speculative.

Unfortunately, if we’ve built a solution that works, but end up with queries that take minutes… when our SLAs might be 5-15 mins, then we’ve possibly failed in our design role.

We can claim that it’s not our fault. We only have finite infrastructure (eg compute, storage, network), each with inherent performance constraints. It is what it is right?…. maybe.

What if we took the perspective of determining our most important features (the 80/20 rule again), setting NFR benchmarks for each and then designing the solution back from there? That is, putting effort into making our most important features super-efficient rather than adding new nice-to-have features (features that will increase load, thus making NFRs harder to hit mind you!)?

In this new world of open-source, we have more “product control” than we’ve probably had before. This gives us more of a chance to start with the non-functionals and work back towards a product. An example might be redesigning our inventory to work with Graph database technology rather than the existing relational databases.

How feasible is this NFR concept? Do you know anyone in OSS who does it this way? Do you have any clever tricks for ensuring your developed features stay within NFR targets?

I will never understand…

I will never understand why Advertising is an investment and customer service is a cost.
Let’s spend millions trying to reach people, but if they try to reach us, make our contact details impossible to find, incentivise call center workers to hang up as fast as possible or ideally outsource it to a bot. It’s absolute lunacy and it absolutely matters
.”
Tom Goodwin
here.

Couldn’t agree more Tom. In fact, we’ve spoken about this exact irony here on PAOSS a few times before (eg herehere and here).

Telcos call it CVR – Call Volume Reduction (ie reduction in the number of customers’ calls that are responded to by a real person who represents the telco). But what CVR really translates to is, “we’re happy for you to reach us on our terms (ie if you want to buy something from us), but not on your terms (ie you have a problem that needs to be resolved).” Unfortunately, customer service is the exact opposite – it has to be on the customer’s terms, not yours.

Even more unfortunately, many of the problems that need to be resolved are being caused in our OSS / BSS (not always “by” our OSS / BSS, but that’s another story). Worse still, the contact centre has no chance of determining where to start understanding the problem due to the complexity of fall-out management and the complicated data flows through our OSS / BSS.

Bill Gates said, “Your most unhappy customers are your greatest source of learning.”

Let me ask you a question – Do you have a direct line of learning from your unhappy customers to your backlog of OSS / BSS enhancements? Or even an indirect line of learning? Nope?? If so, you’re not alone.

Let me ask you another question – You’re an OSS expert. Do you have any idea what problems your customers are raising with your contact centre staff? Or perhaps that should be problems they’re not getting to raise with contact centre staff due to the “success” of CVR measures? Nope?? If so, you’re not alone here either.

Can you think of a really simple and obvious way to start fixing this?

Re-writing the Sales vs Networks cultural divide

Brand, marketing, pricing and sales were seen as sexy. Networks and IT were the geeks no one seemed to speak to or care about. … This isolation and excommunication of our technical team had created an environment of disillusion. If you wanted something done the answer was mostly ‘No – we have no budget and no time for that’. Our marketing team knew more about loyalty points … than about our own key product, the telecommunications network.”
Olaf Swantee
, from his book, “4G Mobile Revolution”

Great note here (picked up by James Crawshaw at Heavy Reading). It talks about the great divide that always seems to exist between Sales / Marketing and Network / Ops business units.

I’m really excited about the potential for next generation OSS / orchestration / NaaS (Network as a Service) architectures to narrow this divide though.

In this case:

  1. The Network is offered as a microservice (let’s abstractly call them Resource Facing Services [RFS]);
  2. Sales / Marketing construct customer offerings (let’s call them Customer Facing Services [CFS]) from those RFS; and
  3. There’s a catalog / orchestration layer that marries the CFS with the cohesive set of RFS

The third layer becomes a meet-in-the-middle solution where Sales / Marketing comes together with Network / Ops – and where they can discuss what customers want and what the network can provide.

The RFS are suitably abstracted that Sales / Marketing doesn’t need to understand the network and complexity that sits behind the veil. Perhaps it’s time for Networks / Ops to shine, where the RFS can be almost as sexy as CFS (am I falling too far into the networks / geeky side of the divide?  🙂  )

The CFS are infinitely composable from RFS (within the constraints of the RFS that are available), allowing Sales / Marketing teams to build whatever they want and the Network / Ops teams don’t have to be constantly reacting to new customer offerings.

I wonder if this revolution will give Olaf cause to re-write this section of his book in a few years, or whether we’ll still have the same cultural divide despite the exciting new tools.

Does the death of ATM bear comparison with telco-grade open-source OSS?

Hands up if you’re old enough to remember ATM here? And I don’t mean the type of ATM that sits on the side of a building dispensing cash – no I mean Asynchronous Transfer Mode.

For those who aren’t familiar with ATM, a little background. ATM was THE telco-grade packet-switching technology of choice for most carriers globally around the turn of the century. Who knows, there might still be some ATM switches/routers out there in the wild today.

ATM was a powerful beast, with enormous configurability and custom-designed with immense scale in mind. It was created by telco-grade standards bodies with the intent of carrying voice, video, data, whatever, over big data pipes.

With such pedigree, you may be wondering then, how it was beaten out by a technology that was designed to cheaply connect small groups of computers clustered within 100 metres of each other (and a theoretical maximum bandwidth of 10Mbps).

Why does the technology that scaled up to become carrier Ethernet exist in modern telco networks, whereas ATM is largely obsoleted? Others may beg to differ, and there are probably a multitude of factors, but I feel it boils down to operational simplicity. Customers wanted operational simplicity and operators didn’t want to have a degree in ATM just to be able to drive it. By being designed to be all things to all people (carriers), did that make ATM compromised from the start?

Now I’ll state up front that I love the initiative and collaboration being shown by many of the telcos in committing to open-source programs like ONAP. It’s a really exciting time for the industry. It’s a sign that the telcos are wresting control back from the vendors in terms of driving where the collective innovation goes.

Buuuuuuut…..

Just like with ATM, are the big open source programs just too big and too complicated? Do you need a 100% focus on ONAP to be able to make it work, or even to follow all the moving parts? Are these initiatives trying to be all things to all carriers instead of changing needs to more simplified use cases?

Sometimes the ‘right’ way to do it just doesn’t exist yet, but often it does exist but is very expensive. So, the question is whether the ‘cheap, bad’ solution gets better faster than the ‘expensive, good’ solution gets cheap. In the broader tech industry (as described in the ‘disruption’ concept), generally the cheap product gets good. The way that the PC grew and killed specialized professional hardware vendors like Sun and SGi is a good example. However, in mobile it has tended to be the other way around – the expensive good product gets cheaper faster than the cheap bad product can get good.”
Ben Evans
here.

Is there an Ethernet equivalent in the OSS world, something that’s “cheap, bad” but getting better (and getting customer buy-in) rapidly?

Blown away by one innovation. Now to extend on it

Our most recent two posts, from yesterday and Friday, have talked about one stunningly simple idea that helps to overcome one of OSS‘ biggest challenges – data quality. Those posts have stimulated quite a bit of dialogue and it seems there is some consensus about the cleverness of the idea.

I don’t know if the idea will change the OSS landscape (hopefully), or just continue to be a strong selling point for CROSS Network Intelligence, but it has prompted me to think a little longer about innovating around OSS‘ biggest challenges.

Our standard approach of just adding more coats of process around our problems, or building up layers of incremental improvements isn’t going to solve them any time soon (as indicated in our OSS Call for Innovation). So how?

Firstly, we have to be able to articulate the problems! If we know what they are, perhaps we can then take inspiration from the CROSS innovation to spur us into new ways of thinking?

Our biggest problem is complexity. That has infiltrated almost every aspect of our OSS. There are so many posts about identifying and resolving complexity here on PAOSS that we might skip over that one in this post.

I decided to go back to a very old post that used the Toyota 5-whys approach to identify the real cause of the problems we face in OSS [I probably should update that analysis because I have a whole bunch of additional ideas now, as I’m sure you do too… suggested improvements welcomed BTW].

What do you notice about the root-causes in that 5-whys analysis? Most of the biggest causes aren’t related to system design at all (although there are plenty of problems to fix in that space too!). CROSS has tackled the data quality root-cause, but almost all of the others are human-centric factors – change controls, availability of skilled resources, requirement / objective mis-matches, stakeholder management, etc. Yet, we always seem to see OSS as a technical problem.

How do you fix those people challenges? Ken Segal puts it this way, “When process is king, ideas will never be. It takes only common sense to recognize that the more layers you add to a process, the more watered down the final work will become.” Easier said than done, but a worthy objective!

Blown away by one innovation – a follow-up concept

Last Friday’s blog discussed how I’ve just been blown away by the most elegant OSS innovation I’ve seen in decades.

You can read more detail via the link, but the three major factors in this simple, elegant solution to data quality problems (probably OSS‘ biggest kryptonite) are:

  1. Being able to make connections that break standard object hierarchy rules; but
  2. Having the ability to mark that standard rules haven’t been followed; and
  3. Being able to uses the markers to prioritise the fixing of data at a more convenient time

It’s effectively point 2 that has me most excited. So novel, yet so obvious in hindsight. When doing data migrations in the past, I’ve used confidence flags to indicate what I can rely on and what needs further audit / remediation / cleansing. But the recent demo I saw of the CROSS product is the first time I’ve seen it built into the user interface of an OSS.

This one factor, if it spreads, has the ability to change OSS data quality in the same way that Likes (or equivalent) have changed social media by acting as markers of confidence / quality.

Think about this for a moment – what if everyone who interacts with an OSS GUI had the ability to rank their confidence in any element of data they’re touching, with a mechanism as simple as clicking a like/dislike button (or similar)?

Bad example here but let’s say field techs are given a design pack, and upon arriving at site, find that the design doesn’t match in-situ conditions (eg the fibre pairs they’re expecting to splice a customer lead-in cable to are already carrying live traffic, which they diagnose is due to data problems in an upstream distribution joint). Rather than jeopardising the customer activation window by having to spend hours/days fixing all the trickle-down effects of the distribution joint data, they just mark confidence levels in the vicinity and get the customer connected.

The aggregate of that confidence information is then used to show data quality heat maps and help remediation teams prioritise the areas that they need to work on next. It helps to identify data and process improvements using big circle and/or little circle remediation techniques.

Possibly the most important implication of the in-built ranking system is that everyone in the end-to-end flow, from order takers to designers through to coal-face operators, can better predict whether they need to cater for potential data problems.

Your thoughts?? In what scenarios do you think it could work best, or alternatively, not work?

I’ve just been blown away by the most elegant OSS innovation I’ve seen in decades

Looking back, I now consider myself extremely lucky to have worked with an amazing product on the first OSS project I worked on (all the way back in 2000). And I say amazing because the underlying data models and core product architecture are still better than any other I’ve worked with in the two decades since. The core is the most elegant, simple and powerful I’ve seen to date. Most importantly, the models were designed to cope with any technology, product or service variant that could be modelled as a hierarchy, whether physical or virtual / logical. I never found a technology that couldn’t be modelled into the core product and it required no special overlays to implement a new network model. Sadly, the company no longer exists and the product is languishing on the books of the company that bought out the assets but isn’t leveraging them.

Having been so spoilt on the first assignment, I’ve been slightly underwhelmed by the level of elegant innovation I’ve observed in OSS since. That’s possibly part of the reason for the OSS Call for Innovation published late last year. There have been many exciting innovations introduced since, but many modern tools are still more complex and complicated than they should be, for implementers and operators alike.

But during a product demo last week, I was blown away by an innovation that was so simple in concept, yet so powerful that it is probably the single most impressive innovation I’ve seen since that first OSS. Like any new elegant solution, it left me wondering why it hasn’t been thought of previously. You’re probably wondering what it is. Well first let me start by explaining the problem that it seeks to overcome.

Many inventory-based OSS rely on highly structured and hierarchical data. This is a double-edged sword. Significant inter-relationship of data increases the insight generation opportunities, but the downside is that it can be immensely challenging to get the data right (and to maintain a high-quality data state). Limited data inter-relationships make the project easier to implement, but tend to allow less rich data analyses. In particular, connectivity data (eg circuits, cables, bearers, VPNs, etc) can be a massive challenge because it requires the linking of separate silos of data, often with no linking key. In fact, the data quality problem was probably one of the most significant root-causes of the demise of my first OSS client.

Now getting back to the present. The product feature that blew me away was the first I’ve seen that allows significant inter-relationship of data (yet in a simple data model), but still copes with poor data quality. Let’s say your OSS has a hierarchical data model that comprises Location, Rack, Equipment, Card, Port (or similar) and you have to make a connection from one device’s port to another’s. In most cases, you have to build up the whole pyramid of data perfectly for each device before you can create a customer connection between them. Let’s also say that for one device you have a full pyramid of perfect data, but for the other end, you only know the location.

The simple feature is to connect a port to a location now, or any other point to point on the hierarchy (and clean up the far-end data later on if you wish). It also allows the intermediate hops on the route to be connected at any point in the hierarchy. That’s too simple right, yet most inventory tools don’t allow connections to be made between different levels of their hierarchies. For implementers, data migration / creation / cleansing gets a whole lot simpler with this approach. But what’s even more impressive is that the solution then assigns a data quality ranking to the data that’s just been created. The quality ranking is subsequently considered by tools such as circuit design / routing, impact analysis, etc. However, you’ll have noted that the data quality issue still hasn’t been fixed. That’s correct, so this product then provides the tools that show where quality rankings are lower, thus allowing remediation activities to be prioritised.

If you have an inventory data quality challenge and / or are wondering the name of this product, it’s CROSS, from the team at CROSS Network Intelligence (www.cross-ni.com).

Are your existing data sets actually suited to seeding an AI engine?

In the virtualization domain, the old root cause technology is becoming obsolete because resources and workloads move around dynamically – we no longer have fixed network and compute resources. Existing service assurance systems in the telecommunication network were designed to manage a fixed set of resources and these assurance systems fall short in monitoring dynamic virtualized networks. Code was written using a rule based approach on known problems. Some advances have been made to develop signature patterns to determine the root cause of a problem, but this approach will also fall short in a dynamic virtualized network where autonomous changes will occur continuously.”
Patrick Kelly
here.

This quote is taken from a really interesting article by Patrick Kelly (see link above).

The old models of determining service impact and root-cause certainly struggle to hold up in the transient world of virtualised networks. Artificial Intelligence or Machine Learning or machine-led pattern identification, or whatever the technologies will be called by their developers, have a really important part to play in networks that are not just dynamic, but undergoing a touchpoint explosion.

The fascinating part of this story is that these clever new models will rely on data. Lots of data. We already have lots of data to feed into the new models. Buuuuut…. I’ve long held the reservation that there might be one slight problem… does all of our existing data actually suit the “AI” models available today?

Firstly, our existing data doesn’t include much of a history on dynamically transient networks. But the more important factor is that our networks have been managed by humans – operators who have a tendency of recording the quickest, dirtiest (and not necessarily correct or complete) set of data that allows them to restore service quickly.

Following a recent discussion with someone who’s running an AI assurance PoC for a big telco, it seems this reservation is turning out to be true. Their existing data sets just aren’t suited to the AI models. They’re having to reconsider their whole approach to their data model and how to collect / store it. They’re now starting to get positive results from the custom-built data sets.

It’s coming back to the same story as a post from last week – having connectors that can translate the different languages of ops, data, AI, etc and building a people / process / technology solution that the AI models can cope with.

You might not be ready to start an AI experiment yet, but you may like to start the journey by understanding whether your existing data is suited to AI modelling. If not, you get the chance to change it and have a great repository of data to seed an AI engine when you are ready in future. The first step on an exponential OSS journey.

The answer is soooo obvious…. or is it?

There’s a crowded room of OSS experts, a room filled with serious intellectual horsepower. You might be a virtu-OSS-o, but you surely know that there’s still so much to be learnt from those around you. You have the chance to unlock the experiences and insights of your esteemed colleagues. But how? The answer might seem to be obvious. You do so by asking questions. Lots of questions.

But that obvious answer might have just one little unexpected twist.

Do you ask:

  1. Ego questions – questions that demonstrate how clever you are (and thus prove to the other experts that you too are an expert); OR
  2. Embarrassing questions – questions that could potentially embarrass you (and demonstrate major deficiencies in your knowledge, perhaps suggesting that you’re not as much of as expert as everyone else)

I’ve been in those rooms and heard the questions, as you have too no doubt. What do you think the ratio of ego to embarrassing would typically be? 10 to 1? 20 to 1?

The problem with the ego questions is that they can be so specific to the context of a few that they end up steering the conversations to the depths of technology hell (of course they can also end up inspiring / enlightening too, so I’m generalising here).

But have you observed that the very best in our industry happen to ask a lot of embarrassing  questions?

A quote by Ramit Sethi splices in brilliantly here, “The very best ask lots of questions. 3 questions I almost never hear: (1) “Just a second. If you don’t mind me asking, how did you get to that?” (2) “I’m not sure I understand the conclusion — can you walk me through that?” (3) “How did you see that answer?” Ask these questions and stop worrying about being embarrassed. How else are you going to learn?

Just for laughs, next time you’re at one of these events (and I notice that TM Forum Live is coming up in May), try to guess what the ego to embarrassing ratio might be there and which set of questions are spawning the more interesting / insightful / helpful conversations.

An OSS niche market opportunity?

The survey found that 82 percent of service providers conduct less than half of customer transactions digitally, despite the fact that nearly 80 percent of respondents said they are moving forward with business-wide digital transformation programs of varying size and scale. This underscores a large perception gap in understanding, completing and benefiting from digitalization programs.

The study revealed that more than one-third of service providers have completed some aspect of digital transformation, but challenges persist; nearly three-quarters of service providers identify legacy systems and processes, challenges relating to staff and skillsets and business risk as the greatest obstacles to transforming digital services delivery.

Driving a successful digital transformation requires companies to transform myriad business and operational domains, including customer journeys, digital product catalogs, partner management platforms and networks via software-defined networking (SDN) and network functions virtualization (NFV).
Survey from Netcracker and ICT Intuition.

Interesting study from Netcracker and ICT Intuition. To re-iterate with some key numbers and take-aways:

  1. 82% of responding service providers can increase digital transactions by at least 50% (in theory).  Digital transactions tend to be significantly cheaper for service providers than manual transactions. However, some customers will work the omni-channel experience to find the channel that they’re most comfortable dealing with. In many cases, this means attempting to avoid digital experiences. As a side note, any attempts to become 100% digital are likely to require social / behavioural engineering of customers and/or an associated churn rate
  2. Nearly 75% of responding service providers identify legacy systems / processes, skillsets and business risk as biggest challenges. This reads as putting a digital interface onto back-end systems like BSS / OSS tools. This is less of a challenge for newer operators that have been designed with digitalised customer interactions in mind. The other challenge for operators is that the digital front-ends are rarely designed to bolt onto the operators’ existing legacy back-end systems and need significant integration
  3. If an operator want to build a digital transaction regime, they should expect an OSS / BSS transformation too.

To overcome these challenges, I’ve noticed that some operators have been building up separate (often low-cost) brands with digital-native front ends, back ends, processes and skills bases. These brands tend to target the ever-expanding digitally native generations and be seen as the stepping stone to obsoleting legacy solutions (and perhaps even legacy business models?).

I wonder whether this is a market niche for smaller OSS players to target and grow into whilst the big OSS brands chase the bigger-brother operator brands?

The challenges in transforming network assurance to network healing

A couple of interesting concepts have the ability to fundamentally change the way networks and services are maintained. If they can be harnessed, we could replace the term “network assurance” with “network healing.”

The first concept is SON, which has been formulated specifically with mobile radio networks in mind, but has the potential to extend into all network types.

A Self-Organizing Network (SON) is an automation technology designed to make the planning, configuration, management, optimization and healing of mobile radio access networks simpler and faster.”
Wikipedia

One of the challenges of creating self-organising, self-optimising, self-healing networks is that every network has physical points of failure – cable cuts, equipment failure, etc. These can’t be fixed with software alone. That’s where the second concept comes in.

The second concept is smart-contract technology (possibly facilitated by Blockchain), which provides the potential for a more automated way of engaging a mini procurement / delivery / test / payment process to fix physical problems (or logical for that matter). Whilst the work might be done in the physical world, it could be done by third-parties, initiated by the OSS via microservice. Network Fix as a Service (NFaaS), with implementation, test, acceptance and payment all done in software as far as the OSS sees it.

To an extent this already happens via the issuance of ToW (Tickets of Work) to third party fault-fix teams, but it’s normally a significantly manual process currently.

However, the bigger challenge of transforming network assurance to network healing is to find a way to self-heal services that span multiple network domains. This could be physical network functions (PNF), virtual network functions (VNF) and the myriad topologies, technologies and protocols that interconnect them.

I can’t help but think that to simplify (self-healing) we first have to simplify (network variant minimisation).

If we can drastically reduce the number of variants, we have a better chance of building self-heal automations… and don’t just tell me that AI engines will solve all these problems! Maybe one day, but perhaps we can start with baby steps first.

Bringing Eminem’s blank canvas to OSS

“When you start out in your career, you have a blank canvas, so you can paint anywhere that you want because the shit ain’t been painted on yet. And then your second album comes out, and you paint a little more and you paint a little more. By the time you get to your seventh and eighth album you’ve already painted all over it. There’s nowhere else to paint.”
Eminem. (on Rick Rubin and Malcolm Gladwell’s Broken Record podcast)

To each their own. Personally, Eminem’s music has never done it for me, whether his first or eighth album, but the quote above did strike a chord (awful pun).

It takes many, many hours to paint in the detail of an OSS painting. By the time a product has been going for a few years, there’s not much room left on the canvas and the detail of the existing parts of the work is so nuanced that it’s hard to contemplate painting over.

But this doesn’t consider that over the years, OSS have been painted on many different canvases. First there were mainframes, then client-server, relational databases, XaaS, virtualisation (of servers and networks), and a whole continuum in between… not to mention the future possibilities of blockchain, AI, IoT, etc. And that’s not even considering the changes in programming languages along the way. In fact, new canvases are now presenting themselves at a rate that’s hard to keep up with.

The good thing about this is that we have the chance to start over with a blank canvas each time, to create something uniquely suited to that canvas. However, we invariably attempt to bring as much of the old thinking across as possible, immediately leaving little space left to paint something new. Constraints that existed on the old canvas don’t always apply to each new canvas, but we still have a habit of bringing them across anyway.

We don’t always ask enough questions like:

  • Does this existing process still suit the new canvas
  • Can we skip steps
  • Can we obsolete any of the old / unused functionality
  • Are old and new architectures (at all levels) easily transmutable
  • Does the user interface need to be ported or replaced
  • Do we even need a user interface (assuming the rise of machine-to-machine with IoT, etc)
  • Does the old data model have any relevance to the new canvas
  • Do the assurance rules of fixed-network services still apply to virtualised networks
  • Do the fulfillment rules of fixed-network services still apply to virtualised networks
  • Are there too many devices to individually manage or can they be managed as a cohort
  • Does the new model give us access to new data and/or techniques that will allow us to make decisions (or derive insights) differently
  • Does the old billing or revenue model still apply to the new platform
  • Can we increase modularity and abstraction between modules

“The real reason “blockchain” or “AI” may actually change businesses now or in the future, isn’t that the technology can do remarkable things that can’t be done today, it’s that it provides a reason for companies to look at new ways of working, new systems and finally get excited about what can be done when you build around technology.”
Tom Goodwin
.

50 exercises to ignite your OSS innovation sessions

Every project starts with an idea… an idea that someone is excited enough to sponsor.

  1. But where are your ideas being generated from?
  2. How do they get cultivated and given time to grow?
  3. How do they get pitched? and How do they get heard?
  4. How are sponsors persuaded?
  5. How do they then get implemented?
  6. How do we amplify this cycle of innovation and implementation?

I’m fascinated by these questions in OSS for the reasons outlined in The OSS Call for Innovation.

If we look at the levels of innovation (to be honest, it’s probably more a continuum than bands / levels):

  1. Process Improvement
  2. Incremental Improvement (new integrations, feature enhancement, etc)
  3. Derivative Ideas (iPhone = internet + phone + music player)
  4. Quantum Innovation (Tablet computing, network virtualisation, cloud delivery models)
  5. Radical Innovations (transistors, cellular wireless networks, Claude Shannon’s Information Theory)

We have so many immensely clever people working in our industry and we’re collectively really good at the first two levels. Our typical mode of working – which could generally be considered fire-fighting (or dare I say it, Agile) – doesn’t provide the time and headspace to work on anything in the longer life-cycles of levels 3-5. These are the levels that can be more impactful, but it’s these levels where we need to carve out time specifically for innovation planning.

If you’re ever planning to conduct innovation fire-starter sessions, I really recommend reading Richard Brynteson’s, “50 Activities for Building Innovation.” As the title implies, it provides 50 (simple but powerful) exercises to help groups to generate ideas.

Please contact us if you’d like PAOSS to help facilitate your OSS idea firestarter or road-mapping sessions.

Posing a Network Data Synchronisation Protocol (NDSP) concept

Data quality is one of the biggest challenges we face in OSS. A product could be technically perfect, but if the data being pumped into it is poor, then the user experience of the product will be awful – the OSS becomes unusable, and that in itself generates a data quality death spiral.

This becomes even more important for the autonomous, self-healing, programmable, cooperative networks being developed (think IoT, virtualised networks, Self-Organizing Networks). If we look at IoT networks for example, they’ll be expected to operate unattended for long periods, but with code and data auto-propagating between nodes to ensure a level of self-optimisation.

So today I’d like to pose a question. What if we could develop the equivalent of Network Time Protocol (NTP) for data? Just as NTP synchronises clocking across networks, Network Data Synchronisation Protocol (NDSP) would synchronise data across our networks through a feedback-loop / synchronisation algorithm.

Of course there are differences from NTP. NTP only tries to coordinate one data field (time) along a common scale (time as measured along a 64+64 bits continuum). The only parallel for network data is in life-cycle state changes (eg in-service, port up/down, etc).

For NTP, the stratum of the clock is defined (see image below from wikipedia).

This has analogies with data, where some data sources can be seen to be more reliable than others (ie primary sources rather than secondary or tertiary sources). However, there are scenarios where stratum 2 sources (eg OSS) might push state changes down through stratum 1 (eg NMS) and into stratum 0 (the network devices). An example might be renaming of a hostname or pushing a new service into the network.

One challenge would be the vast different data sets and how to disseminate / reconcile across the network without overloading it with management / communications packets. The other would be that format consistency. I once had a device type that had four different port naming conventions, and that was just within its own NMS! Imagine how many port name variations (and translations) might have existed across the multiple inventories that exist in our networks. The good thing about the NDSP concept is that it might force greater consistency across different vendor platforms.

Another would be that NDSP would become a huge security target as it would have the power to change configurations and because of its reach through the network.

So what do you think? Has the NDSP concept already been developed? Have you implemented something similar in your OSS? What are the scenarios in which it could succeed? Or fail?