Is OSS the future of OSS?

Don’t worry. The title of this post isn’t a typo, but I’ll get to that shortly.

I’ve just had an interesting day 2 at TM Forum’s Digital Transformation Asia (https://dta.tmforum.org and #tmfdigitalasia ). The quality of presentations was again quite high with further thought-provoking ideas!!

My favorite session for the day was a panel discussion entitled, “Is open-source the future of OSS/BSS?” Hence the title of today’s blog. Is OSS (open source software) the future of OSS?

Trevor Cheung of OpenROADS Community spoke about their framework for delivering transformation. One point he emphasised was that we’re so wrapped up in Customer Experience (CX), we often forget about Employee Experience. Put simply, if we don’t win the hearts and minds of the implementers, there’s never going to be a transformed experience for the customers to have.

Jurgen Hase of unlimit gave a number of really interesting perspectives, but the best is paraphrased as follows, “The S in IoT stands for security… Wait, what? There is no S in IoT??”

Next was Angelia Ooi of TIME. Angelia provided 8 really useful tips on digital transformation via a presentation pack that is easily the most succinct and polished of all those I’ve seen at DTA so far.

Joddy Hernady of Telkom Indonesia provided some of the economics of becoming a digital telco, which provided an interesting perspective on the benefits of achieving digital transformation.

But finally, and it was the last presentation of the day that was most thought-provoking. Is open source the future of OSS/BSS?
Unfortunately I missed almost all of Catherine Michel’s opening gambit, but I believe the CTO of Sigma Systems made the key point that open source projects such as Mongo DB should really only be considered once they’d reached a level of maturity, ongoing development and support that approaches the large ISVs (Independent Software Vendors) such as Sigma Systems. She also highlighted the multi-layered challenges around licensing / rights.
Gnanapriya Chidambaranathan of Infosys contended that there is a wealth of open source projects that can be leveraged, curated and supported by integrators such as Infosys. She posed that open source adoption is a key to innovation.
Venura Mendis of Apigate provided the perspective of an open source software provider. He highlighted the challenge he faces in dealing with traditional carrier procurement teams, particularly in their ambition of reaching comparative TCO (Total Cost of Ownership) models.
Guy Lupo of Telstra provided a number of different and interesting perspectives, as he regularly does, this time on a carrier deciding between ISV, open source products and going down the path (rabbit hole?) of open sourcing their own developments. Guy’s perspectives were really pertinent as he’s currently utilising all of these options in his NaaS (Network as a Service) program at Telstra.

Finally, a few thoughts from me on the topic of OSS as the future of OSS.

1. One of the biggest challenges facing the future of OSS is fragmentation. The PAOSS vendor list has over 200 records (and I’ll be doing a major update again shortly that will add hundreds of additional vendors). This means the available skills pool is diluted with a lot of functionality duplication. It also means it becomes really challenging for customers to choose the right product for their needs (although we could claim that this is a good thing for PAOSS as we often assist customers with this challenge). The proliferation of open source projects that deliver OSS/BSS functionality further fragments and dilutes

2. We’re seeing a trend away from the behemoth software stacks of the past for a variety of reasons, but could be summed up as the laws of physics preventing us from making a large-scale OSS pivots. The more modular OSS appear to be more nimble. This plays into the hands of niche open source offerings. It appears contra to the massive-scale open source efforts of ONAP, which interestingly, the above mentioned panelists also held doubts over ONAP’s ability to succeed. I should note that they, like me, were also enthusiastic about facets of ONAP such as the collaboration, initiative taken, etc.

3. I still believe there is the potential to build an open-source OSS core that then allows collaboration and plug-ins to be developed, thus better leveraging the long tail of innovation from the available skills pool. Today’s panelists did throw something of a spanner in these works though by pointing out the layered licensing challenge with open source. It’s quite common for open source projects to leverage open source projects, which in turn leverage open source projects. Guy in particular highlighted just how big a problem it has been for Telstra’s procurement team to trace out all the open source threads.

The biggest OSS loser

You are so much more likely to put effort into something when you know whether it will pay off and what the gains will be. Not knowing how things will turn out undermines your motivation and makes you delay taking action.”
Dr Theo Tsaousides
in his book, Brainblocks.

Have you seen the reality TV show, “The Biggest Loser?” I rarely watch TV, but have noticed that it’s been a runaway hit in the ratings here in Australia (and overseas apparently). Why has it been so successful and what does it have to do with OSS?

Well, according to Dr Tsaousides, the success of the show comes down to the obvious body-shape / fitness transformations each of the contestants makes over each season of the show. But more specifically, “You need to watch only one season from beginning to end and you will start craving to be a contestant on the show, regardless of your current weight… Seeing the people’s amazing transformation over a few months is a much more convincing way to start working out and eating well than being told by your doctor that you need to lose weight and about the cardiovascular advantages of exercise. Forecasting a positive outcome, especially when dealing with something new and unfamiliar, leads to action.”

Can you see how this might be a useful technique when planning an OSS transformation?

Change management is always a challenging task on any large OSS transformation. It’s always best to have the entire OSS user population involved in the change, but that’s not always feasible for large groups of users.

It’s one of the reasons I’m always a big advocate for getting a baseline, sandpit version of off-the-shelf OSS stood up and available for the user population to start interacting with. This is particularly helpful if the sandpit is perceptibly better than the current one.

To paraphrase, “Forecasting a positive outcome (via the OSS sandpit), especially when dealing with something new and unfamiliar (the future state after OSS transformation), leads to action (more excitement, engagement and less pushback from the user population during the course of the transformation).”

Do you think the biggest loser technique could work on your next OSS transformation?

Facebook’s algorithmic feed for OSS

This is the logic that led Facebook inexorably to the ‘algorithmic feed’, which is really just tech jargon for saying that instead of this random (i.e. ‘time-based’) sample of what’s been posted, the platform tries to work out which people you would most like to see things from, and what kinds of things you would most like to see. It ought to be able to work out who your close friends are, and what kinds of things you normally click on, surely? The logic seems (or at any rate seemed) unavoidable. So, instead of a purely random sample, you get a sample based on what you might actually want to see. Unavoidable as it seems, though, this approach has two problems. First, getting that sample ‘right’ is very hard, and beset by all sorts of conceptual challenges. But second, even if it’s a successful sample, it’s still a sample… Facebook has to make subjective judgements about what it seems that people want, and about what metrics seem to capture that, and none of this is static or even in in principle perfectible. Facebook surfs user behaviour..”
Ben Evans
here.

Most of the OSS I’ve seen tend to be akin to Facebook’s old ‘chronological feed’ (where users need to sift through thousands of posts to find what’s most interesting to them).

The typical OSS GUI has thousands of functions (usually displayed on a screen all at once – via charts, menus, buttons, pull-downs, etc). But of all of those available functions, any given user probably only interacts with a handful.
Current-style OSS interface

Most OSS give their users the opportunity to customise their menus, colour schemes, even filters. For some roles such as network ops, designers, order entry operators, there are activity lists, often with sophisticated prioritisation and skills-based routing, which starts to become a little more like the ‘algorithmic feed.’

However, unlike the random nature of information hitting the Facebook feed, there is a more explicit set of things that an OSS user is tasked to achieve. It is a little more directed, like a Google search.

That’s why I feel the future OSS GUI will be more like a simple search bar (like Google) that will provide a direction of intent as well as some recent / regular activity icons. Far less clutter than the typical OSS. The graphs and activity lists that we know and love would still be available to users, but the way of interacting with the OSS to find the most important stuff quickly needs to get more intuitive. In future it may even get predictive in knowing what information will be of interest to you.
OSS interface of the future

Are we better off waiting for OSS technology to catch up?

Yesterday’s post discussed Dave Duggal’s concept of 20th century OSS being all about centralizing command and control to gain efficiency through vertical integration and mass standardization, whilst 21st century OSS are about decentralization – gaining efficiency through horizontal integration of partner ecosystems and mass customization.

We talked about transitioning from a telco market driven by economies of scale (the 20th century benchmark) to a “market of one” (21st century target state), where fully personalised experience exists and is seamless across all channels.

Dave wrote the original article back in 2016. Two years on and some of the technology in our OSS is just starting to catch up to Dave’s concepts. To be completely honest, we still haven’t architected or built the decentralised OSS that truly offer wide-scale partner ecosystems or customer personalisation, particularly at a scale that is cost-viable.

So I’m going to ask a really pointed question. If our OSS are still better suited to 20th century markets and can’t handle the incalculable number of variants that come with a fully personalised customer experience, are we better off waiting for the technology to catch up before trying to build business models that cater to the “market of one?”

Why? Well, as Gadi Solotorevsky, Chief Technology Officer, cVidya in this post on TM Forum’s Inform says, “…digital customers aren’t known for their patience and or tolerance for errors (I should know – I’m one of them). And any serious glitch, e.g. an error in charging, will not only push them towards a competitor – did I mention how easy is to change digital service providers? It will probably find also its way to social media, causing a ripple effect. The same goes for the partners who are enabling operators to offer cool digital services in the first place.”

Better to have a business model that is simpler and repeatable / reliable at massive scale than attempt a 21st century model where it’s the fall-outs that are scaling.

I’d love to hear your thoughts.

BTW. Kudos to those organisations investing in the bleeding edge tech that are attempting to solve what Dave refers to as “the challenge of our times.” I’m certainly not going to criticise their bold efforts. Just highlighting the point that many operators have 21st century ambitions of their OSS whilst only having 20th century capabilities currently.

OSS feature parity. A functionality arms race

OSS Vendor 1. “I have 1 million features.” (Dr Evil puts finger in mouth)
OSS Vendor 2. “Yeah, well I have 1,000,001 features in my OSS.”

This is the arms-race that we see in OSS, just like almost any other tech product. I imagine that vendors get into this arms-race because they wish to differentiate. Better to differentiate on functionality than price. If there’s a feature parity, then the only differentiator is price. We all know that doesn’t end well!

But I often ask myself a few related questions:

  • Of those million features, how many are actually used regularly
  • As a vendor do you have logging that actually allows you to know what features are being used
  • Taking the Whale Curve perspective, even if being used, how many of those features are actually contributing to the objectives of the vendor
    • Do they clearly contribute towards making sales
    • Do customers delight in using them
    • Would customers be irate if you removed them
    • etc

Earlier this week, I spoke about a friend who created an alarm management tool by himself over a weekend. It didn’t have a million features, but it did have all of what I’d consider to be the most important ones. It did look like a lot of other alarm managers that are now on the market. The GUI based on alarm lists still pervades.

If they all look alike, and all have feature parity, how do you differentiate? If you try to add more features, is it safe to assume that those features will deliver diminishing returns?

But is an alarm list and the flicking of tickets the best way to manage network health?

What if, instead of seeking incremental improvement, someone went back to the most important requirements and considered whether the current approach is meeting those customer needs? I have a strong suspicion that customer feedback will indicate that there are definitely flaws to overcome, especially on high event volume networks.

Clever use of large data volumes provides a level of pre-cognition and automation that wasn’t available when simple alarm lists were first invented. This in turn potentially changes the way that operators can engage with network monitoring and management.

What if someone could identify a whole new user interface / approach that overcame the current flaws and exceeded the key requirements? Would that be more of a differentiator than adding a 1,000,002nd feature?

If you’re looking for a comparison, there were plenty of MP3 players on the market with a heap of features, many more than the iPod. We all know how that one played out!

What if the OSS solution lies in its connections?

Imagine for a moment that you’re sitting in front of a pristine chess board, awaiting the opportunity to make your first move. All of the pieces have been exquisitely carved from stone, polished to a sheen. The rules of the game have been established for centuries, so you know exactly which piece is able to move in which sequences. Time to make the opening move.

You’ve studied the games of the masters who have preceded you and have planned your opening gambit, the procession of moves that will hopefully take you into a match-winning position. Due to your skills with modern automations, you’ve connected some of the chess pieces with delicate strings to implement your opening gambit with precision.

Unfortunately, after the first few moves, your strings are starting to pull the pieces out of position. Your opponent has countered well and you’re having to modify your initial plans. You introduce some additional pulleys and springs to help retain the rightful position of your pieces on the board and cope with unexpected changes in strategy. The automations are becoming ever more complex, taking more time to plan and implement than the actual next move.

The board is starting to devolve into unmanageable chaos.

Does this sound like the analogy of a modern OSS? It’s what I refer to as the chessboard analogy.

We’ve been at this OSS game for long enough to already have an understanding of all of the main pieces. TM Forum’s TAM provides this definition as a useful guide. The pieces are modular, elegant and quite well understood by its many players. The rules of the game haven’t really changed much. The main use cases of an OSS from decades ago (ie assure, fulfil, plan, build, etc) probably don’t differ significantly from those of today. This
“should” set the foundations for interchangeability of applications.

We see programs of work like ONAP, where millions of lines of code are being developed to re-write the rules of the game. I’m a big advocate of many of the principles of ONAP, but I’m still not sure that such a massive re-write is what’s needed.

It’s not so much in the components of our OSS as in the connections between them where things tend to go awry.

The foundation of all brilliance is seeing connections when no one else does.”
Richard Parkinson
.

This article distills ONAP from its answers back to the core questions. What if instead of seeking an entirely-new architectural stack, we focused on solving the core questions and the chessboard problem – the problem of connections?

Perhaps the answer to the connection problem lies in the interchangeable small grid OSS model discussed in yesterday’s article on planned OSS obsolescence.
But it probably also incorporates what ONAP calls, “real-time, policy-driven orchestration and automation,” to replace pre-defined processes. I wonder instead whether state-based transitions, being guided by intent/policy rules and feedback loops (ie learning systems) might hold the key. An evolving and learning solution that shares similarities with the electrical pathways in our brain, which strengthen the more they’re used and diminish if no longer used.

Would an EoL be beneficial for OSS?

In the world of networking, it’s common for devices to go EOL (end-of-life). Capital spend and depreciation models are based around refresh cycles of around 5-7 years. Vendors reinforce this refresh cycle by designing obsolescence into maintenance, support and part supplies. Customers tend to simply submit to the risk of having no vendor support by buying the next generation replacements.

But how often do you hear of an OSS going EOL? Not often right? They tend to get written off only when the cost of upkeep outweighs new revenues.

I know, I can hear you saying that software is different from hardware and of course I agree with you. I’d partially counter by claiming that software architectures and development platforms also have a discernibly useful life just like physical network devices. If you doubt that, I’m sure you’ve seen OSS tools with origins in the 1990s that are still being developed upon. I tend to believe that product usefulness becomes asymptotic for its vendors. With the speed of change and proliferation of new platforms, useful lives are getting ever-shorter.

Would a pre-ordained product replacement life-cycle be beneficial for the OSS industry? It has some merits.

For a start, planned obsolescence enforces designs with interchangeability, in line with the small-grid OSS described yesterday. It promotes short-term enhancements to long-term visions. It becomes easier for customers to write off their investment and inject new capital into the vendor market. It penalises the amount of Frankenstein integrations that tend to become increasingly burdensome (to vendor and customer) into the future. It enforces those mythical beasts of telco software – subtraction projects. It promotes innovation to avoid the asymptotic benefit deterioration curve shown below:
Asymptotic OSS feature development

As the asymptote is being reached, a new jumping-off point commences with the new product.

But it’s a difficult status-quo to break. Vendors have invested millions of developer hours into their products. Taking a product EoL is effectively throwing that invested effort away. For carriers, it means the risk and cost of breaking integrations / processes and replacing them with new ones.

I’d love to hear your thoughts on whether an EOL model might be relevant / useful for your OSS.

The future of work and its impact on OSS

Many years ago, I worked on a seriously big OSS transformation for one of the region’s biggest telcos. Everything was big on the project, the investment, the resources, the documentation. Everything except the outcomes. There was so much inefficiency that I often spoke about making one day of progress for every ten on site. Meetings, bureaucracy, impossible approval cycles, customer re-organisations, over-analysis, etc all added up to stagnation.

This contrasted so much with some of the amazing small teams I’ve worked alongside. Teams that worked cohesively, cleverly and just got stuff done with almost no resources. It’s one of the reasons I feel that the future of work, even for the very large organisations, will be via small teams. Outsourced to small, efficient teams / organisations. The gig economy, and the proliferation of tools that support it, make it an obvious approach to take, especially for very large organisations to leverage. Proof of work technologies, such as those building upon the discovery of blockchain, will provide further impetus to use smaller teams of experts.

Experts like a friend and colleague of mine who once built an alarm management tool in a weekend, by himself. It also happened to be more sophisticated than his employer’s existing tool that had taken years of combined developer effort by a larger team.

Maybe I’ll be proven wrong, but I see the transition to this model of work as being inevitable. The question I have is how to make our OSS more accommodating of this work model. Behemoth OSS stacks won’t. Highly modular OSS made up of many smaller components probably will, as long as they don’t succumb to the OSS chessboard analogy. The pulleys and strings will make it impossible for small, interchangable teams to decipher and manage.

A small-grid OSS model is the one I’d be backing in.

OSS – like a duck on a pond

Let’s start with a basic question. “What does an OSS need to do?”

The basic answer is, “make operations easier.”

The real answer(s) is so much more nuanced than that of course. The term easier can also encapsulate other words such as faster, more accurate, more repeatable, cheaper, etc.

Designing, building, operating and maintaining a sizable network is extremely challenging, despite network operators around the world, and the vendors that supply to them, employing some of the best and brightest. So we design OSS and related tools / processes to make operations easier.

Yet I sometimes wonder whether we achieve that aim – to make operations easier. Seems to me that we tend to focus more on just replicating functions at a higher layer in the management stack. That is, moving the function to the OSS rather than EMS/NMS, without really making it much easier operationally.

Let’s start at the user interface (UI). How often are they intuitive enough for an experienced network operator to start doing tasks with negligible OSS expert guidance?
Let’s look at deployments. How often are the projects low on effort, risk, cost and complexity?
Let’s look at flexibility (ie in-flight modifications or transformations). How often do we actually deliver flexibility to our customers through our OSS. To ask the same as above, how often are our changes low on effort, risk, cost and complexity?

As a small step towards providing an answer, I wonder whether it’s a case of making the hard things look easy and the easy things look hard.

We want to make the really hard operational things much easier to do within an OSS because that’s the primary purpose of an OSS. That’s the example of a duck on a pond. The OSS is gliding along effortlessly across the top of the water, but under the water it is paddling furiously.

Conversely, we want to make the really easy* operational things look hard to do within an OSS so that we’re not constantly being asked to build functionality / complexity into our OSS that doesn’t warrant being there. It diffuses the intent of the OSS. Just because we can, doesn’t mean we should.

OSS Road-itecture. Part-roadmap, part-architecture

A post from earlier this week discussed a less risky, dependency-reduced, stepping-stone transformation approach. It contrasted with the big-bang delivery model that’s often proposed on OSS projects.

Taking the same train of thought, have you noticed how often architects (including myself) come up with an end-state view of what an OSS, or IT, or networks will be? Have you also noticed that they often seek to demonstrate the cleverness of their architecture in the end-state?

To be honest, I’m more impressed with architectures that cleverly guide a reader through the minefield of complexity via multiple lesser steps and steer towards an intended end-state. To be equally honest, this type of architecture is probably part-roadmap, part-architecture. The journey often demonstrates the impracticality of an ideal end-state.

This may lead to an OSS with compromises but at least it’s not compromised.

The big-bang end-state might look really impressive on paper, but not be viable for the delivery team.

For fear of OSS investment

Friday’s post discussed three analogies about the challenges of performing an OSS pivot.

The biggest challenge in initiating the transformation / replacement of any significant OSS is fear. There are many OSS out there whose “owners” want to change and need to change… but fear changing because a significant pivot would mean a “sell the farm” decision.

The fear is completely understandable. These are highly complex projects with so many potential pitfalls that invest massive amounts of resource (time, money, people). The risks can be huge for sponsors / stakeholders / investors. Failure of these projects can be career changing. The upside potential rarely balances the downside risk.

So, the only choice we have is to present pivots that aren’t “bet the farm” decisions.

The delivery approach of a bet the farm pivot tends to look like this:
The Bet-the-farm OSS Transformation Approach

The less risky, dependency-reduced, stepping-stone transformation tends to look a bit like this, but probably with a lot more verticals, as described here:
The Stepping-Stone OSS Transformation Approach

Do the laws of physics prevent you from making an OSS pivot?

AIrcraft carrier
Image linked from GCaptain.com.

As you already know, the word pivot has become common in the world of business, particularly the world of start-ups. It’s a euphemism for a significant change in strategic direction. In the context of today’s post, I love the word pivot because it implies a rapid change in direction, something that’s seemingly impossible for most of our OSS and the customers who use them.

I like to use analogies. It’s no coincidence that some of the analogies posted here on PAOSS relate to the challenge in making strategic change in our OSS. Here are just three of those analogies:

The OSS intertia principle relates classical physics with our OSS, where Force equals Mass x Acceleration (F = ma). In other words, the greater the mass (of your OSS), the more force must be applied to reach a given acceleration (ie to effect a change)

The OSS chess-board analogy talks about the rubber bands and pulleys (ie integrations) that enmesh the pieces on our OSS chessboard. This means that other pieces get dragged out of position whenever we try to move any individual piece and chaos ensues.

The aircraft carrier analogy compares OSS (and the CSPs they service) with navies of old. In days gone by, CSPs enjoyed command of the sea. Their boats were big, powerful and mobile enough to move around world. However, their size requires significant planning to change course. The newer application and content communications models are analogous to the advent of aviation. The over the top (OTT) business model has the speed, flexibility, lower cost base and diversity of aircraft. Air supremacy has changed the competitive dynamic. CSPs and our OSS can’t quickly change from being a navy to being an airforce, so the aircraft carrier approach looks to the future whilst working within the constraints of the past.

When making day to day changes within, and to, your OSS does the ability to pivot ever come to mind?

Do you intentionally ensure it stays small, modular and limit its integrations to simplify your game of OSS chess?
If constrained by existing mass that you simply can’t eliminate, do you seek to transform via OSS‘s aviation equivalents?
Or like many of the OSS around the world, are you just making them larger, enmeshed behemoths that will never be able to change the laws of physics and achieve a pivot?

Do any of our global target architectures represent such behemoths?

Build an OSS and they will come… or sometimes not

Build it and they will come.

This is not always true for OSS. Let me recount a few examples.

The project team is disconnected from the users – The team that’s building the OSS in parallel to existing operations doesn’t (or isn’t able to) engage with the end users of the OSS. Once it comes time for cut-over, the end users want to stick with what they know and don’t use the shiny new OSS. From painful experience I can attest that stakeholder management is under-utilised on large OSS projects.

Turf wars – Different groups within a customer are unable to gain consensus on the solution. For example, the operational design team gains the budget to build an OSS but the network assurance team doesn’t endorse this decision. The assurance team then decides not to endorse or support the OSS that is designed and built by the design team. I’ve seen an OSS worth tens of millions of dollars turned off less than 2 years after handover because of turf wars. Stakeholder management again, although this could be easier said than done in this situation.

It sounded like a good idea at the time – The very clever OSS solution team keeps coming up with great enhancements that don’t get used, for whatever reason (eg non fit-for-purpose, lack of awareness of its existence by users, lack of training, etc). I’ve seen a customer that introduced over 500 customisations to an off-the-shelf solution, yet hundreds of those customisations hadn’t been touched by users within a full year prior to doing a utilisation analysis. That’s right, not even used once in the preceding 12 months. Some made sense because they were once-off tools (eg custom migration activities), but many didn’t.

The new OSS is a scary beast – The new solution might be perfect for what the customer has requested in terms of functionality. But if the solution differs greatly from what the operators are used to, it can be too intimidating to be used. A two-week classroom-based training course at the end of an OSS build doesn’t provide sufficient learning to take up all the nuances of the new system like the operators have developed with the old solution. Each significant new OSS needs an apprenticeship, not just a short-course.

It’s obsolete before it’s finishedOSS work in an environment of rapid change – networks, IT infrastructure, organisation models, processes, product offerings, regulatory shifts, disruptive innovation, etc, etc. The longer an OSS takes to implement, the greater the likelihood of obsolescence. All the more reason for designing for incremental delivery of business value rather than big-bang delivery.

What other examples have you experienced where an OSS has been built, but the users haven’t come?

Falsely rewarding based on OSS existence rather than excellence

There’s a common belief that most jobs see people rewarded for presence rather than performance. That is, they’re encouraged to be on site from 9am to 5pm rather than being given free reign over their work schedules as long as key outcomes are met / exceeded.

In OSS vendor / product selection there’s a similar concept. Contracts are often awarded based on existence rather than excellence. When evaluating a product, if it’s able to do a majority of the functions in the long list of requirements then the box is ticked.

However, this doesn’t take into account that there are usually only a very small number of functions that any given customer’s OSS needs to perform at a very high level of efficiency. All the others are effectively just nice to have. That’s the 80/20 rule at work.

When guiding a customer through their vendor selections, I always take them through an exercise to identify the use-cases / functions that really matter. Then we ensure that the demos or proofs of concept focus closely on how excellent the OSS is at those most important factors.

OSS automations – just because we can, doesn’t mean we should

Automation is about using machines / algorithms to respond faster than humans can, or more efficiently than humans can, or more accurately than humans can… but only if the outcomes justify the costs. When it comes to automations, it’s a case of, “just because we can, doesn’t mean we should.”

The more complex the decision tree you’re trying to automate, the higher the costs and therefore the harder it becomes to cost-justify. So the first step in any automation is taking a lateral thinking approach to simplifying the decision tree.

This recent post highlighted a graph from Nokia’s Bell Labs and the financial dependency that network slicing has on operational automation:
Nokia Network Slicing

Let’s use the Toyota Five Whys technique to work our way through the implications of this:

Statement 0: As CSPs, we need to drastically reduce complexity in the processes / decision-trees across our whole organisation.

Why 1? So that we can apply significant levels of automation

Why 2? So that we can apply technologies / techniques such as network slicing or virtualisation that are cost-justifiable

Why 3? So that we can offer differentiated, premium services

Why 4? So that our offerings don’t become commodities

Why 5? So that we retain corporate profitability to return to shareholders and/or invest in further interesting projects

I love that we’re looking to all number of automation technologies / techniques to apply to our OSS. However, we’re bypassing the all-important statement 0. We’re starting at Why 1 and partially missing the cost-justifiable part of Why 2. If our automation projects don’t prove cost-justifiable, then we never get the chance to reach whys 3, 4 and 5.

OSS implementation, but without the dependencies

One of the challenges with getting a new OSS or OSS transformation project completed can be the large number of dependencies that can cause momentum gridlock. If you’re looking to deliver business value in one big-bang, which is a really common approach to delivering OSS projects, then you end up juggling many different activities and hoping they all align at the right times.

I’ve noticed that the vendors tend to design their delivery schedules around big-bang / waterfall approaches like below.
Big-bang OSS delivery

Many vendors will even assure you that this is their standard practice and are hesitant to consider changes to their “best practice” delivery scheduling. Having been involved in many of these types of deliveries in the past, on both vendor and customer side, I can assure you that they rarely work well.

Generally speaking, the gridlocks occur on the customer-side, but the result is detrimental to customer and vendor alike. Hold-ups mean inefficient allocation of resources as well as the resultant cost / time over-runs.

The alternative is to apply a bit more lateral thinking to how you break down the work into smaller chunks. The lateral thinking work breakdown aims are two-fold:

  1. How to break up the work so that it best avoids dependencies; whilst also
  2. Delivering some sort of value to the customer

There are many dependencies on a typical OSS project – hardware, procurement, IT infrastructure, network connectivity, security, approvals, integrations, licensing, resource availability, data quality and many more. However, each different customer, their org chart and project has its own unique mix of dependencies, so I don’t subscribe to the “best practice” argument to project delivery.

The diagram below shows an example of an alternate breakdown. The business value chunks that are delivered might be tiny in some cases, but at least momentum can be demonstrated. Rather than having a mass of entwined dependencies, you can isolate and minimise dependencies for that sliver of business value. When the dependency/ies has cleared, you can jump straight onto the next activity from an existing build-state rather than having to align all the activities to land in perfect precision.
Incremental OSS work breakdown

OSS project stalled? Cancel it

When a project appears to be in limbo, in a permanent holding pattern, where sunk costs meet opportunity costs, where no one can figure out what to do…

Cancel it.
Cancel it with a week’s notice.

One of two things will happen:
A. A surge of support and innovation will arrive, and it won’t be stuck any more.
B. You’ll follow through and cancel it, and you won’t be stuck any more.

It costs focus and momentum to carry around the stalled. Let it go.”
Seth Godin on his blog here.

OSS projects have a tendency to get so big and complex and with so many dependencies that they can stagnate. When projects stagnate, we have a tendency of treating them with contempt or cynicism don’t we? We treat them this way even when we’re involved, so you know that outsiders are treating them with even more contempt and cynicism.

So Seth’s concept is an interesting one. I haven’t tried his technique before.

Have you? Did it achieve your desired outcomes?
Did it rally the troops? Did it clear the way for assignment of resources onto better projects, Darwinian-style? Or did it just throw away the last vestiges of momentum and all sunk costs?

OSS holds the key to network slicing

Network slicing opens new business opportunities for operators by enabling them to provide specialized services that deliver specific performance parameters. Guaranteeing stringent KPIs enables operators to charge premium rates to customers that value such performance. The flip side is that such agreements will inevitably come with tough contractual obligations and penalties when the agreed KPIs are not met…even high numbers of slices could be managed without needing to increase the number of operational staff. The more automation applied, the lower the operating costs. At 100 percent automation, there is virtually no cost increase with the number of slices. Granted this is a long-term goal and impractical in the short to medium term, yet even 50 percent automation will bring very significant benefits.”
From a paper by Nokia – “Unleashing the economic potential of network slicing.”

With typical communications services tending towards commoditisation, operators will naturally seek out premium customers. Customers with premium requirements such as latency, throughput, reliability, mobility, geography, security, analytics, etc.

These custom requirements often come with unique network configuration requirements. This is why network slicing has become an attractive proposition. The white paper quoted above makes an attempt at estimating profitability of network slicing including some sensitivity analyses. It makes for an interesting read.

The diagram below is one of many contained in the White Paper:
Nokia Network Slicing

It indicates that a significant level of automation is going to be required to achieve an equivalent level of operational cost to a single network. To re-state the quote, “The more automation applied, the lower the operating costs. At 100 percent automation, there is virtually no cost increase with the number of slices. Granted this is a long-term goal and impractical in the short to medium term, yet even 50 percent automation will bring very significant benefits.”

Even 50% operational automation is a significant ambition. OSS hold the key to delivering on this ambition. Such ambitious automation goals means we have to look at massive simplification of operational variant trees. Simplifications that include, but go far beyond OSS, BSS and networks. This implies whole-stack simplification.

If ONAP is the answer, what are the questions?

ONAP provides a comprehensive platform for real-time, policy-driven orchestration and automation of physical and virtual network functions that will enable software, network, IT and cloud providers and developers to rapidly automate new services and support complete lifecycle management.
By unifying member resources, ONAP is accelerating the development of a vibrant ecosystem around a globally shared architecture and implementation for network automation–with an open standards focus–faster than any one product could on its own
.”
Part of the ONAP charter from onap.org.

The ONAP project is gaining attention in service provider circles. The Steering Committee of the ONAP project hints at the types of organisations investing in the project. The statement above summarises the mission of this important project. You can bet that the mission has been carefully crafted. As such, one can assume that it represents what these important stakeholders jointly agree to be the future needs of their OSS.

I find it interesting that there are quite a few technical terms (eg policy-driven orchestration) in the mission statement, terms that tend to pre-empt the solution. However, I don’t feel that pre-emptive technical solutions are the real mission, so I’m going to try to reverse-engineer the statement into business needs. Hopefully the business needs (the “why? why? why?” column below) articulates a set of questions / needs that all OSS can work to, as opposed to replicating the technical approach that underpins ONAP.

Phrase Interpretation Why? Why? Why?
real-time The ability to make instantaneous decisions Why1: To adapt to changing conditions
Why2: To take advantage of fleeting opportunities or resolve threats
Why 3: To optimise key business metrics such as financials
Why 4: As CSPs are under increasing pressure from shareholders to deliver on key metrics
policy-driven orchestration To use policies to increase the repeatability of key operational processes Why 1: Repeatability provides the opportunity to improve efficiency, quality and performance
Why 2: Allows an operator to service more customers at less expense
Why 3: Improves corporate profitability and customer perceptions
Why 4: As CSPs are under increasing pressure from shareholders to deliver on key metrics
policy-driven automation To use policies to increase the amount of automation that can be applied to key operational processes Why 1: Automated processes provide the opportunity to improve efficiency, quality and performance
Why 2: Allows an operator to service more customers at less expense
Why 3: Improves corporate profitability and customer perceptions
physical and virtual network functions Our networks will continue to consist of physical devices, but we will increasingly introduce virtualised functionality Why 1: Physical devices will continue to exist into the foreseeable future but virtualisation represents an exciting approach into the future
Why 2: Virtual entities are easier to activate and manage (assuming sufficient capacity exists)
Why 3: Physical equipment supply, build, deploy and test cycles are much longer and labour intensive
Why 4: Virtual assets are more flexible, faster and cheaper to commission
Why 5: Customer services can be turned up faster and cheaper
software, network, IT and cloud providers and developers With this increase in virtualisation, we find an increasingly large and diverse array of suppliers contributing to our value-chain. These suppliers contribute via software, network equipment, IT functions and cloud resources Why 1: CSPs can access innovation and efficiency occurring outside their own organisation
Why 2: CSPs can leverage the opportunities those innovations provide
Why 3: CSPs can deliver more attractive offers to customers
Why 4: Key metrics such as profitability and customer satisfaction are enhanced
rapidly automate new services We want the flexibility to introduce new products and services far faster than we do today Why 1: CSPs can deliver more attractive offers to customers faster than competitors
Why 2: Key metrics such as market share, profitability and customer satisfaction are enhanced as well as improved cashflow
support complete lifecycle management The components that make up our value-chain are changing and evolving so quickly that we need to cope with these changes without impacting customers across any of their interactions with their service Why 1: Customer satisfaction is a key metric and a customer’s experience spans the entire lifecyle of their service.
Why 2: CSPs don’t want customers to churn to competitors
Why 3: Key metrics such as market share, profitability and customer satisfaction are enhanced
unifying member resources To reduce the amount of duplicated and under-synchronised development currently being done by the member bodies of ONAP Why 1: Collaboration and sharing reduces the effort each member body must dedicate to their OSS
Why 2: A reduced resource pool is required
Why 3: Costs can be reduced whilst still achieving a required level of outcome from OSS
vibrant ecosystem To increase the level of supplier interchangability Why 1: To reduce dependence on any supplier/s
Why 2: To improve competition between suppliers
Why 3: Lower prices, greater choice and greater innovation tend to flourish in competitive environments
Why 4: CSPs, as customers of the suppliers, benefit
globally shared architecture To make networks, services and support systems easier to interconnect across the global communications network Why 1: Collaboration on common standards reduces the integration effort between each member at points of interconnect
Why 2: A reduced resource pool is required
Why 3: Costs can be reduced whilst still achieving interconnection benefits

As indicated in earlier posts, ONAP is an exciting initiative for the CSP industry for a number of reasons. My fear for ONAP is that it becomes such a behemoth of technical complexity that it becomes too unwieldy for use by any of the member bodies. I use the analogy of ATM versus Ethernet here, where ONAP is equivalent to ATM in power and complexity. The question is whether there’s an Ethernet answer to the whys that ONAP is trying to solve.

I’d love to hear your thoughts.

(BTW. I’m not saying that the technologies the ONAP team is investigating are the wrong ones. Far from it. I just find it interesting that the mission is starting with a technical direction in mind. I see parallels with the OSS radar analogy.)

Where are the reliability hotspots in your OSS?

As you already know, there are two categories of downtime – unplanned (eg failures) and planned (eg upgrades / maintenance).

Planned downtime sounds a lot nicer (for operators) but the reality is that you could call both types “incidents” – they both impact (or potentially impact) the customer. We sometimes underestimate that fact.

Today’s question is whether you’re able to identify where the hotspots are in your OSS suite when you combine both types of downtime. Can you tell which outages are service-impacting?

In a round-about way, I’m asking whether you already have a dashboard that monitors uptime of all the components (eg applications, probes, middleware, infra, etc) that make up your complete OSS / BSS estate? If you do, does it tell you what you anecdotally know already, or are there sometimes surprises?

Does the data give you the evidence you need to negotiate with the implementers of problematic components (eg patch cadence, the need for reliability fixes, streamlining the patch process, reduction in customisations, etc)? Does it give you reason to make architectural changes (eg webscaling)?