OSS – just in time rather than just in case

We all know that once installed, OSS tend to stay in place for many years. Too much effort to air-lift in. Too much effort to air-lift back out, especially if tightly integrated over time.

The monolithic COTS (off-the-shelf) tools of the past would generally be commissioned and customised during the initial implementation project, with occasional integrations thereafter. That meant we needed to plan out what functionality might be required in future years and ask for it to be implemented, just in case. Along with all the baked-in functionality that is never needed, and the just in case but possibly never used, we ended up with a lot of bloat in our OSS.

With the current approach of implementing core OSS building blocks, then utilising rapid release and microservice techniques, we have an ongoing enhancement train. This provides us with an opportunity to build just in time, to build only functionality that we know to be essential.

This has pluses and minuses. On the plus side, we have more opportunity to restrict delivery to only what’s needed. On the minus side, a just in time mindset can build a stop-gap culture rather than strategic, long-term thinking. It’s always good to have long-term thinkers / planners on the team to steer the rapid release implementations (and reductions / refactoring) and avoid a new cause of bloat.

An OSS doomsday scenario

If I start talking about doomsday scenarios where the global OSS job industry is decimated, most people will immediately jump to the conclusion that I’m predicting an artificial intelligence (AI) takeover. AI could have a role to play, but is not a key facet of the scenario I’m most worried about.
OSS doomsday scenario

You’d think that OSS would be quite a niche industry, but there must be thousands of OSS practitioners in my home town of Melbourne alone. That’s partly due to large projects currently being run in Australia by major telcos such as nbn, Telstra, SingTel-Optus and Vodafone, not to mention all the smaller operators. Some of these projects are likely to scale back in coming months / years, meaning less seats in a game of OSS musical chairs. But this isn’t the doomsday scenario I’m hinting at in the title either. There will still be many roles at the telcos and the vendors / integrators that support them.

There are hundreds of OSS vendors in the market now, with no single dominant player. It’s a really fragmented market that would appear to be ripe for M&A (mergers and acquisitions). Ripe for consolidation, but massive consolidation is still not the doomsday scenario because there would still be many OSS roles in that situation.

The doomsday scenario I’m talking about is one where only one OSS gains domination globally. But how?

Most traditional telcos have a local geographic footprint with partners/subsidiaries in other parts of the world, but are constrained by the costs and regulations of a wired or cellular footprint to be able to reach all corners of the globe. All that uniqueness currently leads to the diversity of OSS offerings we see today. The doomsday scenario arises if one single network operator usurps all the traditional telcos and their legacy network / OSS / BSS stacks in one technological fell swoop.

How could a disruption of that magnitude happen? I’m not going to predict, but a satellite constellation such as the one proposed by Starlink has some of the hallmarks of such a scenario. By using low-earth orbit (LEO) satellites (ie lower latency than geostationary satellite solutions), point-to-point laser interconnects between them and peering / caching of data in the sky, it could fundamentally change the world of communications and OSS.

It has global reach, no need for carrier interconnect (hence no complex contract negotiations or OSS/BSS integration for that matter), no complicated lead-in negotiations or reinstatements, no long-haul terrestrial or submarine cable systems. None of the traditional factors that cost so much time and money to get customers connected and keep them connected (only the complication of getting and keeping the constellation of birds in the sky – but we’ll put that to the side for now). It would be hard for traditional telcos to compete.

I’m not suggesting that Starlink can or will be THE ubiquitous global communications network. What if Google, AWS or Microsoft added this sort of capability to their strengths in hosting / data? Such a model introduces a new, consistent network stack without the telcos’ tech debt burdens discussed here. The streamlined network model means the variant tree is millions of times simpler. And if the variant tree is that much simpler, so is the operations model and so is the OSS… with one distinct contradiction. It would need to scale for billions of customers rather than millions and trillions of events.

You might be wondering about all the enterprise OSS. Won’t they survive? Probably not. Comms networks are generally just an important means-to-an-end for enterprises. If the one global network provider were to service every organisation with local or global WANs, as well as all the hosting they would need, and hosted zero-touch network operations like Google is already pre-empting, would organisation have a need to build or own an on-premises OSS?

One ubiquitous global network, with a single pared back but hyperscaled OSS, most likely purpose-built with self-healing and/or AI as core constructs (not afterthoughts / retrofits like for existing OSS). How many OSS roles would survive that doomsday scenario?

Do you have an alternative OSS doomsday scenario that you’d like to share?

Hat tip again to Jay Fenton for pointing out what Starlink has been up to.

The OSS MoSCoW requirement prioritisation technique

Since the soccer World Cup is currently taking place in Russia, I thought I’d include reference to the MoSCoW technique in today’s blog. It could be used as part of your vendor selection processes for the purpose of OSS requirement prioritisation.

The term MoSCoW itself is an acronym derived from the first letter of each of four prioritization categories (Must have, Should have, Could have, and Won’t have), with the interstitial Os added to make the word pronounceable.”
Wikipedia.

It can be used to rank the importance an OSS operator gives to each of their requirements, such as the sample below:

Index Description MoSCoW
1 Requirement #1 Must have (mandatory)
2 Requirement #2 Should have (desired)
3 Requirement #3 Could have (optional)
4 Requirement #5 Won’t have (not required)

But that’s only part of the story in a vendor selection process – the operator’s wish-list. This needs to be cross-referenced with a vendor or solution integrator’s ability to deliver to those wishes. This is where the following compliance codes come in:

  • FC – Fully Compliant – This functionality is available in the baseline installation of the proposed OSS
  • NC – Non-compliant – This functionality is not able to be supported by the proposed OSS
  • PC – Partially Compliant – This functionality is partially supported by the proposed OSS (a vendor description of what is / isn’t supported is required)
  • WC – Will Comply – This functionality is not available in the baseline installation of the proposed software, but will be provided via customisation of the proposed OSS
  • CC – Can Comply – This functionality is possible, but not part of the offer and can be provided at additional cost

So a quick example might look like the following:

Index Description MoSCoW Compliance Comment
1 Requirement #1 Must have (mandatory) FC
2 Requirement #2 Should have (desired) PC Can only delivery the functionality of 2a, not 2b in this solution

Yellow columns are created by the operator / customer, blue cells are populated by the vendor / SI. I usually also add blue columns to indicate which product module delivers the compliance and room to pose questions / assumptions back to the customer.

BTW. I’ve only heard of the MoSCoW acronym recently but have been using a similar technique for years. My prioritisation approach is a little simpler. I just use Mandatory, Preferred, Optional.

The OSS dart-board analogy

The dartboard, by contrast, is not remotely logical, but is somehow brilliant. The 20 sector sits between the dismal scores of five and one. Most players aim for the triple-20, because that’s what professionals do. However, for all but the best darts players, this is a mistake. If you are not very good at darts, your best opening approach is not to aim at triple-20 at all. Instead, aim at the south-west quadrant of the board, towards 19 and 16. You won’t get 180 that way, but nor will you score three. It’s a common mistake in darts to assume you should simply aim for the highest possible score. You should also consider the consequences if you miss.”
Rory Sutherland
on Wired.

When aggressive corporate goals and metrics are combined with brilliant solution architects, we tend to aim for triple-20 with our OSS solutions don’t we? The problem is, when it comes to delivery, we don’t tend to have the laser-sharp precision of a professional darts player do we? No matter how experienced we are, there tends to be hidden surprises – some technical, some personal (or should I say inter-personal?), some contractual, etc – that deflect our aim.

The OSS dart-board analogy asks the question about whether we should set the lofty goals of a triple-20 [yellow circle below], with high risk of dismal results if we miss (think too about the OSS stretch-goal rule); or whether we’re better to target the 19/16 corner of the board [blue circle below] that has scaled back objectives, but a corresponding reduction in risk.

OSS Dart-board Analogy

Roland Leners posed the following brilliant question, “What if we built OSS and IT systems around people’s willingness to change instead of against corporate goals and metrics? Would the corporation be worse off at the end?” in response to a recent post called, “Did we forget the OSS operating model?

There are too many facets to count on Roland’s question but I suspect that in many cases the corporate goals / metrics are akin to the triple-20 focus, whilst the team’s willingness to change aligns to the 19/16 corner. And that is bound to reduce delivery risk.

I’d love to hear your thoughts!!

The OSS farm equipment analogy

OSS End of Financial Year
It’s an interesting season as we come up to the EOFY (end of financial year – on 30 June). Budget cycles are coming to an end. At organisations that don’t carry un-spent budgets into the next financial year, the looming EOFY triggers a use-it-or-lose-it mindset.

In some cases, organisations are almost forced to allocate funds on OSS investments even if they haven’t always had the time to identify requirements and / or model detailed return projections. That’s normally anathema to me because an OSS‘ reputation is determined by the demonstrable value it creates for years to come. However, I can completely understand a client’s short-term objectives. The challenge we face is to minimise any risk of short-term spend conflicting with long-term objectives.

I take the perspective of allocating funds to build the most generally useful asset (BTW, I like Robert Kiyosaki’s simple definition of an asset as, “in reality, an asset is only something that puts money in your pocket,”) In the case of OSS, putting money in one’s pocket needs to consider earnings [or cost reductions] that exceed outgoings such as maintenance, licensing, operations, etc as well as cost of capital. Not a trivial task!

So this is where the farm equipment analogy comes in.

If we haven’t had the chance to conduct demand estimation (eg does the telco’s market want the equivalent of wheat, rice, stone fruit, etc) or product mix modelling (ie which mix of those products will bear optimal returns) then it becomes hard to predict what type of machinery is best fit for our future crops. If we haven’t confirmed that we’ll focus efforts on wheat, then it could be a gamble to invest big in a combine harvester (yet). We probably also don’t want to invest capital and ongoing maintenance on a fruit tree shaker if our trees won’t begin bearing fruit for another few years.

Therefore, a safer investment recommendation would be on a general-purpose machine that is most likely to be useful for any type of crop (eg a tractor).

In OSS terminology, if you’re not sure if your product mix will provision 100 customers a day or 100,000 then it could be a little risky to invest in an off-the-shelf orchestration / provisioning engine. Still potentially risky, but less so, would be to invest in a resource and service inventory solution (if you have a lot of network assets), alarm management tools (if you process a lot of alarms), service order entry, workforce management, etc.

Having said that, a lot of operators already have a strong gut-feel for where they intend to get returns on their investment. They may not have done the numbers extensively, but they know their market roadmap. If wheat is your specialty, go ahead and get the combine harvester.

I’d love to get your take on this analogy. How do you invest capital in your OSS without being sure of the projections (given that we’re never sure on projections becoming reality)?

How economies of unscale change the OSS landscape

For more than a century, economies of scale made the corporation an ideal engine of business. But now, a flurry of important new technologies, accelerated by artificial intelligence (AI), is turning economies of scale inside out. Business in the century ahead will be driven by economies of unscale, in which the traditional competitive advantages of size are turned on their head.
Economies of unscale are enabled by two complementary market forces: the emergence of platforms and technologies that can be rented as needed. These developments have eroded the powerful inverse relationship between fixed costs and output that defined economies of scale. Now, small, unscaled companies can pursue niche markets and successfully challenge large companies that are weighed down by decades of investment in scale — in mass production, distribution, and marketing
.”
Hemant Taneja with Kevin Maney
in their Sloan Review article, “The End of Scale.”

There are two pathways I can envisage OSS playing a part in the economies of unscale indicated in the Sloan Review quote above.

The first is the changing way of working towards smaller, more nimble organisations, which includes increasing freelancing. There are already many modularised activities managed within an OSS, such as field work, designs, third-party service bundling, where unscale is potentially an advantage. OSS natively manages all these modules with existing tools, whether that’s ticketing, orchestration, provisioning, design, billing, contract management, etc.

Add smart contract management and John Reilly’s value fabric will undoubtedly increase in prevalence. John states that a value fabric is a mesh of interwoven, cooperating organizations and individuals, called parties, who directly or indirectly deliver value to customers. It gives the large, traditional network operators the chance to be more creative in their use of third parties when they look beyond their “Not Invented Here” syndrome of the past. It also provides the opportunity to develop innovative supply and procurement chains (meshes) that can generate strategic competitive advantage.

The second comes with an increasing openness to using third-party platforms and open-source OSS tools within operator environments. The OSS market is already highly fragmented, from multi-billion dollar companies (by market capitalisation) through to niche, even hobby, projects. However, there tended to be barriers to entry for the small or hobbyist OSS provider – they either couldn’t scale their infrastructure or they didn’t hold the credibility mandated by risk averse network operators.

As-a-Service platforms have changed the scale dynamic because they now allow OSS developers to rent infrastructure on a pay-as-you-eat model. In other words, the more their customers consume, the more infrastructure an OSS supplier can afford to rent from platforms such as AWS. More importantly, this become a possibility because operators are now increasingly open to renting third-party services on shared (but compartmentalised / virtualised) infrastructure. BTW. When I say “infrastructure” here, I’m not just talking about compute / network / storage but also virtualisation, containerisation, databases, AI, etc, etc.

Similarly, the credibility barrier-to-entry is being pulled down like the Berlin Wall as operators are increasingly investing in open-source projects. There are large open-source OSS projects / platforms being driven by the carriers themselves (eg ONAP, OpenStack, OPNFV, etc) that are accommodative of smaller plug-in modules. Unlike the proprietary, monolithic OSS/BSS stacks of the past, these platforms are designed with collaboration and integration being front-of-mind.

However, there’s an element of “potential” in these economies of unscale. Andreas Hegers likens open-source to the wild west, as many settlers seek to claim their patch of real-estate in an uncharted map. Andreas states further, “In theory, vendor interoperability from open source should be convenient — even harmonious — with innovations being shared like recipes. Unfortunately for many, the system has not lived up to this reality.”

Where do you sit on the potential of economies of unscale and open-source OSS?

OSS / BSS security getting a little cloudy

Many systems are moving beyond simple virtualization and are being run on dynamic private or even public clouds. CSPs will migrate many to hybrid clouds because of concerns about data security and regulations on where data are stored and processed.
We believe that over the next 15 years, nearly all software systems will migrate to clouds provided by third parties and be whatever cloud native becomes when it matures. They will incorporate many open-source tools and middleware packages, and may include some major open-source platforms or sub-systems (for example, the size of OpenStack or ONAP today)
.”
Dr Mark H Mortensen
in an article entitled, “BSS and OSS are moving to the cloud: Analysys Mason” on Telecom Asia.

Dr Mortensen raises a number of other points relating to cloud models for OSS and BSS in the article linked above. Included is definition of various cloud / virtualisation related terms.

He also rightly points out that many OSS / BSS vendors are seeking to move to virtualised / cloud / as-a-Service delivery models (for reasons including maintainability, scalability, repeatability and other “ilities”).

The part that I find interesting with cloud models (I’ll use the term generically) is positioning of the security control point(s). Let’s start by assuming a scenario where:

  1. The Active Network (AN) is “on-net” – the network that carries live customer traffic (the routers, switches, muxes, etc) is managed by the CSP / operator [Noting though, that these too are possibly managed as virtual entities rather than owned].
  2. The “cloud” OSS/BSS is “off-net” – some vendors will insist on their multi-tenanted OSS/BSS existing within the public cloud

The diagram below shows three separate realms:

  1. The OSS/BSS “in the cloud”
  2. The operator’s enterprise / DC realm
  3. The operator’s active network realm

as well as the Security Control Points (SCPs) between them.

OSS BSS Cloud Security Control Points

The most important consideration of this architecture is that the Active Network remains operational (ie carry customer traffic) even if the link to the DC and/or the link to the cloud is lost.

With that in mind, our second consideration is what aspects of network management need to reside within the AN realm. It’s not just the Active Network devices, but anything else that allows the AN to operate in an isolated state. This means that shared services like NTP / synch needs a presence in the AN realm (even if not of the highest stratum within the operator’s time-synch solution).

What about Element Managers (EMS) that look after the AN devices? How about collectors / probes? How about telemetry data stores? How about network health management tools like alarm and performance management? How about user access management (LDAP, AD, IAM, etc)? Do they exist in the AN or DC realm?

Then if we step up the stack a little to what I refer to as East-West OSS / BSS tools like ticket management, workforce management, even inventory management – do we collect, process, store and manage these within the DC or are we prepared to shift any of this functionality / data out to the cloud? Or do we prefer it to remain in the AN realm and ensure only AN privileged users have access?

Which OSS / BSS tools remain on-net (perhaps as private cloud) and which can (or must) be managed off-net (public cloud)?

Climb further up the stack and we get into the interesting part of cloud offerings. Not only do we potentially have the OSS/BSS (including East-West tools), but more excitingly, we could bring in services or solutions like content from external providers and bundle them with our own offerings.

We often hear about the tight security that’s expected (and offered) as part of the vendor OSS/BSS cloud solutions, but as you see, the tougher consideration for network management architects is actually the end-to-end security and where to position the security control points relative to all the pieces of the OSS/BSS stack.

Dematerialisation of OSS

In 1972, the Club of Rome in its report The Limits to Growth predicted a steadily increasing demand for material as both economies and populations grew. The report predicted that continually increasing resource demand would eventually lead to an abrupt economic collapse. Studies on material use and economic growth show instead that society is gaining the same economic growth with much less physical material required. Between 1977 and 2001, the amount of material required to meet all needs of Americans fell from 1.18 trillion pounds to 1.08 trillion pounds, even though the country’s population increased by 55 million people. Al Gore similarly noted in 1999 that since 1949, while the economy tripled, the weight of goods produced did not change.
Wikipedia on the topic of Dematerialisation.

The weight of OSS transaction volumes appears to be increasing year on year as we add more stuff to our OSS. The touchpoint explosion is amplifying this further. Luckily, our platforms / middleware, compute, networks and storage have all been scaling as well so the increased weight has not been as noticeable as it might have been (even though we’ve all worked on OSS that have been buckling under the weight of transaction volumes right?).

Does it also make sense that when there is an incremental cost per transaction (eg via the increasingly prevalent cloud or “as a service” offerings) that we pay closer attention to transaction volumes because there is a great perception of cost to us? But not for “internal” transactions where there is little perceived incremental cost?

But it’s not so much the transaction processing volumes that are the problem directly. It’s more by implication. For each additional transaction there’s the risk of a hand-off being missed or mis-mapped or slowing down overall activity processing times. For each additional transaction type, there’s additional mapping, testing and regression testing effort as well as an increased risk of things going wrong.

Do you measure transaction flow volumes across your entire OSS suite? Does it provide an indication of where efficiency optimisation (ie dematerialisation) could occur and guide your re-factoring investments? Does it guide you on process optimisation efforts?

OSS, the great multipliers

Skills multiply labors by two, five, 10, 50, 100 times. You can chop a tree down with a hammer, but it takes about 30 days. That’s called labor. But if you trade the hammer in for an ax, you can chop the tree down in about 30 minutes. What’s the difference in 30 days and 30 minutes? Skills—skills make the difference.”
Jim Rohn
, here.

OSS can be great labour multipliers. They can deliver baked-in “skills” that multiply labors by two, five, 10, 50, 100 times. They can be not just a hammer, not just an axe, but a turbo-charged chainsaw.

The more pertinent question to understand with our OSS though is, “Why are we chopping this tree down?” Is it the right tree to achieve a useful outcome? Do the benefits outweigh the cost of the hammer / axe / chainsaw?

What if our really clever OSS engineers come up with a brilliant design that can reduce a task from 30 days to 30 minutes…. but it takes us 35 days to design/build/test/deploy the customisation for a once-off use? We’re clearly cutting down the wrong tree.

What if instead we could reduce this same task from 30 days to 1 day with just a quick analysis and process change? It’s nowhere near as sexy or challenging for us OSS engineers though. The very clever 30 minute solution is another case of “just because we can, doesn’t mean we should.”

The strangler fig transformation analogy

You’re probably familiar with strangler figs, which grow on a host tree, often resulting in the death of the host. You’re probably less familiar with the strangler fig analogy as an OSS transformation or cutover model.

The concept is that there is a “host tree” (ie legacy system) that needs to be obsoleted and replaced, but it’s so dominant and integral (eg because of complex and/or meshed integrations) that a big-bang replacement is unviable (eg due to risk, costs, etc).

The strangler fig (ie new solution) is developed in parallel to the host tree and is progressively grown over time. Generally, it grows through step-wise enhancement / replacement. This approach is best suited to scenarios where there are lots of transaction types, fault types, process types, use-cases, etc that can be systematically switched from host to strangler.

This approach can also be used for product consolidation (ie multiple products consolidated into one).

Clever use of automated regression testing can help with this evolving cutover approach.

An OSS automation mind-flip

I recently had something of a perspective-flip moment in relation to automation within the realm of OSS.

In the past, I’ve tended to tackle the automation challenge from the perspective of applying automated / scripted responses to tasks that are done manually via the OSS. But it’s dawned on me that I have it around the wrong way! It is an incremental perspective on the main objective of automations – global zero-touch networks.

If we take all of the tasks performed by all of the OSS around the globe, the number of variants is incalculable… which probably means the zero-touch problem is unsolvable (we might be able to solve for many situations, but not all).

The more solvable approach would be to develop a more homogeneous approach to network self-care / self-optimisation. In other words, the majority of the zero-touch challenge is actually handled at the equivalent of EMS level and below (I’m perhaps using out-dated self-healing terminology, but hopefully terminology that’s familiar to readers) and only cross-domain issues bubble up to OSS level.

As the diagram below describes, each layer up abstracts but connects (as described in more detail in “What an OSS shouldn’t do“). That is, each higher layer in the stack reduces the amount if information/control within a domain that it’s responsible for, but it assumes more a broader responsibility for connecting domains together.
OSS abstract and connect

The abstraction process reduces the number of self-healing variants the OSS needs to handle. But to cope with the complexity of self-caring for connected domains, we need a more homogeneous set of health information being presented up from the network.

Whereas the intent model is designed to push actions down into the lower layers with a standardised, simplified language, this would be the reverse – pushing network health knowledge up to higher layers to deal with… in a standard, consistent approach.

And BTW, unlike the pervading approach of today, I’m clearly saying that when unforeseen (or not previously experienced) scenarios appear within a domain, they’re not just kicked up to OSS, but the domains are stoic enough to deal with the situation inside the domain.

How to run an OSS PoC

This is the third in a series describing the process of finding the right OSS solution for your specific needs and getting estimated pricing to help you build a business case.

The first post described the overall OSS selection process we use. The second described the way we poll the market and prepare a short-list of OSS products / vendors based on current capabilities.

Once you’ve prepared the short-list it’s time to get into specifics. We generally do this via a PoC (Proof of Concept) phase with the short-listed suppliers. We have a few very specific principles when designing the PoC:

  • We want it to reflect the operator’s context so that they can grasp what’s being presented (which can be a challenge when a vendor runs their own generic demos). This “context” is usually in the form of using the operator’s device types, naming conventions, service types, etc. It also means setting up a network scenario that is representative of the operator’s, which could be a hypothetical model, a small segment of a real network, lab model or similar
  • PoC collateral must clearly describe the PoC and related context. It should clearly identify the important scenarios and selection criteria. Ideally it should logically complement the collateral provided in the previous step (ie the requirement gathering)
  • We want it to focus on the most important conditions. If we take the 80/20 rule as a guide, will quickly identify the most common service types, devices, configurations, functions, reports, etc that we want to model
  • Identify efficacy across those most important conditions. Don’t just look for the functionality that implements those conditions, but also the speed at which they can be done at a scale required by the operator. This could include bulk load or processing capabilities and may require simulators (or real integrations – see below) to generate volume
  • We want it to be a simple as is feasible so that it minimises the effort required both of suppliers and operators
  • Consider a light-weight integration if possible. One of the biggest challenges with an OSS is getting data in and out. If you can get a rapid integration with a real network (eg a microservice, SNMP traps, syslog events or similar) then it will give an indication of integration challenges ahead. However, note the previous point as it might be quite time-consuming for both operator and supplier to set up a real-time integration
  • Take note of the level of resourcing required by each supplier to run the PoC (eg how many supplier staff, server scaling, etc.). This will give an indication of the level of resourcing the operator will need to allocate for the actual implementation, including organisational change management factors
  • Attempt to offer PoC platform consistency so that all operators are on a level playing field, which might be through designing the PoC on common devices or topologies with common interfaces. You may even look to go the opposite way if you think the rarity of your conditions could be a deal-breaker

Note that we tend to scale the size/complexity/reality of the PoC to the scale of project budget out of consideration of vendor and operator alike. If it’s a small project / budget, then we do a light PoC. If it’s a massive transformation, then the PoC definitely has to go deeper (ie more integrations, more scenarios, more data migration and integrity challenges, etc)…. although ultimately our customers decide how deep they’re comfortable in going.

Best of luck and feel free to contact us if we can assist with the running of your OSS PoC.

How to identify a short-list of best-fit OSS suppliers for you

In yesterday’s post, we talked about how to estimate OSS pricing. One of the key pillars of the approach was to first identify a short-list of vendors / integrators best-suited to implementing your specific OSS, then working closely with them to construct a pricing model.

Finding the right vendor / integrator can be a complex challenge. There are dozens, if not hundreds of OSS / BSS solutions to choose from and there are rarely like-for-like comparators. There are some generic comparison tools such as Gartner’s Magic Quadrant, but there’s no way that they can cater for the nuanced requirements of each OSS operator.

Okay, so you don’t want to hear about problems. You want solutions. Well today’s post provides a description of the approach we’ve used and refined across the many product / vendor selection processes we’ve conducted with OSS operators.

We start with a short-listing exercise. You won’t want to deal with dozens of possible suppliers. You’ll want to quickly and efficiently identify a small number of candidates that have capabilities that best match your needs. Then you can invest a majority of your precious vendor selection time in the short-list. But how do you know the up-to-date capabilities of each supplier? We’ll get to that shortly.

For the short-listing process, I use a requirement gathering and evaluation template. You can find a PDF version of the template here. Note that the content within it is out-dated and I now tend to use a more benefit-centric classification rather than feature-centric classification, but the template itself is still applicable.

STEP ONE – Requirement Gathering
The first step is to prepare a list of requirements (as per page 3 of the PDF):
Requirement Capture.
The left-most three columns in the diagram above (in white) are filled out by the operator, which classifies a list of requirements and how important they are (ie mandatory, etc). The depth of requirements (column 2) is up to you and can range from specific technical details to high-level objectives. They could even take the form of user-stories or intended benefits.

STEP TWO – Issue your requirement template to a list of possible vendors
Once you’ve identified the list of requirements, you want to identify a list of possible vendors/integrators that might be able to deliver on those requirements. The PAOSS vendor/product list might help you to identify possible candidates. We then send the requirement matrix to the vendors. Note that we also send an introduction pack that provides the context of the solution the OSS operator needs.

STEP THREE – Vendor Self-analysis
The right-most three columns in the diagram above (in aqua) are designed to be filled out by the vendor/integrator. The suppliers are best suited to fill out these columns because they best understand their own current offerings and capabilities.
Note that the status column is a pick-list of compliance level, where FC = Fully Compliant. See page 2 of the template for other definitions. Given that it is a self-assessment, you may choose to change the Status (vendor self-rankings) if you know better and/or ask more questions to validate the assessments.
The “Module” column identifies which of the vendor’s many products would be required to deliver on the requirement. This column becomes important later on as it will indicate which product modules are most important for the overall solution you want. It may allow you to de-prioritise some modules (and requirements) if price becomes an issue.

STEP FOUR – Compare Responses
Once all the suppliers have returned their matrix of responses, you can compare them at a high-level based on the summary matrix (on page 1 of the template)
OSS Requirement Summary
For each of the main categories, you’ll be able to quickly see which vendors are the most FC (Fully Compliant) or NC (Non-Compliant) on the mandatory requirements.

Of course you’ll need to analyse more deeply than just the Summary Matrix, but across all the vendor selection processes we’ve been involved with, there has always been a clear identification of the suppliers of best fit.

Hopefully the process above is fairly clear. If not, contact us and we’d be happy to guide you through the process.

Using OSS/BSS to steer the ship

For network operators, our OSS and BSS touch most parts of the business. The network, and the services they carry, are core business so a majority of business units will be contributing to that core business. As such, our OSS and BSS provide many of the metrics used by those business units.

This is a privileged position to be in. We get to see what indicators are most important to the business, as well as the levers used to control those indicators. From this privileged position, we also get to see the aggregated impact of all these KPIs.

In your years of working on OSS / BSS, how many times have you seen key business indicators that are conflicting between business units? They generally become more apparent on cross-team projects where the objectives of one internal team directly conflict with the objectives of another internal team/s.

In theory, a KPI tree can be used to improve consistency and ensure all business units are pulling towards a common objective… [but what if, like most organisations, there are many objectives? Does that mean you have a KPI forest and the trees end up fighting for light?]

But here’s a thought… Have you ever seen an OSS/BSS suite with the ability to easily build KPI trees? I haven’t. I’ve seen thousands of standalone reports containing myriad indicators, but never a consolidated roll-up of metrics. I have seen a few products that show operational metrics rolled-up into a single dashboard, but not business metrics. They appear to have been designed to show an information hierarchy, but not necessarily with KPI trees in mind specifically.

What do you think? Does it make sense for us to offer KPI trees as base product functionality from our reporting modules? Would this functionality help our OSS/BSS add more value back into the businesses we support?

The Goldilocks OSS story

We all know the story of Goldilocks and the Three Bears where Goldilocks chooses the option that’s not too heavy, not too light, but just right.

The same model applies to OSS – finding / building a solution that’s not too heavy, not too light, but just right. To be honest, we probably tend to veer towards the too heavy, especially over time. We put more complexity into our architectures, integrations and customisations… because we can… which end up burdening us and our solutions.

A perfect example is AT&T offering its ECOMP project (now part of the even bigger Linux Foundation Network Fund) up for open source in the hope that others would contribute and help mature it. As a fairytale analogy, it’s an admission that it’s too heavy even for one of the global heavyweights to handle by itself.

The ONAP Charter has some great plans including, “…real-time, policy-driven orchestration and automation of physical and virtual network functions that will enable software, network, IT and cloud providers and developers to rapidly automate new services and support complete lifecycle management.”

These are fantastic ambitions to strive for, especially at the Pappa Bear end of the market. I have huge admiration for those who are creating and chasing bold OSS plans. But what about for the large majority of customers that fall into the Goldilocks category? Is our field of vision so heavy (ie so grand and so far into the future) that we’re missing the opportunity to solve the business problems of our customers and make a difference for them with lighter solutions today?

TM Forum’s Digital Transformation World is due to start in just over two weeks. It will be fascinating to see how many of the presentations and booths consider the Goldilocks requirements. There probably won’t be many because it’s just not as sexy a story as one that mentions heavy solutions like policy-driven orchestration, zero-touch automation, AI / ML / analytics, self-scaling / self-healing networks, etc.

[I should also note that I fall into the category of loving to listen to the heavy solutions too!! ]

Are today’s platform benefits tomorrow’s constraints?

One of the challenges facing OSS / BSS product designers is which platform/s to tie the roadmap to.

Let’s use a couple of examples.
In the past, most outside plant (OSP) designs were done in AutoCAD, so it made sense to build OSP design tools around AutoCAD. However in making that choice, the OSP developer becomes locked into whatever product directions AutoCAD chooses in future. And if AutoCAD falls out of favour as an OSP design format, the earlier decision to align with it becomes an even bigger challenge to work around.

A newer example is for supply chain related tools to leverage Salesforce. The tools and marketplace benefits make great sense as a platform to leverage today. But it also represents a roadmap lock-in that developers hope will remain beneficial in the years / decades to come.

I haven’t seen an OSS / BSS product built around Facebook, but there are plenty of other tools that have been. Given Facebook’s recent travails, I wonder how many product developers are feeling at risk due to their reliance on the Facebook platform?

The same concept extends to the other platforms we choose – operating systems, programming languages, messaging models, data storage models, continuous development tools, etc, etc.

Anecdotally, it seems that new platforms are coming online at an ever greater rate, so the chance / risk of platform churn is surely much higher than when AutoCAD was chosen back in the 1980s – 1990s.

The question becomes how to leverage today’s platform benefits, whilst minimising dependency and allowing easy swap-out. There’s too much nuance to answer this definitively, but the one key strategy is to try to build key logic models outside the dependent platform (ie ensuring logic isn’t built into AutoCAD or similar).

Automated Network Operations as a Service (ANOaaS)

Google has started applying its artificial intelligence (AI) expertise to network operations and expects to make its tools available to companies building virtual networks on its global cloud platform.
That could be a troubling sign for network technology vendors such as Ericsson AB (Nasdaq: ERIC), Huawei Technologies Co. Ltd. and Nokia Corp. (NYSE: NOK), which now see AI in the network environment as a potential differentiator and growth opportunity…
Google already uses software-defined network (SDN) technology as the bedrock of this infrastructure and last week revealed details of an in-development “Google Assistant for Networking” tool, designed to further minimize human intervention in network processes.
That tool would feature various data models to handle tasks related to network topology, configuration, telemetry and policy.
.”
Iain Morris
here on Light Reading.

This is an interesting, but predictable, turn of events isn’t it? If (when?) automated network operations as a service (ANOaaS) is perfected, it has the ability to significantly change the OSS space doesn’t it?

Let’s have a look at this from a few scenarios (and I’m considering ANOaaS from the perspective of any of the massive cloud providers who are also already developing significant AI/ML resource pools, not just Google).

Large Enterprise, Utilities, etc with small networks (by comparison to telco networks), where the network and network operations are simply a cost of doing business rather than core business. Virtual networks and ANOaaS seem like an attractive model for these types of customer (ignoring data sovereignty concerns and the myriad other local contexts for now). Outsourcing this responsibility significantly reduces CAPEX and head-count to run what’s effectively non-core business. This appears to represent a big disruptive risk for the many OSS vendors who service the Enterprise / Utilities market (eg Solarwinds, CA, etc, etc).

T2/3 Telcos with relatively small networks that tend to run lean operations. In this scenario, the network is core business but having a team of ML/AI expects is hard to justify. Automations are much easier to build for homogeneous (consistent) infrastructure platforms (like those of the cloud providers) than for those carrying different technologies (like T2/T3 telcos perhaps?). Combine complexity, lack of scale and lack of large ML/AI resource pools and it becomes hard for T2/T3 telcos to deliver cost-effective ANOaaS either internally or externally to their customer base. Perhaps outsourcing the network (ie VNO) and ANOaaS allows these operators to focus more on sales?

T1 Telcos have large networks, heterogenous platforms and large workforces where the network is core business. The question becomes whether they can build network cloud at the scale and price-point of Amazon, Microsoft, Google, etc. This is partly dependent upon internal processes, but also on what vendors like Ericsson, Huawei and Nokia can deliver, as quoted as a risk above.

As you probably noticed, I just made up ANOaaS. Does a term already exist for this? How do you think it’s going to change the OSS and telco markets?

Is service personalisation the answer?

The actions taken by the telecom industry have mostly been around cost cutting, both in terms of opex and capex, and that has not resulted in breaking the curve. Too few activities has been centered around revenue growth, such as focused activities in personalization, customer experience, segmentation, targeted offerings that become part of or drive ecosystems. These activities are essential if you want to break the curve; thus, it is time to gear up for growth… I am very surprised that very few, if any, service providers today collect and analyze data, create dynamic targeted offerings based on real-time insights, and do that per segment or individual.”
Lars Sandstrom
, here.

I have two completely opposite and conflicting perspectives on the pervading wisdom of personalised services (including segmentation of one and targeted offerings) in the telecoms industry.

Telcos tend to be large organisations. If I invest in a large organisation it’s because the business model is simple, repeatable and has a moat (as Warren Buffett likes to say). Personalisation is contra to two of those three mantras – personalisation makes our OSS/BSS far more complicated and hence less repeatable (unless we build in lots of automations, which BTW, are inherently more complex).

I’m more interested in reliable and enduring profitability than just revenue growth (not that I’m discounting the search for revenue growth of course). The complexity of personalisation leads to significant increases in systems costs. As such, you’d want to be really sure that personalisation is going to give an even larger up-tick in revenues (ie ROI). Seems like a big gamble to me.

For my traditional telco services, I don’t want personalised, dynamic offers that I have to evaluate and make decisions on regularly. I want set and forget (mostly). It’s a bit like my electricity – I don’t want to constantly choose between green electricity, blue electricity, red electricity – I just want my appliances to work when I turn on the switch and not have bill shock at the end of the month / quarter. In telco, it’s not just green / blue / red. We seem to want to create the millions of shades of the rainbow, which is a nightmare for OSS/BSS implementers.

I can see the argument however for personalisation in what I’ll call the over-the-top services (let’s include content and apps as well). Telcos tend to be much better suited to building the platforms that support the whole long tail than selecting individual winners (except perhaps supply of popular content like sport broadcasts, etc).

So, if I’m looking for a cool, challenging project or to sell some products or services (you’ll notice that the quote above is on a supplier’s blog BTW), then I’ll definitely recommend personalisation. But if I want my telco customers to be reliably profitable…

Am I taking a short-term view on this? Is personalisation going to be expected by all end-users in future, leaving providers with no choice but to go down this path??

More places for the bad guys to hide?

Just wondering – do you think there would be any correlation between the number of fall-outs through an OSS/BSS stack and the level of security on a network?

In theory, they are quite separate. NetOps and SecOps are usually completely separate units, use different thinking models, don’t rely on common tools (apart from the network itself), etc.

The reason I ask the question is because fall-out rates amplify with complexity. And within the chaos of complexity, are there more places for the bad guys to hide?

Thoughts?

Networks lead. OSS are an afterthought. Time for a change?

In a recent post, we described how changing conditions in networks (eg topologies, technologies, etc) cause us to reconsider our OSS.

Networks always lead and OSS (or any form of network management including EMS/NMS) is always an afterthought. Often a distant afterthought.

But what if we spun this around? What if OSS initiated change in our networks / services? After all, OSS is the platform that operationalises the network. So instead of attempting to cope with a huge variety of network options (which introduces a massive number of variants and in turn, massive complexity, which we’re always struggling with in our OSS), what if we were to define the ways that networks are operationalised?

Let’s assume we want to lead. What has to happen first?

Network vendors tend to lead currently because they’re offering innovation in their networks, but more importantly on the associated services supported over the network. They’re prepared to take the innovation risk knowing that operators are looking to invest in solutions they can offer to their customers (as products / services) for a profit. The modus operandi is for operators to look to network vendors, not OSS vendors / integrators, to help to generate new revenues. It would take a significant perception shift for operators to break this nexus and seek out OSS vendors before network vendors. For a start, OSS vendors have to create a revenue generation story rather than the current tendency towards a cost-out business case.

ONAP provides an interesting new line of thinking though. As you know, it’s an open-source project that represents multiple large network operators banding together to build an innovative new approach to OSS (even if it is being driven by network change – the network virtualisation paradigm shift in particular). With a white-label, software-defined network as a target, we have a small opening. But to turn this into an opportunity, our OSS need to provide innovation in the services being pushed onto the SDN. That innovation has to be in the form of services/products that are readily monetisable by the operators.

Who’s up for this challenge?

As an aside:
If we did take the lead, would our OSS look vastly different to what’s available today? Would they unequivocally need to use the abstract model to cope with the multitude of scenarios?