Are telco services and SLAs no longer relevant?

I wonder if we’re reaching the point where “telecommunication services” is no longer a relevant term? By association, SLAs are also a bust. But what are they replaced by?

A telecommunication service used to effectively be the allocation of a carrier’s resources for use by a specific customer. Now? Well, less so

  1. Service consumption channel alternatives are increasing, from TV and radio; to PC, to mobile, to tablet, to YouTube, to Insta, to Facebook, to a million others.
    Consumption sources are even more prolific.
  2. Customer contact channel alternatives are also increasing, from contact centres; to IVR, to online, to mobile apps, to Twitter, etc.
  3. A service bundle often utilises third-party components, some of which are “off-net”
  4. Virtualisation is increasingly abstracting services from specific resources. They’re now loosely coupled with resource pools and rely on high availability / elasticity to ensure customer service continuity. Not only that, but those resource pools might extend beyond the carrier’s direct control and out to cloud provider infrastructure

The growing variant-tree is taking the concept beyond the reach of “customer services” and evolves to become “customer experiences.”

The elements that made up a customer service in the past tended to fall within the locus of control of a telco and its OSS. The modern customer experience extends far beyond the control of any one company or its OSS. An SLA – Service Level Agreement – only pertains to the sub-set of an experience that can be measured by the OSS. We can aspire to offer an ELA – Experience Level Agreement – because we don’t have the mechanisms by which to measure or manage the entire experience yet.

The metrics that matter most for telcos today tend to revolve around customer experience (eg NPS). But aside from customer surveys, ratings and derived / contrived metrics, we don’t have electronic customer experience measurements.

Customer services are dead; Long live the customer experiences king… if only we can invent a way to measure the whole scope of what makes up customer experiences.

Intent to simplify our OSS

The left-hand panel of the triptych below shows the current state of interactions with most OSS. There are hundreds of variants inbound via external sources (ie multi-channel) and even internal sources (eg different service types). Similarly, there are dozens of networks (and downstream systems), each with different interface models. Each needs different formatting and integration costs escalate.
Intent model OSS

The intent model of network provisioning standardises the network interface, drastically simplifying the task of the OSS and the variants required for it to handle. This becomes particularly relevant in a world of NFVs, where it doesn’t matter which vendor’s device type (router say) can be handled via a single command intent rather than having separate interfaces to each different vendor’s device / EMS northbound interface. The unique aspects of each vendor’s implementation are abstracted from the OSS.

The next step would be in standardising the interface / data model upstream of the OSS. That’s a more challenging task!!

If ONAP is the answer, what are the questions?

ONAP provides a comprehensive platform for real-time, policy-driven orchestration and automation of physical and virtual network functions that will enable software, network, IT and cloud providers and developers to rapidly automate new services and support complete lifecycle management.
By unifying member resources, ONAP is accelerating the development of a vibrant ecosystem around a globally shared architecture and implementation for network automation–with an open standards focus–faster than any one product could on its own
.”
Part of the ONAP charter from onap.org.

The ONAP project is gaining attention in service provider circles. The Steering Committee of the ONAP project hints at the types of organisations investing in the project. The statement above summarises the mission of this important project. You can bet that the mission has been carefully crafted. As such, one can assume that it represents what these important stakeholders jointly agree to be the future needs of their OSS.

I find it interesting that there are quite a few technical terms (eg policy-driven orchestration) in the mission statement, terms that tend to pre-empt the solution. However, I don’t feel that pre-emptive technical solutions are the real mission, so I’m going to try to reverse-engineer the statement into business needs. Hopefully the business needs (the “why? why? why?” column below) articulates a set of questions / needs that all OSS can work to, as opposed to replicating the technical approach that underpins ONAP.

Phrase Interpretation Why? Why? Why?
real-time The ability to make instantaneous decisions Why1: To adapt to changing conditions
Why2: To take advantage of fleeting opportunities or resolve threats
Why 3: To optimise key business metrics such as financials
Why 4: As CSPs are under increasing pressure from shareholders to deliver on key metrics
policy-driven orchestration To use policies to increase the repeatability of key operational processes Why 1: Repeatability provides the opportunity to improve efficiency, quality and performance
Why 2: Allows an operator to service more customers at less expense
Why 3: Improves corporate profitability and customer perceptions
Why 4: As CSPs are under increasing pressure from shareholders to deliver on key metrics
policy-driven automation To use policies to increase the amount of automation that can be applied to key operational processes Why 1: Automated processes provide the opportunity to improve efficiency, quality and performance
Why 2: Allows an operator to service more customers at less expense
Why 3: Improves corporate profitability and customer perceptions
physical and virtual network functions Our networks will continue to consist of physical devices, but we will increasingly introduce virtualised functionality Why 1: Physical devices will continue to exist into the foreseeable future but virtualisation represents an exciting approach into the future
Why 2: Virtual entities are easier to activate and manage (assuming sufficient capacity exists)
Why 3: Physical equipment supply, build, deploy and test cycles are much longer and labour intensive
Why 4: Virtual assets are more flexible, faster and cheaper to commission
Why 5: Customer services can be turned up faster and cheaper
software, network, IT and cloud providers and developers With this increase in virtualisation, we find an increasingly large and diverse array of suppliers contributing to our value-chain. These suppliers contribute via software, network equipment, IT functions and cloud resources Why 1: CSPs can access innovation and efficiency occurring outside their own organisation
Why 2: CSPs can leverage the opportunities those innovations provide
Why 3: CSPs can deliver more attractive offers to customers
Why 4: Key metrics such as profitability and customer satisfaction are enhanced
rapidly automate new services We want the flexibility to introduce new products and services far faster than we do today Why 1: CSPs can deliver more attractive offers to customers faster than competitors
Why 2: Key metrics such as market share, profitability and customer satisfaction are enhanced as well as improved cashflow
support complete lifecycle management The components that make up our value-chain are changing and evolving so quickly that we need to cope with these changes without impacting customers across any of their interactions with their service Why 1: Customer satisfaction is a key metric and a customer’s experience spans the entire lifecyle of their service.
Why 2: CSPs don’t want customers to churn to competitors
Why 3: Key metrics such as market share, profitability and customer satisfaction are enhanced
unifying member resources To reduce the amount of duplicated and under-synchronised development currently being done by the member bodies of ONAP Why 1: Collaboration and sharing reduces the effort each member body must dedicate to their OSS
Why 2: A reduced resource pool is required
Why 3: Costs can be reduced whilst still achieving a required level of outcome from OSS
vibrant ecosystem To increase the level of supplier interchangability Why 1: To reduce dependence on any supplier/s
Why 2: To improve competition between suppliers
Why 3: Lower prices, greater choice and greater innovation tend to flourish in competitive environments
Why 4: CSPs, as customers of the suppliers, benefit
globally shared architecture To make networks, services and support systems easier to interconnect across the global communications network Why 1: Collaboration on common standards reduces the integration effort between each member at points of interconnect
Why 2: A reduced resource pool is required
Why 3: Costs can be reduced whilst still achieving interconnection benefits

As indicated in earlier posts, ONAP is an exciting initiative for the CSP industry for a number of reasons. My fear for ONAP is that it becomes such a behemoth of technical complexity that it becomes too unwieldy for use by any of the member bodies. I use the analogy of ATM versus Ethernet here, where ONAP is equivalent to ATM in power and complexity. The question is whether there’s an Ethernet answer to the whys that ONAP is trying to solve.

I’d love to hear your thoughts.

(BTW. I’m not saying that the technologies the ONAP team is investigating are the wrong ones. Far from it. I just find it interesting that the mission is starting with a technical direction in mind. I see parallels with the OSS radar analogy.)

If your partners don’t have to talk to you then you win

If your partners don’t have to talk to you then you win.”
Guy Lupo
.

Put another way, the best form of customer service is no customer service (ie your customers and/or partners are so delighted with your automated offerings that they have no reason to contact you). They don’t want to contact you anyway (generally speaking). They just want to consume a perfectly functional and reliable solution.

In the deep, distant past, our comms networks required operators. But then we developed automated dialling / switching. In theory, the network looked after itself and people made billions of calls per year unassisted.

Something happened in the meantime though. Telco operators the world over started receiving lots of calls about their platform and products. You could say that they’re unwanted calls. The telcos even have an acronym called CVR – Call Volume Reduction – that describes their ambitions to reduce the number of customer calls that reach contact centre agents. Tools such as chatbots and IVR have sprung up to reduce the number of calls that an operator fields.

Network as a Service (NaaS), the context within Guy’s comment above, represents the next new tool that will aim to drive CVR (amongst a raft of other benefits). NaaS theoretically allows customers to interact with network operators via impersonal contracts (in the form of APIs). The challenge will be in the reliability – ensuring that nothing falls between the cracks in any of the layers / platforms that combine to form the NaaS.

In the world of NaaS creation, Guy is exactly right – “If your partners [and customers] don’t have to talk to you then you win.” As always, it’s complexity that leads to gaps. The more complex the NaaS stack, the less likely you are to achieve CVR.

Designing an Operational Domain Manager (ODM)

A couple of weeks ago, Telstra and the TM Forum held an event in Melbourne on OSS for next gen architectures.

The diagram below comes from a presentation by Corey Clinger. It describes Telstra’s Operational Domain Manager (ODM) model that is a key component of their Network as a Service (NaaS) framework. Notice the API stubs across the top of the ODM? Corey went on to describe the TM Forum Open API model that Telstra is building upon.
Operational Domain Manager (ODM)

In a following session, Raman Balla indicated an perspective that differs from many existing OSS. The service owner (and service consumer) must know all aspects of a given service (including all dimensions, lifecycle, etc) in a common repository / catalog and it needs to be attribute-based. Raman also indicated that the aim he has for architecting NaaS is to not only standardise the service, but the entire experience around the service.

In the world of NaaS, operators can no longer just focus separately on assurance or fulfillment or inventory / capacity, etc. As per DevOps, operators are accountable for everything.

Network slicing, another OSS activity

One business customer, for example, may require ultra-reliable services, whereas other business customers may need ultra-high-bandwidth communication or extremely low latency. The 5G network needs to be designed to be able to offer a different mix of capabilities to meet all these diverse requirements at the same time.
From a functional point of view, the most logical approach is to build a set of dedicated networks each adapted to serve one type of business customer. These dedicated networks would permit the implementation of tailor-made functionality and network operation specific to the needs of each business customer, rather than a one-size-fits-all approach as witnessed in the current and previous mobile generations which would not be economically viable.
A much more efficient approach is to operate multiple dedicated networks on a common platform: this is effectively what “network slicing” allows. Network slicing is the embodiment of the concept of running multiple logical networks as virtually independent business operations on a common physical infrastructure in an efficient and economical way.
.”
GSMA’s Introduction to Network Slicing.

Engineering a network is one of compromises. There are many different optimisation levers to pull to engineer a set of network characteristics. In the traditional network, it was a case of pulling all the levers to find a middle-ground set of characteristics that supported all their service offerings.

QoS striping of traffic allowed for a level of differentiation of traffic handling, but the underlying network was still a balancing act of settings. Network virtualisation offers new opportunities. It allows unique segmentation via virtual networks, where each can be optimised for the specific use-cases of that network slice.

For years, I’ve been posing the concept of telco offerings being like electricity networks – that we don’t need so many service variants. I should note that this analogy is not quite right. We do have a few different types of “electricity” such as highly available (health monitoring), high-bandwidth (content streaming), extremely low latency (rapid reaction scenarios such as real-time sensor networks), etc.

Now what do we need to implement and manage all these network slices?? Oh that’s right, OSS! It’s our OSS that will help to efficiently coordinate all the slicing and dicing that’s coming our way… to optimise all the levers across all the different network slices!

Are we making our OSS lives easier?

As an implementer of OSS, what’s the single factor that makes it challenging for us to deliver on any of the three constraints of project delivery? Complexity. Or put another way, variants. The more variants, the less chance we have of delivering on time, cost or functionality.

So let me ask you, is our next evolution simpler? No, actually. At least, it doesn’t seem so to me.

For all their many benefits, are virtualised networks simpler? We can apply abstractions to give a simpler view to higher layers in the stack, but we’ve actually only introduced more layers. Virtualisation will also bring an even higher volume of devices, transactions, etc to monitor, so we’re going to have to develop complex ways of managing these factors in cohorts.

We’re big on automations to simplify the roles of operators. But automations don’t make the task simpler for OSS implementers. Once we build a whole bunch of complex automations it might give the appearance of being simpler. But under the hood, it’s not. There are actually more moving parts.

Are we making it simpler through repetition across the industry? Nope, with the proliferation of options we’re getting more diverse. For example, back in the day, we only had a small number of database options to store our OSS data in (I won’t mention the names, I’m sure you know them!). But what about today? We have relational databases of course, but also have so many more options. What about virtualisation options? Mediation / messaging options? Programming languages? Presentation / reporting options? The list goes on. Each different OSS uses a different suite of tools, meaning less standardisation.

Our OSS lives seem to be getting harder by the day!

Did we forget the OSS operating model?

When we have a big OSS transformation to undertake, we tend to start with the use cases / requirements, work our way through the technical solution and build up an implementation plan before delivering it (yes, I’ve heavily reduced the real number of steps there!).

However, we sometimes overlook the organisational change management part. That’s the process of getting the customer’s organisation aligned to assist with the transformation, not to mention being fully skilled up to accept handover into operations. I’ve seen OSS projects that were nearly perfect technically, but ultimately doomed because the customer wasn’t ready to accept handover. Seasoned OSS veterans probably already have plans in place for handling organisational change through stakeholder management, training, testing, thorough handover-to-ops processes, etc. You can find some hints on the Free Stuff pages here on PAOSS.

In addition, long-time readers here on PAOSS have probably already seen a few posts about organisational management, but there’s a new gotcha that I’d like to add to the mix today – the changing operating model. This one is often overlooked. The changes made in a next-gen OSS can often have profound changes on the to-be organisation chart. Roles and responsibilities that used to be clearly defined now become blurred and obsoleted by the new solution.

This is particularly true for modern delivery models where cloud, virtualisation, as-a-service, etc change the dynamic. Demarcation points between IT, operations, networks, marketing, products, third-party suppliers, etc can need complete reconsideration. The most challenging part about understanding the re-mapping of operating models is that we often can’t even predict what they will be until we start using the new solution and refining our processes in-flight. We can start with a RACI and a bunch of “what if?” assumptions / scenarios to capture new operational mappings, but you can almost bet that it will need ongoing refinement.

How economies of unscale change the OSS landscape

For more than a century, economies of scale made the corporation an ideal engine of business. But now, a flurry of important new technologies, accelerated by artificial intelligence (AI), is turning economies of scale inside out. Business in the century ahead will be driven by economies of unscale, in which the traditional competitive advantages of size are turned on their head.
Economies of unscale are enabled by two complementary market forces: the emergence of platforms and technologies that can be rented as needed. These developments have eroded the powerful inverse relationship between fixed costs and output that defined economies of scale. Now, small, unscaled companies can pursue niche markets and successfully challenge large companies that are weighed down by decades of investment in scale — in mass production, distribution, and marketing
.”
Hemant Taneja with Kevin Maney
in their Sloan Review article, “The End of Scale.”

There are two pathways I can envisage OSS playing a part in the economies of unscale indicated in the Sloan Review quote above.

The first is the changing way of working towards smaller, more nimble organisations, which includes increasing freelancing. There are already many modularised activities managed within an OSS, such as field work, designs, third-party service bundling, where unscale is potentially an advantage. OSS natively manages all these modules with existing tools, whether that’s ticketing, orchestration, provisioning, design, billing, contract management, etc.

Add smart contract management and John Reilly’s value fabric will undoubtedly increase in prevalence. John states that a value fabric is a mesh of interwoven, cooperating organizations and individuals, called parties, who directly or indirectly deliver value to customers. It gives the large, traditional network operators the chance to be more creative in their use of third parties when they look beyond their “Not Invented Here” syndrome of the past. It also provides the opportunity to develop innovative supply and procurement chains (meshes) that can generate strategic competitive advantage.

The second comes with an increasing openness to using third-party platforms and open-source OSS tools within operator environments. The OSS market is already highly fragmented, from multi-billion dollar companies (by market capitalisation) through to niche, even hobby, projects. However, there tended to be barriers to entry for the small or hobbyist OSS provider – they either couldn’t scale their infrastructure or they didn’t hold the credibility mandated by risk averse network operators.

As-a-Service platforms have changed the scale dynamic because they now allow OSS developers to rent infrastructure on a pay-as-you-eat model. In other words, the more their customers consume, the more infrastructure an OSS supplier can afford to rent from platforms such as AWS. More importantly, this become a possibility because operators are now increasingly open to renting third-party services on shared (but compartmentalised / virtualised) infrastructure. BTW. When I say “infrastructure” here, I’m not just talking about compute / network / storage but also virtualisation, containerisation, databases, AI, etc, etc.

Similarly, the credibility barrier-to-entry is being pulled down like the Berlin Wall as operators are increasingly investing in open-source projects. There are large open-source OSS projects / platforms being driven by the carriers themselves (eg ONAP, OpenStack, OPNFV, etc) that are accommodative of smaller plug-in modules. Unlike the proprietary, monolithic OSS/BSS stacks of the past, these platforms are designed with collaboration and integration being front-of-mind.

However, there’s an element of “potential” in these economies of unscale. Andreas Hegers likens open-source to the wild west, as many settlers seek to claim their patch of real-estate in an uncharted map. Andreas states further, “In theory, vendor interoperability from open source should be convenient — even harmonious — with innovations being shared like recipes. Unfortunately for many, the system has not lived up to this reality.”

Where do you sit on the potential of economies of unscale and open-source OSS?

OSS / BSS security getting a little cloudy

Many systems are moving beyond simple virtualization and are being run on dynamic private or even public clouds. CSPs will migrate many to hybrid clouds because of concerns about data security and regulations on where data are stored and processed.
We believe that over the next 15 years, nearly all software systems will migrate to clouds provided by third parties and be whatever cloud native becomes when it matures. They will incorporate many open-source tools and middleware packages, and may include some major open-source platforms or sub-systems (for example, the size of OpenStack or ONAP today)
.”
Dr Mark H Mortensen
in an article entitled, “BSS and OSS are moving to the cloud: Analysys Mason” on Telecom Asia.

Dr Mortensen raises a number of other points relating to cloud models for OSS and BSS in the article linked above. Included is definition of various cloud / virtualisation related terms.

He also rightly points out that many OSS / BSS vendors are seeking to move to virtualised / cloud / as-a-Service delivery models (for reasons including maintainability, scalability, repeatability and other “ilities”).

The part that I find interesting with cloud models (I’ll use the term generically) is positioning of the security control point(s). Let’s start by assuming a scenario where:

  1. The Active Network (AN) is “on-net” – the network that carries live customer traffic (the routers, switches, muxes, etc) is managed by the CSP / operator [Noting though, that these too are possibly managed as virtual entities rather than owned].
  2. The “cloud” OSS/BSS is “off-net” – some vendors will insist on their multi-tenanted OSS/BSS existing within the public cloud

The diagram below shows three separate realms:

  1. The OSS/BSS “in the cloud”
  2. The operator’s enterprise / DC realm
  3. The operator’s active network realm

as well as the Security Control Points (SCPs) between them.

OSS BSS Cloud Security Control Points

The most important consideration of this architecture is that the Active Network remains operational (ie carry customer traffic) even if the link to the DC and/or the link to the cloud is lost.

With that in mind, our second consideration is what aspects of network management need to reside within the AN realm. It’s not just the Active Network devices, but anything else that allows the AN to operate in an isolated state. This means that shared services like NTP / synch needs a presence in the AN realm (even if not of the highest stratum within the operator’s time-synch solution).

What about Element Managers (EMS) that look after the AN devices? How about collectors / probes? How about telemetry data stores? How about network health management tools like alarm and performance management? How about user access management (LDAP, AD, IAM, etc)? Do they exist in the AN or DC realm?

Then if we step up the stack a little to what I refer to as East-West OSS / BSS tools like ticket management, workforce management, even inventory management – do we collect, process, store and manage these within the DC or are we prepared to shift any of this functionality / data out to the cloud? Or do we prefer it to remain in the AN realm and ensure only AN privileged users have access?

Which OSS / BSS tools remain on-net (perhaps as private cloud) and which can (or must) be managed off-net (public cloud)?

Climb further up the stack and we get into the interesting part of cloud offerings. Not only do we potentially have the OSS/BSS (including East-West tools), but more excitingly, we could bring in services or solutions like content from external providers and bundle them with our own offerings.

We often hear about the tight security that’s expected (and offered) as part of the vendor OSS/BSS cloud solutions, but as you see, the tougher consideration for network management architects is actually the end-to-end security and where to position the security control points relative to all the pieces of the OSS/BSS stack.

The Goldilocks OSS story

We all know the story of Goldilocks and the Three Bears where Goldilocks chooses the option that’s not too heavy, not too light, but just right.

The same model applies to OSS – finding / building a solution that’s not too heavy, not too light, but just right. To be honest, we probably tend to veer towards the too heavy, especially over time. We put more complexity into our architectures, integrations and customisations… because we can… which end up burdening us and our solutions.

A perfect example is AT&T offering its ECOMP project (now part of the even bigger Linux Foundation Network Fund) up for open source in the hope that others would contribute and help mature it. As a fairytale analogy, it’s an admission that it’s too heavy even for one of the global heavyweights to handle by itself.

The ONAP Charter has some great plans including, “…real-time, policy-driven orchestration and automation of physical and virtual network functions that will enable software, network, IT and cloud providers and developers to rapidly automate new services and support complete lifecycle management.”

These are fantastic ambitions to strive for, especially at the Pappa Bear end of the market. I have huge admiration for those who are creating and chasing bold OSS plans. But what about for the large majority of customers that fall into the Goldilocks category? Is our field of vision so heavy (ie so grand and so far into the future) that we’re missing the opportunity to solve the business problems of our customers and make a difference for them with lighter solutions today?

TM Forum’s Digital Transformation World is due to start in just over two weeks. It will be fascinating to see how many of the presentations and booths consider the Goldilocks requirements. There probably won’t be many because it’s just not as sexy a story as one that mentions heavy solutions like policy-driven orchestration, zero-touch automation, AI / ML / analytics, self-scaling / self-healing networks, etc.

[I should also note that I fall into the category of loving to listen to the heavy solutions too!! ]

Training network engineers to code, not vice versa

Did any of you read the Light Reading link in yesterday’s post about Google creating automated network operations services? If you haven’t, it’s well worth a read.

If you did, then you may’ve also noticed a reference to Finland’s Elisa selling its automation smarts to other telcos. This is another interesting business model disruption for the OSS market, although I’ll reserve judgement on how disruptive it will be until Elisa sells to a few more operators.

What did catch my eye in the Elisa article (again by Light Reading’s Iain Morris), is this paragraph:
Automation has not been hassle-free for Elisa. Instilling a software culture throughout the organization has been a challenge, acknowledges [Kirsi] Valtari. Rather than recruiting software expertise, Elisa concentrated on retraining the people it already had. During internal training courses, network engineers have been taught to code in Python, a popular programming language, and to write algorithms for a self-optimizing network (or SON). “The idea was to get engineers who were previously doing manual optimization to think about automating it,” says Valtari. “These people understand network problems and so it is a win-win outcome to go down this route.”.

It provides a really interesting perspective on this diagram below (from a 2014 post about the ideal skill-set for the future of networking)

There is an undoubted increase in the level of network / IT overlap (eg SDN). Most operators appear to be taking the path of hiring for IT and hoping they’ll grow to understand networks. Elisa is going the opposite way and training their network engineers to code.

With either path, if they then train their multi-talented engineers to understand the business (the red intersect), then they’ll have OSS experts on their hands right folks?? 😉

Automated Network Operations as a Service (ANOaaS)

Google has started applying its artificial intelligence (AI) expertise to network operations and expects to make its tools available to companies building virtual networks on its global cloud platform.
That could be a troubling sign for network technology vendors such as Ericsson AB (Nasdaq: ERIC), Huawei Technologies Co. Ltd. and Nokia Corp. (NYSE: NOK), which now see AI in the network environment as a potential differentiator and growth opportunity…
Google already uses software-defined network (SDN) technology as the bedrock of this infrastructure and last week revealed details of an in-development “Google Assistant for Networking” tool, designed to further minimize human intervention in network processes.
That tool would feature various data models to handle tasks related to network topology, configuration, telemetry and policy.
.”
Iain Morris
here on Light Reading.

This is an interesting, but predictable, turn of events isn’t it? If (when?) automated network operations as a service (ANOaaS) is perfected, it has the ability to significantly change the OSS space doesn’t it?

Let’s have a look at this from a few scenarios (and I’m considering ANOaaS from the perspective of any of the massive cloud providers who are also already developing significant AI/ML resource pools, not just Google).

Large Enterprise, Utilities, etc with small networks (by comparison to telco networks), where the network and network operations are simply a cost of doing business rather than core business. Virtual networks and ANOaaS seem like an attractive model for these types of customer (ignoring data sovereignty concerns and the myriad other local contexts for now). Outsourcing this responsibility significantly reduces CAPEX and head-count to run what’s effectively non-core business. This appears to represent a big disruptive risk for the many OSS vendors who service the Enterprise / Utilities market (eg Solarwinds, CA, etc, etc).

T2/3 Telcos with relatively small networks that tend to run lean operations. In this scenario, the network is core business but having a team of ML/AI expects is hard to justify. Automations are much easier to build for homogeneous (consistent) infrastructure platforms (like those of the cloud providers) than for those carrying different technologies (like T2/T3 telcos perhaps?). Combine complexity, lack of scale and lack of large ML/AI resource pools and it becomes hard for T2/T3 telcos to deliver cost-effective ANOaaS either internally or externally to their customer base. Perhaps outsourcing the network (ie VNO) and ANOaaS allows these operators to focus more on sales?

T1 Telcos have large networks, heterogenous platforms and large workforces where the network is core business. The question becomes whether they can build network cloud at the scale and price-point of Amazon, Microsoft, Google, etc. This is partly dependent upon internal processes, but also on what vendors like Ericsson, Huawei and Nokia can deliver, as quoted as a risk above.

As you probably noticed, I just made up ANOaaS. Does a term already exist for this? How do you think it’s going to change the OSS and telco markets?

Networks lead. OSS are an afterthought. Time for a change?

In a recent post, we described how changing conditions in networks (eg topologies, technologies, etc) cause us to reconsider our OSS.

Networks always lead and OSS (or any form of network management including EMS/NMS) is always an afterthought. Often a distant afterthought.

But what if we spun this around? What if OSS initiated change in our networks / services? After all, OSS is the platform that operationalises the network. So instead of attempting to cope with a huge variety of network options (which introduces a massive number of variants and in turn, massive complexity, which we’re always struggling with in our OSS), what if we were to define the ways that networks are operationalised?

Let’s assume we want to lead. What has to happen first?

Network vendors tend to lead currently because they’re offering innovation in their networks, but more importantly on the associated services supported over the network. They’re prepared to take the innovation risk knowing that operators are looking to invest in solutions they can offer to their customers (as products / services) for a profit. The modus operandi is for operators to look to network vendors, not OSS vendors / integrators, to help to generate new revenues. It would take a significant perception shift for operators to break this nexus and seek out OSS vendors before network vendors. For a start, OSS vendors have to create a revenue generation story rather than the current tendency towards a cost-out business case.

ONAP provides an interesting new line of thinking though. As you know, it’s an open-source project that represents multiple large network operators banding together to build an innovative new approach to OSS (even if it is being driven by network change – the network virtualisation paradigm shift in particular). With a white-label, software-defined network as a target, we have a small opening. But to turn this into an opportunity, our OSS need to provide innovation in the services being pushed onto the SDN. That innovation has to be in the form of services/products that are readily monetisable by the operators.

Who’s up for this challenge?

As an aside:
If we did take the lead, would our OSS look vastly different to what’s available today? Would they unequivocally need to use the abstract model to cope with the multitude of scenarios?

Designing OSS to cope with greater transience

There are three broad models of networking in use today. The first is the adaptive model where devices exchange peer information to discover routes and destinations. This is how IP networks, including the Internet, work. The second is the static model where destinations and pathways (routes) are explicitly defined in a tabular way, and the final is the central model where destinations and routes are centrally controlled but dynamically set based on policies and conditions.”
Tom Nolle here.

OSS of decades past worked best with static networks. Services / circuits that were predominantly “nailed up” and (relatively) rarely changed after activation. This took the real-time aspect out of play and justified the significant manual effort required to establish a new service / circuit.

However, adaptive and centrally managed networks have come to dominate the landscape now. In fact, I’m currently working on an assignment where DWDM, a technology that was once largely static, is now being augmented to introduce an SDN controller and dynamic routing (at optical level no less!).

This paradigm shift changes the fundamentals of how OSS operate. Apart from the physical layer, network connectivity is now far more transient, so our tools must be able to cope with that. Not only that, but the changes are too frequent to justify the manual effort of the past.

To tie in with yesterday’s post, we are again faced with the option of abstract / generic modelling or specific modelling.

Put another way, we have to either come up with adaptive / algorithmic mechanisms to deal with that transience (the specific model), or need to mimic “nailed-up” concepts (the abstract model).

More on the implications of this tomorrow.

Is your data AI-ready (part 2)

Further to yesterday’s post that posed the question about whether your data was AI ready for virtualised network assurance use cases, I thought I’d raise a few more notes.

The two reasons posed were:

  1. Our data sets haven’t had time to collect much elastic / dynamic network data yet
  2. Our data is riddled with human-generated data that is error-prone

On the latter case in particular, I sense that we’re going to have to completely re-architect the way we collect and store assurance data. We’re almost definitely going to have to think in terms of automated assurance actions and related logging to avoid the errors of human data creation / logging. The question becomes whether it’s worthwhile trying to wrangle all of our old data into formats that the AI engines can cope with or do we just start afresh with new models? (This brings to mind the recent “perfect data” discussion).

It will be one thing to identify patterns, but another thing entirely to identify optimum response activities and to automate those.

If we get these steps right, does it become logical that the NOC (network) and SOC (security operations centre) become conjoined… at least much more so than they tend to be today? In other words, does incident management merge network incidents and security incidents onto common analysis and response platforms? If so, does that imply another complete re-architecture? It certainly changes the operations model.

I’d love to hear your thoughts and predictions.

An OSS niche market opportunity?

The survey found that 82 percent of service providers conduct less than half of customer transactions digitally, despite the fact that nearly 80 percent of respondents said they are moving forward with business-wide digital transformation programs of varying size and scale. This underscores a large perception gap in understanding, completing and benefiting from digitalization programs.

The study revealed that more than one-third of service providers have completed some aspect of digital transformation, but challenges persist; nearly three-quarters of service providers identify legacy systems and processes, challenges relating to staff and skillsets and business risk as the greatest obstacles to transforming digital services delivery.

Driving a successful digital transformation requires companies to transform myriad business and operational domains, including customer journeys, digital product catalogs, partner management platforms and networks via software-defined networking (SDN) and network functions virtualization (NFV).
Survey from Netcracker and ICT Intuition.

Interesting study from Netcracker and ICT Intuition. To re-iterate with some key numbers and take-aways:

  1. 82% of responding service providers can increase digital transactions by at least 50% (in theory).  Digital transactions tend to be significantly cheaper for service providers than manual transactions. However, some customers will work the omni-channel experience to find the channel that they’re most comfortable dealing with. In many cases, this means attempting to avoid digital experiences. As a side note, any attempts to become 100% digital are likely to require social / behavioural engineering of customers and/or an associated churn rate
  2. Nearly 75% of responding service providers identify legacy systems / processes, skillsets and business risk as biggest challenges. This reads as putting a digital interface onto back-end systems like BSS / OSS tools. This is less of a challenge for newer operators that have been designed with digitalised customer interactions in mind. The other challenge for operators is that the digital front-ends are rarely designed to bolt onto the operators’ existing legacy back-end systems and need significant integration
  3. If an operator want to build a digital transaction regime, they should expect an OSS / BSS transformation too.

To overcome these challenges, I’ve noticed that some operators have been building up separate (often low-cost) brands with digital-native front ends, back ends, processes and skills bases. These brands tend to target the ever-expanding digitally native generations and be seen as the stepping stone to obsoleting legacy solutions (and perhaps even legacy business models?).

I wonder whether this is a market niche for smaller OSS players to target and grow into whilst the big OSS brands chase the bigger-brother operator brands?

Interaction points with fast/slow processes

Further to yesterday’s post on fast / slow processes and factory platforms, a concept presented by Sylvain Denis of Orange in Melbourne last week, here’s a diagram from Sylvain’s presentation pack :

The yellow blocks represent the fast (automated) processes. The orange blocks represent the slow processes.

The next slide showed the human interaction points (blue boxes) into this API / factory stack.

Posing a Network Data Synchronisation Protocol (NDSP) concept

Data quality is one of the biggest challenges we face in OSS. A product could be technically perfect, but if the data being pumped into it is poor, then the user experience of the product will be awful – the OSS becomes unusable, and that in itself generates a data quality death spiral.

This becomes even more important for the autonomous, self-healing, programmable, cooperative networks being developed (think IoT, virtualised networks, Self-Organizing Networks). If we look at IoT networks for example, they’ll be expected to operate unattended for long periods, but with code and data auto-propagating between nodes to ensure a level of self-optimisation.

So today I’d like to pose a question. What if we could develop the equivalent of Network Time Protocol (NTP) for data? Just as NTP synchronises clocking across networks, Network Data Synchronisation Protocol (NDSP) would synchronise data across our networks through a feedback-loop / synchronisation algorithm.

Of course there are differences from NTP. NTP only tries to coordinate one data field (time) along a common scale (time as measured along a 64+64 bits continuum). The only parallel for network data is in life-cycle state changes (eg in-service, port up/down, etc).

For NTP, the stratum of the clock is defined (see image below from wikipedia).

This has analogies with data, where some data sources can be seen to be more reliable than others (ie primary sources rather than secondary or tertiary sources). However, there are scenarios where stratum 2 sources (eg OSS) might push state changes down through stratum 1 (eg NMS) and into stratum 0 (the network devices). An example might be renaming of a hostname or pushing a new service into the network.

One challenge would be the vast different data sets and how to disseminate / reconcile across the network without overloading it with management / communications packets. The other would be that format consistency. I once had a device type that had four different port naming conventions, and that was just within its own NMS! Imagine how many port name variations (and translations) might have existed across the multiple inventories that exist in our networks. The good thing about the NDSP concept is that it might force greater consistency across different vendor platforms.

Another would be that NDSP would become a huge security target as it would have the power to change configurations and because of its reach through the network.

So what do you think? Has the NDSP concept already been developed? Have you implemented something similar in your OSS? What are the scenarios in which it could succeed? Or fail?

Can you re-skill fast enough to justify microservices?

There’s some things that I’ve challenged my team to do. We have to be faster than the web scale players and that sounds audacious. I tell them you can’t you can’t go to the bus station and catch a bus that’s already left the station by getting on a bus. We have to be faster than the people that we want to get to. And that sounds like an insane goal but that’s one of the goals we have. We have to speed up to catch the web scale players.”
John Donovan
, AT&T at this link.

Last week saw a series of articles appear here on the PAOSS blog around the accumulation of tech-debt and how microservices / Agile had the potential to accelerate that accumulation.

The part that I find most interesting about this new approach to telco (or more to the point, to the Digital Service Provider (DSP) model) is that it speaks of a shift to being software companies like the OTT players. Most telcos are definitely “digital” companies, but very few could be called “software” companies.

All telcos have developers on their payroll but how many would have software roles filling more than 5% of their workforce? How many would list their developer pools amongst a handful of core strengths? I’d hazard a guess that the roots of most telcos’ core strengths would’ve been formed decades ago.

Software-centric networks are on the rise. Rapid implementation models like DevOps and Agile are on the rise. API / Microservice interfaces to network domains (irrespective of being VNF, PNF, etc) are on the rise. Software, software, software.

In response, telcos are talking software. Talking, but how many are doing?

Organic transition of the workforce (ie boomers out, millennials in) isn’t going to refresh fast enough. Are telcos actively re-inventing their resource pool? Are they re-skilling on a grand scale, often tens of thousands of people, to cater for a future mode of operation where software is a core capability like it is at the OTT players? Re-skilling at a speed that’s faster than the web-scale bus?

If they can’t, or don’t, then perhaps software is not really the focus. Software isn’t their differentiator… they do have many other strengths to work with after all.

If so then OSS, microservices, SDN / NFV, DevOps, etc are key operational requirements without being core differentiators. So therefore should they all be outsourced to trusted partners / vendors / integrators (rather than the current insourcing trend), thus delegating the responsibility for curating the tech-debt we spoke about last week?

I’m biased. I see OSS as a core differentiator (if done well), but few agree with me.