DTA is all wrapped up for another year

We’ve just finished the third and final day at TM Forum’s Digital Transformation Asia (https://dta.tmforum.org and #tmfdigitalasia ). Wow, talk about a lot happening!!

After spending the previous two days focusing on the lecture series, it would’ve been remiss of me to not catch up with the vendors and Catalyst presentations that had been on display for all three days. So that was my main focus for day 3. Unfortunately, I probably missed seeing some really interesting presentations, although I did catch the tail-end of the panel discussion, “Zero-touch – Identifying the First Steps Toward Fully Automated NFV/SDN,” which was ably hosted by George Glass (along with NFV/SDN experts Tomohiro Otani and Ir. Rizaludin Kaspin ). From the small amount I did see, it left me wishing that I could’ve experienced the entire discussion.

But on with the Catalysts, which are one of the most exciting assets in TM Forum’s arsenal IMHO. They connect carriers (as project champions) with implementers to deliver rapid prototypes on some of the carriers’ most pressing needs. They deliver some seriously impressive results in a short time, often with implementers only being able to devote part of their working hours (or after-hours) to the Catalyst.

As reported here, the winning Catalysts are:

1. Outstanding Catalyst for Business Impact
Telco Cloud Orchestration Plus, Using Open APIs on IoT
Champion: China Mobile
Participants: BOCO Inter-Telecom, Huawei, Hewlett Packard Enterprise, Nokia

2. Outstanding Catalyst for Innovation
5G Pâtisserie
Champions: Globe Telecom, KDDI Research, Singtel
Patricipants: Neural Technologies, Infosys, Ericsson

3. Outstanding New Catalyst
Artificial Intelligence for IT Operations (AIOps)
Champions: China Telecom, China Unicom, China Mobile
Participants: BOCO Inter-Telecom, Huawei, Si-Tech

These were all undoubtedly worthy winners, reward for the significant effort that has already gone into them. Three other Catalysts that I particularly liked are:

  • Transcend Boundaries – which demonstrates the use of Augmented Reality for the field workforce in particular, as championed by Globe. Collectively we haven’t even scratched the surface of what’s possible in this space, so it was exciting to see the concept presented by this Catalyst
  • NaaS in Action – which is building upon Telstra’s exciting Network as a Service (NaaS) initiative; and
  • Telco Big Data Security and Privacy Management Framework – the China Mobile led Catalyst that is impressive for the number of customers that are already signed up and generating revenues for CT.

BTW. The full list of live Catalysts can be found here.

For those who missed this year’s event, I can only suggest that you mark it in your diaries for next year. The TM Forum team is already starting to plan out next year’s event, one that will surely be even bigger and better than the one I’ve been privileged to have attended this week.

OSS that capture value, not just create it

I’ve just had a really interesting first day at TM Forum’s Digital Transformation Asia (https://dta.tmforum.org and #tmfdigitalasia ). The quality of presentations was quite high. Some great thought-provoking ideas!!

Nik Willetts kicked off his keynote with the following quote, which I’m paraphrasing, “Telcos need to start capturing value, not just creating it as they have for the last decade.”

For me, this is THE key takeaway for this event, above any of the other interesting technical discussions from day 1 (and undoubtedly on the agenda for the next 2 days too).

The telecommunications industry has made a massive contribution to the digital lifestyle that we now enjoy. It has been instrumental in adding enormous value to our lives and our economy. But all the while, telecommunications providers globally have been experiencing diminishing profitability and share-of-wallet (as described in this earlier post). Clearly the industry has created enormous value, but hasn’t captured as much as it would’ve liked.

The question to ask is how will our thinking and our OSS/BSS stacks help to contribute to capturing more value for our customers. As described in the share of wallet post above, the premium end of the value chain has always been in the content (think in terms of phone conversations in days gone by, or the myriad of comms techniques today such as email, live chat, blogs, etc, etc). That’s what the customer pays for – the experience – not the networks or systems that facilitate it.

Nik’s comments made me think of Andrew Carnegie. Monopolies such as the telecommunications organisations of the past and Andrew Carnegie’s steel business owned vast swathes of the value chain (Carnegie Steel Company owned the mines which extracted the raw materials needed to make steel, controlled the transportation used to deliver the materials and the product, and ran the mills used for steel production). Buyers didn’t care for the mines or mills or transportation. Customers were paying for the end product as it is what helped them achieve their goals, whether that was the railway tracks needed by the railroads or the beams needed by construction companies.

The Internet has allowed enormous proliferation of the premium-end of the telecommunications value chain. It’s too late to stuff that genie back into the bottle. But to Nik’s further comment, we can help customers achieve their goals by becoming their “do-it-yourself” digital partners.

Our customers now look to platforms like Facebook, Instagram, Google, WordPress, Amazon, etc to build their marketing, order capture, product / content delivery, commercial transactions, etc. I really enjoyed Monty Hong‘s presentation that showed how Telkomsel’s OSS/BSS is helping to embed Telkomsel into customers’ digital lifestyles / value-chains. It’s a perfect example of the biggest OSS loser proof discussed in yesterday’s post.

Are telco services and SLAs no longer relevant?

I wonder if we’re reaching the point where “telecommunication services” is no longer a relevant term? By association, SLAs are also a bust. But what are they replaced by?

A telecommunication service used to effectively be the allocation of a carrier’s resources for use by a specific customer. Now? Well, less so

  1. Service consumption channel alternatives are increasing, from TV and radio; to PC, to mobile, to tablet, to YouTube, to Insta, to Facebook, to a million others.
    Consumption sources are even more prolific.
  2. Customer contact channel alternatives are also increasing, from contact centres; to IVR, to online, to mobile apps, to Twitter, etc.
  3. A service bundle often utilises third-party components, some of which are “off-net”
  4. Virtualisation is increasingly abstracting services from specific resources. They’re now loosely coupled with resource pools and rely on high availability / elasticity to ensure customer service continuity. Not only that, but those resource pools might extend beyond the carrier’s direct control and out to cloud provider infrastructure

The growing variant-tree is taking the concept beyond the reach of “customer services” and evolves to become “customer experiences.”

The elements that made up a customer service in the past tended to fall within the locus of control of a telco and its OSS. The modern customer experience extends far beyond the control of any one company or its OSS. An SLA – Service Level Agreement – only pertains to the sub-set of an experience that can be measured by the OSS. We can aspire to offer an ELA – Experience Level Agreement – because we don’t have the mechanisms by which to measure or manage the entire experience yet.

The metrics that matter most for telcos today tend to revolve around customer experience (eg NPS). But aside from customer surveys, ratings and derived / contrived metrics, we don’t have electronic customer experience measurements.

Customer services are dead; Long live the customer experiences king… if only we can invent a way to measure the whole scope of what makes up customer experiences.

Telco services that are bigger, faster, better and the OSS that supports that

We all know of the tectonic shifts in the world of telco services, profitability and business models.

One common trend is for telcos to offer pipes that are bigger and faster. Seems like a commoditising business model to me, but our OSS still need to support that. How? Through enabling efficiency at scale. Building tools, GUIs, workflows, integrations, sales pipelines, etc that enable telcos march seamlessly towards offering ever bigger/faster pipes. An OSS/BSS stack that supports this could represent one of the few remaining sustainable competitive advantages, so any such OSS/BSS could be highly valuable to its owner.

But if the bigger/faster pipe model is commoditising and there’s little differentiation between competing telcos’ OSS/BSS on service activation, then what is the alternative? Services that are better? But what is “better”? More to the point, what is sustainably better (ie can’t be easily copied by competitors)? Services that are “better” are likely to come in many different forms, but they’re unlikely to be related to the pipe (except maybe reliability / SLA / QoS). They’re more likely to be in the “bundling,” which may include premium content, apps, customer support, third-party products, etc. An OSS/BSS that is highly flexible in supporting any mix of bundling becomes important. Product / service catalogs are one of many possible examples.

An even bigger differentiator is not bigger / faster / better, but different (if perceived by the market as being invaluably different). The challenge with being different is that “different” tends to be fleeting. It tends to only last for a short period of time before competitors catch up. Since many of the differences available to telco services are defined in software, the window of opportunity is getting increasingly short… except when it comes to the OSS/BSS being able to operationalise that differentiator. It’s not uncommon for a new feature to take 9+ months to get to market, with changes to the OSS/BSS taking up a significant chunk of the project’s critical path. Having an OSS/BSS stack that can repeatedly get a product / feature to market much faster than competing telcos provides greater opportunity to capture the market during the window of difference.

Facebook’s algorithmic feed for OSS

This is the logic that led Facebook inexorably to the ‘algorithmic feed’, which is really just tech jargon for saying that instead of this random (i.e. ‘time-based’) sample of what’s been posted, the platform tries to work out which people you would most like to see things from, and what kinds of things you would most like to see. It ought to be able to work out who your close friends are, and what kinds of things you normally click on, surely? The logic seems (or at any rate seemed) unavoidable. So, instead of a purely random sample, you get a sample based on what you might actually want to see. Unavoidable as it seems, though, this approach has two problems. First, getting that sample ‘right’ is very hard, and beset by all sorts of conceptual challenges. But second, even if it’s a successful sample, it’s still a sample… Facebook has to make subjective judgements about what it seems that people want, and about what metrics seem to capture that, and none of this is static or even in in principle perfectible. Facebook surfs user behaviour..”
Ben Evans
here.

Most of the OSS I’ve seen tend to be akin to Facebook’s old ‘chronological feed’ (where users need to sift through thousands of posts to find what’s most interesting to them).

The typical OSS GUI has thousands of functions (usually displayed on a screen all at once – via charts, menus, buttons, pull-downs, etc). But of all of those available functions, any given user probably only interacts with a handful.
Current-style OSS interface

Most OSS give their users the opportunity to customise their menus, colour schemes, even filters. For some roles such as network ops, designers, order entry operators, there are activity lists, often with sophisticated prioritisation and skills-based routing, which starts to become a little more like the ‘algorithmic feed.’

However, unlike the random nature of information hitting the Facebook feed, there is a more explicit set of things that an OSS user is tasked to achieve. It is a little more directed, like a Google search.

That’s why I feel the future OSS GUI will be more like a simple search bar (like Google) that will provide a direction of intent as well as some recent / regular activity icons. Far less clutter than the typical OSS. The graphs and activity lists that we know and love would still be available to users, but the way of interacting with the OSS to find the most important stuff quickly needs to get more intuitive. In future it may even get predictive in knowing what information will be of interest to you.
OSS interface of the future

OSS collaboration rooms. Getting to the coal-face

A number of years ago I heard about an OSS product that introduced collaborative rooms for network operators to collectively solve challenging network health events. It was in line with some of my own thinking about the use of collaboration techniques to solve cross-domain or complex events. But the concept hasn’t caught on in the way that I expected. I was curious why, so I asked around some friends and colleagues who are hands-on managing networks every day.

The answer showed that I hadn’t got close enough to understanding the psyche at the coal-face. It seems that operators have a preference for the current approach, the tick and flick of trouble tickets until the solution forms and the problem is solved.

This shows the psyche of collaboration at a micro scale. I wonder if it holds true at a macro scale too?

No CSP has an everywhere footprint (admittedly cloud providers are close to everywhere though, in part through global presence, in part through coverage of the access domain via their own networks and/or OTT connectivity). For customers that need to cross geo-footprints, carriers take a tick and flick approach in terms of OSS. The OSS of one carrier passes orders to the other carrier’s OSS. Each OSS stays within the bounds of its organisation’s locus of control (see this blog for further context).

To me, there seems to be an opportunity for carriers to get out of their silo. To leverage collaboration for speed, coverage, etc by designing offerings in OSS design rooms rather than standards workshops. A global product catalog sandpit as it were for carriers to design offerings in. Every carrier’s service offering / API / contract resides there for other carriers to interact with.

But once again, I may not be close enough to understanding the psyche at the coal-face. If you work at this coal-face, I’d love to get your opinions on why this would or would not work.

Are we better off waiting for OSS technology to catch up?

Yesterday’s post discussed Dave Duggal’s concept of 20th century OSS being all about centralizing command and control to gain efficiency through vertical integration and mass standardization, whilst 21st century OSS are about decentralization – gaining efficiency through horizontal integration of partner ecosystems and mass customization.

We talked about transitioning from a telco market driven by economies of scale (the 20th century benchmark) to a “market of one” (21st century target state), where fully personalised experience exists and is seamless across all channels.

Dave wrote the original article back in 2016. Two years on and some of the technology in our OSS is just starting to catch up to Dave’s concepts. To be completely honest, we still haven’t architected or built the decentralised OSS that truly offer wide-scale partner ecosystems or customer personalisation, particularly at a scale that is cost-viable.

So I’m going to ask a really pointed question. If our OSS are still better suited to 20th century markets and can’t handle the incalculable number of variants that come with a fully personalised customer experience, are we better off waiting for the technology to catch up before trying to build business models that cater to the “market of one?”

Why? Well, as Gadi Solotorevsky, Chief Technology Officer, cVidya in this post on TM Forum’s Inform says, “…digital customers aren’t known for their patience and or tolerance for errors (I should know – I’m one of them). And any serious glitch, e.g. an error in charging, will not only push them towards a competitor – did I mention how easy is to change digital service providers? It will probably find also its way to social media, causing a ripple effect. The same goes for the partners who are enabling operators to offer cool digital services in the first place.”

Better to have a business model that is simpler and repeatable / reliable at massive scale than attempt a 21st century model where it’s the fall-outs that are scaling.

I’d love to hear your thoughts.

BTW. Kudos to those organisations investing in the bleeding edge tech that are attempting to solve what Dave refers to as “the challenge of our times.” I’m certainly not going to criticise their bold efforts. Just highlighting the point that many operators have 21st century ambitions of their OSS whilst only having 20th century capabilities currently.

OSS feature parity. A functionality arms race

OSS Vendor 1. “I have 1 million features.” (Dr Evil puts finger in mouth)
OSS Vendor 2. “Yeah, well I have 1,000,001 features in my OSS.”

This is the arms-race that we see in OSS, just like almost any other tech product. I imagine that vendors get into this arms-race because they wish to differentiate. Better to differentiate on functionality than price. If there’s a feature parity, then the only differentiator is price. We all know that doesn’t end well!

But I often ask myself a few related questions:

  • Of those million features, how many are actually used regularly
  • As a vendor do you have logging that actually allows you to know what features are being used
  • Taking the Whale Curve perspective, even if being used, how many of those features are actually contributing to the objectives of the vendor
    • Do they clearly contribute towards making sales
    • Do customers delight in using them
    • Would customers be irate if you removed them
    • etc

Earlier this week, I spoke about a friend who created an alarm management tool by himself over a weekend. It didn’t have a million features, but it did have all of what I’d consider to be the most important ones. It did look like a lot of other alarm managers that are now on the market. The GUI based on alarm lists still pervades.

If they all look alike, and all have feature parity, how do you differentiate? If you try to add more features, is it safe to assume that those features will deliver diminishing returns?

But is an alarm list and the flicking of tickets the best way to manage network health?

What if, instead of seeking incremental improvement, someone went back to the most important requirements and considered whether the current approach is meeting those customer needs? I have a strong suspicion that customer feedback will indicate that there are definitely flaws to overcome, especially on high event volume networks.

Clever use of large data volumes provides a level of pre-cognition and automation that wasn’t available when simple alarm lists were first invented. This in turn potentially changes the way that operators can engage with network monitoring and management.

What if someone could identify a whole new user interface / approach that overcame the current flaws and exceeded the key requirements? Would that be more of a differentiator than adding a 1,000,002nd feature?

If you’re looking for a comparison, there were plenty of MP3 players on the market with a heap of features, many more than the iPod. We all know how that one played out!

Build an OSS and they will come… or sometimes not

Build it and they will come.

This is not always true for OSS. Let me recount a few examples.

The project team is disconnected from the users – The team that’s building the OSS in parallel to existing operations doesn’t (or isn’t able to) engage with the end users of the OSS. Once it comes time for cut-over, the end users want to stick with what they know and don’t use the shiny new OSS. From painful experience I can attest that stakeholder management is under-utilised on large OSS projects.

Turf wars – Different groups within a customer are unable to gain consensus on the solution. For example, the operational design team gains the budget to build an OSS but the network assurance team doesn’t endorse this decision. The assurance team then decides not to endorse or support the OSS that is designed and built by the design team. I’ve seen an OSS worth tens of millions of dollars turned off less than 2 years after handover because of turf wars. Stakeholder management again, although this could be easier said than done in this situation.

It sounded like a good idea at the time – The very clever OSS solution team keeps coming up with great enhancements that don’t get used, for whatever reason (eg non fit-for-purpose, lack of awareness of its existence by users, lack of training, etc). I’ve seen a customer that introduced over 500 customisations to an off-the-shelf solution, yet hundreds of those customisations hadn’t been touched by users within a full year prior to doing a utilisation analysis. That’s right, not even used once in the preceding 12 months. Some made sense because they were once-off tools (eg custom migration activities), but many didn’t.

The new OSS is a scary beast – The new solution might be perfect for what the customer has requested in terms of functionality. But if the solution differs greatly from what the operators are used to, it can be too intimidating to be used. A two-week classroom-based training course at the end of an OSS build doesn’t provide sufficient learning to take up all the nuances of the new system like the operators have developed with the old solution. Each significant new OSS needs an apprenticeship, not just a short-course.

It’s obsolete before it’s finishedOSS work in an environment of rapid change – networks, IT infrastructure, organisation models, processes, product offerings, regulatory shifts, disruptive innovation, etc, etc. The longer an OSS takes to implement, the greater the likelihood of obsolescence. All the more reason for designing for incremental delivery of business value rather than big-bang delivery.

What other examples have you experienced where an OSS has been built, but the users haven’t come?

OSS that are profitable, difficult, or important?

Apple became the first company to be worth a trillion dollars. They did that by spending five years single-mindedly focusing on doing profitable work. They’ve consistently pushed themselves toward high margin luxury goods and avoided just about everything else. Belying their first two decades, when they focused on breakthrough work that was difficult and perhaps important, nothing they’ve done recently has been either…
Profitable, difficult, or important — each is an option. A choice we get to make every day. ‘None of the above’ is also available, but I’m confident we can seek to do better than that
. ”
Seth Godin
in this post.

I encourage you to view the entire post at the link above. It gives definitions (and examples) of organisations that focus on profitable, difficult or important activities.

In OSS, the organisations that focus on the profitable are the ones investing heavily on glossy sales / marketing and only making incremental improvements to products that have been around for years.

Then there are others that are doing the difficult and innovative and complex work (ie the sexy work for all of us tech-heads). This recent article about ONAP talks about the fantastic tech-driven ambitions of that program, but then distills it down to the business objectives.

That leaves us with the important – the business needs / objectives – and this is where the customers come in. Speak with any OSS customer (or customer’s customer for that matter) and you’ll tend to find frustrations with their OSS. Frustration with complexity, time to deliver / modify, cost to deliver / modify, risks, functionality constraints, etc.

This is a simplification of course, but do you notice that as an industry, our keen focus on the profitable and difficult might just be holding us back from doing the important?

If ONAP is the answer, what are the questions?

ONAP provides a comprehensive platform for real-time, policy-driven orchestration and automation of physical and virtual network functions that will enable software, network, IT and cloud providers and developers to rapidly automate new services and support complete lifecycle management.
By unifying member resources, ONAP is accelerating the development of a vibrant ecosystem around a globally shared architecture and implementation for network automation–with an open standards focus–faster than any one product could on its own
.”
Part of the ONAP charter from onap.org.

The ONAP project is gaining attention in service provider circles. The Steering Committee of the ONAP project hints at the types of organisations investing in the project. The statement above summarises the mission of this important project. You can bet that the mission has been carefully crafted. As such, one can assume that it represents what these important stakeholders jointly agree to be the future needs of their OSS.

I find it interesting that there are quite a few technical terms (eg policy-driven orchestration) in the mission statement, terms that tend to pre-empt the solution. However, I don’t feel that pre-emptive technical solutions are the real mission, so I’m going to try to reverse-engineer the statement into business needs. Hopefully the business needs (the “why? why? why?” column below) articulates a set of questions / needs that all OSS can work to, as opposed to replicating the technical approach that underpins ONAP.

Phrase Interpretation Why? Why? Why?
real-time The ability to make instantaneous decisions Why1: To adapt to changing conditions
Why2: To take advantage of fleeting opportunities or resolve threats
Why 3: To optimise key business metrics such as financials
Why 4: As CSPs are under increasing pressure from shareholders to deliver on key metrics
policy-driven orchestration To use policies to increase the repeatability of key operational processes Why 1: Repeatability provides the opportunity to improve efficiency, quality and performance
Why 2: Allows an operator to service more customers at less expense
Why 3: Improves corporate profitability and customer perceptions
Why 4: As CSPs are under increasing pressure from shareholders to deliver on key metrics
policy-driven automation To use policies to increase the amount of automation that can be applied to key operational processes Why 1: Automated processes provide the opportunity to improve efficiency, quality and performance
Why 2: Allows an operator to service more customers at less expense
Why 3: Improves corporate profitability and customer perceptions
physical and virtual network functions Our networks will continue to consist of physical devices, but we will increasingly introduce virtualised functionality Why 1: Physical devices will continue to exist into the foreseeable future but virtualisation represents an exciting approach into the future
Why 2: Virtual entities are easier to activate and manage (assuming sufficient capacity exists)
Why 3: Physical equipment supply, build, deploy and test cycles are much longer and labour intensive
Why 4: Virtual assets are more flexible, faster and cheaper to commission
Why 5: Customer services can be turned up faster and cheaper
software, network, IT and cloud providers and developers With this increase in virtualisation, we find an increasingly large and diverse array of suppliers contributing to our value-chain. These suppliers contribute via software, network equipment, IT functions and cloud resources Why 1: CSPs can access innovation and efficiency occurring outside their own organisation
Why 2: CSPs can leverage the opportunities those innovations provide
Why 3: CSPs can deliver more attractive offers to customers
Why 4: Key metrics such as profitability and customer satisfaction are enhanced
rapidly automate new services We want the flexibility to introduce new products and services far faster than we do today Why 1: CSPs can deliver more attractive offers to customers faster than competitors
Why 2: Key metrics such as market share, profitability and customer satisfaction are enhanced as well as improved cashflow
support complete lifecycle management The components that make up our value-chain are changing and evolving so quickly that we need to cope with these changes without impacting customers across any of their interactions with their service Why 1: Customer satisfaction is a key metric and a customer’s experience spans the entire lifecyle of their service.
Why 2: CSPs don’t want customers to churn to competitors
Why 3: Key metrics such as market share, profitability and customer satisfaction are enhanced
unifying member resources To reduce the amount of duplicated and under-synchronised development currently being done by the member bodies of ONAP Why 1: Collaboration and sharing reduces the effort each member body must dedicate to their OSS
Why 2: A reduced resource pool is required
Why 3: Costs can be reduced whilst still achieving a required level of outcome from OSS
vibrant ecosystem To increase the level of supplier interchangability Why 1: To reduce dependence on any supplier/s
Why 2: To improve competition between suppliers
Why 3: Lower prices, greater choice and greater innovation tend to flourish in competitive environments
Why 4: CSPs, as customers of the suppliers, benefit
globally shared architecture To make networks, services and support systems easier to interconnect across the global communications network Why 1: Collaboration on common standards reduces the integration effort between each member at points of interconnect
Why 2: A reduced resource pool is required
Why 3: Costs can be reduced whilst still achieving interconnection benefits

As indicated in earlier posts, ONAP is an exciting initiative for the CSP industry for a number of reasons. My fear for ONAP is that it becomes such a behemoth of technical complexity that it becomes too unwieldy for use by any of the member bodies. I use the analogy of ATM versus Ethernet here, where ONAP is equivalent to ATM in power and complexity. The question is whether there’s an Ethernet answer to the whys that ONAP is trying to solve.

I’d love to hear your thoughts.

(BTW. I’m not saying that the technologies the ONAP team is investigating are the wrong ones. Far from it. I just find it interesting that the mission is starting with a technical direction in mind. I see parallels with the OSS radar analogy.)

Stop looking for exciting new features for your OSS

The iPhone disrupted the handset business, but has not disrupted the cellular network operators at all, though many people were convinced that it would. For all that’s changed, the same companies still have the same business model and the same customers that they did in 2006. Online flight booking doesn’t disrupt airlines much, but it was hugely disruptive to travel agents. Online booking (for the sake of argument) was sustaining innovation for airlines and disruptive innovation for travel agents.
Meanwhile, the people who are first to bring the disruption to market may not be the people who end up benefiting from it, and indeed the people who win from the disruption may actually be doing something different – they may be in a different part of the value chain. Apple pioneered PCs but lost the PC market, and the big winners were not even other PC companies. Rather, most of the profits went to Microsoft and Intel, which both operated at different layers of the stack. PCs themselves became a low-margin commodity with fierce competition, but PC CPUs and operating systems (and productivity software) turned out to have very strong winner-takes-all effects
.”
Ben Evans
on his blog about Tesla.

As usual, Ben makes some thought-provoking points. The ones above have coaxed me into thinking about OSS from a slightly perspective.

I’d tended to look at OSS as a product to be consumed by network operators (and further downstream by the customers of those network operators). I figured that if our OSS delivered benefit to the downstream customers, the network operators would thrive and would therefore be prepared to invest more into OSS projects. In a way, it’s a bit like a sell-through model.

But the ideas above give some alternatives for OSS providers to reduce dependence on network operator budgets.

Traditional OSS fit within a value-chain that’s driven by customers that wish to communicate. In the past, the telephone network was perceived as the most valuable part of that value-chain. These days, digitisation and competition has meant that the perceived value of the network has dropped to being a low-margin commodity in most cases. We’re generally not prepared to pay a premium for a network service. The Microsofts and Intels of the communications value-chain is far more diverse. It’s the Googles, Facebooks, Instagrams, YouTubes, etc that are perceived to deliver most value to end customers today.

If I were looking for a disruptive OSS business model, I wouldn’t be looking to add exciting new features within the existing OSS model. In fact, I’d be looking to avoid our current revenue dependence on network operators (ie the commoditising aspects of the communications value-chain). Instead I’d be looking for ways to contribute to the most valuable aspects of the chain (eg apps, content, etc). Or even better, to engineer a more exceptional comms value-chain than we enjoy today, with an entirely new type of OSS.

The OSS self-driving vehicle

I was lucky enough to get some time of a friend recently, a friend who’s running a machine-learning network assurance proof-of-concept (PoC).

He’s been really impressed with the results coming out of the PoC. However, one of the really interesting factors he’s been finding is how frequently BAU (business as usual) changes in the OSS data (eg changes in naming conventions, topologies, etc) would impact results. Little changes made by upstream systems effectively invalidated baselines identified by the machine-learning engines to key in on. Those little changes meant the engine had to re-baseline / re-learn to build back up to previous insight levels. Or to avoid invalidating the baseline, it would require re-normalising all of data prior to the identification of BAU changes.

That got me wondering whether DevOps (or any other high-change environment) might actually hinder our attempts to get machine-led assurance optimisation. But more to the point, does constant change (at all levels of a telco business) hold us back from reaching our aim of closed-loop / zero-touch assurance?

Just like the proverbial self-driving car, will we always need someone at the wheel of our OSS just in case a situation arises that the machines hasn’t seen before and/or can’t handle? How far into the future will it be before we have enough trust to take our hands off the OSS wheel and let the machines drive closed-loop processes without observation by us?

An alternate way of slicing OSS (part 2)

Last week we talked about an alternate way of slicing OSS projects. Today, we’ll look a little deeper and include some diagrams.

The traditional (aka waterfall) approach to delivering an OSS project sees one big-bang delivery of business value at the end of the implementation.
OSS project delivery via waterfall

The yellow arrows indicate the sequential nature of this style of delivery. The implications include:

  1. If the project runs out of funds before the project finishes, no (negligible) value is delivered
  2. If there’s no modularity of delivery then the project team must stay the course of the original project plan. There’s no room for prioritising or dropping or including delivery modules. Project plans are rarely perfect at first after all
  3. Any changes in project plan tend to have knock-on effects into the rest of the delivery
  4. There is only one true delivery of value, but milestones demonstrate momentum for the project… a key for change management and team morale
  5. Large deliverables represent the proverbial overload one segment of the project delivery team then under-utilises the rest in each stage.  This isn’t great for project flow or team utilisation

The alternate approach seeks to deliver in multiple phases by business value, not artefacts, as shown in the sample model below:
OSS project delivery via AgilePhased enhancements following a base platform build (eg Sandpit and/or Single-site above) could include the following, where each provides a tangible outcome / benefit for the business, thus maintaining perception of momentum (assurance use-cases cited):

  • Additional event collection (ie additional collectors / probes / mediation-devices can be added or configured)
  • Additional filters / sorting of events
  • Event prioritisation mapping / presentation
  • Event correlation
  • Fault suppression
  • Fault escalation
  • Alarm augmentation
  • Alarm thresholding
  • Root-cause analysis (intra, then inter-domain)
  • Other configurations such as latching, auto-acknowledgement, visualisation parameters, etc
  • Heart-beat function (ie devices are unreachable for a user-defined period)
  • Knowledge base (ie developing a database of activities to respond to certain events)
  • Interfacing with other systems (eg trouble-ticket, work-force management, inventory, etc)
  • Setup of roles/groups
  • Setup of skills-based routing
  • Setup of reporting
  • Setup of notifications (eg email, SMS, etc)
  • Naming convention refinements
  • etc, etc

The latter is a more Agile-style breakdown of work, but doesn’t need to be delivered using Agile methodology.

Of course there are pros and cons of each approach. I’d love to hear your thoughts and experiences with different OSS delivery approaches.

Expanding your bag of OSS tricks

Let me ask you a question – when you’ve expanded your bag of tricks that help you to manage your OSS, where have they typically originated?

By reading? By doing? By asking? Through mentoring? Via training courses?
Relating to technical? People? Process? Product?
Operations? Network? Hardware? Software?
Design? Procure? Implement / delivery? Test? Deploy?
By retrospective thinking? Creative thinking? Refinement thinking?
Other?

If you were to highlight the questions above that are most relevant to the development of your bag of tricks, how much coverage does your pattern show?

There are so many facets to our OSS (ie. tentacles on the OctopOSS) aren’t there? We have to have a large bag of tricks. Not only that, we need to be constantly adding new tricks too right?

I tend to find that our typical approaches to OSS knowledge transfer cover only a small subset (think about discussion topics at OSS conferences that tend to just focus on the technical / architectural)… yet don’t align with how we (or maybe just I) have developed capabilities in the past.

The question then becomes, how do we facilitate the broader learnings required to make our OSS great? To introduce learning opportunities for ourselves and our teams across vaguely related fields such as project management, change management, user interface design, process / workflows, creative thinking, etc, etc.

Zero touch network & Service Management (ZSM)

Zero touch network & Service Management (ZSM) is a next-gen network management approach using closed-loop principles hosted by ETSI. An ETSI blog has just demonstrated the first ZSM Proof of Concept (PoC). The slide deck describing the PoC, supplied by EnterpriseWeb, can be found here.

The diagram below shows a conceptual closed-loop assurance architecture used within the PoC
ETSI ZSM PoC.

It contains some similar concepts to a closed-loop traffic engineering project designed by PAOSS back in 2007, but with one big difference. That 2007 project was based on a single-vendor solution, as opposed to the open, multi-vendor PoC demonstrated here. Both were based on the principle of using assurance monitors to trigger fulfillment responses. For example, ours used SLA threshold breaches on voice switches to trigger automated remedial response through the OSS‘s provisioning engine.

For this newer example, ETSI’s blog details, “The PoC story relates to a congestion event caused by a DDoS (Denial of Service) attack that results in a decrease in the voice quality of a network service. The fault is detected by service monitoring within one or more domains and is shared with the end-to-end service orchestrator which correlates the alarms to interpret the events, based on metadata and metrics, and classifies the SLA violations. The end-to-end service orchestrator makes policy-based decisions which trigger commands back to the domain(s) for remediation.”

You’ll notice one of the key call-outs in the diagram above is real-time inventory. That was much harder for us to achieve back in 2007 than it is now with virtualised network and compute layers providing real-time telemetry. We used inventory that was only auto-discovered once daily and had to build in error handling, whilst relying on over-provisioned physical infrastructure.

It’s exciting to see these types of projects being taken forward by ETSI, EnterpriseWeb, et al.

Network slicing, another OSS activity

One business customer, for example, may require ultra-reliable services, whereas other business customers may need ultra-high-bandwidth communication or extremely low latency. The 5G network needs to be designed to be able to offer a different mix of capabilities to meet all these diverse requirements at the same time.
From a functional point of view, the most logical approach is to build a set of dedicated networks each adapted to serve one type of business customer. These dedicated networks would permit the implementation of tailor-made functionality and network operation specific to the needs of each business customer, rather than a one-size-fits-all approach as witnessed in the current and previous mobile generations which would not be economically viable.
A much more efficient approach is to operate multiple dedicated networks on a common platform: this is effectively what “network slicing” allows. Network slicing is the embodiment of the concept of running multiple logical networks as virtually independent business operations on a common physical infrastructure in an efficient and economical way.
.”
GSMA’s Introduction to Network Slicing.

Engineering a network is one of compromises. There are many different optimisation levers to pull to engineer a set of network characteristics. In the traditional network, it was a case of pulling all the levers to find a middle-ground set of characteristics that supported all their service offerings.

QoS striping of traffic allowed for a level of differentiation of traffic handling, but the underlying network was still a balancing act of settings. Network virtualisation offers new opportunities. It allows unique segmentation via virtual networks, where each can be optimised for the specific use-cases of that network slice.

For years, I’ve been posing the concept of telco offerings being like electricity networks – that we don’t need so many service variants. I should note that this analogy is not quite right. We do have a few different types of “electricity” such as highly available (health monitoring), high-bandwidth (content streaming), extremely low latency (rapid reaction scenarios such as real-time sensor networks), etc.

Now what do we need to implement and manage all these network slices?? Oh that’s right, OSS! It’s our OSS that will help to efficiently coordinate all the slicing and dicing that’s coming our way… to optimise all the levers across all the different network slices!

The OSS co-op business model

A co-operative is a member-owned business structure with at least five members, all of whom have equal voting rights regardless of their level of involvement or investment. All members are expected to help run the cooperative.”
Small Business WA.

The co-op business model has fascinated me since doing some tech projects in the dairy industry in the deep distant past. The dairy co-ops empower collaboration of dairy farmers where the might of the collective outweighs that of each individually. As the collective, they’ve been able to establish massive processing plants, distribution lines, bargaining power, etc. The dairy co-ops are a sell-side collaboration.

By contrast open source projects like ONAP represent an interesting hybrid – part buy-side collaboration (ie the service providers acquiring software to run their organisations) and part sell-side (ie the vendors contributing code to the project alongside the service providers).

I’ve long been intrigued by the potential for a pure sell-side co-operative in OSS.

As we all know, the OSS market is highly fragmented (just look the number of vendors / products on this page), which means inefficiency because of the duplicated effort across vendors. A level of market efficiency comes from mergers and acquisitions. In addition, some comes from vendors forming partnerships to offer more complete solutions to a given customer requirement list.

But the key to a true sell-side OSS co-operative would be in the definition above – “at least five members.” Perhaps it’s an open-source project that brings them together. Perhaps it’s an extended partnership.

As Tom Nolle stated in an article that prompted the writing of today’s post, “On the vendor side, commoditization tends to force consolidation. A vendor who doesn’t have a nice market share has little to hope for but slow decline. A couple such vendors (like Infinera and Coriant, recently) can combine with the hope that the combination will be more survivable than the individual companies were likely to be. Consolidation weeds out industry inefficiencies like parallel costly operations structures, and so makes the remaining players stronger.

Imagine for a moment if instead of having developers spread across 100 alarm management tools, that same developer pool can take a consolidated 5 alarm management products forward? Do you think we’d get better, more innovative, more complete products faster?

Having said that, co-ops have their weaknesses too.

What do you think? Could such a model work? Would it be a disaster?

Orchestration looks a bit like provisioning

The following is the result of a survey question posed by TM Forum:
Number 1 Driver for Orchestration

I’m not sure how the numbers tally, but conceptually the graph above paints an interesting perspective of why orchestration is important. The graph indicates the why.

But in this case, for me, the why is the by-product of the how. The main attraction of orchestration models is in how we can achieve modularity. All of the business outcomes mentioned in the graph above will only be achievable as a result of modularity.

Put another way, rather than having the integration spaghetti of an “old-school” OSS / BSS stack, orchestration (and orchestration plans) potentially provides the ability to provide clearer demarcation and abstraction all the way from product design down into transactions that hit the network… not to mention the meet-in-the-middle points between business units.

Demarcation points support catalog items (perhaps as APIs / microservices with published contracts), allowing building-block design of products rather than involvement of (and disputes between) business units all down the line of product design. This facilitates the speed (34%) and services on demand (28%) objectives stated in the graph.

But I used the term “old-school” with intent above. The modularity mentioned above was already achieved in some older OSS too. The ability to carve up, sequence, prioritise and re-construct a stream of service orders was already achievable by some provisioning + workflow engines of the past.

The business outcomes remain the same now as they were then, but perhaps orchestration takes it to the next level.

Using OSS machine learning to predict backwards not forwards

There’s a lot of excitement about what machine-led decisioning can introduce into the world of network operations, and rightly so. Excitement about predictions, automation, efficiency, optimisation, zero-touch assurance, etc.

There are so many use-cases that disruptors are proposing to solve using Artificial Intelligence (AI), Machine Learning (ML) and the like. I might have even been guilty of proposing a few ideas here on the PAOSS blog – ideas like closed-loop OSS learning / optimisation of common processes, the use of Robotic Process Automation (RPA) to reduce swivel-chairing and a few more.

But have you ever noticed that most of these use-cases are forward-looking? That is, using past data to look into the future (eg predictions, improvements, etc). What if we took a slightly different approach and used past data to reconcile what we’ve just done?

AI and ML models tend to rely on lots of consistent data. OSS produce or collect lots of consistent data, so we get a big tick there. The only problem is that a lot of our manually created OSS data is riddled with inconsistency, due to nuances in the way that each worker performs workflows, data entry, etc as well as our own personal biases and perceptions.

For example, zero-touch automation has significant cachet at the moment. To achieve that, we need network health indicators (eg alarms, logs, performance metrics), but also a record of interventions that have (hopefully) improved on those indicators. If we have inconsistent or erroneous trouble-ticketing information, our AI is going to struggle to figure out the right response to automate. Network operations teams tend to want to fix problems quickly, get the ticketing data populated as fast as possible and move on to the next fault, even if that means creating inconsistent ticketing data.

So, to get back to looking backwards, a machine-learning use-case to consider today is to look at what an operator has solved, compare it to past resolutions and either validate the accuracy of their resolution data or even auto-populate fields based on the operator’s actions.

Hat tip to Jay Fenton for the idea seed for this post.