Back in the days when I first started using OSS/BSS software tools, there was no way any respectable telco was going to use open-source software (the other oss, for which I’ll use lower-case in this article) in their OSS/BSS stacks. The arguments were plenty, and if we’re being honest, probably had a strong element of truth in many cases back then.
These arguments included:
Security – This is the most commonly cited aversion I’ve heard to open-source. Our OSS/BSS control our network, so they absolutely have to be secure. Secure across all aspects of the stack from network / infrastructure to data (at rest and in motion) to account access to applications / code, etc. The argument against open-source is that the code is open to anyone to view, so vulnerabilities can be identified by hackers. Another argument is that community contributors could intentionally inject vulnerabilities that aren’t spotted by the rest of the community
Quality – There is a perception that open-source projects are more hobby projects than professional. Related to that, hobbyists can’t expend enough effort to make the solution as feature-rich and/or user-friendly as commercial software
Flexibility – Large telcos tend to want to steer the products to their own unique needs via a lot of customisations. OSS/BSS transformation projects tend to be large enough to encourage proprietary software vendors to be paid to make the requested changes. Choosing open-source implies accepting the product (and its roadmap) is defined by its developer community unless you wish to develop your own updates
Support – Telcos run 24x7x365, so they often expect their OSS/BSS vendors to provide round-the-clock support as well. There’s a belief that open-source comes with a best-effort support model with no contracted service obligations. And if something does go drastically wrong, that open-source disclaims all responsibility and liability
Continuity – Telcos not only run 24x7x365, but also expect to maintain this cadence for decades to come. They need to know that they can rely on their OSS/BSS today but also expect a roadmap of updates into the future. They can’t become dependent upon a hobbyist or community that decides they don’t want to develop their open-source project anymore
Luckily, these perceptions around open-source have changed in telco circles in recent years. The success of open-source organisations like Red Hat (acquired by IBM for $34 billion on annual revenues of $3.4 billion) have shown that valuable business models can be underpinned by open-source. There are many examples of open-source OSS/BSS projects driving valuable business models and associated professionalism. The change in perception has possibly also been driven by shifts in application architectures, from monolithic OSS/BSS to more modular ones. Having smaller modules has opened the door to utilisation of building block solutions like the Apache projects.
So let’s look at the same five factors above again, but through the lens of the pros rather than the cons.
Security – There’s no doubt that security is always a challenge, regardless of being open-source or proprietary software, especially for an industry like OSS/BSS where all organisations are still investing more heavily in innovation (new features/capabilitys) more than security optimisations. Clearly the openness of code means vulnerabilities are more easily spotted in open-source than in “walled-garden” proprietary solutions. Not just by nefarious actors, but its development community as well. Linus’ Law suggests that “given enough eyeballs, all bugs (and security flaws) are shallow.” The question for open-source OSS/BSS is whether there are actually many eyeballs. All commercially successful open-source OSS/BSS vendors that I’m aware of have their own teams of professional developers who control any changes to the code base, even on the rare occasions when there are community contributions. However, many modern open-source OSS/BSS leverage other open-source modules that do have many eyes (eg linux, snmp libaries, Apache projects, etc)
Quality – There’s no doubt that many open-source OSS/BSS have matured and found valuable business models to sustain them. With the profitable business model has come increased resources, professionalism and quality. With the increased modularity of modern architectures, open-source OSS/BSS projects are able to perform very specific niche functionalities. Contrast this with the monolithic proprietary solutions that have needed to spread their resources thinner across a much wider functional estate. Also successful open-source OSS/BSS organisations tend to focus on product development and product-related services (eg support), whereas the largest OSS/BSS firms tend to derive a much larger percentage of revenues from value-added services (eg transformations, customisations, consultancy, managed services, etc). The latter are more services-oriented companies than product companies.
Flexibility – There has been a significant shift in telco mindsets in recent years, from an off-the-shelf to a build-your-own OSS/BSS stack. Telcos like AT&T have seen the achievements of the hyperscalers, observed the increased virtualisation of networks and realised they needed to have more in-house software development skills. Having in-house developers and access to the code-base of open-source means that telcos have (almost) complete control over their OSS/BSS destinies. They don’t need to wait for proprietary vendors to acknowledge, quote, develop and release new feature requests. They can just slip the required changes into their CI/CD pipeline and prioritise according to resource availability
Support – Remember when I mentioned above that OSS/BSS organisations have found ways to build profitable business models around open-source software? In most cases, their revenues are derived from annual support contracts. The quality and coverage of their support (and the products that back it up) is directly tied to their income stream, so there’s commensurate professionalism assigned to support
Continuity – This is perhaps the most interesting one for me. There is the assumption that big, commercial software vendors are more reliable than open-source vendors. This may (or might not) be the case. Plenty of commercial vendors have gone out of business, just as plenty of open-source projects have burned out or dwindled away. To counter the first risk, telcos pay to enter into software escrow agreements with proprietary vendors to ensure critical fixes and roadmap can continue even in the event that a vendor ceases to operate. But the escrow contract may not cover when a commercial vendor chooses to obsolete a line of software or just fail to invest in new features or patches. They’re effectively paying an insurance fee to have access to the code for operational continuity purposes but escrow may still not be as open as open-source, which is available under any scenario. But the more important continuity consideration is the data and data is the reason OSS/BSS exist. When choosing a commercial provider, especially a cloud software / service provider, the data goes into a black box. What happens to the data inside the black box is proprietary and often what comes out of it is also. Telcos will tend to have far more control of their data destinies for operational continuity if using open-source solutions
Now, I’m not advocating one or the other for your particular situation. As cited above, there are clearly pros and cons for each approach as well as different products of best-fit for different operators. However, open-source can no longer be as summarily dismissed as it was when I first started on my OSS/BSS journey. There are many fine OSS and BSS products and vendors in our Blue Book OSS/BSS Vendor Directory that are worthy of your consideration too when looking into your next product or transformation.
The pandemic has been beneficial for the telco world in one way. For many who weren’t already aware, it’s now clear how incredibly important telecommunications providers are to our modern way of life. Not just for our ability to communicate with others, but our economy, the services we use, the products we buy and even more fundamentally, our safety.
Working in the telco industry, as I’m sure you do, you’ll also be well aware of all the rhetoric and politics around Chinese manufactured equipment (eg Huawei) being used in the networks of global telco providers. The theory is that having telecommunications infrastructure supplied by a third-party, particularly a third-party aligned with non-Western allies, puts national security interests at risk.
In this article, “5G: The outsourced elephant in the room,” Bert Hubert provides a brilliant look into the realities of telco network security that go far beyond just equipment supply. It breaks the national security threat into three key elements:
Spying (using compromised telco infrastructure to conduct espionage)
Availability (compromising and/or manipulating telco infrastructure so that it’s unable to work reliably)
Autonomy (being unable to operate a network or to recover from outages or compromises)
The first two are well understood and often discussed. The third is the real elephant in the room. The elephant OSS/BSS have a huge influence over (potentially). But we’ll get to that shortly.
Before we do, let’s summarise Bert’s analysis of security models. For 5G, he states that there’s an assumption that employees at national carriers design networks, buy equipment, install it, commission it and then hand it over to other employees to monitor and manage it. Oh, and to provide other specialised activities like lawful intercept, where a local legal system provides warrants to monitor the digital communications of (potentially) nefarious actors. Government bodies and taxpayers all assume the telcos have experienced staff with the expertise to provide all these services.
However, the reality is far different. Service providers have been outsourcing many of these functions for decades. New equipment is designed, deployed, configured, maintained and sometimes even financed by vendors for many global telcos. As Bert reinforces, “Just to let that sink in, Huawei (and their close partners) already run and directly operate the mobile telecommunication infrastructure for over 100 million European subscribers.“
But let’s be very clear here. It’s not just Huawei and it’s not just Chinese manufacturers. Nor is it just mobile infrastructure. It’s also cloud providers and fixed-line networks. It’s also American manufacturers. It’s also the integrators that pull these networks and systems together.
Bert also points out that CDRs (Call Detail Records) have been outsourced for decades. There’s a strong trend for billing providers to supply their tools via SaaS delivery models. And what are CDRs? Only metadata. Metadata that describes a subscriber’s activities and whereabouts. Data that’s powerful enough to be used to assist with criminal investigations (via lawful intercept). But where has CDR / bill processing been outsourced to? China and Israel mostly.
Now, let’s take a closer look at the autonomy factor, the real elephant in the room. Many design and operations activities have been offshored to jurisdictions where staff are more affordable. The telcos usually put clean-room facilities in place to ensure a level of security is applied to any data handled off-shore. They also put in place contractual protection mechanisms.
Those are moot points, but still not the key point here. As Bert brilliantly summarises, “any worries about [offshore actors] being able to disrupt our communications through backdoors ignore the fact that all they’d need to do to disrupt our communications.. is to stop maintaining our networks for us!“
There might be an implicit trust in “Western” manufacturers or integrators (eg Ericsson, Nokia, IBM) in designing, building and maintaining networks. However, these organisation also outsource / insource labour to international destinations where labour costs are cheaper.
If the R&D, design, configuration and operations roles are all outsourced, where do the telcos find the local resources with requisite skills to keep the network up in times when force majeure (eg war, epidemic, crime, strikes, etc) interrupts a remote workforce? How do local resources develop the required skills if the roles don’t exist locally?
Bert proposes that automation is an important part of the solution. He has a point. Many of the outsource arrangements are time and materials based contracts, so it’s in the resource suppliers’ best interests for activities to take a lot of time to maintain manually. He counters by showing how the hyperscalers (eg Google) have found ways of building automations so that their networks and infrastructure need minimal support crews.
Their support systems, unlike the legacy thinking of telco systems, have been designed with zero-touch / low-touch in mind.
If we do care about the stability, resiliency and privacy of our national networks, then something has to be done differently, vastly different! Having highly autonomous networks, OSS, BSS and related systems is a start. Having a highly skilled pool of local resources that can research, design, build, commission, operate and improve these systems would also seem important. If the business models of these telcos can’t support the higher costs of these local resources, then perhaps national security interests might have to subsidise these skills?
I wonder if the national carriers and/or local OSS/BSS / Automation suppliers are lobbying this point? I know a few governments have inserted security regulatory systems and pushed them onto the telcos to adhere to, to ensure they have suitable cyber-security mechanisms. They also have lawful intercept provisions. But do any have local operational autonomy provisions? None that I’m aware of, but feel free to leave us a comment about any you’re aware of.
The aim of the show is to shine a light on the many brilliant people who work in the OSS industry.
We’ll interview experts in the field of OSS/BSS and telecommunications software. Guests represent the many facets of OSS including: founders, architects, business analysts, designers, developers, rainmakers, implementers, operators and much more, giving a 360 degree perspective of the industry.
We’ll delve into the pathways they’ve taken in achieving their god-like statuses, but also unlock the tips, tactics, methodologies and strategies they employ. Their successes and failures, challenges and achievements. We’ll look into the past, present and even seek to peer into what the future holds for the telco and OSS industries.
We posted an article in July entitled “OSS / BSS in the clouds,” which looked at the OSS, BSS and related telco infrastructure platforms being offered by AWS, Google, Microsoft and their partners. This followed a number of recent announcements made by the hyperscalers relating to their bigger pushes into telco. It had a particular focus on 5G models and the edge compute technologies that support them.
Following VMware’s announcement of its 5G Telco Cloud Portfolio on 1st Sept 2020, we’ve now retro-fitted VMware into the earlier post. Click on it here “OSS / BSS in the clouds,” to view the updated article.
It also mentioned that the first principle behind that advantage is simplicity (of systems, overheads, processes, offerings, etc). Many of these simplification factors are controllable by our OSS/BSS, but most just end up with us as a result of up-stream decisions that flow to us for resolution. These decisions (eg running 50+ mobile offering variants including grandfathered services) make our OSS/BSS far more complex.
Another example is omni-channel customer engagement, which includes:
Digital / websites
IVR (Interactive Voice Response)
USSD (Unstructured Supplementary Service Data)
I completely get that we want to allow customers (or potential customers) to choose the contact mechanism/s they’re most comfortable with. Tight coupling / orchestration across these channels is important for customer loyalty.
Unfortunately, the task of coordinating all of these systems is complex. Linking and tracking customer journeys through multiple channels is even more challenging. For example, websites and IVR channels don’t require customers to self-identify, making it difficult to cross-link with in app touch-point data where identification is inherent. Some flows are personalised, some are interactive, some are batched.
Gaining consolidated insights from these disparate data sources is imperative for customer satisfaction.
Clearly a typical telco omni-channel story doesn’t represent simplicity, especially for the OSS/BSS that support it! That presents a barrier to achieving the cost of production advantage we discussed.
How do telcos compare with Apple, Amazon, Google and others? Do they offer as many customer contact channels? Do they have as many variants in their “extended supply chain?” Do they have as many potential fall-out points in their O2A, T2R, P2O, etc workflows? It appears to me that most telcos are at a structural disadvantage here (on costs and CX).
We’d love to hear your thoughts. Please leave us your perspectives in the comments section below.
We talked yesterday about the commoditisation of telco services and the part that OSS/BSS have to play in differentiation. We also talked about telcos retaining a few competitive advantages despite the share-of-wallet inroads made by OTT, software and cloud service providers recently. Managed services is one area where some of those advantages converge.
Quite a few years ago I worked with one of Australia’s largest telcos on a managed service account for one of Australia’s big four banks. The value of the managed service contract was worth a couple of billion dollars. It covered the whole gamut of what a telco can offer. Voice, data, mobility and unified communications of course, but also branch / office comms fit-outs, international WANs, IT services, security, custom help-desk, dedicated service delivery management and much more.
Big companies tend to prefer to deal with big companies, especially when the services are complex and wide-ranging. A bank like this one could realistically only negotiate with a handful of telcos because only a few could provide viable size and scope to meet their needs… if they wanted to keep it to a consolidated procurement event.
This bank wasn’t this telco’s only large managed services contract. They had quite a few similar clients. The deals were all incredibly cut-throat between the other big telcos, but they were still profitable, especially once the variations started coming in. The telcos are not in a race to the bottom on managed services like the commodity data services we discussed yesterday.
I found it interesting that the telcos mainly focused on providing these managed service deals to the ASX200 (Australia’s top 200 companies by market capitalisation), give or take. So they service the top 200 (ish) with managed services and the long, long tail with retail services. But of course mid-market companies fall somewhere in the middle – big enough to require custom solutions, but too small to warrant the type of managed services the bank was getting.
So, now let me put the OSS/BSS spin on this situation (yes, it took me a while). At this telco, each of its big managed service contracts had completely bespoke OSS/BSS/portal solutions. Of course they had a core OSS/BSS – the factory that handled assurance and fulfilment for customers large and small – but every managed service had completely unique, satellite OSS/BSS/portals at their customer interface. For example, the core ran eTOM, the customer interface often demanded ITIL.
This telco couldn’t offer managed services to the mid-market because it was too expensive to set up the unique tooling. That meant there were almost no repeatability or economies of scale in processes, reporting, data science, skill portability, etc.
It bewildered me that they hadn’t invested the time into creating a cookie-cutter approach to all these satellite OSS/BSS/portals. Sure, each would need some customisation, but there was also significant potential for repeatability / consistency.
Now this may sound a bit bewildering to you too. But it’s not just the telco’s short-term thinking that leads to this situation. Most OSS/BSS vendors don’t build products with this type of multi-tenancy in mind.
The ability to spin up/down satellite OSS/BSS/portals, each with configurability but consistency, is not common. Importantly, nor is secure partitioning that ensures each customer only sees their data even when using shared infrastructure (including NOC / WOC / SOC as well as network infra and core OSS/BSS).
You know, I don’t recall ever hearing anyone else talk about this type of use-case. Multi-tenancy, yes, to an extent, but not at managed services scale.
Is this even a thing? Should it be? Could it be? You be the judge! I’d love to hear your thoughts / experiences. Please leave a comment below if your OSS/BSS stack supports this use-case or if your telco has developed a cookie-cutter approach. What approach do you use?
Since widespread deregulation of telecommunications globally, the passing of data has become a commodity. Perhaps it always was, but increased competition has steadily driven down dollar per bit. It’s likely to continue on that path too. Meanwhile the expected throughputs and consumption of data services is ramping ever-upwards, which requires investment in networks by their operators.
By definition, a commodity is a product/service that is indistinguishable between providers. The primary differentiator is price because there are no other features that make a buyer choose one provider over another.
At face value, that’s true of commodities such as oil or iron ore, just as it is for data. What could be less differentiated than ones and zeroes?
However, as the charts below show for oil and iron ore, commodities aren’t always a level playing field. Some suppliers have significant differentiation – via cost of production advantages over their competitors.
What if we created a $/bit graph of telcos similar to the oil graph above showing opex and capex splits? Which countries / operators would have significant advantage? Which telcos are like Rio Tinto, comfortable in the knowledge that no matter how low the spot price goes, their competitors will be deeply unprofitable (and possibly out of business) long before they will be.
Virtualisation of network infrastructure (SDN, NFV, cloud) has been the much-hyped saviour of telco business models. However, like me, you’ve probably started hearing news that savings from these architectures just aren’t eventuating. Cost-bases are possibly even going up because of their increased complexity. Either way, incremental cost reduction tends to have diminishing returns anyway.
It would seem that telcos are left with two / three choices:
Have structurally lower production costs, like Kuwait and Rio Tinto; or
Differentiate / Innovate, limiting exposure to the raw data / connectivity services that so many telcos still rely upon
OSS/BSS has a part to play with both of these options.
Structurally lower cost comes not from cost reduction but from having far simpler variant trees (ie smaller numbers of product offerings, network types, system integrations, service configuration styles, support models, obsolescence of legacy / grandfathered products, minimal process options, etc, etc). Some of this happens within the OSS/BSS, but a lot more of it stems from upstream decisions.
Differentiation / innovation means being able to innovate through experiences / insights / content / trust / partnerships / ecosystems / local-presence in ways that other organisations can’t. It’s unlikely to be in software alone or in cloud infrastructure because others have proven to do that far more effectively than telcos. As much as they wish otherwise, it’s just not in the DNA of many. Yet that’s where most attention seems to be. Meanwhile OSS/BSS are waiting to be the glue that can leverage the competitive advantages that telcos do still hold.
Our Blue Book OSS/BSS Vendors Directory provides a list of over 400 vendors. That clearly states that it’s a highly fragmented market. This amount of fragmentation hurts the industry in many ways, including:
Duplication – Let’s say 100 of the 400 vendors offer alarm / fault management capabilities. That means there are 100 teams duplicating effort in creating similar functionality. Isn’t that re-inventing the wheel, again and again? Wouldn’t the effort be better spread into developing new / improved functionality, rather than repeating what’s already available (more or less). And it’s not just coding, but testing, training, etc. The talent pool overlaps on duplicated work at the expense of taking us faster into an innovative future
Profit-share – The collective revenues of all those vendors need to be spread across many investors. Consolidated profits would most likely lead to more coordination of innovation (and less duplication above). And just think at how much capability has been lost in tools developed by companies that are no longer commercially viable
Overhead – Closely related is that every one of these organisations has an overhead that supports the real work (ie product development, project implementation, etc). Consolidation would bring greater economies of scale
Consistency – With 400+ vendors, there are 400+ visions / designs / architectures / approaches. This means the cost and complexity of integration hurts us and our customers. The number of variants makes it impossible for everything to easily bolt together – not withstanding the wonderful alignment mechanisms that TM Forum, MEF, etc create (via their products Frameworx, OpenAPIs, MEF 3.0 Framework, LSO, etc). At the end of the day, they create recommendations that vendors can interpret as they see fit. It seems the integration points are proliferating rather than consolidating
Repeatability and Quality – Repeatability tends to provide a platform for continual improvement. If you do something repeatedly, you have more opportunities to refine. Unfortunately, the bespoke nature of OSS/BSS implementations (and products) means there’s not a lot of repeatability. Linus’s Law of OSS defects also applies, with eyeballs spread across many code-bases. And the spread of our variant trees means that we can never have sufficient test coverage, meaning higher end-to-end failure / fall-out rates than should be acceptable
Portability – Because each product and implementation is so different, it can be difficult to readily transfer skills between organisations. An immensely knowledgeable, talented and valuable OSS expert at one organisation will likely still need to do an apprenticeship period at a new organisation before becoming nearly as valuable
Analysis Paralysis – If you’re looking for a new vendor / product, you generally need to consider dozens of alternatives. And it’s not like the decisions are easy. Each vendor provides a different set of functionality, pros and cons. It’s never a simple “apples-for-apples” comparison (although we at PAOSS have refined ways to make comparisons simpler). It’s certainly not like a cola-lover having to choose between Coke and Pepsi. The cost and ramifications of an OSS/BSS procurement decision are decidedly more significant too obviously
Requirement Spread – Because there are so many vendors with so many niches and such a willingness to customise, our customers tend to have few constraints when compiling a list of requirements for their OSS/BSS. As described in the Lego analogy, reducing the number of building blocks, perhaps counter-intuitively, can actually enhance creativity and productivity
Shared Insight – Our OSS/BSS collect eye-watering amounts of data. However, every data set is unique – collection / ETL approach, network under management, product offerings, even naming conventions, etc. This makes it challenging to truly benchmark between organisations, or share insights, or share seeded data for cognitive tools
However, I’m very cognisant that OSS come in all shapes and sizes. They all have nuanced requirements and need unique consideration. Yet many of our customers stand on a burning platform and desperately need us to create better outcomes for them.
From the points listed above, the industry is calling out for consolidation – especially in the foundational functionality that is so heavily duplicated – inventory / resource management, alarms, performance, workflows, service ordering, provisioning, security, infrastructure scalability, APIs, etc, etc.
If we had a consistent foundation for all to work on, we could then more easily take the industry forward. It becomes a platform for continuous improvement of core functionality, whilst allowing more widespread customisation / innovation at its edges.
But who could provide such a platform and lead its over-arching vision?
I don’t think it can be a traditional vendor. Despite there being 400+ vendors, I’m not aware of any that cover the entire scope of TM Forum’s TAM map. Nor do any hold enough market share currently to try to commandeer the foundational platform
TM Forum, wouldn’t want to compromise their subscriber base by creating something that overlaps with existing offerings
Solution Integrators often perform a similar role today, combining a multitude of different OSS/BSS offerings on behalf of their customers. But none have a core of foundational products that they’ve rolled out to enough customers to achieve critical mass
I like the concept of what ONAP is trying to do to rally global carriers to a common cause. However, its size and complexity also worries me. That it’s a core component of the Linux Foundation (LF Networking) gives it more chance of creating a core foundation via collaborative means rather than dictatorial ones
We’d love to hear your thoughts. Is fragmentation a good thing or a bad thing? Do you suggest a better way? Leave us a message below.
Have you been tasked with designing process flows for a telecommunication network operator? Do these include end-to-end (E2E) processes that leverage one (or likely more) of your OSS/BSS tools along the journey? Perhaps you’ve even been tasked with setting a roadmap for OSS/BSS development and/or integration with other OSS/BSS?
This can be a daunting task because there are so many E2E processes that are required to run the business and operations of a network service provider. Where do you start?
Lucky for you and the rest of the industry, the TM Forum has provided a valuable set of tools that have become the benchmark for designing business processes for network operators globally. It’s known as the eTOM (enhanced Telecommunication Operations Map), which is part of the TM Forum’s Frameworx suite. The origins of eTOM began in the 1990s when the TM Forum sought to assist in the understanding of external business linkages for interface design.
Due to the unique nature and requirements of each organisation, eTOM was always intended to be flexible, but also provide as much standardisation of process as possible. This would provide consistency between vendors / integrators of telco software like OSS/BSS, but also consistency in the delivery of services across points of interconnect (POI) between the networks of different service providers.
The main document (GB921) – A spreadsheet that contains a list of many (1,000+) atomic tasks (or decompositions) that can be used to create E2E flows from
Addendum D (GB921D) – A document that provides a hierarchical and functional grouping of tasks, with multiple levels of granularity (Level 0 is shown in the first diagram below)
(then a decomposition down to task level 3 can be formed, as shown in the second diagram below. Note: don’t worry about the details here, as we’ll get into that later)
Addendum E (GB921E) – A document that describes the design of E2E business streams from the atomic tasks described in GB921. The diagram below comes directly from GB921E. The “customer centric processes” on the left panel represent customer-initiated workflows (more on those later in this article). Meanwhile, the bottom right corner shows how the atomic tasks are linked to form an E2E process for each customer centric process (using L3 decompositions in this example):
Addendum F (GB921F) – A document that provides examples of end to end business processes, built from the atomic tasks provided in the previously mentioned documents. The E2E samples include Request to Answer (also known as R2A), Order to Payment (O2P), Request to Change (R2C) and many others, which we mention in this article about key business process acronyms in the OSS/BSS industry
Addendum G (GB921G) – A document that provides a guide on how to apply the eTOM process framework
There are multiple other addenda that can be used to assist you with the development of your organisation’s E2E processes and integrations, including Addendum W (GP921W), which describes how ITIL (an IT process framework) and eTOM can work together
The important feature to understand here is that eTOM comprises:
A large list of tasks (GB921), which aren’t process flows in isolation, but need to be joined together as…
…Suggested sequences of tasks (eg GB921E, GB921F) to guide the creation of your E2E processes to meet your workflow requirements
The art is in the building of E2E flows. The tasks are prescribed by eTOM. The processes only partially so. You have the option of forming them using the examples in GB921F, but you will probably also require input from your operational teams, business analysts, past experiences, inputs from vendors (eg OSS/BSS product functionality capabilities), etc. You’re left with infinite possibilities. So you might still be asking, where do I start?
Well, the industry tends to make E2E processes that are initiated by a customer (or internal operator / engineer). They start with that trigger (X) and end with the outcome (Y) that they’ve sought to trigger (ie closing the loop). They are often, but not always, named using the “X to Y” convention. eg, the link above, provides many examples of these, such as Order to Activate (O2A), Order to Cash (O2C), Trouble to Resolve (T2R), etc, etc, etc.
GB921E provides the following examples:
Customer-Centric E2E business streams:
Request to Answer
Order to Payment
Usage to Payment
Request to Change
Termination to Confirmation
Complaint to Solution
Network E2E business streams:
Production Order to Acceptance
Trouble Ticket to Solution
Activation to Usage Data
Service Lifecycle Management
Resource Lifecycle Management
Product E2E business streams:
Idea to Business Plan
Idea to Business Proposal
Business Proposal to Launch
Assessment to Relaunch
Assessment to Retirement
Market Strategy to Campaign
Engaged Party Flows:
Let’s take a closer look at GB921E and how it helps to solve for the first process in the list above – R2A (Request to Answer) – using eTOM mappings.
Let’s first start with a Level 2 breakdown / mapping of R2A:
Then, this can be used to guide the Level 3 breakdown, which looks more like the E2E R2A process we’re expecting:
GB921E even provides the template for showing detailed information about the R2A process, as follows:
Note that the examples provided above were from eTOM release 20-5 (which includes GP921E v20.0.1). However, the eTOM document libraries are being refined constantly, with major version releases a couple of times each year, so revert back to the TM Forum eTOM page for latest updates before embarking on your business process designs.
Note: We’ve developed a technique to document, benchmark and optimise operational processes directly from OSS/BSS activity logs. This helps you capture current-state process flows in BPMN format to assist with your eTOM / ITIL process mapping exercise (see sample below).
Good luck on your journey of designing telco business processes for your organisation. If you require assistance, please don’t hesitate to contact us via the contact form below.
Have you noticed the recent up-tick in headlines around telco offerings by hyperscalers AWS, Google and Microsoft? Or the multi-cloud telco models, the middleware, supplied by VMware and Red Hat?
Whilst previous generations of wireless connectivity have focussed on voice and data capabilities, 5G is architected to better enable consumer business models. Edge compute (both on-prem / device-edge and provider edge) and related services to support 5G use-cases appears to be the leading driver behind recent announcements. These use-cases will need to be managed by our OSS/BSS for the telco operators and their customers.
Meanwhile, top-tier OSS/BSS users are also continuing to adopt cloud-native OSS/BSS initiatives, as described in this Infographic from Analysis Mason / Amdocs. Analysis Mason estimates that over 90% of CSPs in North America, Asia–Pacific and Europe will have their OSS/BSS stacks running on cloud infrastructure by 2022, with well over 60% on hybrid cloud.
However, just how much of the CSP OSS/BSS stack will be on the cloud remains in question. According to TM Forum’s research, most CSPs have deployed less than 5% percent of their operations software in the public cloud.
In today’s article, we take a closer look into cloud offerings for OSS/BSS. The providers we’ll cover are hyperscalers:
AND Bosch, ThingsBoard, ThingPark, ThingLogix, etc, etc (if we extend into IoT device management)
AWS Marketplace tends to show the solutions that are more standardised / fixed-price in nature (Telecoms section in Marketplace). Many other OSS/BSS vendors such as Netcracker, CSG, Intraway and Camvio don’t appear in the AWS marketplace but have customisable, AWS-ready solutions for clients. These companies have their own sales arms obviously, but also train the AWS global salesforce in their products.
Helping telecommunications companies monetise 5G as a business services platform, including:
The Global Mobile Edge Cloud (GMEC) strategy, which will deliver a portfolio and marketplace of 5G solutions built jointly with telecommunications companies; an open cloud platform for developing network-centric applications; and a global distributed edge for deploying these solutions
Anthos for Telecom, which will bring its Anthos cloud application platform to the network edge, allowing telecommunications companies to run their applications wherever it makes the most sense. Anthos for Telecom—based on open-source Kubernetes—will provide an open platform for network-centric applications.
Empowering telecommunications companies to better engage their customers through data-driven experiences by:
Empowering telecommunications companies to transform their customer experiences through data- and AI-driven technologies. Google’s BigQuery platform provides a scalable data analytics solution—with machine learning built-in so telecommunications companies can store, process, and analyze data in real time, and build personalization models on top of this data
Contact Center AI assists telecommunications companies with customer service. Contact Center AI gives companies 24/7 access to conversational self-service, with seamless hand-offs to human agents for more complex issues. It also empowers human agents with continuous support during their calls by identifying intent and providing real-time, step-by-step assistance
AI and retail solutions including omni-channel marketing, sales and service, personalisation and recommendations, and virtual-agent presence in stores
Assisting them in improving operational efficiencies across core telecom systems. This allows operators to move OSS, BSS and network functions from their own environments to the Google Cloud
Deliver Amdocs solutions to Google Cloud: Amdocs will run its digital portfolio on Google Cloud’s Anthos, enabling communications service providers (CSPs) to deploy across hybrid and multi-cloud configurations
Develop new enterprise-focused 5G edge computing solutions: Amdocs and Google Cloud will create new industry solutions for CSPs to monetize over 5G networks at the edge
Help CSPs leverage data and analytics to improve services: Amdocs will make its Data Hub and Data Intelligence analytics solutions available on Google Cloud. Amdocs and Google Cloud will also develop a new, comprehensive analytics solution to help CSPs leverage data to improve the reliability of their services and customer experiences.
Partner on Site Reliability Engineering (SRE) services: The companies will share tools, frameworks, and best practices for SRE and DevOps
Microsoft has also announced an intention to better serve telecom operators at the convergence of cloud and comms networks through its Azure platform.
The diagram below, from this Microsoft blog, shows their coverage map of offerings for CSP customers (Azure for Operators):
The blog also indicates that their offering is built upon:
Interconnect – 170 points of presence and 20,000 peering connections globally. More than 200 operators are already integrated with the Azure network via their ExpressRoute service
Edge – offered by the Azure platform, whether at enterprise edge, network edge, network core or cloud
Network Functions – This is where Microsoft distinguishes itself from AWS and Google. Its ability to offer network, particularly for 5G via RAN and Mobile Packet Core offerings (see more about Microsoft’s Affirmed and Metaswitch acquisitions below)
Cloud – Incorporates a marketplace with capabilities including OSS/BSS, IoT (via IoT Central), Machine Learning / AI, Azure Cognitive Services (APIs/SDKs that help developers build cognitive intelligence across Decisions, Vision, Speech, Language and Web-search)
“We will continue to partner with existing suppliers, emerging innovators and network equipment partners to share roadmaps and explore expanded opportunities to work together, including in the areas of radio access networks (RAN), next-generation core, virtualized services, orchestration and operations support system/business support system (OSS/BSS) modernization,” states Yousef Khalidi in this Microsoft post.
VMware’s recently announced 5G Telco Cloud Portfolio has been designed to give network operators the platform to accelerate 5G and Edge implementation. Its key differentiator from the examples provided above is it allows operators to run containerised workloads across private, telco, edge and public clouds. This is seen as being an important feature allowing telcos to avoid cloud partner lock-in.
The press release above indicates that, “VMware is evolving its VMware vCloud NFV solution to Telco Cloud Infrastructure, providing CSPs a consistent and unified platform delivering consistent operations for both Virtual Network Functions (VNFs) and Cloud Native Network Functions (CNNFs) across telco networks. Telco Cloud Infrastructure is designed to optimize the delivery of network services with telco centric enhancements, supporting distributed cloud deployments… Tightly integrated with Telco Cloud Infrastructure, VMware’s Telco Cloud Automation intelligently automates the end-to-end lifecycle management of network functions and services to simplify operations and accelerate service delivery while optimizing resource utilization. Telco Cloud Automation also now supports infrastructure and Containers-as-a-Service (CaaS) management automation to streamline workload placement and deliver optimal infrastructure resource allocation. It also significantly simplifies the 5G and telco edge network expansions through zero-touch-provisioning (ZTP) whenever capacity is required.“
Red Hat’s origins, of course, are in open-source tools such as Red Hat Linux. They’ve now evolved to become a leading provider of open-source solutions for enterprise, delivering Linux, cloud, container and Kubernetes technology. Red Hat was acquired by IBM in 2019.
Red Hat’s telco offerings are built upon the premise that service providers will use more open-source, multi-vendor solutions to underpin their OSS, BSS and networks of the future. Red Hat aims to offer open infrastructure to facilitate service provider initiatives such as NFV, 5G, OpenRAN and Edge Compute. This includes coordination of telco and IT infrastructure, but also the applications and data that are intertwined with them.
Red Hat’s telco proposition is supported by:
OpenShift – a cloud compute platform as a service (PaaS), built upon Docker containers (containerised applications) that are managed by Kubernetes. This is particularly relevant to support telco cloud models that provide virtualised network functions. It also helps to deliver the edge compute infrastructure that’s becoming synonymous with 5G
OpenStack – a set of components, mostly deployed as infrastructure as a service (IaaS) that help manage compute, storage and networks. Some of the components are shown in the diagram below sourced from redhat.com
Ansible Automation Platform – to automate network configuration, fault remediation, security updates, and more
Marketplace – to assist service providers in finding, buying and deploying certified, container-based software
Telco Ecosystem Program – that brings together enterprise and community partners to deliver integrated telco solutions. Partners include Affirmed, Altiostar, Atos, Cisco, Ericsson, Amdocs, MYCOM OSI, Zabbix, Metaswitch, Nokia, Juniper and more
Consulting – offering service resources that include Innovation Labs, training and consulting
And other solutions such as Ceph Storage, Cloud Suite, Quay, JBoss suite, Integration, Insights, Fuse and more.
CSPs (Communications Service Providers) find themselves in a catch-22 position with cloud providers. Their own OSS/BSS and those of their suppliers have an increasing reliance on cloud provider services and infrastructure. Due to economies of scale, efficiency of delivery, scalability and a long-tail of service offerings (from the cloud providers and their marketplaces), CSPs aren’t able to compete. Complexity of public cloud (security, scalability, performance, interoperability, etc) also make it a quandary for CSPs. It’s already a challenge (commercially and technically) to run the networks they do, but prohibitively difficult to expand coverage further to include public cloud.
Yet, by investing heavily in cloud services, CSPs are funding further growth of said cloud providers, thus making CSPs less competitive, but more reliant, on cloud services. Telco architects are becoming ever more adept at leveraging the benefits of cloud. An example is being able to spin up apps without having to wait for massive infrastructure projects to be completed first, which has been a massive dependency (ie time delay) for many OSS/BSS projects.
In the distant past, CSPs had the killer apps, being voice and WAN data. These services supported the long-tail of business (eg salespeople from every industry in the world would make sales calls via telephony services) and customers were willing to pay a premium for these services.
The long-tail of business is now omni-channel, and the killer apps are content, experiences, data and the apps that support them. Being the killer apps, whoever supplies them also takes the premium and share-of-wallet. AWS, Google and Microsoft are supplying more of today’s killer apps (or the platforms that support them) than CSPs are.
The risk for CSPs is that cloud providers and over the top players will squeeze most of the profits from massive global investments in 5G. This is exacerbated if telco architects get their cloud architectures wrong and OPEX costs spiral out of control. Whether architectures are optimal or not, CSPs will fund much of the cloud infrastructure. But if CSPs don’t leverage cloud provider offerings, the infrastructure will cost even more, take longer to get to market and constrain them to local presence, leaving them at a competitive disadvantage with other CSPs.
If I were a cloud provider, I’d be happy for CSPs to keep providing the local, physical, outside plant networks (however noting recent investments in local CSPs such as Amazon’s $2B stake in Bharti Airtel and Google’s $4.7 billion investment in Jio Platforms* not to mention Google Fiber and sub-sea fibre roll-outs such as this). It’s CAPEX intensive and needs a lot of human interaction to maintain / augment the widely distributed infrastructure. That means a lot is paid on non-effective time (ie the travel-time of techs travelling to site to fix problems, managing resources and/or coordinating repairs with property owners). Not only that, but there tends to be a lot of regulatory overhead managing local infrastructure / services as well as local knowledge / relationships. Governments want to ensure all their constituents have access to communications services at affordable prices. All the while, revenue per bit is continuing to drop, so merely shuffling bits around is a business model with declining profitability.
With declining profitability, operational efficiency improvements and cost reduction becomes even more important. OSS/BSS tools are vital for delivering improved productivity. But CSPs are faced with the challenge of transforming from legacy, monolithic OSS/BSS to more modern, nimble solutions. The more modular, flexible OSS/BSS of today and in future roadmaps are virtualised, microservice-based and designed for continuous delivery / DevOps. This is painting CSPs into a cloud-based future.
Like I said, a catch-22 for CSPs!
But another interesting strategy by Google is that its Anthos hybrid cloud platform will run multi-cloud workloads, including workloads on AWS and Microsoft Azure. Gartner predicts that >75% of midsize and large organisations will have adopted a multi-cloud and/or hybrid IT strategies by 2021 to prevent vendor lock-in. VMware (Dell) and Red Hat (IBM) are others creating multi-cloud / hybrid-cloud offerings. This gives the potential for CSPs to develop a near-global presence for virtualised telco functions. But will cloud providers get there before the telcos do?
For those of us supporting or delivering OSS/BSS, our future is in the clouds either way. It’s a rapidly evolving landscape, so watch this space.
* Note: Google is not the only significant investor in Jio:
Nope, they sell financial outcomes – they reduce downtime, they turn on revenue, they improve productivity by coordinating the workforce, etc…
But they only “sell money” if they can help stakeholders clearly see the money! I mean “actually” see it, not “read between the lines” see it! (so many benefits of OSS are intangible, so we have to help make the financial benefits more obvious).
They don’t sell network performance metrics or orchestration plans or AI or any other tech chatter. They sell money in the form of turning on customers that pay to use comms services. They sell insurance policies (ie service reliability) that keep customers from churning.
Or to think of it another way, could you estimate (in a dollar amount) the consequences of not having the OSS/BSS? What would the cost to your organisation be?
As discussed, some aspects of Operational Expenses are well known when kicking off a new OSS project (eg annual OSS license / support costs). Others can slip through the cracks – what I referred to as OPEX leakage (eg third-party software, ongoing maintenance of software customisations).
OPEX leakage might be an unfair phrase. If there’s a clear line of sight from the expenses to a profitable return, then it’s not leakage. If costs (of data, re-work, cloud services, applications, etc) are proliferating with no clear benefit, then the term “leakage” is probably fair.
I’ve seen examples of Agile and cloud implementation strategies where leakage has occurred. And even the supposedly “cheap” open-source strategies have led to surprises. OPEX leakage has caused project teams to scramble as their financial year progressed and budgets were unexpectedly being exceeded.
Oh, and one other observation to share that you may’ve seen examples of, particularly if you’ve worked on OSS in large organisations – Having OPEX incurred by one business unit but the benefit derived by different business units. This can cause significant problems for the people responsible for divisional budgets, even if it’s good for the business as a whole.
Let me explain by example: An operations delivery team needs extralogging capability so they stand up a new open-source tool. They make customisations so that log data can be collected for all of their network types. All log data is then sent to the organisation’s cloud instance. The operations delivery team now owns lifecycle maintenance costs. However, the cost of cloud (compute and storage) and data lake licensing have now escalated but Operations doesn’t foot that bill. They’ve just handed that “forever” budgetary burden to another business unit.
The opposite can also be true. The costs of build and maintain might be borne by IT or ops, but the benefits in revenue or CX (customer experience) are gladly accepted by business-facing units.
Both types of project could give significant whole-of-company benefit. But the unit doing the funding will tend to choose projects that are less effective if it means their own business unit will derive benefit (especially if individual’s bonuses are tied to those results).
OSS can be powerful tools, giving and receiving benefit from many different business units. However, the more OPEX-centric OSS projects that we see today are introducing new challenges to get funded and then supported across their whole life-cycle.
PS. Just like diamonds bought at retail prices, there’s a risk that the financials won’t look so great a year after purchase. If that’s the case, you may have to seek justification on intangible benefits. 😉
PS2. Check out Robert’s insightful comment to the initial post, including the following question, “I wonder how many OSS procurements are justified on the basis of reducing the Opex only *of the current OSS*, rather than reducing the cost of achieving what the original OSS was created to do? The former is much easier to procure (but may have less benefit to the business). The latter is harder (more difficult analysis to do and change to manage, but payoff potentially much larger).”
Geoff Moore’s seminal book, “Crossing the Chasm,” described the psychological chasm between early buyers and the mainstream market.
Seth Godin cites Moore’s work, “Moore’s Crossing the Chasm helped marketers see that while innovation was the tool to reach the small group of early adopters and opinion leaders, it was insufficient to reach the masses. Because the masses don’t want something that’s new, they want something that works…
The lesson is simple:
– Early adopters are thrilled by the new. They seek innovation.
– Everyone else is wary of failure. They seek trust.”
I’d reason that almost all significant OSS buyer decisions fall into the “mainstream market” section in the diagram above. Why? Well, an organisation might have the 15% of innovators / early-adopters conceptualising a new OSS project. However, sign-off of that project usually depends on a team of approvers / sponsors. Statistics suggest that 85% of the team is likely to exist in a mindset beyond the chasm and outweigh the 15%.
The mainstream mindset is seeking something that works and something they can trust.
But OSS / digital transformation projects are hard to trust. They’re all complex and unique. They often fail to deliver on their promises. They’re rarely reliable or repeatable. They almost all require a leap of faith (and/or a burning platform) for the buyer’s team to proceed.
OSS sellers seek to differentiate from the 400+ other vendors (of course). How do they do this? Interestingly, by pitching their innovations and uniqueness mostly.
Do you see the gap here? The seller is pitching the left side of the chasm and the buyer cohort is on the right.
I wonder whether our infuriatingly lengthy sales cycles (often 12-18 months) could be reduced if only we could engineer our products and projects to be more mainstream, repeatable, reliable and trustworthy, whilst being less risky.
This is such a dilemma though. We desperately need to innovate, to take the industry beyond the chasm. Should we innovate by doing new stuff? Or should we do the old, important stuff in new and vastly improved ways? A bit of both??
Do we improve our products and transformations so that they can be used / performed by novices rather than designed for use by all the massive intellects that our industry seems to currently consist of?
I sometimes wonder whether OPEX is underestimated when considering OSS investments, or at least some facets (sorry, awful pun there!) of it.
Cost-out (aka head-count reduction) seems to be the most prominent OSS business case justification lever. So that’s clearly not underestimated. And the move to cloud is also an OPEX play in most cases, so it’s front of mind during the procurement process too. I’m nought for two so far! Hopefully the next examples are a little more persuasive!
Large transformation projects tend to have a focus on the up-front cost of the project, rightly so. There’s also an awareness of ongoing license costs (usually 20-25% of OSS software list price per annum). Less apparent costs can be found in the exclusions / omissions. This is where third-party OPEX costs (eg database licenses, virtualisation, compute / storage, etc) can be (not) found.
That’s why you should definitely consider preparing a TCO (Total Cost of Ownership) model that includes CAPEX and OPEX that’s normalised across all options when making a buying decision.
But the more subtle OPEX leakage occurs through customisation. The more customisation from “off-the-shelf” capability, the greater the variation from baseline, the larger the ongoing costs of maintenance and upgrade. This is not just on proprietary / commercial software, but open-source products as well.
And choosing Agile almost implies ongoing customisation. One of the things about Agile is it keeps adding stuff (apps, data, functions, processes, code, etc) via OPEX. It’s stack-ranked, so it’s always the most important stuff (in theory). But because it’s incremental, it tends to be less closely scrutinised than during a CAPEX / procurement event. Unless carefully monitored, there’s a greater chance for OPEX leakage to occur.
And as we know about OPEX, like diamonds, they’re forever (ie the costs re-appear year after year).
“When experts are wrong, it’s often because they’re experts on an earlier version of the world.”
OSS experts are often wrong. Not only because of the “earlier version of the world” paradigm mentioned above, but also the “parallel worlds” paradigm that’s not explicitly mentioned. That is, they may be experts on one organisation’s OSS (possibly from spending years working on it), but have relatively little transferable expertise on other OSS.
It would be nice if the OSS world view never changed and we could just get more and more expert at it, approaching an asymptote of expertise. Alas, it’s never going to be like that. Instead, we experience a world that’s changing across some of our most fundamental building blocks.
“We are the sum total of our experiences.”
My earliest forays into OSS had a heavy focus on inventory. The tie-in between services, logical and physical inventory (and all use-cases around it) was probably core to me becoming passionate about OSS. I might even go as far as saying I’m “an Inventory guy.”
Those early forays occurred when there was a scarcity mindset in network resources. You provisioned what you needed and only expanded capacity within tight CAPEX envelopes. Managing inventory and optimising revenue using these scarce resources was important. We did that with the help of Inventory Management (IM) tools. Even end-users had a mindset of resource scarcity.
But the world has changed. We now operate with a cloud-inspired abundance mindset. We over-provision physical resources so that we can just spin up logical / virtual resources whenever we wish. We have meshed, packet-switched networks rather than nailed up circuits. Generally speaking, cost per resource has fallen dramatically so we now buy a much higher port density, compute capacity, dollar per bit, etc. Customers of the cloud generation assume abundance of capacity that is even available in small consumption-based increments. In many parts of the world we can also assume ubiquitous connectivity.
So, as “an inventory guy,” I have to question whether the scarcity to abundance transformation might even fundamentally change my world-view on inventory management. Do I even need an inventory management solution or should I just ask the network for resources when I want to turn on new customers and assume the capacity team has ensured there’s surplus to call upon?
Is the enormous expense we allocate to building and reconciling a digital twin of the network (ie the data gathered and used by Inventory Management) justified? Could we circumvent many of the fallouts (and a multitude of other problems) that occur because the inventory data doesn’t accurately reflect the real network?
For example, in the old days I always loved how much easier it was to provision a customer’s mobile / cellular or IN (Intelligent Network) service than a fixed-line service. It was easier because fixed-line service needed a whole lot more inventory allocation and reservation logic and process. Mobile / IN services didn’t rely on inventory, only an availability of capacity (mostly). Perhaps the day has almost come where all services are that easy to provision?
Yes, we continue to need asset management and capacity planning. Yes, we still need inventory management for physical plant that has no programmatic interface (eg cables, patch-panels, joints, etc). Yes, we still need to carefully control the capacity build-out to CAPEX to revenue balance (even more so now in a lower-profitability operator environment). But do many of the other traditional Inventory Management and resource provisioning use cases go away in a world of abundance?
I’d love to hear your opinions, especially from all you other “inventory guys” (and gals)!! Are your world-views, expertise and experiences changing along these lines too or does the world remain unchanged from your viewing point?
OSS wear many hats and help many different functions within an organisation. One function that OSS assists might be surprising to some people – the CFO / Accounting function.
The traditional service provider business model tends to be CAPEX-heavy, with significant investment required on physical infrastructure. Since assets need to be depreciated and life-cycle managed, Accountants have an interest in the infrastructure that our OSS manage via Inventory Management (IM) tools.
I’ve been lucky enough to work with many network operators and see vastly different asset management approaches used by CFOs. These strategies have ranged from fastidious replacement of equipment as soon as depreciation cycles have expired through to building networks using refurbished equipment that has already passed manufacturer End-of-Life dates. These strategies fundamentally effect the business models of these operators.
Given that telecommunications operator revenues are trending lower globally, I feel it’s incumbent on us to use our OSS to deliver positive outcomes to global business models.
With this in mind, I found this article entitled, “Circular Economy at Work in Google Data Centers,” to be quite interesting. It cites, “Google’s circular approach to optimizing end of life of servers based on Total Cost of Ownership (TCO) principles have resulted in hundreds of millions per year in cost avoidance.”
Asset lifecycle management is not your typical focus area for OSS experts, but an area where we can help add significant value for our customers!
Some operators use dedicated asset management tools such as SAP. Others use OSS IM tools. Others reconcile between both. There’s no single right answer.
For a deeper dive into ideas where our OSS can help in asset lifecycle (which Google describes as its Circular Economy and seems to manage using its ReSOLVE tool), I really recommend reviewing the article link above.
If you need to develop such a tool using machine learning models, reach out to us and we’ll point you towards some tools equivalent to ReSOLVE to augment your OSS.
I must’ve written dozens of posts about us needing to collectively invest a lot more effort into UI / UX. I’ve written quite a few over the last few months especially. This one in particular springs to mind.
As an industry, we typically don’t do user experience journeys (UX) or user interfaces (UI) very well at all yet. I know thousands of OSS experts, but only 2 who specialise in UI / UX! That ratio is far too small….
… but then something dawned on me when writing the Autonomous Networking post earlier this week – All effort invested into UI (and most effort on UX) is pointless if we succeed in building autonomous networks. You get the implication don’t you? Truly autonomous networks are machine-driven, so you don’t need users, UI or UX.
Oh, I should make one last point though. If you don’t expect to get all of your network operations activities to midday on the Autonomous Networks Clock, within the next couple of years, then you probably should still invest in your UI / UX!!
….I want to make my network so observable, reliable, predictable and repeatable that I don’t need anyone to operate it.
That’s clearly a highly ambitious goal. Probably even unachievable if we say it doesn’t need anyone to run it. But I wonder whether this has to be the starting point we take on behalf of our network operator customers?
If we look at most networks, OSS, BSS, NOC, SOC, etc (I’ll call this whole stack “the black box” in this article), they’ve been designed from the ground up to be human-driven. We’re now looking at ways to automate as many steps of operations as possible.
If we were to instead design the black-box to be machine-driven, how different would it look?
In fact, before we do that, perhaps we have to take two unique perspectives on this question:
Retro-fitting existing black-boxes to increase their autonomy
Designing brand new autonomous black-boxes
I suspect our approaches / architectures will be vastly different.
The first will require a incredibly complex measure, command and control engine to sit over top of the existing black box. It will probably also need to reach into many of the components that make up the black box and exert control over them. This approach has many similarities with what we already do in the OSS world. The only exception would be that we’d need to be a lot more “closed-loop” in our thinking. I should also re-iterate that this is incredibly complex because it inherits an existing “decision tree” of enormous complexity and adds further convolution.
The second approach holds a great deal more promise. However, it will require a vastly different approach on many levels:
We have to take a chainsaw to the decision tree inside the black box. For example:
We start by removing as much variability from the network as possible. Think of this like other utilities such as water or power. Our electricity service only has one feed-type for almost all residential and business customers. Yet it still allows us great flexibility in what we plug into it. What if a network operator were to simply offer a “broadband dial-tone” service and end users decide what they overlay on that bit-stream
This reduces the “protocol stack” in the network (think of this in terms of the long list of features / tick-boxes on any router’s brochure)
As well as reducing network complexity, it drastically reduces the variables an end-user needs to decide from. The operator no longer needs 50 grandfathered, legacy products
This also reduces the decision tree in BSS-related functionality like billing, rating, charging, clearing-house
We achieve a (globally?) standardised network services catalog that’s completely independent of vendor offerings
We achieve a more standardised set of telemetry data coming from the network
In turn, this drives a more standardised and minimal set of service-impact and root-cause analyses
We design data input/output methods and interfaces (to the black box and to any of its constituent components) to have closed-loop immediacy in mind. At the moment we tend to have interfaces that allow us to interrogate the network and push changes into the network separately rather than tasking the network to keep itself within expected operational thresholds
We allow networks to self-regulate and self-heal, not just within a node, but between neighbours without necessarily having to revert to centralised control mechanisms like OSS
All components within the black-box, down to device level, are programmable. [As an aside, we need to consider how to make the physical network more programmable or reconcilable, considering that cables, (most) patch panels, joints, etc don’t have APIs. That’s why the physical network tends to give us the biggest data quality challenges, which ripples out into our ability to automate networks]
End-to-end data flows (ie controls) are to be near-real-time, not constrained by processing lags (eg 15 minute poll cycles, hourly log processing cycles, etc)
Data minimalism engineering. It’s currently not uncommon for network devices to produce dozens, if not hundreds, of different metrics. Most are never used by operators manually, nor are likely to be used by learning machines. This increases data processing, distribution and storage overheads. If we only produce what is useful, then it should improve data flow times (point 5 above). Therefore learning machines should be able to control which data sets they need from network devices and at what cadence. The learning engine can start off collecting all metrics, then progressively turning them off as they deem metrics unnecessary. This could also extend to controlling log-levels (ie how much granularity of data is generated for a particular log, event, performance counter)
Perhaps we even offer AI-as-a-service, whereby any of the components within the black-box can call upon a centralised AI service (and the common data lake that underpins it) to assist with localised self-healing, self-regulation, etc. This facilitates closed-loop decisions throughout the stack rather than just an over-arching command and control mechanism
I’m barely exposing the tip of the iceberg here. I’d love to get your thoughts on what else it will take to bring fully autonomous network to reality.
Inability to serve the market (eg offerings, capacity, etc)
Inability to operate network assets profitably
In that article, we looked closely at a human factor and how current trends of open-source, Agile and microservices might actually exacerbate it. In yesterday’s article we looked at market-serving factors for us to investigate and monitor.
But let’s look at point 3 today. The profitability factors we could consider that reduce the chances of the big boss getting fired are:
Ability to see revenues in near-real-time (revenues are relatively easy to collect, so we use these numbers a lot. Much harder are profitability measures because of the shared allocation of fixed costs)
Ability to see cost breakdown (particularly which parts of the technical solution are most costly, such as what device types / topologies are failing most often)
Ability to measure profitability by product type, customer, etc
Are there more profitable or cost-effective solutions available
Is there greater profitability that could be unlocked by simplification
Inability to serve the market (eg offerings, capacity, etc)
Inability to operate network assets profitably
In that article, we looked closely at a human factor and how current trends of open-source, Agile and microservices might actually exacerbate it. In yesterday’s article we looked at the broader set of catastrophic failure factors for us to investigate and monitor.
But let’s look at some of the broader examples under point 2 today. The market-serving factors we could consider that reduce the chances of the big boss getting fired are:
Immediate visibility of key metrics by boss and execs (what are the metrics that matter, eg customer numbers, ARPU, churn, regulatory, media hot-buttons, network health, etc)
Response to “voice of customer” (including customer feedback, public perception, etc)
Human resources (incl up-skill for new tech, etc)
Ability to implement quickly / efficiently
Ability to handle change (to network topology, devices/vendors, business products, systems, etc)
Measuring end-to-end user experience, not just “nodal” monitoring
Scalability / Capacity (ability to serve customer demand now and into a foreseeable future)