The Theory of Evolution, OSS evolution

Evolution says that biological change is a property of populations — that every individual is a trial run of an experimental combination of traits, and that at the end of the trial, you are done and discarded, and the only thing that matters is what aggregate collection of traits end up in the next generation. The individual is not the focus, the population is. And that’s hard for many people to accept, because their entire perception is centered on self and the individual.”
FreeThoughtBlog.

Have we almost reached the point where the same can be said for OSS workflows? In the past (and the present?) we had pre-defined process flows. There may be an occasional if/else decision gate, but we could capture most variants on a process diagram. These pre-defined processes were / are akin to a production line.

Process diagrams are becoming harder to lock down as our decision trees get more complicated. Technologies proliferate, legacy product lines don’t get obsoleted, the number of customer contact channels increases. Not only that, but we’re now marketing to a segment of one, treating every one of our customers as unique, whilst trying not to break our OSS / BSS.

Do we have the technology yet that allows each transaction / workflow instance to just be treated as an experimental combination of attributes / tasks? More importantly, do we have the ability to identify any successful mutations that allow the population (ie the combination of all transactions) to get progressively better, faster, stronger.

It seems that to get to CX nirvana, being able to treat every customer completely uniquely, we need to first master an understanding of the population at scale. Conversely, to achieve the benefits of scale, we need to understand and learn from every customer interaction uniquely.

That’s evolution. The benchmark sets the pattern for future workflows until a variant / mutation identifies a better benchmark to establish the new pattern for future workflows, which continues.

The production line workflow model of the past won’t get us there. We need an evolution workflow model that is designed to accommodate infinite optionality and continually learn from it.

Does such a workflow tool exist yet? Actually, it’s more than a workflow tool. It’s a continually improving loop workflow.

The OSS proof-of-worth dilemma

Earlier this week we posted an article describing Sutton’s Law of OSS, which effectively tells us to go where the money is. The article suggested that in OSS we instead tend towards the exact opposite – the inverse-Sutton – we go to where the money isn’t. Instead of robbing banks like Willie Sutton, we break into a cemetery and aimlessly look for the cash register.

A good friend responded with the following, “Re: The money trail in OSS … I have yet to meet a senior exec. / decision maker in any telco who believes that any OSS component / solution / process … could provide benefit or return on any investment made. In telco, OSS = cost. I’ve tried very hard and worked with many other clever people also trying hard to find a way to pitch OSS which overcomes this preconception. BSS is often a little easier … in many cases it’s clear that “real money” flows through BSS and needs to be well cared for.”

He has a strong argument. The cost-out mentality is definitely ingrained in our industry.

We are saddled with the burden of proof. We need to prove, often to multiple layers of stakeholders, the true value of the (often intangible) benefits that our OSS deliver.

The same friend also posited, “The consequence is the necessity to establish beneficial working relationships with all key stakeholders – those who specify needs, those who design and implement solutions and those, especially, who hold or control the purse strings. [To an outsider] It’s never immediately obvious who these people are, nor what are their key drivers. Sometimes it’s ambition to climb the ladder, sometimes political need to “wedge” peers to build empires, sometimes it’s necessity to please external stakeholders – sometimes these stakeholders are vendors or government / international agencies. It’s complex and requires true consultancy – technical, business, political … at all levels – to determine needs and steer interactions.

Again, so true. It takes more than just technical arguments.

I’m big on feedback loops, but also surprised at how little they’re used in telco – at all levels.

  • We spend inordinate amounts of time building and justifying business cases, but relatively little measuring the actual benefits produced after we’ve finished our projects (or gaining the learnings to improve the next round of business cases)
  • We collect data in our databases, obliviously let it age, realise at some point in the future that we have a data quality issue and perform remedial actions (eg audits, fixes, etc) instead of designing closed-loop improvement cycles that ensure DQ isn’t constantly deteriorating
  • We’re great at spending huge effort in gathering / arguing / prioritising requirements, but don’t always run requirements traceability all the way into testing and operational rollout.
  • etc

Which leads us back to the burden of proof. Our OSS have all the data in the world, but how often do we use it to justify and persuade – to prove?

Our OSS products have so many modules and technical functionality (so much of it effectively duplicated from vendor to vendor). But I’ve yet to see any vendor product that allows their customer, the OSS operators, to automatically gather proof-of-worth stats (ie executive-ready reports). Nor have I seen any integrator build proof-of-worth consultancy into their offer, whereby they work closely with their customer to define and collect the metrics that matter. BTW. If this sounds hard, I’d be happy to discuss how I approach this task.

So let me leave you with three important questions today:

  1. Have you also experienced the overwhelming burden of the “OSS = cost” mentality
  2. If so, do you have any suggestions / experiences on how you’ve overcome it
  3. Does the proof-of-worth product functionality sound like it could be useful (noting that it doesn’t even have to be a product feature, but a custom report / portal using data that’s constantly coursing through our OSS databases)

OSS’ Rosetta Stone

When working on OSS projects, I find that linking or reference keys are so valuable at so many levels. Not just data management within the OSS database, but project management, design activities, task assignment / delivery, etc.

People might call things all sorts of different names, which leads to confusion.

Let me cite an example. When a large organisation has lots of projects underway and many people are maintaining project lists, but each with a slightly (or massively!!) different variant of the project name, there can be confusion, and possibly duplication. But introduce a project number and it becomes a lot easier to compare project lists (if everyone cites the project number in their documentation).

Trying to pattern match text / language can be really difficult. But if data sets have linking keys, we can easily use Excel’s vlookup function (or the human brain, or any other equivalent tool of choice) to make comparison a whole lot easier.

Linking keys could be activity codes in a WBS, a device name in a log file, a file number, a process number, part numbers in a designer’s pick-lists, device configuration code, etc.

Correlation and reconcile tasks are really important in OSS. An OSS does a lot of data set joins at application / database level. We can take a lead from code-level use of linking keys.

Just like the Rosetta stone proved to be the key to unlocking Egyptian hieroglyphs, introducing linking keys can be useful for translating different “languages” at many levels in OSS projects.

How OSS/BSS facilitated Telkomsel’s structural revenue changes

The following two slides were presented by Monty Hong of Indonesia’s Telkomsel at Digital Transformation Asia 2018 last week. They provide a fascinating insight into the changing landscape of comms revenues that providers are grappling with globally and the associated systems decisions that Telkomsel has made.

The first shows the drastic declines in revenues from Telkomsel’s traditional telco products (orange line), contrasted with the rapid rise in revenues from content such as video and gaming.
Telkomsel Revenue Curve

The second shows where Telkomsel is repositioning itself into additional segments of the content value-chain (red chevrons at top of page show where Telkomsel is playing).
Telkomsel gaming ecosystem

Telkomsel has chosen to transform its digital core to specifically cater for this new revenue model with one API ecosystem. One of the focuses of this transformation is to support a multi-speed architectural model. Traditional back-end systems (eg OSS/BSS and system of records) are expected to rarely change, whilst customer-facing systems are expected to be highly agile to cater to changing customer needs.

More about the culture of this change tomorrow.

Is your data getting too heavy for your OSS to lift?

Data mass is beginning to exhibit gravitational properties – it’s getting heavy – and eventually it will be too big to move.”
Guy Lupo
in this article on TM Forum’s Inform that also includes contributions from George Glass and Dawn Bushaus.

Really interesting concept, and article, linked above.

The touchpoint explosion is helping to make our data sets ever bigger… and heavier.

In my earlier days in OSS, I was tasked with leading the migration of large sets of data into relational databases for use by OSS tools. I was lucky enough to spend years working on a full-scope OSS (ie it’s central database housed data for inventory management, alarm management, performance management, service order management, provisioning, etc, etc).

Having all those data sets in one database made it incredibly powerful as an insight generation tool. With a few SQL joins, you could correlate almost any data sets imaginable. But it was also a double-edged sword. Firstly, ensuring that all of the sets would have linking keys (and with high data quality / reliability) was a data migrator’s nightmare. Secondly, all those joins being done by the OSS made it computationally heavy. It wasn’t uncommon for a device list query to take the OSS 10 minutes to provide a response in the PROD environment.

There’s one concept that makes GIS tools more inherently capable of lifting heavier data sets than OSS – they generally load data in layers (that can be turned on and off in the visual pane) and unlike OSS, don’t attempt to stitch the different sets together. The correlation between data sets is achieved through geographical proximity scans, either algorithmically, or just by the human eye of the operator.

If we now consider real-time data (eg alarms/events, performance counters, etc), we can take a leaf out of Einstein’s book and correlate by space and time (ie by geographical and/or time-series proximity between otherwise unrelated data sets). Just wondering – How many OSS tools have you seen that use these proximity techniques? Very few in my experience.

BTW. I’m the first to acknowledge that a stitched data set (ie via linking keys such as device ID between data sets) is definitely going to be richer than uncorrelated data sets. Nonetheless, this might be a useful technique if your data is getting too heavy for your OSS to lift (eg simple queries are causing minutes of downtime / delay for operators).

Are telco services and SLAs no longer relevant?

I wonder if we’re reaching the point where “telecommunication services” is no longer a relevant term? By association, SLAs are also a bust. But what are they replaced by?

A telecommunication service used to effectively be the allocation of a carrier’s resources for use by a specific customer. Now? Well, less so

  1. Service consumption channel alternatives are increasing, from TV and radio; to PC, to mobile, to tablet, to YouTube, to Insta, to Facebook, to a million others.
    Consumption sources are even more prolific.
  2. Customer contact channel alternatives are also increasing, from contact centres; to IVR, to online, to mobile apps, to Twitter, etc.
  3. A service bundle often utilises third-party components, some of which are “off-net”
  4. Virtualisation is increasingly abstracting services from specific resources. They’re now loosely coupled with resource pools and rely on high availability / elasticity to ensure customer service continuity. Not only that, but those resource pools might extend beyond the carrier’s direct control and out to cloud provider infrastructure

The growing variant-tree is taking the concept beyond the reach of “customer services” and evolves to become “customer experiences.”

The elements that made up a customer service in the past tended to fall within the locus of control of a telco and its OSS. The modern customer experience extends far beyond the control of any one company or its OSS. An SLA – Service Level Agreement – only pertains to the sub-set of an experience that can be measured by the OSS. We can aspire to offer an ELA – Experience Level Agreement – because we don’t have the mechanisms by which to measure or manage the entire experience yet.

The metrics that matter most for telcos today tend to revolve around customer experience (eg NPS). But aside from customer surveys, ratings and derived / contrived metrics, we don’t have electronic customer experience measurements.

Customer services are dead; Long live the customer experiences king… if only we can invent a way to measure the whole scope of what makes up customer experiences.

Intent to simplify our OSS

The left-hand panel of the triptych below shows the current state of interactions with most OSS. There are hundreds of variants inbound via external sources (ie multi-channel) and even internal sources (eg different service types). Similarly, there are dozens of networks (and downstream systems), each with different interface models. Each needs different formatting and integration costs escalate.
Intent model OSS

The intent model of network provisioning standardises the network interface, drastically simplifying the task of the OSS and the variants required for it to handle. This becomes particularly relevant in a world of NFVs, where it doesn’t matter which vendor’s device type (router say) can be handled via a single command intent rather than having separate interfaces to each different vendor’s device / EMS northbound interface. The unique aspects of each vendor’s implementation are abstracted from the OSS.

The next step would be in standardising the interface / data model upstream of the OSS. That’s a more challenging task!!

Facebook’s algorithmic feed for OSS

This is the logic that led Facebook inexorably to the ‘algorithmic feed’, which is really just tech jargon for saying that instead of this random (i.e. ‘time-based’) sample of what’s been posted, the platform tries to work out which people you would most like to see things from, and what kinds of things you would most like to see. It ought to be able to work out who your close friends are, and what kinds of things you normally click on, surely? The logic seems (or at any rate seemed) unavoidable. So, instead of a purely random sample, you get a sample based on what you might actually want to see. Unavoidable as it seems, though, this approach has two problems. First, getting that sample ‘right’ is very hard, and beset by all sorts of conceptual challenges. But second, even if it’s a successful sample, it’s still a sample… Facebook has to make subjective judgements about what it seems that people want, and about what metrics seem to capture that, and none of this is static or even in in principle perfectible. Facebook surfs user behaviour..”
Ben Evans
here.

Most of the OSS I’ve seen tend to be akin to Facebook’s old ‘chronological feed’ (where users need to sift through thousands of posts to find what’s most interesting to them).

The typical OSS GUI has thousands of functions (usually displayed on a screen all at once – via charts, menus, buttons, pull-downs, etc). But of all of those available functions, any given user probably only interacts with a handful.
Current-style OSS interface

Most OSS give their users the opportunity to customise their menus, colour schemes, even filters. For some roles such as network ops, designers, order entry operators, there are activity lists, often with sophisticated prioritisation and skills-based routing, which starts to become a little more like the ‘algorithmic feed.’

However, unlike the random nature of information hitting the Facebook feed, there is a more explicit set of things that an OSS user is tasked to achieve. It is a little more directed, like a Google search.

That’s why I feel the future OSS GUI will be more like a simple search bar (like Google) that will provide a direction of intent as well as some recent / regular activity icons. Far less clutter than the typical OSS. The graphs and activity lists that we know and love would still be available to users, but the way of interacting with the OSS to find the most important stuff quickly needs to get more intuitive. In future it may even get predictive in knowing what information will be of interest to you.
OSS interface of the future

Extending the OSS beyond a customer’s locus of control

While the 20th century was all about centralizing command and control to gain efficiency through vertical integration and mass standardization, 21st century automation is about decentralization – gaining efficiency through horizontal integration of partner ecosystems and mass customization, as in the context-aware cloud where personalized experience across channels is dynamically orchestrated.
The operational challenge of our time is to coordinate these moving parts into coherent and manageable value chains. Instead of building yet another siloed and brittle application stack, the age of distributed computing requires that we re-think business architecture to avoid becoming hopelessly entangled in a “big ball of CRUD”
.”
Dave Duggal
here on TM Forum’s Inform back in May 2016.

We’ve quickly transitioned from a telco services market driven by economies of scale (Dave’s 20th century comparison) to a “market of one” (21st century), where the market wants a personalised experience that seamlessly crosses all channels.

By and large, the OSS world is stuck between the two centuries. Our tools are largely suited to the 20th century model (in some cases, today’s OSS originated in the 20th century after all), but we know we need to get to personalisation at scale and have tried to retrofit them. We haven’t quite made the jump to the model Dave describes yet, although there are positive signs.

It’s interesting. Telcos have the partner ecosystems, but the challenge is that the entire ecosystem still tends to be centrally controlled by the telco. This is the so-called best-of-breed model.

In the truly distributed model Dave talks about, the telcos would get the long tail of innovation / opportunity by extending their value chain beyond their own OSS stack. They could build an ecosystem that includes partners outside their locus of control. Outside their CAPEX budget too, which is the big attraction. They telcos get to own their customers, build products that are attractive to those customers, gain revenues from those products / customers, but not incur the big capital investment of building the entire OSS stack. Their partners build (and share profits from) external components.

It sounds attractive right? As-a-service models are proliferating and some are steadily gaining take-up, but why is it still not happening much yet, relatively speaking? I believe it comes down to control.

Put simply, the telcos don’t yet have the right business architectures to coordinate all the moving parts. From my customer observation at least, there are too many fall-outs as a customer journeys hand off between components within the internally controlled partner ecosystem. This is especially when we talk omni-channel. A fully personalised solution leaves too many integration / data variants to provide complete test coverage. For example, just at the highest level of an architecture, I’ve yet to see a solution that tracks end-to-end customer journeys across all components of the OSS/BSS as well as channels such as online, IVR, apps, etc.

Dave rightly points out that this is the challenge of our times. If we can coherently and confidently manage moving parts across the entire value chain, we have more chance of extending the partner ecosystem beyond the telco’s locus of control.

OSS holds the key to network slicing

Network slicing opens new business opportunities for operators by enabling them to provide specialized services that deliver specific performance parameters. Guaranteeing stringent KPIs enables operators to charge premium rates to customers that value such performance. The flip side is that such agreements will inevitably come with tough contractual obligations and penalties when the agreed KPIs are not met…even high numbers of slices could be managed without needing to increase the number of operational staff. The more automation applied, the lower the operating costs. At 100 percent automation, there is virtually no cost increase with the number of slices. Granted this is a long-term goal and impractical in the short to medium term, yet even 50 percent automation will bring very significant benefits.”
From a paper by Nokia – “Unleashing the economic potential of network slicing.”

With typical communications services tending towards commoditisation, operators will naturally seek out premium customers. Customers with premium requirements such as latency, throughput, reliability, mobility, geography, security, analytics, etc.

These custom requirements often come with unique network configuration requirements. This is why network slicing has become an attractive proposition. The white paper quoted above makes an attempt at estimating profitability of network slicing including some sensitivity analyses. It makes for an interesting read.

The diagram below is one of many contained in the White Paper:
Nokia Network Slicing

It indicates that a significant level of automation is going to be required to achieve an equivalent level of operational cost to a single network. To re-state the quote, “The more automation applied, the lower the operating costs. At 100 percent automation, there is virtually no cost increase with the number of slices. Granted this is a long-term goal and impractical in the short to medium term, yet even 50 percent automation will bring very significant benefits.”

Even 50% operational automation is a significant ambition. OSS hold the key to delivering on this ambition. Such ambitious automation goals means we have to look at massive simplification of operational variant trees. Simplifications that include, but go far beyond OSS, BSS and networks. This implies whole-stack simplification.

OSS designed as a bundle, or bundled after?

Over the years I’m sure you’ve seen many different OSS demonstrations. You’ve probably also seen presentations by vendors / integrators that have shown multiple different products from their suite.

How integrated have they appeared to you?

  1. Have they seemed tightly integrated, as if carved from a single piece of stone?
  2. Or have they seemed loosely integrated, a series of obviously different stones joined together with some mortar?
  3. Or perhaps even barely associated, a series of completely different objects (possibly through product acquisition) branded under a common marketing name?

There are different pros and cons with each approach. Tight integration possibly suits a greenfields OSS. Looser integration perhaps better suits carve-off for best-of-breed customer architecture models.

I don’t know about you, but I always prefer to be given the impression that an attempt has been made to ensure consistency in the bundling. Consistency of user-interface, workflow, data modelling/presentation, reports, etc. With modern presentation layers, database technologies and the availability of UX / CX expertise, this should be less of a hurdle than it has been in the past.

Very little OSS data is ever actually used

We keep shiploads of data in our OSS don’t we? Just think about how much storage your OSS estate consumes.

Technically, it doesn’t cost much (relatively) to retain all that potential for insight generation with the cost of storage diminishing. The real cost of storing the data goes a little deeper than the $/Mb though. Other cost factors include data curation, cleansing, database search performance, etc.

There’s a whole field of study relating to this, named Information Lifecycle Management (ILM), but let’s look at it in terms of relevance to OSS.

We collect information across different timescales including real-time processing, short-term correlations, longer-term trending and long-term statutory / regulatory.

Information Lifecycle Management
Note: I suspect the “Less Archive” box actually should say “Less Active”.
Diagram above sourced from here.

But rather than blindly just storing everything, we could ask ourselves at what stage does each data sub-set lose relevance. As our OSS data ages, it can tend to deteriorate because the models it uses also deteriorate. Model deterioration factors, such as those described in this recent post about a machine-learning PoC and the following, are numerous:

  • Network devices change (including cards, naming conventions used, life-cycle upgrades, capacity, new alarm types, etc)
  • Network topologies change
  • Business processes change
  • Customer behaviours change
  • Product / Service offerings change
  • Regulations change
  • New datasets become available
  • Data model factors change to cope with gaps in original models

Each of these factors (and more) lead to deterioration in the usefulness of baseline data. This means the insight signals in the data becomes less clear, or at worst the baseline needs to be re-established, making old data invalid. If it’s invalid, then retention would appear to be pointless. Shifting it to the right through the storage types shown in the diagram above could also be pointless.

Very little of the OSS data you store is ever actually used, decreasingly so as it ages. Do you have a heatmap of what data you use in your OSS?

Where are the reliability hotspots in your OSS?

As you already know, there are two categories of downtime – unplanned (eg failures) and planned (eg upgrades / maintenance).

Planned downtime sounds a lot nicer (for operators) but the reality is that you could call both types “incidents” – they both impact (or potentially impact) the customer. We sometimes underestimate that fact.

Today’s question is whether you’re able to identify where the hotspots are in your OSS suite when you combine both types of downtime. Can you tell which outages are service-impacting?

In a round-about way, I’m asking whether you already have a dashboard that monitors uptime of all the components (eg applications, probes, middleware, infra, etc) that make up your complete OSS / BSS estate? If you do, does it tell you what you anecdotally know already, or are there sometimes surprises?

Does the data give you the evidence you need to negotiate with the implementers of problematic components (eg patch cadence, the need for reliability fixes, streamlining the patch process, reduction in customisations, etc)? Does it give you reason to make architectural changes (eg webscaling)?

OSS operationalisation at scale

We had a highly flexible network design team at a previous company. Not because we wanted to necessarily, but because we were forced to by the client’s allocation of work.

Our team was largely based on casual workers because there was little to predict whether we needed 2 designers or 50 in any given week. The workload being assigned by the client was incredibly lumpy.

But we were lucky. We only had design work. The lumpiness in design effort flowed down through the work stack into construction, test and deployment teams. The constructors had millions of dollars of equipment that they needed to mobilise and demobilise as the work ebbed and flowed. Unfortunately for the constructors, they’d prepared their rate cards on the assumption of a fairly consistent level of work coming through (it was a very big project).

This lumpiness didn’t work out for anyone in the delivery pipeline, the client included. It was actually quite instrumental in a few of the constructors going into liquidation. The client struggled to meet roll-out targets.

The allocation of work was being made via the client’s B/OSS stack. The B/OSS teams were blissfully unaware of the downstream impact of their sporadic allocation of designs. Towards the end of the project, they were starting to get more consistent and delivery teams started to get into more of a rhythm… just as the network was coming to the end of build.

As OSS builders, we sometimes get so wrapped up in delivering functionality that we can forget that one of the key requirements of an OSS is to operationalise at scale. In addition to UI / CX design, this might be something as simple as smoothing the effort allocation for work under our OSS‘s management.

OSS data Ponzi scheme

The more data you have, the more data you need to understand the data you have. You are engaged in a data ponzi scheme…Could it be in service assurance and IT ops that more data equals less understanding?
Phil Tee
in the opening address at the AIOps Symposium.

Interesting viewpoint right?

Given that our OSS hold shed-loads of data, Phil is saying we need lots of data to understand that data. Well, yes… and possibly no.

I have a theory that data alone doesn’t talk, but it’s great at answering questions. You could say that you need lots of data, although I’d argue in semantics that you actually need lots of knowledge / awareness to ask great questions. Perhaps that knowledge / awareness comes from seeding machine-led analysis tools (or our data scientists’s brains) with lots of data.

The more data you have, the more noise that you need to find signal in amongst. That means you have to ask more questions of your data if you want to drive a return that justifies the cost of collecting and curating it all. Machine-led analytics certainly assist us in handling the volume and velocity of data our OSS create / collect. That’s just asking the same question/s over and over. There’s almost no end to the questions that can be asked of our data, just a limit on the time in which we can ask it.

Does that make data a Ponzi scheme? A Ponzi scheme pays profits to earlier investors using funds obtained from newer investors. Eventually it must collapse the scheme eventually runs out of new investors to fund profits. In a data Ponzi scheme, it pays in insights from earlier (seed) data by obtaining new (streaming) data. The stream of data reaching an OSS never runs out. If we need to invest heavily in data (eg AI / ML, etc), at what point in the investment lifecycle will we stop creating new insights?

Help needed: IoT / OSS cross-over use cases

Hi PAOSS community.

I’d like to call in a favour today if I may. I’m on the hunt for any existing use-cases and / or project sites that have integrated a significant sensor network into their OSS and existing operational processes.

That includes a strategy for handling IoT-scale integration of data collection, event / alarm processing, device management, data contextualization, data analytics, end-to-end security and applications management / enablement within existing OSS tools.

I’m looking for examples where an OSS had previously managed thousands of (network) devices and is now managing hundreds of thousands of (IoT) devices. Not necessarily IoT devices of customers as services but within an operator’s own network.

Obviously that’s an unprecedented change in scale in traditional OSS terms, but will be commonplace if our OSS are to play a part in the management of large sensor networks in the future.

There’s an element of mutual exclusivity between what an IoT management platform and OSS needs to do, but there are also some similarities. I’d love to speak with anyone who has actually bridged the gap.

If your partners don’t have to talk to you then you win

If your partners don’t have to talk to you then you win.”
Guy Lupo
.

Put another way, the best form of customer service is no customer service (ie your customers and/or partners are so delighted with your automated offerings that they have no reason to contact you). They don’t want to contact you anyway (generally speaking). They just want to consume a perfectly functional and reliable solution.

In the deep, distant past, our comms networks required operators. But then we developed automated dialling / switching. In theory, the network looked after itself and people made billions of calls per year unassisted.

Something happened in the meantime though. Telco operators the world over started receiving lots of calls about their platform and products. You could say that they’re unwanted calls. The telcos even have an acronym called CVR – Call Volume Reduction – that describes their ambitions to reduce the number of customer calls that reach contact centre agents. Tools such as chatbots and IVR have sprung up to reduce the number of calls that an operator fields.

Network as a Service (NaaS), the context within Guy’s comment above, represents the next new tool that will aim to drive CVR (amongst a raft of other benefits). NaaS theoretically allows customers to interact with network operators via impersonal contracts (in the form of APIs). The challenge will be in the reliability – ensuring that nothing falls between the cracks in any of the layers / platforms that combine to form the NaaS.

In the world of NaaS creation, Guy is exactly right – “If your partners [and customers] don’t have to talk to you then you win.” As always, it’s complexity that leads to gaps. The more complex the NaaS stack, the less likely you are to achieve CVR.

What OSS environments do you need?

When we’re planning a new OSS, we tend to be focused on the production (PROD) environment. After all, that’s where it’s primary purpose is served, to operationalise a network asset. That is where the majority of an OSS‘s value gets created.

But we also need some (roughly) equivalent environments for separate purposes. We’ll describe some of those environments below.

By default, vendors will tend to only offer licensing for a small number of database instances – usually just PROD and a development / test environment (DEV/TEST). You may not envisage that you will need more than this, but you might want to negotiate multiple / unlimited instances just in case. If nothing else, it’s worth bringing to the negotiation table even if it gets shot down because budgets are tight and / or vendor pricing is inflexible relating to extra environments.

Examples where multiple instances may be required include:

  1. Production (PROD) – as indicated above, that’s where the live network gets managed. User access and controls need to be tight here to prevent catastrophic events from happening to the OSS and/or network
  2. Disaster Recovery (DR) – depending on your high-availability (HA) model (eg cold standby, primary / redundant, active / active), you may require a DR or backup environment
  3. Sandpit (DEV / TEST) – these environments are essential for OSS operators to be able to prototype and learn freely without the risk of causing damage to production environments. There may need to be multiple versions of this environment depending on how reflective of PROD they need to be and how viable it is to take refresh / updates from PROD (aka PROD cuts). Sometimes also known as non-PROD (NP)
  4. Regression testing (REG TEST) – regression testing requires a baseline data set to continually test and compare against, flagging any variations / problems that have arisen from any change within the OSS or networks (eg new releases). This implies a need for data and applications to be shielded from the constant change occurring on other types of environments (eg DEV / TEST). In situations where testing transforms data (eg activation processes), REG TEST needs to have the ability to roll-back to the previous baseline state
  5. Training (TRAIN) – your training environments may need to be established with a repeatable set of training scenarios that also need to be re-set after each training session. This should also be separated from the constant change occurring on dev/test environments. However, due to a shortage of environments, and the relative rarity of training needed at some customers, TRAIN often ends up as another DEV or TEST environment
  6. Production Support (PROD-SUP) – this type of environment is used to prototype patches, releases or defect fixes (for defects on the PROD environment) prior to release into PROD. PROD-SUP might also be used for stress and volume testing, or SVT may require its own environment
  7. Data Migration (DATA MIG) – At times, data creation and loading needs to be prototype in a non-PROD environment. Sometimes this can be done in PROD-SUP or even a DEV / TEST environment. On other occasions it needs its own dedicated environment so as to not interrupt BAU (business as usual) activities on those other environments
  8. System Integration Testing (SIT)OSS integrate with many other systems and often require dedicated integration testing environments

Am I forgetting any? What other environments do you find to be essential on your OSS?

Designing an Operational Domain Manager (ODM)

A couple of weeks ago, Telstra and the TM Forum held an event in Melbourne on OSS for next gen architectures.

The diagram below comes from a presentation by Corey Clinger. It describes Telstra’s Operational Domain Manager (ODM) model that is a key component of their Network as a Service (NaaS) framework. Notice the API stubs across the top of the ODM? Corey went on to describe the TM Forum Open API model that Telstra is building upon.
Operational Domain Manager (ODM)

In a following session, Raman Balla indicated an perspective that differs from many existing OSS. The service owner (and service consumer) must know all aspects of a given service (including all dimensions, lifecycle, etc) in a common repository / catalog and it needs to be attribute-based. Raman also indicated that the aim he has for architecting NaaS is to not only standardise the service, but the entire experience around the service.

In the world of NaaS, operators can no longer just focus separately on assurance or fulfillment or inventory / capacity, etc. As per DevOps, operators are accountable for everything.

Security and privacy as an OSS afterthought?

I often talk about OSS being an afterthought for network teams. I find that they’ll often design the network before thinking about how they’ll operationalise it with an OSS solution. That’s both in terms of network products (eg developing a new device and only thinking about building the EMS later), or building networks themselves.

It can be a bit frustrating because we feel we can give better solutions if we’re in the discussion from the outset. As OSS people, I’m sure you’ll back me up on this one. But we can’t go getting all high and mighty just yet. We might just be doing the same thing… but to security, privacy and analytics teams.

In terms of security, we’ll always consider security-based requirements (usually around application security, access management, etc) in our vendor / product selections. We’ll also include Data Control Network (DCN) designs and security appliance (eg firewalls, IPS, etc) effort in our implementation plans. Maybe we’ll even prescribe security zone plans for our OSS. But security is more than that (check out this post for example). We often overlook the end-to-end aspects such central authentication, API hardening, server / device patching, data sovereignty, etc and it then gets picked up by the relevant experts well into the project implementation.

Another one is privacy. Regulations like GDPR and the Facebook trials show us the growing importance of data privacy. I have to admit that historically, I’ve been guilty on this one, figuring that the more data sets I could stitch together, the greater the potential for unlocking amazing insights. Just one problem with that model – the more data sets that are stitched together, the more likely that privacy issues arise.

We increasingly have to figure out ways to weave security, privacy and analytics into our OSS planning up-front and not just think of them as overlays that can be developed after all of our key decisions have been made.