OSS feature parity. A functionality arms race

OSS Vendor 1. “I have 1 million features.” (Dr Evil puts finger in mouth)
OSS Vendor 2. “Yeah, well I have 1,000,001 features in my OSS.”

This is the arms-race that we see in OSS, just like almost any other tech product. I imagine that vendors get into this arms-race because they wish to differentiate. Better to differentiate on functionality than price. If there’s a feature parity, then the only differentiator is price. We all know that doesn’t end well!

But I often ask myself a few related questions:

  • Of those million features, how many are actually used regularly
  • As a vendor do you have logging that actually allows you to know what features are being used
  • Taking the Whale Curve perspective, even if being used, how many of those features are actually contributing to the objectives of the vendor
    • Do they clearly contribute towards making sales
    • Do customers delight in using them
    • Would customers be irate if you removed them
    • etc

Earlier this week, I spoke about a friend who created an alarm management tool by himself over a weekend. It didn’t have a million features, but it did have all of what I’d consider to be the most important ones. It did look like a lot of other alarm managers that are now on the market. The GUI based on alarm lists still pervades.

If they all look alike, and all have feature parity, how do you differentiate? If you try to add more features, is it safe to assume that those features will deliver diminishing returns?

But is an alarm list and the flicking of tickets the best way to manage network health?

What if, instead of seeking incremental improvement, someone went back to the most important requirements and considered whether the current approach is meeting those customer needs? I have a strong suspicion that customer feedback will indicate that there are definitely flaws to overcome, especially on high event volume networks.

Clever use of large data volumes provides a level of pre-cognition and automation that wasn’t available when simple alarm lists were first invented. This in turn potentially changes the way that operators can engage with network monitoring and management.

What if someone could identify a whole new user interface / approach that overcame the current flaws and exceeded the key requirements? Would that be more of a differentiator than adding a 1,000,002nd feature?

If you’re looking for a comparison, there were plenty of MP3 players on the market with a heap of features, many more than the iPod. We all know how that one played out!

What if the OSS solution lies in its connections?

Imagine for a moment that you’re sitting in front of a pristine chess board, awaiting the opportunity to make your first move. All of the pieces have been exquisitely carved from stone, polished to a sheen. The rules of the game have been established for centuries, so you know exactly which piece is able to move in which sequences. Time to make the opening move.

You’ve studied the games of the masters who have preceded you and have planned your opening gambit, the procession of moves that will hopefully take you into a match-winning position. Due to your skills with modern automations, you’ve connected some of the chess pieces with delicate strings to implement your opening gambit with precision.

Unfortunately, after the first few moves, your strings are starting to pull the pieces out of position. Your opponent has countered well and you’re having to modify your initial plans. You introduce some additional pulleys and springs to help retain the rightful position of your pieces on the board and cope with unexpected changes in strategy. The automations are becoming ever more complex, taking more time to plan and implement than the actual next move.

The board is starting to devolve into unmanageable chaos.

Does this sound like the analogy of a modern OSS? It’s what I refer to as the chessboard analogy.

We’ve been at this OSS game for long enough to already have an understanding of all of the main pieces. TM Forum’s TAM provides this definition as a useful guide. The pieces are modular, elegant and quite well understood by its many players. The rules of the game haven’t really changed much. The main use cases of an OSS from decades ago (ie assure, fulfil, plan, build, etc) probably don’t differ significantly from those of today. This
“should” set the foundations for interchangeability of applications.

We see programs of work like ONAP, where millions of lines of code are being developed to re-write the rules of the game. I’m a big advocate of many of the principles of ONAP, but I’m still not sure that such a massive re-write is what’s needed.

It’s not so much in the components of our OSS as in the connections between them where things tend to go awry.

The foundation of all brilliance is seeing connections when no one else does.”
Richard Parkinson
.

This article distills ONAP from its answers back to the core questions. What if instead of seeking an entirely-new architectural stack, we focused on solving the core questions and the chessboard problem – the problem of connections?

Perhaps the answer to the connection problem lies in the interchangeable small grid OSS model discussed in yesterday’s article on planned OSS obsolescence.
But it probably also incorporates what ONAP calls, “real-time, policy-driven orchestration and automation,” to replace pre-defined processes. I wonder instead whether state-based transitions, being guided by intent/policy rules and feedback loops (ie learning systems) might hold the key. An evolving and learning solution that shares similarities with the electrical pathways in our brain, which strengthen the more they’re used and diminish if no longer used.

The future of work and its impact on OSS

Many years ago, I worked on a seriously big OSS transformation for one of the region’s biggest telcos. Everything was big on the project, the investment, the resources, the documentation. Everything except the outcomes. There was so much inefficiency that I often spoke about making one day of progress for every ten on site. Meetings, bureaucracy, impossible approval cycles, customer re-organisations, over-analysis, etc all added up to stagnation.

This contrasted so much with some of the amazing small teams I’ve worked alongside. Teams that worked cohesively, cleverly and just got stuff done with almost no resources. It’s one of the reasons I feel that the future of work, even for the very large organisations, will be via small teams. Outsourced to small, efficient teams / organisations. The gig economy, and the proliferation of tools that support it, make it an obvious approach to take, especially for very large organisations to leverage. Proof of work technologies, such as those building upon the discovery of blockchain, will provide further impetus to use smaller teams of experts.

Experts like a friend and colleague of mine who once built an alarm management tool in a weekend, by himself. It also happened to be more sophisticated than his employer’s existing tool that had taken years of combined developer effort by a larger team.

Maybe I’ll be proven wrong, but I see the transition to this model of work as being inevitable. The question I have is how to make our OSS more accommodating of this work model. Behemoth OSS stacks won’t. Highly modular OSS made up of many smaller components probably will, as long as they don’t succumb to the OSS chessboard analogy. The pulleys and strings will make it impossible for small, interchangable teams to decipher and manage.

A small-grid OSS model is the one I’d be backing in.

OSS – like a duck on a pond

Let’s start with a basic question. “What does an OSS need to do?”

The basic answer is, “make operations easier.”

The real answer(s) is so much more nuanced than that of course. The term easier can also encapsulate other words such as faster, more accurate, more repeatable, cheaper, etc.

Designing, building, operating and maintaining a sizable network is extremely challenging, despite network operators around the world, and the vendors that supply to them, employing some of the best and brightest. So we design OSS and related tools / processes to make operations easier.

Yet I sometimes wonder whether we achieve that aim – to make operations easier. Seems to me that we tend to focus more on just replicating functions at a higher layer in the management stack. That is, moving the function to the OSS rather than EMS/NMS, without really making it much easier operationally.

Let’s start at the user interface (UI). How often are they intuitive enough for an experienced network operator to start doing tasks with negligible OSS expert guidance?
Let’s look at deployments. How often are the projects low on effort, risk, cost and complexity?
Let’s look at flexibility (ie in-flight modifications or transformations). How often do we actually deliver flexibility to our customers through our OSS. To ask the same as above, how often are our changes low on effort, risk, cost and complexity?

As a small step towards providing an answer, I wonder whether it’s a case of making the hard things look easy and the easy things look hard.

We want to make the really hard operational things much easier to do within an OSS because that’s the primary purpose of an OSS. That’s the example of a duck on a pond. The OSS is gliding along effortlessly across the top of the water, but under the water it is paddling furiously.

Conversely, we want to make the really easy* operational things look hard to do within an OSS so that we’re not constantly being asked to build functionality / complexity into our OSS that doesn’t warrant being there. It diffuses the intent of the OSS. Just because we can, doesn’t mean we should.

Do the laws of physics prevent you from making an OSS pivot?

AIrcraft carrier
Image linked from GCaptain.com.

As you already know, the word pivot has become common in the world of business, particularly the world of start-ups. It’s a euphemism for a significant change in strategic direction. In the context of today’s post, I love the word pivot because it implies a rapid change in direction, something that’s seemingly impossible for most of our OSS and the customers who use them.

I like to use analogies. It’s no coincidence that some of the analogies posted here on PAOSS relate to the challenge in making strategic change in our OSS. Here are just three of those analogies:

The OSS intertia principle relates classical physics with our OSS, where Force equals Mass x Acceleration (F = ma). In other words, the greater the mass (of your OSS), the more force must be applied to reach a given acceleration (ie to effect a change)

The OSS chess-board analogy talks about the rubber bands and pulleys (ie integrations) that enmesh the pieces on our OSS chessboard. This means that other pieces get dragged out of position whenever we try to move any individual piece and chaos ensues.

The aircraft carrier analogy compares OSS (and the CSPs they service) with navies of old. In days gone by, CSPs enjoyed command of the sea. Their boats were big, powerful and mobile enough to move around world. However, their size requires significant planning to change course. The newer application and content communications models are analogous to the advent of aviation. The over the top (OTT) business model has the speed, flexibility, lower cost base and diversity of aircraft. Air supremacy has changed the competitive dynamic. CSPs and our OSS can’t quickly change from being a navy to being an airforce, so the aircraft carrier approach looks to the future whilst working within the constraints of the past.

When making day to day changes within, and to, your OSS does the ability to pivot ever come to mind?

Do you intentionally ensure it stays small, modular and limit its integrations to simplify your game of OSS chess?
If constrained by existing mass that you simply can’t eliminate, do you seek to transform via OSS‘s aviation equivalents?
Or like many of the OSS around the world, are you just making them larger, enmeshed behemoths that will never be able to change the laws of physics and achieve a pivot?

Do any of our global target architectures represent such behemoths?

Build an OSS and they will come… or sometimes not

Build it and they will come.

This is not always true for OSS. Let me recount a few examples.

The project team is disconnected from the users – The team that’s building the OSS in parallel to existing operations doesn’t (or isn’t able to) engage with the end users of the OSS. Once it comes time for cut-over, the end users want to stick with what they know and don’t use the shiny new OSS. From painful experience I can attest that stakeholder management is under-utilised on large OSS projects.

Turf wars – Different groups within a customer are unable to gain consensus on the solution. For example, the operational design team gains the budget to build an OSS but the network assurance team doesn’t endorse this decision. The assurance team then decides not to endorse or support the OSS that is designed and built by the design team. I’ve seen an OSS worth tens of millions of dollars turned off less than 2 years after handover because of turf wars. Stakeholder management again, although this could be easier said than done in this situation.

It sounded like a good idea at the time – The very clever OSS solution team keeps coming up with great enhancements that don’t get used, for whatever reason (eg non fit-for-purpose, lack of awareness of its existence by users, lack of training, etc). I’ve seen a customer that introduced over 500 customisations to an off-the-shelf solution, yet hundreds of those customisations hadn’t been touched by users within a full year prior to doing a utilisation analysis. That’s right, not even used once in the preceding 12 months. Some made sense because they were once-off tools (eg custom migration activities), but many didn’t.

The new OSS is a scary beast – The new solution might be perfect for what the customer has requested in terms of functionality. But if the solution differs greatly from what the operators are used to, it can be too intimidating to be used. A two-week classroom-based training course at the end of an OSS build doesn’t provide sufficient learning to take up all the nuances of the new system like the operators have developed with the old solution. Each significant new OSS needs an apprenticeship, not just a short-course.

It’s obsolete before it’s finishedOSS work in an environment of rapid change – networks, IT infrastructure, organisation models, processes, product offerings, regulatory shifts, disruptive innovation, etc, etc. The longer an OSS takes to implement, the greater the likelihood of obsolescence. All the more reason for designing for incremental delivery of business value rather than big-bang delivery.

What other examples have you experienced where an OSS has been built, but the users haven’t come?

Falsely rewarding based on OSS existence rather than excellence

There’s a common belief that most jobs see people rewarded for presence rather than performance. That is, they’re encouraged to be on site from 9am to 5pm rather than being given free reign over their work schedules as long as key outcomes are met / exceeded.

In OSS vendor / product selection there’s a similar concept. Contracts are often awarded based on existence rather than excellence. When evaluating a product, if it’s able to do a majority of the functions in the long list of requirements then the box is ticked.

However, this doesn’t take into account that there are usually only a very small number of functions that any given customer’s OSS needs to perform at a very high level of efficiency. All the others are effectively just nice to have. That’s the 80/20 rule at work.

When guiding a customer through their vendor selections, I always take them through an exercise to identify the use-cases / functions that really matter. Then we ensure that the demos or proofs of concept focus closely on how excellent the OSS is at those most important factors.

OSS automations – just because we can, doesn’t mean we should

Automation is about using machines / algorithms to respond faster than humans can, or more efficiently than humans can, or more accurately than humans can… but only if the outcomes justify the costs. When it comes to automations, it’s a case of, “just because we can, doesn’t mean we should.”

The more complex the decision tree you’re trying to automate, the higher the costs and therefore the harder it becomes to cost-justify. So the first step in any automation is taking a lateral thinking approach to simplifying the decision tree.

This recent post highlighted a graph from Nokia’s Bell Labs and the financial dependency that network slicing has on operational automation:
Nokia Network Slicing

Let’s use the Toyota Five Whys technique to work our way through the implications of this:

Statement 0: As CSPs, we need to drastically reduce complexity in the processes / decision-trees across our whole organisation.

Why 1? So that we can apply significant levels of automation

Why 2? So that we can apply technologies / techniques such as network slicing or virtualisation that are cost-justifiable

Why 3? So that we can offer differentiated, premium services

Why 4? So that our offerings don’t become commodities

Why 5? So that we retain corporate profitability to return to shareholders and/or invest in further interesting projects

I love that we’re looking to all number of automation technologies / techniques to apply to our OSS. However, we’re bypassing the all-important statement 0. We’re starting at Why 1 and partially missing the cost-justifiable part of Why 2. If our automation projects don’t prove cost-justifiable, then we never get the chance to reach whys 3, 4 and 5.

Persona mapping for OSS PoCs

When selecting new applications for an OSS or to augment an existing OSS, it always makes sense to me to run a Proof of Concept. But what do we want to demonstrate in that PoC? For me, we want to run demonstrations of the factors (eg features, use-cases, processes, etc) that justify the investment.

A simple exercise you can use is to identify the personas / roles that interact with the OSS. This could include personas such as NOC operator, strategic planner, network engineer, order entry, field ops, data / analytics, application administrator, etc. The actual personas will differ within each organisation of course.

For each of those personas, we can identify and interview an individual that represents that persona.

Interview questions include:

  1. What are the key responsibilities of your role
  2. What is the most important goal / KPI for your role
  3. How does this OSS (or proposed OSS) support you meeting this goal
  4. Describe the single most important process / function that you perform using the OSS
  5. Why is it so important
  6. How often do you perform this process / function
  7. Please provide a short list of other important processes / functions you perform with this OSS

We can then build this into a matrix and seek to prioritise into a set of use-cases. Based on time and cost constraints, we can then build the top-n of those use-cases into implementation scenarios for the PoC.

OSS operationalisation at scale

We had a highly flexible network design team at a previous company. Not because we wanted to necessarily, but because we were forced to by the client’s allocation of work.

Our team was largely based on casual workers because there was little to predict whether we needed 2 designers or 50 in any given week. The workload being assigned by the client was incredibly lumpy.

But we were lucky. We only had design work. The lumpiness in design effort flowed down through the work stack into construction, test and deployment teams. The constructors had millions of dollars of equipment that they needed to mobilise and demobilise as the work ebbed and flowed. Unfortunately for the constructors, they’d prepared their rate cards on the assumption of a fairly consistent level of work coming through (it was a very big project).

This lumpiness didn’t work out for anyone in the delivery pipeline, the client included. It was actually quite instrumental in a few of the constructors going into liquidation. The client struggled to meet roll-out targets.

The allocation of work was being made via the client’s B/OSS stack. The B/OSS teams were blissfully unaware of the downstream impact of their sporadic allocation of designs. Towards the end of the project, they were starting to get more consistent and delivery teams started to get into more of a rhythm… just as the network was coming to the end of build.

As OSS builders, we sometimes get so wrapped up in delivering functionality that we can forget that one of the key requirements of an OSS is to operationalise at scale. In addition to UI / CX design, this might be something as simple as smoothing the effort allocation for work under our OSS‘s management.

Help needed: IoT / OSS cross-over use cases

Hi PAOSS community.

I’d like to call in a favour today if I may. I’m on the hunt for any existing use-cases and / or project sites that have integrated a significant sensor network into their OSS and existing operational processes.

That includes a strategy for handling IoT-scale integration of data collection, event / alarm processing, device management, data contextualization, data analytics, end-to-end security and applications management / enablement within existing OSS tools.

I’m looking for examples where an OSS had previously managed thousands of (network) devices and is now managing hundreds of thousands of (IoT) devices. Not necessarily IoT devices of customers as services but within an operator’s own network.

Obviously that’s an unprecedented change in scale in traditional OSS terms, but will be commonplace if our OSS are to play a part in the management of large sensor networks in the future.

There’s an element of mutual exclusivity between what an IoT management platform and OSS needs to do, but there are also some similarities. I’d love to speak with anyone who has actually bridged the gap.

If your partners don’t have to talk to you then you win

If your partners don’t have to talk to you then you win.”
Guy Lupo
.

Put another way, the best form of customer service is no customer service (ie your customers and/or partners are so delighted with your automated offerings that they have no reason to contact you). They don’t want to contact you anyway (generally speaking). They just want to consume a perfectly functional and reliable solution.

In the deep, distant past, our comms networks required operators. But then we developed automated dialling / switching. In theory, the network looked after itself and people made billions of calls per year unassisted.

Something happened in the meantime though. Telco operators the world over started receiving lots of calls about their platform and products. You could say that they’re unwanted calls. The telcos even have an acronym called CVR – Call Volume Reduction – that describes their ambitions to reduce the number of customer calls that reach contact centre agents. Tools such as chatbots and IVR have sprung up to reduce the number of calls that an operator fields.

Network as a Service (NaaS), the context within Guy’s comment above, represents the next new tool that will aim to drive CVR (amongst a raft of other benefits). NaaS theoretically allows customers to interact with network operators via impersonal contracts (in the form of APIs). The challenge will be in the reliability – ensuring that nothing falls between the cracks in any of the layers / platforms that combine to form the NaaS.

In the world of NaaS creation, Guy is exactly right – “If your partners [and customers] don’t have to talk to you then you win.” As always, it’s complexity that leads to gaps. The more complex the NaaS stack, the less likely you are to achieve CVR.

What OSS environments do you need?

When we’re planning a new OSS, we tend to be focused on the production (PROD) environment. After all, that’s where it’s primary purpose is served, to operationalise a network asset. That is where the majority of an OSS‘s value gets created.

But we also need some (roughly) equivalent environments for separate purposes. We’ll describe some of those environments below.

By default, vendors will tend to only offer licensing for a small number of database instances – usually just PROD and a development / test environment (DEV/TEST). You may not envisage that you will need more than this, but you might want to negotiate multiple / unlimited instances just in case. If nothing else, it’s worth bringing to the negotiation table even if it gets shot down because budgets are tight and / or vendor pricing is inflexible relating to extra environments.

Examples where multiple instances may be required include:

  1. Production (PROD) – as indicated above, that’s where the live network gets managed. User access and controls need to be tight here to prevent catastrophic events from happening to the OSS and/or network
  2. Disaster Recovery (DR) – depending on your high-availability (HA) model (eg cold standby, primary / redundant, active / active), you may require a DR or backup environment
  3. Sandpit (DEV / TEST) – these environments are essential for OSS operators to be able to prototype and learn freely without the risk of causing damage to production environments. There may need to be multiple versions of this environment depending on how reflective of PROD they need to be and how viable it is to take refresh / updates from PROD (aka PROD cuts). Sometimes also known as non-PROD (NP)
  4. Regression testing (REG TEST) – regression testing requires a baseline data set to continually test and compare against, flagging any variations / problems that have arisen from any change within the OSS or networks (eg new releases). This implies a need for data and applications to be shielded from the constant change occurring on other types of environments (eg DEV / TEST). In situations where testing transforms data (eg activation processes), REG TEST needs to have the ability to roll-back to the previous baseline state
  5. Training (TRAIN) – your training environments may need to be established with a repeatable set of training scenarios that also need to be re-set after each training session. This should also be separated from the constant change occurring on dev/test environments. However, due to a shortage of environments, and the relative rarity of training needed at some customers, TRAIN often ends up as another DEV or TEST environment
  6. Production Support (PROD-SUP) – this type of environment is used to prototype patches, releases or defect fixes (for defects on the PROD environment) prior to release into PROD. PROD-SUP might also be used for stress and volume testing, or SVT may require its own environment
  7. Data Migration (DATA MIG) – At times, data creation and loading needs to be prototype in a non-PROD environment. Sometimes this can be done in PROD-SUP or even a DEV / TEST environment. On other occasions it needs its own dedicated environment so as to not interrupt BAU (business as usual) activities on those other environments
  8. System Integration Testing (SIT)OSS integrate with many other systems and often require dedicated integration testing environments

Am I forgetting any? What other environments do you find to be essential on your OSS?

The OSS Ferrari analogy

A friend and colleague has recently been talking about a Ferrari analogy on a security project we’ve been contributing to.

The end customers have decided they want a Ferrari solution, a shiny new, super-specified new toy (or in this case toys!). There’s just one problem though. The customer has a general understanding of what it is to drive, but doesn’t have driving experience or a driver’s license yet (ie they have a general understanding of what they want but haven’t described what they plan to do with the shiny toys operationally once the keys are handed over).

To take a step further back, since the project hasn’t articulated exactly where the customers want to go with the solution, we’re asking whether a Ferrari is even the right type of vehicle to take them there. As amazing as Ferraris are, might it actually make more sense to buy a 4WD vehicle?

As indicated in yesterday’s post, sometimes the requirements gathering process identifies the goal-based expectations (ie the business requirements – where the customer wants to go), but can often just identify a set of product features (ie the functional requirements such as a turbo-charged V8 engine, mid-mount engine, flappy-paddle gear change, etc, etc). The latter leads to buying a Ferrari. The former is more likely to lead to buying the vehicle best-suited to getting to the desired destination.

The OSS Ferrari sounds nice, but…

Optimisation Support Systems

We’ve heard of OSS being an acronym for operational support systems, operations support systems, even open source software. I have a new one for you today – Optimisation Support Systems – that exists for no purpose other than to drive a mindset shift.

I think we have to transition from “expectations” in a hype sense to “expectations” in a goal sense. NFV is like any technology; it depends on a business case for what it proposes to do. There’s a lot wrong with living up to hype (like, it’s impossible), but living up to the goals set for a technology is never unrealistic. Much of the hype surrounding NFV was never linked to any real business case, any specific goal of the NFV ISG.”
Tom Nolle
in his blog here.

This is a really profound observation (and entire blog) from Tom. Our technology, OSS included, tends to be surrounded by “hyped” expectations – partly from our own optimistic desires, partly from vendor sales pitches. It’s far easier to build our expectations from hype than to actually understand and specify the goals that really matter. Goals that are end-to-end in manner and preferably quantifiable.

When embarking on a technology-led transformation, our aim is to “make things better,” obviously. A list of hundreds of functional requirements might help. However, having an up-front, clear understanding of the small number of use cases you’re optimising for tends to define much clearer goal-driven expectations.

Security and privacy as an OSS afterthought?

I often talk about OSS being an afterthought for network teams. I find that they’ll often design the network before thinking about how they’ll operationalise it with an OSS solution. That’s both in terms of network products (eg developing a new device and only thinking about building the EMS later), or building networks themselves.

It can be a bit frustrating because we feel we can give better solutions if we’re in the discussion from the outset. As OSS people, I’m sure you’ll back me up on this one. But we can’t go getting all high and mighty just yet. We might just be doing the same thing… but to security, privacy and analytics teams.

In terms of security, we’ll always consider security-based requirements (usually around application security, access management, etc) in our vendor / product selections. We’ll also include Data Control Network (DCN) designs and security appliance (eg firewalls, IPS, etc) effort in our implementation plans. Maybe we’ll even prescribe security zone plans for our OSS. But security is more than that (check out this post for example). We often overlook the end-to-end aspects such central authentication, API hardening, server / device patching, data sovereignty, etc and it then gets picked up by the relevant experts well into the project implementation.

Another one is privacy. Regulations like GDPR and the Facebook trials show us the growing importance of data privacy. I have to admit that historically, I’ve been guilty on this one, figuring that the more data sets I could stitch together, the greater the potential for unlocking amazing insights. Just one problem with that model – the more data sets that are stitched together, the more likely that privacy issues arise.

We increasingly have to figure out ways to weave security, privacy and analytics into our OSS planning up-front and not just think of them as overlays that can be developed after all of our key decisions have been made.

New OSS functionality or speed and scale?

We all know that revenue per bit (of data transferred across comms networks) is trending lower. How could we not? It’s posited as one of the reasons for declining profitability of the industry. The challenge for telcos is how to engineer an environment of low revenue per bit but still be cost viable.

I’m sure there are differentiated comms products out there in the global market. However, for the many products that aren’t differentiated, there’s a risk of commoditisation. Customers of our OSS are increasingly moving into a paradigm of commoditisation, which in turn impacts the form our OSS must mould themselves to.

The OSS we deliver can either be the bane or the saviour. They can be a differentiator where otherwise there is none. For example, getting each customer’s order ready for service (RFS) faster than competitors. Or by processing orders at scale, yet at a lower cost-base through efficiencies / repeatability such as streamlined products, processes and automations.

OSS exist to improve efficiency at scale of course, but I wonder whether we lose sight of that sometimes? I’ve noticed that we have a tendency to focus on functionality (ie delivering new features) rather than scale.

This isn’t just the OSS vendors or implementation teams either by the way. It’s often apparent in customer requirements too. If you’ve been lucky enough to be involved with any OSS procurement processes, which side of the continuum was the focus – on introducing a raft of features, or narrowing the field of view down to doing the few really important things at scale and speed?

Just in time design

It’s interesting how we tend to go in cycles. Back in the early days of OSS, the network operators tended to build their OSS from the ground up. Then we went through a phase of using Commercial off-the-shelf (COTS) OSS software developed by third-party vendors. We now seem to be cycling back towards in-house development, but with collaboration that includes vendors and external assistance through open-source projects like ONAP. Interesting too how Agile fits in with these cycles.

Regardless of where we are in the cycle for our OSS, as implementers we’re always challenged with finding the Goldilocks amount of documentation – not too heavy, not too light, but just right.

The Agile Manifesto espouses, “working software over comprehensive documentation.” Sounds good to me! It perplexes me that some OSS implementations are bogged down by lengthy up-front documentation phases, especially if we’re basing the solution on COTS offerings. These can really stall the momentum of a project.

Once a solution has been selected (which often does require significant analysis and documentation), I’m more of a proponent of getting the COTS software stood up, even if only in a sandpit environment. This is where just-in-time (JIT) documentation comes into play. Rather than having every aspect of the solution documented (eg process flows, data models, high availability models, physical connectivity, logical connectivity, databases, etc, etc), we only need enough documentation for collaborative stakeholders to do their parts (eg IT to set up hardware / hosting, networks to set up physical connectivity, vendor to provide software, integrator to perform build, etc) to stand up a vanilla solution.

Then it’s time to start building trial scenarios through the solution. There’s usually quite a bit of trial and error in this stage, as we seek to optimise the scenarios for the intended users. Then we add a few more scenarios.

There’s little point trying to document the solution in detail before a scenario is trialled, but some documentation can be really helpful. For example, if the scenario is to build a small sub-section of a network, then draw up some diagrams of that sub-network that include the intended naming conventions for each object (eg device, physical connectivity, addresses, logical connectivity, etc). That allows you to determine whether there are unexpected challenges with naming conventions, data modelling, process design, etc. There are always unexpected challenges that arise!

I figure you’re better off documenting the real challenges than theorising on the “what if?” challenges, which is what often happens with up-front documentation exercises. There are always brilliant stakeholders who can imagine millions of possible challenges, but these often bog the design phase down.

With JIT design, once the solution evolves, the documentation can evolve too… if there is an ongoing reason for its existence (eg as a user guide, for a test plan, as a training cheat-sheet, a record of configuration for fault-finding purposes, etc).

Interestingly, the first value in the Agile Manifesto is, “individuals and interactions over processes and tools.” This is where the COTS vs in-house-dev comes back into play. When using COTS software, individuals, interactions and processes are partly driven by what the tools support. COTS functionality constrains us but we can still use Agile configuration and customisation to optimise our solution for our customers’ needs (where cost-benefit permits).

Having a working set of vanilla tools allows our customers to get a much better feel for what needs to be done rather than trying to understand the intent of up-front design documentation. And that’s the key to great customer outcomes – having the customers knowledgeable enough about the real solution (not hypothetical solutions) to make the most informed decisions possible.

Of course there are always challenges with this JIT design model too, especially when third-party contracts are involved!

Using risk reversal to design OSS

There’s a concept in sales called “risk reversal” that takes all of the customers’ likely issues with a product and provides answers to alleviate customer concerns. I believe we can apply the same concept to OSS, not just to sell them, but to design them.

To borrow from a risk register page here on PAOSS, the major categories of risk that appear on almost all OSS projects are:

  • Organisational change management – the OSS will touch almost all parts of a business and a large number of people within the organisation. If all parts of the business is not conditioned to the change then the implementation will not be successful even if the technical deliverables are faultless. Change management has many, many layers but one way to minimise change management is to make the products and processes highly intuitive. I feel that intuitive OSS will come from a focus on design and simplification rather than our current focus on constantly adding more features. The aim should be to create OSS that are as easy for operators to start using as office tools like spreadsheets, word processors, presentation applications, etc
  • Data integrity – the OSS is only as good as good as the data that is being fed to it. If the quality of data in the OSS database is poor then operational staff will quickly lose faith in the tools. The product-based techniques that can be used to overcome this risk include:
    • Design tools / data model to cope with poor data quality, but also flag it as low confidence for future repair
    • Reduction in data relationships / dependencies (ie referential integrity) to ensure that quality problems don’t have a domino effect on OSS usability
    • Building checks and balances that ensure the data can be reconciled and quality remains high
    • Incorporate closed-loop processes to ensure data quality is continually improved, rather than the open-loop processes that tend to lead to data quality degradation
  • Application functionality mapping to real business needs OSS have been around long enough to have all but run out of features for vendors to differentiate against. The truly useful functionality has arisen from real business needs. “Wish-list” functionality that adds little tangible business benefit or requires significant effort is just adding product and project risk
  • Northbound Interface / Integration – Costs and risks of integrations are significant on each OSS project. There are many techniques that can be used to reduce risk such as a Minimum Viable Data (ie less data types to collect across an interface), standardised destination mapping models, etc but the industry desperately needs major innovation here
  • Implementation – there are so many sources of risk within this category, as is to be expected on any large, complex project. Taking the PMP approach to risk reduction, we can apply the Triple Constraint model

Aggregated OSS buying models

Last week we discussed a sell-side co-op business model. Today we’ll look at buy-side co-op models.

In other industries, we hear of buying groups getting great deals through aggregated buying volumes. This is a little harder to achieve with products that are as uniquely customised as OSS. It’s possible that OSS buy-side aggregation could occur for operators that are similar in nature but don’t compete (eg regional operators). Having said that, I’ve yet to see any co-ops formed to gain OSS group-purchase benefits. If you have, I’d love to hear about it.

In OSS, there are three approaches that aren’t exactly co-op buying models but do aggregate the evaluation and buying decision.

The most obvious is for corporations that run multiple carriers under one umbrella such as Telefonica (see Telefonica’s various OSS / BSS contract notifications here), SingTel (group contracts here), etisalat, etc. There would appear to benefits in standardising OSS platforms across each of the group companies.

A far less formal co-op buying model I’ve noticed is the social-proof approach. This is where one, typically large, network operator in a region goes through an extensive OSS / BSS evaluation and chooses a vendor. Then there’s a domino effect where other, typically smaller, network operators also buy from the same vendor.

Even less formal again is by using third-party organisations like Passionate About OSS to assist with a standard vendor selection methodology. The vendors selected aren’t standardised because each operator’s needs are different, but the product / vendor selection methodology builds on the learnings of past selection processes across multiple operators. The benefits comes in the evaluation and decision frameworks.