The Ineffective OSS Scoreboard Analogy

Imagine for a moment that you’re the coach of a sporting team. You train your team and provide them with a strategy for the game. You send them out onto the court and let them play.

The scoreboard gives you all of the stats about each player. Their points, blocks, tackles, heart-rate, distance covered, errors, etc. But it doesn’t show the total score for each team or the time remaining in the game. 

That’s exactly what most OSS reports and dashboards are like! You receive all of the transactional data (eg alarms, truck-rolls, device performance metrics, etc), but not how you’re collectively tracking towards team objectives (eg growth targets, risk reduction, etc). 

Yes, you could infer whether the team is doing well by reverse engineering the transactional data. Yes, you could then apply strategies against those inferences in the hope that it has a positive impact. But that’s a whole lot of messing around in the chaos of the coach’s box with the scores close (you assume) and the game nearing the end (possibly). You don’t really know when the optimal time is to switch your best players back into the game.

As coach with funding available, would you be asking your support team to give you more transactional tools / data or the objective-based insights?

Does this analogy help articulate the message from the previous two posts (Wed and Thurs)?

PS. What if you wanted to build a coach-bot to replace yourself in the near future? Are you going to build automations that close the feedback loop against transactional data or are you going to be providing feedback that pulls many levers to optimise team objectives?

One big requirement category most OSS can’t meet

We talked yesterday about a range of OSS products that are more outcome-driven than our typically transactional OSS tools. There’s not many of them around at this stage. I refer to them as “data bridge” products.
 
Our typical OSS tools help manage transactions (alarms, activate customers services, etc). They’re generally not so great at (directly) managing objectives such as:
  • Sign up an extra 50,000 customers along the new Southern network corridor this month
  • Optimise allocation of our $10M capital budget to improve average attainable speeds by 20% this financial year
  • Achieve 5% revenue growth in Q3
  • Reduce truck rolls by 10% in the next 6 months
  • Optimal management of the many factors that contribute to churn, thus reducing churn risk by 7% by next March
 
We provide tools to activate the extra 50,000 customers. We also provide reports / dashboards that visualise the numbers of activations. But we don’t tend to include the tools to manage ongoing modelling and option analysis to meet key objectives. Objectives that are generally quantitative and tied to time, cost, etc and possibly locations/regions. 
 
These objectives are often really difficult to model and have multiple inputs. Managing to them requires data that’s changing on a daily basis (or potentially even more often – think of how a single missed truck-roll ripples out through re-calculation of optimal workforce allocation).
 
That requires:
  • Access to data feeds from multiple sources (eg existing OSS, BSS and other sources like data lakes)
  • Near real-time data sets (or at least streaming or regularly updating data feeds)
  • An ability to quickly prepare and compare options (data modelling, possibly using machine-based learning algorithms)
  • Advanced visualisations (by geography, time, budget drawdown and any graph types you can think of)
  • Flexibility in what can be visualised and how it’s presented
  • Methods for delivering closed-loop feedback to optimise towards the objectives (eg RPA)
  • Potentially manage many different transaction-based levers (eg parallel project activities, field workforce allocations, etc) that contribute to rolled-up objectives / targets
 
You can see why I refer to this as a data bridge product right? I figure that it sits above all other data sources and provides the management bridge across them all. 
 
PS. If you want to know the name of the existing products that fit into the “data bridge” category, please leave us a message.

An OSS checksum

Yesterday’s post discussed two waves of decisions stemming from our increasing obsession with data collection.

“…the first wave had [arisen] because we’d almost all prefer to make data-driven decisions (ie decisions based on “proof”) rather than “gut-feel” decisions.

We’re increasingly seeing a second wave come through – to use data not just to identify trends and guide our decisions, but to drive automated actions.”

Unfortunately, the second wave has an even greater need for data correctness / quality than we’ve experienced before.

The first wave allowed for human intervention after the collection of data. That meant human logic could be applied to any unexpected anomalies that appeared.

With the second wave, we don’t have that luxury. It’s all processed by the automation. Even learning algorithms struggle with “dirty data.” Therefore, the data needs to be perfect and the automation’s algorithm needs to flawlessly cope with all expected and unexpected data sets.

Our OSS have always had a dependence on data quality so we’ve responded with sophisticated ways of reconciling and maintaining data. But the human logic buffer afforded a “less than perfect” starting point, as long as we sought to get ever-closer to the “perfection” asymptote.

Does wave 2 require us to solve the problem from a fundamentally different starting point? We have to assume perfection akin to a checksum of correctness.

Perfection isn’t something I’m very qualified at, so I’m open to hearing your ideas. 😉

 

Crossing the OSS chasm

Geoff Moore’s seminal book, “Crossing the Chasm,” described the psychological chasm between early buyers and the mainstream market.

Crossing the Chasm

Seth Godin cites Moore’s work, “Moore’s Crossing the Chasm helped marketers see that while innovation was the tool to reach the small group of early adopters and opinion leaders, it was insufficient to reach the masses. Because the masses don’t want something that’s new, they want something that works…

The lesson is simple:

– Early adopters are thrilled by the new. They seek innovation.

– Everyone else is wary of failure. They seek trust.”
 

I’d reason that almost all significant OSS buyer decisions fall into the “mainstream market” section in the diagram above.  Why? Well, an organisation might have the 15% of innovators / early-adopters conceptualising a new OSS project. However, sign-off of that project usually depends on a team of approvers / sponsors. Statistics suggest that 85% of the team is likely to exist in a mindset beyond the chasm and outweigh the 15%. 

The mainstream mindset is seeking something that works and something they can trust.

But OSS / digital transformation projects are hard to trust. They’re all complex and unique. They often fail to deliver on their promises. They’re rarely reliable or repeatable. They almost all require a leap of faith (and/or a burning platform) for the buyer’s team to proceed.

OSS sellers seek to differentiate from the 400+ other vendors (of course). How do they do this? Interestingly, by pitching their innovations and uniqueness mostly.

Do you see the gap here? The seller is pitching the left side of the chasm and the buyer cohort is on the right.

I wonder whether our infuriatingly lengthy sales cycles (often 12-18 months) could be reduced if only we could engineer our products and projects to be more mainstream, repeatable, reliable and trustworthy, whilst being less risky.

This is such a dilemma though. We desperately need to innovate, to take the industry beyond the chasm. Should we innovate by doing new stuff? Or should we do the old, important stuff in new and vastly improved ways? A bit of both??

Do we improve our products and transformations so that they can be used / performed by novices rather than designed for use by all the massive intellects that our industry seems to currently consist of?

 

 

 

 

Over 30 Autonomous Networking User Stories

The following is a set of user stories I’ve provided to TM Forum to help with their current Autonomous Networking initiative.

They’re just an initial discussion point for others to riff off. We’d love to get your comments, additions and recommended refinements too.

As a Head of Network Operations, I want to Automatically maintain the health of my network (within expected tolerances if necessary) So that Customer service quality is kept to an optimal level with little or no human intervention
As a Head of Network Operations, I want to Ensure the overall solution is designed with machine-led automations as a guiding principle So that Human intervention can not be easily engineered into the systems/processes
As a Head of Network Operations, I want to Automatically identify any failures of resources or services within the entire network So that All relevant data can be collected, logged, codified and earmarked for effective remedial action without human interaction
As a Head of Network Operations, I want to Automatically identify any degradation of resource or service performance within the network So that All relevant data can be collected, logged, codified and earmarked for effective remedial action without human interaction
As a Head of Network Operations, I want to Map each codified data set (for failure or degradation cases) to a remedial action plan So that Remedial activities can be initiated without human interaction
As a Head of Network Operations, I want to Identify which remedial activities can be initiated via a programmatic interface and which activities require manual involvement such as a truck roll So that Even manual activities can be automatically initiated
As a Head of Network Operations, I want to Ensure that automations are able to resolve all known failure / degradation scenarios So that Activities can be initiated for any failure or degradation and be automatically resolved through to closure (with little or no human intervention)
As a Head of Network Operations, I want to Ensure there is sufficient network resilience So that Any failure or degradation can be automatically bypassed (temporarily or permanently)
As a Head of Network Operations, I want to Ensure there is sufficient resilience within all support systems So that Any failure or degradation can be automatically bypassed (temporarily or permanently) to ensure customer service is maintained
As a Head of Network Operations, I want to Ensure that operator initiated changes (eg planned maintenance, software upgrades, etc) automatically generate change tracking, documentation and logging So that The change can be monitored (by systems and humans where necessary) to ensure there is minimal or no impact to customer services, but also to ensure resolution data is consistently recorded
As a Head of Network Operations, I want to Ensure that customer initiated changes (eg by raising an incident) automatically generate change tracking, documentation and logging So that The change can be monitored (by systems and humans where necessary) to ensure the incident is closed expediently, but also to ensure resolution data is consistently recorded
As a Head of Network Operations, I want to Initiate planned outages with or without triggering automated remedial activities So that The change agents can decide to use automations or not and ensure automations don’t adversely effect the activities that are scheduled for the planned outage window
As a Head of Network Operations, I want to Ensure that if an unplanned outage does occur, impacted customers are automatically notified (on first instance and via a communications sequence if necessary throughout the outage window) So that Customer experience can be managed as best possible
As a Head of Network Operations, I want to Ensure that if an unplanned outage does occur without a remedial action being triggered, a post-mortem analysis is initiated So that Automations can be revised to cope with this previously unhandled outage scenario
As a Head of Network Operations, I want to Ensure that even previously un-seen new fail scenarios can be handled by remedial automations So that Customer service quality is kept to an optimal level with little or no human intervention
As a Head of Network Operations, I want to Automatically monitor the effects of remedial actions So that Remedial automations don’t trigger race conditions that result in further degradation and/or downstream impacts
As a Head of Network Operations, I want to Be able to manually override any automations by following a documented sequence of events So that If a race condition is inadvertently triggered by an automation, it can be negated quickly and effectively before causing further degradation
As a Head of Network Operations, I want to Intentionally trigger network/service outages and/or degradations, including cascaded scenarios on an scheduled and/or randomised basis So that The resilience of the network and systems can be thoroughly tested (and improved if necessary)
As a Head of Network Operations, I want to Intentionally trigger network/service outages and/or degradations, including cascaded scenarios on an ad-hoc basis So that The resilience of the network and systems can be thoroughly tested (and improved if necessary)
As a Head of Network Operations, I want to Perform scheduled compliance checks on the network So that Expected configurations and policies are in place across the network
As a Head of Network Operations, I want to Automatically generate scheduled reports relating to the effectiveness of the network, services and automations So that The overall solution health (including automations) can be monitored
As a Head of Network Operations, I want to Automatically generate dashboards (in near-real-time) relating to the effectiveness of the network, services and automations So that The overall solution health (including automations) can be monitored
As a Head of Network Operations, I want to Ensure that automations are able to extend across all domains within the solution So that Remedial actions aren’t constrained by system hand-offs
As a Head of Network Operations, I want to Ensure configuration backups are performed automatically on all relevant systems (eg EMS, OSS, etc) So that A recent good solution configuration can be stored as protection in case automations fail and corrupt configurations within the system
As a Head of Network Operations, I want to Ensure configuration restores are performed and tested automatically on all relevant systems (eg EMS, OSS, etc) So that A recent good solution configuration can be reverted to in case automations fail and corrupt configurations within the system
As a Head of Network Operations, I want to Ensure automations are able to manage the entire service lifecycle (add, modify/upgrade, suspend, restore, delete) So that Customer services can evolve to meet client expectations with little or no human intervention
As a Head of Network Operations, I want to Have a design and architecture that uses intent-based and/or policy-based actions So that The complexity of automations is minimised (eg automations don’t need to consider custom rules for different device makes/models, etc)
As a Head of Network Operations, I want to Ensure as many components of the solution (eg EMS, OSS, customer portals, etc) have programmatic interfaces (even if manual activities are required in back-end processes) So that Automations can initiate remedial actions in near real time
As a Head of Network Operations, I want to Ensure all components and data flows within the solution are securely hardened (eg encryption of data in motion and at rest) So that The power of the autonomous platform can not be leveraged for nefarious purposes
As a Head of Network Operations, I want to Ensure that all required metrics can be automatically sourced from the network / systems in as near real time as feasible / useful So that Automations have the full set of data they need to initiate remedial actions and it is as up-to-date as possible for precise decision-making
As a Head of Network Operations, I want to Use the power of learning machines So that The sophistication and speed of remedial response is faster, more accurate and more reliable than if manual interaction were used
As a Head of Network Operations, I want to Record actual event patterns and replay scenarios offline So that Event clusters and response patterns can be thoroughly tested as part of the certification process prior to being released into production environments
As a Head of Network Operations, I want to Capture metrics that can be cross-referenced against event patterns and remedial actions So that Regressions and/or refinements can improve existing automations (ie continuous retraining of the model)
As a Head of Network Operations, I want to Be able to seed a knowledge base with relevant event/action data, whether the pattern source is from Production, an offline environment, a digital twin environment or other production-like environments So that The database is able to identify real scenarios, even if  scenarios are intentially initiated, but could potentially cause network degradation to a production environment
As a Head of Network Operations, I want to Ensure that programmatic interfaces also allow for revert / rollback capabilities So that Remedial actions that aren’t beneficial can be rolled back to the previous state; OR other remedial actions are performed, allowing the automation to revert to original configuration / state
As a Head of Network Operations, I want to Be able to initiate circuit breakers to override any automations So that If a race condition is inadvertently triggered by an automation, it can be negated quickly and effectively before causing further degradation
As a Head of Network Operations, I want to Manually or automatically generate response-plans (ie documented sequences of activities) for any remedial actions fed back into the system So that Internal (eg quality control) or external (eg regulatory) bodies can review “best-practice” remedial activities at any point in time
As a Head of Network Operations, I want to Intentionally trigger catastrophic network failures (in non-prod environments) So that We can trial many remedial actions until we find an optimal solution to seed the knowledge base with

We use time-stamping in OSS, but what about geo-stamping?

A slightly left-field thought dawned on me the other day and I’d like to hear your thoughts on it.

We all know that almost all telemetry coming out of our networks is time-stamped. Events, syslogs, metrics, etc. That makes perfect sense because we look for time-based ripple-out effects when trying to diagnose issues.

But therefore does it also make sense to geo-stamp telemetry data too? Just as time-based ripple-out is common, so too are geographic / topological (eg nearest neighbour and/or power source) ripple-out effects.

If you want to present telemetry data as a geo/topo overlay, you currently have to enrich the telemetry data set first. Typically that means identifying the device name that’s generating the data and then doing a query on huge inventory databases to find the location and connectivity that corresponds to that device.

It’s usually not a complex query, but consider how much processing power must go into enriching at the enormous scale of telemetry records.

For stationary devices (eg core routers), it might seem a bit absurd adding a fixed geo-code (which has to be manually entered into the device once) to every telemetry record, but it seems computationally far more efficient than data lookups (please correct me if I’m wrong here!). For devices that move around (eg routers on planes), hopefully they already have GPS sensors to provide geo-stamp data.

What do you think? Am I stating a problem that has already been solved and/or is not worth solving? Or does it have merit?

As a network owner….

….I want to make my network so observable, reliable, predictable and repeatable that I don’t need anyone to operate it.

That’s clearly a highly ambitious goal. Probably even unachievable if we say it doesn’t need anyone to run it. But I wonder whether this has to be the starting point we take on behalf of our network operator customers?

If we look at most networks, OSS, BSS, NOC, SOC, etc (I’ll call this whole stack “the black box” in this article), they’ve been designed from the ground up to be human-driven. We’re now looking at ways to automate as many steps of operations as possible.

If we were to instead design the black-box to be machine-driven, how different would it look?

In fact, before we do that, perhaps we have to take two unique perspectives on this question:

  1. Retro-fitting existing black-boxes to increase their autonomy
  2. Designing brand new autonomous black-boxes

I suspect our approaches / architectures will be vastly different.

The first will require a incredibly complex measure, command and control engine to sit over top of the existing black box. It will probably also need to reach into many of the components that make up the black box and exert control over them. This approach has many similarities with what we already do in the OSS world. The only exception would be that we’d need to be a lot more “closed-loop” in our thinking. I should also re-iterate that this is incredibly complex because it inherits an existing “decision tree” of enormous complexity and adds further convolution.

The second approach holds a great deal more promise. However, it will require a vastly different approach on many levels:

  1. We have to take a chainsaw to the decision tree inside the black box. For example:
    • We start by removing as much variability from the network as possible. Think of this like other utilities such as water or power. Our electricity service only has one feed-type for almost all residential and business customers. Yet it still allows us great flexibility in what we plug into it. What if a network operator were to simply offer a “broadband dial-tone” service and end users decide what they overlay on that bit-stream
    • This reduces the “protocol stack” in the network (think of this in terms of the long list of features / tick-boxes on any router’s brochure)
    • As well as reducing network complexity, it drastically reduces the variables an end-user needs to decide from. The operator no longer needs 50 grandfathered, legacy products 
    • This also reduces the decision tree in BSS-related functionality like billing, rating, charging, clearing-house
    • We achieve a (globally?) standardised network services catalog that’s completely independent of vendor offerings
    • We achieve a more standardised set of telemetry data coming from the network
    • In turn, this drives a more standardised and minimal set of service-impact and root-cause analyses
  2. We design data input/output methods and interfaces (to the black box and to any of its constituent components) to have closed-loop immediacy in mind. At the moment we tend to have interfaces that allow us to interrogate the network and push changes into the network separately rather than tasking the network to keep itself within expected operational thresholds
  3. We allow networks to self-regulate and self-heal, not just within a node, but between neighbours without necessarily having to revert to centralised control mechanisms like OSS
  4. All components within the black-box, down to device level, are programmable. [As an aside, we need to consider how to make the physical network more programmable or reconcilable, considering that cables, (most) patch panels, joints, etc don’t have APIs. That’s why the physical network tends to give us the biggest data quality challenges, which ripples out into our ability to automate networks]
  5. End-to-end data flows (ie controls) are to be near-real-time, not constrained by processing lags (eg 15 minute poll cycles, hourly log processing cycles, etc) 
  6. Data minimalism engineering. It’s currently not uncommon for network devices to produce dozens, if not hundreds, of different metrics. Most are never used by operators manually, nor are likely to be used by learning machines. This increases data processing, distribution and storage overheads. If we only produce what is useful, then it should improve data flow times (point 5 above). Therefore learning machines should be able to control which data sets they need from network devices and at what cadence. The learning engine can start off collecting all metrics, then progressively turning them off as they deem metrics unnecessary. This could also extend to controlling log-levels (ie how much granularity of data is generated for a particular log, event, performance counter)
  7. Perhaps we even offer AI-as-a-service, whereby any of the components within the black-box can call upon a centralised AI service (and the common data lake that underpins it) to assist with localised self-healing, self-regulation, etc. This facilitates closed-loop decisions throughout the stack rather than just an over-arching command and control mechanism

I’m barely exposing the tip of the iceberg here. I’d love to get your thoughts on what else it will take to bring fully autonomous network to reality.

For those starting out in OSS product, here’s a tip

For those starting out in product, here’s a tip: Design, Defaults*, Documentation, Details and Delivery really matter in software.”
Jeetu Patel here.

* Note that you can interpret “Defaults” to be Out-Of-The-Box functionality offered by the product.

Let’s break those 5 D-words down and describe why they really matter to the OSS industry shall we?

  • Design – The power of OSS product development tends to lie with engineering, ie the developers. I have huge admiration for the very clever and very talented engineers who create amazing products for us to use, buuutttttt……. I just have one reservation – is there a single OSS company that is design-driven? A single one that’s making intuitive, effective, beautiful experiences for their users? The obvious answer is of course engineering teams hold sway over design teams in OSS – how many OSS vendors even have a dedicated design department??? See this article for more.
  • Defaults – Almost every OSS I know of has an enormous amount of “out-of-the-box” functionality baked in. You could even say that most have too much functionality. There’s functionality that might be really important for one customer but never even used by any of the vendor’s other customers. It just represents bloat for all the other customers, and potentially a distraction for their operators. I’m still bemused to see vendors trying to differentiate by adding obscure new default features rather than optimising for “must-have” functions. See this article for more. However, I must add that I’m starting to see a shift in some OSS. They’re moving away from having baked-in functionality and are moving to more data-repository-driven architectures. Interesting!!
  • Documentation – This is a really interesting factor! Some vendors make almost no documentation available until a prospect becomes a paying customer. Other vendors make their documentation available for the general public online and dedicate significant effort to maintaining their information library. The low-doc approach espoused by Agile could be argued to be reducing document quality. However, it also reduces the chance of producing documentation that nobody will read ever! Personally, I believe vendors like Cisco have earnt a huge competitive advantage (in the networking space moreso than OSS) because of their training / certification (ie CCNA, etc) and self-learning (ie online documentation). See this article for more. As such, I’d tend to err on over-documenting for customer-facing collateral. And perhaps under-documenting for internal-facing collateral unless it’s likely to be used regularly and by many.
  • Details – This is another item where there are two ends to the spectrum. That might surprise some people who would claim that attention to detail is paramount. Well, yes…. in many cases, but certainly not all on OSS projects. Let me share a story on attention to detail on a past OSS project. And another story on seeking perfection. Sometimes we just need to find the right balance, and knowing when to prioritise resilience and when to favour precision becomes an art.
  • Delivery – I have two perspectives on this D-word. Firstly, the Steve Jobs inspired quote of “Real artists ship!” In other words, to laud the skill of shipping a product that provides value to the customer rather than holding off on a not-yet-perfected solution. But the second case is probably more important. OSS projects tend to be massive and complex transformation efforts. Our OSS are rarely self-installed like office software, so they require big delivery teams. Some products are easy to deliver/deploy. Others are a *&$%#! If you’re a product developer, please get out in the trenches with your delivery teams and find ways to make their job easier and/or more repeatable.

Net Simplicity Score (NSS) gets a little more complex

In last Tuesday’s post, I asked the community here on PAOSS and on TM Forum’s Engage platform for ideas about how you would benchmark complexity.

I also provided a reference to an old post that described the concept of a NSS (Net Simplicity Score) for our OSS/BSS.

Due to the complexity of factors that contribute to a complexity score, the NSS is a “catch-all” simplicity metric. Hopefully it will allow subtraction projects to be easily justified, just as the NPS (Net Promoter Score) metric has helped justify customer experience initiatives.

The NSS (Net Simplicity Score), could be further broken down into:

  • The NCSS (Net Customer Simplicity Score) – A ranking from 0 (lowest) to 10 (highest) how easy is it to choose and use the company / product / service? This is an external metric (ie the ranking of the level of difficulty that your customers face)
  • The NOSS (Net Operator Simplicity Score) – A ranking from 0 (lowest) to 10 (highest) how easy is it to choose and use the company / product / service? This is an internal metric (ie for operators to rank complexity of systems and their constituent applications / data / processes)

One interesting item of feedback came from Ronald Hasenberger. He rightly pointed out that just because something is simple for users to interact with, doesn’t mean it’s simple behind the scenes – often exactly the opposite. The iPod example I used in earlier posts is a case in point. The iPod was more intuitive than existing MP3 players, but a huge amount of design and engineering went into making it that way. The underlying “system” certainly wasn’t simple.

So perhaps there’s a third simplicity factor to add to the two bullets listed above:

  • The NSSS (Net System Simplicity Score) – and this one does require a more sophisticated algorithm than just an aggregate of perceptions. Not only that, but it’s the one that truly reflects the systems we design and build. I wonder whether the first two are an initial set of proxies that help drive complexity out of our solutions, but we need to develop Ronald’s third one to make the biggest impact?

Again, I’d love to hear your thoughts!

Opinions wanted – How to Benchmark OSS/BSS complexity

I’d love to ask you an important question…  how do we benchmark OSS/BSS complexity? To measure how complex our systems are and therefore provide a signpost for simplification.

A colleague has opined that the number of apps in a stack could be used a proxy. I can see where he’s going with that, but I feel that it doesn’t account for architectural differences such as monolith versus microservices.

I’d love to hear your thoughts via the comments box below.

FWIW, here are some additional thoughts from me, but please don’t let them bias your opinions:

  • For me, complexity relates to the efficiency of getting tasks done:
    • How much time to complete certain tasks
    • How many button clicks
    • How much swivel-chairing
    • How many CPU cycles for automated tasks
    • How much admin overhead
    • How much duplicated effort and/or rework
  • However, there are so many different tasks done within an OSS/BSS stack that it’s difficult to provide a complexity metric that compares one OSS/BSS stack with another. Or compares a single stack before/after changes are made
  • In some cases the complexity happens inside the OSS/BSS “black box” (eg tools within the suite aren’t seamlessly integrated, causing operators to perform dual-entry that leads to data inconsistency and downstream re-work)
  • In other cases the complexity is inherited from outside the black box (eg product offerings have hundreds of possible variants that are imperceptibly different in the customer’s eyes). I call this The OSS Pyramid of Pain
  • In many cases, the complexity of an OSS/BSS stack is less about the systems and integrations, and more about the complexity of The Decision Tree that spans the stack. The spread of the Decision Tree is impacted by:
    • The OSS/BSS applications
    • Support applications (eg authentication, security, data management, resilience / availability, etc)
    • System interfaces (internal and external)
    • User interfaces
    • Process designs
    • Product definitions
    • Work practices
    • Data models
    • Design rules
    • Network topologies
    • etc, etc
  • The more complex the Decision Tree, the more complex it is to transform our OSS. It loosely aligns with what I call this The Chessboard Analogy
  • The development strategy used also has an impact, be monolithic, best-of-breed, hosted or in-house developed. For example an in-house-developed solution is likely to have less functionality-bloat than a COTS (off-the-shelf) solution. The COTS solution needs to include additional functionality to enable it to support requirements of multiple customers
  • And finally, a benchmark is only as useful as the actions that it triggers. How do we codify a complexity metric that has the equally complex array of contributions described above?
  • Perhaps we could take a somewhat abstracted approach like the NPS (Net Promoter Score) does, thus creating a NSS (Net Simplicity Score)

As mentioned above, I’d love to hear your thoughts on how we can benchmark the level of complexity in our OSS/BSS. Please leave your comments below.

 

The digital transformation paradox twins

There’s an old adage that “the confused mind always says no.”

Consider this from your own perspective. If you’re in a state of confusion about something, are you likely to commit wholeheartedly or will you look to delay / procrastinate?

The paradox for digital transformation is that our projects are almost always complex, but complexity breeds confusion and uncertainty. Transformation may be urgently needed, but it’s really hard to persuade stakeholders and sponsors to commit to change if they don’t have a clear picture of the way forward.

As change agents, we face another paradox. It’s our task to simplify the messaging. but our messaging should not imply that the project will be simple. That will just set unrealistic expectations for our stakeholders (“but this project was supposed to be simple,” they say).

Like all paradoxes, there’s no perfect solution. However, one technique that I’ve found to be useful is to narrow down the choices. Not by discarding them outright, but by figuring out filters – ways to quick include or exclude branches of the decision tree.

Let’s take the example of OSS vendor selection. An organisation asks itself, “what is the best-fit OSS/BSS for our needs?” The Blue Book OSS/BSS Vendor Directory will show that there are well over 400 OSS/BSS providers to choose from. Confusion!

So let’s figure out what our needs are. We could dive into really detailed requirement gathering, but that in itself requires many complex decisions. What if we instead just use a few broad needs as our first line of filtering? We know we need an outside plant management tool. Our list of 400+ now becomes 20. There’s still confusion, but we’re now more targeted.

But 20 is still a lot to choose from. A slightly deeper level of filtering should allow us to get to a short list of 3-5. The next step is to test those 3-5 to see which does the best at fulfilling the most important needs of the organisation. Chances are that the best-fit won’t fulfil every requirement, but generally it will clearly fulfil more than any of the other alternatives. It’s best-fit, not perfect fit.

We haven’t made the project less complex, but we have simplified the decision. We’ve arrived at the “best” option, so the way forward should be clear right?

Unfortunately, it’s not always that easy. Even though the best way forward has been identified, there’s still uncertainties in the minds of stakeholders caused purely by the complexity of the upcoming project. I’ve seen examples where the choice of vendor has been clear, with the best-fit clearly surpassing the next-best, but the buyer is still indecisive. I completely get it. Our task as change agents is to reduce doubts and increase transformation confidence.

What will get your CEO fired? (part 4)

In Monday’s article, we suggested that the three technical factors that could get the big boss fired are probably only limited to:

  1. Repeated and/or catastrophic failure (of network, systems, etc)
  2. Inability to serve the market (eg offerings, capacity, etc)
  3. Inability to operate network assets profitably

In that article, we looked closely at a human factor and how current trends of open-source, Agile and microservices might actually exacerbate it. In yesterday’s article we looked at market-serving factors for us to investigate and monitor.

But let’s look at point 3 today. The profitability factors we could consider that reduce the chances of the big boss getting fired are:

  1. Ability to see revenues in near-real-time (revenues are relatively easy to collect, so we use these numbers a lot. Much harder are profitability measures because of the shared allocation of fixed costs)

  2. Ability to see cost breakdown (particularly which parts of the technical solution are most costly, such as what device types / topologies are failing most often)

  3. Ability to measure profitability by product type, customer, etc

  4. Are there more profitable or cost-effective solutions available

  5. Is there greater profitability that could be unlocked by simplification

Exactly what is an OSS’s “intuition age”

I’m currently reading a book entitled, “Jony Ive. The genius behind Apple’s greatest products.”

I’d like to share a paragraph with you from it (and probably expect a few more in coming days):

“…Apple’s internal culture heavily favored the engineers within the product groups. The design process was engineering driven. In the early days of Frog Design, the engineers had bent over backward to help implement the design team’s ambitions, but now the power had shifted. The different engineering groups gave their products in development to Brunner’s group, who were expected to merely “skin” them.

Brunner wanted to shift the power from engineering to design. He started thinking strategically… The idea was to get ahead of the engineering groups and start to make Apple more of a design-driven company rather than a marketing or engineering one.”

That’s an unbelievably insightful conclusion Robert Brunner made. If he wanted to turn Apple into a design-driven company, then he’d have to prepare design concepts that looked further into the future than where the engineers were up to. Products like the iPod and iPad are testimony that Brunner’s strategy worked.

We face the same situation in OSS today. The power of product development tends to lie with engineering, ie the developers. I have huge admiration for the very clever and very talented engineers who create amazing products for us to use, buuutttttt…….

I just have one reservation – is there a single OSS company that is design-driven? A single one that’s making intuitive, effective, beautiful experiences for their users? Of course engineering holds power over design in OSS – how many OSS vendors even have a dedicated design department???

Let me give a comparison (albeit a slightly unfair one). Both of my children were reasonably adept at navigating their way around our iPad (for multiple use cases) by the age of three. What would the equivalent “intuition age” be for navigating our OSS?

If you’re a product manager, have you ever tried it? Have you ever considered benchmarking it (or an equivalent usability metric) and seeing what you could do to improve it for your OSS products?

Interesting metrics from The Blue Book launch

When I first started the Passionate About OSS site / blog many years ago, I was lucky to get a handful of views per day. It’s grown by many multiples since then, fortunately.

The launch of The Blue Book OSS/BSS Vendor Directory generated some exciting metrics yesterday. The directory alone came within 5 pageviews of the highest count we’ve ever seen on PassionateAboutOSS.com (and PAOSS is up to nearly 2,500 posts now). That total appeared in only a 14-hour window because we didn’t go live with The Directory or metric collection until ~10am local time! The graphs are indicating that we should easily exceed PAOSS’s best ever count today.

If you were one of the many viewers who popped in from all around the world to look at The Directory, thank you! If you have any suggested improvements, we’d love to hear from you as we’re sure to be making many further tweaks in coming days/months.

But the most interesting fact about the launch yesterday was that a job posting appeared on UpWork to scrape all the data we’ve presented. On our very first day!! In fact a gentleman in the US reviewed bids and awarded the UpWork job all within about 14 hours of go-live.

That’s positive news because it means that at least one person must’ve thought the data was useful. 🙂

“The Blue Book OSS/BSS Vendor Directory” from Passionate About OSS has officially launched

We’re excited to announce that “The Blue Book OSS/BSS Vendor Directory” has officially gone live here at https://passionateaboutoss.com/directory

It provides a comprehensive directory of over 400 suppliers that produce OSS, BSS and/or related network management tools. Company details, product details and functionality classifications are included.

The Blue Book OSS / BSS Vendor Directory

Every network operator has a unique set of needs from their operational software – software that includes OSS (Operational Support Systems), BSS (Business Support Systems), NMS (Network Management Systems) and the many other related tools.

To service those many and varied needs, a large number of different products have been created by some very clever developers. But it’s a highly fragmented market. There are literally hundreds of product options out there and they all have different capabilities.

If you’re a typical buyer, how many of those products are you familiar with? Five? Ten? Fifty? How do you know whether the best-fit product or supplier is within the list you already know? Perhaps the best-fit is actually amongst the hundreds of other products and suppliers you’re not familiar with yet. How much time do you have to research each one and distill down to a short-list of possible candidates to service your specific needs? Where do you start? Lots of web searches? There has to be an easier way.

What if you’re a seller? These products tend to have lengthy life-cycles once they’ve been installed so it might be years before a prospect actually enters the buying phase. Yet there are so many prospects out there at different phases of their buying windows. There are bound to be some live ones at any time that suit your capabilities. The challenge for you as a supplier is how to make those prospects aware of you. You don’t have the time to establish trusted relationships with hundreds, perhaps even thousands, of buyers across the globe (or maybe just within your region/s). Wouldn’t you love to be presented with qualified prospects who are in (or nearing) their buying window?

Well we at Passionate About OSS have created The Blue Book OSS/BSS Vendor Directory to simplify the task of bringing buyers and sellers together. With over 400 suppliers listed (and climbing), we provide a single, comprehensive repository for searching, matching and connecting. The tools allow you to do it yourself, or we can help you using the approaches we’ve developed, used and refined over the years.

Now just click on “Directory” to start your journey of searching, matching and connecting (and updating your listing if you’re a supplier).

A lighter-touch OSS procurement approach (part 3)

We’ve spoken at length about TM Forum’s, “Time to kill the RFP? Reinventing IT procurement for the 2020s,” report so far this week. We’ve also spoken about the feeling that the OSS/BSS RFP (Request For Proposal) still has relevance in some situations… as long as it’s more of a lighter-touch than most. We’ve spoken about a more pragmatic approach that aims to find best available fit (for key objectives through stages of filtering) rather than perfect fit (for all requirements through detailed analyses). And I should note that “best available fit” includes measurement against these three contrarian procurement KPIs ahead of the traditional ones.

Yesterday’s post discussed how we get to a short list with minimal involvement of buyers and sellers, with the promise that we’d discuss the detailed analysis stage today.

It’s where we do use an RFP, but with thought given to the many pain-points cited so brilliantly by Mark Newman and team in the abovementioned TM Forum report.

The RFP provides the mechanism to firm up pricing and architecture, but is also closely tied to a PoC (Proof of Concept) demonstration. The RFP helps to prioritise the order in which PoCs are performed. PoCs tend to be very time consuming for buyer and seller. So if there’s a clear leader from the paper studies so far, then they will demonstrate first.

If there’s not a clear difference, or if the prime candidate’s demonstration identified significant gaps, then additional PoCs are run.

And to ensure the PoCs are run against the objectives that matter most, we use scenarios that were prioritised during part 1 of this series.

Next steps are to form the more detailed designs, commercials / contracts and ratify that the business case still holds up.

In yesterday’s post, I also promised to share our “starting-point” procurement methodology. I say starting point because each buyer situation is different and we tend to customise it to each buyer’s needs. It’s useful for starting discussions.

The overall methodology diagram is shown below:

PAOSS vendor selection process

A few key notes here:

  1. The process looks much heavier than it really is… if you use traditional procurement processes as an indicator
  2. We have existing templates for all the activities marked in yellow
  3. The activity marked in blue partially represents the project we’re getting really excited to introduce to you tomorrow

 

A lighter-touch OSS procurement approach (part 2)

Yesterday’s post described the approach to get from 400+ possible OSS/BSS suppliers/products down to a more manageable list without:

  1. Having to get into significant discussions with vendors (yet)
  2. Gathering all your stakeholders together to prepare a detailed list of requirements

We’ll call this “the long list,” which might consist of 5-20 suppliers. We use this evaluation technique (which we’ll share more about on Monday) to ensure we’ve looked at the broad market of suppliers rather than just the few the buyer already knows.

The next step we follow helps us to get to a much smaller list, which we’ll call “the short list.”

For this, we do need to contact vendors (the long list) and we do need to prepare a list of requirements to add to the objectives and key workflows we’ve previously identified. The requirements won’t need to be detailed, but will still probably number into the 100s – some from our pick-list, others customised to each client’s needs.

Then we engage in what we refer to as an EOI (Expression of Interest) phase. Our EOIs are not just a generic market capability analysis like many  buyers conduct. Ours seek indicative vendor compliance (to objectives and requirements) and indicative pricing based on the dimensions we supply. We’ve refined this model over the years to make it quite quick and (relatively) easy for vendors to respond to.

Using compliance to measure suitability and indicative pricing to plug in to our long-term TCO (Total Cost of Ownership) model, the long list usually becomes a clear short list of 1-5 very quickly.

Now we can get into detailed discussions with a very small number of best-fit suppliers without having wasted much time of buyer or seller. 

More on the detailed discussions tomorrow!

A lighter-touch OSS procurement approach (part 1)

You may have noticed that we’ve run a series of posts about OSS/BSS procurement, and about the RFP process by association.

One of the first steps in the traditional procurement process is preparing a strategy and detailed set of requirements.

As TM Forum’s, “Time to kill the RFP? Reinventing IT procurement for the 2020s,” report describes:
Before an RFP can be issued, the CSP’s IT or network team must produce a document detailing the strategy for implementing a technology or delivering a service, which is a lengthy process because of the number of stakeholders involved and the need to describe requirements in a way that satisfies them all.”

The problem with most requirements documents, the ones I’ve seen at least, is that they tend to get down into a deep, deep level of detail. And when it’s down in that level of detail, contrasting opinions from different stakeholders can make it really difficult to reach agreement. Have you ever been in a room with many high-value (and high cost) stakeholders spending days debating the semantics (and wording) of requirements? Every stakeholder group needs a say and needs to be heard.

The theory is that you need a great level of detail to evaluate supplier offerings for best-fit. Well, maybe, but not in the initial stages.

First things first – I seek to find out what’s really important for the organisation. That rarely comes from a detailed requirements spreadsheet, but by determining the things that are done most often and/or add the most value to the buyer’s organisation. I use persona mapping, long-tail and perhaps whale-curve mapping approaches to determine this.

Persona mapping means identifying all the groups within the buyer’s organisation that need to interact with the OSS/BSS (current and proposed). Then sitting with each group to determine what they need to achieve, who they need to interact with and what their workflows look like. That also gives a chance for all groups to be heard.

From this, we can collaboratively determine some high-level evaluation criteria, maybe only 15-20 to start with. You’d be surprised at how quickly this 15-20 criteria can help with initial supplier filtering.

Armed with the initial 15-20 evaluation criteria and the project we’re getting excited to launch on Monday, we can get to a relevant list of possible suppliers quite quickly. It allows us to do a broad market search to compile a list of suppliers, not just from the 5-10 suppliers the buyer already knows about, but from the 400+ suppliers/products available on the market. And we don’t even have to ask the suppliers to fill out any lengthy requirement response spreadsheets / forms yet.

We’ll continue the discussion over the next two days. We’ll also share our procurement methodology pack on Sunday.

Do I support the death penalty (of OSS RFPs)? Hmmm….

As per yesterday’s post, I’ll continue to reference a TM Forum report called, “Time to kill the RFP? Reinventing IT procurement for the 2020s” today. Mark Newman and the team have captured and discussed so many layers to the OSS/BSS procurement process.

There’s no doubt the current stereotypical RFP approach to procurement is broken. It needs to be done differently. That’s why we have been doing it differently with customers for years now (another hint regarding a project we’re getting excited to announce this Monday).

The TM Forum report is really powerful and well worth a read. There are a few additional (and somewhat random) thoughts that go through my head when considering the death of the RFP:

  1. The TM Forum report is primarily coming at the problem from the perspective of a carrier that is constantly steering the development of its own systems, as implied through this quote, “The fundamental problem with the RFP process is that in a fast-paced technology environment, where cloud and software are fast becoming preferred options, it is difficult for CSPs to describe in lengthy, written documents what they want and need. The processes are simply too complex and cumbersome to support modern, Agile methods of working.”
  2. That perspective is particularly applicable for some buyers, ones that have committed to having significant developer resources available to build exactly what they want. That could be in the form of in-house developers, contract developers, long-term panel arrangements with suppliers or similar
  3. Others, perhaps such as utilities, enterprise and some telcos want to focus on their core business and delegate OSS/BSS configuration and customisation to third-parties.
  4. Some of those rely on COTS (commercial off the shelf) software to leverage the benefits of innovation, cost and development time that have been spread across multiple customers. Their budgets simply don’t allow for custom-built solutions
  5. COTS, be it on-prem through to cloud service models, are almost never going to be a perfect fit for a buyer’s needs. They’re designed to generically suit many buyers, so a certain amount of bloat becomes part of the trade-off
  6. In recent weeks, I’ve seen two entirely in-house developed OSS/BSS. They fit their organisations like a glove and there’s almost no bloat at all. In fact it would be almost impossible for a COTS solution to replace what they’ve built. In both cases it’s taken a decade of ongoing development to get to that position. Most buyers don’t have that amount of time to get it right though unfortunately
  7. Commercial realities imply a pragmatic approach is taken to procurement – which product/s provide default capability that best aligns with the buyer’s most important objectives.
  8. RFPs often get bogged down at the far right-hand side of the long-tail of requirements (where impact tends to be negligible), or in trying to completely re-sculpt the solution to be the perfect fit (that it’s unlikely to ever be)
  9. In my experience at least, the best-fit (not perfect fit) solution, or very short list of solutions, usually becomes apparent fairly quickly [we’ll share more about how we do that tomorrow]. It’s then just a case of testing objectives, assumptions and gaps (eg via a proof-of-concept) and getting to a mutually beneficial commercial agreement
  10. As one respondent in the TM Forum report put it, “The RFP glorifies the process, not the outcome.” A healthy dose of outcome-driven pragmatism helps to reduce glorification of the RFP process
  11. Also in my experience at least, scope of works quotes from vendors (which RFPs tend to lead to) tend to be written in a waterfall style that don’t fit into Agile frameworks very effectively. That can be partially overcome by slicing and dicing the SoW in ways that are more conducive to Agile delivery
  12. With so much fragmentation in the OSS/BSS market already (there are over 400 in our vendor directory), that means the talent pool of creators is thinly spread. Many of those 400 have duplicated functionality, which isn’t great for the industry’s overall progress. Custom development for each different buyer spreads the talent pool even further… unless buyers can get economies of development scale through shared platforms like ONAP

In summary, I love the concept of avoiding massive procurement events. I still can’t help but think the RFP still fits in there somewhere for many buyers… as long as we ensure we glorify the outcomes and de-emphasise the process. It’s just that we use RFPs like a primitive instrument and inflict blunt-force trauma, rather than using surgical precision.

Lobbying hard for the death penalty for OSS RFPs

Earlier this year, the TM Forum published a really insightful report called, “Time to kill the RFP? Reinventing IT procurement for the 2020s.” There are so many layers to the OSS/BSS procurement discussion and Mark Newman and team have done a fantastic job of capturing them. We’ll expand on a few of those layers in a series of posts this week.

For example, section 2 articulates the typical RFI / RFP / RFQ approach. It’s clear to see why the typical approach is flawed. Yesterday’s post pondered whether procurement events are flawed from the initial KPIs that are set by buyers. Today we’ll take a look at the process that follows.

Two quotes from the TM Forum report frame some of the challenges with RFPs from buyer and seller viewpoints respectively:
QUOTE 1 (Buyer-side) – “CSPs normally distribute RFPs to a group of three to eight suppliers. These are most likely existing suppliers, previous vendors or companies the CSP is aware of through its own technology scouting. Suppliers are likely to include systems integrators who rely on other vendors to fulfill elements of the contract, and CSPs tend to invite bidders offering a range of options.
For example, they may invite a supplier that is likely to offer a good price, one that is a ‘safe’, low-risk option, and the incumbent supplier, which in many cases the CSP is looking to replace.
The document itself is likely to be several hundred pages long, a large portion of it comprising details of technology requirements, with suppliers asked to specify whether they comply with each requirement
.”
The question I’d ask about this process is how does the CSP choose 3-8 out of the 400+ vendors that supply the OSS/BSS market? Does their “own technology scouting” adequately discount the hundreds of others that could potentially be best-fit for their needs?

QUOTE 2 (Seller-side) – “We were holed up in our hotel for a month working feverishly on different aspects of the bid. We had 15 people there in total, and we were asked to come in for meetings with five different teams. The meetings go on and on, and you really have no idea when they’re going to finish.”
Let’s do the sums on this situation. 15 people x 25 days x $1500 per day (a round figure that includes accommodation, meals, etc) = $562,500. That’s over half a million dollars just for the seller-side of the post-RFP evaluation phase. Now let’s say there were 4 sellers going through this. [Just a small aside here – reading between the lines, do you suspect the buyer was taking the seller on a journey into the minutiae or focusing on what will move the needle for them? Re-read that through the lens of yesterday’s contrasting KPI perspectives]

You can see exactly why Mark has proposed that it’s, “Time to kill the RFP,” at least in its traditional form. These two quotes lobby hard for the death penalty. More on that tomorrow!

Also note that another hint was contained above in the lead-up to a project launch on Monday that we’re really excited about.