OSS – just in time rather than just in case

We all know that once installed, OSS tend to stay in place for many years. Too much effort to air-lift in. Too much effort to air-lift back out, especially if tightly integrated over time.

The monolithic COTS (off-the-shelf) tools of the past would generally be commissioned and customised during the initial implementation project, with occasional integrations thereafter. That meant we needed to plan out what functionality might be required in future years and ask for it to be implemented, just in case. Along with all the baked-in functionality that is never needed, and the just in case but possibly never used, we ended up with a lot of bloat in our OSS.

With the current approach of implementing core OSS building blocks, then utilising rapid release and microservice techniques, we have an ongoing enhancement train. This provides us with an opportunity to build just in time, to build only functionality that we know to be essential.

This has pluses and minuses. On the plus side, we have more opportunity to restrict delivery to only what’s needed. On the minus side, a just in time mindset can build a stop-gap culture rather than strategic, long-term thinking. It’s always good to have long-term thinkers / planners on the team to steer the rapid release implementations (and reductions / refactoring) and avoid a new cause of bloat.

Would you ever alarm your lab equipment?

Something curious dawned on me the other day – I wondered how many people / organisations actively manage alarms / alerts being generated by their lab equipment?

At first glance, this would seem silly. Lab environments are in constant flux, in all sorts of semi-configured situations, and therefore likely to be alarming their heads off at different times.

As such, it would seem even sillier to send alarms, syslogs, performance counters, etc from your lab boxes through to your production alarm management platform. Events on your lab equipment simply shouldn’t be distracting your NOC teams from handling events on your active network right?

But here’s why I’m curious, in a word, DevOps. Does the DevOps model now mean that some lab equipment stays in a relatively stable state and performs near-mission-critical activities like automated regression testing?? Therefore, does it make sense to selectively monitor / manage some lab equipment?

In what other scenarios might we wish to send lab alarms to the NOC (and I’m not talking about system integration testing type scenarios, but ongoing operational scenarios)?

An OSS doomsday scenario

If I start talking about doomsday scenarios where the global OSS job industry is decimated, most people will immediately jump to the conclusion that I’m predicting an artificial intelligence (AI) takeover. AI could have a role to play, but is not a key facet of the scenario I’m most worried about.
OSS doomsday scenario

You’d think that OSS would be quite a niche industry, but there must be thousands of OSS practitioners in my home town of Melbourne alone. That’s partly due to large projects currently being run in Australia by major telcos such as nbn, Telstra, SingTel-Optus and Vodafone, not to mention all the smaller operators. Some of these projects are likely to scale back in coming months / years, meaning less seats in a game of OSS musical chairs. But this isn’t the doomsday scenario I’m hinting at in the title either. There will still be many roles at the telcos and the vendors / integrators that support them.

There are hundreds of OSS vendors in the market now, with no single dominant player. It’s a really fragmented market that would appear to be ripe for M&A (mergers and acquisitions). Ripe for consolidation, but massive consolidation is still not the doomsday scenario because there would still be many OSS roles in that situation.

The doomsday scenario I’m talking about is one where only one OSS gains domination globally. But how?

Most traditional telcos have a local geographic footprint with partners/subsidiaries in other parts of the world, but are constrained by the costs and regulations of a wired or cellular footprint to be able to reach all corners of the globe. All that uniqueness currently leads to the diversity of OSS offerings we see today. The doomsday scenario arises if one single network operator usurps all the traditional telcos and their legacy network / OSS / BSS stacks in one technological fell swoop.

How could a disruption of that magnitude happen? I’m not going to predict, but a satellite constellation such as the one proposed by Starlink has some of the hallmarks of such a scenario. By using low-earth orbit (LEO) satellites (ie lower latency than geostationary satellite solutions), point-to-point laser interconnects between them and peering / caching of data in the sky, it could fundamentally change the world of communications and OSS.

It has global reach, no need for carrier interconnect (hence no complex contract negotiations or OSS/BSS integration for that matter), no complicated lead-in negotiations or reinstatements, no long-haul terrestrial or submarine cable systems. None of the traditional factors that cost so much time and money to get customers connected and keep them connected (only the complication of getting and keeping the constellation of birds in the sky – but we’ll put that to the side for now). It would be hard for traditional telcos to compete.

I’m not suggesting that Starlink can or will be THE ubiquitous global communications network. What if Google, AWS or Microsoft added this sort of capability to their strengths in hosting / data? Such a model introduces a new, consistent network stack without the telcos’ tech debt burdens discussed here. The streamlined network model means the variant tree is millions of times simpler. And if the variant tree is that much simpler, so is the operations model and so is the OSS… with one distinct contradiction. It would need to scale for billions of customers rather than millions and trillions of events.

You might be wondering about all the enterprise OSS. Won’t they survive? Probably not. Comms networks are generally just an important means-to-an-end for enterprises. If the one global network provider were to service every organisation with local or global WANs, as well as all the hosting they would need, and hosted zero-touch network operations like Google is already pre-empting, would organisation have a need to build or own an on-premises OSS?

One ubiquitous global network, with a single pared back but hyperscaled OSS, most likely purpose-built with self-healing and/or AI as core constructs (not afterthoughts / retrofits like for existing OSS). How many OSS roles would survive that doomsday scenario?

Do you have an alternative OSS doomsday scenario that you’d like to share?

Hat tip again to Jay Fenton for pointing out what Starlink has been up to.

The OSS MoSCoW requirement prioritisation technique

Since the soccer World Cup is currently taking place in Russia, I thought I’d include reference to the MoSCoW technique in today’s blog. It could be used as part of your vendor selection processes for the purpose of OSS requirement prioritisation.

The term MoSCoW itself is an acronym derived from the first letter of each of four prioritization categories (Must have, Should have, Could have, and Won’t have), with the interstitial Os added to make the word pronounceable.”
Wikipedia.

It can be used to rank the importance an OSS operator gives to each of their requirements, such as the sample below:

Index Description MoSCoW
1 Requirement #1 Must have (mandatory)
2 Requirement #2 Should have (desired)
3 Requirement #3 Could have (optional)
4 Requirement #5 Won’t have (not required)

But that’s only part of the story in a vendor selection process – the operator’s wish-list. This needs to be cross-referenced with a vendor or solution integrator’s ability to deliver to those wishes. This is where the following compliance codes come in:

  • FC – Fully Compliant – This functionality is available in the baseline installation of the proposed OSS
  • NC – Non-compliant – This functionality is not able to be supported by the proposed OSS
  • PC – Partially Compliant – This functionality is partially supported by the proposed OSS (a vendor description of what is / isn’t supported is required)
  • WC – Will Comply – This functionality is not available in the baseline installation of the proposed software, but will be provided via customisation of the proposed OSS
  • CC – Can Comply – This functionality is possible, but not part of the offer and can be provided at additional cost

So a quick example might look like the following:

Index Description MoSCoW Compliance Comment
1 Requirement #1 Must have (mandatory) FC
2 Requirement #2 Should have (desired) PC Can only delivery the functionality of 2a, not 2b in this solution

Yellow columns are created by the operator / customer, blue cells are populated by the vendor / SI. I usually also add blue columns to indicate which product module delivers the compliance and room to pose questions / assumptions back to the customer.

BTW. I’ve only heard of the MoSCoW acronym recently but have been using a similar technique for years. My prioritisation approach is a little simpler. I just use Mandatory, Preferred, Optional.

The OSS dart-board analogy

The dartboard, by contrast, is not remotely logical, but is somehow brilliant. The 20 sector sits between the dismal scores of five and one. Most players aim for the triple-20, because that’s what professionals do. However, for all but the best darts players, this is a mistake. If you are not very good at darts, your best opening approach is not to aim at triple-20 at all. Instead, aim at the south-west quadrant of the board, towards 19 and 16. You won’t get 180 that way, but nor will you score three. It’s a common mistake in darts to assume you should simply aim for the highest possible score. You should also consider the consequences if you miss.”
Rory Sutherland
on Wired.

When aggressive corporate goals and metrics are combined with brilliant solution architects, we tend to aim for triple-20 with our OSS solutions don’t we? The problem is, when it comes to delivery, we don’t tend to have the laser-sharp precision of a professional darts player do we? No matter how experienced we are, there tends to be hidden surprises – some technical, some personal (or should I say inter-personal?), some contractual, etc – that deflect our aim.

The OSS dart-board analogy asks the question about whether we should set the lofty goals of a triple-20 [yellow circle below], with high risk of dismal results if we miss (think too about the OSS stretch-goal rule); or whether we’re better to target the 19/16 corner of the board [blue circle below] that has scaled back objectives, but a corresponding reduction in risk.

OSS Dart-board Analogy

Roland Leners posed the following brilliant question, “What if we built OSS and IT systems around people’s willingness to change instead of against corporate goals and metrics? Would the corporation be worse off at the end?” in response to a recent post called, “Did we forget the OSS operating model?

There are too many facets to count on Roland’s question but I suspect that in many cases the corporate goals / metrics are akin to the triple-20 focus, whilst the team’s willingness to change aligns to the 19/16 corner. And that is bound to reduce delivery risk.

I’d love to hear your thoughts!!

Using OSS machine learning to predict backwards not forwards

There’s a lot of excitement about what machine-led decisioning can introduce into the world of network operations, and rightly so. Excitement about predictions, automation, efficiency, optimisation, zero-touch assurance, etc.

There are so many use-cases that disruptors are proposing to solve using Artificial Intelligence (AI), Machine Learning (ML) and the like. I might have even been guilty of proposing a few ideas here on the PAOSS blog – ideas like closed-loop OSS learning / optimisation of common processes, the use of Robotic Process Automation (RPA) to reduce swivel-chairing and a few more.

But have you ever noticed that most of these use-cases are forward-looking? That is, using past data to look into the future (eg predictions, improvements, etc). What if we took a slightly different approach and used past data to reconcile what we’ve just done?

AI and ML models tend to rely on lots of consistent data. OSS produce or collect lots of consistent data, so we get a big tick there. The only problem is that a lot of our manually created OSS data is riddled with inconsistency, due to nuances in the way that each worker performs workflows, data entry, etc as well as our own personal biases and perceptions.

For example, zero-touch automation has significant cachet at the moment. To achieve that, we need network health indicators (eg alarms, logs, performance metrics), but also a record of interventions that have (hopefully) improved on those indicators. If we have inconsistent or erroneous trouble-ticketing information, our AI is going to struggle to figure out the right response to automate. Network operations teams tend to want to fix problems quickly, get the ticketing data populated as fast as possible and move on to the next fault, even if that means creating inconsistent ticketing data.

So, to get back to looking backwards, a machine-learning use-case to consider today is to look at what an operator has solved, compare it to past resolutions and either validate the accuracy of their resolution data or even auto-populate fields based on the operator’s actions.

Hat tip to Jay Fenton for the idea seed for this post.

The OSS farm equipment analogy

OSS End of Financial Year
It’s an interesting season as we come up to the EOFY (end of financial year – on 30 June). Budget cycles are coming to an end. At organisations that don’t carry un-spent budgets into the next financial year, the looming EOFY triggers a use-it-or-lose-it mindset.

In some cases, organisations are almost forced to allocate funds on OSS investments even if they haven’t always had the time to identify requirements and / or model detailed return projections. That’s normally anathema to me because an OSS‘ reputation is determined by the demonstrable value it creates for years to come. However, I can completely understand a client’s short-term objectives. The challenge we face is to minimise any risk of short-term spend conflicting with long-term objectives.

I take the perspective of allocating funds to build the most generally useful asset (BTW, I like Robert Kiyosaki’s simple definition of an asset as, “in reality, an asset is only something that puts money in your pocket,”) In the case of OSS, putting money in one’s pocket needs to consider earnings [or cost reductions] that exceed outgoings such as maintenance, licensing, operations, etc as well as cost of capital. Not a trivial task!

So this is where the farm equipment analogy comes in.

If we haven’t had the chance to conduct demand estimation (eg does the telco’s market want the equivalent of wheat, rice, stone fruit, etc) or product mix modelling (ie which mix of those products will bear optimal returns) then it becomes hard to predict what type of machinery is best fit for our future crops. If we haven’t confirmed that we’ll focus efforts on wheat, then it could be a gamble to invest big in a combine harvester (yet). We probably also don’t want to invest capital and ongoing maintenance on a fruit tree shaker if our trees won’t begin bearing fruit for another few years.

Therefore, a safer investment recommendation would be on a general-purpose machine that is most likely to be useful for any type of crop (eg a tractor).

In OSS terminology, if you’re not sure if your product mix will provision 100 customers a day or 100,000 then it could be a little risky to invest in an off-the-shelf orchestration / provisioning engine. Still potentially risky, but less so, would be to invest in a resource and service inventory solution (if you have a lot of network assets), alarm management tools (if you process a lot of alarms), service order entry, workforce management, etc.

Having said that, a lot of operators already have a strong gut-feel for where they intend to get returns on their investment. They may not have done the numbers extensively, but they know their market roadmap. If wheat is your specialty, go ahead and get the combine harvester.

I’d love to get your take on this analogy. How do you invest capital in your OSS without being sure of the projections (given that we’re never sure on projections becoming reality)?

Did we forget the OSS operating model?

When we have a big OSS transformation to undertake, we tend to start with the use cases / requirements, work our way through the technical solution and build up an implementation plan before delivering it (yes, I’ve heavily reduced the real number of steps there!).

However, we sometimes overlook the organisational change management part. That’s the process of getting the customer’s organisation aligned to assist with the transformation, not to mention being fully skilled up to accept handover into operations. I’ve seen OSS projects that were nearly perfect technically, but ultimately doomed because the customer wasn’t ready to accept handover. Seasoned OSS veterans probably already have plans in place for handling organisational change through stakeholder management, training, testing, thorough handover-to-ops processes, etc. You can find some hints on the Free Stuff pages here on PAOSS.

In addition, long-time readers here on PAOSS have probably already seen a few posts about organisational management, but there’s a new gotcha that I’d like to add to the mix today – the changing operating model. This one is often overlooked. The changes made in a next-gen OSS can often have profound changes on the to-be organisation chart. Roles and responsibilities that used to be clearly defined now become blurred and obsoleted by the new solution.

This is particularly true for modern delivery models where cloud, virtualisation, as-a-service, etc change the dynamic. Demarcation points between IT, operations, networks, marketing, products, third-party suppliers, etc can need complete reconsideration. The most challenging part about understanding the re-mapping of operating models is that we often can’t even predict what they will be until we start using the new solution and refining our processes in-flight. We can start with a RACI and a bunch of “what if?” assumptions / scenarios to capture new operational mappings, but you can almost bet that it will need ongoing refinement.

Microsoft to acquire GitHub

Microsoft to acquire GitHub for $7.5 billion.

Microsoft Corp. announced it has reached an agreement to acquire GitHub, the world’s leading software development platform where more than 28 million developers learn, share and collaborate to create the future. Together, the two companies will empower developers to achieve more at every stage of the development lifecycle, accelerate enterprise use of GitHub, and bring Microsoft’s developer tools and services to new audiences.

“Microsoft is a developer-first company, and by joining forces with GitHub we strengthen our commitment to developer freedom, openness and innovation,” said Satya Nadella, CEO, Microsoft. “We recognize the community responsibility we take on with this agreement and will do our best work to empower every developer to build, innovate and solve the world’s most pressing challenges.”

Under the terms of the agreement, Microsoft will acquire GitHub for $7.5 billion in Microsoft stock. Subject to customary closing conditions and completion of regulatory review, the acquisition is expected to close by the end of the calendar year.

GitHub will retain its developer-first ethos and will operate independently to provide an open platform for all developers in all industries. Developers will continue to be able to use the programming languages, tools and operating systems of their choice for their projects — and will still be able to deploy their code to any operating system, any cloud and any device.

Microsoft Corporate Vice President Nat Friedman, founder of Xamarin and an open source veteran, will assume the role of GitHub CEO. GitHub’s current CEO, Chris Wanstrath, will become a Microsoft technical fellow, reporting to Executive Vice President Scott Guthrie, to work on strategic software initiatives.

“I’m extremely proud of what GitHub and our community have accomplished over the past decade, and I can’t wait to see what lies ahead. The future of software development is bright, and I’m thrilled to be joining forces with Microsoft to help make it a reality,” Wanstrath said. “Their focus on developers lines up perfectly with our own, and their scale, tools and global cloud will play a huge role in making GitHub even more valuable for developers everywhere.”

Today, every company is becoming a software company and developers are at the center of digital transformation; they drive business processes and functions across organizations from customer service and HR to marketing and IT. And the choices these developers make will increasingly determine value creation and growth across every industry. GitHub is home for modern developers and the world’s most popular destination for open source projects and software innovation. The platform hosts a growing network of developers in nearly every country representing more than 1.5 million companies across healthcare, manufacturing, technology, financial services, retail and more.

Upon closing, Microsoft expects GitHub’s financials to be reported as part of the Intelligent Cloud segment. Microsoft expects the acquisition will be accretive to operating income in fiscal year 2020 on a non-GAAP basis, and to have minimal dilution of less than 1 percent to earnings per share in fiscal years 2019 and 2020 on a non-GAAP basis, based on the expected close time frame. Non-GAAP excludes expected impact of purchase accounting adjustments, as well as integration and transaction-related expenses. An incremental share buyback, beyond Microsoft’s recent historical quarterly pace, is expected to offset stock consideration paid within six months after closing. Microsoft will use a portion of the remaining ~$30 billion of its current share repurchase authorization for the purchase.

1.045 Trillion reasons to re-consider your OSS strategy

The global Internet of Things (IoT) market will be worth $1.1 trillion in revenue by 2025 as market value shifts from connectivity to platforms, applications and services. By that point, there will be more than 25 billion IoT connections (cellular and non-cellular), driven largely by growth in the industrial IoT market. The Asia Pacific region is forecast to become the largest global IoT region in terms of both connections and revenue.
Although connectivity revenue will grow over the period, it will only account for 5 per cent of the total IoT revenue opportunity by 2025, underscoring the need for operators to expand their capabilities beyond connectivity in order to capture a greater share of market value
.”
GSMA Intelligence
, referred to here.

Let’s look at these projected numbers. The GSMA Intelligence report forecasts only 5 cents in every dollar of IoT spend (of a $1.1T market opportunity) will be allocated to connectivity. That leaves $1.045T on the table if network operators just focus on connectivity.

Traditional OSS tend to focus on managing connectivity – less so on managing marketplaces, customer-facing platforms and applications. Does that headline number – $1.045T – provide you with an incentive to re-consider what your OSS manages and future use cases?

IoT OSS market opportunity

IoT requires slightly different OSS thinking:

  • Rather than integrating to a (relatively) small number of device types, IoT will have an almost infinite number of sensor types from a huge range of suppliers.
  • Rather than managing devices individually, their sheer volume means that devices will need to be increasingly managed in cohorts via policy controls.
  • Rather than a fairly narrow set of network-comms based services, functionality explodes into diverse areas like metering, vehicle fleets, health-care, manufacturing, asset controls, etc, etc so IoT controllers will need to be developed by a much longer-tail of suppliers (meaning open development platforms and/or scalable certification processes to integrate into the IoT controller platforms).
  • There are undoubtedly many, many additional differences.

Caveat: I haven’t evaluated the claims / numbers in the GSMA Intelligence report. This blog is just to prompt a thought-experiment around hypothetical projections.

The paint the fence automation analogy

There are so many actions that could be automated by / with / in our OSS. It can be hard to know where to start can’t it? One approach is to look at where the largest amounts of manual effort is being expended by operators. Another way is to employ the “paint the fence” analogy.

When envisaging fulfilment workflows, it’s easiest to picture actions that start with a customer and wipe down through the OSS / BSS stack.

When envisaging assurance workflows, it’s easiest to picture actions that start in the network and wipe up through the OSS / BSS stack.
Paint the fence OSS analogy

Of course there are exceptions to these rules, but to go a step further, wipe down = revenue, wipe up = costs. We want to optimise both through automation of course.

Like ensuring paint coverage when painting a fence, OSS automation has the potential to best improve Customer Experience coverage when we use brushstrokes down and up.

On the downstroke, it’s through faster service activations, quotes, response times, etc. On the upstroke, it’s through network reliability (downtime reduction), preventative maintenance, expedited notifications, etc.

You’ll notice that these are indicators that are favourable to the customers. I’m sure it won’t take much sluething to see the association to trailing metrics that are favourable to the network operators though right?

How economies of unscale change the OSS landscape

For more than a century, economies of scale made the corporation an ideal engine of business. But now, a flurry of important new technologies, accelerated by artificial intelligence (AI), is turning economies of scale inside out. Business in the century ahead will be driven by economies of unscale, in which the traditional competitive advantages of size are turned on their head.
Economies of unscale are enabled by two complementary market forces: the emergence of platforms and technologies that can be rented as needed. These developments have eroded the powerful inverse relationship between fixed costs and output that defined economies of scale. Now, small, unscaled companies can pursue niche markets and successfully challenge large companies that are weighed down by decades of investment in scale — in mass production, distribution, and marketing
.”
Hemant Taneja with Kevin Maney
in their Sloan Review article, “The End of Scale.”

There are two pathways I can envisage OSS playing a part in the economies of unscale indicated in the Sloan Review quote above.

The first is the changing way of working towards smaller, more nimble organisations, which includes increasing freelancing. There are already many modularised activities managed within an OSS, such as field work, designs, third-party service bundling, where unscale is potentially an advantage. OSS natively manages all these modules with existing tools, whether that’s ticketing, orchestration, provisioning, design, billing, contract management, etc.

Add smart contract management and John Reilly’s value fabric will undoubtedly increase in prevalence. John states that a value fabric is a mesh of interwoven, cooperating organizations and individuals, called parties, who directly or indirectly deliver value to customers. It gives the large, traditional network operators the chance to be more creative in their use of third parties when they look beyond their “Not Invented Here” syndrome of the past. It also provides the opportunity to develop innovative supply and procurement chains (meshes) that can generate strategic competitive advantage.

The second comes with an increasing openness to using third-party platforms and open-source OSS tools within operator environments. The OSS market is already highly fragmented, from multi-billion dollar companies (by market capitalisation) through to niche, even hobby, projects. However, there tended to be barriers to entry for the small or hobbyist OSS provider – they either couldn’t scale their infrastructure or they didn’t hold the credibility mandated by risk averse network operators.

As-a-Service platforms have changed the scale dynamic because they now allow OSS developers to rent infrastructure on a pay-as-you-eat model. In other words, the more their customers consume, the more infrastructure an OSS supplier can afford to rent from platforms such as AWS. More importantly, this become a possibility because operators are now increasingly open to renting third-party services on shared (but compartmentalised / virtualised) infrastructure. BTW. When I say “infrastructure” here, I’m not just talking about compute / network / storage but also virtualisation, containerisation, databases, AI, etc, etc.

Similarly, the credibility barrier-to-entry is being pulled down like the Berlin Wall as operators are increasingly investing in open-source projects. There are large open-source OSS projects / platforms being driven by the carriers themselves (eg ONAP, OpenStack, OPNFV, etc) that are accommodative of smaller plug-in modules. Unlike the proprietary, monolithic OSS/BSS stacks of the past, these platforms are designed with collaboration and integration being front-of-mind.

However, there’s an element of “potential” in these economies of unscale. Andreas Hegers likens open-source to the wild west, as many settlers seek to claim their patch of real-estate in an uncharted map. Andreas states further, “In theory, vendor interoperability from open source should be convenient — even harmonious — with innovations being shared like recipes. Unfortunately for many, the system has not lived up to this reality.”

Where do you sit on the potential of economies of unscale and open-source OSS?

OSS / BSS security getting a little cloudy

Many systems are moving beyond simple virtualization and are being run on dynamic private or even public clouds. CSPs will migrate many to hybrid clouds because of concerns about data security and regulations on where data are stored and processed.
We believe that over the next 15 years, nearly all software systems will migrate to clouds provided by third parties and be whatever cloud native becomes when it matures. They will incorporate many open-source tools and middleware packages, and may include some major open-source platforms or sub-systems (for example, the size of OpenStack or ONAP today)
.”
Dr Mark H Mortensen
in an article entitled, “BSS and OSS are moving to the cloud: Analysys Mason” on Telecom Asia.

Dr Mortensen raises a number of other points relating to cloud models for OSS and BSS in the article linked above. Included is definition of various cloud / virtualisation related terms.

He also rightly points out that many OSS / BSS vendors are seeking to move to virtualised / cloud / as-a-Service delivery models (for reasons including maintainability, scalability, repeatability and other “ilities”).

The part that I find interesting with cloud models (I’ll use the term generically) is positioning of the security control point(s). Let’s start by assuming a scenario where:

  1. The Active Network (AN) is “on-net” – the network that carries live customer traffic (the routers, switches, muxes, etc) is managed by the CSP / operator [Noting though, that these too are possibly managed as virtual entities rather than owned].
  2. The “cloud” OSS/BSS is “off-net” – some vendors will insist on their multi-tenanted OSS/BSS existing within the public cloud

The diagram below shows three separate realms:

  1. The OSS/BSS “in the cloud”
  2. The operator’s enterprise / DC realm
  3. The operator’s active network realm

as well as the Security Control Points (SCPs) between them.

OSS BSS Cloud Security Control Points

The most important consideration of this architecture is that the Active Network remains operational (ie carry customer traffic) even if the link to the DC and/or the link to the cloud is lost.

With that in mind, our second consideration is what aspects of network management need to reside within the AN realm. It’s not just the Active Network devices, but anything else that allows the AN to operate in an isolated state. This means that shared services like NTP / synch needs a presence in the AN realm (even if not of the highest stratum within the operator’s time-synch solution).

What about Element Managers (EMS) that look after the AN devices? How about collectors / probes? How about telemetry data stores? How about network health management tools like alarm and performance management? How about user access management (LDAP, AD, IAM, etc)? Do they exist in the AN or DC realm?

Then if we step up the stack a little to what I refer to as East-West OSS / BSS tools like ticket management, workforce management, even inventory management – do we collect, process, store and manage these within the DC or are we prepared to shift any of this functionality / data out to the cloud? Or do we prefer it to remain in the AN realm and ensure only AN privileged users have access?

Which OSS / BSS tools remain on-net (perhaps as private cloud) and which can (or must) be managed off-net (public cloud)?

Climb further up the stack and we get into the interesting part of cloud offerings. Not only do we potentially have the OSS/BSS (including East-West tools), but more excitingly, we could bring in services or solutions like content from external providers and bundle them with our own offerings.

We often hear about the tight security that’s expected (and offered) as part of the vendor OSS/BSS cloud solutions, but as you see, the tougher consideration for network management architects is actually the end-to-end security and where to position the security control points relative to all the pieces of the OSS/BSS stack.

A new phenomenon for IT

In the past, business-oriented groups have had ideas about what they want to do and then they come to us… Now, they want to know what technology can bring to the table and then they’ll work on the business plan.
So there’s a big gap here. It’s a phenomenon that’s been happening in the last year and it’s an uncomfortable place for IT. We’re not used to having to lead in that way. We have been more in the order-taker business.

Veenod Kurup
, Liberty Global, Group CIO.

That’s a really thought-provoking insight from Veenod isn’t it? Technology driving the business rather than business driving the technology. Technology as the business advantage.

I’ll be honest here – I never thought I’d see that day although I… guess… as e-business increases, the dependency on tech increases in lockstep. I’m passionate about tech, but also of the opinion that the tech is only a means to an end.

So if what Veenod says is reflective of a macro-trend across all industry then he’s right in saying that we’re going to have some very uncomfortable situations for some tech experts. Many will have to widen their field of view from a tech-only vision to a business vision.

Maybe instead the business-oriented groups could just come to the OSS / BSS department and speak with our valuable tripods. After all, we own and run the information and systems where business (BSS) meets technology (OSS) right? Or as previously reiterated on this blog, we have the opportunity to take the initiative and demonstrate the business value within our OSS / BSS.

PS. hat-tip to Dawn Bushaus for unearthing the quote from Veenod in her article on Inform.

Dematerialisation of OSS

In 1972, the Club of Rome in its report The Limits to Growth predicted a steadily increasing demand for material as both economies and populations grew. The report predicted that continually increasing resource demand would eventually lead to an abrupt economic collapse. Studies on material use and economic growth show instead that society is gaining the same economic growth with much less physical material required. Between 1977 and 2001, the amount of material required to meet all needs of Americans fell from 1.18 trillion pounds to 1.08 trillion pounds, even though the country’s population increased by 55 million people. Al Gore similarly noted in 1999 that since 1949, while the economy tripled, the weight of goods produced did not change.
Wikipedia on the topic of Dematerialisation.

The weight of OSS transaction volumes appears to be increasing year on year as we add more stuff to our OSS. The touchpoint explosion is amplifying this further. Luckily, our platforms / middleware, compute, networks and storage have all been scaling as well so the increased weight has not been as noticeable as it might have been (even though we’ve all worked on OSS that have been buckling under the weight of transaction volumes right?).

Does it also make sense that when there is an incremental cost per transaction (eg via the increasingly prevalent cloud or “as a service” offerings) that we pay closer attention to transaction volumes because there is a great perception of cost to us? But not for “internal” transactions where there is little perceived incremental cost?

But it’s not so much the transaction processing volumes that are the problem directly. It’s more by implication. For each additional transaction there’s the risk of a hand-off being missed or mis-mapped or slowing down overall activity processing times. For each additional transaction type, there’s additional mapping, testing and regression testing effort as well as an increased risk of things going wrong.

Do you measure transaction flow volumes across your entire OSS suite? Does it provide an indication of where efficiency optimisation (ie dematerialisation) could occur and guide your re-factoring investments? Does it guide you on process optimisation efforts?

Vulnerability in OSS

All over the world – from America’s National Football League (NFL) to the National Basketball Association (NBA), from our own AFL to NRL – athletes and coaches are cultivating club cultures in which tales of personal hardship and woe are welcome, even desirable. All are clamouring to embrace the biggest buzzword in professional sport: vulnerability.
The most publicised incarnation of this shift was the “Triple H” sessions used at AFL winners Richmond last year, where once a fortnight a player stood and shared three personal stories about a Hero, Hardship and Highlight from their life
.”
Konrad Marshall
, GoodWeekend.

I’ve just done a quick rummage through the OSS and/or tech-related books in my bookshelves. Would you like to know what I noticed? None mention teams, teamwork or culture, let alone the V-word quoted above – vulnerability. Funny that.

I say funny, because each of the highest-performing teams I’ve worked with have also had great team culture. Conversely, the worst-performing teams I’ve worked with have seen ego overpower vulnerability and empathy. Does that resonate with your experiences too?

Professor Amy Edmondson of Harvard Business School refers to psychological safety as, “a team climate characterized by interpersonal trust and mutual respect in which people are comfortable being themselves.” Graeme Cowan states, “We become more open-minded, resilient, motivated, and persistent when we feel safe. Humor increases, as does solution-finding and divergent thinking — the cognitive process underlying creativity.”

Yet why is it that we don’t seem to rate team factors when it comes to OSS delivery? “Team bonding” in stylishly inverted commas can come across as a bit ridiculous, but informal culture building seems to be more valuable than any of the technical alignment workshops we tend to build into our project plans.

OSS are built by teams, for teams, clearly. They’re often built in politically charged situations. They’re also usually built in highly complex environments, where complexities abound in technology and process, but even more so within the people involved. Not only that, but they’re regularly built across divisional lines of business units or organisations over which (hopefully metaphorical) hand grenades can be easily thrown.

Underestimate psychological safety and vulnerability across the entire stakeholder group at your peril on OSS projects. We could benefit from looking outside the walls of OSS, to models used by sporting teams in particular, where team culture is invested in far more heavily because of the proven performance benefits they’ve delivered.

OSS, the great multipliers

Skills multiply labors by two, five, 10, 50, 100 times. You can chop a tree down with a hammer, but it takes about 30 days. That’s called labor. But if you trade the hammer in for an ax, you can chop the tree down in about 30 minutes. What’s the difference in 30 days and 30 minutes? Skills—skills make the difference.”
Jim Rohn
, here.

OSS can be great labour multipliers. They can deliver baked-in “skills” that multiply labors by two, five, 10, 50, 100 times. They can be not just a hammer, not just an axe, but a turbo-charged chainsaw.

The more pertinent question to understand with our OSS though is, “Why are we chopping this tree down?” Is it the right tree to achieve a useful outcome? Do the benefits outweigh the cost of the hammer / axe / chainsaw?

What if our really clever OSS engineers come up with a brilliant design that can reduce a task from 30 days to 30 minutes…. but it takes us 35 days to design/build/test/deploy the customisation for a once-off use? We’re clearly cutting down the wrong tree.

What if instead we could reduce this same task from 30 days to 1 day with just a quick analysis and process change? It’s nowhere near as sexy or challenging for us OSS engineers though. The very clever 30 minute solution is another case of “just because we can, doesn’t mean we should.”

The strangler fig transformation analogy

You’re probably familiar with strangler figs, which grow on a host tree, often resulting in the death of the host. You’re probably less familiar with the strangler fig analogy as an OSS transformation or cutover model.

The concept is that there is a “host tree” (ie legacy system) that needs to be obsoleted and replaced, but it’s so dominant and integral (eg because of complex and/or meshed integrations) that a big-bang replacement is unviable (eg due to risk, costs, etc).

The strangler fig (ie new solution) is developed in parallel to the host tree and is progressively grown over time. Generally, it grows through step-wise enhancement / replacement. This approach is best suited to scenarios where there are lots of transaction types, fault types, process types, use-cases, etc that can be systematically switched from host to strangler.

This approach can also be used for product consolidation (ie multiple products consolidated into one).

Clever use of automated regression testing can help with this evolving cutover approach.

What if every OSS project was a stretch goal?

What if the objectives of every large OSS project were actually perceived as a stretch goal by internal and external stakeholders of the project?

Sim Sitkin, et al describe a stretch goal as, “We’re not talking about merely challenging goals. We’re talking about management moon shots—goals that appear unattainable given current practices, skills, and knowledge.

Dymphna Boholt describes it thus:
The reality is that if everything lands in a project – if every element does what it needs to do when it was supposed to do it, and everything went off without a hitch, that’s actually a stunning success. It’s also incredibly rare.

But we set that impossibly high standard as our benchmark, and then think that everything short of that is a failure.

And in that way, the grading scale should almost be reversed, so that if you put them side by side, it’d look like:

Success = Stunning Success
Partial failure = Total Success
Total Failure = Success
Miserable Failure = Useful learning experience
Total Trainwreck = Failure.”

You know, I’d never, ever looked at OSS projects from this perspective before. I’d always taken the view that if a project I’d worked on didn’t deliver on all expectations (the triple constraint of cost, time, scope/functionality), then it had failed to meet expectations, no matter how big the achievements of the project team may have been.

Dymphna has a really interesting point though. The chances of everything going to plan on a large, complex OSS project are almost zero. There are always challenges to overcome, no matter how skillful the team, how great the planning. It’s why I call it the OctopOSS (just when you think you have all the tentacles tied down, another comes and whacks you on the back of the head).

What if we instead treated the definition of “success” on our OSS projects as an implausibly unlikely stretch goal, to act as a guiding vision for our team? What if we also re-set the expectation benchmark of stakeholders to Dymphna’s right-hand column, not their incumbent expectation at the left?

Would that be letting ourselves off the hook for “failing,” for not meeting our promises, for under-delivering? Or is it fair to resign ourselves to reality rather than delusional benchmarks? Is Dymphna’s right-hand scale effectively like passing an exam only because the bell-curve is used and your classmates have overwhelmingly scored even lower than you?

I’m asking myself, why is it that some of the projects I’m most proud of (of the achievements of my colleagues and I) are also perhaps the biggest failures (on time, cost, scope or customer usability)?

Are you as challenged by Dymphna’s perspective as I am? I’d love to hear your thoughts.

Getting lost in the flow of OSS

The myth is that people play games because they want to avoid challenging work. The reality is, people play games to engage in well-designed, challenging work. The only thing they are avoiding is poorly designed work. In essence, we are replacing poorly designed work with work that provides a more meaningful challenge and offers a richer sense of progress.
And we should note at this point that just because something is a game, it doesn’t mean it’s good. As we’ll soon see, it can be argued that everything is a game. The difference is in the design.
Really good games have been ruthlessly play-tested and calibrated to the point where achieving a state of flow is almost guaranteed for many. Play-testing is just another word for iterative development, which is essentially the conducting of progressive experiments
.”
Dr Jason Fox
in his book, “The Game Changer.”

Reflect with me for a moment – when it comes to your OSS activities, in which situations do you consistently get into a state of flow?

For me, it’s in quite a few different scenarios, but one in particular stands out – building up a network model in an inventory management tool. This activity starts with building models / patterns of devices, services, connections, etc, then using the models to build a replica of the network, either manually or via data migration, within the inventory tool(s). I can lose complete track of time when doing this task. In fact I have almost every single time I’ve performed this task.

Whilst not being much of a gamer, I suspect it’s no coincidence that by far my favourite video game genre is empire-building strategy games like the Civilization series. Back in the old days, I could easily get lost in them for hours too. Could we draw a comparison from getting that same sense of achievement, seeing a network (of devices in OSS, of cities in the empire strategy games) grow rapidly as a result of your actions?

What about fans of first-person shooter games? I wonder whether they get into a state of flow on assurance activities, where they get to hunt down and annihilate every fault in their terrain?

What about fans of horse grooming and riding games? Well…. let’s not go there. 🙂

Anyway, enough of all these reflections and musings. I would like to share three concepts with you that relate to Dr Fox’s quote above:

  1. Gamification – I feel that there is MASSIVE scope for gamification of our OSS, but I’ve yet to hear of any OSS developers using game design principles
  2. Play-testing – How many OSS are you aware of that have been, “ruthlessly play-tested and calibrated?” In almost every OSS situation I’ve seen, as soon as functionality meets requirements, we stop and move on to the next feature. We don’t pause and try a few more variants to see which is most likely to result in a great design, refining the solution, “to the point where achieving a state of flow is almost guaranteed for many
  3. Richer Progress – How many of our end-to-end workflows are designed with, “a richer sense of progress” in mind? Feedback tends to come through retrospective reporting (if at all), rarely through the OSS game-play itself. Chances are that our end-to-end processes actually flow through multiple un-related applications, so it comes back to clever integration design to deliver more compelling feedback. We simply don’t use enough specialist creative designers in OSS