Ooops. The 3GPP network management omission

A recent discussion with a learned and respected OSS colleague reminded me that there had been a major omission from the PAOSS History / Standards page. With the buzz developing around 5G, not to mention some of the advanced features like network slicing and radio infrastructure virtualisation, the oversight was a big one. We’d forgotten to include radio network management standards.

We’ve filled that gap now by adding a section relating to the network and service management standards prepared by 3GPP.

But what’s now concerning me is, “what else is missing?”

Would you mind doing me a favour? Would you like to quickly skim through the link above and let me know if there’s anything else that needs to be added? I know you’re really busy and your time is valuable, so any input you might find time for would be greatly appreciated.

What are OSS “platform wrapper” roadblocks?

OSS can be cumbersome at times. Making change can be difficult. We tend to build layers of protections around them and the networks we manage. I get that. Change can be risky (although the protections are often implemented because the OSS and/or network platforms might not be as robust as they could be).

Contrast this with the OSS we want to create. We want to create a platform for rapid innovation, the platform that helps us and our clients generate opportunities and advantages.

For us to build a platform that allows our customers (and their customers) to revolutionise their markets, we might have to consider whether the protective layers around our OSS that are stymying change. Things like firewall burns, change review boards, documentation, approvals, politics, individuals with a reticence to change, etc.

For example, Netflix takes a contrarian, whitelist approach to access by its engineers rather than a blacklist. It assumes that its engineers are professional enough to only use the tools that they need to get their tasks done. They enable their engineers to use commonly off-limits functionality such as adding their own DNS records (ie to support the stand-up of new infrastructure). But they also take a use-it-or-lose-it approach, monitoring the tools that the engineer uses and rescinding access to tools they haven’t used within 90 days. But if they do need access again, it’s as simple as a message on Slack to reinstate it.

This is just one small example of streamlining the platform wrapper. There are probably a million others.

When working on OSS projects as the integrator / installer, I’ve seen many of these “platform wrapper” roadblocks. I’m sure you have too. If you see them as the installer, chances are the ops team you hand over to will also experience these roadblocks.

Question though. Do you flag these platform wrapper roadblocks for improvement, or do you treat them as non-platform and therefore just live with them?

Only do the OSS that only you can do

A friend of mine has a great saying, “only do what only you can do.”

Do you think that this holds true for the companies undergoing digital transformation? Banks are now IT companies. Insurers are IT companies. Car manufacturers are now IT companies. Telcos are, well, some are IT companies.

We’ve spoken before about the skill transformations that need to happen within telcos if they’re to become IT companies. Some are actively helping their workforce to become more developer-centric. Some of the big telcos that I’ve been assisting in the last few years are embarking on bold Agile-led IT transformations. They’re cutting more of their own code and managing their own IT developments.

That’s exciting news for all of us in OSS. Even if it loses the name OSS in future, telcos will still need software that efficiently operationalises their networks. We have the overlapping skills in software, networks, business and operations.

But I wonder about the longevity of the in-house approach unless we come focus clearly on the first quote above. If all development is brought in-house, we end up with a lot of duplication across the industry. I’m not really sure that it makes sense doing all the heavy-lifting of all custom OSS tools when the heavy-lifting has already been done elsewhere.

It’s the old ebb and flow between in-house and outsourced OSS.

In my very humble opinion, it’s not just a choice between in-house and outsourced that matters. The more important decisions are around choosing to only develop the tools in-house that only you can do (ie the strategic differentiators).

Solarwinds acquires Samanage for $350m

SolarWinds Sets Its Sights on the ITSM Market through Acquisition of Samanage and Introduction of a SolarWinds Service Desk Product.

SolarWinds announced that it has signed an agreement to acquire Samanage, an IT service desk solution company based in Cary, NC. Over the past 7 years, Samanage has built a strong, well-respected product guided by a customer-centricity that aligns well with SolarWinds’ mission and commitment to the technology professional community. SolarWinds plans to add the Samanage products to its IT Operations Management portfolio beginning in Q2 2019. The SaaS-based offering will complement the on-premise products the company offers today to serve the needs of IT organizations at businesses of all sizes – from the SMB to the large enterprise.

“For 20 years, SolarWinds has been committed to making IT look easy by arming technology pros with the powerful tools they need to solve today’s IT management challenges. We do this by responding to well-understood, everyday problems based on input and feedback from our customers and the IT professionals that we serve,” said Kevin Thompson, Chief Executive Officer, SolarWinds. “The IT Service Desk is core to any IT professional’s job and it is something that they interact with every day to serve their employees.”

According to IDC, IT Service Management (ITSM) represents an over $6 billion market today and is forecasted to reach over $8.5 billion by 2023.1 This size reflects the evolution of the ITSM market. ITSM is no longer the domain of large enterprises. Businesses of all sizes increasingly depend on technology to achieve optimal levels of productivity and efficiency, and drive business outcomes and success. There are very few providers who are positioned to serve the entire IT market, from small businesses to the Fortune 500, the way that SolarWinds and Samanage do. Mid-market and smaller businesses are underserved in the space, as existing offerings tend to focus on complex enterprise solutions that require dedicated staff and expensive professional services engagements.

Most IT departments continue to use phone (77%) and email (87%) as their main support channels, but by adopting service desk software they could reduce resolution time by 13% improving not just IT service efficiency, but also employee productivity.2 This is even more pronounced in small and mid-sized businesses. In a recent SolarWinds survey, IT pros indicated that cost (76%) and ease of use (84%) were the critical, driving factors in the selection of an ITSM offering.3 This supports the need for a SolarWinds approach to ITSM – powerful, affordable, and easy to use products designed to solve problems the way that IT pros want them solved.

Thompson continued, “We believe that a powerful, market-leading ITSM solution offers us another compelling product to enhance our ability to serve IT professionals in organizations of all sizes while meaningfully expanding our total addressable market, including additional cross-sell opportunities within our large and expanding customer-base of more than 300,000 customers.”

“IT departments increasingly find themselves at the center of employee service and digital business transformation. As IT leaders pursue new technologies to transform their business, they have the ability to grow the role of service management from an IT help desk to intelligent employee service management across all departments,” said Doron Gordon, Founder & CEO, Samanage. “Deploying an employee service management mindset, coupled with an enterprise-wide service desk platform that supports it – like Samanage — can help increase employee productivity and better connect employees to their customers. We are excited about the opportunity to bring our products together with the reach and strength of SolarWinds to enable IT organizations in companies of all sizes to achieve better business outcomes.”

SolarWinds plans to acquire Samanage for a purchase price of $350 million in cash or approximately $329 million net of cash acquired. SolarWinds plans to fund the transaction primarily with its existing cash balance. The transaction is expected to close before the end of Q2 2019. SolarWinds will provide additional details about the acquisition and its expected impact to 2019 financial results on the company’s Q1 2019 Earnings Call scheduled for April 24, 2019.

A single glass of pain or single pane of glass??

Is your OSS a single pane of glass, or a single glass of pain?

You can tell I’m being a little flippant here. People often (perhaps idealistically) talk about OSS as being the single pane of glass (SPOG) to manage a network.

I say “idealistically” for a couple of reasons:

  1. There are usually many personas who interact with an OSS, each with vastly different user interface (UI) needs
  2. There is usually more than one OSS product in a client’s OSS suite, often from different vendors, with varying levels of integration

Where a single pane of glass can be a true ambition is as a consolidated health-status dashboard / portal, Invariably, this portal is used by executive / leader / manager personas who want to quickly see a single-screen health status that covers all networks and/or parts of the OSS suite. When things go wrong, this portal becomes the single glass of pain.

These single panes tend to be heavily customised for each organisation as every one has a unique set of metrics-that-matter. For those designing these panes, the key is to not just include vanity metrics, but to show information that the leader can action.

But the interesting perspective here is whether the single glass of pain is even relevant within your organisation’s culture. It’s just my opinion, but I prefer for coal-face workers to be empowered to make rapid recovery actions rather than requiring direction from up high in the org-chart. Coal-face workers generally have different tools with UIs that *should* help them monitor, manage and repair super-efficiently.

To get back to the “idealistic” comment above, each OSS UI needs to be fit-for-purpose for each unique persona (eg designers, product owners, network operations, etc). To me this implies that there is no single pane of glass…

I should caveat that by citing the example of an OSS search interface, something I’ve yet to see in OSS… although that’s just a front end to dozens of persona-specific panes of glass.

Unleashing the chaos monkeys on your OSS

I like to compare OSS projects with chaos theory. A single butterfly flapping it’s wings (eg a conversation with the client) can have unintended consequences that cause a tornado (eg the client’s users refusing to use a new OSS).

The day-to-day operation of a network and its management tools can be similarly sensitive to seemingly minor inputs. We can never predict or test for every combination of knock-on effects. This means that forecasting the future is impossible and failure is inevitable.

If we take these two statements to be true, it perhaps changes the way we engineer our OSS.

How many production OSS (and/or related EMS) do you know of whose operators have to tiptoe around the edges for fear of causing a meltdown? Conversely, how many do you know whose operators would quite happily trigger failures with confidence, knowing that their solution is robust and will recover without perceptibly impacting customers?

How many of you could confidently trigger scheduled or unscheduled outages of various types on your production OSS to introduce the machine learning seeding technique discussed yesterday?

Would you be prepared to unleash the chaos monkeys on your OSS / network like Netflix is prepared to do on its production systems?

Most OSS are designed for known errors and mechanisms are put in place to prevent them. Instead I wonder whether we should be design systems on the assumption that failure is inevitable, so recovery should be both rapid and automated.

It’s a subtle shift in thinking. Reduce the test scenarios that might lead to OSS failure, and increase the number of intentional OSS failures to test for recovery.

PS. Oh, and you’d rightly argue that a telco is very different from Netflix. There’s a lot more complexity in the networks, especially the legacy stacks. Many a telco would NEVER let anyone intentionally cause even the slightest degradation / failure in the network. This is where digital twin technology potentially comes into play.

An OSS without the shackles of topology

It’s been nearly two decades since I designed my first root-cause analysis (RCA) rule. It was completely reliant on network topology – more specifically, it relied on a network hierarchy to determine which alarms could be suppressed.

I had a really interesting discussion today with some colleagues who are using much more modern RCA techniques. I was somewhat surprised, but not surprised at all in hindsight, that their Machine Learning engine doesn’t even use topology data. It just looks at events and tries to identify patterns.

That’s a really interesting insight that hadn’t dawned on me before. But it’s an exciting one because it effectively unshackles our fault management tools from data quality perfection in our inventory / asset databases. It also possibly lessens the need for integrations that share topological data.

Equally interesting, the ML engine had identified over 4,000 patterns, but only a dozen had been codified and put into use so far. In other words, the machine was learning, but humans still needed to get involved in the process to confirm that the machine had learned correctly.

Makes me wonder whether the ML pre-seeding technique we discussed in an earlier post might actually be useful for confirmations at a greater scale than the team had achieved with 12 of 4000+ to date.

The standard approach is to let the ML loose and identify patterns. This is the reactive approach. The ML reacts to the alarms that are pushed up from the network. It looks at alarms and determines what the root cause is based on historical data. A human then has to check that the root cause is correct by reverse engineering the alarm stream (just like a network operator used to do before RCA tools came along) and comparing. If the comparison is successful, the person then approves this pattern.

My proposed alternate approach is the proactive method. If we proactively trigger a fault (e.g. pull a patch lead, take a port down, etc), we start from a position of already knowing what the root cause is. This has three benefits:
1. We can check if the ML’s interpretation of root cause is right
2. We’ve proactively seeded the ML’s data with this root cause example
3. We categorically know what the root cause is, unlike the reactive mode which only assumes the operator has correctly diagnosed the root cause

Then we just have to figure out a whole bunch of proactive failures to test safely. Where to start? Well, I’d speak with the NOC operators to find out what their most common root causes are and look to trigger those events.

More tomorrow on intentionally triggering failures in production systems.

I sent you an OSS helicopter

There’s a fable of a man stuck in a flood. Convinced that God is going to save him, he says no to a passing canoe, boat, and helicopter that offer to help. He dies, and in heaven asks God why He didn’t save him. God says, “I sent you a canoe, a boat, and a helicopter!”
We all have vivid imaginations. We get a goal in our mind and picture the path so clearly. Then it’s hard to stop focusing on that vivid image, to see what else could work.
New technologies make old things easier, and new things possible. That’s why you need to re-evaluate your old dreams to see if new means have come along
.”
Derek Sivers
, here.

In the past, we could make OSS platform decisions with reasonable confidence that our choices would remain viable for many years. For example, in the 1990s if we decided to build our OSS around a particular brand of relational database then it probably remained a valid choice until after 2010.

But today, there are so many more platforms to choose from, not to mention the technologies that underpin them. And it’s not just the choices currently available but the speed with which new technologies are disrupting the existing tech. In the 1990s, it was a safe bet to use AutoCAD for outside plant visualisation without the risk of heavy re-tooling within a short timeframe.

If making the same decision today, the choices are far less clear-cut. And the risk that your choice will be obsolete within a year or two has skyrocketed.

With the proliferation of open-source projects, the decision has become harder again. That means the skill-base required to service each project has also spread thinner. In turn, decisions for big investments like OSS projects are based more on the critical mass of developers than the functionality available today. If many organisations and individuals have bought into a particular project, you’re more likely to get your new features developed than from a better open-source project that has less community buy-in.

We end up with two ends of a continuum to choose between. We can either chase every new bright shiny object and re-factor for each, or we can plan a course of action and stick to it even if it becomes increasingly obsoleted over time. The reality is that we probably fit somewhere between the two ends of the spectrum.

To be brutally honest I don’t have a solution to this conundrum. The closest technique I can suggest is to design your solution with modularity in mind, as opposed to the monolithic OSS of the past. That’s the small-grid OSS architecture model. It’s easier to replace one building than an entire city.

Life-cycles of key platforms are likely to now be a few years at best (rather than decades if starting in the 1990s). Hence, we need to limit complexity (as per the triple-constraint of OSS) and functionality to support the most high-value objectives.

I’m sure you face the same conundrums on a regular basis. Please leave a comment below to tell us how you overcome them.

Mythical OSS beasts – feature removal releases

Life can be improved by adding, or by subtracting. The world pushes us to add, because that benefits them. But the secret is to focus on subtracting…

No amount of adding will get me where I want to be. The adding mindset is deeply ingrained. It’s easy to think I need something else. It’s hard to look instead at what to remove.

The least successful people I know run in conflicting directions, drawn to distractions, say yes to almost everything, and are chained to emotional obstacles.

The most successful people I know have a narrow focus, protect against time-wasters, say no to almost everything, and have let go of old limiting beliefs.”
Derek Sivers, here.

I’m really curious here. Have you ever heard of an OSS product team removing a feature? Nope?? Me either!

I’ve seen products re-factored, resulting in changes to features. I’ve also seen products obsoleted and their replacements not offer all of the same features. But what about a version upgrade to an existing OSS product that has features subtracted? That never happens does it?? The adding mindset is deeply ingrained.

So let’s say we do want to go on a subtraction drive and remove some of the clutter from our OSS. I know plenty of OSS GUIs where subtraction is desperately needed BTW! But how do we know what to remove?

I have no data to back this up, but I would guess that almost every OSS would have certain functions that are not used, by any of their customers, in a whole year. That functionality was probably built for a specific use-case for a specific customer that no longer has relevance. Perhaps for a service type that is no longer desired by the market or a network type that will never be used again.

Question is, does your OSS have profiling instrumentation that allows you to measure what functionality is and isn’t used across your whole client base?

Can your products team readily produce a usage profile graph like the following that shows a list of functions (x-axis) by the number of times each function is used (y-axis) in a given time window? Per client? Across all clients?
Long-tail of OSS functionality use

Leave us a comment below if you’ve ever seen this type of profiling instrumentation (not for code optimisation, but for identifying client utilisation levels) and/or systematic feature subtraction initiatives.

BTW. I should make the distinction that just because a function hasn’t been used in a while, doesn’t mean it should automatically be removed. Some functionality (eg data loaders) might be rarely used, but important to retain.

The humbling experience of OSS superstars

Years ago, I was so confident, and so naive. I was so sure that I was right and everyone else was wrong.
Unfortunately I was lucky and got successful, so that kept me ignorant of my shortcomings.
I sold my company, felt ready to do something new, and started to learn. But the more I learned, the more I realized how little I knew, and how dumb-lucky I had been. I continued learning until I felt like an absolute idiot…
So I’m glad my old confidence is gone, because it thought I was right, and maybe even great.
Derek Sivers here.

Confidence is a fascinating thing in OSS. There are so many facets to it. One person can rightfully build the confidence that comes from being the best in the world at one facet, but completely lack confidence in many of the other facets. Or be a virtuoso within their project / environment but be out of their depth in another organisation’s environment.

Yesterday’s post discussed how much of a bottleneck I became on my first OSS project. Most members of that project team had a reliance on the information / data that I knew how to find and interpret. Most members of the project team had problems that “my” information / data would help them to solve.

As they say, knowledge is power and I had the confidence that came with having that knowledge / power. My ego thrived on the importance of having that knowledge / power. That ego was built on helping the team solve many of the huge number of problems we faced. But as with most ego plays, I was being selfish in retrospect.

Having access to all of the problems, and helping to solve them, meant I was learning at a rapid rate (by my limited standards). That was exciting. Realising there were so many problems to solve in this field probably helped spawn my passion for OSS.

With the benefit of hindsight, I was helping, but also hindering in almost equal measures. Like Derek, “the more I learned, the more I realized how little I knew, and how dumb-lucky I had been.” To be honest, I’ve always been aware of just how little I know across the many facets of OSS. But the more I learn, the more I realise just how big the knowledge chasm is. And with the speed of change currently afoot, the chasm is only getting wider. Not just for me, but for all of us I suspect. Do you feel it too?

Also like Derek, my old naive confidence is gone. A humbler confidence now replaces it, one derived from many humbling experiences and failures (ie learning experiences) along the way.

There are superstars on every OSS project team. You’ve worked with many of them no doubt. But let me ask you a question. If you think back on those superstars, how many have also been bottlenecks on your projects? How many have thrived on the importance of being the bottleneck? Conversely, how many have mastered the art of not just being the superstar performer, but also the on-field leader who brings others into the game? The one who can make an otherwise dysfunctional OSS team function more cohesively.

Leave a comment below if you’d like to share a story of your experience dealing with OSS superstars (or being the superstar). What have you learned from that experience?

I was a huge bottleneck on my first OSS project

I became a problematic bottleneck on my first OSS project. It didn’t start that way, but it definitely ended that way. And I’ve been thinking ever since about how I could’ve managed that better.

I started out as a network subject matter expert but wasn’t a bottleneck in that role. However, the next two functions I absorbed were the source of the problem.

The first additional role was in becoming the unofficial document librarian. Most of the documents coming into our organisation came through me. Being inquisitive, I’d review each document and try to apply it to what my colleagues and I were trying to achieve. When the team had an information void, they’d come to me with the problem and I’d not just point them to the relevant document/s but dive into helping to solve the problem.

The next role was assisting to model network data into the OSS database. This morphed into becoming responsible for all of the data in the database. In those days, I didn’t have a Minimum Viable Data (MVD) mindset. Instead it was an ingest-it-all-and-figure-out-how-to-use-it-later mentality. When the team had a data void, they’d come to me with the problem and I’d not just point them to the relevant data and what it meant but dive into helping to solve their problem/s.

You can see how this is leading to being a bottleneck can’t you?

I was effectively asking for all problems to be re-routed through me. Every person on the project (except possibly the project admins) relied on documentation and data. I averaged 85 hour weeks for about 2.5 years on that project, but still didn’t get close to servicing all the requests. Great as a learning exercise. Not great for the project.

Twenty years on, how would I do it better? Well, let me ask first, how would you do it better?

You possibly have many more ideas, but the two I’d like to leave you with are:

  • Figure out ways to make teaching more repeatable and self-learnt
  • Very closely aligned, and more importantly, is in asking leading questions that help others solve their own problems

It still feels like it’s less helpful to not dive into solving the problem, but it undoubtedly improves overall team efficiency and growth.

Oh, and by the way, if you’re just starting out in OSS and want to speed up your own development into becoming an OSS linchpin – find your way into the document librarian and/or data management roles. After all these years on OSS projects, I still think these are the best places to launch into the learning curve from.

The use of drones by OSS

The last few days have been all about organisational structuring to support OSS and digital transformations. Today we take a different tack – a more technical diversion – onto how drones might be relevant to the field of OSS.

A friend recently asked for help to look into the use of drones in his archaeological business. This got me to thinking about how they might apply in cross-over with OSS.

I know they’re already used to perform really accurate 3D cable route / corridor surveying. Much cooler than the old surveyor diagrams on A1 sheets from the old days. Apparently experts in the field can even tell if there’s rock in the surveyed area by looking at the vegetation patterns, heat and LIDAR scans.

But my main area of interest is in the physical inventory. With accurate geo-tagging available on drones and the ability to GPS correct the data, it seems like a really useful technique for getting outside plant (OSP) data into OSS inventory systems. Or geo-correcting data for brownfields assets.
Drone-based cable corridor surveys
Have you heard of drone-based OSP asset identification and mapping data being fed into inventory systems yet? I haven’t, but it seems like the logical next step. Do you know anyone who has started to dabble in this type of work? If you do, please send me a note as I’d love to be introduced.

Once loaded into the inventory system, with 3d geo-location, we then have the ability to visualise the OSP data with augmented reality solutions.

And other applications for drone technology?

OSS orgitecture

So far this week we’ve been focusing on ways to improve the OSS transformation process. Monday provided 7 models for achieving startup-like efficiency for larger OSS transformations. Tuesday provided suggestions for speeding up the transition from OSS PoC to getting the solution into production, specifically strategies for absorbing an OSS PoC into production.

Both of these posts talk about the speed of getting things done outside the bureaucracy of big operators, big networks and big OSS. Today, as the post title suggests, we’re going to look at orgitecture – how re-designing the structure and culture of an organisation can help streamline digital transformations.

Do you agree with the premise that smaller entities (eg Agile autonomous groups, partners, consultants, etc) can get OSS tasks done more efficiently when operating at arms-length of the larger entity (eg the carrier)? I believe that this is a first principle of physics at play.

If you’ve worked under this arms-length arrangement in the past, you’ll also know that at some point those delivery outcomes need to get integrated back into the big entity. It’s what we referred to yesterday as absorption, where the level of integration effort falls on a continuum between minimally absorbed to fully absorbed.

OSS orgitecture is the re-architecture of the people, processes, culture and org structure to better allow for the absorption process. In the past, all the safety-checks (eg security, approvals, ops handover, etc) were designed on the assumption that internal teams were doing the work. They’re not always a great fit, especially when it comes to documentation review and approval.

For example, I have a belief that the effectiveness of documentation review and approval is inversely proportional to the number of reviewers (in most, but not all cases). Unfortunately, when an external entity is delivering, there tends to be inherently less trust than if an internal entity was delivering. As such, the safety-checks increase.

Another example is when the large organisation uses Agile delivery models, but use supply partners to deliver scope of works. The partners are able to assign effort in a sequential / waterfall manner, but can be delayed by only getting timeslices of attention from client’s staff (ie resources are available according to Agile sprint planning).

Security and cutover planning mechanisms such as Change Review Boards (CRB) have also been designed around old internal delivery models. They also need to be reconsidered to facilitate a pipeline of externally-implemented change.

Perhaps the biggest orgitecture factor is in getting multiple internal business units to work together effectively. In the old world we needed all the business units to reach consensus for a new product to come to market. Sales/Marketing/Products had to work with OSS/IT and Networks. Each of these units tend to have vastly different cultures and different cadences for getting their tasks done. Delivering a new product was as much an organisational challenge as it was a technical challenge and often took months. Those times-to-market are not feasible in a world of software where competitive advantages are fleeting. External entities can potentially help or hinder these timeframes. Careful design of small autonomous teams have the potential to improve abstraction at the interlocks, but culture remains the potential roadblock.

I’m excited by the opportunity for OSS delivery improvement coming from leveraging the gig economy. But if big OSS transformations are to make use of these efficiency gains, then we may also need to consider culture and process refinement as part of the change management.

Speeding up your OSS transition from PoC to PROD

In yesterday’s article, we discussed 7 models for achieving startup-like efficiency on large OSS transformations.

One popular approach is to build a proof-of-concept or sandpit quickly on cloud hosting or in lab environments. It’s fast for a number of reasons including reduced number of approvals, faster activation of infrastructure, reduced safety checks (eg security, privacy, etc), minimised integration with legacy systems and many other reasons. The cloud hosting business model is thriving for all of these reasons.

However, it’s one thing to speed up development of an OSS PoC and another entirely to speed up deployment to a PROD environment. As soon as you wish to absorb the PoC-proven solution back into PROD, all the items listed above (eg security sign-offs) come back into play. Something that took days/weeks to stand up in PoC now takes months to productionise.

Have you noticed that the safety checks currently being used were often defined for the old world? They often aren’t designed with transition from cloud to PROD in mind. Similarly, the culture of design cross-checks and approvals can also be re-framed (especially when the end-to-end solution crosses multiple different business units). Lastly, and way outside my locus of competence, is in re-visiting security / privacy / deployment / etc models to facilitate easier transition.

One consideration to make is just how much absorption is required. For example, there are examples of services being delivered to the large entity’s subscribers by a smaller, external entity. The large entity then just “clips-the-ticket,” gaining a revenue stream with limited involvement. But the more common (and much more challenging) absorption model is for the partner to fold the solution back into the large entity’s full OSS/BSS stack.

So let’s consider your opportunity in terms of the absorption continuum that ranges between:

clip-the-ticket (minimally absorbed) <-----------|-----------> folded-in (fully absorbed)

Perhaps it’s feasible for your opportunity to fit somewhere in between (partially absorbed)? Perhaps part of that answer resides in the cloud model you decide to use (public, private, hybrid, cloud-managed private cloud) as well as the partnership model?

Modularity and reduced complexity (eg integrations) are also a factor to consider (as always).

I haven’t seen an ideal response to the absorption challenge yet, but I believe the solution lies in re-framing corporate culture and technology stacks. We’ll look at that in more detail tomorrow.

How about you? Have you or your organisation managed to speed up your transition from PoC to PROD? What techniques have you found to be successful?

Seven OSS transformation efficiency models

Do you work in a large organisation? Have you also worked in smaller organisations?
Where have you felt more efficient?

I’ve been lucky enough to work on some massive OSS transformations for large T1 telcos. But I’ve always noticed the inefficiency of working on these projects when embedded inside the bureaucracy of the beast. With all of the documentation, sign-offs, meetings, politics, gaining consensus, budget allocations, etc it can sometimes feel so inefficient. On some past projects, I’ve felt I can accomplish more in a day outside than a week or more inside the beast.

This makes sense when applying the fundamental law of physics F = M x a to OSS projects. In other words, the greater the mass (of the organisation), the more force must be applied to reach a given acceleration (ie to effect a change).

It’s one of the reasons I love working within a small entity (Passionate About OSS), but into big entities (the big telcos and utilities). It’s also why I strongly believe that the big entities need to better leverage smaller working groups to facilitate big OSS change. Not just OSS transformation, but any project where the size of the culture and technology stack are prohibitive.

Here are a few ways you can use to bring a start-up’s efficiency to a big OSS transformation:

  1. Agile methodologies – If done well, Agile can be great at breaking transformations down into smaller, more manageable pieces. The art is in designing small autonomous teams / responsibilities and breakdown of work to minimise dependencies
  2. Partnerships – Using smaller, external partners to deliver outcomes (eg product builds or service offerings) that can be absorbed into the big organisation. There are varying levels of absorption here – from an external, “clip-the-ticket” offering to offerings that are fully absorbed into the large entity’s OSS/BSS stack
  3. Consultancies – Similar to partnerships, but using smaller teams to implement professional services
  4. Spin-out / spin-in teams – Separating small teams of experts out from the bureaucracy of the large organisation so that they can achieve rapid progress
  5. Smart contracts / RFPs – I love the potential for smart contracts to automate the offer of small chunks of work to trusted partners to bid upon and then deliver upon
  6. Externalised Proofs of Concept (PoC) – One of the big challenges in implementing for large organisations is all of the safety checks that slow progress. Many, such as security and privacy mechanisms, are completely justified for a production network. But when a concept needs to be proved, such as user journeys, product integrations, sand-pit environments, etc, then cloud-based PoCs can be brilliant
  7. Alternate brands – Have you also noticed that some of the tier-1 telcos have been spinning out low-cost and/or niche brands with much leaner OSS/BSS stacks, offerings and related culture lately? It’s a clever business model on many levels. Combined with the strangler fig transformation approach, this might just represent a pathway for the big brand to shed many of their OSS/BSS legacy constraints

Can you think of other models that I’ve missed?

The key to these strategies is not so much the carve-out, the process of getting small teams to do tasks efficiently, but the absorb-in process. For example, how to absorb a cloud-based PoC back into the PROD network, where all safety checks (eg security, privacy, operations acceptance, etc) still need to be performed. More on that in tomorrow’s post.

How to bring your art and your science to your OSS

In the last two posts, we’ve discussed repeatability within the field of OSS implementation – paint-by-numbers vs artisans and then resilience vs precision in delivery practices.

Now I’d like you to have a think about how those posts overlay onto this quote by Karl Popper:
Non-reproducible single occurrences are of no significance to science.”

Every OSS implementation is different. That means that every one is a non-reproducible single occurrence. But if we bring this mindset into our OSS implementations, it means we’re bringing artisinal rather than scientific method to the project.

I’m all for bringing more art, more creativity, more resilience into our OSS projects.

I’m also advocating more science though too. More repeatability. More precision. Whilst every OSS project may be different at a macro level, there are a lot of similarities in the micro-elements. There tends to be similarities in sequences of activities if you pay close attention to the rhythms of your projects. Perhaps our products can use techniques to spot and leverage similarities too.

In other words, bring your art and your science to your OSS. Please leave a comment below. I’d love to hear the techniques you use to achieve this.

OSS resilience vs precision

Resilience is what happens when we’re able to move forward even when things don’t fit together the way we expect.[OSS project anyone???] And tolerances are an engineer’s measurement of how well the parts meet spec.
One way to ensure that things work out the way you hope is to spend the time and money to ensure that every part, every form, every worker meets spec. Tighten your spec, increase precision and you’ll discover that systems become more reliable.
The other alternative is to embrace the fact that nothing is ever exactly on spec, and to build resilient systems.
You’ll probably find that while precision feels like the way forward, resilience, the ability to thrive when things go wrong, is a much safer bet
.”
Seth Godin
here.

Yesterday’s post talked about the difference between having a team of artisans versus a team that paints by numbers. Seth’s blog provides a similar comparison. Instead of comparing by talent, Seth compares by attitude.

I’m really conflicted by Seth’s comparison.

From the side of past experience, resilience is a massive factor in overcoming the many obstacles faced on implementation projects. I’ve yet to work on an OSS project where all challenges were known at inception.

From the side of an ideal future, precision and repeatability are essential factors in improving the triple constraint of OSS delivery and increasing reliability for customers. And whilst talking about the future, the concept of network slicing (which holds the key for 5G) is dependent upon OSS repeatability and efficiency.

So which do we focus on? Building a vastly talented, experienced and resilient implementation team? Or building a highly reliable, repeatable implementation system? Both, most likely.

But what if you only get to choose one? Which do you focus on (for you and your team/system)?

The Mona Lisa of OSS

All OSS rely on workflows to make key outcomes happen. Outcomes like activating a customer order, resolving a fault, billing customers, etc. These workflows often touch multiple OSS/BSS products and/or functional capabilities. There’s not always a single-best-way to achieve an outcome.

If you’re responsible for your organisation’s workflows do you want to build a paint-by-numbers approach where each process is repeatable?
Or do you want the bespoke paintings, which could unintentionally lead to a range in quality from Leonardo’s Mona Lisa to my 3 year old’s finger painting?

Apart from new starters, who thrive on a paint-by-numbers approach at first, every person who uses an OSS wants to feel like an accomplished artisan. They want to have the freedom to get stuff done with their own unique brush-strokes. They certainly don’t want to follow a standard, pre-defined pattern day-in and day-out. That would be so boring and demoralising. I don’t blame them. I’d be exactly the same.

This is perhaps why some organisations don’t have documented workflows, or at least they only have loosely defined ones. It’s just too hard to capture all the possibilities on one swim-lane chart.

I’m all for having artisans on the team who are able to handle the rarer situations (eg process fall-outs) with bespoke processes. But bespoke processes should never be the norm. Continual improvement thrives on a strong level of repeatability.

To me, bespoke workflows are not necessarily an indication of a team of free spirited artists that need to be regimented, but of processes with too many variants. Click on this link to find recommendations for reducing the level of bespoke processes in your organisation.

Are processes bespoke or paint-by-numbers in your organisation?

BTW. We’ll take a slightly different perspective on workflow repeatability in tomorrow’s post.