The chains of integration are too light until…

Chains of habit are too light to be felt until they are too heavy to be broken.”
Warren Buffett (although he attributed it to an unknown author, perhaps originating with Samuel Johnson).

What if I were to replace the word “habit” in the quote above with “OSS integration” or “OSS customisation” or “feature releases?”

The elegant quote reflects this image:

The chains of feature releases are light at t0. They’re much heavier at t0+100.

Like habits, if we could project forward and see the effects, would we allow destructive customisations to form in their infancy? The problem is that we don’t see them as destructive at infancy. We don’t see how they entangle.

Posing a Network Data Synchronisation Protocol (NDSP) concept

Data quality is one of the biggest challenges we face in OSS. A product could be technically perfect, but if the data being pumped into it is poor, then the user experience of the product will be awful – the OSS becomes unusable, and that in itself generates a data quality death spiral.

This becomes even more important for the autonomous, self-healing, programmable, cooperative networks being developed (think IoT, virtualised networks, Self-Organizing Networks). If we look at IoT networks for example, they’ll be expected to operate unattended for long periods, but with code and data auto-propagating between nodes to ensure a level of self-optimisation.

So today I’d like to pose a question. What if we could develop the equivalent of Network Time Protocol (NTP) for data? Just as NTP synchronises clocking across networks, Network Data Synchronisation Protocol (NDSP) would synchronise data across our networks through a feedback-loop / synchronisation algorithm.

Of course there are differences from NTP. NTP only tries to coordinate one data field (time) along a common scale (time as measured along a 64+64 bits continuum). The only parallel for network data is in life-cycle state changes (eg in-service, port up/down, etc).

For NTP, the stratum of the clock is defined (see image below from wikipedia).

This has analogies with data, where some data sources can be seen to be more reliable than others (ie primary sources rather than secondary or tertiary sources). However, there are scenarios where stratum 2 sources (eg OSS) might push state changes down through stratum 1 (eg NMS) and into stratum 0 (the network devices). An example might be renaming of a hostname or pushing a new service into the network.

One challenge would be the vast different data sets and how to disseminate / reconcile across the network without overloading it with management / communications packets. The other would be that format consistency. I once had a device type that had four different port naming conventions, and that was just within its own NMS! Imagine how many port name variations (and translations) might have existed across the multiple inventories that exist in our networks. The good thing about the NDSP concept is that it might force greater consistency across different vendor platforms.

Another would be that NDSP would become a huge security target as it would have the power to change configurations and because of its reach through the network.

So what do you think? Has the NDSP concept already been developed? Have you implemented something similar in your OSS? What are the scenarios in which it could succeed? Or fail?

The concept of DevOps is missing one really important thing

There’s a concept that’s building a buzz across all digital industries – you may’ve heard of it – it’s a little thing called DevOps. Someone (most probably a tester) decided to extend it and now you might even hear the #DevTestOps moniker being mentioned.

In the ultimate of undeserved acknowledgements, I even get a reference on Wikipedia’s DevOps page. It references this DevOps life-cycle diagram from an earlier post that I can take no credit for:

However, there is one really important chevron missing from the DevOps infinite loop above. Can you picture what it might be?

If I show you this time series below, does it help identify what’s missing from the DevOps infinite loop? I refer to the diagram below as The Tech-Debt Wreck
The increasing percentage of tech debt
If I give you a hint that it primarily relates to the grey band in the time series above, would that help?

Okay, okay. I’m sure you’ve guessed it already, but the big thing missing from the DevOps loop is pruning, or what I refer to as subtraction projects (others might call it re-factoring). Without pruning, the rapid release mantra of DevOps will take the digital world from t0 to t0+100 faster than at any time before in our history.

As a result, I’m advocating a variation on DevOps… or DevTestOps even… I want you to preach a revised version of the label – let’s start a movement called #DevTestPruneOps. Actually, the pruning should go at the start, before each dev / test cycle, but by calling it #PruneDevTestOps, I fear its lineage might get lost.

Torturous OSS version upgrades

Have you ever worked on an OSS where a COTS (Commercial Off-The-Shelf) solution has been so heavily customised that implementing the product’s next version upgrade has become a massive challenge? The solution has become so entangled that if the product was upgraded, it would break the customisations and/or integrations that are dependent upon that product.

This trickle-down effect is the perfect example of The Chess-board Analogy or The Tech-debt Wreck at work. Unfortunately, it is far too common, particularly in large, complex OSS environments.

The OSS then either has to:

  • skip the upgrade or
  • take a significant cost/effort hit and perform an upgrade that might otherwise be quite simple.

If the operator decides to take the “skip” path for a few upgrades in a row, then it gets further from the vendor’s baseline and potentially misses out on significant patches, functionality or security hardening. Then, when finally making the decision to upgrade, a much more complex project ensues.

It’s just one more reason why a “simple” customisation often has a much greater life-cycle cost than was initially envisaged.

How to reduce the impact?

  1. We’ve recently spoken about using RPA tools for pseudo-integrations, allowing the operator to leave the COTS product un-changed, but using RPA to shift data between applications
  2. Attempt to achieve business outcomes via data / process / config changes to the COTS product rather than customisations
  3. Enforce a policy of integration as a last resort as a means of minimising the chess-board implications (ie attempting to solve problems via processes, in data, etc before considering any integration or customisation)
  4. Enforcing modularity in the end-to-end architecture via carefully designed control points, microservices, etc

There are probably many other methods that I’m forgetting about whilst writing the article. I’d love to hear the approach/es you use to minimise the impact of COTS version upgrades. Similarly, have you heard of any clever vendor-led initiatives that are designed to minimise upgrade costs and/or simplify the upgrade path?

A summary of RPA uses in an OSS suite

This is the sixth and final post in a series about the four styles of RPA (Robotic Process Automation) in OSS.

Over the last few days, we’ve looked into the following styles of RPA used in OSS, their implementation approaches, pros / cons and the types of automation they’re best suited to:

  1. Automating repeatable tasks – using an algorithmic approach to completing regular, mundane tasks
  2. Streamlining processes / tasks – using an algorithmic approach to assist an operator during a process or as an alternate integration technique
  3. Predefined decision support – guiding operators through a complex decision process
  4. As part of a closed-loop system – that provides a learning, improving solution

RPA tools can significantly improve the usability of an OSS suite, especially for end-to-end processes that jump between different applications (in the many ways mentioned in the above links).

However, there can be a tendency to use the power of RPAs to “solve all problems” (see this article about automating bad processes). That can introduce a life-cycle of pain for operators and RPA admins alike. Like any OSS integration, we should look to keep the design as simple and streamlined as possible before embarking on implementation (subtraction projects).

The OSS / RPA parrot on the shoulder analogy

This is the fourth in a series about the four styles of RPA (Robotic Process Automation) in OSS.

The third style is Decision Support. I refer to this style as the parrot on the shoulder because the parrot (RPA) guides the operator through their daily activities. It isn’t true automation but it can provide one of the best cost-benefit ratios of the different RPA styles. It can be a great blend of human-computer decision making.

OSS processes tend to have complex decision trees and need different actions performed depending on the information being presented. An example might be a customer on-boarding, which includes credit and identity check sub-processes, followed by the customer service order entry.

The RPA can guide the operator to perform each of the steps along the process including the mandatory fields to populate for regulatory purposes. It can also recommend the correct pull-down options to select so that the operator traverses the correct branch of the decision tree of each sub-process.

This functionality can allow organisations to deliver less training than they would without decision support. It can be highly cost-effective in situations where:

  • There are many inexperienced operators, especially if there is high staff turnover such as in NOCs, contact centres, etc
  • It is essential to have high process / data quality
  • The solution isn’t intuitive and it is easy to miss steps, such as a process that requires an operator to swivel-chair between multiple applications
  • There are many branches on the decision tree, especially when some of the branches are rarely traversed, even by experienced operators

In these situations the cost of training can far outweigh the cost of building an OSS (RPA) parrot on each operator’s shoulder.

Using RPA as an alternate OSS integration

This is the third in a series about the four styles of RPA (Robotic Process Automation) in OSS.

The second of those styles is Streamlining processes / tasks by following an algorithmic approach to simplify processes for operators.

These can be particularly helpful during swivel-chair processes where multiple disparate systems are partially integrated but each needs the same data (ie reducing the amount of duplicated data entry between systems). As well as streamlining the process it also improves data consistency rates.

The most valuable aspect of this style of RPA is that it can minimise the amount of integration between systems, thus potentially reducing solution maintenance into the future. The RPA can even act as the integration technique where an API isn’t available or documentation isn’t available (think legacy systems here).

Using RPA to automate OSS activities

This is the second in a series about the four styles of RPA (Robotic Process Automation) in OSS.

The first of those styles is automating repeatable tasks by following an algorithmic approach to complete regular, mundane tasks.

Running an OSS has many high value, challenging tasks for operators to perform. Unfortunately, they also have many repetitive, simple (brain-dead?) tasks that need to be done too.

This might include collecting data from various sources and aggregating it into a single file or report for consumption by humans or machines. Other examples include admin clean-up tasks like accounts / tempfiles / processes / sessions and myriad simple process automations.

When we think of OSS automations, we often think of high value but complicated tasks like orchestrations, network self-healing, etc. They can be expensive and inflexible, not always delivering the perceived worth for the investment.

However, when thinking of RPA I think about the simplest stuff first. They are basic and consistent processes that are straightforward to define an algorithm for, making them the “low-hanging fruit” of OSS / RPA activities. They help to build momentum towards the bigger automation fish. Best of all, they free up your talented OSS operators to do more valuable activities.

Automating repeatable tasks is the most basic RPA style. We’ll step up the value chain with each additional style over the next few days.

It’s hard to do big things in a small way

it’s hard to do big things in a small way, so I suspect incumbents have more of an advantage than they do in most industries.”
Nic Brisbourne

The quote above came from a piece about the rise of ConstructTech (ie building houses via means such as 3D printing). However, it is equally true of the OSS industry.

Our OSS tend to be behemoths, or at least the ones I work on seem to be. They’ve been developed over many years and have millions of sunk person-hours invested in them. And they’ve been customised to each client’s business like vines wrapped around a pillar. This gives enormous incumbency power and acts as a barrier to smaller innovators having a big impact in the world of OSS.

Want an example of it being hard to do big things in a small way? Ever heard of ONAP? AT&T is a massive telco with revenues to match, committed to a more software-centric future, and has developed millions of lines of code yet it still needs the broader industry to help flesh out its vision for ONAP.

There are occasionally niche products developed but it’s definitely hard to do big things in a small way. The small grid analogy proposed earlier gives more room for the long tail of innovation, allowing smaller innovators to impact the larger ecosystem.

Write a comment below if you’d like to point out an outlier to this trend.

The two types of disruptive technologists

OSS is an industry that’s undergoing constant, and massive change. But it still hasn’t been disrupted in the modern sense of that term. It’s still waiting to have its Uber/AirBnB-moment, where the old way becomes almost obsoleted by the introduction of a new way. OSS is not just waiting, but primed for disruption.

It’s a massive industry in terms of revenues, but it’s still far from delivering everything that customers want/need. It’s potentially even holding back the large-scale service provider industry from being even more influential / efficient in the current digital communications world. Our recent OSS Call for Innovation spelled out the challenges and opportunities in detail.

Today we’ll talk about the two types of disruptive technologists – one that assists change and one that hinders.

The first disruptive technologist is a rare beast – they’re the innovators who create solutions that are distinctly different from anything else in the market, changing the market (for the better) in the process. As discussed in this recent post, most of the significant changes occurring to OSS have been extrinsic (from adjacent industries like IT or networking rather than OSS). We need more of these.

The second disruptive technologist is all too common – they’re the technologists whose actions disrupt an OSS implementation. They’re usually well-intended, but can get in the way of innovation in two main ways:
1) By not looking beyond incremental change to existing solutions
2) Halting momentum by creating and resolving a million “what if?” scenarios

Most of us probably fall into the second category more often than the first. We need to reverse that trend individually and collectively though don’t we?

Would you like to nominate someone who stands out as being the first type of disruptive technologist and why?

How “what if?” scenarios can halt a project

Let’s admit it; we’ve all worked on an OSS project that has gone into a period of extended stagnation because of a fear of the unknown. I call them “What if?” scenarios. They’re the scenarios where someone asks, “What if x happens?” and then the team gets side-tracked whilst finding an answer / resolution. The problem with “What if?” scenarios is that many of them will never happen, or will happen on such rare occasions that the impact will be negligible. They’re the opposite end of the Pareto Principle – they’re the 20% that take up the 80% of effort / budget / time. They need to be minimised and/or mitigated.

In some cases, the “what if?” questions comes from a lack of understanding about the situation, the product suite and / or the future solution. That’s completely understandable because we can never predict all of the eventualities of an OSS project at the outset. That’s the OctopOSS at work – you think you have all of the tentacles under control, but another one always comes and whacks you on the back of the head.

The best way to reduce the “what if?” questions from getting out of control is to give stakeholders a sandpit / MVP / rapid-prototype / PoC environment to interact with.

The benefit of the prototype environment is that it delivers something tangible, something that stakeholders far and wide can interact with and test assumptions, usefulness, usability, boundary cases, scalability, etc. Stakeholders get to understand the context of the product and get a better feeling for what the end solution is going to look like. That way, many of the speculative “what ifs?” are bypassed and you start getting into the more productive dialogue earlier. The alternative, the creation of a document or discussion, can devolve into an almost endless set of “what-if” scenarios and opinions, especially when there are large groups of (sometimes militant) stakeholders.

The more dangerous “what if?” questions come from the experts. They’re the ones who demonstrate their intellectual prowess by finding scenario after scenario that nobody else is considering. I have huge admiration for those who can uncover potential edge cases, race conditions, loopholes in code, etc. The challenge is that they can be extremely hard to document, test for and circumvent. They’re also often very difficult to quantify or prove a likelihood of occurrence, thus consuming significant resources.

Rather than divert resources to resolving all these “what if?” questions one-by-one, I try to seek a higher-order “safety-net” solution. This might be in the form of exception handling, try-catch blocks, fall-out analysis reports, etc. Or, it might mean assigning a watching brief on the problem and handling it only if it arises in future.

The evolving complexity of RCA

Root cause analysis (RCA) is one of the great challenges of OSS. As you know, it aims to identify the probable cause of an alarm-storm, where all alarms are actually related to a single fault.

In the past, my go-to approach was to start with a circuit hierarchy-based algorithm. If you had an awareness of the hierarchy of circuits, usually through an awareness in inventory, if you have a lower-order fault (eg Loss of Signal on a transmission link caused by a cable break), then you could suppress all higher-order alarms (ie from bearers or tributaries that were dependent upon the L1 link. That works well in the fixed networks of distant past (think SDH / PDH). This approach worked well because it was repeatable between different customer environments.

Packet-switching data networks changed that to an extent, because a data service could traverse any number of links, on-net or off-net (ie leased links). The circuit hierarchy approach was still applicable, but needed to be supplemented with other rules.

Now virtualised networking is changing it again. RCA loses a little relevance in the virtualised layer. Workloads and resource allocations are dynamic and transient, making them less suited to fixed algorithms. The objective now becomes self-healing – if a failure is identified, failed resources are spun down and new ones spun up to take the load. The circuit hierarchy approach loses relevance, but perhaps infrastructure hierarchy still remains useful. Cable breaks, server melt-downs, hanging controller applications are all examples of root causes that will cause problems in higher layers.]

Rather than fixed-rules, machine-based pattern-matching is the next big hope to cope with the dynamically changing networks.

The number of layers and complexity of the network seems to be ever increasing, and with it RCA becomes more sophisticated…. If only we could evolve to simpler networks rather than more complex ones. Wishful thinking?

What is your OSS answer : question ratio?

Experts know a lot…. obviously.
They have lots of answers… obviously.

There are lots of OSS experts. Combined, they know A LOT!!

Powerful indeed, but not sure if that’s what we need right now. I feel like we’re in a bit of an OSS innovation funk. The biggest improvements in OSS are coming from outside OSS – extrinsic improvement.

Where’s the intrinsic improvement coming from? Do we need someone to shake it up (do we need everyone to shake it up?)? Do we need new thinking to identify and create new patterns? To re-organise and revolutionise what the experts already know. Or do we need to ask the massive questions that re-frame the situation for the experts?

So, considering this funky moment in time, is the real expert the one who knows lots of answers… or the person who can catalyse change by asking the best mind-shift questions?

May I ask you – As an OSS expert, are you prouder of your answers…. or your questions?

To tackle that from a different angle – What is your answer : question ratio? Are you such an important expert that your day is so full of giving brilliant answers that you have no time left to ruminate and develop brilliant questions?

If so, can we take some of your answer time back and re-prioritise it please?

In the words of Socrates, “I cannot teach anybody anything, I can only make them think.

Customers don’t invest in OSS. What do they invest in?

“An organisation buys an OSS, not because it wants an Operational Support System, but because it wants Operational Support.”

So if our customers are not investing in our OSS, what are they actually investing in? Easy! They’re investing in the ability to solve their own problems and opportunities in future.

If we don’t actually understand operations, what chance do we have to deliver operational support? We keep hearing the term, “customer experience this,” “CX that,” so it must be important right? Operational support staff might be a few steps removed from us (intentionally or unintentionally) but they are our “real” customers and the only way we can develop a solution that empathises with them is by spending time with them and listening (not always easy for us know-it-all OSS builder-types).

And just because we have a history in ops doesn’t mean we can assume to know this time. Operations are different at each organisation.

So, are we sure we understand the nature, extent and context of the unique problem/s that this customer needs to solve (not wants to solve)?

When low OSS performance is actually high performance

It’s not unusual for something to be positioned as the high performance alternative. The car that can go 0 to 60 in three seconds, the corkscrew that’s five times faster, the punch press that’s incredibly efficient…
The thing is, though, that the high performance vs. low performance debate misses something. High at what?
That corkscrew that’s optimized for speed is more expensive, more difficult to operate and requires more maintenance.
That car that goes so fast is also more difficult to drive, harder to park and generally a pain in the neck to live with.
You may find that a low-performance alternative is exactly what you need to actually get your work done. Which is the highest performance you can hope for
Seth Godin
in this article, What sort of performance?

Whether selecting a vendor / product, designing requirements or building an OSS solution, we can sometimes lose track of what level of performance is actually required to get the work done can’t we?

How many times have you seen a requirement sheet that specifies a Ferrari, but you know the customer lives in a location with potholed and cobblestoned roads? Is it right to spec them – sell them – build them – charge them for a Ferrari?

I have to admit to being guilty of this one too. I have gotten carried away in what the OSS can do, nearer the higher performance end of the spectrum, rather than taking the more pragmatic view of what the customer really needs.

Automations, custom reports and integrations are the perfect OSS examples of low performance actually being high performance. We spend a truckload of money on these types of features to avoid manual tasks (curse having to do those manual tasks)… when a simple cost-benefit analysis would reveal that it makes a lot more sense to stay manual in many cases.

The double-edged sword of OSS/BSS integrations

…good argument for a merged OSS/BSS, wouldn’t you say?
John Malecki

The question above was posed in relation to Friday’s post about the currency and relevance of OSS compared with research reports, analyses and strategic plans as well as how to extend OSS longevity.

This is a brilliant, multi-faceted question from John. My belief is that it is a double-edged sword.

Out of my experiences with many OSS, one product stands out above all the others I’ve worked with. It’s an integrated suite of Fault Management, Performance Management, Customer Management, Product / Service Management, Configuration / orchestration / auto-provisioning, Outside Plant Management / GIS, Traffic Engineering, Trouble Ticketing, Ticket of Work Management, and much more, all tied together with the most elegant inventory data model I’ve seen.

Being a single vendor solution built on a relational database, the cross-pollination (enrichment) of data between all these different modules made it the most powerful insight engine I’ve worked with. With some SQL skills and an understanding of the data model, you could ask it complex cross-domain questions quite easily because all the data was stored in a single database. That edge of the sword made a powerful argument for a merged OSS/BSS.

Unfortunately, the level of cross-referencing that made it so powerful also made it really challenging to build an initial data set to facilitate all modules being inter-operable. By contrast, an independent inventory management solution could just pull data out of each NMS / EMS under management, massage the data for ingestion and then you’d have an operational system. The abovementioned solution also worked this way for inventory, but to get the other modules cross-referenced with the inventory required engineering rules, hand-stitched spreadsheets, rules of thumb, etc. Maintaining and upgrading also became challenges after the initial data had been created. In many cases, the clients didn’t have all of the data that was needed, so a data creation exercise needed to be undertaken.

If I had the choice, I would’ve done more of the cross-referencing at data level (eg via queries / reports) rather than entwining the modules together so tightly at application level… except in the most compelling cases. It’s an example of the chess-board analogy.

If given the option between merged (tightly coupled) and loosely coupled, which would you choose? Do you have any insights or experiences to share on how you’ve struck the best balance?

Bad OSS ego decisions

A long, long time ago Dennis Haslinger told me that most of the most serious mistakes I would make in life would be bad ego decisions. I have found that to be true.”
Gary Halbert

OSS is an industry filled with highly intelligent people. In every country I’ve visited to work on OSS assignments, perhaps excluding Vietnam, my colleagues have been predominantly male. Dare I say it, do those two preceding facts imply a significant ego level exists on many (most?) OSS projects (or has it just been a coincidence that I’ve experienced)?

Given that OSS projects tend to cross business units, inter-departmental power plays like the one described in the Dilbert comic below can become just another potential pitfall.
Dilbert - I found a way to save a million dollars

To be honest, I can’t recall any examples where ego (mine or others) has lead to serious mistakes as such, but I’ve seen many cases where it’s lead to serious stagnation, delays in project delivery, that have been extremely costly.

One example is cited in this post, where the intellectual brilliance of one person caused a document to blow out from 30 pages to 150+, causing a 3+ month delay and more than $100k additional cost.

Stakeholder management and change management are highly underestimated factors in the success of OSS projects.

PS. The “intellectual brilliance” link above also talks about the possible benefits of smart contracts in OSS delivery. I wonder whether smart contracts will reduce some of the ego-related stagnation on OSS projects, or simply shift it from the delivery phase to the up-front smart contract agreement phase, thus introducing more “what if scenario” stagnation?

To reduce OSS dark data (or not)?

Dark data is the name for data that is collected but never used.
lt’s said that 96-98% of all data is dark data (not that I can confirm or deny those claims).

Dark data forms the bottom layer in the DIKW hierarchy below (image sourced from here).
DIKW hierarchy

What would the dark data percentage be within OSS do you think? Or more specifically, your OSS?

If you’re not going to use it, then why collect it?

I have two conflicting trains of thought here:

  • The Minimum Viable Data perspective; and
  • It’s relatively cheap and easy to collect / store raw data if an interface is already built, so hoard it all just in case your data scientists (or automated data algorithms) ever need it

Where do you sit on the data collection spectrum?

Can you re-skill fast enough to justify microservices?

There’s some things that I’ve challenged my team to do. We have to be faster than the web scale players and that sounds audacious. I tell them you can’t you can’t go to the bus station and catch a bus that’s already left the station by getting on a bus. We have to be faster than the people that we want to get to. And that sounds like an insane goal but that’s one of the goals we have. We have to speed up to catch the web scale players.”
John Donovan
, AT&T at this link.

Last week saw a series of articles appear here on the PAOSS blog around the accumulation of tech-debt and how microservices / Agile had the potential to accelerate that accumulation.

The part that I find most interesting about this new approach to telco (or more to the point, to the Digital Service Provider (DSP) model) is that it speaks of a shift to being software companies like the OTT players. Most telcos are definitely “digital” companies, but very few could be called “software” companies.

All telcos have developers on their payroll but how many would have software roles filling more than 5% of their workforce? How many would list their developer pools amongst a handful of core strengths? I’d hazard a guess that the roots of most telcos’ core strengths would’ve been formed decades ago.

Software-centric networks are on the rise. Rapid implementation models like DevOps and Agile are on the rise. API / Microservice interfaces to network domains (irrespective of being VNF, PNF, etc) are on the rise. Software, software, software.

In response, telcos are talking software. Talking, but how many are doing?

Organic transition of the workforce (ie boomers out, millennials in) isn’t going to refresh fast enough. Are telcos actively re-inventing their resource pool? Are they re-skilling on a grand scale, often tens of thousands of people, to cater for a future mode of operation where software is a core capability like it is at the OTT players? Re-skilling at a speed that’s faster than the web-scale bus?

If they can’t, or don’t, then perhaps software is not really the focus. Software isn’t their differentiator… they do have many other strengths to work with after all.

If so then OSS, microservices, SDN / NFV, DevOps, etc are key operational requirements without being core differentiators. So therefore should they all be outsourced to trusted partners / vendors / integrators (rather than the current insourcing trend), thus delegating the responsibility for curating the tech-debt we spoke about last week?

I’m biased. I see OSS as a core differentiator (if done well), but few agree with me.

Is micro-strangulation underway within OSS?

Yesterday’s post spoke of how the accumulation of features was limiting us to small, incremental change.

The diagram below re-tells that story:
The increasing percentage of tech debt

You’ve probably noticed that microservices are the big buzz in our industry. They’re perceived as being the big white hope for our future. I have my reservations though.

If you’re at t0 in the chart above, microservices allow for rapid rollout of features, whole small-grid architectures even (in a Lean / MVP world). My reservations stem from the propensity for rapid release of microservices to amplify the accumulation of tech debt (ie the escalation of maintenance and testing in the chart above). They have the potential to take organisations to t0+100 really quickly.

The upside though is that replacement or re-factoring of smaller modules (ie microservices) should be easier than the change-out of monolithic software suites. The one caveat… we have to commit to a culture of subtraction projects being as important as feature releases.