The OSS self-driving vehicle

I was lucky enough to get some time of a friend recently, a friend who’s running a machine-learning network assurance proof-of-concept (PoC).

He’s been really impressed with the results coming out of the PoC. However, one of the really interesting factors he’s been finding is how frequently BAU (business as usual) changes in the OSS data (eg changes in naming conventions, topologies, etc) would impact results. Little changes made by upstream systems effectively invalidated baselines identified by the machine-learning engines to key in on. Those little changes meant the engine had to re-baseline / re-learn to build back up to previous insight levels. Or to avoid invalidating the baseline, it would require re-normalising all of data prior to the identification of BAU changes.

That got me wondering whether DevOps (or any other high-change environment) might actually hinder our attempts to get machine-led assurance optimisation. But more to the point, does constant change (at all levels of a telco business) hold us back from reaching our aim of closed-loop / zero-touch assurance?

Just like the proverbial self-driving car, will we always need someone at the wheel of our OSS just in case a situation arises that the machines hasn’t seen before and/or can’t handle? How far into the future will it be before we have enough trust to take our hands off the OSS wheel and let the machines drive closed-loop processes without observation by us?

An alternate way of slicing OSS (part 2)

Last week we talked about an alternate way of slicing OSS projects. Today, we’ll look a little deeper and include some diagrams.

The traditional (aka waterfall) approach to delivering an OSS project sees one big-bang delivery of business value at the end of the implementation.
OSS project delivery via waterfall

The yellow arrows indicate the sequential nature of this style of delivery. The implications include:

  1. If the project runs out of funds before the project finishes, no (negligible) value is delivered
  2. If there’s no modularity of delivery then the project team must stay the course of the original project plan. There’s no room for prioritising or dropping or including delivery modules. Project plans are rarely perfect at first after all
  3. Any changes in project plan tend to have knock-on effects into the rest of the delivery
  4. There is only one true delivery of value, but milestones demonstrate momentum for the project… a key for change management and team morale
  5. Large deliverables represent the proverbial overload one segment of the project delivery team then under-utilises the rest in each stage.  This isn’t great for project flow or team utilisation

The alternate approach seeks to deliver in multiple phases by business value, not artefacts, as shown in the sample model below:
OSS project delivery via AgilePhased enhancements following a base platform build (eg Sandpit and/or Single-site above) could include the following, where each provides a tangible outcome / benefit for the business, thus maintaining perception of momentum (assurance use-cases cited):

  • Additional event collection (ie additional collectors / probes / mediation-devices can be added or configured)
  • Additional filters / sorting of events
  • Event prioritisation mapping / presentation
  • Event correlation
  • Fault suppression
  • Fault escalation
  • Alarm augmentation
  • Alarm thresholding
  • Root-cause analysis (intra, then inter-domain)
  • Other configurations such as latching, auto-acknowledgement, visualisation parameters, etc
  • Heart-beat function (ie devices are unreachable for a user-defined period)
  • Knowledge base (ie developing a database of activities to respond to certain events)
  • Interfacing with other systems (eg trouble-ticket, work-force management, inventory, etc)
  • Setup of roles/groups
  • Setup of skills-based routing
  • Setup of reporting
  • Setup of notifications (eg email, SMS, etc)
  • Naming convention refinements
  • etc, etc

The latter is a more Agile-style breakdown of work, but doesn’t need to be delivered using Agile methodology.

Of course there are pros and cons of each approach. I’d love to hear your thoughts and experiences with different OSS delivery approaches.

Optimisation Support Systems

We’ve heard of OSS being an acronym for operational support systems, operations support systems, even open source software. I have a new one for you today – Optimisation Support Systems – that exists for no purpose other than to drive a mindset shift.

I think we have to transition from “expectations” in a hype sense to “expectations” in a goal sense. NFV is like any technology; it depends on a business case for what it proposes to do. There’s a lot wrong with living up to hype (like, it’s impossible), but living up to the goals set for a technology is never unrealistic. Much of the hype surrounding NFV was never linked to any real business case, any specific goal of the NFV ISG.”
Tom Nolle
in his blog here.

This is a really profound observation (and entire blog) from Tom. Our technology, OSS included, tends to be surrounded by “hyped” expectations – partly from our own optimistic desires, partly from vendor sales pitches. It’s far easier to build our expectations from hype than to actually understand and specify the goals that really matter. Goals that are end-to-end in manner and preferably quantifiable.

When embarking on a technology-led transformation, our aim is to “make things better,” obviously. A list of hundreds of functional requirements might help. However, having an up-front, clear understanding of the small number of use cases you’re optimising for tends to define much clearer goal-driven expectations.

Security and privacy as an OSS afterthought?

I often talk about OSS being an afterthought for network teams. I find that they’ll often design the network before thinking about how they’ll operationalise it with an OSS solution. That’s both in terms of network products (eg developing a new device and only thinking about building the EMS later), or building networks themselves.

It can be a bit frustrating because we feel we can give better solutions if we’re in the discussion from the outset. As OSS people, I’m sure you’ll back me up on this one. But we can’t go getting all high and mighty just yet. We might just be doing the same thing… but to security, privacy and analytics teams.

In terms of security, we’ll always consider security-based requirements (usually around application security, access management, etc) in our vendor / product selections. We’ll also include Data Control Network (DCN) designs and security appliance (eg firewalls, IPS, etc) effort in our implementation plans. Maybe we’ll even prescribe security zone plans for our OSS. But security is more than that (check out this post for example). We often overlook the end-to-end aspects such central authentication, API hardening, server / device patching, data sovereignty, etc and it then gets picked up by the relevant experts well into the project implementation.

Another one is privacy. Regulations like GDPR and the Facebook trials show us the growing importance of data privacy. I have to admit that historically, I’ve been guilty on this one, figuring that the more data sets I could stitch together, the greater the potential for unlocking amazing insights. Just one problem with that model – the more data sets that are stitched together, the more likely that privacy issues arise.

We increasingly have to figure out ways to weave security, privacy and analytics into our OSS planning up-front and not just think of them as overlays that can be developed after all of our key decisions have been made.

An alternate way of slicing OSS projects

One of the biggest challenges of big bang OSS project implementations is that all of the business value (ie the OSS and its data, workflows, integrations, etc) gets delivered at once, normally at the end of a lengthy exercise.

Ok, ok, so the delivery of value is not a challenge, it’s the implications of a big delivery of value that’s the challenge – implications that include:

  1. If the project runs out of funds before the project finishes, no value is delivered
  2. If there’s no modularity of delivery then the project team must stay the course of the original project plan. There’s no room for prioritising or dropping or including delivery modules. Project plans are rarely perfect at first after all
  3. Any changes in project plan tend to have knock-on effects into the rest of the delivery due to the sequential nature of typical project plans
  4. Any delivery of value represents a milestone, which in turn demonstrates momentum for the project… a key change management and team morale strategy
  5. Large deliverables represent the proverbial “pig in the python” – only one segment of the python (ie segment of the project delivery team) is engaged (hyper-engaged) whilst the other segments remain under-utilised.  This isn’t great for project flow or utilisation

When tasked with designing a project schedule, I’ve noticed that many vendors tend to follow the typical waterfall delivery and corresponding payment milestones (eg. design, then build, then test, then deploy, then hand over). The downside of this approach is that the business value (for the customer) is delivered at the end of the handover (ie big bang). There’s no business value in delivering design artefacts for example – the customer can’t use them to perform operational tasks.

The model I prefer sees incremental business value being delivered such as:

  • Proof of Concept (PoC) build
  • Sandpit build
  • Out of the box (OOTB) production build (ie. no customisations)
  • End-to-end use case #1 delivery (ie. design, build*, test, deploy, handover)
  • E2E use case #2 delivery
  • E2E use case #n delivery

where build* includes incremental configuration, customisation, integration, data migration, etc.

Just in time design

It’s interesting how we tend to go in cycles. Back in the early days of OSS, the network operators tended to build their OSS from the ground up. Then we went through a phase of using Commercial off-the-shelf (COTS) OSS software developed by third-party vendors. We now seem to be cycling back towards in-house development, but with collaboration that includes vendors and external assistance through open-source projects like ONAP. Interesting too how Agile fits in with these cycles.

Regardless of where we are in the cycle for our OSS, as implementers we’re always challenged with finding the Goldilocks amount of documentation – not too heavy, not too light, but just right.

The Agile Manifesto espouses, “working software over comprehensive documentation.” Sounds good to me! It perplexes me that some OSS implementations are bogged down by lengthy up-front documentation phases, especially if we’re basing the solution on COTS offerings. These can really stall the momentum of a project.

Once a solution has been selected (which often does require significant analysis and documentation), I’m more of a proponent of getting the COTS software stood up, even if only in a sandpit environment. This is where just-in-time (JIT) documentation comes into play. Rather than having every aspect of the solution documented (eg process flows, data models, high availability models, physical connectivity, logical connectivity, databases, etc, etc), we only need enough documentation for collaborative stakeholders to do their parts (eg IT to set up hardware / hosting, networks to set up physical connectivity, vendor to provide software, integrator to perform build, etc) to stand up a vanilla solution.

Then it’s time to start building trial scenarios through the solution. There’s usually quite a bit of trial and error in this stage, as we seek to optimise the scenarios for the intended users. Then we add a few more scenarios.

There’s little point trying to document the solution in detail before a scenario is trialled, but some documentation can be really helpful. For example, if the scenario is to build a small sub-section of a network, then draw up some diagrams of that sub-network that include the intended naming conventions for each object (eg device, physical connectivity, addresses, logical connectivity, etc). That allows you to determine whether there are unexpected challenges with naming conventions, data modelling, process design, etc. There are always unexpected challenges that arise!

I figure you’re better off documenting the real challenges than theorising on the “what if?” challenges, which is what often happens with up-front documentation exercises. There are always brilliant stakeholders who can imagine millions of possible challenges, but these often bog the design phase down.

With JIT design, once the solution evolves, the documentation can evolve too… if there is an ongoing reason for its existence (eg as a user guide, for a test plan, as a training cheat-sheet, a record of configuration for fault-finding purposes, etc).

Interestingly, the first value in the Agile Manifesto is, “individuals and interactions over processes and tools.” This is where the COTS vs in-house-dev comes back into play. When using COTS software, individuals, interactions and processes are partly driven by what the tools support. COTS functionality constrains us but we can still use Agile configuration and customisation to optimise our solution for our customers’ needs (where cost-benefit permits).

Having a working set of vanilla tools allows our customers to get a much better feel for what needs to be done rather than trying to understand the intent of up-front design documentation. And that’s the key to great customer outcomes – having the customers knowledgeable enough about the real solution (not hypothetical solutions) to make the most informed decisions possible.

Of course there are always challenges with this JIT design model too, especially when third-party contracts are involved!

Aggregated OSS buying models

Last week we discussed a sell-side co-op business model. Today we’ll look at buy-side co-op models.

In other industries, we hear of buying groups getting great deals through aggregated buying volumes. This is a little harder to achieve with products that are as uniquely customised as OSS. It’s possible that OSS buy-side aggregation could occur for operators that are similar in nature but don’t compete (eg regional operators). Having said that, I’ve yet to see any co-ops formed to gain OSS group-purchase benefits. If you have, I’d love to hear about it.

In OSS, there are three approaches that aren’t exactly co-op buying models but do aggregate the evaluation and buying decision.

The most obvious is for corporations that run multiple carriers under one umbrella such as Telefonica (see Telefonica’s various OSS / BSS contract notifications here), SingTel (group contracts here), etisalat, etc. There would appear to benefits in standardising OSS platforms across each of the group companies.

A far less formal co-op buying model I’ve noticed is the social-proof approach. This is where one, typically large, network operator in a region goes through an extensive OSS / BSS evaluation and chooses a vendor. Then there’s a domino effect where other, typically smaller, network operators also buy from the same vendor.

Even less formal again is by using third-party organisations like Passionate About OSS to assist with a standard vendor selection methodology. The vendors selected aren’t standardised because each operator’s needs are different, but the product / vendor selection methodology builds on the learnings of past selection processes across multiple operators. The benefits comes in the evaluation and decision frameworks.

An OSS data creation brain-fade

Many years ago, I made a data migration blunder that slowed a production OSS down to a crawl. Actually, less than a crawl. It almost became unusable.

I was tasked with creating a production database of a carrier’s entire network inventory, including data migration for a bunch of Nortel Passport ATM switches (yes, it was that long ago).

  • There were around 70 of these devices in the network
  • 14 usable slots in each device (ie slots not reserved for processing, resilience, etc)
  • Depending on the card type there were different port densities, but let’s say there were 4 physical ports per slot
  • Up to 2,000 VPIs per port
  • Up to 65,000 VCIs per VPI
  • The customer was running SPVC

To make it easier for the operator to create a new customer service, I thought I should script-create every VPI/VCI on every port on every devices. That would allow the operator to just select any available VPI/VCI from within the OSS when provisioning (or later, auto-provisioning) a service.

There was just one problem with this brainwave. For this particular OSS, each VPI/VCI represented a logical port that became an entry alongside physical ports in the OSS‘s ports table… You can see what’s about to happen can’t you? If only I could’ve….

My script auto-created nearly 510 billion VCI logical ports; over 525 billion records in the ports table if you also include VPIs and physical ports…. in a production database. And that was just the ATM switches!

So instead of making life easier for the operators, it actually brought the OSS‘s database to a near stand-still. Brilliant!!

Luckily for me, it was a greenfields OSS build and the production database was still being built up in readiness for operational users to take the reins. I was able to strip all the ports out and try again with a less idiotic data creation plan.

The reality was that there’s no way the customer could’ve ever used 2,000 x 65,000 VPI/VCI groupings I’d created on every single physical port. Put it this way, there were far less than 130 million services across all service types across all carriers across that whole country!

Instead, we just changed the service activation process to manually add new VPI/VCIs into the database on demand as one of the pre-cursor activities when creating each new customer service.

From that experience, I have reverted back to the Minimum Viable Data (MVD) mantra ever since.

OSS, with drama, without drama. Your choice

A recent blog from Seth Godin brought back some memories from a past project.

Two ways to solve a problem and provide a service.
With drama. Make sure the customer knows just how hard you’re working, what extent you’re going to in order to serve. Make a big deal out of the special order, the additional cost, the sweat and the tears.
Without drama. Make it look effortless.
Either can work. Depends on the customer and the situation.
Seth Godin here.

Over the course of the long-running and challenging project, I worked under a number of different Program Directors. The second last (chronologically) took the team barrel-chested down the “With Drama” path whilst the last took the “Without Drama” approach.

The “With Drama” approach was very melodramatic and political, but to be honest, was also really draining. It was draining because of the high levels of contact (eg meetings, reports, etc), reducing the amount of productive delivery time.

The “Without Drama” approach did make it look effortless, because by comparison it was effortless. The Program Director took responsibility for peer-level contact and cleared the way for the delivery team to focus on delivering. The team was still working well over 60 hour weeks, but it was now more clearly focused on delivery tasks. Interestingly, this approach brought a seemingly endless project to a systematic and clean conclusion (ie delivery) within about three months.

Now I’m not sure about your experiences or preferences, but I’d go with the “Without Drama” OSS delivery approach every time. The emotional intensity required of the “With Drama” approach just isn’t sustainable over long-running projects like our OSS projects tend to be.

What are your thoughts / experiences?

How an OSS is like an F1 car

A recent post discussed the challenge of getting a timeslice of operations people to help build the OSS. That post surmised, “as the old saying goes, you get back what you put in. In the case of OSS I’ve seen it time and again that operations need to contribute significantly to the implementation to ensure they get a solution that fits their needs.”

I have a new saying for you today, this time from T.D. Jakes, “You can’t be committed to the dream. You have to be committed to the process.”

If you’re representing an organisation that is buying an OSS solution from a vendor / integrator, please consider these two adages above. Sometimes we’re good at forming the dream (eg business requirements, business case, etc) and expecting the vendor to conduct almost all of the process. While our network operations teams are hired for the process of managing the network, we also need their significant input on the process of building / configuring an OSS. The vendor / integrator can’t just develop it in isolation and then hand it over to ops with a few days of training at the end.

The process of bringing a new OSS into an organisation is not like buying a road car. With an OSS, you can’t just place an order with some optional features like paint and trim specified, then expect to start driving it as soon as it leaves the vendor’s assembly line. It’s more like an F1 car where the driver is in constant communications with the pit-crew, changing and tweaking and refining to optimise the car to the driver’s unique needs (and in turn to hopefully optimise the results).

At least, that’s what current-state OSS are like. Perhaps in the future… we’ll strive to refine our OSS to be more like a road-car – standardised and intuitive enough for operators to drive straight off the assembly line.

A defacto spatial manager

Many years ago, I was lucky enough to lead a team responsible for designing a complex inside and outside plant network in a massive oil and gas precinct. It had over 120 buildings and more than 30 networked systems.

We were tasked with using CAD (Computer Aided Design) and Office tools to design the comms and security solution for the precinct. And when I say security, not just network security, but building access control, number plate recognition, coast guard and even advanced RADAR amongst other things.

One of the cool aspects of the project was that it was more three-dimensional than a typical telco design. A telco cable network is usually planned on x and y coordinates because the y coordinate is usually on one or two planes (eg all ducts are at say 0.6m below ground level or all catenary wires between poles are at say 5m above ground). However, on this site, cable trays ran at all sorts of levels to run around critical gas processing infrastructure.

We actually proposed to implement a light-weight OSS for management of the network, including outside plant assets, due to the easy maintainability compared with CAD files. The customer’s existing CAD files may have been perfect when initially built / handed-over, but were nearly useless to us because of all the undocumented that had happened in the ensuing period. However, the customer was used to CAD files and wanted to stay with CAD files.

This led to another cool aspect of the project – we had to build out defacto OSS data models to capture and maintain the designs.

We modelled:

  • The support plane (trayway, ducts, sub-ducts, trenches, lead-ins, etc)
  • The physical connectivity plane (cables, splices, patch-panels, network termination points, physical ports, devices, etc)
  • The logical connectivity plane (circuits, system connectivity, asset utilisation, available capacity, etc)
  • Interconnection between these planes
  • Life-cycle change management

This definitely gave a better appreciation for the type of rules, variants and required data sets that reside under the hood of a typical OSS.

Have you ever had a non-OSS project that gave you a better appreciation / understanding of OSS?

I’m also curious. Have any of you used designed your physical network plane in three dimensions? With a custom or out-of-the-box tool?

Taking SMEs out of ops to build an OSS

OSS are there to do just that – support operations. So as OSS implementers we have to do just that too.

But as the old saying goes, you get back what you put in. In the case of OSS I’ve seen it time and again that operations need to contribute significantly to the implementation to ensure they get a solution that fits their needs.

Just one problem here though. Operations are hired to operate the network, not build OSS. Now let’s assume the operations team does decide to commit heavily to your OSS build, thus taking away from network ops at some level (unless they choose to supplement the ops team).

That still leaves operations team leaders with a dilemma. Do they take certain SMEs out of ops to focus entirely on the OSS build (and thus act as nominees for the rest of the team) or do they cycle many of their ops people through the dual roles (at risk of task-switching inefficiency)?

There are pros and cons with each aren’t there? Which would you choose and why? Do you have an alternate approach?

Front-loading with OSS auto-discovery

Yesterday’s post discussed the merits of front-loading effort on knowledge transfer of new starters and automated testing, whilst acknowledging the challenges that often prevent that from happening.

Today we look at the front-loading benefits of building OSS / network auto-discovery tools.

We all know that OSS are only as good as the data we seed them with. As the old saying goes, garbage in, garbage out.

Assurance / network-health data is generally collected directly from the network in near real time, typically using SNMP traps, syslog events and similar. The network is generally the data master for this assurance-style data, so it makes sense to pull data from the network wherever possible (ie bottom-up data flows).

Fulfilment data, in the form of customer orders, network designs, etc are often captured in external systems first (ie as master) and pushed into the network as configurations (ie top-down data flows).
Bottom-up and top-down OSS data flows

These two flows meet in the middle, as they both tend to rely on network inventory and/or resources. Bottom-up – Network faults to be tied to inventory / resource / topology information (eg fibre cuts, port failure, device failure, etc) which are important for fault identification (Root Cause Analysis [RCA]). Similarly for top-down – customer services / circuits / contracts tend to consume inventory, capacity and/or other resources.

Looking end-to-end and correlating network health (assurance) to customer service health (fulfilment) (eg as Service Level Agreement [SLA] analysis, Service Impact Analysis [SIA]) tends to only be possible due to reconciliation via inventory / resource data sets as linking keys.

Seeding an OSS with inventory / resource data can be done via three methods:

  1. Data migration (eg script-loading from external sources such as spreadsheet or CSV files)
  2. Manual data creation
  3. Auto-discovery (ie collection of data directly from the network)
  4. (or a combination of the above)

Options 1 and 2 are probably the more traditional method of initial seeding of OSS databases, mainly because they tend to be faster to demonstrate progress.

Option 3 is the front-loading option that can be challenging in the short-to-medium term but will hopefully prove beneficial in the longer term (just like knowledge transfer and automated testing).

It might seem easy to just suck data directly out of the network, but the devil is definitely in the detail, details such as:

  • Choosing optimal timings to poll the network without saturating it (if notification aren’t being pushed by the network), not to mention session handling of these long-running data transfers
  • Building the mediation layer to perform protocol conversion and data mappings
  • Field translation / mapping to common naming standards. This can be much more difficult than it sounds and is key to allow the assurance and fulfilment meet-in-the-middle correlations to occur (as described above)
  • Reconciliation between the data being presented by the network and what’s already in the OSS database. In theory, the data presented by the network should always be right, but there are scenarios where it’s not (eg flapping ports giving the appearance of assets being present / absent from network audit data, assets in test / maintenance mode that aren’t intended be accepted into the OSS inventory pool, lifecycle transitions from planned to built, etc)
  • Discrepancy and exception handling rules
  • All of the above make it challenging for “siloed” data sets, but perhaps even more challenging is in the discovery and auto-stitching of cross-domain data sets (eg cross-domain circuits / services / resource chains)

The vexing question arises – do you front-load and seed via auto-discovery or perform a creation / migration that requires more manual intervention? In some cases, it could even be a combination of both as some domains are easier to auto-discover than others.

The OSS breathing coach analogy

To paraphrase the great Chinese philosopher Lao Tzu, resisting change is like trying to hold your breath – even if you’re successful, it won’t end well.”
Michael McQueen
here.

OSS is an interesting dichotomy. At one end of the scale, you have the breath holders – those who want the status quo to remain so that they can bring their OSS (and/or network) under control. At the other end, you have the hyperventilators – those who want to force a constant stream of change to overcome any perceived shortfalls in the current solution.

The more desirable state is probably a balance between breath holding and hyperventilation:

  • If you’re an OSS implementer (eg vendor, integrator, internal project delivery team), then you rely on change – as long as it’s enough to deliver an income, but not so much as to overwhelm you.
  • If you’re an OSS operator, then you long for an OSS that does its role perfectly and evolves at a manageable speed, allowing you stability.

The art of change management in OSS is to act as a breathing coach – to find the collective balance of respiration that suits the organisation whilst considering the natural tendencies of all of the different contributors to the project.

Just like breathing, change might seem simple, but is often completely underestimated as a result. But spend some time with any breathing coach or OSS change manager and you’ll find that there are many techniques that they call upon to find optimal balance.

OSS stepping stone or wet cement

Very often, what is meant to be a stepping stone turns out to be a slab of wet cement that will harden around your foot if you do not take the next step soon enough.”
Richelle E. Goodrich
.

Not sure about your parts of the world, but I’ve noticed the terms “tactical” (ie stepping stone solution) and “strategic” (ie long-term solution) entering the architectural vernacular here in Australia.

OSS seem to be full of tactical solutions. We’re always on a journey to somewhere else. I love that mindset – getting moving now, but still keeping the future in mind. There’s just one slight problem… how many times have we seen a tactical solution that was build years before? Perhaps it’s not actually a problem at all in some cases – the short-term fix is obviously “good enough” to have survived.

As a colleague insightfully pointed out last week – “if you create a tactical solution without also preparing a strategic solution, you don’t have a tactical solution, you have a solution.

When architecting your OSS solutions, do you find yourself more easily focussing on the tactical, the strategic, or is having an eye on both the essential part of your solution?

OSS compromise, not compromised

When you’ve got multiple powerful parties involved in a decision, compromise is unavoidable. The point is not that compromise is a necessary evil. Rather, compromise can be valuable in itself, because it demonstrates that you’ve made use of diverse opinions, which is a way of limiting risk.”
Chip and Dan Heath
in their book, Decisive.

This risk perspective on compromise (ie diversity of thought), is a fascinating one in the context of OSS.

Let’s just look at Vendor Selection as one example scenario. In the lead-up to buying a new OSS, there are always lots of different requirements that are thrown into the hat. These requirements are likely to come from more than one business unit, and from a diverse set of actors / contributors. This process, the OSS Thrashing process, tends to lead to some very robust discussions. Even in the highly unlikely event of every requirement being met by a single OSS solution, there are still compromises to be made in terms of prioritisation on which features are introduced first. Or which functionality is dropped / delayed if funding doesn’t permit.

The more likely situation is that each of the product options will have different strengths and weaknesses, each possibly aligning better or worse to some of the requirement contributor needs. By making the final decision, some requirements will be included, others precluded. Compromise isn’t an option, it’s a reality. The perspective posed by the Heath brothers is whether all requirement contributors enter the OSS vendor selection process prepared for compromise (thus diversity of thought) or does one actor / business-unit seek to steamroll the process (thus introducing greater risk)?

The OSS transformation dilemma

There’s a particular carrier that I know quite well that appears to despise a particular OSS vendor… but keeps coming back to them… and keeps getting let down by them… but keeps coming back to them. And I’m not just talking about support of their existing OSS, but whole new tools.

It never made sense to me… until reading Seth Godin’s blog today. In it, he states, “…this market segment knows that things that are too good to be true can’t possibly work, and that’s fine with them, because they don’t actually want to change–they simply want to be able to tell themselves that they tried. That the organization they paid their money to failed, of course it wasn’t their failure. Once you see that this short-cut market segment exists, you can choose to serve them or to ignore them. And you can be among them or refuse to buy in

It starts to makes sense. The same carrier has a tendency to spend big money on the big-4 consultants whenever an important decision needs to be made. If the big, ambitious project then fails, the carrier’s project sponsors can say that the big-4 organization they paid their money to failed.

Does that ring true of any telco you’ve worked with? That they don’t actually want to change–they simply want to be able to tell themselves that they tried (or be seen to have tried) with their OSS transformation?

Are we actually stuck in one big dilemma? Are our OSS transformations actually so hard that they’re destined to fail, yet are already failing so badly that we desperately need to transform them? If so, then Seth’s insightful observation gives the appearance of progress AND protection from the pain of failure.

Not sure about you, but I’ll take Seth’s “refuse to buy in” option and try to incite change.

Post Implementation Review (PIR)

Have you noticed that OSS projects need to go through extensive review to get funding of business cases? That makes sense. They tend to be a big investment after all. Many OSS projects fail, so we want to make sure this one doesn’t and we perform thorough planing / due-diligence.

But I do find it interesting that we spend less time and effort on Post Implementation Reviews (PIRs). We might do the review of the project, but do we compare with the Cost Benefit Analysis (CBA) that undoubtedly goes into each business case?

OSS Project Analysis Scales

Even more interesting is that we spend even less time and effort performing ongoing analysis of an implemented OSS
against against the CBA.

Why interesting? Well, if we took the time to figure out what has really worked, we might have better (and more persuasive) data to improve our future business cases. Not only that, but more chance to reduce the effort on the business case side of the scale compared with current approaches (as per diagrams above).

What do you think?

OSS – just in time rather than just in case

We all know that once installed, OSS tend to stay in place for many years. Too much effort to air-lift in. Too much effort to air-lift back out, especially if tightly integrated over time.

The monolithic COTS (off-the-shelf) tools of the past would generally be commissioned and customised during the initial implementation project, with occasional integrations thereafter. That meant we needed to plan out what functionality might be required in future years and ask for it to be implemented, just in case. Along with all the baked-in functionality that is never needed, and the just in case but possibly never used, we ended up with a lot of bloat in our OSS.

With the current approach of implementing core OSS building blocks, then utilising rapid release and microservice techniques, we have an ongoing enhancement train. This provides us with an opportunity to build just in time, to build only functionality that we know to be essential.

This has pluses and minuses. On the plus side, we have more opportunity to restrict delivery to only what’s needed. On the minus side, a just in time mindset can build a stop-gap culture rather than strategic, long-term thinking. It’s always good to have long-term thinkers / planners on the team to steer the rapid release implementations (and reductions / refactoring) and avoid a new cause of bloat.

Would you ever alarm your lab equipment?

Something curious dawned on me the other day – I wondered how many people / organisations actively manage alarms / alerts being generated by their lab equipment?

At first glance, this would seem silly. Lab environments are in constant flux, in all sorts of semi-configured situations, and therefore likely to be alarming their heads off at different times.

As such, it would seem even sillier to send alarms, syslogs, performance counters, etc from your lab boxes through to your production alarm management platform. Events on your lab equipment simply shouldn’t be distracting your NOC teams from handling events on your active network right?

But here’s why I’m curious, in a word, DevOps. Does the DevOps model now mean that some lab equipment stays in a relatively stable state and performs near-mission-critical activities like automated regression testing?? Therefore, does it make sense to selectively monitor / manage some lab equipment?

In what other scenarios might we wish to send lab alarms to the NOC (and I’m not talking about system integration testing type scenarios, but ongoing operational scenarios)?