The overlaps of DCIM with inventory, asset and config management

A regular reader of the PAOSS blog recently wrote, “I follow with passion your blog,latest post about Inventory are great [Ed. the reader is talking about this post about LNI and PNI and this one about Inventory vs Asset vs CMDB Management]. I ask you if possible have a post on Inside Plant vs Outside Plant vs Virtual network creation… we usually use CAD based tool for Inside Plant design both for TLC equipment, cabling, cross connection, Distribution Frame, rooms, virtual rooms, rows structure,etc but also for power, conditioning, lighfiring,etc. We also use Network Inventory for Datacenter and server farm modelling.Outside Plant typically deals with GIS tool for cabling infrastructure. And now also virtualizzation of Network is coming with NFV and SDN. What do you think about?”

Great question.

In the post about Inventory vs Asset vs CMDB, we used the following Venn Diagram:

Unfortunately, there’s another circle that’s not shown on this diagram, but should be – the DCIM (Data Centre Infrastructure Management) circle. The overlaps between OSS and DCIM partially answer the questions above. We wrote a 5 part series on DCIM back in 2014 (part one, two, three, four, five), so perhaps it’s time for a re-visit.

The last of those five posts even included another Venn Diagram, as follows:

OSS, DCIM, ITSM Venn Diagram

Data Centre Infrastructure Management (DCIM) shares much of its DNA with OSS, but also has a number of unique differences.

Similarities:

  • IT and network device / inventory management
  • CSPs and Data Centres tend to have many Enterprise customers, and therefore a need to align with their IT service and life-cycle management (ITIL / ITSM) methodologies
  • Electronic data collection and storage to support fulfillment and assurance workflows
  • Analytics and operational decision support
  • Planning and design tools
  • Predictive modelling
  • Process and change management
  • Capacity planning, resource allocation and provisioning

Differences (ie what Data Centres have that traditional CSP networks don’t):

  • Facilities / Building Management Systems (FMS/BMS)
  • Energy / Power management
  • Environment and heat management (HVAC) including management of hot/cold zones
  • Data Centres tend to have less outside plant or inter-site connectivity* (ie most power and network connectivity tends to reside within the Data Centres)
  • However, Data Centre cable management have some slight differences. Network links are more likely to be managed within 3D spatial systems (x, y and height) if at all, rather than the 2D (x and y coordinates) typically plotted by most OSS inventory via GIS (Geographical Information Systems) or CAD (Computer Aided Design) drawings. Data Centre cables tend to be run in spatially-dense above-rack or below-floor trayways. By comparison, cables between sites tends to be less dense and at a fairly consistent height (eg a standard depth underground or a standard height when mounted on towers/poles aboveground)
  • Alternatively, DCs may manage spatial infrastructure through naming conventions such as rooms, rack-rows, racks, rack-position rather than 3D spatial systems
  • Data Centres have traditionally had a higher proportion of virtualised assets than traditional CSPs, although that is now changing with the operator network embracing network virtualisation

 

So let’s now look at how it “might” all hang together (noting that each company is likely to be different depending on their systems and processes):

  • DCIM manages facilities, building, power / PLCs and heating/cooling/HVAC
  • PNI manages physical connectivity (between sites and within the DC) as it can generally manage connectivity to physical ports on patch-panels / frames and physical devices (eg switches and routers) inside the DC. PNI also handles splicing and patching. PNI tools can generally also manage power cabling, although not everyone uses PNI for this
  • LNI (in conjunction with EMS [Element Management Systems] and virtual resource managers) will tend to manage the virtual / logical networks including resource management and orchestration
  • LNI will also tend to provide topological views of the network (often point-to-point links between physical/logical ports rather than the cable routes shown in PNI). LNI may also potentially include rack layouts and other forms of network visualisation. However, LNI tends to only partially show spatial presentation of the data (eg physical locations of “circuit” end-points rather than spatial location of all racks and equipment in 3D)
  • Related compute / storage infrastructure could be managed by DCIM, LNI, VIM, etc
  • And any of this could be cross-referenced as assets in the Asset Management System and/or Configuration Management Database (CMDB)

I can see that CAD might still be required for trayway, HVAC ducting, etc because PNI isn’t really designed with this in mind in 3D. 

Having said that, I’d probably still attempt to get all connectivity and support designed into a spatial visualisation tool like PNI rather than CAD. Afterall, connectivity of any type can be modelled as nodes and arcs (same as PNI). It’s just that ducting tends to have a greater 3D heft than a single line / arc of a typical comms cable. 

Why is it important to have this data in a single spatial system rather than CAD? Well, I figure it should help future augmented reality (AR) use-cases like the ones described in the link.

So here’s the updated diagram:

* There are of course multi-site DC organisations that have links between their sites, but they tend to outsource their long-haul network links to traditional carriers.

The common data store trend

Some time back, we discussed  A modern twist on OSS architecture that is underpinned by a common data model.
 
Time to discuss this a little more visually.
 
As the blue boxes on the left side of the diagram below show, you may have many different data sources (some master, some slaved). You may have a single OSS tool (monolithic solution) or you may have many OSS tools (best-of-breed approach).
 
You may have multiple BSS, NMS and even direct connections to network devices. You may even have other sources of data that you’ve never used before such as weather patterns, lightning strikes, asset management prediction modelling, SCADA data, HVAC data, building access / security events, etc, etc.
 
The common data model allows you to aggregate those sets to provide insights that have never been readily accessible to you previously.
 
So let’s look at a few key points
  1. Existing network layer systems (eg NMS, NE and their mediation devices) are currently sucking (near)real-time (ie alarm and perf) data out of the network and feeding to an OSS directly. They may also be pushing inventory discovery data to the OSS, although probably only loading less frequently (once-daily typically) .
  2. The common data model provides a few options for data flows: 
    1. If the data store is performant enough, the network layer could feed real-time data to the data store which on-forwards to OSS
    2. multi-home the data from the network to the data store and OSS simultaneously
    3. feed data from the network to the OSS, which may (or not) process before pushing to the data store
  3. Just a quick note regarding data flows: The network will tend to be the master for real-time / assurance flows. However, manual input tends to be the master for design/fulfil flows, so the OSS becomes the master of inventory data as per this link 
  4. The question then becomes where the data enrichment happens (ie appending inventory-related data to alarms) to help with root-cause and service-impact calculations. Enrichment / correlation probably needs to happen in the OSS‘s real-time engine, but it could source enrichment data directly from the network, from the OSS‘s inventory, or from the common data store 
  5. If the modern ETL tools (eg SNMP and syslog collectors, etc) allow you to do your own ETL to a common data store, a vendor OSS would only need one mediation device (ie to take data from the data store), rather than needing separate ones to pull from all the different NMS/EMS/NE) in your network. This has the potential to reduce mediation license costs from your OSS vendor
  6. Having said that, if you have difficult / proprietary interfaces that make it a challenge to do all of your own ETL then it might be best to let your OSS vendor build your mediation / ETL engines
  7. The big benefit of the common data store is you can choose a best-of-breed approach but still have a common data model to build Business Intelligence queries and reports around
  8. The common data store also takes load off the production OSS application / data servers. Queries and reports can be run against the common data platform, freeing up CPU cycles on the OSS for faster user interactions

The Common Data Model is supported by a few key advancements:

  1. In the past, the mediation layer (ie getting data out of the network and into the OSS) was a challenge. Network operators didn’t tend to want to do this themselves. This introduced a dependency on software suppliers / integrators to build mediation devices and sell them to operators as part of their OSS/BSS solutions. But there’s been a proliferation of highly scalable ETL (Extract, Transform, Load) tools in recent years
  2. Many networks used to have proprietary interfaces that required significant expertise to integrate with. The increasing ubiquity of IP networking and common interfaces (eg SNMP and web interfaces like RESTful, JSON, SOAP, XML) to the network layer makes ETL simpler.=
  3. Massively scalable databases that don’t have as much dependency on relational integrity and can ingest data for myriad sources
  4. A proliferation of data visualisation tools that are more user-friendly instead of having to be a coder capable of writing complex SQL queries
 

Softwarisation of 5G

As you have undoubtedly noticed, 5G is generating quite a bit of buzz in telco and OSS circles.

For many it’s just an n+1 generation of mobile standards, where n is currently 4 (well, the number of recent introductions into the market mean n is probably now getting closer to 5  🙂  ).

But 5G introduces some fairly big changes from an OSS perspective. As usual with network transformations / innovations, OSS/BSS are key to operationalisation (ie monetising) the tech. This report from TM Forum suggests that more than 60% of revenues from 5G use-cases will be dependent on OSS/BSS transformation.

And this great image from the 5G PPP Architecture Working Group shows how the 5G architecture becomes a lot more software-driven than previous architectures. Interesting how all 5 “software dimensions” are the domain of our OSS/BSS isn’t it? We could replace “5G architecture” with “OSS/BSS” in the diagram below and it wouldn’t feel out of place at all.

So, you may be wondering in what ways 5G will impact our OSS/BSS:

  • Network slicing – being able to carve up the network virtually, to generate network slices that are completely different functionally, means operators will be able to offer tailored, premium service offerings to different clients. This differs from the one-size-fits-all approach being used previously. However, this means that the OSS/BSS complexity gets harder. It’s almost like you need an OSS/BSS stack for each network slice. Unless we can create massive operational efficiencies through automation, the cost to run the network will increase significantly. Definitely a no-no for the execs!!
  • Fibre deeper – since 5G will introduce increased cell density in many locations, and offer high throughput services, we’ll need to push fibre deeper into the network to support all those nano-cells, pico-cells, etc. That means an increased reliance on good outside plant (PNI – Physical Network Inventory) and workforce management (WFM) tools
  • Software defined networks, virtualisation and virtual infrastructure management (VIM) – since the networks become a lot more software-centric, that means there are more layers (and complexity) to manage.
  • Mobile Edge Compute (MEC) and virtualisation – 5G will help to serve use-cases that may need more compute at the edge of the radio network (ie base stations and cell sites). This means more cross-domain orchestration for our OSS/BSS to coordinate
  • And other use-cases where OSS/BSS will contribute including:
    • Multi-tenancy to support new business models
    • Programmability of disparate networks to create a homogenised solution (access, aggregation, core, mobile edge, satellite, IoT, cloud, etc)
    • Self-healing automations
    • Energy efficiency optimisation
    • Monitoring end-user experience
    • Zero-touch administration aspirations
    • Drone survey and augmented reality asset management
    • etc, etc

Fun times ahead for OSS transformations! I just hope we can keep up and allow the operator market to get everything it wants / needs from the possibilities of 5G.

An Asset Management / Inventory trick

Last week we discussed the nuances between Inventory, Asset and Config Management within an OSS stack. Each one of these tools are designed to supports functionality for different users / persona-groups. However, they also tend to have significant functional overlap. Chances are your organisation doesn’t have separate dedicated tools for each.

So today I’m going to share a trick I’ve used in the past when I’ve only had a PNI (Physical Network Inventory) system to work with, but need to perform asset management style functionality.

Most inventory tools are great at storing the current state of a device that exists in a network. However, they don’t tend to be so great at an asset manager’s primary function – tracking the entire life-cycle of an asset from procurement to decommissioning and sparing / maintenance along the way.

Normally the PNI just records the locations of all the active network equipment – in buildings, exchanges, comms-huts, cabinets, etc. The trick I use is to create an additional location/s for warehouses. They may (or may not) reside in the physical location of your real warehouse/s.

In almost all PNI systems, you have control over the status of the device (eg IN-SERVICE, etc). You can use this functionality to include status of SPARE, UNDER REPAIR, etc and switch a device between active network locations and the warehouse.

These status-change records give you the ability to pin-point the location of a given asset at any point in time. It also gives you trending stats, either as an individual device or as a cohort of devices (eg by make/model).

You can even build processes around it for check-in / check-out of the warehouse and maintenance scheduling.

I should point out that this works if your PNI allows you to uniquely identify a device (eg by make/model + serial number or perhaps a unique naming convention instance). If your PNI device records only show the current function of a device (eg a naming convention like SiteA-Router-0001), then you might lose sight of the device’s trail when it moves through life-cycle states (eg to the warehouse).

The differences between Inventory, Asset and Config Management in an OSS

We recently discussed the differences between PNI (Physical Network Inventory) and LNI (Logical Network Inventory) solutions that appear as part of many OSS stacks. 

As promised, today we’ll talk about the subtle differences between:

  • Inventory Management Systems 
  • Asset Management Systems and
  • Configuration Management Databases (CMDB)
  • We might even discuss Virtual Infrastructure (VIM) and Resource Managers as well as Config Managers (different from CMDB) too

Inventory vs Asset vs CMDB

To be honest, the diagram above doesn’t show adequate overlap. Each of these systems has a slightly different purpose, usually for a slightly different set of personas. However, they all play a part in managing the resources that make up an organisation’s Active Network (the network segment dedicated to carrying customer traffic, as opposed to internal corporate traffic).

Let’s start with Inventory Management Systems (IMS) because IMHO, these are the tools that were traditionally responsible for managing service-provider networks. These are the tools typically used by network planners, network engineers, capacity planners and other back-office operational staff.  As mentioned in the link above, these tools can be further broken down into:

  • PNI (Physical Network Inventory) – The physical devices like switches, routers, firewalls as well as the outside plant (OSP) like cables, joints, etc. Generally only used by operators with large, wide-spread networks of physical assets, especially outside plant.
  • LNI (Logical Network Inventory) – The set of objects that are formed using physical infrastructure (and possibly associations to other logical objects). This could include circuits, VLANs, and other overlay network topologies as well as the management of attributes like bandwidth, protocols and other network functionality

These tools tend to focus on the key physical/logical/virtual resources that comprise an operator’s active network (AN). However, they often also support functionality that crosses into other domains such as asset and config management.

Asset Management Systems (AMS), as the name implies, have a more “financial” purpose; where assets are objects of intrinsic financial value to an organisation. AMS tools tend to be used by the accounting and asset management teams. They’re used to track current value (purchase price minus depreciation), warranties, spares management, life-cycles / refresh / end-of-life of assets and their contracts, as well as reactive and predictive maintenance. AMS will tend to store information about most of the Active Network Physical devices. This means they will have records for the same devices as PNI, but often with different information / attributes. They won’t tend to store LNI-related data. However, AMS will often keep information about assets in addition to Active Network devices. This could include software licenses and more.

Configuration Management Databases (CMDB) is more of an IT Service Management (ITSM) terminology. Like many IT concepts, ITSM has been increasingly used in parts of service provider networks. CMDBs are a database of Configuration Items (CIs), where CIs can be logical or physical entities. CIs may (or may not) be physical devices (PNI) or logical resource entities (LNI) and may (or may not) represent tangible values (assets). The main purpose of CIs is to store information about IT services that will allow other ITSM processes, such as Incident, Problem and Change Management, to be performed efficiently.

Not only is there functional overlap between these systems, there’s often also terminology overlap and/or misalignments. Different vendors have different levels of functionality and support alternate use-cases, so the areas of overlap differ between organisations.

Oh, and I also promised to mention VIMs and Config Managers:

Virtual Infrastructure Managers (VIM) are responsible for managing the virtual resources made available by physical infrastructure like compute, storage and network devices. In some cases, VIMs generate virtual network devices (VNFs) or virtual machines (VMs) that could look almost identical to any other device stored in LNI, PNI, AMS and/or CMDB. In fact, instances of these VNFs and VMs may even appear in those systems.

Config Management (as opposed to, but also potentially overlapping with, CMDB), is all about managing the configurations of devices in the network (often active network and corporate network). Each device, such as a router, has a configuration that tells the hardware how to function, where to route traffic, which packets to prioritise, where to send management logs (to the OSS), etc. Being able to monitor and manage these configurations centrally and consistently is the purpose for Config Managers. These are mostly used by network engineers to set policies and golden-configs (ie the config templates that all devices of that type must adhere to consistently). For example, you may have hundreds/thousands of devices in your network and want to re-point all management traffic to a new server as part of an OSS upgrade. Rather than configuring each device separately and manually, you can use the config management tool to push config changes out to the network.

Leave us a message to describe how your organisation use these (and other) tools.

Various forms of OSS Inventory

After reading other recent posts such as “Orders Down, Faults Up” and “How is OSS/BSS service and resource availability supposed to work?” an avid reader of the PAOSS blog posed the following brilliant question:

Do you have any thoughts on geospatial vs non geospatial network inventory systems? How often do you see physical plant mapping in a separate system from network inventory, with linkages or integrations between them, vs how often do you see physical and logical inventory being captured primarily in a geospatially oriented system?

Boy do I ever have some thoughts on this topic!! I’m sure you do too, so I’d love to hear what you think in the comments section below.

I was lucky. The first OSS/BSS that I worked on (all the way back in 2000), had both geo and non-geo (topology) views. It also had a brilliantly flexible data model that accommodated physical and logical inventory. All tightly integrated into one package. There aren’t many tools that can do all of that even today. Like I said, I was lucky to have this as a starting point!!

Like all things OSS/BSS, it starts with the personas and the key tasks they need to perform. Or from the supplier’s perspective, which customer personas they’re most actively targeting.

For example, if you have a significant Outside Plant (OSP) Network, then geo-positioning is vital. The exchanges and comms huts are easy enough to find, but pits, cable routes, easements, etc are often harder to find. It’s not uncommon for a field tech to waste time searching for a pit that’s covered in dirt, grass or snow. And knowing the exact cable route in geo view is helpful for sending field techs to the exact location of a fault (ie helping them to pinpoint the location of the bright yellow excavator that has just sliced through your inter-capital link). Geo-view is also important for OSP designers and the field workforce that builds the OSP network.

But other personas don’t care about seeing the detailed cable route. They just want to see a point-to-point topological link to represent physical connections between the ports on adjacent devices. This helps them to quickly understand the network or circuit / service view. They may also like to see an alarm overlay on the topology to quickly determine which parts of the network aren’t performing as expected. For these personas, seeing all the geo-detail just acts as visual noise that they need to subconsciously filter out to understand the topology view.

These personas also tend to want topological views of the network, not just the physical but the logical and virtual network / service overlays too.

In most cases that I can think of, the physical / OSP inventory tools show the physical devices (ports even) that the OSP network connects into. Their main focus is on the cables, joints, pits, pipes, catenaries, poles, lead-ins, patch-panels, patch-leads, splitters, etc. But showing the termination of cables onto active equipment (Inside Plant or ISP) is an important linking key between the physical and logical views.

The physical port (on the physical device) becomes the key demarcation between physical and logical worlds. The physical port connects physical cables / leads, but it also acts as the anchor point from which to create logical ports to which logical connections are made. As a result, the physical device and port tend to be shown in both physical (geo) and logical inventory tools. They also tend to be shown in both physical and logical network topology views.

In the case of the original OSS/BSS I worked on, it had separate visualisation tools for geo, network and circuit/service, but all underpinned by a common data model.

What’s the best way? Different personas will have different perspectives of course. I prefer for physical and logical inventories to be integrated out of the box (to allow simple cross-ref visually and in queries)…. but I also prefer for them to have different views (eg geo, topology, network, circuit/service) to suit different situations.

I also find it helpful if each of those views allow the ability to drill down deeper into specific sections of the graph if necessary. I’d prefer not to have all of those different views overlaid onto a geo visualisation. Too much visual clutter IMHO, but others may love it that way.

Oh, and having separate LNI (Logical Network Inventory) and PNI (Physical Network Inventory) can be a tricky thing to reconcile. The LNI will almost always have programmatic interfaces (APIs) to collect data from, but will generally have to amalgamate many different sources. Meanwhile, the PNI consists of mostly passive equipment and therefore has no API to collect latest info from. I tend to use strategies at the above-mentioned demarcation point (ie physical ports) to help establish linking keys between LNI and PNI.

BTW. There’s one aspect of the question, “How often do you see physical plant mapping in a separate system from network inventory” that I haven’t fully answered. I’ll cover the question of asset management vs inventory management vs CMDB (Configuration Management Database) in more detail in an upcoming post. [Ed. See link here]

What’s in your OSS for me?

May I ask you a question?  Do the senior executives at your organisation ever USE your OSS/BSS?

I’d love to hear your answer.

My guess is that few, if any, do. Not directly anyway. They may depend on reports whose data comes from our OSS, but is that all?

Execs are ultimately responsible for signing off large budget allocations (in CAPEX and OPEX) for our OSS. But if they don’t see any tangible benefits, do the execs just see OSS as cost centres? And cost centres tend to become targets for cost reduction right?

Building on last week’s OSS Scoreboard Analogy, the senior execs are the head coaches of the team. They don’t need the transactional data our OSS are brilliant at collating (eg every network device’s health metrics). They need insights at a corporate objective level.

How can we increase the executives’s, “what’s in it for me?” ranking of the OSS/BSS we implement? We can start by considering OSS design through the lens of senior executive responsibilities:

  • Strategy / objective development
  • Strategy execution (planning and ongoing management to targets)
  • Clear communication of priorities and goals
  • Optimising productivity
  • Risk management / mitigation
  • Optimising capital allocation
  • Team development

And they are busy, so they need concise, actionable information.

Do we deliver functionality that helps with any of those responsibilities? Rarely!

Could we? Definitely!

Should we? Again, I’d love to hear your thoughts!

 

The Ineffective OSS Scoreboard Analogy

Imagine for a moment that you’re the coach of a sporting team. You train your team and provide them with a strategy for the game. You send them out onto the court and let them play.

The scoreboard gives you all of the stats about each player. Their points, blocks, tackles, heart-rate, distance covered, errors, etc. But it doesn’t show the total score for each team or the time remaining in the game. 

That’s exactly what most OSS reports and dashboards are like! You receive all of the transactional data (eg alarms, truck-rolls, device performance metrics, etc), but not how you’re collectively tracking towards team objectives (eg growth targets, risk reduction, etc). 

Yes, you could infer whether the team is doing well by reverse engineering the transactional data. Yes, you could then apply strategies against those inferences in the hope that it has a positive impact. But that’s a whole lot of messing around in the chaos of the coach’s box with the scores close (you assume) and the game nearing the end (possibly). You don’t really know when the optimal time is to switch your best players back into the game.

As coach with funding available, would you be asking your support team to give you more transactional tools / data or the objective-based insights?

Does this analogy help articulate the message from the previous two posts (Wed and Thurs)?

PS. What if you wanted to build a coach-bot to replace yourself in the near future? Are you going to build automations that close the feedback loop against transactional data or are you going to be providing feedback that pulls many levers to optimise team objectives?

One big requirement category most OSS can’t meet

We talked yesterday about a range of OSS products that are more outcome-driven than our typically transactional OSS tools. There’s not many of them around at this stage. I refer to them as “data bridge” products.
 
Our typical OSS tools help manage transactions (alarms, activate customers services, etc). They’re generally not so great at (directly) managing objectives such as:
  • Sign up an extra 50,000 customers along the new Southern network corridor this month
  • Optimise allocation of our $10M capital budget to improve average attainable speeds by 20% this financial year
  • Achieve 5% revenue growth in Q3
  • Reduce truck rolls by 10% in the next 6 months
  • Optimal management of the many factors that contribute to churn, thus reducing churn risk by 7% by next March
 
We provide tools to activate the extra 50,000 customers. We also provide reports / dashboards that visualise the numbers of activations. But we don’t tend to include the tools to manage ongoing modelling and option analysis to meet key objectives. Objectives that are generally quantitative and tied to time, cost, etc and possibly locations/regions. 
 
These objectives are often really difficult to model and have multiple inputs. Managing to them requires data that’s changing on a daily basis (or potentially even more often – think of how a single missed truck-roll ripples out through re-calculation of optimal workforce allocation).
 
That requires:
  • Access to data feeds from multiple sources (eg existing OSS, BSS and other sources like data lakes)
  • Near real-time data sets (or at least streaming or regularly updating data feeds)
  • An ability to quickly prepare and compare options (data modelling, possibly using machine-based learning algorithms)
  • Advanced visualisations (by geography, time, budget drawdown and any graph types you can think of)
  • Flexibility in what can be visualised and how it’s presented
  • Methods for delivering closed-loop feedback to optimise towards the objectives (eg RPA)
  • Potentially manage many different transaction-based levers (eg parallel project activities, field workforce allocations, etc) that contribute to rolled-up objectives / targets
 
You can see why I refer to this as a data bridge product right? I figure that it sits above all other data sources and provides the management bridge across them all. 
 
PS. If you want to know the name of the existing products that fit into the “data bridge” category, please leave us a message.

Do you want funding on an OSS project?

OSS tend to be very technical and transactional in nature. For example, a critical alarm happens, so we have to coordinate remedial actions as soon as possible. Or, a new customer has requested service so we have to coordinate the workforce to implement certain tasks in the physical and logical/virtual world. When you spend so much of your time solving transactional / tactical problems, you tend to think in a transactional / tactical way.
 
You can even see that in OSS product designs. They’ve been designed for personas who solve transactional problems (eg alarms, activations, etc). That’s important. It’s the coal-face that gets stuff done.
 
But who funds OSS projects? Are their personas thinking at a tactical level? Perhaps, but I suspect not on a full-time basis. Their thoughts might dive to a tactical level when there are outages or poor performance, but they’ll tend to be thinking more about strategy, risk mitigation and efficiency if/when they can get out of the tactical distractions.
 
Do our OSS meet project sponsor needs? Do our OSS provide functionality that help manage strategy, risk and efficiency? Well, our OSS can help with reports and dashboards that help them. But do reports and dashboards inspire them enough to invest millions? Could sponsors rightly ask, “I’m spending money, but what’s in it for me?”
 
What if we tasked our product teams to think in terms of business objectives instead of transactions? The objectives may include rolled-up transaction-based data and other metrics of course. But traditional metrics and activities are just a means to an end.
 
You’re probably thinking that there’s no way you can retrofit “objective design” into products that were designed years ago with transactions in mind. You’d be completely correct in most cases. So what’s the solution if you don’t have retrofit control over your products?
 
Well, there’s a class of OSS products that I refer to as being “the data bridge.” I’ll dive into more detail on these currently rare products tomorrow.

An OSS checksum

Yesterday’s post discussed two waves of decisions stemming from our increasing obsession with data collection.

“…the first wave had [arisen] because we’d almost all prefer to make data-driven decisions (ie decisions based on “proof”) rather than “gut-feel” decisions.

We’re increasingly seeing a second wave come through – to use data not just to identify trends and guide our decisions, but to drive automated actions.”

Unfortunately, the second wave has an even greater need for data correctness / quality than we’ve experienced before.

The first wave allowed for human intervention after the collection of data. That meant human logic could be applied to any unexpected anomalies that appeared.

With the second wave, we don’t have that luxury. It’s all processed by the automation. Even learning algorithms struggle with “dirty data.” Therefore, the data needs to be perfect and the automation’s algorithm needs to flawlessly cope with all expected and unexpected data sets.

Our OSS have always had a dependence on data quality so we’ve responded with sophisticated ways of reconciling and maintaining data. But the human logic buffer afforded a “less than perfect” starting point, as long as we sought to get ever-closer to the “perfection” asymptote.

Does wave 2 require us to solve the problem from a fundamentally different starting point? We have to assume perfection akin to a checksum of correctness.

Perfection isn’t something I’m very qualified at, so I’m open to hearing your ideas. 😉

 

Riffing with your OSS

Data collection and data science is becoming big business. Not just in telco – our OSS have always been one of the biggest data gatherers around – but across all sectors that are increasingly digitising (should I just say, “all sectors” because they’re all digitising?).

Why do you think we’re so keen to collect so much data?

I’m assuming that the first wave had mainly been because we’d almost all prefer to make data-driven decisions (ie decisions based on “proof”) rather than “gut-feel” decisions.

We’re increasingly seeing a second wave come through – to use data not just to identify trends and guide our decisions, but to drive automated actions.

I wonder whether this has the potential to buffer us from making key insights / observations about the business, especially senior leaders who don’t have the time to “science” their data? Have teams already cleansed, manipulated, aggregated and presented data, thus stripping out all the nuances before senior leaders ever even see your data?

I regretfully don’t get to “play” with data as much as I used to. I say regretfully because looking at raw data sets often gives you the opportunity to identify trends, outliers, anomalies and patterns that might otherwise remain hidden. Raw data also gives you the opportunity to riff off it – to observe and then ask different questions of the data.

How about you? Do you still get the opportunity to observe and hypothesise using raw OSS/BSS data? Or do you make your decisions using data that’s already been sanitised (eg executive dashboards / reports)?

 

Crossing the OSS chasm

Geoff Moore’s seminal book, “Crossing the Chasm,” described the psychological chasm between early buyers and the mainstream market.

Crossing the Chasm

Seth Godin cites Moore’s work, “Moore’s Crossing the Chasm helped marketers see that while innovation was the tool to reach the small group of early adopters and opinion leaders, it was insufficient to reach the masses. Because the masses don’t want something that’s new, they want something that works…

The lesson is simple:

– Early adopters are thrilled by the new. They seek innovation.

– Everyone else is wary of failure. They seek trust.”
 

I’d reason that almost all significant OSS buyer decisions fall into the “mainstream market” section in the diagram above.  Why? Well, an organisation might have the 15% of innovators / early-adopters conceptualising a new OSS project. However, sign-off of that project usually depends on a team of approvers / sponsors. Statistics suggest that 85% of the team is likely to exist in a mindset beyond the chasm and outweigh the 15%. 

The mainstream mindset is seeking something that works and something they can trust.

But OSS / digital transformation projects are hard to trust. They’re all complex and unique. They often fail to deliver on their promises. They’re rarely reliable or repeatable. They almost all require a leap of faith (and/or a burning platform) for the buyer’s team to proceed.

OSS sellers seek to differentiate from the 400+ other vendors (of course). How do they do this? Interestingly, by pitching their innovations and uniqueness mostly.

Do you see the gap here? The seller is pitching the left side of the chasm and the buyer cohort is on the right.

I wonder whether our infuriatingly lengthy sales cycles (often 12-18 months) could be reduced if only we could engineer our products and projects to be more mainstream, repeatable, reliable and trustworthy, whilst being less risky.

This is such a dilemma though. We desperately need to innovate, to take the industry beyond the chasm. Should we innovate by doing new stuff? Or should we do the old, important stuff in new and vastly improved ways? A bit of both??

Do we improve our products and transformations so that they can be used / performed by novices rather than designed for use by all the massive intellects that our industry seems to currently consist of?

 

 

 

 

A billion dollar bid

A few years ago I was lucky enough to be invited to lead a bid. I say lucky because the partner organisations are two of the most iconic firms in the tech industry. The bid was for bleeding-edge work, potentially worth well over a billion dollars. I was a little surprised to be honest. I mean, two tech titans, with many very, very clever people, much cleverer than me. Why would they need to look outside and engage me?

As it turned out, the answer became clear within the first few meetings. And whilst the project had little to do with OSS, it certainly had (has) parallels in the world of OSS.

Both of the organisations were highly siloed. Each product / capability silo had immense talent and immense depth to it. Our combined team had many PhDs who could discuss their own silo for hours, but could only point me in the general direction of what plugged into their products. 

Clearly, I was engaged to figure out the required end-to-end solution for the customer and then how to bolt the two sets of silos into that solution framework.

The same is true when looking for OSS solution gaps, in my experience at least. If you look into a domain or a product, the functionality / capability is usually quite well defined, understood and supported. For example, alarm / event managers are invariably very good at managing alarm / event lists.

If you’re going to find gaps, they’re more likely to be found in the end-to-end solution – in the handoffs, responsibility demarcation points, interfaces and processes that cross between silos. That’s why external consultancies can prove valuable for large organisations. They generally look into the cross-domain solution performance.

As you’d already know, the end-to-end solution is a combination of people, process and technology. Even so, as the “manager of managers,” I’m not sure our OSS tech is solving this problem as well as it could. Is there even a “glue” product that’s missing from our OSS/BSS stack?

Sure, we have some tools that fit this purpose – workflow engines, messaging buses, orchestration engines, data lakes, etc. Yet I still feel there’s an opportunity to do it far better. And the opportunity probably extends far beyond just OSS and into the broader IT industry.

What have you done to help solve this problem on your OSS suites?

PS. If you’re wondering what happened to the bid. Well, the team was excited to have made the shortlist of 3, but then the behemoths decided to withdraw from the race. Turns out that winning the bid could’ve jeopardised the even bigger supply contracts they already had with the client. Boggles the mind to think there were bigger contracts already in play!!

 

Inventory Management re-states its case

In a post last week we posed the question on whether Inventory Management still retains relevance. There are certainly uses cases where it remains unquestionably needed. But perhaps others that are no longer required, a relic of old-school processes and data flows.
 
If you have an extensive OSP (Outside Plant) network, you have almost no option but to store all this passive infrastructure in an Inventory Management solution. You don’t have the option of having an EMS (Element Management System) console / API to tell you the current design/location/status of the network. 
 
In the modern world of ubiquitous connection and overlay / virtual networks, Inventory Management might be less essential than it once was. For service qualification, provisioning and perhaps even capacity planning, everything you need to know is available on demand from the EMS/s. The network is a more correct version of the network inventory than external repository (ie Inventory Management) can hope to be, even if you have great success with synchronisation.
 
But I have a couple of other new-age use-cases to share with you where Inventory Management still retains relevance.
 
One is for connectivity (okay so this isn’t exactly a new-age use-case, but the scenario I’m about to describe is). If we have a modern overlay / virtual network, anything that stays within a domain is likely to be better served by its EMS equivalent. Especially since connectivity is no longer as simple as physical connections or nearest neighbours with advanced routing protocols. But anything that goes cross-domain and/or off-net needs a mechanism to correlate, coordinate and connect. That’s the role the Inventory Manager is able to do (conceptually).
 
The other is for digital twinning. OSS (including Inventory Management) was the “original twin.” It was an offline mimic of the production network. But I cite Inventory Management as having a new-age requirement for the digital twin. I increasingly foresee the need for predictive scenarios to be modelled outside the production network (ie in the twin!). We want to try failure / degradation scenarios. We want to optimise our allocation of capital. We want to simulate and optimise customer experience under different network states and loads. We’re beginning to see the compute power that’s able to drive these scenarios (and more) at scale.
 
Is it possible to handle these without an Inventory Manager (or equivalent)?

When OSS experts are wrong

When experts are wrong, it’s often because they’re experts on an earlier version of the world.”
Paul Graham.
 
OSS experts are often wrong. Not only because of the “earlier version of the world” paradigm mentioned above, but also the “parallel worlds” paradigm that’s not explicitly mentioned. That is, they may be experts on one organisation’s OSS (possibly from spending years working on it), but have relatively little transferable expertise on other OSS.
 
It would be nice if the OSS world view never changed and we could just get more and more expert at it, approaching an asymptote of expertise. Alas, it’s never going to be like that. Instead, we experience a world that’s changing across some of our most fundamental building blocks.
 
We are the sum total of our experiences.”
B.J. Neblett.
 
My earliest forays into OSS had a heavy focus on inventory. The tie-in between services, logical and physical inventory (and all use-cases around it) was probably core to me becoming passionate about OSS. I might even go as far as saying I’m “an Inventory guy.”
 
Those early forays occurred when there was a scarcity mindset in network resources. You provisioned what you needed and only expanded capacity within tight CAPEX envelopes. Managing inventory and optimising revenue using these scarce resources was important. We did that with the help of Inventory Management (IM) tools. Even end-users had a mindset of resource scarcity. 
 
But the world has changed. We now operate with a cloud-inspired abundance mindset. We over-provision physical resources so that we can just spin up logical / virtual resources whenever we wish. We have meshed, packet-switched networks rather than nailed up circuits. Generally speaking, cost per resource has fallen dramatically so we now buy a much higher port density, compute capacity, dollar per bit, etc. Customers of the cloud generation assume abundance of capacity that is even available in small consumption-based increments. In many parts of the world we can also assume ubiquitous connectivity.
 
So, as “an inventory guy,” I have to question whether the scarcity to abundance transformation might even fundamentally change my world-view on inventory management. Do I even need an inventory management solution or should I just ask the network for resources when I want to turn on new customers and assume the capacity team has ensured there’s surplus to call upon?
 
Is the enormous expense we allocate to building and reconciling a digital twin of the network (ie the data gathered and used by Inventory Management) justified? Could we circumvent many of the fallouts (and a multitude of other problems) that occur because the inventory data doesn’t accurately reflect the real network?
 
For example, in the old days I always loved how much easier it was to provision a customer’s mobile / cellular or IN (Intelligent Network) service than a fixed-line service. It was easier because fixed-line service needed a whole lot more inventory allocation and reservation logic and process. Mobile / IN services didn’t rely on inventory, only an availability of capacity (mostly). Perhaps the day has almost come where all services are that easy to provision?
 
Yes, we continue to need asset management and capacity planning. Yes, we still need inventory management for physical plant that has no programmatic interface (eg cables, patch-panels, joints, etc). Yes, we still need to carefully control the capacity build-out to CAPEX to revenue balance (even more so now in a lower-profitability operator environment). But do many of the other traditional Inventory Management and resource provisioning use cases go away in a world of abundance?
 

 

I’d love to hear your opinions, especially from all you other “inventory guys” (and gals)!! Are your world-views, expertise and experiences changing along these lines too or does the world remain unchanged from your viewing point?
 
Hat tip to Garry for the seed of this post!

Google’s Circular Economy in OSS

OSS wear many hats and help many different functions within an organisation. One function that OSS assists might be surprising to some people – the CFO / Accounting function.

The traditional service provider business model tends to be CAPEX-heavy, with significant investment required on physical infrastructure. Since assets need to be depreciated and life-cycle managed, Accountants have an interest in the infrastructure that our OSS manage via Inventory Management (IM) tools.

I’ve been lucky enough to work with many network operators and see vastly different asset management approaches used by CFOs. These strategies have ranged from fastidious replacement of equipment as soon as depreciation cycles have expired through to building networks using refurbished equipment that has already passed manufacturer End-of-Life dates. These strategies fundamentally effect the business models of these operators.

Given that telecommunications operator revenues are trending lower globally, I feel it’s incumbent on us to use our OSS to deliver positive outcomes to global business models. 

With this in mind, I found this article entitled, “Circular Economy at Work in Google Data Centers,” to be quite interesting. It cites, “Google’s circular approach to optimizing end of life of servers based on Total Cost of Ownership (TCO) principles have resulted in hundreds of millions per year in cost avoidance.”

Google Asset Lifecycle

Asset lifecycle management is not your typical focus area for OSS experts, but an area where we can help add significant value for our customers!

Some operators use dedicated asset management tools such as SAP. Others use OSS IM tools. Others reconcile between both. There’s no single right answer.

For a deeper dive into ideas where our OSS can help in asset lifecycle (which Google describes as its Circular Economy and seems to manage using its ReSOLVE tool), I really recommend reviewing the article link above.

If you need to develop such a tool using machine learning models, reach out to us and we’ll point you towards some tools equivalent to ReSOLVE to augment your OSS.

Another OSS “forehead-slap” moment!

I don’t know about you, but I find this industry of ours has a remarkable ability to keep us humble. Barely a day goes by when I don’t have to slap my forehead and say, “uhhh…. of course!” (or perhaps, “D’oh!!”)

I had one such instance yesterday. I couldn’t figure out why a client’s telemetry / performance-management suite needed an inventory ingestion interface. Can you think of a reason (you probably can)???

My mind had followed the line of thinking that it was for reconciling with traditional inventory systems or perhaps some sort of topology reckoning. It’s far more rudimentary than that. 

Have you figured out what it might be used for yet?

Enrichment!

For example, if device names (hostnames) attached to the metrics aren’t human-readable, simple, just enrich the data with its human-readable alternate name. If you don’t know what device type is generating sub-sets of metrics, no problems, just enrich the data.

I’d heard of enrichment of alarms/event of course, but hadn’t followed that line of thinking for performance management before. Does your performance management stack allow you to enrich its data sets?

Seems obvious in hindsight! Smacked down again!!

I’d love to hear any anecdotes you have where OSS gave you a “forehead slap” moment.

Over 30 Autonomous Networking User Stories

The following is a set of user stories I’ve provided to TM Forum to help with their current Autonomous Networking initiative.

They’re just an initial discussion point for others to riff off. We’d love to get your comments, additions and recommended refinements too.

As a Head of Network Operations, I want to Automatically maintain the health of my network (within expected tolerances if necessary) So that Customer service quality is kept to an optimal level with little or no human intervention
As a Head of Network Operations, I want to Ensure the overall solution is designed with machine-led automations as a guiding principle So that Human intervention can not be easily engineered into the systems/processes
As a Head of Network Operations, I want to Automatically identify any failures of resources or services within the entire network So that All relevant data can be collected, logged, codified and earmarked for effective remedial action without human interaction
As a Head of Network Operations, I want to Automatically identify any degradation of resource or service performance within the network So that All relevant data can be collected, logged, codified and earmarked for effective remedial action without human interaction
As a Head of Network Operations, I want to Map each codified data set (for failure or degradation cases) to a remedial action plan So that Remedial activities can be initiated without human interaction
As a Head of Network Operations, I want to Identify which remedial activities can be initiated via a programmatic interface and which activities require manual involvement such as a truck roll So that Even manual activities can be automatically initiated
As a Head of Network Operations, I want to Ensure that automations are able to resolve all known failure / degradation scenarios So that Activities can be initiated for any failure or degradation and be automatically resolved through to closure (with little or no human intervention)
As a Head of Network Operations, I want to Ensure there is sufficient network resilience So that Any failure or degradation can be automatically bypassed (temporarily or permanently)
As a Head of Network Operations, I want to Ensure there is sufficient resilience within all support systems So that Any failure or degradation can be automatically bypassed (temporarily or permanently) to ensure customer service is maintained
As a Head of Network Operations, I want to Ensure that operator initiated changes (eg planned maintenance, software upgrades, etc) automatically generate change tracking, documentation and logging So that The change can be monitored (by systems and humans where necessary) to ensure there is minimal or no impact to customer services, but also to ensure resolution data is consistently recorded
As a Head of Network Operations, I want to Ensure that customer initiated changes (eg by raising an incident) automatically generate change tracking, documentation and logging So that The change can be monitored (by systems and humans where necessary) to ensure the incident is closed expediently, but also to ensure resolution data is consistently recorded
As a Head of Network Operations, I want to Initiate planned outages with or without triggering automated remedial activities So that The change agents can decide to use automations or not and ensure automations don’t adversely effect the activities that are scheduled for the planned outage window
As a Head of Network Operations, I want to Ensure that if an unplanned outage does occur, impacted customers are automatically notified (on first instance and via a communications sequence if necessary throughout the outage window) So that Customer experience can be managed as best possible
As a Head of Network Operations, I want to Ensure that if an unplanned outage does occur without a remedial action being triggered, a post-mortem analysis is initiated So that Automations can be revised to cope with this previously unhandled outage scenario
As a Head of Network Operations, I want to Ensure that even previously un-seen new fail scenarios can be handled by remedial automations So that Customer service quality is kept to an optimal level with little or no human intervention
As a Head of Network Operations, I want to Automatically monitor the effects of remedial actions So that Remedial automations don’t trigger race conditions that result in further degradation and/or downstream impacts
As a Head of Network Operations, I want to Be able to manually override any automations by following a documented sequence of events So that If a race condition is inadvertently triggered by an automation, it can be negated quickly and effectively before causing further degradation
As a Head of Network Operations, I want to Intentionally trigger network/service outages and/or degradations, including cascaded scenarios on an scheduled and/or randomised basis So that The resilience of the network and systems can be thoroughly tested (and improved if necessary)
As a Head of Network Operations, I want to Intentionally trigger network/service outages and/or degradations, including cascaded scenarios on an ad-hoc basis So that The resilience of the network and systems can be thoroughly tested (and improved if necessary)
As a Head of Network Operations, I want to Perform scheduled compliance checks on the network So that Expected configurations and policies are in place across the network
As a Head of Network Operations, I want to Automatically generate scheduled reports relating to the effectiveness of the network, services and automations So that The overall solution health (including automations) can be monitored
As a Head of Network Operations, I want to Automatically generate dashboards (in near-real-time) relating to the effectiveness of the network, services and automations So that The overall solution health (including automations) can be monitored
As a Head of Network Operations, I want to Ensure that automations are able to extend across all domains within the solution So that Remedial actions aren’t constrained by system hand-offs
As a Head of Network Operations, I want to Ensure configuration backups are performed automatically on all relevant systems (eg EMS, OSS, etc) So that A recent good solution configuration can be stored as protection in case automations fail and corrupt configurations within the system
As a Head of Network Operations, I want to Ensure configuration restores are performed and tested automatically on all relevant systems (eg EMS, OSS, etc) So that A recent good solution configuration can be reverted to in case automations fail and corrupt configurations within the system
As a Head of Network Operations, I want to Ensure automations are able to manage the entire service lifecycle (add, modify/upgrade, suspend, restore, delete) So that Customer services can evolve to meet client expectations with little or no human intervention
As a Head of Network Operations, I want to Have a design and architecture that uses intent-based and/or policy-based actions So that The complexity of automations is minimised (eg automations don’t need to consider custom rules for different device makes/models, etc)
As a Head of Network Operations, I want to Ensure as many components of the solution (eg EMS, OSS, customer portals, etc) have programmatic interfaces (even if manual activities are required in back-end processes) So that Automations can initiate remedial actions in near real time
As a Head of Network Operations, I want to Ensure all components and data flows within the solution are securely hardened (eg encryption of data in motion and at rest) So that The power of the autonomous platform can not be leveraged for nefarious purposes
As a Head of Network Operations, I want to Ensure that all required metrics can be automatically sourced from the network / systems in as near real time as feasible / useful So that Automations have the full set of data they need to initiate remedial actions and it is as up-to-date as possible for precise decision-making
As a Head of Network Operations, I want to Use the power of learning machines So that The sophistication and speed of remedial response is faster, more accurate and more reliable than if manual interaction were used
As a Head of Network Operations, I want to Record actual event patterns and replay scenarios offline So that Event clusters and response patterns can be thoroughly tested as part of the certification process prior to being released into production environments
As a Head of Network Operations, I want to Capture metrics that can be cross-referenced against event patterns and remedial actions So that Regressions and/or refinements can improve existing automations (ie continuous retraining of the model)
As a Head of Network Operations, I want to Be able to seed a knowledge base with relevant event/action data, whether the pattern source is from Production, an offline environment, a digital twin environment or other production-like environments So that The database is able to identify real scenarios, even if  scenarios are intentially initiated, but could potentially cause network degradation to a production environment
As a Head of Network Operations, I want to Ensure that programmatic interfaces also allow for revert / rollback capabilities So that Remedial actions that aren’t beneficial can be rolled back to the previous state; OR other remedial actions are performed, allowing the automation to revert to original configuration / state
As a Head of Network Operations, I want to Be able to initiate circuit breakers to override any automations So that If a race condition is inadvertently triggered by an automation, it can be negated quickly and effectively before causing further degradation
As a Head of Network Operations, I want to Manually or automatically generate response-plans (ie documented sequences of activities) for any remedial actions fed back into the system So that Internal (eg quality control) or external (eg regulatory) bodies can review “best-practice” remedial activities at any point in time
As a Head of Network Operations, I want to Intentionally trigger catastrophic network failures (in non-prod environments) So that We can trial many remedial actions until we find an optimal solution to seed the knowledge base with

New OSS product – Restoration Manager

At Passionate About OSS, we’re lucky enough to count the utilities market as an important part of our client base. This probably puts us in a very small percentage of OSS exponents that work across OSS for telco and utilities.

Utilities have a number of interesting and unique nuances compared with other OSS markets. Starting at the top, the network is core business for a telco, whereas the comms network only supports the core business of other utilities.

Despite having vastly different functions, there are still many similarities between operational support tools at telcos and other utilities. Similarities include:

  • Network inventory is made up of nodes and arcs (nodes are routers vs pumps vs sub-stations; arcs are comms cables vs power cables vs pipes)
  • All are CAPEX-heavy industries, so asset management is important from a financial (ie depreciation and “useful life remaining” modelling) as well as physical perspective
  • Assets need to be systematically life-cycle managed (ie commissioned, repaired, replaced, modified, maintained, decommissioned, etc)
  • A field workforce needs to be coordinated to keep the network in a healthy operational state
  • The network either provides (or supports) essential services so rapid remediation of failed / degraded services is expected by customers

Anyway, enough of the preamble. I find it interesting to observe the tools used by the different utilities to prompt alternate ways of thinking about our OSS.

Last week I observed a tool called a Restoration Manager. It is used widely in the power industry to handle fault restoration on comms networks. It has little direct equivalent in comms network management.

Some ticket managers allow task templates to be developed and defined. Similarly, the Restoration Manager also retains restoration plans, which are sequences of responses, but it also goes further by:

  • Coordinating implementation of restoration plans in real-time 
  • Looking ahead of each step in the  restoration plan to determine whether it is still useful or potentially harmful
  • Providing an indicator of whether the current network state is suited to being handled by each of the stored restoration plan/s
  • Coordinating restoration of planned or unplanned outages and even degradation events
  • Facilitates use AI or past restorations to create an optimal restoration plan
  • Documenting a proposed plan of action/s that can be audited by internal groups (eg engineering, QC, etc) or external groups (eg regulators)

The restoration plans could be made to tie in with DRP (Disaster Recovery Plans), CRB (Change Request Boards), outage window sequencing/management and even security incident response.

What are your thoughts? Would a Restoration Manager be useful for our OSS stack (ie solves an existing, unsolved problem) or do we already have suitable ways of solving / avoiding the outage problem?