Today’s is the third part in a series about OSS/BSS testing (part 1, part 2).
Many people think about OSS/BSS testing in terms of application functionality and non-functional requirements. They also think about entry criteria / pre-requisites such as the environments, application builds / releases, test case development and maybe even the integrations required.
However, an often overlooked aspect of OSS/BSS functionality testing is the data that is required to underpin the tests.
Factors to be considered will include:
Programmatically collectable data – this refers to data that can be collected from data sources. Great examples are near-real-time alarm and performance data that can be collected by connecting to the network, either to real devices and NMS or simulators
Manually created or migrated data – this refers to data that is seeded into the OSS/BSS database. This could be manually created or script-loaded / migrated. Common examples of this are inventory data, especially information about passive infrastructure like buildings, cables, patch-panels, racks, etc. In some cases, even data that can be collected via a programmatic interface still needs to be augmented with manually created or migrated data
Base data – for consistency of test results, there usually needs to be a consistent data baseline to underpin the tests.
Reversion to baseline – If/when there are test cases that modify the base data set (eg provisioning a new service that reserves resources such as ports on a device), there may need to be a method of roll-back (or reinstatement) to base state. In other cases, a series of tests may require a cascading series of dependent changes (eg reserving a port before activating a service). These examples may need to be rolled-back as a sequence
Automated / Regression testing – if automations and/or regression testing is to be performed, automated reversion to consistent base data will also be required. Otherwise discrepancies will appear between automated test cycles
Migration of base data – for consistency of results between phases, there also needs to be consistency of data across the different environments on which testing is performed. This may require migration of base data between environments (see yesterday’s post about transitions between environments)
Multi-homing of data – particularly for real-time data, sources such as devices / NMS may issue data to more than one destination (eg different environments).
Reconciliation / Equivalency testing – When multi-homing of data sources is possible or when Current and Future PROD are running in parallel, equivalency testing can be performed. By performing comparison between the processed data in destination systems it allows for equivalency testing (eg between current and future state mediation devices / probes / MDDs). Transition planning will also be important here as we plan to migrate from Current PROD to Future PROD environments
Data Migration Testing – this is testing to confirm the end-to-end validity and completeness of migrated data sets
PROD data cuts – once a system is cut over to production status, it’s common to take copies of PROD data and load it onto lower environments. However, care should be taken that the PROD data doesn’t overwrite specially crafted base data (see “base data” dot-point above)
One of the most vital, but underestimated aspect of OSS/BSS project implementation is ensuring momentum is maintained. These large and complex projects are prone to stagnating at different stages, which can introduce pressure onto the implementation team.
As mentioned in yesterday’s post, the first in this week’s series, the test strategy and scheduling is regularly overlooked as a means of maintaining OSS project momentum. More specifically, careful planning of transitions between test phases and the environments that they’re run on, can demonstrate progress – where progress is seen through the introduction of business value.
The following diagram provides a highly stylised indicative timeline (x-axis) of activities, showing how to leverage multiple different environments (y-axis). See here for examples of additional environments that you may have on your project.
Data loads (reference data, symbolic data and/or real data extracts)
How builds and data loads can be cascaded between environments to reduce duplicated effort
Some other important call-outs from this stylised diagram include:
The Builds and Data Loads cascade from lower environments (eg from PROD-SUPPORT to PRE-PROD). Thought needs to be given as to which builds and data sets need to be cascaded from which environments
However, after PRE-PROD is handed over and becomes PROD, it is common for cuts of production data to be regularly loaded back into the lower environments so that they are PROD-like
Stand-up of new PROD environments is often a long lead-time item (because of size and complexity such as resilience architectures, security, etc), compared to lower environments. Environments such as DEV/TEST could be as easy to stand up as creating a new Virtual Machine/s on existing hosting
Different environments may have access to different data sources / integrations. For example, the lower environments may be connected to lab versions of devices and NMS/EMS. Alternatively, the network might be mimicked by simulators in non-PROD environments. Other integrations could be for active directory (AD), environment logging, patch management, etc. Sometimes production data sources can be connected to non-PROD OSS environments, but this is not so common
The item marked as Base-build on the PRE-PROD / PROD environment reflects the initial build and configuration of virtualisation, databases, storage, management networking, resilience / failover mechanisms, backup/restore, logging, security hardening and much more
Careful transition planning needs to go into PROD cutover. In the sample diagram, a final cut of data comes from the Current PROD environment to PRE-PROD before it becomes the Future PROD environment during official handover
You’ll notice though that there may still be a period of overlap between cutover from Current PROD to Future PROD. This is because there needs to be staged data source cutover in cases where data sources like network devices can’t multi-home data feeds to both environments in parallel.
On major software projects like the OSS you’re building, testing is an important phase of course. You’ll have undoubtedly incorporated testing into your planning. After all, testing is a key component of any Software Development Life Cycle (SDLC). There are various SDLC models / methodologies such as Waterfall, V-Model, Agile and others that you can consider.
Unfortunately, most OSS project teams tend to underestimate the testing phase, thinking it can just fit in around other major activities towards the end of the implementation. Experienced testers will suggest that they should be involved right from the requirement capture phase, because they’ll have to design test cases to prove that each requirement is met.
More importantly, your test strategy and test phase transitioning can play a major part in maintaining momentum through a project’s delivery phase. We’ll look into a number of related details in a series of posts this week.
Today we’ll look at the V-Model. It can be a helpful model for mapping requirements to test phases / cases. The diagram below, which comes from my book, Mastering your OSS, shows a simplified, sample version of the V-Model. It highlights the relationship between key test artefacts (eg plans / designs / specifications / requirements) on the left with the corresponding test phases on the right.
MTBF (Mean Time Between Failures) – the average elapsed time between failures of a system, service or device. It’s the basic measure of availability / reliability of the system / service / device. The higher, the better.
MTTR (Mean Time to Repair) – generally used to denote the average time to close a trouble ticket (to repair a failed system / service / device). It’s the basic measure of corrective action efficiency. The lower, the better.
Some also use MTTR as a Mean Time to Recover / Resolve (ie MTTD + MTTR in the diagram above) or Mean Time to Respond (MTTD in the diagram above to acknowledge an event and create a ticket). See why I get confused?
MTTD (Mean Time to Detect / Diagnose) – the average time taken from when an event is first generated and timestamped to when the NOC detects / diagnoses the cause and generates a ticket. The lower, the better.
MTTF (Mean Time to Failure) – the average system / service / device up-time
Today’s article pushes the vision a little further. If your CDM is built as part of an enterprise-wide data warehouse, then you may get the opportunity to think beyond the boundaries network and operations data.
Traditional OSS/BSS were built around highly structured, relational databases, usually designed by OSS/BSS product vendors. Each data architecture was designed to support the specific, baked-in use-cases offered by each product. It was like a building architect designing a building, let’s say a new wing of a university, from the ground up for a very specific purpose.
You don’t get the same luxury with your CDM. You have to take a multitude of existing platforms, applications and data models and attempt to turn them into a cohesive data set. This is a bit like taking a row of existing houses, extending and combining them to form a university wing. The “existing houses” might represent disparate OSS or BSS or network systems, but they could also be IT / data silos from various other parts of the organisation.
You know the latter “university” design will be compromised – discrepancies in data standards, data flows, siloed data knowledge, disjointed data governance, etc. However, it also comes with a big benefit. You can keep appending new sets of data that were never part of initial considerations, of any of the IT / data silos. It could be weather data, social trending, building approvals or so much more. It could be any data set that you think could unlock new insights.
But to form a coherent and valuable data set, you still need a common blueprint. As Stephanie Shen describes on towardsdatascience.com,
“…the following areas need to be considered and planned at this conceptual stage:
– The core data entities and data elements such as those about customers, products, sales.
– The output data needed by the clients and customers.
– The source data to be gathered and transformed or referenced to produce the output data.
– Ownership of each data entity and how it should be consumed and distributed based on business use cases.
– Security policies to be applied to each data entity.
– The relationships between the data entities, such as reference integrity, business rules, execution sequence.
– Standard data classification and taxonomy.
– Standards of data quality, operations, and Service Level Agreements (SLAs).
This conceptual level of design consists of the underlying data entities that support each business function.“
As with all valuable OSS tripods, your value in the CDM chain is being able to connect the silos of business, IT and operations.
Our OSS / BSS manage some of the world’s most vital comms infrastructure don’t they? That makes them pretty important assets to protect from cyber-intrusion. Therefore security is a key, but often underestimated, component of any OSS / BSS project.
Let me start by saying I’m no security expert. However, I have worked with quite a few experts tasked with securing my OSS projects and picked up a few ideas along the way. I’ll share a few of those ideas in today’s post.
We look at:
Security Trust Zones / Realms
Restricting Access to OSS / BSS systems and data
OSS / BSS Data Security
Real-time Security Logging / Monitoring
Security Testing / Hardening
Useful Security Standards
1. Security Trust Zones / Realms
For me, security starts with how you segment and segregate your network and related systems. The aim of segmentation / segregation is to restrict malicious access to sensitive data / systems. The diagram below shows a highly simplified three-realm design, starting at the bottom:
The operator’s Active Network realm – the network that carries live customer traffic and is managed by the CSP / operator [Noting though, that these are possibly managed as virtual and/or leased entities rather than owned]. It comprises the routers, switches, muxes, etc that make up the network. As such, this zone needs to be highly secure. Customers connect to the Active Network at the edge of the organisation’s network, often via CPE (Customer Premises Equipment), NTU (Network Termination Units) or similar. Dedicated Network Operation Centre (NOC) operator terminals tend to connect inside the Active Network
The operator’s Corporate / Enterprise realm – the network that houses the organisation’s corporate IT assets. This is where most corporate staff engage with core business services like desktop tools and so much more. If network operations staff need to connect to the Corporate / Enterprise realm but also reach into the Active Network realm, then an air-gap is usually established by the SCP between the two. This is bridged through technologies like Citrix, RDP (Remote Desktop Protocol) or similar
The Cloud / Internet realm – the external networks / infrastructure utilised by the organisation that are outside the organisation’s direct control. This includes Internet services, which many corporate users rely on of course. However, it may also include some important components of your OSS/BSS stack if provided as public cloud services, an increasingly common software supply model these days
You’ll also notice the all-important Security Control Points (SCP) like firewalls that provide segregation between the zones
In all likelihood, your security trust model will contain more than three zones, but these should be the absolute minimum.
The Active Network should be segregated from the Corporate / Enterprise network so that it can continue to provide service to customers even if the connection between them is lost (or intentionally severed if a security breach is identified).
This is where things get interesting. The Active Network and our Network Management stack rely on Shared Services such as DNS (Domain Naming System), NTP (Network Time Protocol), Identity / Access Management, Anti-Virus and more. These tend to be housed in Corporate / Enterprise realms. If we want the Active Network to be able to operate in complete standalone mode then we need to provide special consideration to the shared services architectures.
Aside: Traditionally, we’ve focused on perimeter defense and authenticated users are granted authorised access to a broad collection of resources. We now see the trend towards more remote users and cloud-based assets outside the enterprise-owned boundary in our OSS architectures. There’s currently debate around whether zero-trust architectures are required to segment more holistically – to restrict lateral movement within a network, assuming an attacker is already present on the network. The NIST ZTA draft discusses this emerging approach in more detail
Once we have the security trust zones identified, we now have to determine where our OSS / BSS / management stack resides within the zones. If we use the layers of the TMN Pyramid as a guide:
The Network Element Layer (NEL) is the heart of the Active Network
The EMS / NMS (Element / Network Management Systems) will also usually reside within the Active Network
The OSS / BSS are interesting. They have to interface with the network and EMS / NMS. But they also usually have to interface with corporate systems like data warehouses, reporting tools, etc. They’re so critical to managing the Active Network, they need to be highly secure. That means they could be placed inside the Active Network realm or even have their own special Central Management realm. In other cases, different components of the OSS / BSS might be spread across different realms.
Note that we also have to consider the systems (eg user portals, asset management systems, etc, etc) that our OSS / BSS need to interface to and where they reside in the trust model.
2. Restricting Access to OSS / BSS systems and data
We want to uniquely control who has access to what systems and data using our OSS / BSS stack.
The Security Trust model also impacts the architectures of Identity Management (Directory Services like Active Directory), User Access Management (UAM) and Privileged Access Management (PAM) solutions and how they control access to our OSS / BSS.
They serve three purposes:
To provide fine-grained management of access to privileged / restricted data and systems within our OSS / BSS
To simplify the administrative overhead of managing user access to our OSS / BSS by defining group-based user access policies
To log the activities of individual users whilst they use the OSS/BSS and related systems / networks
Most OSS / BSS allow user authentication via Directory Services these days. Most, but not all, also allow roles / privileges to be assigned via Directory Services. For example, RBAC (Role Based Access Control) is policy that is defined by our OSS / BSS applications. It controls what functions users / groups can perform via permission management. For central user administration purposes, it’s ideal that the Directory Service can pass role-based information to our OSS / BSS.
3. OSS / BSS Data Security
The first step in the data security process is to identify categories of data such as unclassified, confidential, secret, etc.
We then need to consider what security mechanisms need to be applied to each category. There are four main OSS / BSS data security considerations:
Data Anonymisation / Privacy – is the process of removing / redacting / encrypting personally identifiable information from the data sets stored in our OSS / BSS (particularly the latter). Our solutions need to store personal data such as names, addresses, contact details, billing details, etc. We can use techniques to control the pervasiveness of access to that data. For example, we may use a tightly restricted system to store personal details as well as a non-identifiable code (eg LocationID or ServiceID) for use by our other more widely accessed tools (eg PNI / LNI)
Encryption of data at rest – is the process of encrypting the large stores of data used by our OSS / BSS, whether a local database used by each application or in centralised data warehouses
Encryption of data in transit – is the process of encrypting data as it transits between components within your OSS/BSS stack (and possibly beyond). Techniques such as VPNs and IPSec protocols can be used. As we increasingly see OSS / BSS built as web-based applications, we’re using encrypted connections (eg HTTPS, SSL, TLS, etc) to protect our data
Physical security – is the process of restricting physical access to data stores (eg locked cabinets, facilities access management, etc). This isn’t always within our control as an OSS / BSS project team.
4. Real-time Security Logging / Monitoring
Ensure all systems in the management stack (OSS, BSS, NMS, EMS, the network, out-of-band management, etc) are logging to a central SIEM (Security Information and Event Management) tool. Oh, and don’t do what I saw one big bank do – they had so many hits occurring just on their IPS / IDS tool that they just left it sitting in the corner unmonitored and in the too-hard basket. By having the tools, they’d ticked their compliance box, but there was no checkbox asking them to actually look at the results or respond to the incidents identified!!
5. Patch Management
Software patch management is theoretically one of the simplest security management techniques to implement. It ensures you have the latest, hopefully most secure, version of all software.
OSS / BSS / Management stacks tend to have many, many different components. Not just at the obvious application level, but operating systems, third-party software (eg runtime environments, databases, application servers, message buses, antivirus software, syslog, etc).
Patch management is often well maintained by IT teams within the Corporate / Enterprise trust zone discussed above. They have access to the Internet to download patches and tools to help push updates out. However, the Active Network zone shouldn’t have direct access to the Internet, so routine patch management could be easily overlooked and/or difficult to implement. Sometimes the software components reside on servers that are rarely logged into and patches can be easily overlooked.
The other problem is that OSS / BSS applications are often heavily customised, making it hard to follow a standard upgrade path. I’ve seen OSS / BSS that haven’t been patched for years, even with something as simple as Jave runtime environments, because it causes the OSS / BSS to fail.
6. Security Testing / Hardening
Your organisation probably already has standards and checklists in place to ensure that all of your IT assets are as secure as possible. Your OSS / BSS environments are just one of those assets. However, as the “manager of managers” of your Active Network, the OSS / BSS is probably more important to secure than most.
Your organisation might also insist that all applications, including the OSS / BSS, are built on a hardened Standard Operating Environment (SOE). However, some suppliers provide OSS / BSS as appliances, built on their own environments. These then have to go through a hardening process in alignment with your corporate IT standards.
If using a vendor-supplied off-the-shelf application, it will be quite common for it to have a default admin account on the application and database. This makes it easier for the system implementation team to navigate their way around the solution when building it. However, one of the first steps in a hardening process is to rename or disable these built-in accounts.
As “manager of managers,” your OSS / BSS‘s primary purpose is to collect (or request) information from a variety of sources. Some of these sources reside in the Active Network. Others reside in the Corporate Network or elsewhere. As such, careful consideration needs to be given to what Ports / Protocols are allowed. Some systems will come pre-configured with default / open settings. However, these should be restricted to necessary protocols only, including SNMP, HTTPS, SSH, FTPS and/or similar.
Speaking of SNMP, its original design was inherently insecure as it uses a primitive method of authentication. It uses clear-text community strings to secure access to the management plane. Only version 3 of SNMP (ie SNMPv3) has the ability to authenticate and encrypt payloads, so this should be used wherever possible. Some of you may have legacy device types that precede SNMPv3 though. Alert TA17-156A provides suggestions to minimise exposure to SNMP abuse.
Also consider the environment on which you’re performing your security testing. As described in this post about OSS / BSS environments and test transitions, you’ll probably have multiple environments – PROD environments that are connected to the live Active Network devices and non-PROD environments that are connected to test lab devices and/or simulators. Where should you perform your penetration / security testing? Probably not on PROD, because you want to ensure the solution is already secure before letting it loose into Production. But you also want to ensure it’s the most PROD-like as possible. You could possibly use PRE-PROD (ie a state before a solution is cut-over to PROD), before it’s fully connected to the Active Network. Or, you could use the most PROD-like lower environment (eg Staging).
One other thing when conducting security tests and hardening – penetration testing often breaks things by injecting malicious code / data. Ensure you take a backup of any environment so you can roll-back to a working state after conducting your pen-tests.
7. Useful Security Standards
The following is a list of security standards that I’ve used in the past:
A regular reader of the PAOSS blog recently wrote, “I follow with passion your blog,latest post about Inventory are great [Ed. the reader is talking about this post about LNI and PNI and this one about Inventory vs Asset vs CMDB Management]. I ask you if possible have a post on Inside Plant vs Outside Plant vs Virtual network creation… we usually use CAD based tool for Inside Plant design both for TLC equipment, cabling, cross connection, Distribution Frame, rooms, virtual rooms, rows structure,etc but also for power, conditioning, lighfiring,etc. We also use Network Inventory for Datacenter and server farm modelling.Outside Plant typically deals with GIS tool for cabling infrastructure. And now also virtualizzation of Network is coming with NFV and SDN. What do you think about?”
Unfortunately, there’s another circle that’s not shown on this diagram, but should be – the DCIM (Data Centre Infrastructure Management) circle. The overlaps between OSS and DCIM partially answer the questions above. We wrote a 5 part series on DCIM back in 2014 (part one, two, three, four, five), so perhaps it’s time for a re-visit.
The last of those five posts even included another Venn Diagram, as follows:
Data Centre Infrastructure Management (DCIM) shares much of its DNA with OSS, but also has a number of unique differences.
IT and network device / inventory management
CSPs and Data Centres tend to have many Enterprise customers, and therefore a need to align with their IT service and life-cycle management (ITIL / ITSM) methodologies
Electronic data collection and storage to support fulfillment and assurance workflows
Analytics and operational decision support
Planning and design tools
Process and change management
Capacity planning, resource allocation and provisioning
Differences (ie what Data Centres have that traditional CSP networks don’t):
Facilities / Building Management Systems (FMS/BMS)
Energy / Power management
Environment and heat management (HVAC) including management of hot/cold zones
Data Centres tend to have less outside plant or inter-site connectivity* (ie most power and network connectivity tends to reside within the Data Centres)
However, Data Centre cable management have some slight differences. Network links are more likely to be managed within 3D spatial systems (x, y and height) if at all, rather than the 2D (x and y coordinates) typically plotted by most OSS inventory via GIS (Geographical Information Systems) or CAD (Computer Aided Design) drawings. Data Centre cables tend to be run in spatially-dense above-rack or below-floor trayways. By comparison, cables between sites tends to be less dense and at a fairly consistent height (eg a standard depth underground or a standard height when mounted on towers/poles aboveground)
Alternatively, DCs may manage spatial infrastructure through naming conventions such as rooms, rack-rows, racks, rack-position rather than 3D spatial systems
Data Centres have traditionally had a higher proportion of virtualised assets than traditional CSPs, although that is now changing with the operator network embracing network virtualisation
So let’s now look at how it “might” all hang together (noting that each company is likely to be different depending on their systems and processes):
DCIM manages facilities, building, power / PLCs and heating/cooling/HVAC
PNI manages physical connectivity (between sites and within the DC) as it can generally manage connectivity to physical ports on patch-panels / frames and physical devices (eg switches and routers) inside the DC. PNI also handles splicing and patching. PNI tools can generally also manage power cabling, although not everyone uses PNI for this
LNI (in conjunction with EMS [Element Management Systems] and virtual resource managers) will tend to manage the virtual / logical networks including resource management and orchestration
LNI will also tend to provide topological views of the network (often point-to-point links between physical/logical ports rather than the cable routes shown in PNI). LNI may also potentially include rack layouts and other forms of network visualisation. However, LNI tends to only partially show spatial presentation of the data (eg physical locations of “circuit” end-points rather than spatial location of all racks and equipment in 3D)
Related compute / storage infrastructure could be managed by DCIM, LNI, VIM, etc
And any of this could be cross-referenced as assets in the Asset Management System and/or Configuration Management Database (CMDB)
I can see that CAD might still be required for trayway, HVAC ducting, etc because PNI isn’t really designed with this in mind in 3D.
Having said that, I’d probably still attempt to get all connectivity and support designed into a spatial visualisation tool like PNI rather than CAD. Afterall, connectivity of any type can be modelled as nodes and arcs (same as PNI). It’s just that ducting tends to have a greater 3D heft than a single line / arc of a typical comms cable.
Why is it important to have this data in a single spatial system rather than CAD? Well, I figure it should help future augmented reality (AR) use-cases like the ones described in the link.
So here’s the updated diagram:
* There are of course multi-site DC organisations that have links between their sites, but they tend to outsource their long-haul network links to traditional carriers.
As the blue boxes on the left side of the diagram below show, you may have many different data sources (some master, some slaved). You may have a single OSS tool (monolithic solution) or you may have many OSS tools (best-of-breed approach).
You may have multiple BSS, NMS and even direct connections to network devices. You may even have other sources of data that you’ve never used before such as weather patterns, lightning strikes, asset management prediction modelling, SCADA data, HVAC data, building access / security events, etc, etc.
The common data model allows you to aggregate those sets to provide insights that have never been readily accessible to you previously.
So let’s look at a few key points
Existing network layer systems (eg NMS, NE and their mediation devices) are currently sucking (near)real-time (ie alarm and perf) data out of the network and feeding to an OSS directly. They may also be pushing inventory discovery data to the OSS, although probably only loading less frequently (once-daily typically) .
The common data model provides a few options for data flows:
If the data store is performant enough, the network layer could feed real-time data to the data store which on-forwards to OSS
multi-home the data from the network to the data store and OSS simultaneously
feed data from the network to the OSS, which may (or not) process before pushing to the data store
Just a quick note regarding data flows: The network will tend to be the master for real-time / assurance flows. However, manual input tends to be the master for design/fulfil flows, so the OSS becomes the master of inventory data as per this link
The question then becomes where the data enrichment happens (ie appending inventory-related data to alarms) to help with root-cause and service-impact calculations. Enrichment / correlation probably needs to happen in the OSS‘s real-time engine, but it could source enrichment data directly from the network, from the OSS‘s inventory, or from the common data store
If the modern ETL tools (eg SNMP and syslog collectors, etc) allow you to do your own ETL to a common data store, a vendor OSS would only need one mediation device (ie to take data from the data store), rather than needing separate ones to pull from all the different NMS/EMS/NE) in your network. This has the potential to reduce mediation license costs from your OSS vendor
Having said that, if you have difficult / proprietary interfaces that make it a challenge to do all of your own ETL then it might be best to let your OSS vendor build your mediation / ETL engines
The big benefit of the common data store is you can choose a best-of-breed approach but still have a common data model to build Business Intelligence queries and reports around
The common data store also takes load off the production OSS application / data servers. Queries and reports can be run against the common data platform, freeing up CPU cycles on the OSS for faster user interactions
The Common Data Model is supported by a few key advancements:
In the past, the mediation layer (ie getting data out of the network and into the OSS) was a challenge. Network operators didn’t tend to want to do this themselves. This introduced a dependency on software suppliers / integrators to build mediation devices and sell them to operators as part of their OSS/BSS solutions. But there’s been a proliferation of highly scalable ETL (Extract, Transform, Load) tools in recent years
Many networks used to have proprietary interfaces that required significant expertise to integrate with. The increasing ubiquity of IP networking and common interfaces (eg SNMP and web interfaces like RESTful, JSON, SOAP, XML) to the network layer makes ETL simpler.=
Massively scalable databases that don’t have as much dependency on relational integrity and can ingest data for myriad sources
A proliferation of data visualisation tools that are more user-friendly instead of having to be a coder capable of writing complex SQL queries
As you have undoubtedly noticed, 5G is generating quite a bit of buzz in telco and OSS circles.
For many it’s just an n+1 generation of mobile standards, where n is currently 4 (well, the number of recent introductions into the market mean n is probably now getting closer to 5 🙂 ).
But 5G introduces some fairly big changes from an OSS perspective. As usual with network transformations / innovations, OSS/BSS are key to operationalisation (ie monetising) the tech. This report from TM Forum suggests that more than 60% of revenues from 5G use-cases will be dependent on OSS/BSS transformation.
And this great image from the 5G PPP Architecture Working Group shows how the 5G architecture becomes a lot more software-driven than previous architectures. Interesting how all 5 “software dimensions” are the domain of our OSS/BSS isn’t it? We could replace “5G architecture” with “OSS/BSS” in the diagram below and it wouldn’t feel out of place at all.
So, you may be wondering in what ways 5G will impact our OSS/BSS:
Fibre deeper – since 5G will introduce increased cell density in many locations, and offer high throughput services, we’ll need to push fibre deeper into the network to support all those nano-cells, pico-cells, etc. That means an increased reliance on good outside plant (PNI – Physical Network Inventory) and workforce management (WFM) tools
Software defined networks, virtualisation and virtual infrastructure management (VIM) – since the networks become a lot more software-centric, that means there are more layers (and complexity) to manage.
Mobile Edge Compute (MEC) and virtualisation – 5G will help to serve use-cases that may need more compute at the edge of the radio network (ie base stations and cell sites). This means more cross-domain orchestration for our OSS/BSS to coordinate
And other use-cases where OSS/BSS will contribute including:
Multi-tenancy to support new business models
Programmability of disparate networks to create a homogenised solution (access, aggregation, core, mobile edge, satellite, IoT, cloud, etc)
Energy efficiency optimisation
Monitoring end-user experience
Zero-touch administration aspirations
Drone survey and augmented reality asset management
Fun times ahead for OSS transformations! I just hope we can keep up and allow the operator market to get everything it wants / needs from the possibilities of 5G.
Last week we discussed the nuances between Inventory, Asset and Config Management within an OSS stack. Each one of these tools are designed to supports functionality for different users / persona-groups. However, they also tend to have significant functional overlap. Chances are your organisation doesn’t have separate dedicated tools for each.
So today I’m going to share a trick I’ve used in the past when I’ve only had a PNI (Physical Network Inventory) system to work with, but need to perform asset management style functionality.
Most inventory tools are great at storing the current state of a device that exists in a network. However, they don’t tend to be so great at an asset manager’s primary function – tracking the entire life-cycle of an asset from procurement to decommissioning and sparing / maintenance along the way.
Normally the PNI just records the locations of all the active network equipment – in buildings, exchanges, comms-huts, cabinets, etc. The trick I use is to create an additional location/s for warehouses. They may (or may not) reside in the physical location of your real warehouse/s.
In almost all PNI systems, you have control over the status of the device (eg IN-SERVICE, etc). You can use this functionality to include status of SPARE, UNDER REPAIR, etc and switch a device between active network locations and the warehouse.
These status-change records give you the ability to pin-point the location of a given asset at any point in time. It also gives you trending stats, either as an individual device or as a cohort of devices (eg by make/model).
You can even build processes around it for check-in / check-out of the warehouse and maintenance scheduling.
I should point out that this works if your PNI allows you to uniquely identify a device (eg by make/model + serial number or perhaps a unique naming convention instance). If your PNI device records only show the current function of a device (eg a naming convention like SiteA-Router-0001), then you might lose sight of the device’s trail when it moves through life-cycle states (eg to the warehouse).
As promised, today we’ll talk about the subtle differences between:
Inventory Management Systems
Asset Management Systems and
Configuration Management Databases (CMDB)
We might even discuss Virtual Infrastructure (VIM) and Resource Managers as well as Config Managers (different from CMDB) too
To be honest, the diagram above doesn’t show adequate overlap. Each of these systems has a slightly different purpose, usually for a slightly different set of personas. However, they all play a part in managing the resources that make up an organisation’s Active Network (the network segment dedicated to carrying customer traffic, as opposed to internal corporate traffic).
Let’s start with Inventory Management Systems (IMS) because IMHO, these are the tools that were traditionally responsible for managing service-provider networks. These are the tools typically used by network planners, network engineers, capacity planners and other back-office operational staff. As mentioned in the link above, these tools can be further broken down into:
PNI (Physical Network Inventory) – The physical devices like switches, routers, firewalls as well as the outside plant (OSP) like cables, joints, etc. Generally only used by operators with large, wide-spread networks of physical assets, especially outside plant.
LNI (Logical Network Inventory) – The set of objects that are formed using physical infrastructure (and possibly associations to other logical objects). This could include circuits, VLANs, and other overlay network topologies as well as the management of attributes like bandwidth, protocols and other network functionality
These tools tend to focus on the key physical/logical/virtual resources that comprise an operator’s active network (AN). However, they often also support functionality that crosses into other domains such as asset and config management.
Asset Management Systems (AMS), as the name implies, have a more “financial” purpose; where assets are objects of intrinsic financial value to an organisation. AMS tools tend to be used by the accounting and asset management teams. They’re used to track current value (purchase price minus depreciation), warranties, spares management, life-cycles / refresh / end-of-life of assets and their contracts, as well as reactive and predictive maintenance. AMS will tend to store information about most of the Active Network Physical devices. This means they will have records for the same devices as PNI, but often with different information / attributes. They won’t tend to store LNI-related data. However, AMS will often keep information about assets in addition to Active Network devices. This could include software licenses and more.
Configuration Management Databases (CMDB) is more of an IT Service Management (ITSM) terminology. Like many IT concepts, ITSM has been increasingly used in parts of service provider networks. CMDBs are a database of Configuration Items (CIs), where CIs can be logical or physical entities. CIs may (or may not) be physical devices (PNI) or logical resource entities (LNI) and may (or may not) represent tangible values (assets). The main purpose of CIs is to store information about IT services that will allow other ITSM processes, such as Incident, Problem and Change Management, to be performed efficiently.
Not only is there functional overlap between these systems, there’s often also terminology overlap and/or misalignments. Different vendors have different levels of functionality and support alternate use-cases, so the areas of overlap differ between organisations.
Oh, and I also promised to mention VIMs and Config Managers:
Virtual Infrastructure Managers (VIM) are responsible for managing the virtual resources made available by physical infrastructure like compute, storage and network devices. In some cases, VIMs generate virtual network devices (VNFs) or virtual machines (VMs) that could look almost identical to any other device stored in LNI, PNI, AMS and/or CMDB. In fact, instances of these VNFs and VMs may even appear in those systems.
Config Management (as opposed to, but also potentially overlapping with, CMDB), is all about managing the configurations of devices in the network (often active network and corporate network). Each device, such as a router, has a configuration that tells the hardware how to function, where to route traffic, which packets to prioritise, where to send management logs (to the OSS), etc. Being able to monitor and manage these configurations centrally and consistently is the purpose for Config Managers. These are mostly used by network engineers to set policies and golden-configs (ie the config templates that all devices of that type must adhere to consistently). For example, you may have hundreds/thousands of devices in your network and want to re-point all management traffic to a new server as part of an OSS upgrade. Rather than configuring each device separately and manually, you can use the config management tool to push config changes out to the network.
Leave us a message to describe how your organisation use these (and other) tools.
“People in a tough spot often focus on their own problems, when the answer usually lies in fixing someone else’s.” Steve Schwarzman.
The telco industry is in a tough spot in many areas around the globe. Sadly, there were more stories of wholesale retrenchments here in Australia this week, including good friends. Revenues falling. Sentiment bleak.
Yet it’s not like most declining industries. Telecom services and remote communications are more in demand than ever before. It’s just that the dynamic has flipped. Previously there was a scarcity mindset around telco services. Now there’s an abundance (in many parts of the world). Connectivity is less of a problem now. Customers have stepped up the hierarchy of needs.
Maybe as industries, telco and OSS, we’re focusing on our own problems? The O in OSS, Operations, can trigger an inward-facing viewpoint (and retrenchments only exacerbate internalisation).
5G isn’t the solution if we’re just looking at it as a new, better model of connectivity. It might be if it can make us better at solving other people’s problems. I suspect OSS will have a big part to play in generating those solutions (or stymieing them!).
Hello, Trouble. It’s been a while since we last met. But I know you’re still out there. And I have a feeling you’re looking for me. You wish I’d forget ya.. Don’t ya trouble? Perhaps it is you, that has forgotten me. Perhaps I need to come find you. Remind you, who I am.
Sounds like an apt mindset for working in the OSS industry doesn’t it?
Or for marketing knives.
I’ll be honest. I like the OSS perspective better!
Following yesterday’s post about OSS Inventory, I received another great follow-up question from another avid reader of the PAOSS blog:
“Interesting thoughts Ryan! In addition to ‘faults up’, perhaps there is a case also (obvious?) for ‘discovery up’ to capture ongoing non-planned changes? Wondering have you come across any sort of reconciliation / adaptive inventory patterns like this? Workflow based? Autonomous? (Going to far into chaos theory territory ?“
Yes, we did exactly that with the same tool discussed yesterday that I used back in 2000. In fact, a very clever dev and I got that company’s first-ever auto-discovery tool working on site (using a product supplied by head-office). Discovering the nodals (ie equipment, cards, ports) was fairly easy. Discovering the connectivity within a domain (we started with SDH) was tricky, but achievable. Auto-discovering cross-domain connectivity (ie DSL circuits through physical, SDH transit, ATM and logical connectivity onto the IP cloud) was much trickier as we needed to find/make linking keys across different data sources.
It was definitely workflow based with a routine-driven back-end. We didn’t just want anything that was discovered to be automatically stuffed into (or removed from) the database (think flapping ports or equipment going down temporarily). It could’ve been automous, but we introduced a manual step to approve any of the discoveries made by each automated discovery iteration.
As you know, modern networks / EMS / VIM (resource managers) are much more discoverable. They need to be for modern orchestration and resilience techniques. I don’t think it would be quite so tricky to stitch circuits together as we’re no longer so circuit-oriented as back in 2000.
However, I’d be fascinated to hear from other readers how much of a problem they have trying to marry up different data sources for discovery purposes. I’d also love to hear whether they’re using fully autonomous discovery or using the manual intervention step for the same reason we were. I imagine most are automating, because orchestration plans just need to make use of whatever resources are being presented by the underlying resource managers in near-real-time.
PS. For those wondering what “discovery” is, it’s shown in the lower grey arrow in this diagram from “Orders down, Faults up“
Discovery is the process that allows data to be passed from NMS/EMS/NEs (ie the network or resource managers) directly into the inventory management database. It should be a more reliable and expedient way of sychronising the inventory with the live network.
The reason for the upper grey arrow is because not all networks have APIs that can be “discovered.” Passive equipment like cable joints and patch-panels don’t have programmatic interfaces. Therefore we need to find other ways to get that data into the Inventory Manager.
“Do you have any thoughts on geospatial vs non geospatial network inventory systems? How often do you see physical plant mapping in a separate system from network inventory, with linkages or integrations between them, vs how often do you see physical and logical inventory being captured primarily in a geospatially oriented system?“
Boy do I ever have some thoughts on this topic!! I’m sure you do too, so I’d love to hear what you think in the comments section below.
I was lucky. The first OSS/BSS that I worked on (all the way back in 2000), had both geo and non-geo (topology) views. It also had a brilliantly flexible data model that accommodated physical and logical inventory. All tightly integrated into one package. There aren’t many tools that can do all of that even today. Like I said, I was lucky to have this as a starting point!!
Like all things OSS/BSS, it starts with the personas and the key tasks they need to perform. Or from the supplier’s perspective, which customer personas they’re most actively targeting.
For example, if you have a significant Outside Plant (OSP) Network, then geo-positioning is vital. The exchanges and comms huts are easy enough to find, but pits, cable routes, easements, etc are often harder to find. It’s not uncommon for a field tech to waste time searching for a pit that’s covered in dirt, grass or snow. And knowing the exact cable route in geo view is helpful for sending field techs to the exact location of a fault (ie helping them to pinpoint the location of the bright yellow excavator that has just sliced through your inter-capital link). Geo-view is also important for OSP designers and the field workforce that builds the OSP network.
But other personas don’t care about seeing the detailed cable route. They just want to see a point-to-point topological link to represent physical connections between the ports on adjacent devices. This helps them to quickly understand the network or circuit / service view. They may also like to see an alarm overlay on the topology to quickly determine which parts of the network aren’t performing as expected. For these personas, seeing all the geo-detail just acts as visual noise that they need to subconsciously filter out to understand the topology view.
These personas also tend to want topological views of the network, not just the physical but the logical and virtual network / service overlays too.
In most cases that I can think of, the physical / OSP inventory tools show the physical devices (ports even) that the OSP network connects into. Their main focus is on the cables, joints, pits, pipes, catenaries, poles, lead-ins, patch-panels, patch-leads, splitters, etc. But showing the termination of cables onto active equipment (Inside Plant or ISP) is an important linking key between the physical and logical views.
The physical port (on the physical device) becomes the key demarcation between physical and logical worlds. The physical port connects physical cables / leads, but it also acts as the anchor point from which to create logical ports to which logical connections are made. As a result, the physical device and port tend to be shown in both physical (geo) and logical inventory tools. They also tend to be shown in both physical and logical network topology views.
In the case of the original OSS/BSS I worked on, it had separate visualisation tools for geo, network and circuit/service, but all underpinned by a common data model.
What’s the best way? Different personas will have different perspectives of course. I prefer for physical and logical inventories to be integrated out of the box (to allow simple cross-ref visually and in queries)…. but I also prefer for them to have different views (eg geo, topology, network, circuit/service) to suit different situations.
I also find it helpful if each of those views allow the ability to drill down deeper into specific sections of the graph if necessary. I’d prefer not to have all of those different views overlaid onto a geo visualisation. Too much visual clutter IMHO, but others may love it that way.
Oh, and having separate LNI (Logical Network Inventory) and PNI (Physical Network Inventory) can be a tricky thing to reconcile. The LNI will almost always have programmatic interfaces (APIs) to collect data from, but will generally have to amalgamate many different sources. Meanwhile, the PNI consists of mostly passive equipment and therefore has no API to collect latest info from. I tend to use strategies at the above-mentioned demarcation point (ie physical ports) to help establish linking keys between LNI and PNI.
BTW. There’s one aspect of the question, “How often do you see physical plant mapping in a separate system from network inventory” that I haven’t fully answered. I’ll cover the question of asset management vs inventory management vs CMDB (Configuration Management Database) in more detail in an upcoming post. [Ed. See link here]
These diagrams are really elegant and powerful for communicating with other data experts and delivery teams. It’s a data expert language.
Data experts are experts of the ETL (Extract, Transform, Load) process, but often have less expertise with the actual meaning and importance of the data sets themselves. For example, a data expert may know there’s a product offerings table, and each has 23 associated attributes (eg bandwidth, SLA class, etc) available. But they may have less understanding of the 245 product types that are housed in the product’s data table, and even less awareness of the meanings of the thousands of product attributes. You need to be a subject matter expert (SME) to understand that detail about the data. In some cases, the SME might be from your client and knows far more tribal knowledge than you.
We often need other SMEs (the products expert in this case) to help us understand what has to happen with the data during transformation. What do we keep, what do we change, what do we discard, etc.
Just one problem – SMEs might not always speak the same language as the data experts.
As elegant as it is, the data relationships diagram above might not be the most intuitive format for product experts to review and comment.
As with many aspects of Architecture and transformation, if we’re to understand, it’s best to communicate in our audience’s language.
In this case, it might be best to show data mappings as overlays on screenshots that the Product owner is familiar with:
Their current GUI
Existing sales order forms
Current report templates
Their next-generation GUI
New order forms
Post-Transform report templates
Such an approach might not look elegant to our data expert colleagues. The question is whether it quickly makes enough sense to the SMEs for you to elicit concise responses from them.
The “right” approach is not always the most effective.
I’d love to hear your tips, tricks and recommendations for speaking / listening in the audience’s language.
If you wish to publish news / press-releases on products, contract wins, changes in ownership structure, job advertisements, tradeshow attendance or any other related news, you can click here to create a news item (note that you have to be logged in and be an authorised listing owner – you can find a how-to guide here). If tagged to a vendor, these news items also appear in the listing of that vendor.
If you are an OSS/BSS buyer and wish to publish news about an Expression of Interest (EOI), Request for Proposal (RFP), or similar, please leave us a note at the Contact Page and we’ll publish your details.
Twitter Feed Each of the OSS/BSS vendors that have their Twitter handle registered now have their Twitter feed visible within their listing.
We hope that these two new features make it easier for buyers to find up-to-date information about the suppliers they’re most interested in.
We have plenty of additional new functionality under development and data being loaded, so be sure to check back in from time to time.
If you have any additional features that you think we should add, we’d love to hear from you, so please let us know via the Contact Page too.
As mentioned in a post about Service and Resource Availability last week, I do tend to think of OSS workflows around an “orders down, faults up,” flow direction. And that means customers (services) at the top, network (resources) at the bottom of the (TMN) pyramid.
I also think of inventory (yellow) as the point where Assurance / Faults (blue) and Fulfillment / Orders (purple) collide and enhance, as per the diagram below.
These are highly generic examples, but let’s take a closer look:
Assurance flow (blue) – an alarm or event in the network (NEL/NE layer) pushes up through the stack to the OSS as a fault. The inventory (network / service) helps to enrich the fault with additional information (eg device name, location, connectivity, correlation of this and other faults, etc) to help resolve the fault (either manually by operators or automatically by algorithms). It also helps associate the fault in the network/resource with the customer/s using those resources. This allows notifications to be issued to customers. Note that this simple flow doesn’t reflect examples such as an incident (ie when a customer notices a problem first and calls it in before the OSS has been able to issue a notification).
Fulfillment flow (purple) – a customer places an order (BML/BSS layer or above) and it pushes down through the stack, including changes in the network (NEL/NE layer). Once all the appropriate network changes have been made, the order is ready for use by the customer. Once again, inventory plays an important part, associating customer / service identifiers with suitable resources from the available resource pool. Generally the (customer facing) service orders won’t have the technology-specific details required to actually update the network configurations though. That’s where the inventory often helps to fill in the knowledge gaps and send technology-specific commands down into the network. [See Friday’s post for more information about CFS and RFS definitions and mappings]
Inventory flows (yellow) – an inventory is relevant to assurance and fulfillment flows if the BSS and network / resource layers don’t hold enough information to be fully processed. The enriching information stored by inventory must come from somewhere. Some of it comes from Discovery (usually an automated process of collecting from the network or other sources), or via Manual / Scripted Input (eg physical network designs including patch cables and splicing). Some data (eg splices) just can’t be collected automatically as they relate to passive equipment that has no programmatic interface. This data just has to be created manually. But arguably the more important inventory data is actually recording the mappings made from customers (services) to network (resources). Inventory solutions are often where these linking keys / relationships are recorded.
These flows also tend to indicate the direction of data mastery. Whilst the network itself is the source of truth, Fulfillment flows will start at the BSS and customer / service / order data will tend to be mastered there before orders are pushed into the network as provisioning commands. For assurance flows, the network will tend to be the master source of data, but with enrichment along it’s path northbound.
Just keep in mind that there are many exceptions to these examples. Data and processes can flow in many different ways. The diagram above is just useful for helping newcomers to understand some conceptual processes and data models / flows.
Back in the old days, there was really only one OSS build model – via big milestone/functionality delivery. You followed a waterfall style delivery where you designed the end-solution up-front, then tried to build, test and handover to that design. The business value was delivered at the end of the project (or perhaps major phases along the way). For the large operators, there may have been multiple projects in-flight at any one time, but the value was still only delivered at the end of each project.
Some clients still follow this model, particularly if they outsource build/transformation projects to suppliers. These clients tend to be smaller and have less simultaneous change underway. OSS build/transform projects tend to be occasional for these clients.
But the larger operators now tend to undertake a constant, Agile evolution of their OSS. There are constant transitions, delivery of value at a regular cadence (eg fortnightly sprints). The operating environments (and the constraints associated with them) tend to be in dynamic flow, at an ever increasing speed. There is no project end-state. The OSS simply doesn’t stay in one state for long because change is happening every day (give or take).
For this reason, I find it interesting that Architects tend to design solutions for a particular end-state. This can be a good thing because it stops Agile projects from meandering off track incrementally.
Unfortunately, this type of traditional solution design doesn’t fully suit modern delivery though. The traditional solution design doesn’t show the many stepping-stones that delivery teams have to implement – the multiple states required to transition from current-state to end-state. I’ve seen delivery teams being unable to determine a workable sequence of stepping stones that allow the designed end-state to be reached.
So, a challenge for the many clever Architects out there – help delivery teams out by designing the stepping stones, not just the ideal end-state. Update your design templates to describe all the intermediate states and the transition steps between them.
Include diagrams, pre-requisites, dependencies, etc as you normally do, but only as bite-sized chunks for easy consumption by delivery teams rather than even more massive documents than today.
After all, those intermediate states are only experienced briefly and then gone. Obsoleted. In fact all OSS delivery states are only transitory, so we may need to re-think our document template models. Pre-build, during-build and post-build documentation requirements are all waiting to be accommodated within a stepping-stone timeline in a new style of design template.
In most cases, it’s a greater skill to navigate an OSS journey rather than simply predict a destination.
I’d love to hear from Architects and Delivery Teams regarding how you’ve overcome this challenge at your organisation! What models do you use?