There’s an OSS Security Elephant in the Room!

The pandemic has been beneficial for the telco world in one way. For those who weren’t previously aware already, telecommunications is incredibly important to our modern way of life. Not just our ability to communicate with others, but our economy, the services we use, the products we buy and even more fundamentally, our safety.

Working in the telco industry, as I’m sure you do, you’ll also be well aware of all the rhetoric and politics around Chinese manufactured equipment (eg Huawei) being used in the networks of global telco providers. The theory is that having telecommunications infrastructure supplied by a third-party, particularly a third-party aligned with non-Western allies, puts national security interests at risk.

In this article, “5G: The outsourced elephant in the room,” Bert Hubert provides a brilliant look into the realities of telco network security that go far beyond just equipment supply. It breaks the national security threat into three key elements:

  • Spying (using compromised telco infrastructure to conduct espionage)
  • Availability (compromising and/or manipulating telco infrastructure so that it’s unable to work reliably)
  • Autonomy (being unable to operate a network or to recover from outages or compromises)

The first two are well understood and often discussed. The third is the real elephant in the room. The elephant OSS/BSS have a huge influence over (potentially). But we’ll get to that shortly.

Before we do, let’s summarise Bert’s analysis of security models. For 5G, he states that there’s an assumption that employees at national carriers design networks, buy equipment, install it, commission it and then hand it over to other employees to monitor and manage it. Oh, and to provide other specialised activities like lawful intercept, where a local legal system provides warrants to monitor the digital communications of (potentially) nefarious actors. Government bodies and taxpayers all assume the telcos have experienced staff with the expertise to provide all these services.

However, the reality is far different. Service providers have been outsourcing many of these functions for decades. New equipment is designed, deployed, configured, maintained and sometimes even financed by vendors for many global telcos. As Bert reinforces, “Just to let that sink in, Huawei (and their close partners) already run and directly operate the mobile telecommunication infrastructure for over 100 million European subscribers.

But let’s be very clear here. It’s not just Huawei and it’s not just Chinese manufacturers. Nor is it just mobile infrastructure. It’s also cloud providers and fixed-line networks. It’s also American manufacturers. It’s also the integrators that pull these networks and systems together. 

Bert also points out that CDRs (Call Detail Records) have been outsourced for decades. There’s a strong trend for billing providers to supply their tools via SaaS delivery models. And what are CDRs? Only metadata. Metadata that describes a subscriber’s activities and whereabouts. Data that’s powerful enough to be used to assist with criminal investigations (via lawful intercept). But where has CDR / bill processing been outsourced to? China and Israel mostly.

Now, let’s take a closer look at the autonomy factor, the real elephant in the room. Many design and operations activities have been offshored to jurisdictions where staff are more affordable. The telcos usually put clean-room facilities in place to ensure a level of security is applied to any data handled off-shore. They also put in place contractual protection mechanisms.

Those are moot points, but still not the key point here. As Bert brilliantly summarises,  “any worries about [offshore actors] being able to disrupt our communications through backdoors ignore the fact that all they’d need to do to disrupt our communications.. is to stop maintaining our networks for us!

There might be an implicit trust in “Western” manufacturers or integrators (eg Ericsson, Nokia, IBM) in designing, building and maintaining networks. However, these organisation also outsource / insource labour to international destinations where labour costs are cheaper.

If the R&D, design, configuration and operations roles are all outsourced, where do the telcos find the local resources with requisite skills to keep the network up in times when force majeure (eg war, epidemic, crime, strikes, etc) interrupts a remote workforce? How do local resources develop the required skills if the roles don’t exist locally?

Bert proposes that automation is an important part of the solution. He has a point. Many of the outsource arrangements are time and materials based contracts, so it’s in the resource suppliers’ best interests for activities to take a lot of time to maintain manually. He counters by showing how the hyperscalers (eg Google) have found ways of building automations so that their networks and infrastructure need minimal support crews.

Their support systems, unlike the legacy thinking of telco systems, have been designed with zero-touch / low-touch in mind.

If we do care about the stability, resiliency and privacy of our national networks, then something has to be done differently, vastly different! Having highly autonomous networks, OSS, BSS and related systems is a start. Having a highly skilled pool of local resources that can research, design, build, commission, operate and improve these systems would also seem important. If the business models of these telcos can’t support the higher costs of these local resources, then perhaps national security interests might have to subsidise these skills?

I wonder if the national carriers and/or local OSS/BSS / Automation suppliers are lobbying this point? I know a few governments have inserted security regulatory systems and pushed them onto the telcos to adhere to, to ensure they have suitable cyber-security mechanisms. They also have lawful intercept provisions. But do any have local operational autonomy provisions? None that I’m aware of, but feel free to leave us a comment about any you’re aware of.

PS. Hat tip to Jay for the link to Bert’s post.

Uses of OSS Augmented Reality in the Data Centre

I was doing some research on Ubiquiti’s NMS/OSS tools yesterday and stumbled upon the brilliant Augmented Reality (AR) capability that it has (I think).

I’ve converted to a video shown below (apologies for the low-res – you can try clicking here for Ubiquiti’s full-res view)

I especially love how it uses the OLED at the left side of the chassis almost like a QR code to uniquely identify the chassis and then cross-link with current status information pulled from Ubiquiti’s UNMS (assumedly).

As mentioned in this post from all the way back in 2014, the potential for this type of AR functionality is huge if / when linked to OSS / BSS data. Some of the potential use-cases for inside the data centre as cited in the 2014 article were:

  1. Tracing a cable (or patchlead) route through the complex cabling looms in a DC
  2. Showing an overlay of key information above each rack (and then each device inside the rack):
    1. Highest alarm severity within the rack (eg a flashing red beacon above each rack that has a critical alarm on any device within it, then a red beacon on any device inside the rack that has critical alarms)
    2. Operational state of each device / card / port within the rack
    3. Active alarms on each device
    4. Current port-state of each port
    5. Performance metrics relating to each device / port (either in current metric value or as a graph/s)
    6. Configuration parameters relating to each device / port (eg associated customer, service, service type or circuit name)
  3. Showing design / topology configurations for a piece of equipment
  4. Showing routing or far-end connectivity coming from a physical port (ie where the cable ultimately terminates, which could be in the same DC or in another state / country)

I believe that some of these features have since been implemented in working solutions or proofs-of-concept but I haven’t seen any out in the wild. Have you?

I’d love to hear from you if you’ve already used these Ubiquiti tools and/or have seen AR / OSS solutions actually being used in the field. What are your thoughts on their practicality?

What other use-cases can you think of? Note that the same 2014 article also discusses some AR use-cases that extend beyond the DC.

How To Optimise A Network Assurance GUI To Get Results

In the old-school world of network assurance, we just polled our network devices and aggregated all the events into an event list. But then our networks got bigger and too many events were landing in the list for our assurance teams to process.

The next fix was to apply filters. For example, that meant dropping the Info and Warning messages because they weren’t all that important anyway…. were they?

But still, the event list just kept scrolling off the bottom of the page. Ouch. So then we looked to apply correlation and suppression rules. That is, to apply a set of correlations so that some of the alarms could be bundled together into a single event, allowing the “child” events to be suppressed.

Then we can get a bit more advanced with our rules and perform root-cause analysis (RCA). Now, we’re moving to identify patterns using learning algorithms… to reduce the volume of the event list. But with virtualised networks, higher-speed telemetry and increased network complexity, the list keeps growing and the rules have to get more “dynamic.”

Each of these approaches is designing a more highly filtered lens through which a human operator can view the health of the network. The filters and rules are effectively dumbing down the information that’s landing with the operator to solve. The objective appears to be to develop a suitably dumbed-down solution that allows us to throw lots of minimally-trained (and cheaper) human operators at the (still) high transaction count problem. That means the GUI is design to filter out and dumb down too.

But here’s the thing. The alarm list harks back decades to when telcos were happy having a team of Engineers running the NOC, resolving lots of transactions. Fast forward to today and the telcos aspire to zero-touch assurance. That implies a solution that’s designed with machines in mind rather than humans. What we really want is for the network to self-heal based on all the telemetry it’s seeing.

Unfortunately, rare events can still happen. We still need human operators in the captain’s seat ready to respond when self-healing mechanisms are no longer able to self-heal.

So instead of dumbing-down and filtering out for a large number of dumbed-down and filtered out operators, perhaps we could consider doing the opposite.

Let’s continue to build networks and automations that take responsibility for the details of every transaction (even warning / info events). But let’s instead design a GUI that is used by a small number of highly trained operators, allowing them to see the overall network health posture and respond with dynamic tools and interactions. Preferably before the event using predictive techniques (that might just learn from all those warning / info events that we would’ve otherwise discarded).

Hat tip to Jay for some of the contrarian thoughts that seeded this post.

 

OSS Functionality – Is Your Focus In Anonymous Places?

Yesterday’s article asked whether OSS tend to be anonymous and poorly designed and then compared how Jony Ive (who led the design of iPads, iPods, iPhones for Apple) might look at OSS design. Jony has described “going deep” – being big on focus, care and detail when designing products. The article looked at 8 care factors, some of which OSS vendors do very well and others, well, perhaps less so.

Today we’ll un-pack this in more detail, using a long-tail diagram to help articulate some of the thoughts. [Yes, I love to use long-tail diagrams to help prioritise many facets of OSS]

The diagram below comes from an actual client’s functionality usage profile.
Long tail of OSS

The x-axis shows a list of functionalities / use-cases. The y-axis shows the number of uses (it could equally represent usefulness or value or other scaling factor of your choice).

The colour coding is:

  • Green – This functionality exists in the product today
  • Red – This functionality doesn’t exist

The key questions to ask of this long-tail graph are:

  1. What functionality is most important? What functionality “moves the needle” for the customers? But perhaps more importantly, what functionality is NOT important
  2. What new functionality should be developed
  3. What old functionality should be re-developed

Let’s dive deeper.

#1 – What functionality is most important
In most cases, the big-impact demands on the left side of the graph are going to be important. They’re the things that customers need most and use most. These are the functions that were included in the MVP (minimum viable product) when it was first released. They’ve been around for years. All the competitors’ products also have these features because they’re so fundamental to customers. But, because they already exist, many vendors rarely come back to re-factor them.

There are other functions that are also important to customers, but might be used infrequently (eg data importers or bulk processing tools). These also “move the needle” for the customer.

#2 – What functionality should be developed (and what should not)
What I find interesting is that many vendors just add more and more functionality out at the far right side of the graph, adding to the hundreds of existing functions. They can then market all those extra features, rightly saying that their competitors don’t have these abilities…. But functionality at the far right rarely moves the needle, as described in more detail in this earlier post!

Figuring out what should be green (included) and what should be red (excluded) on this graph appears to be something Apple has done quite well with its products. OSS vendors… perhaps less so!! By the way, the less green, the less complexity (for users, developers, testers, integrators, etc), which is always an important factor for OSS implementations.

Yesterday’s post mentioned eight “care factors” to evaluate your OSS products / implementations by. But filtering out the right side of the long-tail (ie marking up more red boxes) might also help to articulate an important “don’t care factor.”

#3 – What functionality should be re-developed
Now this, for me, is the important question to ask. If the green boxes are most important, especially the ones at the far left of the graph, should these also be the ones that we keep coming back to, looking to improve them? Where usability, reliability, efficiency, de-cluttering, etc are most important?

I suspect that Apple develop hundreds of prototypes that focus on and care about the left side of the graph in incredible detail, whilst looking to reduce the green bars in the right side of the graph. My guess is that subsequent updates to their products also seek improvements to the left side…. whilst also adding some new features, turning some of the red boxes green, but rarely all the way out to the right edge of the graph. 

Summary Questions

If you are in any way responsible for OSS product development, where is your “heat map” of attention on this long tail? Trending towards the left, middle or right?

But, another question vexes me. Do current functionality-based selection / procurement practices in OSS perpetuate the need to tick hundreds of boxes (ie the right side of the long tail), even though many of those functions don’t have a material impact? There’s a reason I’ve moved to a more “prioritised” approach to vendor selection in recent years, but I suspect the functionality check-boxes of the past are still a driving force for many.

OSS – Are they anonymous, poorly made objects?

We’re surrounded by anonymous, poorly made objects. It’s tempting to think its because the people who use them don’t care – just like the people who make them. But what [Apple has] shown is that people do care. It’s not just about aesthetics. They care about things that are thoughtfully conceived and well made.”
Jony Ive (referenced here).

As you undoubtedly know, Jony Ive is the industrial design genius behind many of Apple’s ground-breaking products like the iPod, iPad, etc. You could say he knows a bit about designing products.

I’d love to see what he would make of our OSS. I suspect that he’d claim they (and I’m using the term “they” very generically here) are poorly made objects. But I’d also love the be a fly on the wall to watch how he’d go about re-designing them.

It’s doing an extreme disservice to say that all OSS are poorly designed. From a functionality perspective, they are works of art and incredible technical achievements. They are this MP3 player:

Brilliantly engineered to provide the user with a million features.

But when compared with Apple’s iPods????

The iPods actually have less functionality than the MP3 player above. But that’s part of what makes them incredibly intuitive and efficient to use.

Looking at the quote above from Ive, there’s one word that stands out – CARE. As product developers, owners, suppliers, integrators, do we care sufficiently about our products? This question could be a little incendiary to OSS product people, but there are different levels of care that I’ve seen taken in the build of OSS products:

  • Care to ensure the product can meet the requirements specified? Tick, yes absolutely in almost every case. Often requiring technical brilliance to meet the requirement/s
  • Care to ensure the code is optimised? In almost all cases, yes, tick. Developers tend to have a passionate attention for detail. Their code must not just work, but work with the most efficient algorithm possible. Check out this story about a dilemma of OSS optimisation
  • Care to ensure the user experience is optimised? Well, to be blunt, no. Many of our OSS are not intuitive enough. Even if they were intuitive once, they’ve had so much additional functionality bolted on, having a Frankenstein effect. Our products are designed and tested by people who are intimately familiar with our products’ workings. How often have you heard of products being tested by an external person, a layperson, or even a child to see how understandable they are? How often are product teams allowed the time to prototype many different UI design variants before releasing a new functionality? In most cases, it’s the first and only design that passed functional testing that’s shipped to customers. By contrast, do you think Apple allowed their first prototype of the iPod to be released to customers?
  • Care to ensure that bulk processing is optimised? In most cases, this is often also a fail. OSS exist to streamline operations, to support high volume transactions (eg fulfillment and assurance use-cases). But how many times have you seen user interfaces and test cases that are designed for a single transaction, not to benchmark the efficiency of transactions at massive scale?
  • Care to ensure the product can pass testing? Tick, yes, we submit our OSS to a barrage of tests, not to mention the creation of modern test automation suites
  • Care to ensure the product is subjected to context-sensitive testing? Not always. Check out this amusing story of a failure of testing.
  • Care to ensure that installation, integration and commissioning is simple and repeatable? This is an interesting one. Some OSS vendors know they’re providing tools to a self-service market and do a great job of ensuring their customers can become operational quickly. Others require an expert build / release team to be on site for weeks to commission a solution. Contrast this again with the iPad. It’s quick and easy to get the base solution (including Operating System and core apps) operational. It’s then equally easy to download additional functionality via the App Store. Admittedly, the iPad doesn’t need to integrate with legacy “apps” and interfaces in the same way that telco software needs too!! Eeeek!
  • Care about the customer? This can also be sporadic. Some product people / companies pay fastidious attention to the customers, the way they use the products / processes / data and the objectives they need to meet using the OSS. Others are happy to sit in their ivory towers, meet functional testing and throw the solution over the fence for the customers to deal with.

What other areas have I missed? In what other ways do we (or not) take the level of care and focus that Apple does? Leave us a comment below.

How to Design an OSS / Telco Data Governance Framework

Data governance constructs a strategy and a framework through which an organization (company as well as a state…) can recognize the value of its data assets (towards its own or global goals), implement a data management system that can best leverage it whilst putting in place controls, policies and standards that simultaneously protect data (regulation & laws), ensure its quality and consistency, and make it readily available to those who need it.”
TM Forum Data Governance Team.

I just noticed an article on TM Forum’s Inform Platform today entitled, “Telecoms needs a standardized, agile data governance framework,” by Dawn Bushaus. 

The Forum will publish a white paper on data governance later this month. It has been authored with participation from a huge number of companies including Antel Uruguay, ATC IP, BolgiaTen, Deloitte, Ernst & Young, Etisalat UAE, Fujitsu, Globe Telecom, Huawei Technologies, International Free and Open Source Solutions Foundation, International Software Techniques, KCOM Group, Liquid Telecom, Netcracker Technology, Nokia, Oracle, Orange, Orange Espagne, PT XL Axiata, Rogers Communications, stc, Tech Mahindra, Tecnotree, Telefonica Moviles, Telkom and Viettel. Wow! What a list of luminaries!! Can’t wait to read what’s in it. I’m sure I’ll need to re-visit this article after taking a look at the white paper.

It reminded me that I’ve been intending to write an article about data governance for way too long! We have many data quality improvement articles, but we haven’t outlined the steps to build a data governance policy.

One of my earliest forays into OSS was for a brand new carrier with a brand new network. No brownfields challenges (but plenty of greenfields challenges!!). I started as a network SME, but was later handed responsibility for the data migration project. Would’ve been really handy to have had a TM Forum data governance guide back then! But on the plus side, I had the chance to try, fail and refine, learning so much along the way.

Not least of those learnings was that every single other member on our team was dependent on the data I was injecting into various databases (data-mig, pre-prod, prod). From trainers, to testers, to business analysts, to developers and SMEs. Every person was being held up waiting for me to model and load data from a raft of different network types and topologies, some of which were still evolving as we were doing the migration. Data was the glue that held all the other disciplines together.

We were working with a tool that was very hierarchical in its data model. That meant that our data governance and migration plan was also quite hierarchical. But that suited the database (a relational DB from Oracle) and the network models (SDH, ATM, etc) available at that time, which were also quite hierarchical in nature.

When I mentioned “try, fail and refine” above, boy did I follow that sequence… a lot!! Like the time when I was modelling ATM switches that were capable of a VPI range of 0 to 255 and a VCI range of 0 to 65,535. I created a template that saw every physical port have 255 VPIs and each VPI have 65,535 VCIs. By the time I template-loaded this port-tree for each device in the network overnight, I’d jammed a gazillion unnecessary records into the ports table. Needless to say, any query on the ports table wasn’t overly performant after that data load. The table had to be truncated and re-built more sensibly!!

But I digress. This is a how-to, not a how-not-to. Here are a few hints to building a data governance strategy:

  1. Start with a WBS or mind-map to start fomalising what your data project needs to achieve and for whom. This WBS will also help form the basis of your data migration strategy
  2. Agile wasn’t in widespread use back when I first started (by that I mean that I wasn’t aware of it in 2000). However, the Agile project methodology is brilliantly suited to data migration projects. It’s also well suited to aligning with WBS in that both methods break down large, complex projects into a hierarchy of bite-sized chunks
  3. I take an MVD (Minimum Viable Data) approach wherever possible, not necessarily because it’s expensive to store data these days, but because the life-cycle management of the data can be. And yet the extra data points are just a distraction if they’re never being used
  4. Data Governance Frameworks should cover:
    1. Data Strategy (objectives, org structure / sponsors / owners / stewardship, knowledge transfer, metrics, standards / compliance, policies, etc)
    2. Regulatory Regime (eg privacy, sovereignty, security, etc) in the jurisdiction/s you’re operating in or even just internal expectation benchmarks
    3. Data Quality Improvement Mechanisms (ensuring consistency, synchronised, availability, accuracy, usability, security)
    4. Data Retention (may overlap with regulatory requirements as well as internal policies)
    5. Data Models (aka Master Data Management – particularly if consolidating and unifying data sources)
    6. Data Migration (where “migration” incorporates collection, creation, testing, ingestion, ingestion / reconciliation / discovery pipelines, etc)
    7. Test Data (to ensure suitable test data can underpin testing, especially if automated testing is being used, such as to support CI/CD)
    8. Data Operations (ongoing life-cycle management of the data)
    9. Data Infrastructure (eg storage, collection networks, access mechanisms)
  5. Seek to “discover” data from the network where possible, but note there will be some instances where the network is master (eg current alarm state), yet other instances where the network is updated from an external system (eg network design being created in design tools and network configs are then pushed into the network) 
  6. There tend to be vastly different data flows and therefore data strategies for the different workflow types (ie assurance, fulfilment / charging / billing, inventory / resources) so consider your desired process flows
  7. Break down migration / integration into chunks, such as by domain, service type, device types, etc to suit regular small iterations to the data rather than big-bang releases
  8. I’ve always found that you build up your data in much the same way as you build up your network:
    1. Planning Phase:
      1. You start by figuring out what services you’ll be offering, which gives an initial idea about your service model, and the customers you’ll be offering them to
      2. That helps to define the type of network, equipment and topologies that will carry those services
      3. That also helps guide you on the naming conventions you’ll need to create for all the physical, logical and virtual components that will make up your network. There are many different approaches to naming conventions, but I always tend to start with ITU as a naming convention guide (click here for a link to our naming convention tool)
      4. But these are all just initial concepts for now. The next step, just like for the network engineers, is to build a small Proof of Concept (ie a small sub-set of the network / services / customers) and start trialling possible data models and namings
      5. Migration Strategy (eg list of environments, data model definition, data sources / flows, create / convert / cleanse / migration, load sequences with particular attention to cutover windows, test / verification of data sets, risks, dependencies, etc)
    2. Implementation Phase
      1. Reference data (eg service types, equipment types, speeds, connector types, device templates, etc, etc)
      2. Countries / Sites / Buildings / Rooms / Racks
      3. Equipment (depending on the granularity of your data, this could be at system, device, card/port, or serial number level of detail). This could also include logical / virtual resources (eg VNFs, apps, logical ports, VRFs, etc)
      4. Containment (ie easements, ducts, trays, catenary wires, towers, poles, etc that “contain” physical connections like cables)
      5. Physical Connectivity (cables, joints, patch-leads, radio links, etc – ideally port-to-port connectivity, but depends on the granularity of the equipment data you have)
      6. Map / geo-location of physical infrastructure
      7. Logical Connectivity (eg trails, VPNs, IP address assignments, etc)
      8. Customer Data
      9. Service Data (and SLA data)
      10. Power Feeds (noting that I’ve seen power networks cause over 50% of failures in some networks)
      11. Telemetry (ie the networks that help collect network health data for use by OSS)
      12. Other data source collection such as security, environmentals, etc
      13. Supplementary Info (eg attachments such as photos, user-guides, knowledge-bases, etc, hyperlinks to/from other sources, etc)
      14. Build Integrations / Configurations / Enrichments in OSS/BSS tools and or ingestion pipelines
      15. Implement and refine data aging / archiving automations (in-line with retention policies mentioned above)
      16. Establish data ownership rules (eg user/group policies)
      17. Implement and refine data privacy / masking automations (in-line with privacy policies mentioned above)
    3. Operations Phase
      1. Ongoing ingestion / discovery (of assurance, fulfilment, inventory / resource data sets)
      2. Continual improvement processes to avoid a data quality death spiral, especially for objects that don’t have a programmatic interface (eg passive assets like cables, pits, poles, etc). See big loop, little loop and synchronicity approaches. There are also many other data quality posts on our blog.
      3. Build and refine your rules engines (see post, “Step-by-step guide to build a systematic root-cause analysis (RCA) pipeline“)
      4. Build and refine your decision / insights engines and associated rules (eg dashboards / scorecards, analytics tools, notifications, business intelligence reports, scheduled and ad-hoc reporting for indicators such as churn prediction, revenue leakage, customer experience, operational efficiencies, etc)
      5. In addition to using reconciliation techniques to continually improve data quality, also continually verify compliance for regulatory regimes such as GDPR
      6. Ongoing refinement of change management practices

I look forward to seeing the TM Forum report when it’s released later this month, but I also look forward to hearing suggested adds / moves / changes to this OSS / BSS data governance article from you in the meantime.

If you need any help generating your own data governance framework, please contact us.

 

Launch of The Passionate About OSS Podcast

We’re excited to announce the Launch of The Passionate About OSS Podcast.

The first batch of five episodes can be found here, with new episodes to be released here on a weekly basis:

The Passionate About OSS Podcast

The aim of the show is to shine a light on the many brilliant people who work in the OSS industry.

We’ll interview experts in the field of OSS/BSS and telecommunications software. Guests represent the many facets of OSS including: founders, architects, business analysts, designers, developers, rainmakers, implementers, operators and much more, giving a 360 degree perspective of the industry.

We’ll delve into the pathways they’ve taken in achieving their god-like statuses, but also unlock the tips, tactics, methodologies and strategies they employ. Their successes and failures, challenges and achievements. We’ll look into the past, present and even seek to peer into what the future holds for the telco and OSS industries.

Canary Releases

Do you remember back in the old days of OSS/BSS releases into production? They were a little bit stressful…. especially for the Release Managers.

I remember one Release Manager who used to set up the packages, type out the commands, then his hands would be poised above the keyboard, literally shaking. He would then position his finger above the Enter key and look away from the screen whilst pressing it!

He then couldn’t look back at the screen for at least a couple of minutes… maybe taking a sneaky peak from time to time.

It was pretty hilarious to watch! 

It was like watching a nervous soccer fan at a big game being decided by penalties. Look away! Can’t watch!

Roll-backs upon fail were equally amusing to observe (for me at least).

But luckily, we can avoid these big-bang cutovers in many cases with load balanced application architectures. This article by Danilo Sato describes Canary / Blue-Green Deployments of new / update software.

The steps are summarised in Danilo’s diagrams below:

PRE-CUTOVER

CANARY RELEASE

POST CUTOVER

The canary release model provides a technique of migrating a few selected users (eg internal / project users) to the new solution. If something happens to the canary release and rollback is required, it’s simply a case of routing all users back to the old version.

Danilo also states, “Canary releases can be used as a way to implement A/B testing due to similarities in the technical implementation. However, it is preferable to avoid conflating these two concerns: while canary releases are a good way to detect problems and regressions, A/B testing is a way to test a hypothesis using variant implementations. If you monitor business metrics to detect regressions with a canary [2], also using it for A/B testing could interfere with the results. “

OSS Sandpit – Power, Supervisory and Comms Networks combined in our Inventory Prototype

This article provides a tutorial for building a Power network with corresponding SCADA and comms network into the inventory module of our Personal OSS Sandpit Project.

This prototype includes components such as:

  • Power Network including:
    • Wind Turbines (WT)
    • Solar Panels
    • Transformers
    • Circuit Breakers (CB)
    • Substations (Generation, Transmission and Distribution)
    • Transmission Towers
    • Power Poles
    • Smart Meters / AMI
    • Inverters
    • Power cabling, including OPGW (Optical Groundwire, which bundles ground wire and optical cables together in a common sheath)
  • Power Supervisory network including:
    • IED (Intelligent Electronic Device) 
    • PLC (Programmable Logic Controller)
    • DAQ (Data Acquisition)
    • Meteorological Mast and weather station (for monitoring the conditions that the Wind Turbines are operating within)
  • A Communications network that carries traffic from:
    • VOIP Telephony
    • CCTV
    • Power Supervisory equipment (listed above)
  • Private hosting to support all of the applications used to manage this Power / Comms / Supervisory solution, including:
    • Power Management
    • Advanced Metering Infrastructure (AMI)
    • Asset Management
    • Fault Management
    • and many more

We’ve modelled the network on a small sub-section (circled in yellow) of the Terna network based on a network map shown in this link. Other than the high-level links shown in the diagram below, the rest of what’s modelled below is all hypothetical as we have no direct involvement with the Terna network.

We’ve modelled two off-shore windfarms, with the diagram below showing one of the off-shore platforms.

You’ll notice that there are two separate networks shown. The power network (in red) and the comms / supervisory (in yellow).

As shown in the diagram below, we’ve also modelled the networks all the way back to:

  • Residences that consume energy from the grid (the red power network), but also contribute back to the grid via solar micro-generation
  • Command and Control Centre where the power and comms / supervisory networks are managed from (yellow)

Note that the yellow link above between towers TW-01 and TW-02 show the use of OPGW cables, which could be considered to represent both power (protective groundwire) and comms (optical fibres) networks.

Device Instances

First we start by building the locations and hierarchy of devices within them in Kuwaiba. The diagram below shows the new devices we’ve built to support this model:

The tree is only partially expanded. You’ll notice how these assets align with the conceptual diagrams above.

Applications

There are many applications required to manage these overlapping networks, as shown on Private Cloud Hosting below

Meteorological Masts / Weather Station

We’ve modelled the Met Masts in Kuwaiba, where the following shows an image (set up as a background bitmap), overlaid with Mount Groups and connectivity within Kuwaiba’s Object View functionality.

The Mount Groups on this mast are loaded into Kuwaiba with weather sensors at different heights, as seen in the diagram below:

You’ll notice that the last 6 ports on the Data Acquisition device (DAQ) remain unused.

Connectivity

Next, we have to model the connectivity.  This includes three levels of connectivity – inter-site cabling, intra-site patching and then end-to-end connectivity.

We’ll start with the inter-site cabling in Kuwaiba’s OSP View:

Blue lines are the optical fibre cables and red represent the power links from the Wind Turbines to their off-shore platforms.

The following diagram takes a closer look at the power cables, both from the transformer on the off-shore platform, through to transmission, distribution and even lead-in cables to local residences:

The diagram below shows some local lead-in cables

Once all the intra-site connectivity has been created, we can see the topological view, which is almost identical to the first image above:

Intra-site connectivity, such as patching at the ODFs (Optical Distribution Frames), splicing and cable management allows us to build end-to-end connectivity. The diagram below shows a snippet of the rack layout view on the off-shore platform at Brindisi Wind Farm:

The diagram below shows how we manage the OPGW cable strung between towers TW-01 and TW-02. You’ll notice the “power ports” as well as the optical fibre tubes / strands:

Note that the “power port” naming used here could be misleading. Given that the groundwire is used to protect other conductors on the tower from lightning strikes, it doesn’t actually carry power as such. It’s also not a port, but the termination of a conductor. Power ports are part of the base Kuwaiba data model, probably intended to represent the power plugs on active equipment like modems and switches. We’ve simply re-purposed them to show connectivity of power cables and groundwires.

Now that we’ve created the connectivity, we  can perform end-to-end tracing to ensure they’re properly connected.

First we’ll trace the power tree from the Generation Sub-Station at Brindisi Smist:

You’ll notice that there are four homes highlighted, where customer C-0001 has additional microgeneration infrastructure and is supplying power back to the grid through an inverter.

Next, we trace the comms / supervisory network, which is more extensive:

Numbers 1 to 6 have been added manually to the screenshot to correspond to the end-points highlighted in yellow on the second and third diagrams above.

And finally, we’ll trace the resilient link from the Brindisi Wind Farm via CBL-02 (as opposed to the main cable CBL-01 that’s used to trace the other circuits above):

Double-click to take a closer look.

Customer Service Mappings

We’ve shown examples of how to map services and calculate service impact in our other earlier tutorials so we won’t cover these again here.

Summary

The demonstrated tool can be used for modelling any form of infrastructure (ie comms, power, water, IoT / sensors, etc). Just like an asset management system, but also showing all the connectivity and their geo-spatial position as well as services/customer mappings. It’s far better as a design and ongoing infra management tool than CAD drawings that are traditionally used.

I hope you enjoyed this introduction into how we’ve modelled sample overlapping power / comms / supervisory networks into the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.

 

A Few Caveats

Acknowledgements regarding modelling limitations above:

  1. Organisationally, Power and Comms will be run by different groups, using different tools that are optimised to each group’s requirements. On the flip-side, they’re all just nodes and arcs, so whether it’s telco, power, water, etc the assets and their connectivity should be easily modelled. However, real-time tools such as network health will be divergent in many cases, even though they share some common concepts
  2. There’s a default object in Kuwaiba called a “PowerPort,” which is probably intended to model power plugs on devices, as opposed to power cable management in the way I’ve used it. Any thoughts about a better name / terminology are welcomed (I have complete control over the data model, names, attributes, hierarchy, so I can make it anything that makes more sense BTW)
  3. A power cable will typically “terminate” the bus zone via droppers and then onto a transformer via bushings so it doesn’t really have ports as such, even though I’ve modelled it that way (mostly just for speed / ease initially, but I can take the time to model it more accurately upon request). 

If you think there are better ways of modelling this network, if I’ve missed some of the nuances or practicalities, I’d love to hear your feedback. Leave us a note in the contact form below.

OSS Sandpit – Telco Cloud / DC Inventory Prototype

This article provides a tutorial for building Telco Cloud / Data Centre components into the inventory module of our Personal OSS Sandpit Project.

This prototype has been a bit of a beast to build and includes components such as:

  • Hosting Services including:
    • IaaS (VMs, storage, network – FlexPod)
    • PaaS (ONTAP-AI a hosted AI solution, hosted voice)
    • SaaS (email, secure keys)
    • Internet ecosystems
    • Cloud provider ecosystems
    • CoLocation / Rack Management services
  • Leased Lines including:
  • Carrier MPLS network modelling, including:
    • IPAM (IP Address Management)
    • VPN / VLAN management
    • VRF and AS management
  • Virtualisation and application management
  • Equipment Layout art for:
    • Routers
    • Switches
    • Cisco FlexPod Chassis
    • NVIDIA ONTAP AI Chassis

This Telco Cloud prototype can be summarised as follows (noting that this is an invented network section):

You’ll notice:

  • A Telco Cloud network (light blue) with DCs in Melbourne, Sydney, Singapore, New York and London
  • These DCs are interconnected using MPLS over (note that each of the dotted lines actually has resilient path, which we’ll describe a little later)
  • Each DC has a number of services / platforms at its disposal (as shown by the coloured dots)
  • A single customer with three sites (dark blue in Geelong, Queens and Kuala Lumpur)
  • The Customer Edge (CE) routers at these three sites are connected by local leased lines (yellow clouds) to the Provider Edge (PE) of the Telco Cloud
  • Ports, IP addresses / subnets and BGP AS numbers are identified

What you don’t see in this diagram are the submarine landing stations and submarine + terrestrial leased lines that are also modelled, but we’ll get to that later too.

In addition, the diagram below represents rack layouts at ML1 (Melbourne) DC.

Each of the other core sites is the same, except the ONTAP-AI (rack 05), which is only based in Melbourne DC.

Device Instances

First we start by building the locations and hierarchy of devices within them in Kuwaiba. The diagram below shows the new devices we’ve built to support this Telco Cloud model:

The tree is only partially expanded and only shows ML1. You’ll notice how these assets align with the conceptual rack layout view above.

This translates to the following FlexPod Chassis Rack View in Kuwaiba, where you’ll notice I’ve created Equipment Layout artwork for UCS, switches and storage:

While fiddling with face layouts, I accidentally stumbled on a cool little feature in Kuwaiba.

If you populate the “Model” field, you get the physical / connectivity mapping (of the UCS / Compute shelf in this case):

But if you leave the model field blank then you get the Logical / Virtualisation / Application view:

If you look closely at this view above (double-click if needed), then you’ll see the various hosted customer services (IaaS, SaaS, PaaS) are stored.

The same can be seen on the AI as a Service (AIaaS). First, we see the physical layout of the NVIDIA DGX-1:

And then we can toggle to see the hosted customer AI services running on the DGX-1:

 

CoLocation (CoLo) Services

Speaking of customer services, we can also model CoLo services by allocating rackspace in the COLO rack, as follows:

Perhaps more importantly, we’ve also modelled the attributes of those services, including factors such as Power Feed, Space Type, Number of RUs, Access Type, Bandwidth, etc:

Connectivity

Next, we have to model the connectivity. 

Inter-Rack Connectivity

Firstly, we’ll start with the patching within the racks where we follow these wiring design guides from Cisco (FlexPod), which aligns with the FlexPod Chassis Rack View diagram above… 

…and NVIDIA (ONTAP-AI) respectively

 

Inter-DC Connectivity

Then we establish the Inter-DC connectivity, firstly starting with the Submarine and major terrestrial links between cities (double-click for a closer look):

Being an Australia-based Telco Cloud provider means there are extensive Leased Line links between Melbourne and Sydney:

And also terrestrial leased-lines across to the submarine landing station in Perth to support the links to Singapore:

Submarine Landing Station to DC Connectivity

However, we also have local leased lines that allow us to link the DCs to the Submarine landing stations:

From diverse landing stations at Beaconsfield (SY2) and Paddington (SY3) to the Sydney DC (SY1):

From diverse landing stations at Manasquan (US3) and Brookhaven (US4) to the New York DC (NY1). Note that I haven’t shown the US West-Coast landing stations at Morro Bay (US1) and Hillsboro (US2) but you can see the orange terrestrial links going off to them from US3 and US4 below.

I’ve only shown a single landing station in Singapore, that lands the SeaMeWe-3 and Australia-Singapore submarine cables. However, I’ve then shown diverse routes to the Singapore DC (SG1)

End-to-End DC Connectivity

All of the leased lines are now in place, but we now need to establish end-to-end routes through all these leases… connecting the dots as it were. Here are:

Diverse Routes showing all hops from ML1 to NY1….

…and Diverse Routes from ML1 to SG1

 

Customer Leased-line Connectivity

Then we build the leased lines to customer sites at Geelong (from ML1 DC), Queens (from NY1 DC) and Kuala Lumpur (from SG1 DC):

Internet Leased Lines

And finally the Leased Lines from the POI (Point of Interconnect) rack in the DC to local ISPs in each city. These are shown symbolically just to track their Leased Line identifiers rather than actual routes:

 

Mapping the MPLS Network

Now that all the physical connectivity is in place, we can record all the attributes of the MPLS network. You’ll remember that the first diagram above showed all the ports, IP addresses / subnets and BGP AS numbers.

Firstly the IP / subnet allocations

The right-hand pane shows the four major subnets (MPLS core, customer leases, Loopbacks and Customer Lease ranges respectively).

Just as a single example, the left-hand pane shows that 10.1.1.1 has been assigned to port Gi0/0/0/0 on the PE router at ML1 from the 10.1.1.0/28 subnet.

The diagram below then shows all of the MPLS Links in Kuwaiba (but note that I’ve manually overlaid the blue cloud to make it easier to see the core DC Interconnect network (with P routers in the POI rack in the DCs on its perimeter)). PE routers (in the PE rack in the DCs) are shown connecting to the P routers, but also connecting to the Customer Edge (CE) routers at customer sites. 

Then finally we show the MPLS attribute mappings.

The upper pane shows pools of VRFs (VPN00001 for the demonstrated customer’s network and another for the backbone network), as well as AS allocations.

The lower pane shows an example VRF and how it’s associated with:

  • Customer LAN IP subnet (172.16.1.0/26)
  • VLANs (VLAN10 and VLAN20 in this case)
  • PE routers on which the VRF resides

Customer Service Mappings

The following shows Service Mappings for Customer 0001. They’ve racked up a lot of services here! This service inventory can be used to assist with billing the customer each month.

 

Summary

I hope you enjoyed this introduction into how we’ve modelled a sample Carrier Cloud / DC into the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.

More to come on SDN and SD-WAN in future articles.

If you think there are better ways of modelling this network, if I’ve missed some of the nuances or practicalities, I’d love to hear your feedback. Leave us a note in the contact form below.

OSS Sandpit – GPON Network Inventory Prototype

This article provides a tutorial for building GPON (Gigabit Passive Optical Network) components into the inventory module of our Personal OSS Sandpit Project. It’s also a model for FTTH / FTTP (Fibre to the Home, Fibre to the Premises) as well.

This prototype build includes components such as:

  • Passive Optical Network (cables, patch panels, splices, splitters, containment, multiports, other splice joints)
  • Active GPON equipment (OLT, ONT)

This GPON prototype can be summarised as follows (this invented network section has been visualised using Google Earth). Note that you should start from the Central Office / OLT site and trace outwards towards the ONT:

This represents the physical representation of a typical GPON model such as the following:

Device Instances

The diagram below shows the new devices we’ve built to support this GPON network model:

This includes:

  • An OLT at the Central Office, (Nokia ISAM FD 7302)with:
    • 18 Available card slots
      • Containing GPON cards (Nokia FGLT-8 cards)
        • 8 GPON ports
  • An FDH (Fibre Distribution Hub) that contains patch arrays and splitters
  • Manholes / Handholes containing:
    • Splice Joints (eg AJL, LJL types)
    • Fibre Loops
    • Multiports (MPT)
  • Cables (mains, distribution and lead-ins)
  • ONTs (Optical Network Terminal devices) on the customer premises

Physical Connections

There is quite a lot of patching and port mirroring required to create the physical connectivity between the customer sites and central office.

This includes a small section of local fibre network (LFN) drawn in Google Earth:

You’ll notice the 4 x Lead-in cables emanating from handhole HH-D-005. The green squares indicate handholes. The blue diamond represent multiports and the blue arrowheads represent AJL / LJL splice joints.

These are shown as follows in the Kuwaiba OSP view:

The diagram below shows the conceptual view of the GPON network in the upper section, with corresponding physical path trace shown in the lower section (Points given for anyone who can spot the error in the text I’ve overlaid on the trace).

Click on the image above to show the more detailed trace from the OLT through the splitter at the FDH to the four homes. Note that the next version of Kuwaiba is expected to show a more fanned-out visual presentation of the trace data.

The same data can be shown in the physical tree view below. You’ll notice that four separate branches emanate from the Splitter inside the FDH, indicating a 1:4 split (although only 3 branches are visible below).

You’ll notice that in this instance, we have to perform a TraceDown (ie from OLT to premises) using the trace Physical Tree functionality. TraceUp (ie from premises to OLT) provides slightly spurious data.

We’ve excluded details about creating fibre joints (eg AJLs, LJLs, etc) and splicing to simplify the scenario, but more details and screen-caps of cable management can be found in this earlier inventory article)

Service Modelling

The diagram below shows that we’ve modelled two service types for each customer here:

  • One for GPON line rental
  • The other is for a retail service – connection to the Internet via Carrier / ISP

Note that the right-hand pane shows customer details for this service.

The diagram below shows more Service Impact details (ie the resources that the GPON service is utilising):

Note that it displays these associations in alphabetical order, not in a traceUp or traceDown sequence.

Summary

I hope you enjoyed this introduction into how we’ve modelled a sample GPON network into the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.

If you think there are better ways of modelling this network, if I’ve missed some of the nuances or practicalities, I’d love to hear your feedback. Leave us a note in the contact form below.

OSS Sandpit – Fixed Wireless Network Inventory Prototype

This article provides a tutorial for building Fixed Wireless (FW) Network components into the inventory module of our Personal OSS Sandpit Project.

This prototype build includes components such as:

  • A fixed wireless core network
  • Radio Links across licensed and unlicensed (5 GHz and 24GHz)
  • Line of Sight analysis of each Radio Link and Viewshed
  • Fibre links (including cable management)
  • Tower Management
  • Routing and Switching
  • Layer 2 and Layer 3 service modelling (eg VLANs, VPLS tunnels, etc)

This Fixed Wireless prototype can be summarised as follows:

The primary link is at the bottom of the image (ie 101C – RICH – SWIN – BOXH). The fibre leased-line (101C – BOXH) is for resilience only.

The main intent of this network is to carry L2 services (between Customer Sites 90001 and 90003) and L3 services (between Customer Site 90001 and DXM1). This is as depicted in the service model diagram below:

We’ll revisit the modelling of these services later.

In this post we’ll describe the following use-cases:

  • Building Reference Data like data hierarchies, device types, etc
  • Performing Site Qualification and Line of Sight Analysis between locations
  • Creating Device Instances including buildings, towers, radios, etc
  • Creating Physical Connections between devices (eg radio links, fibre links)
  • Creating Customer Service Modelling 

 

Reference Data

Starting off with the data hierarchy, we had to develop some new building blocks (data classes) to support fixed wireless assets and new link types that allow us to quickly identify the differences in link types from high-level diagrams.

We’ve developed a custom data hierarchy as follows:

  • Country
    • Radio_Infra (to separate FW core network assets from other network assets)
      • Site (core network sites – 101C, RICH, SWIN, BOXH, SURH)
        • Buildings / Comms Rooms
          • Rack
            • Equipment
      •  Tower
        • Appurtenances (ie attachments to the tower)
          • Mount Groups (ie the frames / mounts that connect attachments to the towers / poles)
            • Equipment (including radio units)
    • City
      • Customer Sites
        • Devices (ODU / antenna, IDU, routers)

This required a few new templates, including Customer Sites and FW Core Sites.

Site Qualification and Line of Sight Analysis

Being a Fixed Wireless network, effective communications links rely on line of sight between radio units.

We used Google Earth for Site Qualification and Line of Sight. Here is the plan view of our small section of network:

But we need to ensure each of these legs if visible:

Here’s a view of the 101C – RICH link:

Here’s another view of the same link showing clearance above the MCG light towers:

Here’s the link from RICH – SWIN. :

Here’s SWIN-BOXH:

And finally SWIN – SURH:

All clear on the link analyses above.

For comparison, the following diagram shows the corresponding core network, including fibre links, in Kuwaiba’s Outside Plant Module:

The colour-coding of the links in the diagram above is as follows:

  • Light Blue = 5GHz Radio Links (unlicensed, approx 150Mbps)
  • Green = 24GHz Radio Links (unlicensed, approx 1.4Gbps)
  • Orange = Licensed Radio Links
  • Royal Blue = Fibre Links

The fibre links required fibre management using Kuwaiba as shown below:

Note: The viewshed functionality in Google Earth provides a useful approximation of line of sight from a given point. The green shading in the example below approximates the areas where coverage can be achieved from BOX-MNT-01 (the first mount on tower 1 at the Box Hill core site).

Device Instances

We then create the devices in Kuwaiba to build the prototype network model shown in the first diagram above. Not all devices are shown.

Here’s a partial rack view of the first rack at 101C:

We can also drill down into patch management within the rack as follows:

Tower Management

The following diagram shows a simulation of the tower at 101 Collins St (not actual), specifically showing the mount groups and other key attributes we’ve modelled in Kuwaiba:

Attributes such as elevation, azimuth and horizontal offset have all been identified from the Line of Sight Analysis done earlier in Google Earth.

Note that the attributes of 101C-MNT-01 are shows in the right pane below:

Physical and Logical Connections

There is quite a lot of patching and port mirroring required to create the connectivity between the core sites, customer sites and data centre.

The following diagrams show the three leased fibre lines (we’ve excluded fibre joints and splicing to simplify the scenario, but more details and screen-caps of cable management can be found in this earlier inventory article):

Fibre cable from 101C to BOXH:

Fibre cable from 101C to customer site 90003 (including identification of cable name in left pane):

A full trace all the way from Customer Site 90001 to the Data Centre (L3 service chain) is shown in the diagram below. Click to view in full size:

 

Service Modelling

Re-showing the second diagram above, you’ll notice that there are a number of important service points relating to the L2 (the red, upper path) and L3 (the blue, lower path) services offered over this FW network:

If you look closely at the diagram below, you’ll notice that all of these service points have been modelled into Kuwaiba for Customer 90001:

You’ll also notice the Layer 2 services have been expanded to also show service impact (ie which devices / circuits / cables each link in the service chain relies on). The full service impact is modelled for L3 services as well in this demo, but not expanded in the screen-cap.

We’ve also modelled the Fibre Links as Leased Line Services, including service impacts shown below:

Summary

I hope you enjoyed this introduction into how we’ve modelled a sample Fixed Wireless network into the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.

If you think there are better ways of modelling this network, if I’ve missed some of the nuances or practicalities, I’d love to hear your feedback. Leave us a note in the contact form below.

OSS Sandpit – Smart City / IoT Network Inventory Prototype

This article provides an example of building Smart City and IoT (Internet of Things) Network components into the inventory module of our Personal OSS Sandpit Project.

This prototype build includes components such as:

  • A Command and Control Centre (CCC)
  • Satellite Earth Stations
  • Smart Buildings including:
    • Compute / Hosting (VxBlocks)
    • Comms (incl Unified Comms and In-Building Coverage)
    • Security (eg CCTV, Access Control)
    • Building Management Systems (BMS)
    • Public Address / Audio-Visual
    • HVAC 
  • An optical fibre ring network and direct fibre backhaul links to radio towers
  • Towers / masts that are affixed with 5G and LoRa antenna and radio heads as well as point to point microwave antenna
  • 5G infrastructure (see this earlier post for more details on 5G setup)
  • LoRaWAN infrastructure (including LoRa antenna / gateway, LoRa network server,  app servers and the join server)
  • IoT Sensors including:
    • Power Management Systems
    • Smart Meters
    • Parking Sensors
    • Traffic Control Systems (TCS)
    • Variable Messaging Systems (VMS)
    • Tollway Systems
    • Rail Control Systems (RCS)
    • Vehicle Detection Systems (VDS)
    • IoT Asset / Logistics Management

This smart-city / IoT prototype can be summarised as follows:

This diagram replicates a smart city I helped to design a few years ago for Ha’il in Saudi Arabia. This smart city was intended to house around 500,000 people and align with an existing university.  Dry Dock, Business Park and an Airport were also a feature of the design that we prepared in conjunction with KEO Architects and Ernst & Young. It was a really interesting exercise in design and commercial modelling.

This smart city hasn’t been built yet, so the network you see modelled in the Inventory tool below is purely hypothetical.

Note that this network was built around a GPON network model, especially for the residential areas, but we’ll be covering that in a later prototype article.

In this post we’ll describe the following use-cases:

  • Building Reference Data like data hierarchies, device types, etc
  • Creating Device Instances including rack views and the virtualised layers within them
  • Creating Physical Connections between devices (eg fibre ring, radio network backhaul)
  • Creating Logical Connections between devices (eg LoRaWAN service mappings and LoRaWAN network layout)

 

Reference Data

Starting off with the data hierarchy, we had to develop some new building blocks (data classes) to support a more granular tower asset model and IoT sensors (as per yellow highlights below):

We’ve developed a custom data hierarchy as follows:

  • Country
    • Site (for key sites – Command & Control Centre, University, Airport, Business Centre and the Dry Dock)
      • Comms Room (Room)
        • Rack
          • Equipment (including VxBlocks, routers, ODFs, etc)
            • NFVI
              • NFVs (including 5G Core, Firewalls, LoRaWAN)
                • Apps (eg LoRa Network Server, Application Servers, etc)
    • Mobile Base Station (BTS) Sites
      • Comms Huts (Building)
        • Rack
          • Equipment
      • Tower
        • Appurtenances (ie attachments to the tower)
          • Mount Groups (ie the frames / mounts that connect attachments to the towers / poles)
            • Equipment (including remote units, LoRa gateways and antenna)
    • City
      • IoT Devices

This required a few new templates, including these BTS / LoRa sites:

More importantly, we needed to include additional classes and attributes to correctly model the towers. First we needed to add Mount Groups. These mounts hold the antenna and remote units that provide 5G coverage across the estate:

You’ll notice on the left-hand pane there are three sector mounts [MNT-01 to 03] (all at 30m elevation above ground level) that to provide 360 degree 5G coverage (ie 3 x 120 degree sector cells). You’ll notice above that MNT-01 has an azimuth of 0 degrees. MNT-02 has an azimuth of 120 degrees and MNT-03 has an azimuth of 240 degrees (not shown). 

MNT-04 holds the LoRA Gateway (which is an omni-cell – providing 360 degree coverage at an elevation of 25m). Meanwhile, MNT-05 holds a point-to-point microwave radio antenna (at an elevation of 29m).

The right-hand pane shows the additional attributes required to model the tower mounts.

This tower configuration is reflected in the diagram below:

Device Instances

We then create the devices to build the prototype network model shown in the first diagram above.

And a more detailed view of the IoT Management:

You’ll also see these VNFs and Apps reflected in the Rack Layout of the VxBlock below:

Physical and Logical Connections

There is quite a lot of patching and port mirroring required to create the connectivity between the CCC, key sites, Base Stations, Earth Station, Satellite and IoT Sensor Sites, as follows:

Ha’il City Overview

The diagram below is a GIS view of the estate. from within the Kuwaiba inventory tool, which leverages Google Maps.

Note that this mirrors the first image above, with the exception of the Airport and Satellite which are off the top of the page.

You’ll also notice that there is a fibre ring (red lines) between key sites as well as point-to-point fibre backhaul to the BTS site. IoT sensors are also shown (LoRa radio connectivity not shown though).

Fibre link between CCC and BTS001

5G antenna and remote unit are at the left, connecting to the 5G Core at right.

One leg of the Fibre Ring

You’ll notice that this is the link between a router in the CCC (Data Centre – Comms Room 1) to the Comms Room at Ha’il University.

LoRaWAN Logical Connectivity

There’s likely to be additional App Servers required, but two have been included for demonstrational purposes.

 

Satellite Modelling

Refer to this earlier post for how to model Satellite networks and services.

 

IoT Service Modelling

It shows the Traffic Light service (MIoT) as well as the infrastructure it uses (ie the IoT Traffic Light device, the nearest LoRa Gateway to these fixed IoT devices and the LoRa network server in the CCC).

We could model connectivity of all the IoT sensors back through the LoRa Gateway with physical links, but instead we’ve simplified, showing only the service utilisation. 

Service Impact Analysis (SIA)

We can also use the service relationships to determine which customer services would be affected if the LoRa Gateway (HA001-LORA-01) failed. In the image above, there would be three traffic light services affected (see under “Uses” in the bottom pane).

Similar analysis could be done using the getAffectedServices API that we demonstrated in the OSS Sandpit Inventory Intro post.

Summary

I hope you enjoyed this introduction into how we’ve modelled a sample Smart City and IoT network into the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.

If you think there are better ways of modelling this Smart City network, if I’ve missed some of the nuances or practicalities, I’d love to hear your feedback. Leave us a note in the contact form below.

OSS Sandpit – Satellite Network Inventory Prototype

This article provides an example of building Satellite Network components into the inventory module of our Personal OSS Sandpit Project.

This prototype build includes components such as:

  • A Satellite
  • Earth Stations
  • Satellite Aggregation Site
  • Beams (including Beam to Earth Station mappings)
  • Customer services
  • Satellite Dishes
  • Satellite Receivers (ODU / IDU)
  • Satellite Modems
  • Leased Lines (backhaul)

Our prototype is summarised in the diagram below:

We describe this via the following use-cases:

  • Building Reference Data like data hierarchies, device types, connectivity types, containment, device layouts, templates, flexible data models, etc
  • Creating Device Instances including rack views and the virtualised layers within them
  • Creating Physical Connections between devices
  • Creating Logical Connections between devices

Reference Data

Starting off with the data hierarchy, we had to develop some new building blocks (data classes) to support the devices and multiplexing used in satellite networks (including those highlighted in yellow below):

In our prototype, we’ve developed a custom containment model as follows:

  • Country
    • Site (for head-end equipment)
      • System
        • Rack
          • Equipment
    • City (for customer sites)
      • CustomerSites
        • Equipment
    • Satellite_Infra
      • Satellite Earth Stations
        • Rack
          • Equipment
      • Aggregation Site
        • Transmission System
          • Rack
            • Equipment
      • Satellite
        • Beam (downlinks)
        • Uplinks (as VirtualPorts)

In a real situation, you probably wouldn’t bother to model to this level of detail as it just makes more data to maintain. We’ve just included this detail to show some of the attributes of our sample satellite network.

The satellite network also required some new templates, especially for the Earth Stations and Customer sites that have many devices and ports that you wouldn’t want to re-create each time.

Device Instances

We then create the devices to build the prototype network model shown in the first diagram above. This includes:

  • Satellite
  • Earth Stations
  • Satellite Aggregation Site
  • Satellite Dishes
  • Satellite Receivers (ODU / IDU)
  • Satellite Modems
  • Switches
  • Routers

This diagram below shows a small snapshot of the Geraldton Earth Station. The templates we created earlier helped to avoid re-creating these hierarchies for each Earth Station and Customer Site:

Earth Stations (including Infrastructure within Geraldton):

SkyMuster II Satellite (showing beams and transceivers / uplinks):

SkyMuster II Satellite Attributes:

Satellite Customer Sites:

Customer Site Details:

Physical and Logical Connections

There is quite a lot of patching and port mirroring required to create the connectivity between the customer sites, Aggregation Site, Earth Station, Satellite and Customer Sites, as follows:

Head-end (Customer core site, Aggregation Site, Earth Station):

Earth Station to Satellite:

Satellite to Customer Sites (modelled as MPLS Links to handle many:one links to the satellite):

Customer Site (including identification of which beam is mapped to):

Note: Whilst there would be significant outside plant (OSP) such as cables, joints and splices, especially between the Earth Station (in Geraldton, in Western Australia) and the Aggregation Site (in Eastern Creek, in New South Wales), we’ve only shown a point-to-point leased fibre link. To see how fibre cables, splice joints and ODFs are modelled, refer to the introduction to the OSS Sandpit inventory module.

Service Impact Analysis (SIA)

We can also use the service relationships to determine which customer services would be affected if the Geraldton router (GER-RTR-01) failed. In the example below, there would be two services affected (see under “Uses” in the bottom pane).

The upper pane shows all of the devices and links that Customer Service C-42-00001 relies upon for service.

Similar analysis could be done using the getAffectedServices API that we demonstrated in the OSS Sandpit Inventory Intro post.

Summary

I hope you enjoyed this brief introduction into how we’ve modelled a sample Satellite network into the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.

If you think there are better ways of modelling this Satellite network, if I’ve missed some of the nuances or practicalities, I’d love to hear your feedback. Leave us a note in the contact form below.

Root Cause by Hierarchy (RCH)

The challenging thing about establishing root-cause is that the rules tend to be fairly unique to each network. Each vendors, topologies, interface specs, etc tend to be quite different, so they need to be customised to each network.

But there are a few rules that can be applied to any network. Yesterday we described a root-cause algorithm called Root Cause Trace (RCT). Today we’ll look at Root Cause by Hierarchy (RCH). 

When the cause of a fault happens within a domain, then it tends to be easier to resolve than if it goes cross-domain. RCT and RCH are examples of cross-domain RCA (Root Cause Analysis) techniques.

The diagram below shows two two halves of a network sub-section. The upper half shows the physical connectivity (with “circuits” overlaid as dotted lines).

The bottom half shows how this connectivity can be drawn as network layers. Not exactly OSI layers, but there are some parallels. It could relate to OSI, but it all depends on how you’ve structured your object / data hierarchy in your inventory (LNI / PNI tool). 

The concept behind RCH is that if you have an alarm on one of the lower-layers in the hierarchy, then it is the root-cause and all related alarms from upper layers can be associated / suppressed.

For example, if there are Loss of Signal alarms on the two Line Ports on the ADMs (SDH Add Drop Multiplexers), then it’s likely to be a break in the physical path that the Digital Link traverses [Note that you could apply the RCT rule from yesterday’s post to determine which patchlead, cable, joint or ODF is the likely culprit].

Therefore, any alarms coming from the Tributary Ports on the ADM, or any alarms emanating from the VNFs (ie SPF and UPF) are resulting from the physical path break (ie they’re in the higher layers).

You’ll notice that I’ve shown the OSP (Outside Plant) Containment layers – not because they necessarily have a direct impact on the example above. However, if the scenario was a backhoe cutting through a duct, multiple subducts and multiple cables, then there would be an alarm storm that generates alarms extending far beyond the infrastructure shown in the diagram above. In that case, the damaged duct is the real root-cause, which has likely also damaged sub-ducts and cables.

Note: For simplicity, I’ve excluded other layers of containment (eg buildings, pits, poles, towers, etc) from the diagram. Note that I’ve also simplified the network to exclude an SDH ring (with protection) and intermediate routing points, etc. However, the RCH concept becomes even more helpful across those more complex, multi-layer, cross-domain network scenarios.

You may also recall from the earlier article, “Proximity and Root Cause” that we talked about how RCH could actually help to resolve some complex alarms between layers in virtualised networks like 5G and SDN. If NFVI, VIM, VNFM, NFVO and EMS/NMS all store information separately with no way of correlating between layers, then hierarchical data from LNI / PNI with Root Cause by Hierarchy analysis could come in handy. 

Root-Cause Trace (RCT)

Since we’re on the train of root-cause this week (see the 4 RCA proximity techniques article), I thought I’d share a root-cause technique that relies on topology proximity. It’s referred to as Root Cause Trace (RCT).

Many of our networks (eg access networks like FTTx, mobile, HFC, etc) are tree or bus in nature, whether that’s in the physical, logical or virtual / hierarchical sense. As such, that makes them well suited to modelling in graph databases. It was common for network traces (eg tracing up the tree from NTD #1 to Pillar #1 in the example below) to take long periods in the past [That was because the relational databases that were traditionally used needed multi-table joins on tables with lots of records, which required lots of compute]. It tends to be much more efficient with the graph databases we use today.

The RCT technique I’ll describe today makes use of those graphs.

In the three examples below, you’ll notice that I’ve used passive inventory (pillars, cables, joints). These device types can’t send alarms. Only the NTDs can send alarms. The NTDs marked in orange indicate they are currently in an alarmed state. So if the passive devices can’t send alarms, how can we determine if any of them are the root-cause of a fault.

Let’s start with the example where there’s a fault with distribution cable #1. Perhaps it’s been cut by an excavator. As we can see, all of the NTDs downstream of Dist Cable #1 are impacted (in orange). Therefore, if we trace up from each of the NTDs in this graph to the pillar, we can see that the circled cable is on 3 of 3 trace-up paths. That’s the common point of failure. You’ll also notice that NTDs #6 and #12 are also alarmed, but they appear to be unrelated. This allows us to create a new fault (relating to Dist Cable #1), associate alarms from NTD #1, #2 and #3 to the fault, but also suppress those NTD alarms (but not NTD #6 or #12).

 

 

The scenario below shows an example where Joint #1 is impacted. Perhaps a car has hit a pole that it was attached to, causing damage to the joint. Therefore, all NTDs downstream of the joint will be in an alarmed state. If we perform the same set of trace-ups, we find that Joint #1 is the common point (all 6 of 6 downstream NTDs are alarmed).

 

Now, if we look at the scenario below, pillar #1 has been damaged. Perhaps it was directly hit by a car. All NTDs downstream of it are alarmed. The trace-ups all combine at Pillar #1 (14 of 14 downstream NTDs are alarmed). We create a fault on the pillar and attach, then suppress all the NTD alarms.

Just one extra point to note here. It’s possible that NTDs #6 and #12 already had unrelated problems before the car hit the pillar. Repairing the pillar may remove that root-cause, but any suppressions need to be removed (from all NTDs) so that unrelated alarms from NTDs #6 and #12 can be picked up.

 

 

Proximity and Root-Cause

When it comes to identifying root-cause (ie to identify the actual thing that’s broken / degraded rather than the all of the other things that are affected downstream), I tend to think of proximity:

  1. Proximity in topology (ie nearest neighbours)
  2. Proximity by geography
  3. Proximity in time
  4. Proximity by object hierarchy (think OSI stack)

When devices generate alarms / logs, they don’t tend to have much in the way of proximal information though. The proximal information tends to arrive via an enrichment process by our OSS (or maybe the NMS that sits between device and OSS).

The use of topology  to assist with root-cause calculation is commonplace, so I won’t go into it here. You’ve probably also seen alarm states visualised as map overlays, so there’s nothing unusual here. However, we will take a closer look at the last two items in the list above.

Alarms / logs are timestamped, but time proximity is only achieved when viewed relative to the timestamps of other alarms / logs. The human brain can easily process proximity in time, but only if we provide suitable visualisation. Sequencing by timestamp is easy enough, but I’m a little surprised that our tools don’t make more use of sliders that allow us to readily scrub backwards and forwards in time (on historical events, or perhaps even projected future events). Perhaps long poll cycles (ie the time interval between requesting information from a device) can cloud the effectiveness of time proximity.

Nonetheless, time-scrubbers do increase the power of topology / geo views of alarm data too. They allow us to more readily see the ripple-out effect and hence deduce roughly where the event occurred (like where a rock landed in a pond based on where the concentric rings are emanating from). 

Object hierarchy is another proximity technique that doesn’t tend to be used very often, mainly because Fault Management tools don’t tend to store that information. For example, if a cable has been cut (layer 1 in OSI), then it’s common for child alarms to come from higher layers (eg data link, network, transport, session). RCA (Root Cause Analysis) rules can easily determine the root-cause (cable cut) to correlate and suppress higher layer alarms… but only if they have a reliable object hierarchy to refer to. Our Inventory (LNI / PNI) solutions *should* be able to store object hierarchies.

It’s interesting though. I’m hearing that our industry is having trouble identifying root cause between the layers in our modern virtualised networks like 5G. As I indicated in this post about a 5G inventory prototype, we’d probably never store applications, VNFs, NFVI, VIM, VNFM, NFVO, etc as layers in our inventory solution….. Unless it can actually add value to root-cause by object hierarchy proximity…. hmmm… I wonder????

I’d love to hear your thoughts on this one. Leave us a comment below.

BTW. If RCA interests you, you might like to take a look at this old post that describes the steps for building up a systematic RCA pipeline

OSS Information Overload, Underload

Our OSS/BSS collect a lot of information. But how much of it is used and in what ways? How do the users find the information they need to make decisions?

In some cases, our OSS completely overload the user with information. An example might be in performance metrics. Our network might have hundreds of nodes and each node is collecting dozens/hundreds of performance data points every few minutes. If we just present this as hundreds of adjacent time-series graphs then we’re making the task difficult for our users. Finding the few decisionable data points is like trying to find a paragraph of text that exists in print somewhere in the room below. Our users might never find it.

Image from mastertechmold.com 

In other cases, we underwhelm our users, not giving them all the data they need. How often do you hear of network operations staff receiving an alarm and then connecting to the device via CLI (Command Line Interface) to perform additional diagnostics? Or a capacity planner jumping between inventory, utilisation graphs, calculators, maps, etc.

For the overwhelming cases, look to deliver roll-ups / drill-downs of information. For example, show a rolled-up heat-map of all metrics across the whole of the network (eg green, orange, red). But then let users progressively drill down (eg by region, by colour / severity, by domain, etc) to find what they’re seeking.

For the underwhelming cases, walk the journey with the users – map the user personas, their most important activities, the data points they need to perform those activities (or generate via those activities) and determine whether supplementary data is required.

Personally, I’ve tended to find that cross-linked data has given me great insights. I like being able to query data and mash it up to test ad-hoc hypotheses. We don’t necessarily need all the cross-linked data baked into our OSS/BSS tools, but we do need great data query and visualisation tools.

That’s one of the reasons the data visualisation block features prominently in my OSS Sandpit architecture. 

When considering which tool to use, I looked beyond the norm as the typical telco data management tools tend to have a few limitations.

I decided to look into the types of tools used heavily in the finance industry. In part because I wanted to test Candlestick / Bollinger Band functionality, but also because finance handles massive data sets (often time-series data) and needs to present data in a way that actionable insights can be derived quickly and easily.

I’m currently envisaging different data scenarios in which to test graphing techniques like the following samples provided by Kx Dashboards using Vega:

… including:

  • Sankey diagrams – to show relative activity flow volumes
  • Candlestick / Bollinger – to show trends and exceptions (see BBs on ChartIQ, which is also integrated into Kx Dashboards)
  • Radar views – to map complex comparisons such as performance or utilisation of assets over multiple months
  • Word clouds – to show most common text / phrases in log files
  • Geo – overlay data such as customer counts, device counts, utilisation, percentage connections, network health, etc, etc onto map views
  • Radial convergence – to show the volume of interconnections between devices / ports
  • Hierarchical edge bundles – to show network hierarchies for root-cause-trace (RCT). This might include tying together the virtualisation stacks for networks like 5G, where it’s proving to be a challenge to identify root-cause.
  • Area grouping – to show predominant network connectivity between areas
  • Circular ties – to show graph data relationships
  • Not to mention all your typical graphing models such as
    • Bubble charts – to show relative volumes
    • Scatter-plots – to show network performance vs line length
    • Histograms
    • Bar charts
    • Pie charts
    • etc

OSS Sandpit – 5G Network Inventory Prototype

5G networks seem to be the big investment trend in telco at the moment. It comes with a lot of tech innovation such as network slicing and an increased use of virtualised network functions (VNFs). This article provides an example of building 5G Network components into the inventory module of our Personal OSS Sandpit Project.

This prototype build includes components such as:

  • Hosting infrastructure
  • NFVI / VIM (NFV Infrastructure and Virtualised Infrastructure Management)
  • A 5GCN (5G Core Network)
  • An IMS (IP Multimedia Subsystem)
  • An RIC (RAN Intelligent Controller)
  • Virtualised Network Functions (AUSF, AMF, NRF, CU, DU, etc – a more extensive list of examples is provided later in this article)
  • Mobile Edge Compute (MEC)
  • MEC Applications like gaming servers, CDN (Content Delivery Networks)
  • Radio Access Network (RAN) and Remote Radio Units (RRU)
  • Outside Plant for fibre fronthaul and backhaul
  • Patching between physical infrastructure
  • End to end circuits between DN (Data Network), IMS, 5GCN, gNodeB, RRU
  • Logical Modelling of 5G Reference Points

Our prototype (a Standalone 5G model) is summarised in the diagram below:

Or, if we’re to look at this as a multi-domain knowledge graph, the diagram below shows the same information, but as planes of data (domains) with interconnections.

We describe this via the following use-cases:

  • Building Reference Data like data hierarchies, device types, connectivity types, containment, device layouts, templates, flexible data models, etc
  • Creating Device Instances including rack views and the virtualised layers within them
  • Creating Physical Connections between devices
  • Creating Logical Connections between devices
  • Creating Network Slices in the form of services
  • Performing Service Impact Analysis (SIA)

Reference Data

Starting off with the data hierarchy, we had to develop some new building blocks (data classes) to support the virtualisation used in 5G networks. This included some new network slice types, virtualisation concepts and various other things:

In our prototype, we’ve developed a custom containment model as follows:

  • Country
    • Site
      • System (Network Domain)
        • Rack
          • Hosting
            • NFVI / VIM
              • VNF-Groupings (eg CU, DU, MEC, IMS, etc)
                • VNF
                  • Apps (like Gaming Servers)

In a real situation, you probably wouldn’t bother to model to this level of detail as it just makes more data to maintain. We’ve just included this detail to show some of the attributes of our sample 5G network.

5G also required some new templates, especially for core infrastructure that can house dozens of VNFs, 5G reference points and apps (eg games servers, CDN, etc) that you don’t want to recreate each time.

The 5G System architecture includes the following network functions (VNFs) and others.

  • Authentication Server Function (AUSF).
  • Access and Mobility Management Function (AMF).
  • Data Network (DN), e.g. operator services, Internet access or 3rd party services.
  • Unstructured Data Storage Function (UDSF).
  • Network Exposure Function (NEF).
  • Network Repository Function (NRF).
  • Network Slice Specific Authentication and Authorization Function (NSSAAF).
  • Network Slice Selection Function (NSSF).
  • Policy Control Function (PCF).
  • Session Management Function (SMF).
  • Unified Data Management (UDM).
  • Unified Data Repository (UDR).
  • User Plane Function (UPF).
  • UE radio Capability Management Function (UCMF).
  • Application Function (AF).
  • User Equipment (UE).
  • (Radio) Access Network ((R)AN).
  • 5G-Equipment Identity Register (5G-EIR).
  • Network Data Analytics Function (NWDAF).
  • CHarging Function (CHF).

Device Instances

We then create the devices to build the prototype network model shown in the first diagram above. This includes:

  • Hosting infrastructure
  • NFVI / VIM (NFV Infrastructure and Virtualised Infrastructure Management)
  • A 5GCN (5G Core Network)
  • An IMS (IP Multimedia Subsystem)
  • An RIC (RAN Intelligent Controller)
  • Virtualised Network Functions (AUSF, AMF, NRF, CU, DU, etc)
  • Mobile Edge Compute (MEC)
  • MEC Applications like gaming servers, CDN (Content Delivery Networks)
  • Radio Access Network (RAN) and Remote Radio Units (RRU)

This diagram below shows a small snapshot of the 5G Core. The templates we created earlier sure came in handy to avoid re-creating these hierarchies for each device type:

Note that the VirtualPorts are used for 5G reference points to support logical links, which we’ll cover later.

The diagrams below show the rack-layout views of core and edge hosting respectively. You’ll notice the hierarchy of device, NFVI, VNF-group, VNFs and applications are shown:

Physical Connections

To create the physical connectivity between core, edge and RRU, we’ve re-used the fibre cables, splice joints and ODFs that we demonstrated in the introduction to the OSS Sandpit inventory module.

In this case, we’ve just used fibres that were spare from last time and patched onto the 5G network’s physical infrastructure. The diagram below shows the physical path all the way from the Data Network (DN – aka a core router) to the transmitting antenna at site 2040.

This diagram includes router, core hosting, ODFs (optical patch panels), cables, splice joints, edge hosting, Radio Units and antenna, as well as fibre front and backhaul circuits.

Logical Connections

We also decided to create the various logical connections – in the most part these are interfaces between VNFs – via the standardised 5G Reference Points. 

You can also find a reference to the various logical interfaces / reference-points in the top-right corner of the prototype diagram (first diagram above).

You can also see the full list of reference points from any given VNF, as shown in the example of the AMF below. You’ll notice that these have already been set up as logical links to other components, as shown under “mplsLink” in the bottom pane. (ie the top pane are the “ports” on the AMF, the bottom pane shows the logical links to other VNFs)

The upper pane shows the instance of AMF (on the core) and its various interface points (A-end of the interface as VirtualPorts). The lower pane shows the relationships to Z-end components via logical circuits (note that I had to model them as MPLS links, which is not quite right, but the workaround needed in the tool).

You’ll also notice that the AMF is used by a number of network slices (under “uses” in the bottom pane), but we’ll get to that next.

Network Slices

Whilst not really technically correct, we’ve simulated some network slices in the form of “internal” services. To simplify, for each network slice type we’ve created a separate service terminating at each RRU. So, we’ve associated each RRU, Mobile Edge Infra (RAN), AMF (the Access and Mobility Management Function within the core) and the NSSF (the Network Slice Selection Function within the core) to these network slice “services.”

Some samples are shown below.

BTW 3GPP has defined the following Slice Types:

  • MIoT – Massive Internet of Things – support a huge device counts with enhanced coverage and low power usage
  • URLLC –  Ultra-Reliable Low-Latency Communications – to support low-latency, mission-critical applications
  • eMBB – Enhanced Mobile Broadband – to provide high speed data for application use (eg video conferencing, etc) and
  • V2X – Vehicle to Everything

Service Impact Analysis (SIA)

We can also use the service relationships to determine which Network Slices would be affected if the AMF failed. In the example below, there would be seven slices affected (see under “Uses” in the bottom pane), including all supported via sites 2040 and 2052

Similar analysis could be done using the getAffectedServices API that we demonstrated in the OSS Sandpit Inventory Intro post.

SigScale RIM

Over the last few weeks, I’ve also been using another open-source inventory management tool from SigScale called RIM (a Resource Inventory Manager designed to support service assurance use cases). It shines a light on mobile networks in particular.

The project creators authored the TM Forum best practice document IG1217 Resource Inventory of 3GPP NRM for Service Assurance which details the rationale for, and process of, mapping 3GPP information models to TM Forum’s TMF634 (Resource Catalog Mgmt) and TMF639 (Resource Inventory Mgmt) standards.

I plan to also use RIM’s REST interface (based on TM Forum’s OpenAPIs) to share data both ways with the Kuwaiba inventory module in the future. 

Summary

I hope you enjoyed this brief introduction into how we’ve modelled a sample 5G network into the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.

If you think there are better ways of modelling the 5G network, if I’ve missed some of the nuances or practicalities, I’d love to hear your feedback. Leave us a note in the contact form below.

OSS Sandpit – Resource / Inventory Module

This article provides a description of the inventory baseline, one module of our Personal OSS Sandpit Project.

As outlined in the diagram below, this incorporates the Inventory solution (by Kuwaiba), the graph database that underpins it as well as its APIs and data query tools. The greyed out sections are to be described in separate articles.

OSS Sandpit Inventory Baseline

We’ve tackled inventory first as this provides the base data set about resources in the network that other tools rely upon.

As the baseline introduction to the inventory module, we’ll provide a quick introduction to the following use-cases:

  • Building Reference Data like data hierarchies, device types, connectivity types, containment, device layouts, templates, flexible data models, etc
  • Creating Device Instances including rack views
  • Creating Physical Connections between devices
  • Creating Logical Connections between devices
  • Creating Services and their relationships with resources / inventory
  • Creating Outside Plant Views on geo-maps that include buildings, pits, splice cases, cable management, splicing, towers, antenna, end-to-end L1 circuits
  • Assigning IP Addresses and subnets with an IPAM tool
  • Creating an MPLS network
  • Creating an SDH network
  • Data import / export / updates via APIs including Service Impact Analysis (SIA)
  • Data import / export / updates via a Graph Database Query Language

Reference Data

Kuwaiba has a highly flexible and extensible data model. We’ve added many custom data classes (eg device categories like routers, switches, etc) such as those shown below:

And selectively added custom attributes to each of the classes (such as the Router class below):

Once the classes are created, we then create the Containment model (ie hierarchy of data objects). In our prototype, we’ve developed a custom containment model as follows:

  • Country
    • Site
      • System (Network Domain)
        • Rack
          • Equipment and so on.

We’ve also created a series of data templates to simplify data entry, such as the Cisco ASR 9001 and Generic Router examples below:

But we can also create templates for other objects, such as cables. The following sample shows a 24 fibre cable with two loose-tubes, each containing 12 fibre strands. (Note that colour-coding on tubes and strands is important for splicing technicians and designers)

Site and Device Instances

Next, we created some sites and devices within the sites, as shown below:

You’ll notice that some devices are placed inside a rack whilst others aren’t. You’ll also notice the naming convention for all devices (eg site – system – type – index, where site = 2052, system = DIS (distribution), type = CD (CD player for messaging) and index = 01 (the first instance of CD player at this site)).

It even allows us to show rack layouts (of equipment positions inside a rack)

And even patching-level details inside the rack (pink and blue lines represent patch-leads connecting to ports on the Cisco ASR 9001 router in rack position 2):

Physical Connections

Physical connections can take the form of patch-leads or via strands / conductors inside cables.

The diagram below represents a stylised optical fibre connection that we’ve created between a CODEC at site 2000 and another CODEC at site 2052. As you’ll also notice, it traverses two patch panels (ODFs – optical distribution frames), two splice joints and three optical fibre cables.

In our inventory tool, the stylised connection above presents as follows, where A and B have been added to indicate the patch-leads from the CODECs to patch-panels (ODFs):

Logical Connections

We can also represent logical and virtual connections. In the case below, we show a logical connection from the Waveguide of an antenna, to the broadcast of that signal to a neighbouring site, which then picks up the signal at the UAST (receiver).

Outside Plant Views

Outside plant (OSP) are the cables, joints, manholes, etc that help connect sites and equipment together. In the example below, we see the OSP view of the fibre circuits we described above in “Physical Connections.” If you look closely on the GIS (map overlay) below, you’ll spot sites 2000 and 2052, as well as the cables and splice joints. The lines show the physical route that the cables follow. 

You may also have noticed that the green line is showing a radio broadcast link, which is point-to-point radio and therefore follows a straight line path from antenna to antenna.

Cable Management

Cable management and splicing / connections is supported, with tubes/strands being selected and then terminated at each end of the cable (in this case CABLE1 and its strands connect the splice case on the left pane, with the ODF on the right pane). These can be managed on a strand-by-strand basis via the central pane. From the diagram, we can see that fibre 001 in CABLE 1 is connected to F1-001 in the splice case and 001-back on the ODF from the A-end and B-end details in the bottom left corner.

From the naming convention, you’ll notice that there are two sets of cable “ports” in the splice case, as indicated by fibre numbers starting with F1 and F2 respectively.

Topology Views

The diagram below shows a topological view of the devices within a site, helping operators to visualise connectivity relationships.

Services

One of the most important roles that inventory solutions play is as a repository of equipment and capacity. They also assist in allocating available resources to customer services. In the example below, service number “2052-ABC_LR_97.3FM-BSO” has a dependency on a tower, antenna, antenna switch frame and many more devices. If any of these devices fails, it will impact this customer service, as we’ll describe in more detail below.

 

IP Address Management (IPAM) and IP Assignment

We can manage IP address ranges / subnets, such as the examples below:

And then allocate individual IP addresses to devices, such as assigning IP address 222.22.22.1 to the CODEC, as shown on the “Physical Connection” diagram above.

MPLS Network

The following provides a simple MPLS network cloud for a customer:

APIs (including Service Impact Analysis Query)

The solution has hundreds of in-built APIs that facilitate queries, adds, modifies and deletes of data. 

The example shown below is getAffectedServices, which performs a service impact analysis. In this case, if we know that the device TEST-CD-02 fails, it will affect service number “2052-ABC_LR_97.3FM-BSO.” We can also look up the attributes of that service, which could include customer and customer contact details so that we can inform them their service is degraded and that repair processes have been initiated.

Note that the left-side pane is the Request and the right-side pane is the Response across the getAffectedServices API.

Data management via queries of the Graph Database

This inventory tool uses a Neo4j graph database. Using Neo4j browser, we can connect to the database and issue cypher queries (which are analogous to SQL, a structured query language that allows you to read/write data from/to the database). 

The screenshot below shows the constellation of linked data returned after issuing the cypher query (MATCH (n:InventoryObjects…. etc)). The data can also be exported in other formats, not just the graphical form shown here. 

I hope you enjoyed the brief introduction to the Inventory module of our Personal OSS Sandpit Project. Click on the link to step back to the parent page and see what other modules and/or use-cases are available for review.