By now I’m sure you’ve heard about graph databases. You may’ve even read my earlier article about the benefits graph databases offer when modelling network inventory when compared with relational databases. But have you heard the Graphene Database Analogy?
I equate OSS data migration and data quality improvement with graphene, which is made up of single layers of carbon atoms in hexagonal lattices (planes).
There are four concepts of interest with the graphene model:
- Data Planes – Preparing and ingesting data from siloes (eg devices, cards, ports) is relatively easy. ie building planes of data (black carbon atoms and bonds above)
- Bonds between planes – It’s the interconnections between siloes (eg circuits, network links, patch-leads, joints in pits, etc) that is usually trickier. So I envisage alignment of nodes (on the data plane or graph, not necessarily network nodes) as equivalent to bonds between carbon atoms on separate planes (red/blue/aqua lines above).
Alignment comes in many forms:
- Through spatial alignment (eg a joint and pit have the same geospatial position, so the joint is probably inside the pit)
- Through naming conventions (eg same circuit name associated with two equipment ports)
- Various other linking-key strategies
- Nodes on each data plane can potentially be snapped together (either by an operator or an algorithm) if you find consistent ways of aligning nodes that are adjacent across planes
- Confidence – I like to think about data quality in terms of confidence-levels. Some data is highly reliable, other data sets less so. For example if you have two equipment ports with a circuit name identifier, then your confidence level might be 4 out of 4* because you know the exact termination points of that circuit. Conversely, let’s say you just have a circuit with a name that follows a convention of “LocA-LocB-speed-index-type” but has no associated port data. In that case you only know that the circuit terminates at LocationA and LocationB, but not which building, rack, device, card, port so your confidence level might only be 2 out of 4.
- Visualisation – Having these connected panes of data allows you to visualise heat-map confidence levels (and potentially gaps in the graph) on your OSS data, thus identifying where data-fix (eg physical audits) is required
* the example of a circuit with two related ports above might not always achieve 4 out of 4 if other checks are applied (eg if there are actually 3 ports with that associated circuit name in the data but we know it should represent a two-ended patch-lead).
Note: The diagram above (from graphene-info.com) shows red/blue/aqua links between graphene layers as capturing hydrogen, but is useful for approximating the concept of aligning nodes between planesRead the Passionate About OSS Blog for more or Subscribe to the Passionate About OSS Blog by Email