“Data mass is beginning to exhibit gravitational properties – it’s getting heavy – and eventually it will be too big to move.”
Guy Lupo in this article on TM Forum’s Inform that also includes contributions from George Glass and Dawn Bushaus.
Really interesting concept, and article, linked above.
The touchpoint explosion is helping to make our data sets ever bigger… and heavier.
In my earlier days in OSS, I was tasked with leading the migration of large sets of data into relational databases for use by OSS tools. I was lucky enough to spend years working on a full-scope OSS (ie it’s central database housed data for inventory management, alarm management, performance management, service order management, provisioning, etc, etc).
Having all those data sets in one database made it incredibly powerful as an insight generation tool. With a few SQL joins, you could correlate almost any data sets imaginable. But it was also a double-edged sword. Firstly, ensuring that all of the sets would have linking keys (and with high data quality / reliability) was a data migrator’s nightmare. Secondly, all those joins being done by the OSS made it computationally heavy. It wasn’t uncommon for a device list query to take the OSS 10 minutes to provide a response in the PROD environment.
There’s one concept that makes GIS tools more inherently capable of lifting heavier data sets than OSS – they generally load data in layers (that can be turned on and off in the visual pane) and unlike OSS, don’t attempt to stitch the different sets together. The correlation between data sets is achieved through geographical proximity scans, either algorithmically, or just by the human eye of the operator.
If we now consider real-time data (eg alarms/events, performance counters, etc), we can take a leaf out of Einstein’s book and correlate by space and time (ie by geographical and/or time-series proximity between otherwise unrelated data sets). Just wondering – How many OSS tools have you seen that use these proximity techniques? Very few in my experience.
BTW. I’m the first to acknowledge that a stitched data set (ie via linking keys such as device ID between data sets) is definitely going to be richer than uncorrelated data sets. Nonetheless, this might be a useful technique if your data is getting too heavy for your OSS to lift (eg simple queries are causing minutes of downtime / delay for operators).