“The more data you have, the more data you need to understand the data you have. You are engaged in a data ponzi scheme…Could it be in service assurance and IT ops that more data equals less understanding?”
Phil Tee in the opening address at the AIOps Symposium.
Interesting viewpoint right?
Given that our OSS hold shed-loads of data, Phil is saying we need lots of data to understand that data. Well, yes… and possibly no.
I have a theory that data alone doesn’t talk, but it’s great at answering questions. You could say that you need lots of data, although I’d argue in semantics that you actually need lots of knowledge / awareness to ask great questions. Perhaps that knowledge / awareness comes from seeding machine-led analysis tools (or our data scientists’s brains) with lots of data.
The more data you have, the more noise that you need to find signal in amongst. That means you have to ask more questions of your data if you want to drive a return that justifies the cost of collecting and curating it all. Machine-led analytics certainly assist us in handling the volume and velocity of data our OSS create / collect. That’s just asking the same question/s over and over. There’s almost no end to the questions that can be asked of our data, just a limit on the time in which we can ask it.
Does that make data a Ponzi scheme? A Ponzi scheme pays profits to earlier investors using funds obtained from newer investors. Eventually it must collapse the scheme eventually runs out of new investors to fund profits. In a data Ponzi scheme, it pays in insights from earlier (seed) data by obtaining new (streaming) data. The stream of data reaching an OSS never runs out. If we need to invest heavily in data (eg AI / ML, etc), at what point in the investment lifecycle will we stop creating new insights?Read the Passionate About OSS Blog for more or Subscribe to the Passionate About OSS Blog by Email