To reduce OSS dark data (or not)?

Dark data is the name for data that is collected but never used.
lt’s said that 96-98% of all data is dark data (not that I can confirm or deny those claims).

Dark data forms the bottom layer in the DIKW hierarchy below (image sourced from here).
DIKW hierarchy

What would the dark data percentage be within OSS do you think? Or more specifically, your OSS?

If you’re not going to use it, then why collect it?

I have two conflicting trains of thought here:

  • The Minimum Viable Data perspective; and
  • It’s relatively cheap and easy to collect / store raw data if an interface is already built, so hoard it all just in case your data scientists (or automated data algorithms) ever need it

Where do you sit on the data collection spectrum?

If this article was helpful, subscribe to the Passionate About OSS Blog to get each new post sent directly to your inbox. 100% free of charge and free of spam.

Our Solutions

Share:

Most Recent Articles

No telco wants to buy an OSS/BSS

When you’re a senior exec in a telco and you’ve been made responsible for allocating resources, it’s unlikely that you ever think, “gee, we really

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.