My six laws of data integrity

Data integrity law #1 – When being handled, the accuracy / integrity of a data set tends to degrade over time.

Data integrity law #2 – To prevent rule #1 from making the data unusable, the data needs to be curated.

Data integrity law #3 – Curating data always carries a cost.

Data integrity law #4 – The more data and the more referential integrity (ie cross-linking) the greater the costs.

Data integrity law #5 – If the same data is maintained in more than one place (without automated synchronisation), the faster the decay time of law #1 and the higher the cost of law #3.

Data integrity law #6 – To reduce costs and optimise integrity, retain only essential data, don’t duplicate it and keep cross-linking to a minimum.

The problem with law #6 is that it’s the cross-linking that often unearths the most dramatic insights.

[Edit: Dougie Stevenson rightly suggested a seventh data integrity rule – always use data snapshots rather than production databases to work on your data for BI purposes such as building new reports]

March 17, 2016
Ryan

If you found this article useful or valuable, subscribe (in the top-right corner of this page) and share. Let's spread the word and inspire more people to become passionate about OSS. Ryan is Passionate About OSS and has dedicated the last two decades to sharing his passion for OSS with the world. He is a founder, author, blogger, Engineer, connector and inquisitive learner about OSS and managing networks. To find out a little about his back-story and why he's so Passionate About OSS, click on the About Page. To connect with Ryan and the PAOSS team, click on the Contact page.

All Posts

If this article was helpful, subscribe to the Passionate About OSS Blog to get each new post sent directly to your inbox. 100% free of charge and free of spam.

Our Solutions

Add to cart

Publications

The Most Exciting OSS/BSS Innovations of 2022 (eBook)

US$0.00

Add to cart

OSS/BSS Process Mapping Guide (with 50+ Process Maps)

US$0.00

Add to cart

Training Courses

An Introduction to OSS/BSS (PAOSS-INT-01)

US$27.95

Add to cart

Publications

OSS/BSS Use-Cases

US$0.00

2 Responses

Douglas Stevenson says:

18/03/2016 at 4:08 am

Nice. I tend to agree.

When I do BI sorts of things, reporting, etc. I want to leave the reference data alone and use snapshots to my sort of work.

In the snapshots – I consider them to be just that – SNAPSHOTS.

Anyway, Application data structures may be be the right thing for Reporting… 😉
Ryan says:

18/03/2016 at 8:48 am

Great additional advice Dougie.
Especially in highly-available systems, like we tend to use, working with an offline snapshot of data is a great rule too!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.

My six laws of data integrity

If this article was helpful, subscribe to the Passionate About OSS Blog to get each new post sent directly to your inbox. 100% free of charge and free of spam.

Our Solutions

The Most Exciting OSS/BSS Innovations of 2022 (eBook)

OSS/BSS Process Mapping Guide (with 50+ Process Maps)

An Introduction to OSS/BSS (PAOSS-INT-01)

OSS/BSS Use-Cases

Share:

Most Recent Articles

For the World Cup Final, will You replace Messi with a Local Club Player to lower wage costs?

Entanglement vs Adaptability: The Decisions That Will Define the Future of Telco

What a Bali Market Taught me about OSS and Business

When is an OOTB OSS Not Really Out of the Box?

2 Responses

Leave a Reply

Contact Us

Passionate About OSS © All rights reserved