What is the 1-10-100 Rule of Dirty Data?

1-10-100 rule states that it costs exponentially more money to identify and correct data entry errors the longer it takes to find them.

How do you define dirty data?

Dirty data is like any other kind of dirt. It starts piling up immediately, and eventually, it becomes a problem. For many MSPs, an expensive, time-wasting problem.


The concept of the 1-10-100 rule was developed by George Labovitz and YuSang Chang in 1992.

Degrees of Dirty Data

If you’re dealing with your dirty data manually, you’re going to worry more about importance. In principle, you can think of this as an incident report and focus on differentiating between high-priority data issues and low priority ones.  


The Worst Kind of Dirty Data 

Let’s say you have a vendor calling you asking you to sign a three-year term.  I know, unrealistic scenario, right? To make that decision effectively, you’d want not only perfect data on that vendor’s performance but also on how that performance compares to competing suppliers.  


By the logic of the 1-10-100 Rule, dirty data that you’re using to make decisions is the most important (i.e. expensive) kind, because those decisions could be flawed, and cost you a lot of money over time. Especially when you start compounding sub-optimal decisions.  


Better data = better decisions = better outcomes. 


Ain’t Nobody Got Time for That 

The next level of importance is data that wastes your time. So having the wrong contact for billing or bad data that you need to go in and clean up manually.  If you’re spending hours every week just reconciling records from different systems, those costs can add up.  


Didn’t Need that Money Anyway 

Past that, there’s bad data that leaves money on the table. Like when you’re getting billed for 50 users and only billing the client for 45 because you don’t know about those other five users.  


Five Second Rule 

The last type of bad data is with records that don’t have much of an impact. This stuff is like a wasp that lands on your sandwich. You’d rather it didn’t do that, but it’s not going to hurt you. Most MSPs don’t need to worry too much about this kind of dirty data.  

How Data Goes Bad

Dirty data goes bad in a few different ways. The first is that data gets stale-dated. It would be amazing if every record in your PSA was automagically updated whenever a change is made, but that’s not how it works. If a client has a new controller, and you don’t know that, you’ll just send an email to the old one. If that email address is still active, it could take weeks to sort it all out. 

Then there’s data that is incorrect. I mean it was incorrect from the beginning.  If you’re trying to get hold of the controller at a client and have an incorrect email address, now you’re wasting time.  

A third kind is data that’s correct and up-to-date but has missing information. For example, you’ve got the name of the current controller at your client’s company, but don’t have an email address. Arguably better than having the wrong one because you don’t waste time sending an email that bounces right back, but still not ideal.  

The thing is, everybody has dirty data because it’s part of running a business. The key is to get on top of it and stay on top of it. Larger MSPs have people working on this full-time. That’s a source of competitive advantage. But there’s no reason why every MSP shouldn’t have better data to help them make better decisions.   

Together, we can get every MSP to a place where they don’t have any major dirty data issues and are making the best possible decisions as a result.