It often happens that customer data is collected along the way, probably by a contact center team whose primary concern is different than having only one record per customer. So the database needs a consolidation from time to time.

With the following two rules it is possible to conceive a 100% automated solution:

  • Any two customers can be made the same if they share at least one key information
  • If two customers don’t share any key information, then nothing can be said about them, so they should be kept apart

So applying the equivalence relation and its transitive property to customer data, we get:

  • Equivalence: Person1 = Person2 if and only if there exists at least one field F (key information) for which Person1.F = Person2.F
  • Transitivity: if Person1 = Person2 and Person2 = Person3 then Person1 = Person3

Note that the Transitive property says not only that: if Person1.F = Person2.F and Person2.F = Person3.F then Person1 = Person3, but also that: if Person1.F = Person2.F and Person2.G = Person3.G then Person1 = Person3, being F different from G (this is what makes possible the 100%)

(revised text from my own comment 12517678 at Experts-Exchange)