It wasn’t easy. Believe me.  And honestly I started with less duplicates than any other company I’ve ever worked for.

 

Coming into Fliptop and getting to basically start the Marketo instance from scratch, I knew that I wanted to build things my way (the right way) and that included getting the database as clean as possible.  There are many reasons why a clean, dupe free database is a best practice. I think Elliot wrote a great piece on the why. I’m going to talk about the how.

 

Now there is no such thing as a duplicate free database. Actually, I just thought of what that database would look like. An EMPTY database would be a duplicate free database.

 

There are tolerable levels for number of duplicates. According to Inga Romanoff, having anywhere between 5% to 10% of the entire database be duplicates is tolerable. I started my process with 4,200 duplicates out of an 85k database, representing 7% duplicates.

 

My first step was the narrow down how leads entered the system.  Do this step first as it is pointless to clean up the database without stopping how dirty data enters the system.  For me this meant switching out all the forms on the Fliptop website and the blog from a Salesforce web-to-lead to Marketo forms.  Marketo automatically de dups leads if the email addresses match. Salesforce does not. I had a bit of a problem with my engineering and customer success teams entering test leads into the system to test our own predictive scoring. I first cleaned up all those test leads and then built a data management campaign to go through on a monthly basis to delete test leads. 

 

Screen Shot 2015-08-17 at 8.51.43 PM.png

 

Next I gave my sales reps a tool to add leads in with full contact information. I turned to InsideView as they integrate nicely with both Marketo and Salesforce. If the lead already exists in our system, InsideView will update it rather than creating a net new lead. My reps can research leads and add them into our CRM easily without creating a duplicate mess.

 

After closing down the avenues of how leads go into the system I could next turn to actually de duping the database.

 

The process to get to a place of zen and zero dups relied in large part on a tool I found a LONG time ago called DemandTools by CRM Fusion.

 

It is by no means the prettiest tool around but it gets the job done.  The tool comes with pre built “scenarios” you can run to do sweeps of the database.  Scenarios are basically like matching criteria on the leads, first sweep is to find leads with the exact same email address.  Then the next sweep finds leads with the same name and company name.  Each sweep the criteria loosens up, like the teeth on a comb and the matches on duplicates will become less precise. You can also de dupe leads against contacts and even accounts with similar pre built scenarios. 

crmfusion.jpg

I probably ran 10 or more sweeps using their different pre-built scenarios on just Leads and then moved to de duping Leads against Contacts to whittle the list down. In then end there were probably 200+ leads left in my "Possible Duplicates" smart list inside of Marketo that I de-duped by hand. I know this sounds tedious, but when there were only 200 left I felt I could see the light of the end of the tunnel so I went for it.  The result is when I run the "Possible Duplicates" smart list in Marketo I see "No leads were found."

 

So how do you handle how leads enter your system and managing duplicates?