Deduplication: How It Works, and When It Doesn't

Level 10
Level 10

Everyone knows that Marketo won't allow duplicates to be created, right?  But how does it work, and how do you find out what happened when you find yourself with duplicates?


First we'll talk about how and when it does work.


Deduplication works for three processes of lead creation in Marketo.


  • List Import
  • Form Fill-Out
  • Lead creation through API


All of these processes depend on the Email Address field as the main identification key, and when new data is submitted which includes an email address, Marketo will look through the lead records currently in your database for any record with a matching value in the Email Address field.  If it finds a record with the same Email Address value, then the other field values that are included in the incoming "bundle" will get written to the proper fields on that record.  If the process does not find a record with a match Email Address value, then a new record is created in your database and incoming information will be written to this record.  Done. Bam! New Leads!


Now, lets talk about when deduplication doesn't or can't work.


Because each of these processes relies on the Email Address field as the key piece of identification, it must be present on both sides for deduplication to work.  By that I mean that each bundle of information you submit to Marketo for lead creation must have an Email Address the system can check, and any record currently in your database at the time the bundle is submitted must have an Email Address to check against. If that field is not part of an incoming bundle, then a new lead will always be created.  If there are records in your database with blank Email Address fields, then incoming leads can't be checked against them.


A new lead may have the all the same information as the original, but if the email values don't match at the time the second lead is created then deduplication cannot occur.



How To Troubleshoot Duplicates


You can see here we have two duplicates.



The first was created Manually.



And the second was created through a list import 10 minutes later.



When assessing duplicates it's best to start by looking at the New Lead activity for each duplicate.  Marketo will not deduplicate lead information that is synced from your CRM (Salesforce, Microsoft Dynamics, etc).  It will take the CRM's word for it that the lead is valid and the information is good.  So if one of the duplicates was created by a process in Marketo, and the other was created by Salesforce or Dynamics, then you'll know why the duplication happened.  It's also possible to create leads manually in you Lead Database by clicking New from the bar at the top.  This process does not deduplicate against the leads already in your database, but if you're creating them manually then you'll most likely know if they're duplicates to start with.


The next step in troubleshooting is to look at the logs for any change in the Email Address field.  Remember that the Email Address values are only compared at the time the second lead is created, so if you find changes to the Email Address in the activity log of either duplicate, then you can be pretty sure they weren't the same to begin with.


We can see that our lead which was created through list import had a change made to the Email Address field after it was created.  This change was from [null] to "", and there weren't any other changes logged since the lead was created, so we can be sure that the lead was created without an email address.



These troubleshooting steps will help to identify the cause of any duplicate you have in your database, and help you to prevent duplicates in the future.

Is this article helpful ?