Deduplication: How It Works, and When It Doesn't

John_Clark1
Level 10
Level 10

Everyone knows that Marketo won't allow duplicates to be created, right?  But how does it work, and how do you find out what happened when you find yourself with duplicates?

 

First we'll talk about how and when it does work.

 

Deduplication works for three processes of lead creation in Marketo.

 

  • List Import
  • Form Fill-Out
  • Lead creation through API

 

All of these processes depend on the Email Address field as the main identification key, and when new data is submitted which includes an email address, Marketo will look through the lead records currently in your database for any record with a matching value in the Email Address field.  If it finds a record with the same Email Address value, then the other field values that are included in the incoming "bundle" will get written to the proper fields on that record.  If the process does not find a record with a match Email Address value, then a new record is created in your database and incoming information will be written to this record.  Done. Bam! New Leads!

 

Now, lets talk about when deduplication doesn't or can't work.

 

Because each of these processes relies on the Email Address field as the key piece of identification, it must be present on both sides for deduplication to work.  By that I mean that each bundle of information you submit to Marketo for lead creation must have an Email Address the system can check, and any record currently in your database at the time the bundle is submitted must have an Email Address to check against. If that field is not part of an incoming bundle, then a new lead will always be created.  If there are records in your database with blank Email Address fields, then incoming leads can't be checked against them.

 

A new lead may have the all the same information as the original, but if the email values don't match at the time the second lead is created then deduplication cannot occur.

 

 

How To Troubleshoot Duplicates

 

You can see here we have two duplicates.

pastedImage_8.png

 

The first was created Manually.

pastedImage_18.png

 

And the second was created through a list import 10 minutes later.

pastedImage_17.pngpastedImage_12.png

 

When assessing duplicates it's best to start by looking at the New Lead activity for each duplicate.  Marketo will not deduplicate lead information that is synced from your CRM (Salesforce, Microsoft Dynamics, etc).  It will take the CRM's word for it that the lead is valid and the information is good.  So if one of the duplicates was created by a process in Marketo, and the other was created by Salesforce or Dynamics, then you'll know why the duplication happened.  It's also possible to create leads manually in you Lead Database by clicking New from the bar at the top.  This process does not deduplicate against the leads already in your database, but if you're creating them manually then you'll most likely know if they're duplicates to start with.

 

The next step in troubleshooting is to look at the logs for any change in the Email Address field.  Remember that the Email Address values are only compared at the time the second lead is created, so if you find changes to the Email Address in the activity log of either duplicate, then you can be pretty sure they weren't the same to begin with.

 

We can see that our lead which was created through list import had a change made to the Email Address field after it was created.  This change was from [null] to "test@marketo.com", and there weren't any other changes logged since the lead was created, so we can be sure that the lead was created without an email address.

pastedImage_27.png

 

These troubleshooting steps will help to identify the cause of any duplicate you have in your database, and help you to prevent duplicates in the future.


Is this article helpful ?

YesNo


21607
11
11 Comments
Anonymous
Not applicable

Hi John,

I'm wondering if you can clarify this for me:

  • The next step in troubleshooting is to look at the logs for any change in the Email Address field.  Remember that the Email Address values are only compared at the time the second lead is created, so if you find changes to the Email Address in the activity log of either duplicate, then you can be pretty sure they weren't the same to begin with.

We've come across a lot of duplicate leads where this is exactly what happened. Do you know what the reason is that Marketo won't merge duplicate records if the email address value changes after the lead is first created? Ex.

  1. At 2:00 pm on her laptop, Jane.Doe@gmail.com fills out a form and becomes a new lead in Marketo
  2. At 2:15 pm on her mobile device, Jane.Doe+100@gmail.com fills out a form and becomes a new lead in Marketo
  3. At 2:30 pm on her mobile device, Jane.Doe+100@gmail.com fills out a second form and this time changes her email to Jane.Doe@gmail.com. Her email updates in Marketo but does not merge with the original Jane.Doe@gmail.com created at 2:00 pm.

Is there a strategic reason Marketo wouldn't automatically merge the two records at this point?

This seems to happen quite frequently for us, and I'm trying to explain to our Salesforce admins why it happens and why Marketo can't properly dedupe these leads.

Thanks,

Megan

Anonymous
Not applicable

Thanks for the article John,

Our current problem is that duplicates are created when the data is synced from Salesforce to Marketo. Even though email addresses are identical, Marketo still creates duplicate accounts for every lead that is synced from CRM. Can you please help me to understand what would be the reason for this and how I can fix it?

Thanks,

Ani

John_Clark1
Level 10

Hi Megan,

Deduplication only happens when a record is created.  In the scenario you describe, the records are created with different addresses, and the system wasn't designed to automatically merge records if their addresses change and then match.

I can't say for certain why this was done, or even if it was specifically designed this way.  I would encourage you to post that function as an idea though, to get some feedback from both our product managers and other users.

John

John_Clark1
Level 10

Hi Ani,

Deduplication won't take place when leads are created from the Salesforce sync.  If you're seeing duplicates being created this way, then you most likely already have a record in Marketo when the second record syncs from Salesforce.  You'll need to look at how leads are being created in both systems, and if possible, try to limit lead creation to a single source.  Either have all of your leads created in Marketo, or all of them in Salesforce.

John

Anonymous
Not applicable

Thanks, John

Anonymous
Not applicable

Thanks John

Anonymous
Not applicable

John,

We have a problem with dupes being created when we are assigning leads to a Salesforce queue. We use "Sync Lead to SFDC" in the flow step and assign the lead to the appropriate person or, under certain circumstances, directly to the queue.

I am not sure why this creates dupes since it's supposed to sync the leads. Could you explain to me why this is happening and what is a better approach for us to try?


Thank you!
Jennifer

John_Clark1
Level 10

Hi Jennifer,

Most likely you have Contacts that are going through this campaign in Marketo.  Contacts cannot be members of a queue, so Marketo creates a duplicate lead and adds that to the queue instead.  What I would suggest is adding an SFDC Type filter to the smart list of your campaign to prevent Contacts from qualifying.

Anonymous
Not applicable

You're the bomb John!

Anonymous
Not applicable

Hi John,

We have the same problem. we are getting leads from a specific Marketo campaign by filling a form on particular landing page. We use "Sync Lead to SFDC" in the flow step to assign the incoming lead to respective sales owner in SFDC.

This create dupes. To avoid this, you suggested to add SFDC type filter to the smart list. Which SFDC filter to be added in smart list?? We have person attributes/ company attributes under sfdc filters.

Thanks!