    Duplicates from list uploads

      I seem to have a lot of duplicates (based on the built in duplicate check, i.e. the Email Address field) in my DB, where one of each "pair" is a lead or a contact, and the other one is from a list upload (not flagged as neither a lead nor a contact and not synced to SFDC). Since the 'Email Address' field is the duplicate matching criteria, the duplicates cannot have been created from uploading the list of email addresses to the wrong email field (we have a few), right?


      I'm thinking that either
      - the lead/contact was created first, and then the list upload person was created as a duplicate
      - or of course the other way around


      Either way, I would like to understand under what circumstances this can happen (and try to stop this in the future).


      Any insights would help!

          If you read the docs:

          • Deduplication only occurs on List Import, API, and Form Fill.
          • Dupes can exist prior to original sync OR come in from SFDC.
          • Sometimes Dupes occur because a record was Deleted in either system and then recreated and the earlier left behind record does not get reconnected.
          • You can manage this by analyzing your entry sources (sales uploads, etc) and cut off people from being able to add records. You can add processes to block or force deduping.
          • There are tons of threads on this topic with various tools and techniques that will help.
              Thanks Josh,


              I have done quite extensive readings and in this case I am talking about list imports (called it uploads in OP) which as you say should not create dupes, hence my question.

                  Hi Ludvig,

                  As Josh mentioned, dedupe occurs for list import, for your case you can check the following to find the root cause of the issue:

                  - Check if the file is properly formatted - there should not be any space before or after the email address

                  - Search a lead/contact for which you know the duplicate exists - then look at the activity log for both the records, you will find how they got created

                  - Another reason could be the data in SFDC, since SFDC have different definition for Lead and Contact but Marketo does not have that difference and treat everything as lead, so if any person exists in SFDC as contact and you upload the same in Marketo then once the sync between marketo and sfdc runs, there is a possibility of duplication.

                  When you say "You can manage this by analyzing your entry sources (sales uploads, etc) and cut off people from being able to add records. You can add processes to block or force deduping."

                  Can you share some examples of how we can force dedupe a contact upon re-entry?

                  We have these issues.

                  Hi Ludvig Brandt


                  Marketo has the duplicate matching criteria on the Standard "Email Address" field only. While uploading the list if you are mapping any other email address field except the standard one then Marketo treat that records as the records with email and create all the records with data as it is in the csv file.


                  If you make sure to map the correct or standard email address field, then it will not create any duplicates.

                      Ludvig Brandt

                      Thanks. I am looking at some examples and they have indeed "email address" field populated on both dupes most of the time. One of each pair typically has "person created" as the first activity history. I suspect that in historic list imports, someone might not have had the "de-dupe" checkbox clicked when doing the list import. Alternatively, it could be that the sync between SFDC and MKTO has been activated/performed AFTER the list import was made. Typical work flow for us is to set the sync field to true in SFDC and then make a list import of the target group, so if the sync has not yet finished (sometimes it takes 20 minutes) it could create a new person and then a duplicate when the sync is done like you flag Abhishek Chandra.


                      Boiled it down to 3 likely reasons:
                      - Wrong email field
                      - Sync made after list import
                      - De-dupe option not clicked


                      Related - Is there a way to choose which of the dupes that are added to the program, if the program members come from a list import? Right now it seems to be completely random. So person A and B have the same email, but maybe only B is synced to SFDC, and if B is added to the program, that activity will not be synced to the CRM.