Bombproof your list imports with proxy fields

Not applicable

List imports are scary stuff.

There are a lot of things that can go wrong with pushing a file of leads into your database. Duplicates. Overwritten values. Bad data. List imports open you up to a lot of risk to your Marketo database all at once.

But you can set up a line of defense for your more sensitive fields using what I call “proxy fields”, which help you intelligently manage how data is overwritten.

List imports are a daily task for us, and so we’ve gotten pretty strict on the data requirements for new leads, particularly with the lead status field. When we upload, we need to be able to distinguish between suspects/inquiries and truly qualified leads.

We block field updates for the lead status field, meaning it only accepts the first value from a list import, and won’t be changed by future imports. This helps us ensure a lead that’s already an MQL cannot move backwards and become a suspect from a later list import.

This is handy, but field blocking is a double edged sword. Suppose a lead was originally imported as a suspect, but is now on an import list where we’ve marked them as an MQL. In that case, we actually want them to become an MQL, but since that field blocking is in place, that lead’s status won’t be changed, and we’d be forced to fix it manually.

To work around this, we created a proxy field, called “Import Lead Status.” It replaces the standard lead status field on our import lists, and more intelligently manages the data, making sure leads can move forward in status, but not backwards.

Here’s how to set it up in about 10 minutes:

  • Create a new text field with a name distinct from the original field (this should be a marketo-only field), such as “Import Lead Status.” Ensure no field updates are blocked. You may also want to set a few list import aliases for this field, as this is the field you’ll want all your list imports to map to in the future.
  • Create a smart campaign set to run every time a lead qualifies, with two triggers: one for anytime your new field’s value changes, and another for when the lead is created (since data value change triggers won’t fire for new leads). You’ll also want to include two filters:
    • A filter to exclude any lead statuses you do not want to change. You need to include this so your import lead status doesn’t move any leads backwards inadvertently. I decided that anything Marketing Qualified or beyond (SAL, SQL, etc) shouldn’t be changed.
    • A filter the specifies that the import lead status field isn’t empty (for the aforementioned “lead is created” trigger)
  • For your new Smart Campaign’s flow, you just need one step, “Change Data Value” that sets the lead status by grabbing the value from the proxy field, using a field token.


The nice thing about this process is it’s invisible to your other Marketo users. We have a standard import template that we require for any list uploads, and I have “lead status” and variations of it mapped to this proxy field as a safeguard, so to anyone importing lists, this is no different to how they managed the process before. Since implementing this process, it’s had a big impact on efficiency and accuracy for us.

Obviously this is just one example of a proxy field – you could do something similar for geographic fields, company names, phone numbers, and much more. If you’re doing something similar, I’d love to hear about it!

This post originally appeared on www.jeffrshearer.com

Level 5

Nice article Jeff.

Question about your process: do you allow your field marketers to upload lists (with the template) or is that something just the MA team will do?

Not applicable

Only the marketing automation team is allowed to upload lists. We used to allow field marketers to upload their lists, but backed away from that around 9-12 months ago, though not for the reasons you might expect. In most cases we didn't have issues with the quality of data being uploaded...our field team is pretty responsible about that sort of thing. The bigger issue is that when something goes wrong, it's hard to audit where the issue comes from, because we can't really see who uploaded what. So we try and manage it within as small of a group as possible. While we can't eliminate the fire alarms, we can at least minimize the response time.

Level 4 - Champion Alumni

Great article, Jeff! Kimi Heskett​ shared this with me, and we've been able to use your logic/example to create a similar program for some Lead Source updates. We have a large majority of leads with a Source value that was imported a long time ago, and isn't really valuable from a ROI perspective. So we've created a program to update the Lead Source with a more accurate value upon import if the original value is the old one that we want to limit. One thing to add is that we've also created smart campaigns to NULL out the proxy fields after a certain period of time, that way we can continue to use them clean each time. Thanks again for sharing!

Nice article Jeff.

However, in my experience the problem is split 80-20 or 70-30 in favor of the list upload template in terms of the biggest problem creator. Whether we go the route you've outlined or go the route of retroactively cleaning up the values, it's ultimately an issue of preventing the issue vs retroactively fixing the issue.

The route I took was creating a list upload template file in Excel and locking some columns to be picklists so that things like Lead Source couldn't be improvised or mistyped. While this does the trick for the most part, it also is cumbersome for the team since they have to migrate the list from their original source to the list upload template and then make the necessary changes before uploading. Would love to also hear your thoughts on what all you recommend for preventing the issue from occurring in the first place.

Not applicable

Great article, Jeff.  How do you handle matching leads to the correct record when there are shared email addresses? We have had issues with this, with Marketo matching an import to the first record it finds with a particular email address.  Unfortunately our organisation has a use case where email addresses are shared so there are many contact records with the same email address

Not applicable

Thanks for the kind words Loren, and love hearing the use case for Lead Source!

Not applicable

Hey Andy, great comments, and for me, it really comes down to training and expectations with the teams that submit & format the lists. Only a handful of individuals at our organization are allowed to run list imports, so we can allow for a final list data check before uploading to help catch any inconsistencies. We also tend to kick lists back to the requestor if we see incorrectly formatted data in lists. That way we quickly condition these teams to the proper values to use for formatting. This has helped tremendously, and I'd say the amount of times I see issues is down considerably from even a year ago when we started doing this process.

To help catch anything that slips through the net,  these proxy fields certainly help. Another process we have in place is what I call a "new lead processing queue". It's essentially a chain of smart campaigns that run upon lead creation (or could be set to run on data value change triggers) and run the leads through a series of data sanitization steps. This includes steps for standardizing country names (a HUGE issue for us), Lead Source, and yes, even Lead Status. Coincidentally, I'll actually be discussing this very topic in my session at the Summit, and also plan to write up my process here on the community the next few weeks.

For now, I hope this explanation helps!

Not applicable

Hey Erica,

I wish I had a better answer for you, but that may be tricky to accomplish, and my guide above probably won't be any help on that front!  When duplicates are present, I believe Marketo will associate the list import with the record that was most recently updated. So suppose you have a lead and a contact in your database, but the lead record was updated most recently, then the lead would be the record associated with your list import, and the contact would be unchanged.

This really stems from the fact that Marketo uses the email address as the primary key. It may be possible to work with Marketo support to change what field is used as your primary key, but this sort of configuration can be a huge problem later if not executed correctly, so that would only be advisable as a last resort, and even then I'd probably hesitate to recommend it. 

I'm sure you already have done this, but I recommend taking a close look at the root of the issue: why your company has chosen to structure their data in this fashion. Depending on the priorities of the business, it may make sense to rethink this structure, as it will pose a big hurdle for a lot of activities in Marketo and other systems.

In any case, I hope this helps!

I was thinking about creating those data sanitation campaigns as well, even though we have data enrichment running in real time. Can't wait to see your write up!

Not applicable

I heard whispers of potentially adding in a second key as mentioned above.

Agreed with Jeff, take a look at the use case again. You may be able to implement a process with a second key and or as we have at my company, implemented a "black box" of leads and tucked them into a partition which no one can touch. Leads get to this partition through our purposeful duplicate process.

Marketo associates activity to the original lead as that record has the munchkin tracking etc.

Might not work for you, but hopefully it spurs some creative thought.