SOLVED

How can I prevent spam leads from entering Marketo?

Go to solution
Highlighted

How can I prevent spam leads from entering Marketo?

Hi folks,

We're starting to experience some spam on our blog forms and are looking for a solve. I've seen articles about ReCaptcha and honeypots, but am not sure that either alone will solve our root issue. I'm hoping there's a combined approach that could solve our issue. We are proactively trying to address our global forms before any potential escalations in spam attack volume.

My understanding (please correct me if I'm wrong) is that the ReCaptcha implemenation found here does not prevent leads from entering Marketo. Instead, the data from the ReCaptcha is webhooked into Marketo and appended to the Lead record. You can then use the data to delete spam leads through a flow.

My understanding is also that honeypot fields are easy for a dedicated spammer to identify (even if they don't have an obvious name) and bypass. That said, this article implies that a honeypot can be used to prevent form submits from even happening - a desired result.

Goal:

Prevent Spam lead data from entering Marketo. This could look like spam leads not being able to submit Marketo forms OR preventing the data from form submits from reaching Marketo.

This is to make sure that:

  • Marketo's API is not impacted by sudden high inbound volume
  • Campaigns, etc do not trigger and impact the API - with the current system setup, they would have to be updated 1 x 1 to filter out leads flagged as spam by ReCaptcha data
  • Prevent system delays in triggers, etc. due to backlog
  • Prevent the need for ongoing system cleansing for spam leads, especially if there is high volume

Is this a viable solution?

  • Implement a hidden simple boolean true/false ReCaptcha field on the Marketo form
  • Include JavaScript similar to the honeypot article linked above, but for the ReCaptcha
  • If an automated spam script fills out the form, including the hidden ReCaptcha field, this will trigger the JavaScript to prevent the form from being able to submit OR filter out the data from ever reaching Marketo
  • Standard non-Spam leads will not need to fill out the ReCaptcha (e.g. if ReCaptcha is TRUE, the lead is Spam) and will pass through to Marketo

If this is not possible, is there some way to use a proxy in tandem with Marketo forms to prevent syncing bad data to the system? Other solutions?

Thanks so much for any help and ideas!

Cheers,

Julia

P.S.Sanford Whiteman tagging you since I know you've been an invaluable resource on past ReCaptcha questions. 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Level 10 - Community Moderator

Re: How can I prevent spam leads from entering Marketo?

First, honeypot fields are ridiculous. Worse than useless. Anyone who continues to champion them just doesn't understand how forms work, or how the web (including malicious and legit actors) works in general.

Second, realize that reCAPTCHA never (in any system, not just Marketo) stops non-humans from submitting form data.  It cannot ever do this, because malicious actors do not use JavaScript. reCAPTCHA relies on an end user fingerprint, generated using JS and verified on your server via webhook, to determine whether the submission was from a human or machine (or, in the latest v3, whether they tilt toward human or machine, instead of a binary distinction). 

So reCAPTCHA can be used to intuit whether form data was submitted by human or machine, but it doesn't stop the data from being submitted.

Now, in Marketo, you have the less-than-optimal reality that once form data is submitted, a lead is upserted before any other inspection of the payload can be done. In other systems, the form data can be inspected, again after submission, but before it enters the next stage of processing. In Marketo you can only inspect and attempt to subvert/revert the actions done before you're given a chance to check the reCAPTCHA fingerprint.

When plugging reCAPTCHA into Marketo, you need to tune your workflow so that form intake processes work in serial, not in parallel, so you always have control over the next step. You need to make sure that not just reCAPTCHA, but other prerequisites like an SMTP verify webhook, have a positive outcome before letting people move to the next step (i.e. making robust use of Request Campaign and the Webhook is Called trigger).  I just rolled out a robust reCAPTCHA implementation for a client that was a huge net win, because it taught them a lot about rogue processes they didn't even realize were running, in random order, on every form fill! The end result was a workflow that's (mostly) self-documenting and stops non-human leads from entering the system.

Is this a viable solution?

 

  • Implement a hidden simple boolean true/false ReCaptcha field on the Marketo form
  • Include JavaScript similar to the honeypot article linked above, but for the ReCaptcha
  • If an automated spam script fills out the form, including the hidden ReCaptcha field, this will trigger the JavaScript to prevent the form from being able to submit OR filter out the data from ever reaching Marketo
  • Standard non-Spam leads will not need to fill out the ReCaptcha (e.g. if ReCaptcha is TRUE, the lead is Spam) and will pass through to Marketo

No, this does not make sense. There's no such thing as a reCAPTCHA that operates entirely on the client side, and the last thing you want is a reCAPTCHA that acts more like a honeypot (i.e. is more fake)!

View solution in original post

15 REPLIES 15
Highlighted
Level 10 - Community Moderator

Re: How can I prevent spam leads from entering Marketo?

First, honeypot fields are ridiculous. Worse than useless. Anyone who continues to champion them just doesn't understand how forms work, or how the web (including malicious and legit actors) works in general.

Second, realize that reCAPTCHA never (in any system, not just Marketo) stops non-humans from submitting form data.  It cannot ever do this, because malicious actors do not use JavaScript. reCAPTCHA relies on an end user fingerprint, generated using JS and verified on your server via webhook, to determine whether the submission was from a human or machine (or, in the latest v3, whether they tilt toward human or machine, instead of a binary distinction). 

So reCAPTCHA can be used to intuit whether form data was submitted by human or machine, but it doesn't stop the data from being submitted.

Now, in Marketo, you have the less-than-optimal reality that once form data is submitted, a lead is upserted before any other inspection of the payload can be done. In other systems, the form data can be inspected, again after submission, but before it enters the next stage of processing. In Marketo you can only inspect and attempt to subvert/revert the actions done before you're given a chance to check the reCAPTCHA fingerprint.

When plugging reCAPTCHA into Marketo, you need to tune your workflow so that form intake processes work in serial, not in parallel, so you always have control over the next step. You need to make sure that not just reCAPTCHA, but other prerequisites like an SMTP verify webhook, have a positive outcome before letting people move to the next step (i.e. making robust use of Request Campaign and the Webhook is Called trigger).  I just rolled out a robust reCAPTCHA implementation for a client that was a huge net win, because it taught them a lot about rogue processes they didn't even realize were running, in random order, on every form fill! The end result was a workflow that's (mostly) self-documenting and stops non-human leads from entering the system.

Is this a viable solution?

 

  • Implement a hidden simple boolean true/false ReCaptcha field on the Marketo form
  • Include JavaScript similar to the honeypot article linked above, but for the ReCaptcha
  • If an automated spam script fills out the form, including the hidden ReCaptcha field, this will trigger the JavaScript to prevent the form from being able to submit OR filter out the data from ever reaching Marketo
  • Standard non-Spam leads will not need to fill out the ReCaptcha (e.g. if ReCaptcha is TRUE, the lead is Spam) and will pass through to Marketo

No, this does not make sense. There's no such thing as a reCAPTCHA that operates entirely on the client side, and the last thing you want is a reCAPTCHA that acts more like a honeypot (i.e. is more fake)!

View solution in original post

Highlighted

Re: How can I prevent spam leads from entering Marketo?

Hi Sanford Whiteman‌,

Love this - THANK YOU for such a detailed response! We're working on scoping a project to re-architect our Marketo instance to allow for more streamlined processes. Right now we have no control over what's firing when, so we need to start daisy chaining with Request Campaign and Webhook is Called even outside of the ReCaptcha issue. We'll make sure to factor this in as well.

"The end result was a workflow that's (mostly) self-documenting and stops non-human leads from entering the system." 

Is this referring to SFDC or other CRM as "the system" rather than Marketo? If we're able to use the triggers you mentioned in Marketo, my understanding is the form data/lead record should be in Marketo already, but beyond that point we can control how and where the data flows (i.e. not to the CRM). 

If this understanding is correct, and the data is already in Marketo, then it sounds like there's no way to prevent the form data from entering Marketo when we're using Marketo forms on landing pages? Aka our only option to prevent an attack from reaching our systems in the first place would be to use non-Marketo forms so the data can be inspected before pushing into Marketo?

Highlighted
Level 10 - Community Moderator

Re: How can I prevent spam leads from entering Marketo?

"The end result was a workflow that's (mostly) self-documenting and stops non-human leads from entering the system." 

 

Is this referring to SFDC or other CRM as "the system" rather than Marketo?

Yes, but a better phrasing would've been "stops non-human leads from entering CRM, and stops them from entering any additional Marketo flows if they do not first pass verification."  The client in this case quarantines the flagged leads and deletes them using a nightly batch.

Highlighted
Level 10 - Community Moderator

Re: How can I prevent spam leads from entering Marketo?

 Aka our only option to prevent an attack from reaching our systems in the first place would be to use non-Marketo forms so the data can be inspected before pushing into Marketo?

This isn't such a practical option. Whatever you build to do this naturally has its own DoS vulnerability due to the tight limits Marketo places on API endpoints.

In other words, a high rate of (legit) conversions can deny service to this and all your other API-connected apps (since calls are shared across all). And if you run afoul of anyone who's forging reCAPTCHA (using human "farms" overseas, which is a thing) they'll have an easier time denying service to your Marketo instance than if the data flowed directly into Marketo. The cure can well be worse than the disease. Which is not to say it can't be built successfully, I just don't think it's worth the effort. I think you can live with data flowing into Marketo if your flows are chaining properly. You haven't talked about the volume of spam leads you're experiencing, though.

Highlighted

Re: How can I prevent spam leads from entering Marketo?

Got it. If I'm reading right, Marketo doesn't provide a way to block these on the front end with their forms, and building a workaround externally would potentially block good leads just because of the API call limits. E.g. we could purchase more API calls at high cost from Marketo for one off syncs, but any batch limits for syncing would still be in place, calls may fail, and leads from failed calls won't be re-synced unless we have an ETL set up to automatically re-push until the data is accepted (potentially causing other issues, like backlogs).

Again, super appreciate your responsesSanford Whiteman. This was incredibly helpful. We'll make sure to factor in ReCaptcha and these gotchas as we work towards the re-architecture project for Marketo to quarantine these leads on arrival in the system. We're not experiencing high volume attacks yet, but do see attacks in the millions in short timeframes on our product sometimes. Our Risk team takes care of those, and I'd like to tackle spam proactively before it becomes an issue for inbound pipeline on Marketing pages. The same spammer that's targeted product pages in the past is starting to come through Marketo forms, and right now we have nothing to mitigate the volume. My concern with having the leads come in at truly high rates if an attack occurs is what happens to Marketo when volume spikes (e.g. potential Marketo system slowdowns), as well as eventual contract pricing (e.g. if targeted consistently and we're above our record limits day to day, eventually we have to pay to house those records based on average record count/day even if they're being eliminated nightly). 

Highlighted

Re: How can I prevent spam leads from entering Marketo?

Hi Sanford,

I've recently had exactly the same problem and ended up doing the same thing - having all leads  go through a centralized "bot catcher" program first.

From what I can tell, at least in the attacks that I've seen, spammers don't *really* fill out the form - they emulate a native Marketo form fill, since it takes not much to do that: munchkin id, form id and the list of required fields. The attacker must have parsed the munchkin ID and the form ID from HTML, the list of fields with API names is pretty standard across b2b forms and all standard fields have the same API handles across all Marketo instances - if you've used one Marketo instance, you know them for pretty much every Marketo customer) the URL to submit leads to is no secret either. It's all out in the open.

Just throwing some stuff at the wall here:

would it be viable to hide as much of this stuff as possible in tag managers, create a new form  with a new id from scratch and maybe create a set of custom fields replacing standard ones and make them required for a form submission?

Highlighted
Level 10 - Community Moderator

Re: How can I prevent spam leads from entering Marketo?

reCAPTCHA prevents the very situation you're describing.

The shortcoming with reCAPTCHA is that you need to make sure it's the gateway to *all* your other workflow steps. No trigger campaigns can run in parallel. It must act as if it were another outer layer. 

I have yet to see anyone roll out reCAPTCHA properly and, end user experience aside, not be satisfied.

Highlighted

Re: How can I prevent spam leads from entering Marketo?

We faced this issue multiple times over the last couple of months. Here's what we observed - initially, we saw a surge of 10-15k leads per day and these ended up as fake handraisers. This is what alerted us as we saw a sudden spike in the daily MQL reports.  

Now, it's natural and that we looped in our Digital marketing team as the leads seem to be sourced from a particular form. They jumped right in with many resolutions -

  • Expansion of the existing honeypot solution 
  • reCaptcha
  • IP and email domain blocking on the web forms 

But, nothing seemed to stop the incoming spam leads. This is when we went on to compare the Google Analytics stats and noticed that the incoming web page hits didn't match the # of incoming spam leads. This arose a suspicion that the spam records could directly be hitting the MKTO endpoint. 

We now have 

  • a daily report of the incoming leads matching the spam leads criteria
  • a campaign to mark these incoming spam leads as invalid, so they are not processed and progressed to the next stages
  • an ongoing effort to delete these leads - not just from MKTO, but the integrated systems as well 

But, this is not viable. So, we got on a call with MKTO team to discuss the issue and check if IP blocking or anything was possible. But, apparently not. They told us that a long-term solution is being implemented and will be rolled out in Q1 2020 (tentative :-(, this was 2 months ago!). They recommended that we delete the affected form and create a new one, but this is not going to make the system any less vulnerable. It takes a few seconds to try different form IDs as two are already exposed. 

We are actively in discussion with MKTO CSM and Products team. Let me know if you'd like to discuss the details and I'm happy to jump on a call!  

Highlighted
Level 10 - Community Moderator

Re: How can I prevent spam leads from entering Marketo?

You can't have implemented reCAPTCHA correctly — simple as that. Properly integrated into your flow, all reCAPTCHA-failing leads will never be flagged as handraisers.

And as noted above, honeypot has never worked against an even mildly savvy attacker, so any attempt to "extend" it wouldn't do anything.