- the bots / hacker will push data via the form using POST URL and therefore bypassing the normal form submission by a person that clicks on the "submit" button
- reCAPTHAT will not block spam bots in the scenario above. We have verified it using a script and we were able to submit records over and over
Axel, reCAPTCHA never blocks spam bots from sending form data. That's not what it's ever been designed or advertised to do. And this is true of reCAPTCHA on all websites, not just Marketo LPs and/or forms.
reCAPTCHA allows you to verify on the server side whether a form was submitted by a human or not. If it doesn't pass the human test, you delete or quarantine it before you'd pass it through any processes that would result in it being in synced to another system. Unless you are getting very high volume (10s of thousands is not very high) this should not impact instance performance.
- we use an email verification tool on our form as well. for this type of situation the results are very limited.
Email verification won't apply to bots, so the results will be more like zero than limited.
Hi Sanford
Thanks for your reply. We are getting 10s of thousands of these emails and therefore we have performance issues.
Any suggestions on how we can prevent this from happening?
Thanks
Axel
Thanks for your reply. We are getting 10s of thousands of these emails and therefore we have performance issues.
Are the perf issues actually from the form submissions, or from something else you're doing before checking if the lead is legit?
Hi Sanford
Happy new Year!
Sorry for delay in my reply. The performance issues are due to the the sync between Marketo and SFDC. As all the submissions are coming via a program that sync to SFDC, it impacts SFDC to the point it is stops working.
It is a pretty serious issue and we feel that Marketo would be best suited to help preventing those issues instead of use find ways around them.
But if you have a solution that works most of the time, then that would be a good start
Thanks
Axel
Sorry for delay in my reply. The performance issues are due to the the sync between Marketo and SFDC. As all the submissions are coming via a program that sync to SFDC, it impacts SFDC to the point it is stops working.
It's essential, then, that you delay the addition of people to a synced program until after they've been verified.
Sanford
Delaying the sync does not solve the real issue which is preventing thousands and rogue form submissions. The first time we spotted this, we have 25,000 submissions at the time.
What would happen if this was done as combined attack on several customers' Marketo instances?
Delaying the sync does not solve the real issue which is preventing thousands and rogue form submissions.
Why not? If the only Denial of Service is to the SFDC sync process, then that's what you have to throttle, and that's what you're doing by not adding unwanted people to the sync.
What would happen if this was done as combined attack on several customers' Marketo instances?
25,000 is a tiny drop by modern web standards: even a small virtual server can handle millions of POSTs per day.
So the question isn't the number of hits, it's preventing an amplification attack where a lightweight HTTP request gets expanded into a much longer-lived and resource-intensive process.
I'm not saying Marketo should not be able to shed the load earlier (like by checking ReCAPCTCHA results before inserting into the db) but there's nothing impractical about funneling leads through lighter-weight processes so they don't hit heavier-weight ones.
Marketo's response about Google reCatptcha...
----
Regarding reCaptcha, since it is a third-party integration, we don't have formal documentation regarding setting it up, but I would recommend searching the Marketo community to see how other Marketo users have approached this, or work with using reCaptcha for more information.
In addition to implementing ReCaptcha, you may want to consider adding javascript.
- Add JavaScript validation to the header of your landing pages. This checks to see if JavaScript is enabled on the browser - and, if not, redirects the lead to a page that advises them to do so. Spam bots do not have Javascript enabled, so this can cut down on spam submissions. This will minimize but not eliminate these submissions. You can also use javascript to do custom validation on any of the fields in your form, but keep in mind that you would need a developer's help for these solutions solution.
Here is an article on our community site that may be useful.
Title - Dealing with Spam or Bot Form Fillouts
Link - https://nation.marketo.com/docs/DOC-4755-how-to-setup-a-form-honeypot-field
We suggest to working a developer to implement these solutions as well as test them as custom coding falls outside the scope of support.
-----
In other words, it's up to us to sort this out. Just frustrated how they really don't care and have no ambition to solve the most common issue with email forms (DoS attacks). Honeypot fields doesn't work. Bots are smart now.
They should implement a captcha system and put a toggle on/off on the form creation. It would help us tremendously and reduce the load on their server too.
I agree that some built-in in support for reCAPTCHA would be nice. You would still have to supply your own Google site key and secret: Marketo can't use the same account for all subscribers' reCAPTCHA lookups because Google will rate-limit them very, very fast (you can even get yourself rate-capped within a single organization).
That said, adding the reCAPTCHA to a form is not too difficult, and it's a one-time (or few-times) procedure to set it up.
The problem is that if the underlying forms infrastructure remains the same, it doesn't matter if Marketo creates an automatic webhook callout for you and adds the widget to your form. That doesn't reduce the server load, it actually increases the load, since every form post results in another HTTP roundtrip to lookup the reCAPTCHA status in addition to all the overhead of processing the form data. That's because (the way it works now) form data is accepted, queued for insertion, and inserted into the database before the webhook is called. There's no resource savings, only overhead.
If, on the other hand, the order of operations were changed, the reCAPTCHA endpoint could be called first and the data queued only on success, saving resources. But I'd rather see that pipeline be exposed as an API, not hard-coded to support reCAPTCHA only, so we could call whatever we want in the intermediate layer.
Hi Sanford,
Thanks for your reply.
The idea would be to have a toggle on/off on each form so that not all forms would do a reCaptcha call.
So Marketo could implement the Google API keys in your admin, so you set that once, and then it call the API only if the toggle on a specific form is on. Not rocket science.
I have been following your suggestions from this thread Step by step guide to recaptcha
But got stuck at the webhook stage where I have already created all the fields you suggested on the thread, then when I go to webhooks > response mappings > add a new response attribute but then I cannot find the LastReCAPTCHAServerStatus
Any idea? I tried everything I think, deleted and created new fields etc.
Would appreciate your help Sanford, thanks