    How to correct typos in email addresses?

    Elliott Lowe

      We have a significant number of email addresses submitted via our non-Marketo forms that have typos in the domain field.  For example, instead of .com, they have the following.

      • CON
      • CIM
      • C
      • VOM
      • CPM
      • COMM
      • XOM
      • COOM
      • COML
      • CCOM
      • COME
      • OCM
      • CLM
      • COMI


      At some point, we're going to implement real-time email address validation in our forms using a service like Informatica, but in the meantime, does anyone have a recommendation on how best to do this (e.g. Javascript code, webhook to a service that will let us manipulate strings, etc.)?

        • Re: How to correct typos in email addresses?
          Sanford Whiteman

          Batching your whole DB through a webhook will bog down your instance something fierce (I'd say webhooks are better suited for responding to Interesting Moments such as form fillout, though this might change as Marketo beefs up 'hook performance).  If you can prefilter your db to exclude known TLDs, then a 'hook may be feasible.  I have an endpoint you can use for this (DM me if interested).


          But really you'd get equivalent satisfaction from exporting that same Smart List, touching up the email address using Excel or Google Sheets, and reimporting. 


          The caveat in both cases is that even though you can know for sure there's a typo if an email doesn't end with a (currently) valid TLD, you can't always extrapolate what TLD they meant to type.  For example, did ".con" mean ".com" or ".cn"? Is ".inf" supposed to be ".info," or ".in"?*  And so on.   Of course you can push people toward the most well-known TLDs for your geographic area (if that's sufficiently limited) and use physical and IP addresses as an additional guide.  This complexity is why more intelligent systems usually come at a cost, yet they still aren't 100% accurate.


          On the input side, even without paying for Informatica, simply matching the email address against the known TLDs will be a huge help: MktoForms2 :: Force known TLD


          Here's an interesting list from DomainTools of possible typos for "com" (I think they're in order of likelihood):


          * Note also that from an AI perspective, these pairs of possible typos have the same Levenshtein Distance, so a non-keyboard-aware software program sees them as, er, equally different.

            • Re: How to correct typos in email addresses?
              Elliott Lowe

              I've resigned myself to using the Force.com Connector in Excel and manipulating the errant TLDs there, but I'm definitely interested in a webhook I can call whenever an email address contains one of these common typos.


              The list in my post was ordered by frequency of occurrence.  Below are the top 20 typos and their percent of the total.




              Really appreciate the form script, which I added it to our Marketo LP templates with the following tweaks.

              • Removed the form embed code and replaced with 'MktoForms2.whenReady'
              • Revised the error message
              • Made the form submittable if the TLD is valid by changing the first occurence of 'form.submittable(false)' to 'form.submittable(true)'


              Do you have to periodically manually update the list of valid TLDs by copying them from http://data.iana.org/TLD/tlds-alpha-by-domain.txt or do you have an automated way of maintaining the list in the code?  The list of valid TLD must change frequently as over 200 TLDs have been added since I checked it about 7 months ago.

            • Re: How to correct typos in email addresses?
              Josh Hill

              You could also search for a list of bounced emails and manually correct them.

              • Re: How to correct typos in email addresses?

