A hackity-hack for “Ends With” and email domains

SanfordWhiteman
Level 10 - Community Moderator
Level 10 - Community Moderator

If you've been around the block with Marketo Smart Lists, you know there's no Ends With operator, only Starts With and Contains.[1]

This puts a damper on a common need: accurately searching for an email domain (@gmail.com, @example.co.uk) or TLD (firmographic clues like .edu, geographic ccTLDs like .cn).[2]

Some have attempted extravagant combos of Contains and Not Contains, which require a whole lot of prep just to determine that... they don't work. (Read my comments on the linked post for some examples of how such approaches are broken.)

There's a much easier way: maintain a custom field, here called Matchable Email, that always holds the value of {{Lead.Email Address}} followed immediately by two quotation marks "":

ss

ss

Then, to do a domain search, search that Matchable Email field for Contains @example.com"", which is equivalent to searching the original Email Address for Ends With @example.com:

ss

Pretty easy, right?

Why two quotation marks (“”)?

The key is to add a sequence of characters to the end of the email address that can never occur in the middle of the email address, so Contains @{{domain}}{{characters}} is functionally equivalent to Ends With @{{domain}}.

Finding those appropriate {{characters}} is a lot harder than it sounds. For awhile I was lazily appending $ to the end, because I like the fact that the dollar sign represents end-of-line in regular expressions, so it was easy to remember. But the email address "$ke$ha​@marketo.com$"​@gmail.com — note the quotation marks around the mailbox part, it wouldn't be valid without those — is a Gmail account, but would match Contains @marketo.com$.

Yes, RFC 5321 is just that generous. There are so many crazy-but-valid email addresses, however inadvisable it would be to use them in the real world, that it's hard to find something that, without exception, can only occur outside of a valid address and so can be used as your anchor point.[3]

I think I've found that something, though. Two quotation marks in a row "" can occur inside an email address, but they can never be preceded by a character that is a valid part of a domain name.

Let me explain.

First of all, as you may already be confused by this part, it's possible to have a quoted mailbox name (called a quoted-string in the standard). That's how you can add spaces on the mailbox side of the @: "sandy spacebot"@teknkl.com is a valid SMTP address.

You can also put quotation marks inside an already quoted mailbox name, but if you do so, you have to escape them with a backslash.  Thus "Clarence "Frogman""@henry.com" is not a valid email address, but if you escape the quotes as "Clarence \"Frogman\""@henry.com it is valid. Even though this address has two quotes in a row "" (see the characters right before the @?) they are by necessity preceded by a \.  And the \ can never be at the end of a domain name. 

Therefore you can accurately search the Matchable Email field for a string that Contains @gmail.com"", knowing that that sequence of characters cannot be found at the start or middle of the value, only at the end.

Enjoy!


Notes

[1] As a sometime database architect, I've never understood the technical reasoning and figure it must be just legacy-code-nobody-wants-to-touch syndrome. When searching strings, Starts With is faster than Ends With unless specific indexing is used; yet Contains and Ends With have equivalent performance — often terrible performance, don't get me wrong, but roughly the same either way. Plus, it's way easier to add indexing to speed up Ends With than it is to optimize Contains (an index on the reversed value in the first case, n-grams in the second case, FWIW). But here we are.

[2] My colleague EU points out that Marketo attempts to optimize a search for a domain-like pattern, one that begins with the character @, and turn it into an SMTP domain search. The problem is that it still doesn't work: The valid address "me​@gmail.com"​@outlook.com will (as we would expect given the concept of contains) match both Contains @gmail.com and Contains @outlook.com so it doesn't successfully emulate Ends With. It will also false negative on Contains @outlook.co, which is just plain wrong.

[3] The way to do this in a technically complete manner is to add an ASCII control character (like ASCII 30 RECORD SEPARATOR, one of my faves) which is never allowed, not even in quotes. But while you can append such a character with a specially hacked Change Data Value, searching for those characters is, unless it's just a one-time thing, effectively impossible. So we'll have to make do with "".

9469
17
17 Comments
SanfordWhiteman
Level 10 - Community Moderator

Psyched that people seem to really like this one!

Yoav_Guttman1
Level 4

Interesting!

This seems like it might open up the possibility for some hyper-focused account based strategy, so building lists or alerts based on @cocacola.com"" for example. That is something I have been working on for a while - creating an alert system for activity across different people within an account, ideally based on domain (though that is not a requirement). Sanford Whiteman - have any ideas on how to do something like that?

Ulf_Deeg
Level 3

implemented. All i can say

Josh_Hill13
Level 10 - Champion Alumni

Sanford Whiteman​  so does this run faster as well?

How would you handle the Custom Field? Is that a Formula or a String? Running a daily batch for this could work to some degree, however, very large databases may have trouble implementing this.

SanfordWhiteman
Level 10 - Community Moderator
  • Speed would be the same as any Contains, it's about having an accurate version of Ends With.
  • Matchable Email would be a String.
  • Depends on rate-of-change rather than sheer database size... one of my 2MM+ dbs only had 73 Email Address changes the whole last week (but every instance is different). If importing a large list, it would make sense to include the field as a column so you don't have to run a trigger.
SanfordWhiteman
Level 10 - Community Moderator

That is something I have been working on for a while - creating an alert system for activity across different people within an account, ideally based on domain (though that is not a requirement)

Sorry for the delay, Yoav. Not sure this specific trick gets you too much closer to your goal, though of course having an error-free domain match is part of it. The other part is pairing the domain with (presumably) a account rep, which means you need more than just the ability to filter on the domain, you also need to look it up in a table. This can be done via webhook.

Dana_F
Level 1

Sanford Whiteman​ forgive my ignorance, but when you create a campaign with a flow to "Change data value", what exactly does this do to the persons in my database?

SanfordWhiteman
Level 10 - Community Moderator

​Change Data Value ​sets a lead field (left hand side) to a specific value (right hand side). 

Dana_F
Level 1

Will this campaign only make changes to the field Matchable Email? I want to make sure it doesn't make any kind of adverse change to my database. Sanford Whiteman

SanfordWhiteman
Level 10 - Community Moderator

Of course. Only changes the field in the step. (You likely don't have to @ me btw. )