Skip navigation
All Places > Products > Blog > 2018 > July
2018

Can't believe someone other than me has thought about cryptographic techniques in Marketo! But in this recent Nation thread, a user asked about attaching the SHA-256 hash of a recipient's email address to (email) links.

The aim being to send a secure representation of a lead's email to an external (non-Marketo) site, where it could be looked up as part of a 3rd-party subscription center.

In other words, a link would look like this:

http://nonmarketopage.example.com/?emailHash=284C36B4333F55511617FB225C62813316038522DF5C2AEC6B6B13B3D3774F5C

The target site's database would hold pre-computed hashes of all their registered users' emails, the hash being stored in a separate column from the email itself.[1]

A match would then be done from hash-to-hash (the way hashes are always used, since they can never be reversed to the original input) to load the user's record.

Mind you, I have no idea why this would actually be done as opposed to sending the URL-encoded email address:

http://nonmarketopage.example.com/?email=sandy%40figureone.com

Or the easily reversed, Base64-encoded (not encrypted) address, which wouldn't require a hash lookup:

http://nonmarketopage.example.com/?emailB64=c2FuZHlAZmlndXJlb25lLmNvbQ==

Being a realist, I suspect the justification is merely that they want the links to look more “technical” than they would with Base64 (?) and they don't truly have any need for security (as it makes little sense to care about the security of the recipient's own email address![2]). UPDATE: OP says the client now wants AES encryption instead of hashing, so they really are trying to securely transport stuff in the URL. That gives me a chance to bring out the big guns and show you how encryption is done in VTL... at some point.

Anyway, should you need it, here's a utility Velocimacro to gen a SHA-256 hash:

#**
    HashTool in VTL v2
    @copyright (c) 2018 Sanford Whiteman, FigureOne, Inc.
    @license MIT License: all reproductions of this software must include the data above
*#
#macro( SHA256_v2 $mktoField )
## reflect some dependencies
#set( $Class = $context.getClass() )
#set( $java = {} )
#set( $java.lang = {
     "StringBuilder" : $Class.forName("java.lang.StringBuilder"),
     "Appendable" : $Class.forName("java.lang.Appendable")
} )
#set( $java.security = {
    "MessageDigest" : $Class.forName("java.security.MessageDigest")
} )
#set( $java.util = {
    "Formatter" : $Class.forName("java.util.Formatter").getConstructor($java.lang.Appendable)
} )
## get an MD and make it go, then (important) reset MD singleton for reuse
#set( $MD = $java.security.MessageDigest.getInstance("SHA-256") )
#set( $digestBytes = $MD.digest( $mktoField.getBytes("utf8")) )
#set( $void = $MD.reset() )
## gen hex string representation
#set( $hexString = $java.lang.StringBuilder.newInstance() )
#set( $hexer = $java.util.Formatter.newInstance($hexString) )
#foreach( $B in $digestBytes )
#set( $void = $hexer.format("%02x",$B) )
#end
## return uppercase, but note a hex string should be treated as case-insensitive
$hexString.toString().toUpperCase()##
#end

After including the above in its own token (best to not pollute your user code with utility functions) use the macro like so:

#set( $emailHash = "#SHA256_v2($lead.Email.toLowerCase())" )
<a href="http://www.example.com/?emailHash=${emailHash}">Click here</a>

Note that I toLowerCase()d the Email before passing it, as that's appropriate for that particular field (as I've written about before, though SMTP addresses are actually case-sensitive, they're commonly matched case-insensitively for sanity's sake).


NOTES

[1] Hope I'm not hoping for too much here. They'd better be already pre-computing hashes for all the stuff in their database, or this idea goes from merely frivolous to very bad. If they didn't pre-hash, they'd have to hash every record in the database, on-the-fly, for every lookup. This would absolutely destroy performance and be a sign that the back end was not well thought out.

[2] Long as the destination site runs SSL. But if the site doesn't run SSL then the connection could be intercepted and you have a lot worse problems than showing the attacker an email address.

The Forms 2.0 Email type — just like a standard HTML <input type="email"> — won't throw an error on email addresses like:

 

sandy@gmail

jeff@amazon

alejandra@zz

That is, domains with only a single DNS label to the right of the @ sign (@example), rather than multiple labels separated by dots (@example.com or @example.co.uk), are valid entries.

Confusing, sure. But no, it's not a bug. A mailbox @ a single-label domain isn't an “incomplete” or “partial” address, and while most such addresses happen to be invalid on the public internet, it's impossible to know whether they're valid using JavaScript alone.

To understand why, you have to know more about how SMTP domains are looked up in the global DNS.[1] (To my admittedly unreasonable dismay, nothing about SMTP or DNS is taught to marketing students!)

A laughably brief overview of SMTP and DNS

This is silly to try to go over quickly. But here are the basics:

  1. (1) For an email address to be theoretically routable over the public net, it must have a domain part, the part to the right of the @.
  2. (2) For an email address to be factually routable over the public net, the domain needs to have an MX record[2] in the global (public) DNS.
  3. (3) A web browser cannot perform arbitrary DNS lookups on its own, so it cannot know whether a domain actually exists, let alone what records are in the DNS zone. It can only know if it's well-formed: that is, if it matches the string syntax rules for a domain.
  4. (4) A single-label domain like gmail (or, for that matter, a single label like com) is no less a well-formed domain than one with multiple labels, like gmail.com or zyzzx.co.ukOnce looked up, these will differ greatly in terms of being private (assigned year-by-year to a company or other entity) or public (registered semi-permanently to a country or independent NIC operator).[3] And they may or may not be registered, or even legally eligible for registration, at a given point in time. But they are nevertheless all domains.
  5. (5) A single-label domain, if it exists in the global DNS, may also have an MX record (and some, as we shall see, do). It's a rare case, yes — but it's technically valid, and since form inputs don't perform DNS lookups, there's no way to know whether the particular single-label domain somebody entered is routable or is a dead end.

In sum, the only thing a web browser alone can know about a domain, without invoking a remote lookup service, is that it follows proper syntax. It can't have commas instead of dots; it can't have anything other than a tiny subset of ASCII characters (though it may be displayed as if it has more exotic characters, which is another topic); it can't have an individual label longer than 63 characters; it can't have 2 dots in a row; the total length including dots can't exceed 253. That's pretty much it: one string that matches those requirements is as valid as any other.

You might be getting it already, but let's go further

From the intro above, you should already have your mind opened.

But what exactly do I mean when I say a single-label domain may still be a publicly routable domain? Most likely, you've never sent or received mail from joe@gmail, only joe@gmail.com. But that's just because you haven't sent enough email. (Like, you haven't personally sent billions of messages!)

As of today, these TLDs have MX records, meaning the email address user@tld is publicly routable:

.ai .ax .cf .dm .gp .gt .hr .km .lk .mq .pa .sr .tt .ua .ws

These are mostly small island countries, like Anguilla, Dominica, and Trinidad and Tobago. With full respect to the governments of those countries, with relatively limited budgets they may have set the records up accidentally. But the list also includes Guatemala (.gt), Ukraine (.ua) and Sri Lanka (.lk) — so we must assume it's no accident that you can address mail to sirisena@lk.

I get that you still want to “fix” your forms, though

Even if you believe everything above, you probably still want to hand-wave it and require multi-label domains! So here's a little snippet if you insist:

 

MktoForms2.whenReady(function(form) {   
  form.onValidate(function(nativeValid) {      
    if (!nativeValid) return;      
    var currentValues = form.getValues(),         
    formEl = form.getFormElem()[0],         
    emailEl = formEl.querySelector("[name=Email]"),         
    RE_EMAIL_ASCII_PUBLIC = /^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)+$/;      

    form.submittable(false);      
    if (!RE_EMAIL_ASCII_PUBLIC.test(currentValues.Email)) {         
      form.showErrorMessage(            
        "Must be valid email. <span class='mktoErrorDetail'>user@example.com</span>", 
        MktoForms2.$(emailEl) 
      );      
    } else {         
      form.submittable(true);      
    }   
  });
});

Obviously if you already have custom onValidate behaviors, you have to integrate this code with those. (When you're flipping the Forms 2.0 global submittable flag on and off, you need to make sure all conditions are considered together.)

A deep note about commercial gTLDs

OK, there is one glitch in my claim that browsers are completely justified in accepting all single-label domains in Email inputs.

In fact, newfangled domains like .space and .lawyer and .futbol are expressly prohibited from having MX records at the top level. From the guidebook for applying to run one of these “new gTLDs”:

ICANN receives a number of inquiries about use of various record types in a registry zone, as entities contemplate different business and technical models. Permissible zone contents for a TLD zone are:

• Apex SOA record.

• Apex NS records and in-bailiwick glue for the TLD’s DNS servers.

• NS records and in-bailiwick glue for DNS servers of registered names in the TLD.

• DS records for registered names in the TLD.

• Records associated with signing the TLD zone (i.e., RRSIG, DNSKEY, NSEC, and NSEC3).

In other words, non-essential records like A or MX records are not permitted at the apex level of these new gTLDs.

The guide does go on to imply that another record type might be permitted by special exception, but it seems awfully doubtful that any would ever be approved:

An applicant wishing to place any other record types into its TLD zone should describe in detail its proposal in the registry services section of the application. This will be evaluated and could result in an extended evaluation to determine whether the service would create a risk of a meaningful adverse impact on security or stability of the DNS.

So, if you know you're dealing with one of these new gTLDs, .lawyer for example, you in fact can be sure without a DNS lookup that joe@lawyer will not be routable over the net, since that top-level domain is not legally allowed to accept mail.

So browsers could, in theory, throw an error on <user@new gTLD> (since they do have a copy of the Public Suffix List internally and don't have to hit DNS for that). But, well, they don't. Go figure!


NOTES

[1] Plus, it helps to understand that SMTP delivery doesn't even require a DNS lookup. The SMTP standard predated DNS by a few years, and even today billions of messages are passed from server to server without DNS being consulted. But I'm going to ignore this other contributing factor today.

[2] Yes, for the purists, an MX or an A record. I'm not trying to overwhelm people here.

[3] Or of course the 3rd variant: private on the pure DNS level, as they're children of a public TLD, but to be treated as public for purposes of browser security. This is why the PSL exists: it tracks the effective public domains, a list which can't be derived via DNS alone.

Filter Blog

By date: By tag: