Without reading the official standards, can you describe the differences between a valid URL and a valid href
? How about between the href
attribute of an HTML A tag and the href
IDL attribute of its Location object?
My guess is there only a few people in the world who can recite these off the top — members of WHATWG or W3C. I certainly don’t know all this stuff by heart, but reading standards is fun.
Anyway, all these are valid A tags that link to the same destination URL:
But only the last 2 will be tracked by Marketo.
Indeed.
The href
HTML attribute is defined as “a valid URL potentially surrounded by spaces.” After stripping leading and trailing spaces, it must be a valid URL string, but the spaces themselves are fine.[1]
In other words, a URL can’t start or end with spaces. But even though an <a href>
becomes a URL by design, the href
itself can have spaces.
An even deeper detail is that when an <a>
is parsed into a Location object, the Location object’s href
property won’t have spaces. This is easy to demonstrate in the browser...
... but was difficult to find in the spec(s).
Finally, I found that a Location object is said to have a relevant Document, and any Document has a URL. That URL is derived using the Basic URL Parser, which explicitly has the 3rd step:
3. Remove any leading and trailing C0 control or space from input.
So one thing with the name href
can have spaces, while another type of href
cannot. Confusing!
So why are links with leading spaces left untracked? (And yes, I learned about this when a client messed up a big send this way.)
Because Marketo checks only the raw href
attribute to see if something is a tracking-worthy link. If it doesn’t start with a sequence of letters followed with a colon — that includes not just http:
and https:
but also tel:
and such — it’s thought to be some other kind of <a>
, like a jump link within the email body, which shouldn’t be tracked.
Is this a bug? Possibly, but ultimately it’s an application-level decision. But worth fretting about rather than just keeping in mind? To me, that’s a no.
[1] I’m not sure exactly why surrounding spaces are allowed — even I haven’t been around that long! Maybe it’s in the W3C mailing list archives from 20 years ago, but I’ve got stuff to do.☺
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.