Use an established URL parser for UTM tracking (I’ll say it again!)

SanfordWhiteman
Level 10 - Community Moderator
Level 10 - Community Moderator

I always tell folks to use a time-tested URL parsing library like URI.js to seek and save UTMs and other marketing attribution info (not all browsers support native URLSearchParams, plus URI.js is better anyway).

 

Don’t attempt to write your own or use an uncredited one-liner you found somewhere!

 

Almost broke my own rule just now, though. For an upcoming post where only one query param was necessary, I found this old code (credited as part of “IOWA Util” though no evidence of that project exists) and pasted it in before doing a double-take:

 function getURLParameter(param) {    
    var queryString = window.location.search.substring(1);
    if (!queryString) return;

    var matches = new RegExp(param + "=([^&]*)").exec(queryString);
    if (!matches) return;    

    var value = decodeURIComponent(matches[1]);
    return value;
  }

 

You see this general approach — new RegExp(param + "=([^&]*)").exec() — in hundreds of projects on Github. (That is, projects whose authors should be including a URL library but thought that was overkill.) Seems to have started spreading around 10 years ago.

 

But it’s broken in two ways.

 

The first way

See that part where a new RegExp is constructed from the query param name?

 

What happens if the param name you’re looking for has a period in it?  Well, it may seem to work. getURLParameter("user.name") will return swhiteman for the following URL:

https://www.example.com/?user.name=swhiteman&user.id=12345

 

But if you create a new RegExp from a string without escaping special characters, the dot . represents any character.Which means for this URL:

https://www.example.com/?userfname=sandy&user.name=swhiteman

 

Calling getURLParameter("user.name") will return sandy. Oops!

 

Now, if you know about the implementation, then you can work with it by calling getURLParameter("user\\.name"). But then the function should be explicit about expecting a regExpEscapedParam, not just param.

 

The second way

Query parameter names can be URL-encoded just as much as query parameter values. But the code acts like only the latter is possible. Take this URL:

https://www.example.com/?user%5B%5D=sandy

 

That’s the encoded version of the href https://www.example.com/?user[]=sandy.

 

But calling getURLParameter("user[]") will return undefined. You’d have to know in advance that the param name is encoded, or always try both permutations. That’s far from practical.

 

A good URL parser doesn’t require you to pre-encode/pre-escape, nor pre-decode/pre-unescape, query param names. You just pass the name as it would exist outside of URL restrictions (i.e. unencoded).

 

In sum

Use URI.js, or something equally robust!

946
0