Don’t worry, it’s not your fault! The web was simply designed with an undue amount of trust in, well, people.
As the web has evolved, some data points that used to be available across sites have been determined to be security and/or privacy risks, and they’ve been made harder to get to, if not impossible.
When you click a hyperlinked <a> in a browser, the Referer header (yes, that misspelling is in the original standard, perhaps a harbinger of things to come!) tells the target URL about the source URL of the request.
So when you click to go from
the server running https://some.other.example sees this HTTP header:
Same goes for remote sub-resources fetched by a page, like images, scripts, CSS stylesheets, IFRAMEs, and Ajax requests. The server who owns those resources sees the Referer, so it knows who was asking for them. The Referer is also sent with other methods of navigation, like posting a form or JS-based changes to document.location.
The only exception used to be when the source page ran over SSL (https://www.some.example/?some_query_param=here ) while the target page did not (http://www.example.com ). In that case, the Referrer was not sent at all, to preserve the security of the source page — not so much the privacy of either side. (The idea being if the source URL was originally unreadable on the wire, then it shouldn’t then be exposed as plain text to an eavesdropper.)
Funny thing about the Olde Referrer Days (which only officially ended in late 2020, when Chrome 85 came out!) is that with the notable exception of giant search engines, there wasn’t wide concern about the privacy implications of Site B’s owner knowing someone was just looking at mypage.html on Site A.
Yes, Google and Bing and Yahoo made some special (and non-standardized) adjustments to their code so that a target site would only see a simplified Referer/document.referrer like “https://www.google.com” instead of the full search string. This was about keeping user-entered keywords private... well, that and preserving market share by making you go to the search engine’s console to see keyword trends!
Yet the web as a whole didn’t take any such precautions. So if you ran a personal blog, any site you linked to — say, a company you were critiquing — would know someone was on your blog first. (Some wanted the source to be exposed for affiliate marketing purposes, but most people probably didn’t.)
Likewise for corporate websites, where you might link to industry news sites or what-have-you. Given a choice, you wouldn’t want to share the last page your visitors were on. If you had a sufficiently skilled developer, you could do what search engines do, bouncing people off an interstitial page so only your domain (not path and query string) would be shown. But most companies didn’t do that. So in practice, most Referrers were being shared.
Things started to change when the W3C introduced the Referrer-Policy header (and a companion <meta> tag that has the same function). Referrer-Policy lets the source site determine exactly what will be shown in the Referer header when connecting to target sites: the full URL including query string, just the origin part (“https://www.example.com” ), or perhaps nothing at all.
This feature has been supported in some form in all browsers released since 2012, which is impressive. Setting your site set up to support IE 11, original Edge 14-18, and Safari 11 is trickier than just focusing on later browsers, though, so in practice it’s more for Chromium Edge/Chrome/Firefox/Safari 12+.
However, prior to late 2020, you still needed to deliberately enable the header or <meta> tag if you didn’t want to reveal the full source URL. If you didn’t do anything, the default behavior would be up to the browser, and the browsers mostly sent the full URL (called the unsafe-url option) like the old days.
(The first somewhat-harder-core exception was Safari, which starting from 13.3 forcibly sends the simple origin “https://www.example.com ” to cross-origin sub-resources — CSS, JS, images, IFRAMEs — even when the site wants to send the full URL. But that didn’t affect marketing attribution efforts, which only deal with the journey between main documents.)
But with Chrome 85, the default changed quite drastically. And since a plurality, if not clear majority, of your visitors are surely using Chrome, you’re gonna notice.
Chrome’s current default since last July is strict-origin-when-cross-origin. This means unless the source site is specifically configured to reveal more of its own visitor data, and the sites are on different origins (crucial note: https://www.example.com and https://pages.example.com do not have the same origin!) the target site will only see the Referrer “https://www.example.com”.
The reason this all matters is you may have a Hidden field that Autofills from a Referrer Parameter:
In Marketo-speak, “Referrer Parameter” means a query parameter in the Referrer URL. That is, the document.referrer value is parsed and its name=value&name2=value2 params are made available.
But that only works when:
With Chrome’s newer default behavior, you will only see the origin of the source URL, which by definition never has a query string regardless of whether the previous pageview had a query string!
So a concept like having an ad partner drive UTM-tagged traffic to their site, and then have links (not UTM-tagged) to your site, will not work unless you also coordinate with the partner to use the Referrer-Policy feature.
When you can’t “replay” an earlier touch across origins using an explicit Referrer-Policy, the data must be passed to your site either:
Very useful information indeed, thanks Sanford!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.