Researchers at Princeton have revealed two marketing firms are using the exploit.
You know that handy autofill feature that saves you time when you’re logging in somewhere online? Maybe think twice before using it – at least until the browsers can rectify an exploit that is allowing marketing firms to steal your email address.
The exploit takes advantage of a browser’s built-in login manager. And before you think this doesn’t affect you, all the most popular internet browsers have this feature: Google Chrome, Mozilla Firefox, Apple Safari and Microsoft Edge. They all have a feature that saves your email address and password for sites you regularly log in to.
Unfortunately, this is the internet. And in yet another example of “this is why we can’t have nice things,” a pair of marketing firms have figured out a way to abuse this feature by placing invisible forms that run secretly on over 1,100 websites.
New Uses for Old Vulnerabilities
The Princeton researchers that published their findings on Wednesday were quick to point out that this isn’t a new vulnerability, rather this is a new way to use that vulnerability.
The underlying vulnerability of login managers to credential theft has been known for years. Much of the past discussion has focused on password exfiltration by malicious scripts through cross-site scripting (XSS) attacks. Fortunately, we haven’t found password theft on the 50,000 sites that we analyzed. Instead, we found tracking scripts embedded by the first party abusing the same technique to extract emails addresses for building tracking identifiers.
Here, courtesy of Freedom to Tinker, is how the exploit works:
Basically, it begins when you visit a site in earnest and log in. I personally use Chrome, when I do this a message pops up that asks me if I would like to save the password. From then on, whenever I go back to the site, I can fill the email and password forms simply by clicking on them and selecting the correct credentials.
Where things start to get sneaky is that these third-party marketing firms are injecting invisible login fields on other pages, this tricks your login manager into inserting your email address and password, the scripts then hash the harvested email addresses and send them back to the third-party servers.
In this case, the two culprits were AdThink and OnAudience. You can check out the list of sites with hidden login scripts, here.
So, what’s the value in storing hashed email addresses?
Email addresses are unique and persistent, and thus the hash of an email address is an excellent tracking identifier. A user’s email address will almost never change — clearing cookies, using private browsing mode, or switching devices won’t prevent tracking. The hash of an email address can be used to connect the pieces of an online profile scattered across different browsers, devices, and mobile apps. It can also serve as a link between browsing history profiles before and after cookie clears.
UPDATE: A representative from OnAudience has contacted us with this statement:
“As a Big Data company, we do our best not only to collect sufficient amount of data about internet users but also to protect their privacy and security. As it is clearly visible in our scripts we are not gathering e-mail addresses or passwords. In fact we collect anonymous e-mail shortcuts generated by well-known and widely used hashing algorithm. This method is commonly used in modern marketing automation platforms and is supported by the leading ad technology providers. We used them for the sole purpose of e-mail retargeting using double opt-in mailing lists on behalf of our customers. In this case the script was gathering data for our legacy platform BehavioralEngin
“Our DMP OnAudience.com is a completely different technology and uses other methods to gather information. Moreover, there is no exchange of data between BehavioralEngin
e and OnAudience. All data gathered by our DMP is automatically anonymised and processed in real time by its machine learning algorithms to ensure the highest precision in ad targeting and other marketing activities carried out for our clients. Digital information available in our data warehouse is never combined with any data, that would allow crackers to identify people online. Since we started our activity there has never been any incident of that sort although we process over 9 billion anonymous profiles of Internet users from around the globe.”
Don’t Expect Much to Change
So, here’s the million dollar question: how does this get fixed? And the answer is, it probably doesn’t. It’s certainly not illegal for the two marketing firms to play this game, but I would definitely call it unethical. Anytime you have to hide something, that usually means you’re not doing right by someone. You may roll your eyes at the idea that it’s just an email address, but this is the profile that AdThink is building on you:
Email addresses are just the tip of the iceberg.
As for how it gets fixed, you would have to start by convincing the browsers that this is actually a vulnerability. They don’t necessarily see it that way. That’s because web security rests on the Same Origin Policy, which is really just a nice way of saying that we won’t be using common sense in this discussion.
In this model, scripts and content from different origins (roughly, domains or websites) are treated as mutually untrusting, and the browser protects them from interfering with each other. However, if a publisher directly embeds a third-party script, rather than isolating it in an iframe, the script is treated as coming from the publisher’s origin. Thus, the publisher (and its users) entirely lose the protections of the same origin policy, and there is nothing preventing the script from exfiltrating sensitive information. Sadly, direct embedding is common — and, in fact, the default…
Basically, the browsers argue that everything is working as intended. But more telling is the argument over who is at fault. Browsers like to push this culpability off to the publishers for embedding third-party scripts. That being said, the ways to fix this at the browser-level are limited and any fix could end up breaking something even bigger.
There is a whole range of calculations that go with building a top browser and while we might see a glaring problem, we don’t see all the interlocking pieces that could be affected by a change.
So, what can you do?
I’ll let the guys at Princeton handle this one:
Users can install ad blockers or tracking protection extensions to prevent tracking by invasive third-party scripts. The domains used to serve the two scripts (behavioralengine.com and audienceinsights.net) are blocked by the EasyPrivacy blocklist.