uBlock Origin for Firefox has a new feature to combat this

CodeDragon57 · April 11, 2021, 9:56am

CodeDragon57 · April 11, 2021, 9:59am

@wendell what do you think about this?

CodeDragon57 · April 11, 2021, 4:35pm

It’s in the title, but, I did leave out the Firefox+uBlock. Cause who’s gonna use anything other than Firefox

CodeDragon57 · April 11, 2021, 4:37pm

At the same time, this is mostly for people who don’t use uBlock Origin (on Firefox), and thus do not know about this.

max1220 · April 11, 2021, 5:12pm

Just wait, in not too long instead of adding cross-site tracking at the DNS level it will be inserted directly into the server as a reverse proxy, possibly for some important assets, to make filtering more difficult…

CodeDragon57 · April 11, 2021, 9:42pm

I have no idea what you just said…

bedHedd · April 11, 2021, 10:39pm

that’s the tl;dr for the article.

CNAME DNS tracking is the exploit, it was discovered in 2007/2010. It’s important for people who run ublock on firefox or chrome (which doesn’t have a api that ublock can use to protect users from this exploit).

How it’s implemented

CNAME cloaking involves having a web publisher put a subdomain – e.g. trackyou.example.com – under the control of a third-party through the use of a CNAME DNS record. This makes a third-party tracker associated with the subdomain look like it belongs to the first-party domain, example.com.

Furthermore, this type of exploit is more common as many websites use a subdomain CNAME DNS record, to make a third party look like it belongs to the website

The boffins from Belgium studied the CNAME-based tracking ecosystem and found 13 different companies using the technique. They claim that the usage of such trackers is growing, up 21 per cent over the past 22 months, and that CNAME trackers can be found on almost 10 per cent of the top 10,000 websites.

Lots of sites use such scripts and are leaky.

What’s more, sites with CNAME trackers have an average of about 28 other tracking scripts. They also leak data due to the way web architecture works. The researchers found cookie data leaks on 7,377 sites (95%) out of the 7,797 sites that used CNAME tracking.

CodeDragon57 · April 12, 2021, 1:38am

I wasn’t talking about the article. I understood that perfectly. I was talking about not understanding max.

bedHedd · April 12, 2021, 1:40am

Oh yeah, I quoted you for other people who come along this thread.

The other stuff I wrote was adding to your tl;dr

CodeDragon57 · April 12, 2021, 1:54am

Doesn’t that defeat the purpose of a tl;dr?

bedHedd · April 12, 2021, 1:59am

Not really. It elaborates on your point. If people are interested, they can read a bit more either at my post or the article.

rcxb · April 12, 2021, 5:38am

That makes it too easy for site owners to forge traffic in a way that’s nearly impossible to detect by the advertisers, who are paying them based on the amount of traffic they send.

judahnator · April 12, 2021, 2:59pm

I keep hearing folks say this is an exploit. Even the “Security Now” podcast was saying it.

I am not so sure about that myself. This definitely smells like a “this is by design but we don’t like the design so we call it an exploit because it’s easier to put the responsibility on the anonymous bad actors” - sort of thing.

One solution would be to not assume that subdomains of a site should have the same cookie access rights as the parent domain unless both the main and subdomain share the same security certificate. If the certificates are different, assume different actors behind the scenes.

Of course this is defeated by the web host serving the tracking script as a first-party, but such is the cat-and-mouse game.

max1220 · April 12, 2021, 3:52pm

I meant the following setup:

I connect to https://whatever_website/

Website hoster has setup some resources on that website(could be anything, but ideally
something that is loaded every time(not cached) and is required for the website to work).

When a request to that resource is made, the webserver “reverse-proxies” that connection to another webserver, owned by the advertisement/tracking company.
They can either answer themselves, or forward it back to the host server(double-reverse-proxy!).

They effectively circumvent all possible versions of same-origin policy, and denying cookies based on network parameters(as adblockers do): The client has no way of telling if the resource is from one server or another: Not from the origin(that’s just the regular server from the users perspective), and not from the content(ideally it’s normal content, unrelated to tracking).

@rcxb I agree that this hypothetical scenario is horrible: They have to believe the proxy server’s headers to have the original IP and request data in them. But look at what they get in return: R/W access to all cookie data, server headers, running JS in your client etc; basically everything they usually use to “validate” (admittedly weakly) “real” users, except having the specific client-IP connect to them directly.

I’d like to mention that IP spoofing is real, and that this also can be circumvented by anyone with a botnet. Fundamentally, there is no way to differentiate a real-person user from a fake “automated” one - you can just make it difficult.

It also costs the hoster of the website more traffic, causes extra request latency, and is horrible for security. Sounds just like current tracking, except harder to detect.

Maybe it won’t be this exact scenario(the website needs to trust data from the tracker, and the tracker needs to trust data from the website owner; This is less than ideal).
But there is no hope at all for tracking-blockers long-term: As long as companies want to embed them, nature finds a way.

CodeDragon57 · April 15, 2021, 6:04pm

I’d be almost inclined to agree with this, but wouldn’t that be a huge task to index every single website’s subomain(s) out there and figure out which ones have tracking in them and which ones do not? It seems like it’s already a huge undertaking as it is - even if bots/crawlers are involved.

CodeDragon57 · April 15, 2021, 6:06pm

P.S, I don’t know much about website crawlers or JS (i.e the DOM). The extent of my JS knowledge can be contained in one class that I did in high school. Though, it wasn’t to teach WebDev, but it was to teach programming basics.

judahnator · April 15, 2021, 6:31pm

I don’t think indexing even needs to be a thing, just namespace the cookies by a key derived from the security certificate from the connection. Say a MD5 or SHA1 hash, for example. This would have an unfortunate side-effect of invalidating cookies whenever you renewed your SSL certificate, but asking clients to just log back in doesn’t seem like a huge hassle.

Say you had www.example.com and example.com and they shared a certificate, lets call it “foo”. Cookies could be namespaced under the TLD like normal, but also add the MD5 hash of the cert.

example_com/acbd1…/cookie_1
example_com/abcd1…/cookie_2

Since both have the same TLD and certificate hash, they can both see both of the cookies because they are looking in the “example_com/abcd1…” namespace.

Say instead you had example.com and 3rdparty.example.com, where the former used the “foo” cert and the later used the “bar” cert. You would now have:

example_com/abcd1…/cookie_1
example_com/37b51…/cookie_2

Suddenly even though they use the same TLD the 3rd party can no longer see the cookies for the first party, because their cookie namespaces are different.

anon7678104 · April 15, 2021, 7:23pm

yeah i thought google and amazon saying sure use our dns, look they are faster, more secure and your isp wont care… was to good to be true.

CodeDragon57 · April 15, 2021, 7:24pm

I use https://nextdns.io for personal and https://cloudns.net for my Linode. (I used to use Cloudflare, but it had problems with Nextcloud).

E-waste · November 20, 2022, 4:51pm

I love NextDNS. Over 3/4 connections blocked. Of the ones that do get through, I don’t go much past 1500 per month.