@wendell what do you think about this?
It’s in the title, but, I did leave out the Firefox+uBlock. Cause who’s gonna use anything other than Firefox
At the same time, this is mostly for people who don’t use uBlock Origin (on Firefox), and thus do not know about this.
Just wait, in not too long instead of adding cross-site tracking at the DNS level it will be inserted directly into the server as a reverse proxy, possibly for some important assets, to make filtering more difficult…
I have no idea what you just said…
that’s the tl;dr for the article.
CNAME DNS tracking is the exploit, it was discovered in 2007/2010. It’s important for people who run ublock on firefox or chrome (which doesn’t have a api that ublock can use to protect users from this exploit).
How it’s implemented
CNAME cloaking involves having a web publisher put a subdomain – e.g. trackyou.example.com – under the control of a third-party through the use of a CNAME DNS record. This makes a third-party tracker associated with the subdomain look like it belongs to the first-party domain, example.com.
Furthermore, this type of exploit is more common as many websites use a subdomain CNAME DNS record, to make a third party look like it belongs to the website
The boffins from Belgium studied the CNAME-based tracking ecosystem and found 13 different companies using the technique. They claim that the usage of such trackers is growing, up 21 per cent over the past 22 months, and that CNAME trackers can be found on almost 10 per cent of the top 10,000 websites.
Lots of sites use such scripts and are leaky.
What’s more, sites with CNAME trackers have an average of about 28 other tracking scripts. They also leak data due to the way web architecture works. The researchers found cookie data leaks on 7,377 sites (95%) out of the 7,797 sites that used CNAME tracking.
I wasn’t talking about the article. I understood that perfectly. I was talking about not understanding max.
Oh yeah, I quoted you for other people who come along this thread.
The other stuff I wrote was adding to your tl;dr
Doesn’t that defeat the purpose of a tl;dr?
Not really. It elaborates on your point. If people are interested, they can read a bit more either at my post or the article.
That makes it too easy for site owners to forge traffic in a way that’s nearly impossible to detect by the advertisers, who are paying them based on the amount of traffic they send.
I keep hearing folks say this is an exploit. Even the “Security Now” podcast was saying it.
I am not so sure about that myself. This definitely smells like a “this is by design but we don’t like the design so we call it an exploit because it’s easier to put the responsibility on the anonymous bad actors” - sort of thing.
One solution would be to not assume that subdomains of a site should have the same cookie access rights as the parent domain unless both the main and subdomain share the same security certificate. If the certificates are different, assume different actors behind the scenes.
Of course this is defeated by the web host serving the tracking script as a first-party, but such is the cat-and-mouse game.
I meant the following setup:
I connect to https://whatever_website/
Website hoster has setup some resources on that website(could be anything, but ideally
something that is loaded every time(not cached) and is required for the website to work).
When a request to that resource is made, the webserver “reverse-proxies” that connection to another webserver, owned by the advertisement/tracking company.
They can either answer themselves, or forward it back to the host server(double-reverse-proxy!).
They effectively circumvent all possible versions of same-origin policy, and denying cookies based on network parameters(as adblockers do): The client has no way of telling if the resource is from one server or another: Not from the origin(that’s just the regular server from the users perspective), and not from the content(ideally it’s normal content, unrelated to tracking).
@rcxb I agree that this hypothetical scenario is horrible: They have to believe the proxy server’s headers to have the original IP and request data in them. But look at what they get in return: R/W access to all cookie data, server headers, running JS in your client etc; basically everything they usually use to “validate” (admittedly weakly) “real” users, except having the specific client-IP connect to them directly.
I’d like to mention that IP spoofing is real, and that this also can be circumvented by anyone with a botnet. Fundamentally, there is no way to differentiate a real-person user from a fake “automated” one - you can just make it difficult.
It also costs the hoster of the website more traffic, causes extra request latency, and is horrible for security. Sounds just like current tracking, except harder to detect.
Maybe it won’t be this exact scenario(the website needs to trust data from the tracker, and the tracker needs to trust data from the website owner; This is less than ideal).
But there is no hope at all for tracking-blockers long-term: As long as companies want to embed them, nature finds a way.
I’d be almost inclined to agree with this, but wouldn’t that be a huge task to index every single website’s subomain(s) out there and figure out which ones have tracking in them and which ones do not? It seems like it’s already a huge undertaking as it is - even if bots/crawlers are involved.
P.S, I don’t know much about website crawlers or JS (i.e the DOM). The extent of my JS knowledge can be contained in one class that I did in high school. Though, it wasn’t to teach WebDev, but it was to teach programming basics.
I don’t think indexing even needs to be a thing, just namespace the cookies by a key derived from the security certificate from the connection. Say a MD5 or SHA1 hash, for example. This would have an unfortunate side-effect of invalidating cookies whenever you renewed your SSL certificate, but asking clients to just log back in doesn’t seem like a huge hassle.
Say you had
example.com and they shared a certificate, lets call it “foo”. Cookies could be namespaced under the TLD like normal, but also add the MD5 hash of the cert.
Since both have the same TLD and certificate hash, they can both see both of the cookies because they are looking in the “example_com/abcd1…” namespace.
Say instead you had
3rdparty.example.com, where the former used the “foo” cert and the later used the “bar” cert. You would now have:
Suddenly even though they use the same TLD the 3rd party can no longer see the cookies for the first party, because their cookie namespaces are different.
yeah i thought google and amazon saying sure use our dns, look they are faster, more secure and your isp wont care… was to good to be true.
I love NextDNS. Over 3/4 connections blocked. Of the ones that do get through, I don’t go much past 1500 per month.