zlacker

[return to "Show HN: I made a privacy-first minimalist Google Analytics"]
1. Adriaa+8[view] [source] 2018-09-19 14:13:28
>>Adriaa+(OP)
Creator here. As a developer, I install analytics for clients, but I never feel comfortable installing Google Analytics because Google creates profiles for their visitors, and uses their information for apps (like AdWords). As we all know, big corporations unnecessarily track users without their consent. I want to change that.

So I built Simple Analytics. To ensure that it's fast, secure, and stable, I built it entirely using languages that I'm very familiar with. The backend is plain Node.js without any framework, the database is PostgreSQL, and the frontend is written in plain JavaScript.

I learned a lot while coding, like sending requests as JSON requires an extra (pre-flight) request, so in my script I use the "text/plain" content type, which does not require an extra request. The script is publicly available (https://github.com/simpleanalytics/cdn.simpleanalytics.io/bl...). It works out of the box with modern frontend frameworks by overwriting the "history.pushState"-function.

I am transparent about what I collect (https://simpleanalytics.io/what-we-collect) so please let me know if you have any questions. My analytics tool is just the start for what I want to achieve in the non-tracking movement.

We can be more valuable without exploiting user data.

◧◩
2. pdkl95+Yf[view] [source] 2018-09-19 16:13:14
>>Adriaa+8
> unnecessarily track users without their consent

Regardless of your intentions, you are collecting enough data to track users.

> I am transparent about what I collect ([URL])

That page doesn't mention that you are also collecting (and make no claim about storing) the globally-visible IP address (and any other data in the IP and TCP headers). This can be uniquely identifying; even when it isn't unique you usually only need a few bits of additional entropy to reconstruct[1] a unique tracking ID.

In my case, you're collecting and storing more than enough additional entropy to make a decent fingerprint because [window.innerWidth, window.innerHeight] == [847, 836]. Even if I resized the window, you could follow those changes simply by watching analytics events from the same IP that are temporally nearby (you are collecting and storing timestamps).

[1] An older comment where I discussed how this could be done (and why GA's supposed "anonymization" feature (aip=1) is a blatant lie): https://news.ycombinator.com/item?id=17170468

◧◩◪
3. harian+Nj[view] [source] 2018-09-19 16:41:36
>>pdkl95+Yf
Good comment! I only store the window.innerWidth metric. I updated the what we collect page (https://simpleanalytics.io/what-we-collect) to reflect the IP handling. We don't store them. And fingerprinting is something that would be definitely tracking, not on my watch!
◧◩◪◨
4. pdkl95+oV[view] [source] 2018-09-19 21:22:39
>>harian+Nj
> We don't collect and store IPs.

First, "IPs" might be confusing; "IP addresses" would be more accurate.

More importantly, you have to collect IP addresses (or any other value in the packet headers[1][2]) - even if you don't store it - if you want to receive any packets from the rest of the internet. Storage of those values is separate issue entirely, and it's good to hear that you are intending to NOT store IP addresses (and updating the documenting)!

Also, I strongly recommend using Drdrdrq's suggestion to lower the precision of the collected window dimensions, which should be done on the client i.e. "Math.floor(window.innerWidth/50)*50". This kind of bit-reduction makes fingerprinting a lot harder.

[1] https://en.wikipedia.org/wiki/IPv4#Header

[2] https://en.wikipedia.org/wiki/Transmission_Control_Protocol#...

◧◩◪◨⬒
5. Bjartr+UY[view] [source] 2018-09-19 22:08:19
>>pdkl95+oV
I would argue that in the conversational context "collect" is more a synonym for "store" than for "receive" or "see". Moreso in the context of a tracking system. In my opinion anyway.
[go to top]