Show HN: I made a privacy-first minimalist Google Analytics

>>Adriaa+(OP)
Creator here. As a developer, I install analytics for clients, but I never feel comfortable installing Google Analytics because Google creates profiles for their visitors, and uses their information for apps (like AdWords). As we all know, big corporations unnecessarily track users without their consent. I want to change that.

So I built Simple Analytics. To ensure that it's fast, secure, and stable, I built it entirely using languages that I'm very familiar with. The backend is plain Node.js without any framework, the database is PostgreSQL, and the frontend is written in plain JavaScript.

I learned a lot while coding, like sending requests as JSON requires an extra (pre-flight) request, so in my script I use the "text/plain" content type, which does not require an extra request. The script is publicly available (https://github.com/simpleanalytics/cdn.simpleanalytics.io/bl...). It works out of the box with modern frontend frameworks by overwriting the "history.pushState"-function.

I am transparent about what I collect (https://simpleanalytics.io/what-we-collect) so please let me know if you have any questions. My analytics tool is just the start for what I want to achieve in the non-tracking movement.

We can be more valuable without exploiting user data.

>>Adriaa+8
> unnecessarily track users without their consent

Regardless of your intentions, you are collecting enough data to track users.

> I am transparent about what I collect ([URL])

That page doesn't mention that you are also collecting (and make no claim about storing) the globally-visible IP address (and any other data in the IP and TCP headers). This can be uniquely identifying; even when it isn't unique you usually only need a few bits of additional entropy to reconstruct[1] a unique tracking ID.

In my case, you're collecting and storing more than enough additional entropy to make a decent fingerprint because [window.innerWidth, window.innerHeight] == [847, 836]. Even if I resized the window, you could follow those changes simply by watching analytics events from the same IP that are temporally nearby (you are collecting and storing timestamps).

[1] An older comment where I discussed how this could be done (and why GA's supposed "anonymization" feature (aip=1) is a blatant lie): https://news.ycombinator.com/item?id=17170468

>>pdkl95+Yf
I think there's value in at least distributing the data that's collected. I may not like that the analytics provider has my data, but it seems like a lesser evil if that provider isn't also the world's largest ad company and they aren't using it to build profiles behind the scenes to track my every move across a significant part of the Internet.

Given the choice between a lot of data about me given to a small provider and somewhat less data about me given to Google, I'd generally choose the former.

>>Lyndsy+kk
Thats no a good way to make a decision. Big,small doesn't matter. What matters is who is providing better security? When 2 parties big,small are collecting data ,then the party which can act on security vulnerabilities quickly and has great security engineers and dedicated teams like Project Zero- is the much better choice. People nowadays assume that a small,indie developer is a good guy. I am just pointing out that this is a very bad bias to have. Technicalities matter, security robustness matters. Google might be collecting data,but their security is really good. Good effort by this dev though.

>>sharce+Lq
I totally agree on the security aspect, but I think we're talking about different threat models.

Security matters if your concern is the data leaking to a potential malicious actor. The concern that I'm speaking to is the intended use of the data. Google is definitely going to use it for ad targeting and building a "shadow profile", but a small developer probably won't. This one says they won't, but even if they do they're likely to be much less effective than Google would be.

>>Lyndsy+qz
I'm curious what your concern with Google building this 'shadow profile' is if you're not worried about this data being leaked to a malicious actor - Is Google simply having this data a bad thing, and if so, why?

>>dzader+cH
Is that really a question? Google creates global profiles of everyone for tracking and advertising.

Having a random developer create a shadow profile isn't the same.

The scale is vastly different and can be used to track you from site to site.

>>wolco+8R
I know Google creates global profiles for tracking - and my question (which is the same as my original question) is why do you care? If that data is only used internally by google to serve you better ads why are you concerned with them having your data?

>>dzader+101
Even if a user trusts Google, because the data is digital and therefore permanent, there's no guarantee it will remain internal forever, whether that's because of a hack, a rogue employee, police/government pressure, or a change of ownership.

>>tagawa+Tg1
It seems to me that, with the exception of a rogue employee, all of those examples are at a greater risk of occurring with a small, independent provider. Google almost certainly has more security resources, more legal resources and political clout, and isn’t likely to be acquired any time soon.

I can’t say I love having Google track me, but I don’t feel any better about someone else doing it either.

>>seando+hr1
If the marketplace was full of independent trackers (which I'm not suggesting is a good idea, because third party trackers are bad in the first place), then as they get compromised, only a small subset of data is lost... The chance of losing everything or enough data to pair to your real identity is a lot lower. It's like IDs in physical activity. If you visit your bank they track you by a different id to the library, your medical record, etc, each might be lost individually and be upsetting, but do they reveal data about all the others? No.

Why is Google security better than anyone else? Monopolies often have more resource, but lack motive, because they are a monopoly. Without transparency we have no idea how secure Google's systems are, but we do know Google has been hacked before.

>>marich+yF1
Humans make systems. Teams like Project Zero (of Google) have contributed a ton to security. They prioritize security a lot.

zlacker