Regardless of your intentions, you are collecting enough data to track users.
> I am transparent about what I collect ([URL])
That page doesn't mention that you are also collecting (and make no claim about storing) the globally-visible IP address (and any other data in the IP and TCP headers). This can be uniquely identifying; even when it isn't unique you usually only need a few bits of additional entropy to reconstruct[1] a unique tracking ID.
In my case, you're collecting and storing more than enough additional entropy to make a decent fingerprint because [window.innerWidth, window.innerHeight] == [847, 836]. Even if I resized the window, you could follow those changes simply by watching analytics events from the same IP that are temporally nearby (you are collecting and storing timestamps).
[1] An older comment where I discussed how this could be done (and why GA's supposed "anonymization" feature (aip=1) is a blatant lie): https://news.ycombinator.com/item?id=17170468
Given the choice between a lot of data about me given to a small provider and somewhat less data about me given to Google, I'd generally choose the former.
I’m not the OP, but where is there evidence that they’re storing the IP? Sure it’s in the headers that they process but that doesn’t mean they’re storing it.
Security matters if your concern is the data leaking to a potential malicious actor. The concern that I'm speaking to is the intended use of the data. Google is definitely going to use it for ad targeting and building a "shadow profile", but a small developer probably won't. This one says they won't, but even if they do they're likely to be much less effective than Google would be.
I'd imagine it's difficult to do in depth analytics with tracking users...
Sometime like this https://stackoverflow.com/questions/34031251/javascript-libr...
This is true. The legal department for the healthcare web sites I maintain doesn't let me store or track IP addresses, even for analytics.
I'm only allowed to tally most popular pages, display language chosen, and date/time. There might be one or two other things, but it's all super basic.
Having a random developer create a shadow profile isn't the same.
The scale is vastly different and can be used to track you from site to site.
There is 'justice' in the blog creator using analytics data to to improve the experience of blog visitors: a user's data will, theoretically and in aggregate, create a better experience for that user in the future. The class of 'users who browse this page' gets a benefit from the cost of providing data.
Selling browsing information to advertisers is sort of 'anti-justice'. Using blog visitor data to track and more effectively manipulate those visitors elsewhere on the internet into paying people money. The blog visitor's external online experience is made worse by browsing that blog.
First, "IPs" might be confusing; "IP addresses" would be more accurate.
More importantly, you have to collect IP addresses (or any other value in the packet headers[1][2]) - even if you don't store it - if you want to receive any packets from the rest of the internet. Storage of those values is separate issue entirely, and it's good to hear that you are intending to NOT store IP addresses (and updating the documenting)!
Also, I strongly recommend using Drdrdrq's suggestion to lower the precision of the collected window dimensions, which should be done on the client i.e. "Math.floor(window.innerWidth/50)*50". This kind of bit-reduction makes fingerprinting a lot harder.
[1] https://en.wikipedia.org/wiki/IPv4#Header
[2] https://en.wikipedia.org/wiki/Transmission_Control_Protocol#...
This cannot be stressed enough. At my day job I write reasonably secure software on a team for big clients, then at home I write reasonably secure software independently for small clients.
Come new security issue, the big clients at day job get first priority. Not because they are big and not because they are paying more, but rather because as a team we can reallocate resources and work on issues in parallel. At home, there is only one Dotan to work on each independent client in series.
It wouldn't make sense to prioritize optimizing site design for the few people who are using a non-standard size.
I can’t say I love having Google track me, but I don’t feel any better about someone else doing it either.
Why is Google security better than anyone else? Monopolies often have more resource, but lack motive, because they are a monopoly. Without transparency we have no idea how secure Google's systems are, but we do know Google has been hacked before.
Also, Google knows how to make profiles and it knows the importance of that data amd keeping it safe. It is also somewhat answerable to Consumer groups,users,shareholders,regulatory bodies. Indie dev doesn't know how to make good profile, more likely to sell the data to make revenue. Not ridiculing indie devs, just ridiculing your assumptions that if a solo dev is an angel.
https://www.labnol.org/internet/sold-chrome-extension/28377/