They're not perfect (nothing is), but they're actually pretty good. Every task has to be completable within a sprint. If it's not, you break it down until you have a part that you expect is. Everyone has to unanimously agree on how many points a particular story (task) is worth. The process of coming to unanimous agreement is the difficult part, and where the real value lies. Someone says "3 points", and someone points out they haven't thought about how it will require X, Y, and Z. Someone else says "40 points" and they're asked to explain and it turns out they misunderstood the feature entirely. After somewhere from 2 to 20 minutes, everyone has tried to think about all the gotchas and all the ways it might be done more easily, and you come up with an estimate. History tells you how many points you usually deliver per sprint, and after a few months the team usually gets pretty accurate to within +/- 10% or so, since underestimation on one story gets balanced by overestimation on another.
It's not magic. It prevents you from estimating things longer than a sprint, because it assumes that's impossible. But it does ensure that you're constantly delivering value at a steady pace, and that you revisit the cost/benefit tradeoff of each new piece of work at every sprint, so you're not blindsided by everything being 10x or 20x slower than expected after 3 or 6 months.
Estimating in points that are basically a Fibonacci sequence keep estimation precision limited and avoids implying false guarantees. Yes, in the end the chosen stories are based on summing to a number of points that is roughly equivalent to what the team has achieved per-sprint in the recent past, so in that sense it is ultimately estimating days in the end. But again, for whatever psychological reason, people seem to be more realistic about the variance in actual delivered points per sprint, as opposed to when you try to measure things in hours or days. The points imply more of an estimated goal than a deadline guarantee, which helps keep both team expectations and management expectations more reasonable.
tl;dr: Psychology.
> and people tend not to be overly optimistic and forget to account for things like sickness, meetings, etc.
> But again, for whatever psychological reason, people seem to be more realistic about the variance in actual delivered points per sprint, as opposed to when you try to measure things in hours or days. The points imply more of an estimated goal than a deadline guarantee, which helps keep both team expectations and management expectations more reasonable.