zlacker

[parent] [thread] 38 comments
1. jonas2+(OP)[view] [source] 2025-02-07 18:05:31
ZIP codes are an emergent property of the mail delivery system. While the author might consider this a bad thing, this makes them "good enough" on multiple axes in practice. They tend to be:

- Well-known (everybody knows their zip code)

- Easily extracted (they're part of every address, no geocoding required)

- Uniform-enough (not perfect, but in most cases close)

- Granular-enough

- Contiguous-enough by travel time

Notably, the alternatives the author proposes all fail on one or more of these:

- Census units: almost nobody knows what census tract they live in, and it can be non-trivial to map from address to tract

- Spatial cells: uneven distribution of population, and arbitrary division of space (boundaries pass right through buildings), and definitely nobody knows what S2 or H3 cell they live in.

- Address: this option doesn't even make sense. Yes, you can geocode addresses, but you still need to aggregate by something.

replies(12): >>mattfo+O2 >>ellisv+23 >>ericra+X8 >>o11c+T9 >>JumpCr+5f >>walrus+8f >>hinkle+Oj >>michae+4o >>killjo+Jo >>raphma+QA >>stevag+Ur1 >>MathCo+J13
2. mattfo+O2[view] [source] 2025-02-07 18:20:19
>>jonas2+(OP)
Well you hit on all the points that discuss the compromises that zip codes offer. Just because you have them in your data doesn't mean that they can produce anything useful. You are correct that no one knows their census unit is (if you are thinking from someone entering this on a website) but collecting location or address will be a lot better.

Fact is a lot of web data contains a zip but if you can collect something better it will usually render better results. Unless you are analyzing shipments then that is fine.

3. ellisv+23[view] [source] 2025-02-07 18:21:05
>>jonas2+(OP)
There are point process models, but, yes, its much more common to want to aggregate to a spatial area.

Another consideration is what kind of reference information is available at different spatial units. There are plenty of Census Bureau data available by ZCTA but some data may only be available at other aggregate units. Zip Codes are often used as political boundaries.

I'd also mention the "best" areal unit depends on the data. There is a well known phenomenon called the modifiable areal unit problem in which spatial effects appear and vanish at different spatial resolutions. It can sort of be thought of as a spatial variation of the ecological fallacy.

4. ericra+X8[view] [source] 2025-02-07 18:57:04
>>jonas2+(OP)
This is a tangent, but addresses are also way more complicated than most people realize - especially if you’re relying on a user to input a correct address or if you need to support multiple countries, somewhere with unique addresses like Queens[0], or you need to differentiate between units of a specific street address that uses something other than unit numbers for a unit designation.

At that point you need something like Smarty[1] to validate and parse addresses.

[0]: https://stackoverflow.com/questions/2783155/how-to-distingui...

[1]: https://www.smarty.com/

replies(5): >>nitwit+Ae >>rented+Om >>ghaff+Tr >>bob102+KD >>jonath+711
5. o11c+T9[view] [source] 2025-02-07 19:02:52
>>jonas2+(OP)
Also, "use a different grid" is only masking the problem, not actually fixing it.

The real problem is ever using an average without also specifying some sort of bounds. For median-based data, this probably means the upper and lower quartiles (or possibly other percentiles); for mean-based data, this probably means standard deviation.

◧◩
6. nitwit+Ae[view] [source] [discussion] 2025-02-07 19:28:32
>>ericra+X8
Yes, unfortunately, their assertion that everyone knows their zip code is wrong. People often write a neighboring code, and the post office just delivers it.

Similar issues for city name, of course.

replies(3): >>VWWHFS+di >>steeze+ds >>pauldd+nI
7. JumpCr+5f[view] [source] 2025-02-07 19:32:24
>>jonas2+(OP)
Would add that there are network effects with zip code data. If you collect H2 data, you have fewer sources with which to join.
8. walrus+8f[view] [source] 2025-02-07 19:32:31
>>jonas2+(OP)
In terms of "good enough", a Canadian postal code, broadly equivalent to a zip code, is much more granular and can often identify an individual apartment building, or single city block. Plenty of large office buildings in major Canadian cities also have their own postal code.

The functionality of it is closer to the "Zip+4" with extension used to have a more granular routing of physical mail for USPS.

https://www.canadapost-postescanada.ca/cpc/en/support/articl...

https://en.wikipedia.org/wiki/Postal_codes_in_Canada

replies(3): >>ssl-3+Gj >>mattfo+1o >>throw0+1t
◧◩◪
9. VWWHFS+di[view] [source] [discussion] 2025-02-07 19:50:55
>>nitwit+Ae
Very common in NYC. People will use all of "New York, NY", "Queens, NY", or "Astoria, NY" all interchangeably and the post office will still just deliver it to the same place.
replies(1): >>ericra+Kc1
◧◩
10. ssl-3+Gj[view] [source] [discussion] 2025-02-07 20:00:39
>>walrus+8f
Sure, and in the States, ZIP+4 could once nail my postal location to a subset of 4 (of a group of 16) mailboxes within a particular set of entry doors on a particular apartment building.

But broadly speaking, nobody knows what their ZIP+4 is, while I imagine that most people in Canada know their postal code by heart.

It is interesting.

replies(1): >>bluGil+bo
11. hinkle+Oj[view] [source] 2025-02-07 20:01:54
>>jonas2+(OP)
Contiguous enough by data travel time as well. A few people will get 5 ms more latency than the exact optimal route, but it’s not like your routes are exactly optimal anyway.

And don’t forget sales tax. Which is state + county + city

replies(1): >>kstrau+Dy
◧◩
12. rented+Om[view] [source] [discussion] 2025-02-07 20:19:59
>>ericra+X8
An annoyance for me is that I've yet to see any address validator get my current home address right. They all insist my address is on the road that leads to my road rather than my actual road. It's understandable that they can't be 100% accurate given the scale / complexity of addresses.

Most sites/apps will let me override the validator, but a few won't. The most common ones that insist on using the wrong address are financial institutions that say the law requires them to have my proper physical address and therefore they go with the (incorrectly) validated version.

USPS does not do home delivery in our area, and UPS/FedEx/etc. usually figure it out given that street numbers alone uniquely identify properties in our town.

replies(2): >>killjo+4q >>jonath+u11
◧◩
13. mattfo+1o[view] [source] [discussion] 2025-02-07 20:27:09
>>walrus+8f
Yeah but Zip+4 represent a collection of houses not a polygon so not useful for aggregations or statistical work
14. michae+4o[view] [source] 2025-02-07 20:27:20
>>jonas2+(OP)
If you are worrying about address at all instead of tax or legal jurisdiction its probable that you as a business have a physical presence. You can probably correlate better by predicting which location a given address would likely interact with if you don't know already by prior purchases/interaction which they normally do so. I would suggest actual purchase data followed by travel time.

Zip and distance as the crow flies often gives shit data. My zip suggests I'm off in bum fuck and since I'm on the puget sound things that are relatively near as the crow flies can actually be hours away.

◧◩◪
15. bluGil+bo[view] [source] [discussion] 2025-02-07 20:27:49
>>ssl-3+Gj
The plus four changes all the time so it isn't feasable to know it. The use is large mailers can get a discount by looking it up and presorting mail. If the mail coming into my post office has my mail next to my next door neighbors that saves them a lot of time.
replies(1): >>kstrau+Sy
16. killjo+Jo[view] [source] 2025-02-07 20:30:07
>>jonas2+(OP)
ZIPs are also specifically used in a variety of medical, epidemiologic, public health contexts and HHS has explicit, fairly fine-grained rules on their use: https://www.hhs.gov/hipaa/for-professionals/special-topics/d...
◧◩◪
17. killjo+4q[view] [source] [discussion] 2025-02-07 20:37:56
>>rented+Om
Same! My wife ran a business from home during the pandemic and we actually went through the effort to work with Google Maps (they called us) to get it on the map. And of course USPS has no problem. But our address was originally a federal building with a letter, still only has a letter, no number, and there are now all sorts of work-arounds floating around on how resolve addresses in our neighborhood. What's wild is the Post Office is literally down the street from our house, and our house predates the founding of most of the big delivery services, which all manage to deliver to us, given their preferred incantation. If I can't get the shipper to pass the right incantation to their shipping service, shenanigans ensue. My (least?) favorite was an item that went across the Pacific Ocean 3 times over the course of 3 months.
replies(1): >>jonath+A11
◧◩
18. ghaff+Tr[view] [source] [discussion] 2025-02-07 20:49:40
>>ericra+X8
Just last week I had to deal with the fact that my house has the wrong address in multiple databases because things changed when an interstate went in 40-something years ago. It's not a big change--main st. vs N main st. but it was enough to mess up various things. Not as much as when I moved in 30 years ago but still enough to be wrong in old town and telco records. Took me a couple of days to get a permit issued to get electrical hooked back up after a fire as a result because apparently some town clerk insisted the address wasn't valid.
replies(1): >>jwnacn+iI2
◧◩◪
19. steeze+ds[view] [source] [discussion] 2025-02-07 20:52:08
>>nitwit+Ae
This sounds like the person doesn't know the receiver's zip code. Why are you extending that to not knowing their own zip code? Are they mailing something to themselves?
replies(3): >>toast0+Zt >>wisty+TY >>tbrown+Oj1
◧◩
20. throw0+1t[view] [source] [discussion] 2025-02-07 20:57:22
>>walrus+8f
> In terms of "good enough", a Canadian postal code, broadly equivalent to a zip code, is much more granular and can often identify an individual apartment building, or single city block.

To the point that StatCan and other agencies have rules on the number of characters that are collected/disseminated with other data to make sure it's not too identifying:

* https://www.canada.ca/en/government/system/digital-governmen...

* https://www12.statcan.gc.ca/nhs-enm/2011/ref/DQ-QD/guide_2-e...

◧◩◪◨
21. toast0+Zt[view] [source] [discussion] 2025-02-07 21:04:12
>>steeze+ds
People often give out their mailing address, and may be misinformed about their zip code.

If you get close enough, it usually gets handled in the local sort, but not always.

On cities, the mailing address city really is the name of the post office that handles your delivery route. Often there's a relationship with the city you live in, but there's cases both ways --- I used to live outside city limits, we had a census designated place name, a municipal sanitary district and had a fire department at one time... but never a post office, so our mailing address used the nearby city name, where our post office resided. The place name had an incorporated city on the other side of the state, so using that wouldn't be great.

Nowadays, post offices often have a list of alternative place names, so where I live now, I can pick between the incorporated city name, the nearby large city where a post office that processes all my mail is located, or any of the numerous small post offices that once served my city.

replies(1): >>rascul+Y81
◧◩
22. kstrau+Dy[view] [source] [discussion] 2025-02-07 21:35:10
>>hinkle+Oj
... + special entertainment district + business renovation area + exception + exception + exception + ...
◧◩◪◨
23. kstrau+Sy[view] [source] [discussion] 2025-02-07 21:37:25
>>bluGil+bo
Is that still true? I would imagine any reasonably modern computer could map every physical address in a huge region to a (route number, stop number) pair. I wouldn't think the +4 would add a lot of value anymore.
replies(1): >>bluGil+KU
24. raphma+QA[view] [source] 2025-02-07 21:47:38
>>jonas2+(OP)
One more advantage: ZIP codes are a good trade-off if you want to gather anonymous data in a survey or provide anonymized data to an outside entity. For example, we recently conducted a survey on mobility patterns within our university. To offer respondents a reasonable amount of anonymity, we just asked for their (German) ZIP code and the location of their primary workplace. This allows us to determine the distance and approximate route people would take between home and university campus - to a degree that is sufficient for our goals.
◧◩
25. bob102+KD[view] [source] [discussion] 2025-02-07 22:07:43
>>ericra+X8
Addresses are a huge ordeal in banking. Easily one of the most tortured domain types when it comes to edge cases and integration pain.

Every customer I've worked with insisted on having all addresses ran through the USPS verification API so they could get their bulk mailing discounts.

Even if you get the delivery/cost side under control, you still have to make sure you are talking about the right address from a logical perspective. Mailing, physical, seasonal, etc. address types add a whole extra dimension of fun.

◧◩◪
26. pauldd+nI[view] [source] [discussion] 2025-02-07 22:41:10
>>nitwit+Ae
They know their ZIP code far, far better than any other plausible geographic cell.
◧◩◪◨⬒
27. bluGil+KU[view] [source] [discussion] 2025-02-08 00:27:51
>>kstrau+Sy
The sort everything outgoing by where it goes on the truck is valuable. sure computers can sort but this is physical things and so mechanical limits apply.
◧◩◪◨
28. wisty+TY[view] [source] [discussion] 2025-02-08 01:09:58
>>steeze+ds
People more or less mail themselves parcels all the time, with online delivery.
replies(1): >>steeze+B01
◧◩◪◨⬒
29. steeze+B01[view] [source] [discussion] 2025-02-08 01:27:19
>>wisty+TY
Ha you make an excellent point actually. I wasn't even thinking of that.
◧◩
30. jonath+711[view] [source] [discussion] 2025-02-08 01:32:47
>>ericra+X8
Thanks for the shout-out. Founder of Smarty here.

Regarding article, it really depends on the use case of whether to use ZIP Code (TM), postal code, Canada Post Forward Sortation Area, lat/lon, Census Bureau block and tract, etc.

As has been noted, the ZIP Code is often good enough for aggregating data together and can be a good first step if you don’t know where to start.

◧◩◪
31. jonath+u11[view] [source] [discussion] 2025-02-08 01:37:15
>>rented+Om
Send your address to support@smarty.com and link to this HN thread. I’ll keep an eye watching out for it. I’d love to see what our system does with your address.

We have non-postal addresses and a lot of other mechanisms to help here. We also have contacts at the USPS and others to help fix addresses.

◧◩◪◨
32. jonath+A11[view] [source] [discussion] 2025-02-08 01:38:38
>>killjo+4q
I just replied to an earlier message on this thread with the same offer:

I’d love to have you email your mailing address to support@smarty.com with a link to this HN thread. We may be able to help fix some of this.

◧◩◪◨⬒
33. rascul+Y81[view] [source] [discussion] 2025-02-08 03:03:48
>>toast0+Zt
> On cities, the mailing address city really is the name of the post office that handles your delivery route.

Bigger cities can have multiple post offices and zip codes with the same mail address city.

◧◩◪◨
34. ericra+Kc1[view] [source] [discussion] 2025-02-08 03:54:33
>>VWWHFS+di
This is sort of apocryphal - and also anecdotal because I have my own personal experience living in an annexed Boston neighborhood to draw on - but in a lot of the towns/neighborhoods that have been annexed by Boston, people still use the neighborhood name[1] as the city name because you are more likely to get your package when you indicate which “Washington St,” “Boylston St,” etc. you actually live at.

According to one commenter on the subject:

  It doesn't matter, as long as the zip code is correct

[0]: https://www.city-data.com/forum/boston/601106-mailing-addres...

[1]: https://www.city-data.com/forum/boston/601106-mailing-addres...

◧◩◪◨
35. tbrown+Oj1[view] [source] [discussion] 2025-02-08 05:32:04
>>steeze+ds
I will occasionally still try to use the zip code for my old work address (from about a year before covid) when what I want is my home address.
36. stevag+Ur1[view] [source] 2025-02-08 07:25:31
>>jonas2+(OP)
> Easily extracted (they're part of every address, no geocoding required)

That's only true if you can also access the spatial boundaries of the zipcodes themselves.

In Australia, this turns out not to be true: the postal system considers their boundaries to be commercial confidential information and doesn't share them. The best we can do is the Australian Bureau of Statistics' approximations of them, which they dub "postal areas".

◧◩◪
37. jwnacn+iI2[view] [source] [discussion] 2025-02-08 20:44:32
>>ghaff+Tr
Here is a little-known (but very useful piece of information).

The US Postal Services has a team of people that handle address updates. This team is localized to different regions so that they generally are aware of local nuances. If you need to talk to the USPS about getting an address issue resolved simply go to this USPS AMS site and enter your zipcode to find the team that handles addresses in that area:

https://postalpro.usps.com/ppro-tools/address-management-sys...

If they don't answer, leave a message. They have helped me thousands of times in my last 14 years working with address validations.

replies(1): >>ghaff+CW2
◧◩◪◨
38. ghaff+CW2[view] [source] [discussion] 2025-02-08 22:55:37
>>jwnacn+iI2
The USPS has always been correct since I moved in. It’s been local records and the telcos that have been the problem.

And in this case the fire companies had no problem finding my house in spite of the incorrect information in town records. As you suggest the field people on the ground generally know what the ground truth is.

39. MathCo+J13[view] [source] 2025-02-08 23:52:39
>>jonas2+(OP)
On the note of census units, the only reason we all know our zipcode is because we have to know it. If census units were used as frequently as zip’s I imagine they would quickly become more widely known as well.
[go to top]