They don't represent geography at all, they represent the organizational structure of USPS.
They work by making the address on a letter almost meaningless. For some smaller population zip codes you can practically just put the name and zip code down and achieve delivery.
You're not going to wind up with a situation where zip codes with the same regional marker end up on different coasts.
In other words, is it safe to assume that for entity in a zip code is less than x distance away from the closest entity in the same zip code?
A 5+4 formatted ZIP code maps to just a handful of addresses. In cities with larger populations, the +4 could map to a single building, and in more sparely populated place, it might include houses on a handful of roads.
For smaller datasets, ZIP+4 might as well be a unique household identifier. I just checked a 10 million address database and 60% of entries had a unique ZIP+4, so one other bit of PII would be enough to be a 99.99% unique identifier per person.
With a geo-coded ZIP+4 database, you could locate people with a precision that's proportional to the population density of their region.
Please see: https://opencagedata.com/guides/how-to-think-about-postcodes...
I write this as someone who grew up in the ZIP code 09180
Couldn't this happen for military or proxy codes (PO boxes or other) ?
zip codes don't even need to be contiguous. It's a mail delivery route, not a polygon.
There are 5 cases where the assumption is violated:
- Non-contiguous areas
- Zip codes that are a single point (some big companies get their own zip with a single mailbox, e.g. GE in Schenectady, NY is zip 12345)
- Zip codes that are a single line (highway-based delivery routes)
- Overlapping boundaries (since mail routes are linear, choosing a polygon representation is arbitrary and often not unique in space)
- Residents of some zip codes are not stationary (e.g. houseboats)
In short, asking questions about the area of a zip code is a category error - zip codes do not have a uniform representation in space. And we should be highly skeptical of any geospatial analysis that assumes polygons.
What they do not have is any sort of spatial consistency, they are a convenience for mail sorting. So if you start analyzing patterns across zip codes, you are pulling in information that is likely useless for or harmful to answering your question.