zlacker

[parent] [thread] 1 comments
1. capabl+(OP)[view] [source] 2023-05-05 15:59:43
I don't work on the AT protocol, and don't have any deeper insights into it, I just started reading about it a week or two ago and still putting all the pieces together myself. I linked the twitter thread not as a "Go read this you fucker" but more like "there is no point in me repeating what has already been written elsewhere". I'm just trying to help understanding, not convince you of something, I have zero horses in this race :)

But something I can answer directly to as I have deeper expertise with it, is this:

> how in the world a hash would make it easier to sync content than a URL I don't know

URLs are pointing to a location while content-hashes point to specific pieces of content. Moving from URLs to hashes as URIs gives you the benefit of being able to fetch the content from anywhere, and cache it indefinitely.

Basically any large distributed system out there, no matter if it deals with caching or not is built on top of content-addressable blobs, as it reduces the complexity by magnitudes.

Suddenly, you can tell any of your peers "Give me content X" and you don't really care where it comes from, as long as it is verifiably X. Contrast that to URLs which point to a specific location somewhere, and someone has to server it. If the URL is unresponsive, you cannot really fetch the content anymore.

Content-addressing used in this manner is not new or invented by Bluesky, but a old concept that has been used for multiple use cases, caching is maybe the most common one, but definitely not the only one. Probably the first time I came across it was in Plan 9 (Venti) around ~2000 sometime. First time I actually used it in production was with Tahoe-LAFS, which must have been around ~2010 sometime I think.

replies(1): >>vidarh+C92
2. vidarh+C92[view] [source] 2023-05-06 08:24:48
>>capabl+(OP)
You can treat a URL as a hash into a content-addressable store just fine. Mastodon does just that. Yet that URL also tells it where to retrieve the content if it's not available locally in a way that doesn't require tools to have any additional knowledge. If they do have additional knowledge, say of another caching layer or storage mechanism, they can use that just fine.

That is, I can just paste the URL for this article into my Mastodon instance, and if it has it, it'll fetch it from the local storage, if it doesn't it'll try to fetch it from the source, but there's nothing preventing a hierarchy of caches here, nor is there anything preventing peer to peer.

But while ActivityPub says that object id's "should" be https URL's for public objects, the basic requirement of ActivityStream is just that it's a unique URI, and there's nothing stopping an evolution of ActivityPub allowing URI's pointing to, say, IPFS or similar by content hash instead of a https URL.

[go to top]