Show HN: Octosphere, a tool to decentralise scientific publishing

>>crimso+(OP)
Nothing based on DOIs and OCRIDs will ever be properly decentralised.

You need content addressing and cryptographic signatures for that.

>>j-pb+IO1
Email is pretty decentralized without those things.

>>tbrown+6U1
And it is infamously insecure, full of spam, and struggles with attachments beyond 10mB.

So thank you for bringing it up, it showcases well that a distributed system is not automatically a good distributed system, and why you want encryption, cryptographic fingerprints and cryptographic provenance tracking.

>>j-pb+eW1
And yet, it is a constantly used decentralized system which does not require content addressing, as you mentioned. You should elaborate why we need content addressing for a decentralized system instead of saying "10MiB limit + spam lol email fell off". Contemporary usage of technologies you've mentioned don't seem to do much to reduce spam (see IPFS which has hard content addressing). Please, share more.

>>leland+AY1
If you think email is still in widespread use because it’s doing a good job, rather than because of massive network effects and sheer system inertia, then we’re probably talking past each other - but let me spell it out anyway.

Email “works” in the same sense that fax machines worked for decades: it’s everywhere, it’s hard to dislodge, and everyone has already built workflows around it.

There is no intrinsic content identity, no native provenance, no cryptographic binding between “this message” and “this author”. All of that has to be bolted on - inconsistently, optionally, and usually not at all.

And even ignoring the cryptography angle: email predates “content as a first-class addressable object”. Attachments are in-band, so the sender pushes bytes and the receiver (plus intermediaries) must accept/store/scan/forward them up front. That’s why providers enforce tight size limits and aggressive filtering: the receiver is defending itself against other people’s pushes.

For any kind of information dissemination like email or scientific publishing you want the opposite shape: push lightweight metadata (who/what/when/signature + content hashes), and let clients pull heavy blobs (datasets, binaries, notebooks) from storage the publishing author is willing to pay for and serve. Content addressing gives integrity + dedup for free. Paying ~1$ per DOI for what is essentially a UUID, is ridiculous by comparison.

That decoupling (metadata vs blobs) is the missing primitive in email-era designs.

All of that makes email a bad template for a substrate of verifiable, long-lived, referenceable knowledge. Let's not forget that the context of this thread isn’t “is decentralized routing possible?”, it’s “decentralized scientific publishing” - which is not about decentralized routing, but decentralized truth.

Email absolutely is decentralized, but decentralization by itself isn’t enough. Scientific publishing needs decentralized verification.

What makes systems like content-addressed storage (e.g., IPFS/IPLD) powerful isn’t just that they don’t rely on a central server - it’s that you can uniquely and unambiguously reference the exact content you care about with cryptographic guarantees. That means:

- You can validate that what you fetched is exactly what was published or referenced, with no ambiguity or need to trust a third party.

- You can build layered protocols on top (e.g., versioning, merkle trees, audit logs) where history and provenance are verifiable.

- You don’t have to rely on opaque identifiers that can be reissued, duplicated, or reinterpreted by intermediaries.

For systems that don’t rely on cryptographic primitives, like email or the current infrastructure using DOIs and ORCIDs as identifiers:

- There is no strong content identity - messages can be altered in transit.

- There is no native provenance - you can’t universally prove who authored something without added layers.

- There’s no simple way to compose these into a tamper-evident graph of scientific artifacts with rigorous references.

A truly decentralized scholarly publishing stack needs content identity and provenance. DOIs and ORCIDs help with discovery and indexing, but they are institutional namespaces, not cryptographically bound representations of content. Without content addressing and signatures, you’re mostly just trading one central authority for another.

It’s also worth being explicit about what “institutional namespace” means in practice here.

A DOI does not identify content. It identifies a record in a registry (ultimately operated under the DOI Foundation via registration agencies). The mapping from a DOI to a URL and ultimately to the actual bytes is mutable, policy-driven, and revocable. If the publisher disappears, changes access rules, or updates what they consider the “version of record”, the DOI doesn’t tell you what an author originally published or referenced - it tells you what the institution currently points to.

ORCID works similarly: a centrally governed identifier system with a single root of authority. Accounts can be merged, corrected, suspended, or modified according to organisational policy. There is no cryptographic binding between an ORCID, a specific work, and the exact bytes of that work that an independent third party can verify without trusting the ORCID registry.

None of this is malicious - these systems were designed for coordination and attribution, not for cryptographic verifiability. But it does mean they are gatekeepers in the precise sense that matters for decentralization:

Even if lookup/resolution is distributed, the authority to decide what an identifier refers to, whether it remains valid, and how conflicts are resolved is concentrated in a small number of organizations. If those organizations change policy, disappear, or disagree with you, the identifier loses its meaning - regardless of how many mirrors or resolvers exist.

If the system you build can’t answer “Is this byte-for-byte the thing the author actually referenced or published?” without trusting a gatekeeper, then it’s centralized in every meaningful sense that matters to reproducibility and verifiability.

Decentralised lookup without decentralised authority is just centralisation with better caching.

zlacker