And as another commenter has pointed out, HTML does exactly what you ask for. If it’s done correctly, it doesn’t contain font sizes or layout. Users can style HTML differently with custom CSS.
HTML was a digital format, but it wanted to be a generic format for all document types, not just papers, so it contains a lot of extras that a paper format doesn't need.
for research papers, since they share the same structure, we can further separate content from rendering.
for example, if you want to later connect a paper with an AI, do you want to send <div class="abstract"> ... ?
or do some nasty heuristic to extract the abstract? like document. getElementsByClassName("abstract")[0] ?