HTML as an Accessible Format for Papers

>>el3ctr+(OP)
I don't think HTML is the right approach. HTML is better than PDF, but it is still a format for displaying/rendering.

the actual paper content format should be separated from its rendering.

i.e. it should contain abstract, sections, equations, figures, citations etc. but it shouldn't have font sizes, layout etc.

the viewer platforms then should be able to style the content differently.

>>billco+gd
HTML alone is in fact not a format for displaying/rendering. Done properly, it is a structural representation of the content. (This is often called ”semantic HTML”.)

They are converting to HTML to make the content more accessible. Accessibility in this context means a11y, in effect ”more accessible” equates to ”more compatible with screen readers”.

While PDF documents can be made accessible, it is way easier to do it in HTML, where browsers build an actual AOM (accessibility object model) tree and expose it to screen readers.

>it should contain abstract, sections, equations, figures, citations etc.

So <article>, <section>, <math>, <figure>, <cite>, etc.

>>clucki+ak
Much of it is a structural representation of how to display the content.

>>benatk+Fk
In practice, sometimes. But in principle, hard disagree.

HTML was explicitly designed to semantically represent scientific documents. [1]

”HTML documents represent a media-independent description of interactive content. HTML documents might be rendered to a screen, or through a speech synthesizer, or on a braille display. To influence exactly how such rendering takes place, authors can use a styling language such as CSS.” [2]

1: https://html.spec.whatwg.org/multipage/introduction.html#bac...

2: https://html.spec.whatwg.org/multipage/introduction.html#:~:...

zlacker