All papers should be in HTML/CSS or Tex then just simply converted to PDF.
Why are we even talking about this?
The problem is having the submissions be in TeX and converting that to HTML, when the only output has been PDF for so long.
The problem isn’t converting HTML to PDF, it’s making available a giant portion of TeX/pdf only papers in HTML.
If you’re arguing that maybe TeX then shouldn’t be the source format for papers then I agree, but other than Typst (which also isn’t perfect about HTML output yet) there aren’t that many widely accepted/used authoring formats for physics/math papers, which is what ArXiV primarily hosts.
Either way it gets shoehorned.
HTML doesn't support the necessary features. Citations in various formats, footnotes, references to automatically numbered figures and tables, I could go on and on.
HTML could certainly be extended to support those, but it hasn't been. That's why we're talking about this.