I did my PhD more than 20 years ago and it was annoying then to be working with all these postscript and pdf documents. It's still annoying. These days people publish content in PDF form on websites and mostly not in printed media. People might print these or not. Twenty years ago, I definitely did. But it's weird how we stick with this. And PDFs are of course very unstructured and hard to make sense of programmatically as well.
I bet a lot of modern day scientists don't actually print the articles they read anymore and instead read them on screen or maybe on some ipad or e-reader. Print has become an edge case. Reading a pdf on a small e-reader is not ideal. Anything with columns is kind of awkward to deal with. There's a reason why most websites don't use columns: it kind of sucks as a UX. The optimal form to deliver text is in a responsive form that can adapt to any screen size where you can change the font size as well. A lot of scientific paper layouts are optimized to conserve a resource that is no longer relevant: paper real estate. Tiny fonts, multiple columns, etc.
Anyway, I like Simon's solution and how it kind of works. It's kind of funny how some of these LLMs can be so lazy. The thing with the references being omitted is hilarious. I see the same with chat gpt where it goes out of its way to never do exactly as you asked and instead just give you bits and pieces of what you ask for until you beg it to just please FFing do as you're told?! I guess they are trying to save some tokens or GPU time.
The one-column format is fine on a large monitor, but on a small phone I prefer narrower columns, because a wide column would either make the text too small or it would require horizontal panning while reading.
So I consider the two-column format as better for phones, not worse.
I frequently read documents with many thousands of pages, which also contain many figures and tables.
A variable layout, at least for me, makes the browsing and the search through such documents much more difficult.
I have never ever seen any advantage in having the text reflow to match whatever window happens to be temporarily used to display the text, except for ephemeral messages that I will never read again.
For anything that I will read multiple times, I want the text to retain the same layout, regardless of what device or window happens to display it. If necessary, I see no problem in adjusting the window to fit the text, instead of allowing changes in the text, which would interfere with my ability of remembering it from the previous readings.
I really hate those who fail to provide their technical documentation as PDF documents, being content to just have some Web pages with it.
The usual two-column layout is because having 40 to 60 characters per line in a single column is wasteful of paper. That is a real issue. But the solution is to make the PDF page narrower. Almost nobody prints these documents anyways; there's no good reason they need to conform to legacy sizes like A4 or letter paper commonly found in office printers. Just choose A5 as the size. People who really need to print can fit two A5 pages on one A4 page, and people who view these documents on a phone screen will also find A5 more convenient.