zlacker

[parent] [thread] 25 comments
1. simonw+(OP)[view] [source] 2024-08-24 16:00:38
Google: you are a web company. Please learn to publish your research papers as web pages.
replies(5): >>orange+81 >>esteba+K1 >>lrem+R2 >>irrela+R3 >>jml7c5+ma
2. orange+81[view] [source] 2024-08-24 16:10:55
>>simonw+(OP)
I expected to see some eldritch css monstrosity, but no, its just a pdf. A well formatted one, at that.

What’s your issue there?

replies(2): >>simonw+x1 >>Voodoo+J1
◧◩
3. simonw+x1[view] [source] [discussion] 2024-08-24 16:13:43
>>orange+81
Reading two column PDFs on a mobile phone sucks.

Plus I can't use web tools, like "Read this page" in Mobile Safari.

And copying and pasting is harder.

And I can't link to individual sections.

I'm honestly baffled by people who prefer PDFs for this kind of information. Are they printing them out on paper and going at them with a highlighter or something?

replies(3): >>lrem+f3 >>tmoert+N8 >>samatm+q21
◧◩
4. Voodoo+J1[view] [source] [discussion] 2024-08-24 16:15:28
>>orange+81
PDF content is not web-indexed. Their Google Scholar link doesn't even work either.
5. esteba+K1[view] [source] 2024-08-24 16:15:57
>>simonw+(OP)
Conference papers use templates. It's not like Google can choose.
replies(1): >>simonw+R1
◧◩
6. simonw+R1[view] [source] [discussion] 2024-08-24 16:16:43
>>esteba+K1
They can choose to publish it in both HTML and PDF.
replies(3): >>lrem+Z3 >>goodfi+sjc >>Donald+and
7. lrem+R2[view] [source] 2024-08-24 16:24:44
>>simonw+(OP)
That’s not a blog post. This is an academic preprint, I imagine the format is as prescribed.
◧◩◪
8. lrem+f3[view] [source] [discussion] 2024-08-24 16:27:24
>>simonw+x1
Indeed. That’s the easiest way to show your students/professor/coworkers which are the crucial bits.
9. irrela+R3[view] [source] 2024-08-24 16:32:22
>>simonw+(OP)
Seriously. It’s not like that was the actual purpose of html or anything.
◧◩◪
10. lrem+Z3[view] [source] [discussion] 2024-08-24 16:32:56
>>simonw+R1
Maybe. Maybe not. Depends on the publisher’s terms.
◧◩◪
11. tmoert+N8[view] [source] [discussion] 2024-08-24 17:15:36
>>simonw+x1
Just my personal take, but when I have to read something carefully, I find it easier to do on paper.

For example, I recently wrote an article about taking random samples using SQL. Even though I was writing it for my blog, which is HTML, I proofread the article by rendering it as a PDF doc, printing it out, and reviewing it with a blue pen in hand.

What surprised me is that I also found it easier to review the article on the screen when it was in PDF format. TeX just does a way better job of putting words on a page than does a web browser.

Actually, if you want to do the comparison yourself, I'll put both versions online:

HTML: https://blog.moertel.com/posts/2024-08-23-sampling-with-sql....

PDF: https://blog.moertel.com/images/public_html/blog/pix-2024060...

I don't think either version is hard to read, but if I had my choice, I'd read the PDF version. But maybe that's just me.

Let me know which you prefer.

replies(1): >>LeonB+Pg
12. jml7c5+ma[view] [source] 2024-08-24 17:27:07
>>simonw+(OP)
I really wish that browsers had developed first-class support for offline web page bundles. There's no way to share a page that is guaranteed to be self-contained and not hit the network, especially if you want to use javascript. It's particularly frustrating since browsers supported offline mode as far back as the 90s; it just needed to be combined with support for loading from zipped folders.

That simple change would've largely solved the academic paper problem decades ago. It's bizarre that it still isn't a feature.

replies(3): >>simonw+Ba >>abelch+Yw >>smsm42+B7j
◧◩
13. simonw+Ba[view] [source] [discussion] 2024-08-24 17:29:49
>>jml7c5+ma
One option her is to inline all assets - images etc - as bas64 URIs. The HTML page ends up huge but it will at least be self-contained.
replies(1): >>jml7c5+bc
◧◩◪
14. jml7c5+bc[view] [source] [discussion] 2024-08-24 17:42:35
>>simonw+Ba
Yes, but it's not guaranteed to be self-contained. I wouldn't want to open a random HTML file knowing that it could phone home, or that the content might break one day without me realizing. There's a practical and psychological aspect to sharing `steves_paper_2014.html` versus `steves_paper_2014.offlinesitebundle`. The latter feels safe and immutable.
replies(1): >>irq-1+6s
◧◩◪◨
15. LeonB+Pg[view] [source] [discussion] 2024-08-24 18:16:42
>>tmoert+N8
On mobile phone, as a reader with photophobia, the pdf causes physical pain, and is illegible, whereas the html is perfectly readable via reader mode (where text can be enlarged and dark mode settings are respected.
replies(1): >>tmoert+pi
◧◩◪◨⬒
16. tmoert+pi[view] [source] [discussion] 2024-08-24 18:28:16
>>LeonB+Pg
Thanks for sharing this perspective! HTML is a lot more accessible in general than PDF documents.

QQ: Do the math formulas render properly in reader mode for you? (On my test with Chrome, the answer seems to be no.)

replies(1): >>LeonB+9m
◧◩◪◨⬒⬓
17. LeonB+9m[view] [source] [discussion] 2024-08-24 18:54:53
>>tmoert+pi
I don’t think the formulas are rendered in reader view. (iOS Safari)

In the browser (iOS Safari) I use an extension (dark reader) to give it a dark theme, and the formulas render just fine there.

replies(1): >>LeonB+K21
◧◩◪◨
18. irq-1+6s[view] [source] [discussion] 2024-08-24 19:40:34
>>jml7c5+bc
What you want is an HTML tag or response header that restricts network access, which the browser can then enforce. Offline or a list of allowed domains, this would be great for security in general. Not so great for advertisers though.
replies(1): >>mewpme+fMc
◧◩
19. abelch+Yw[view] [source] [discussion] 2024-08-24 20:19:01
>>jml7c5+ma
this is what you are looking for i believe (web archive format) https://en.wikipedia.org/wiki/WARC_(file_format)?oldformat=t...
replies(1): >>jml7c5+sD
◧◩◪
20. jml7c5+sD[view] [source] [discussion] 2024-08-24 21:17:12
>>abelch+Yw
Browsers don't have native support for opening WARC. It doesn't solve the safety problem either: you can still construct a WARC that phones home, AFAIK.

It's a great format for the problem it solves, but if browsers supported offline-only files the container format wouldn't (and shouldn't) need to be that complicated.

◧◩◪
21. samatm+q21[view] [source] [discussion] 2024-08-25 00:20:21
>>simonw+x1
Personally, it's sending it to GoodReader on a 13" iPad.

I don't know that I'd go so far as to say I 'prefer' this, but there are a lot of PDFs out there, this works fine, and it's a nice change of pace given how much time I spend in front of a monitor / laptop screen.

◧◩◪◨⬒⬓⬔
22. LeonB+K21[view] [source] [discussion] 2024-08-25 00:22:51
>>LeonB+9m
(Minor correction, the plugin is called ‘dark night’)
◧◩◪
23. goodfi+sjc[view] [source] [discussion] 2024-08-29 01:54:51
>>simonw+R1
if you replace in an arxiv.org link with ar5iv.org it will auto translate to html if possible
◧◩◪◨⬒
24. mewpme+fMc[view] [source] [discussion] 2024-08-29 07:30:47
>>irq-1+6s
Then you have to verify that the tag is there, right? But if it has another extension like .offlinebundle you can know thay browsers will not make any extra requests.
◧◩◪
25. Donald+and[view] [source] [discussion] 2024-08-29 13:22:26
>>simonw+R1
Translating LaTeX to HTML is not a straightforward process, unfortunately. Many people have tried to implement automated translation systems, but nothing has really worked out yet.

I think it's unfair to expect the research team to invest additional hours in learning how to make good websites, so to solve your problem would require hiring additional talent whose only job is to translate academic PDFs into accessible web pages. I don't think that's a bad idea, and certainly Google has the funds to do something like that, but I don't imagine they'd find it to be a good use of money. Accessibility is an afterthought for most major companies these days.

◧◩
26. smsm42+B7j[view] [source] [discussion] 2024-08-31 22:52:34
>>jml7c5+ma
Mail clients kinda do that (or at least they can, if asked to). Also, why would academic papers need JS anyway? CSS and images, I can get, but beyond that there's no need for anything fancier.
[go to top]