zlacker

[return to "Google's new pipe syntax in SQL"]
1. aragon+Nda[view] [source] 2024-08-29 01:57:22
>>heyden+(OP)
> This remains a long-standing pet peeve of mine. PDFs like this are horrible to read on mobile phones, hard to copy-and-paste from ...

I've never understood why copying text from digitally native PDFs (created directly from digital source files, rather than by OCR-ing scanned images) is so often such a poor experience. Even PDFs produced from LaTex often contain undesirable ligatures in the copied text like fi and fl. Text copied from some Springer journals sometimes lacks space between words or introduces unwanted space between letters in a word ... Is it due to something inherent in PDF technology?

◧◩
2. mjevan+aea[view] [source] 2024-08-29 02:00:53
>>aragon+Nda
ligatures like fi fl ffi ffl etc are for changes in fonts specific to rendering correctly on a screen or printer. It's intended to be a _rendered_ format, rather than a parse-able format.

Well formatted epub and HTML generally are usually intended to update to end user needs and better fit available layout space.

◧◩◪
3. WorldM+Ulb[view] [source] 2024-08-29 13:52:35
>>mjevan+aea
Though it's also a stuck legacy throwback. Modern advice would be to not send ligatures directly to the renderer and instead let the renderer poll OpenType features (and Unicode/ICU algorithms) to build them itself. PDF's baking of some ligatures in its files seems something of a backwards compatibility legacy mistake to still support ancient "dumb" PostScript fonts and pre-Unicode font encodings (or least pre-Unicode Normalization Forms). It's also a bit of the fact that PDF has always been confused about if it is the final renderer in a stack or not.
[go to top]