The Dubai Debt Trap

>>ur-wha+dh
Disable javascript and the entire article loads on the economist.

>>flerch+5q
Lynx is the best reader for the economist.

>>Scound+sD
I don't love reading long articles in fixed-width fonts.

>>titano+bS
Then pipeline to a PS/PDF generator.

For most modern Web publishing, this is mostly a matter of finding and extracting the <article> block, as well as metadata (title, byline, dateline).

html-xml-tools is quite useful for this.

I'd created a WaPo extractor that reduced pagesize by about 95%, stripped the nags and paywalls, etc. Endpoint was HTML, but that could just as easily have generated PDF or ePub if I'd wanted.

zlacker