zlacker

[parent] [thread] 1 comments
1. gouggo+(OP)[view] [source] 2023-08-15 06:34:18
> Then why web.archive.org isn't also banned?

Because web.archive.org is generally used for...

... things which aren't available from the original source anymore.

While archive.is is generally used to bypass paywalls. These 2 websites have 2 very distinct missions and use-cases.

replies(1): >>dredmo+Kc2
2. dredmo+Kc2[view] [source] 2023-08-15 21:15:37
>>gouggo+(OP)
Whilst I agree with your characterisation as regards usage on HN, I will note that Archive Today actually is a quite useful archival tool, and often works on sites which the Internet Archive behaves poorly on.

I'd run across an instance of this when the Diaspora* pod I was on (the original public node, as it happens) ceased operations. I found myself wanting to archive my own posts, and was caught in something of a dilemma:

- The Internet Archive's Wayback Machine has a highly-scriptable method for submitting sites, in the form of a URL (see below). Once you have a list of pages you want to archive, you can chunk through those using your scripting tool of choice (for me, bash, and curl or wget typically). But it doesn't capture the comments on Diaspora* discussions.... E.g., <https://web.archive.org/web/20220111031247/https://joindiasp...>

- Archive.Today does not have a mass-submission tool, and somewhat aggressively imposes CAPTCHAs at times. So the remaining option is manual submissions, though those can be run off a pre-generated list of URLs which somewhat streamlines the process. And it does capture the comments. E.g., <https://archive.is/9t61g>

So, if you are looking to archive material, Archive Today is useful, if somewhat tedious at bulk.

(Which is probably why the Internet Archive is the far more comprehensive Web archive.)

[go to top]