https://blog.archive.org/2025/07/24/internet-archive-designa...
If the internet archive is already curated for content then yeah there is a 100% chance that there will be more curation of content.
Does anyone have any facts/citations on if this is a myth/coping mechanism I created, or reality?
I hope that all of the world libraries join with the internet archive into a global cooperative.
I also hope there is a secret sub-basement in a different dimension that contains powerful artifacts, guarded by a master librarian.
A man can dream can’t he?
The submission says:
> These records account for “millions and millions of pages” that can take up entire floors of public libraries, Kahle said. San Diego’s public library gave up its federal depository status in 2020 because its government documents took up so much space and often went unused. [...] The GPO [...] has ramped up efforts to digitize the Federal Depository Library Program.
Does IA now have to store floors upon floors of paper copies of information, at least until it got digitized? Or are they now merely obliged to host the digital materials insofar as they already exist? That sounds like what they are doing already for the whole web, and also apparently since 2022 when they started "Democracy’s Library, a free online compendium of government research and publications", just that now they're legally obliged to do this or something?
What I find on doi.gov[1] is "The mission of Federal depository libraries is to provide local, free access to information from the Federal government" and nothing really further on what this concretely means. Sounds like just an obligation though?
What I find on gpo.gov[2] is "The Federal Depository Library Program [ensures] that the American public has access to Government information in depository libraries". Could mean anything. The program ensures that, but let's assume that means the designated libraries ensure that, so then do these libraries get extra info that the public doesn't get (but in order to disseminate them to the public)? Makes no sense either
The GPO page and the submission also say that "Members of Congress may designate up to two qualified libraries." Did they get picked and now it's IA's obligation, or did IA ask for this? What do they get out of it?
[1] https://www.doi.gov/library/collections/federal-documents
[2] https://www.gpo.gov/how-to-work-with-us/agency/services-for-...
Not sure what the appeal of the public library is, when you can have your own.
This is not to disparage the tremendous work done and being done by the IA, it's more of me lamenting the trend of our society and societies to mentally babysit people lest their mind gets exposed to something bad, with the implicit assumption that adult humans can't be trusted to see some stupid bs and react with "that was some stupid bs. I am moving it into the stupid bs bucket of things I know about".
What does this mean. U.S. Senators can unilaterally designate federal depositories?
> "...in response to the enclosed letter I received from the Founder and Digital Librarian of the Internet Archive, Mr. Brewster Kahle, I am designating the Internet Archive as a federal depository library in California."
Which seems a lot more agreeable than unilateral designation (which is also how I initially read this).
One selfish man unwilling to recognize he is doing more harm than good.
Libraries are constantly bringing in new materials and very few are capable of constantly increasing in size to match. I believe national libraries like the Library of Congress tend not to weed, but they do have to offload material to satellite locations and storage facilities.
https://publishers.org/wp-content/uploads/2024/09/2024.09.04...
Brewster has a friend in a state senator and he's trying to do what he can to preserve his section 108 privileges. He's removed over a million items in the past year after being repeatedly sued for copyright infringement, and leaked millions of private communications with patrons including passports and driver licenses. That's the undercurrent here.
Egos aside, the goal isn't to be a library: it's providing access to knowledge. But when your site is on the blocklist at public library terminals because you keep getting flagged for copyright violations and child pornography, maybe you're not on the path.
The "bot" is wrong. Most of the crawl data used by the Internet Archive, particularly the Alexa crawls, isn't publicly accessible. (This is because some of it includes archived pages which have since been suppressed by the site owner - removing those pages from the archived crawl data isn't practical.)
https://archive.org/details/alexacrawls
Common Crawl data is public, but less comprehensive than IA - https://commoncrawl.org/
Tweeting out promotional links to the pages with those materials, while asking for donations on the top of the page? Well, I don't know if that's contempt for artists or just lack of common sense. But when they ask you to take down the material and you refuse...
The depository thing is a distraction. And they do have a habit of sensationalizing things in blog posts. So I understand where that commenter is coming from. Internet Archive is under attack from many sides but much of it is self-inflicted.
It is unwise to push these latter points with the outmost care without having awakened the masses and clarified your stances to decisors - it is unwise to be "right" in front of the immature. But the reputation damage remains about wisdom, not about pride.
>Under federal law, members of Congress can designate up to two qualified libraries for federal depository status.
For music, the Music Modernization Act set up a statutory process for making things available, even downloadable. Brewster and others celebrated the measure in blog posts and speaking gigs. Then didn't follow the process, didn't honor polite requests to stop, then got sued for $700 million.
Previously they did some seriously stupid things in their implementation of Controlled Digital Lending, and got the whole concept killed. Not even a debate, just destroyed on summary judgement without even a trial. This set the future many of us want back decades, and ruined a lot of proper efforts that were run much better than the well-intentioned but undermanaged Internet Archive.
Combined with them giving the finger to the fairly innovative and progressive music act, this caused damage not only to reputations, but also the culture.
Regarding copyright basics, we're likely to agree on many positions, including some radical ones. But Internet Archive cannot be a long-term archive, an activist organization, and an open library. There are different laws, risk profiles, and financial/management requirements for each.
And you can't beg people for donations to "save the internet" then set it all on fire to save a bunch of old records that already existed at the Library of Congress. Or act surprised that just because you scan them, it doesn't mean you can then make them available for unlimited download without permission. Again, archives behave differently from libraries. Although it's annoying to tech people, there are good reasons for it.
Brewster likes his honorary library status and degree but he and the site violate the majority of the librarian code of ethics. https://www.ala.org/tools/ethics
As the name implies, Internet Archive started as an archive. Which is very different from a library.
Running an archive is not particularly fun, and it is very expensive, and you cannot monetize it without having rights to the things you're archiving. They've never offered research services or grants, and yet the monthly bill and tech debt just keeps growing. Last year's hacks showed the state of things, and they leaked patron information and even passports and drivers licenses.
They tried to be a library but didn't follow the law. Hell, they even tried to be a bank at one point and got spanked hard by the feds there, too.
https://ncua.gov/newsroom/press-release/2016/internet-archiv...
With the $700 million lawsuit over old records it became clear that the whole thing is little more than a catch all for things that Brewster Kahle finds interesting. He's got money and seems like a kind guy. But it's not a well-run organization and he's at retirement age without having put much of a dent in that mission.