1. OP said “what it is about, when was it published, the author, etc.” That’s what these mechanisms already cover. Consent is an interesting possibility that I’ll admit something like ai.txt might be better for, but my post was largely focused on the OP.
2. These are all complex formats. If you want to ingest and process them then you already have to build all the hard parts. Getting the metadata out is dead simple compared to parsing, decoding, and then processing an image, for example.