We don't yet know how courts will rule on cases like Does v Github (https://githubcopilotlitigation.com/case-updates.html). LLM-based systems are not even capable of practicing clean-room design (https://en.wikipedia.org/wiki/Clean_room_design). For a maintainer to accept code generated by an LLM is to put the entire community at risk, as well as to endorse a power structure that mocks consent.
This is similar to the ruling by Alsup in the Anthropic books case that the training is “exceedingly transformative”. I would expect a reinterpretation or disagreement on this front from another case to be both problematic and likely eventually overturned.
I don’t actually think provenance is a problem on the axis you suggest if Alsups ruling holds. That said I don’t think that’s the only copyright issue afoot - the copyright office writing on copyrightability of outputs from the machine essentially requires that the output fails the Feist tests for human copyrightability.
More interesting to me is how this might realign the notion of copyrightability of human works further as time goes on, moving from every trivial derivative bit of trash potentially being copyrightable to some stronger notion of, to follow the feist test, independence and creativity. Further it raises a fairly immediate question in an open source setting if many individual small patch contributions themselves actually even pass those tests - they may well not, although the general guidance is to set the bar low - but is a typo fix either? There is so far to go on this rabbit hole.
I swear the ML community is able to rapidly change their mind as to whether "training" an AI is comparable to human cognition based on whichever one is beneficial to them at any given instant.