I think the gap between attributable knowledge and absorbed knowledge is pretty difficult to bridge. For news stuff, if I read the same general story from NYT and LA Times and WaPo then I'll start to get confused about which bit I got from which publication. In some ways, being able to verbatim quote long passages is a failure to generalize that should be fixed rather than reinforced.
Though the other way to do it is to clearly document the training data as a whole, even if you can't cite a specific entry in it for a particular bit of generated output. It should get useless quickly though as you'd eventually have one big citation -- "The Internet"