Given how common and wide-spread misattribution of code is on GitHub, I'd say there is a strong argument (moral rather than legal--I'm not an IP lawyer and will leave judgements regarding legal liability up to the professionals) that they can be held responsible for this mess exactly because it is such a well-known issue and that rolling out copilot without addressing this (most likely as you suggest by actually spending more resources on vetting projects and tidying up training data) amounts to gross negligence on the part of GitHub since there is good reason to believe this will exasperate this problem significantly.