I've written _a lot_ of open source MIT licensed code, and I'm on the fence about that being part of the training data. I've published it as much for other people to use for learning purposes as I did for fun.
I also build and sell closed source commercial JavaScript packages, and more than likely those have ended up in the training data as well. Obviously without consent. So this is why I feel strong about making this separation between code and media, from my perspective it all has the same problem.