GitHub accused of varying Copilot output to avoid copyright allegations

>>belter+(OP)
Eh, their argument is simply that they tuned temperature settings to encourage the model to output slight variations on memorized data. But this is kind of just one of many things you do with a language model and certainly doesn’t imply intent to avoid copyright allegations.

Just implies they tuned it for user experience.

I was expecting there to be some discovery around them deliberately fine tuning their model to output modifications if and only if the code had a certain license.

>>Shamel+ba
Why else bother with such an input? Are randomizations more likely to be correct or more useful?

>>Brian_+Ud
I don't know much about AI, but I think one reason you might do that is to learn which variations are preferred (which are committed unmodified) so you can tune the model in the future. I don't know if Github does that, but given they've cited how often code from copilot is committed without modification, I assume they are measuring it at least in some cases.

zlacker