zlacker

[return to "GitHub accused of varying Copilot output to avoid copyright allegations"]
1. Shamel+ba[view] [source] 2023-06-10 14:56:50
>>belter+(OP)
Eh, their argument is simply that they tuned temperature settings to encourage the model to output slight variations on memorized data. But this is kind of just one of many things you do with a language model and certainly doesn’t imply intent to avoid copyright allegations.

Just implies they tuned it for user experience.

I was expecting there to be some discovery around them deliberately fine tuning their model to output modifications if and only if the code had a certain license.

◧◩
2. Brian_+Ud[view] [source] 2023-06-10 15:20:46
>>Shamel+ba
Why else bother with such an input? Are randomizations more likely to be correct or more useful?
◧◩◪
3. GuB-42+mr1[view] [source] 2023-06-10 22:40:36
>>Brian_+Ud
It is worthwhile with creative writing. For example if you ask ChatGPT to write a short story, you want some originality. Even when asking for an explanation it can be useful as you may want to try different things for the explanation that speaks to you the most.

But here we are talking about autocompleting code. I don't think programmers want the autocompleter to be creative. They want the exact same solution everyone uses, hopefully the right one, with only minor changes so that it matches their style and use their own variable names. In my case, I am the programmer, I decide what to do, I just want my autocompleter to save me some keystrokes and copy-pasting boilerplate from the web, the more it looks like existing code the better. I have enough work fixing my own bugs, thank you.

Speaking about bugs, how come everyone talks about code generation that, I think, doesn't bring that much value. Sure, it saves a few keystrokes and copy-pasting from StackOverflow, but I don't feel like it is the thing programmers spend most of the time doing. Dealing with bugs is. By bugs, there are the big ones that have tickets and can take days to analyze and fix, but also the ones that are just a normal part of writing code, like simple typos that result in compiler errors. I think that machine learning could be of great help here.

Just a system that tells me "hey, look here, this is not what I expected to see" would be of great help. Unexpected doesn't mean there is a bug, but it is something worth paying attention to. I know it has been done, but few people seem to talk about it. Or maybe a classifier trained on bug fix commits. If a piece of code looks like code that has been changed in a bug fix commit, there is a good chance it is also a bug. Have it integrated to the IDE, highlight the suspicious part as I type, just as modern IDEs highlight compilation errors in real time.

[go to top]