zlacker

Why else bother with such an input? Are randomizations more likely to be correct or more useful?

replies(8): >>ianbut+T >>slashd+j1 >>brooks+o1 >>2greml+y6 >>cubefo+Fc >>seanhu+Bv >>golemo+sO >>GuB-42+sd1

>>Brian_+(OP)
Potentially more correct, yes. It frees the model to choose lower probability tokens to some degree, technically it boosts their probabilities, which may be more correct depending on the task.

There are also sampling schemes, top_p and top_k which can each individually help choose tokens that are less probable (but still highly probable) but more correct, and they are often used together for the best effect.

And then there are various decoding methods like beam search where choosing the most optimal beam may not mean the most optimal individual token.

By default a simple greedy search is used which always chooses the next highest probability token.

>>Brian_+(OP)
I don't know much about AI, but I think one reason you might do that is to learn which variations are preferred (which are committed unmodified) so you can tune the model in the future. I don't know if Github does that, but given they've cited how often code from copilot is committed without modification, I assume they are measuring it at least in some cases.

replies(1): >>Brian_+o8

>>Brian_+(OP)
Huge topic, worth Googling. Short version is that too little randomness limits the solution space, so retrying suboptimal results yields the same problems.

>>Brian_+(OP)
Ye olde Bias-Variance tradeoff

>>slashd+j1
makes sense

>>Brian_+(OP)
Well, temperature 0 means the completion is always the most "likely" (or "best", after fine-tuning) token, while temperature 1 means to choose the next tokens stochastically according to their probability (or "goodness" after fine-tuning). Usually some temperature in between is chosen, like 0.7. It's not a priori clear to me which is the best way to do it.

>>Brian_+(OP)
Generally the reason behind adding randomness to machine learning is avoiding "local minima" in the search space of the optimization function(s) used for training the model. If your training produces a very smooth descent to an optimum it can lead to the model converging on a solution that is not globally the best. Adding some randomness helps to avoid this.

Specifically for GPT models, the temperature parameter is used to get outputs wihch are a bit more "creative" and less deterministic. https://help.promptitude.io/en/ai-providers/gpt-temperature

>>Brian_+(OP)
Yes.

>>Brian_+(OP)
It is worthwhile with creative writing. For example if you ask ChatGPT to write a short story, you want some originality. Even when asking for an explanation it can be useful as you may want to try different things for the explanation that speaks to you the most.

But here we are talking about autocompleting code. I don't think programmers want the autocompleter to be creative. They want the exact same solution everyone uses, hopefully the right one, with only minor changes so that it matches their style and use their own variable names. In my case, I am the programmer, I decide what to do, I just want my autocompleter to save me some keystrokes and copy-pasting boilerplate from the web, the more it looks like existing code the better. I have enough work fixing my own bugs, thank you.

Speaking about bugs, how come everyone talks about code generation that, I think, doesn't bring that much value. Sure, it saves a few keystrokes and copy-pasting from StackOverflow, but I don't feel like it is the thing programmers spend most of the time doing. Dealing with bugs is. By bugs, there are the big ones that have tickets and can take days to analyze and fix, but also the ones that are just a normal part of writing code, like simple typos that result in compiler errors. I think that machine learning could be of great help here.

Just a system that tells me "hey, look here, this is not what I expected to see" would be of great help. Unexpected doesn't mean there is a bug, but it is something worth paying attention to. I know it has been done, but few people seem to talk about it. Or maybe a classifier trained on bug fix commits. If a piece of code looks like code that has been changed in a bug fix commit, there is a good chance it is also a bug. Have it integrated to the IDE, highlight the suspicious part as I type, just as modern IDEs highlight compilation errors in real time.