If you use LLMs at very high temperature with samplers which correctly keep your writing coherent (i.e. Min_p, or better like top-h, P-less decoding, etc), than "regression to the mean" literally DOES NOT HAPPEN!!!!
Same problem with image generation (lack of support for different SDE solvers, the image version of LLM sampling) but they have different "coomer" tools, i.e. ComfyUI or Automatic1111
LLMs don’t “reason” the same way humans do. They follow text predictions based on statistical relevance. So raising the temperature will more likely increase the likelihood of unexecutable pseudocode than it would create a valid but more esoteric implementation of a problem.
Code that fails to execute or compile is the default expectation for me. That's why we feed compile and runtime errors back into the model after it proposes something each time.
I'd much rather the code sometimes not work than to get stuck in infinite tool calling loops.
So you can just, like, tweak it when it's working against your intent in either direction?