zlacker

In any training with code I've done, we've written a parser that validates against tree sitter grammars to make sure it's at least syntactically valid against some known subset of languages we're training on.

replies(1): >>yeptha+u2

>>Grimm1+(OP)
I’m which case shifting strategies toward code that looks correct but isn’t using shared syntax between languages as well as language specific gotchas.

replies(1): >>Grimm1+M5

>>yeptha+u2
Yeah but if malicious intent is a concern you can just spin up a sandboxed instance to run the code to check first.

Really the thing is there's not way to ascribe correctness to a piece of code right, like humans fail at this even. The only "correct" code is like rote algorithmic code that has a well defined method of operation. And there's likely a lot more correct examples of that, like way more than you'd ever be able to poison.

You may be able to be misleading though by using names that say one thing but do another, but again you'd be fighting against the tide of correctly named things.