zlacker

[parent] [thread] 5 comments
1. biophy+(OP)[view] [source] 2025-12-05 21:33:44
I thought adversarial testing like this was a routine part of software engineering. He's checking to see how flexible it is. Maybe prompting would help, but it would be cool if it was more flexible.
replies(2): >>Benjam+N8 >>genrad+zq
2. Benjam+N8[view] [source] 2025-12-05 22:26:42
>>biophy+(OP)
So the idea is what? What's the successful outcome look like for this test, in your mind? What should good software do? Respond and say there are 5 legs? Or question what kind of dog this even is? Or get confused by a nonsensical picture that doesn't quite match the prompt in a confusing way? Should it understand the concept of a dog and be able to tell you that this isn't a real dog?
replies(2): >>biophy+Ge >>menaer+S61
◧◩
3. biophy+Ge[view] [source] [discussion] 2025-12-05 23:04:02
>>Benjam+N8
No, it’s just a test case to demonstrate flexibility when faced with unusual circumstances
4. genrad+zq[view] [source] 2025-12-06 00:35:21
>>biophy+(OP)
You're correct, however midwit people who don't actually fully understand all of this will latch on to one of the early difficult questions that was shown as an example, and then continued to use that over and over without really knowing what they're doing while the people developing the model and also testing the model are doing far more complex things
◧◩
5. menaer+S61[view] [source] [discussion] 2025-12-06 09:30:00
>>Benjam+N8
You know, I had a potential hire last week, and I was interviewing this one guy whose resume was really strong, it was exceptional in many ways plus his open-source code was looking really tight. But at the beginning of the interview, I always show the candidates the same silly code example with signed integer overflow undefined behavior baked in. I did the same here and asked him if he sees anything unusual with it, and he failed to detect it. We closed the round immediately and I disclosed no hire decision.
replies(1): >>michae+9x1
◧◩◪
6. michae+9x1[view] [source] [discussion] 2025-12-06 14:19:35
>>menaer+S61
Does the ability to verbally detect gotchas in short conversations dealing only with text on a screen or white board really map to stronger candidates?

In actual situations you have documentation, editor, tooling, tests, and are a tad less distracted than when dealing with a job interview and all the attendant stress. Isn't the fact that he actually produces quality code in real life a stronger signal of quality?

[go to top]