How does misalignment scale with model intelligence and task complexity?

>>salkah+(OP)
The comments so far seem focused on taking a cheap shot, but as somebody working on using AI to help people with hard, long-term tasks, it's a valuable piece of writing.

- It's short and to the point

- It's actionable in the short term (make sure the tasks per session aren't too difficult) and useful for researchers in the long term

- It's informative on how these models work, informed by some of the best in the business

- It gives us a specific vector to look at, clearly defined ("coherence", or, more fun, "hot mess")

>>jmtull+P8
Other actionable insights are:

- Merge amendments up into the initial prompt.

- Evaluate prompts multiple times (ensemble).

>>kernc+rh
Sometimes when I was stressed, I have used several models to verify each others´ work. They usually find problems, too!

This is very useful for things that take time to verify, we have CI stuff that takes 2-3 hours to run and I hate when those fails because of a syntax error.

>>sandos+4e1
Syntax errors should be caught by type checking / compiling/ linting. That should not take 2-3 hours!

zlacker