1. Yes, GPT-4 Turbo is quantitatively getting lazier at coding. I benchmarked the last 2 updates to GPT-4 Turbo, and it got lazier each time.
2. For coding, asking GPT-4 Turbo to emit code changes as unified diffs causes a 3X reduction in lazy coding.
Here are some articles that discuss these topics in much more detail.
- System prompt 1: https://sharegpt.com/c/osmngsQ
- System prompt 2: https://sharegpt.com/c/9jAIqHM
- System prompt 3: https://sharegpt.com/c/cTIqAil Note: I had to nudge ChatGPT on this one.
All of this is anecdotal, but perhaps this style of prompting would be useful to benchmark.