>>laiysb+(OP)
I speculate what is going on is that the agent's context retrieval algorithm is bad, so it does not give the LLM the right context, because today's models should suffice to get the job done.
>>esafak+ek
The cynic in me says, that they were probably using an unreleased state of the art version of their best model not available to normal customers and that‘s the best it could do.