The fact that it sometimes fails simple algorithms for large numbers but shows good performance in other complex algorithms with simple inputs seems to me that something on a fundamental level is still insufficient
>>rdedev+(OP)
You're focusing too much on what the LLM can handle internally. No LLMs aren't good at math, but they understand mathematic concepts and can use a program or tool to perform calculations.
Your argument is the equivalent of saying humans can't do math because they rely on calculators.
In the end what matters is whether the problem is solved, not how it is solved.