Can't speak to firmware code or complex cryptography but my hunch is if it's in it's training dataset and you know enough to guide it, it's generally pretty useful.
Presumably humanity still has room to grow and not everything is already in the training set.
This rather tells that the kind of performance optimizations that you ask for are very "standard".
If you really care about using the hardware effectively, optimizing the code is so much more than what you describe.