For example, in NLP a huge amount of pre and post processing of data is needed outside of the GPU.
The python runtime is slow in general. But anyone using it for ML is not actually using the python runtime to do any of the heavy lifting. All of the popular ML/Ai libraries for python like tensorflow, pytorch, numpy, etc. are just thin python wrappers on top of tens of thousands of lines of C/C++ code. People just use python because it's easy and there's a really good ecosystem of tools and libraries.