it depends on your task, if you have large language model, bottleneck likely be in ML part. It could be pre/post-processing if model is shallow.