any of the polyhedral frameworks is reasonably good at splitting loop nests into parallelizable ones.
Then it's just a codegen problem.
But yes, ultimately, the user needs to be aware of how the language works, what is parallelizable and what isn't, and of the cost of the operations that they ask their computer to execute.