On a GPU, higher precision can cost between 2 and 64 times more than single precision, with typical ratios for consumer cards being 16 or 32. Even on the CPU, fp64 workloads tend to run at half the speed on real data due to the extra bandwidth needed for higher precision.