zlacker

[parent] [thread] 0 comments
1. NavinF+(OP)[view] [source] 2023-05-16 22:47:06
> distributed training

Unfortunately this isn't a thing. Eg too much batch norm latency leaves your GPUs idle. Unless all your hardware is in the same building, training a single model would be so inefficient that it's not worth it.

[go to top]