Usually random drawn from a certain distribution (ex: Gaussian with std. deviation 0.001, or a std. deviation dependent on the number of input/output units (Xavier initialization)).
For some tasks, you may wish to initialize using a network that was already trained on a different dataset, if you have reason to believe the new training task is similar to the previous task.