As other folks have commented, CUDA not being an open standard is a large part of the problem. That and the developers who target CUDA directly when writing Stable Diffusion algorithms—they are forcing the monopoly. Even at the cost of not being able to squeeze every ounce out of the GPU, portability greatly improves software access when people target Vulkan et al.