zlacker

[parent] [thread] 1 comments
1. stonog+(OP)[view] [source] 2025-12-04 21:33:58
Am I reading this wrong, or does this only support FP16 inputs, and compares its performance against an FP32 solver?
replies(1): >>Bulat_+kT4
2. Bulat_+kT4[view] [source] 2025-12-06 10:31:48
>>stonog+(OP)
They compare HGEMM implementations. At least CUBLAS has HGEMM functions.

HGEMM means half-precision (i.e. FP16) general matrix multiplication

[go to top]