zlacker

[parent] [thread] 1 comments

You seem to have conflated SIMD and emulation in the context of performance. ARM has it's own SIMD instructions and doesn't take a performance hit when executing those. Translating x86 SIMD to ARM has an overhead that causes a performance hit, which is due to emulation.

replies(1): >>bigyab+82

>>N_Lens+(OP)
Both incur a performance hit. ARM NEON isn't fully analogous to modern AVX or SSE, so even a 1:1 native port will compile down to more bytecode than x86. This issue is definitely exacerbated when translating, but inherent to any comparison of the two.

[go to top]