Modern processors implement the FMA (fused multiply add) instruction, which can be helpful for reducing instruction count as well as mitigating rounding error in floating point operations. (https://stackoverflow.com/a/15933677)
Suppose fma dest, a, b, c translates to dest = a * b + c. This program could be implemented in 3 timesteps even if only one instruction can be performed at once:
Modern processors implement the FMA (fused multiply add) instruction, which can be helpful for reducing instruction count as well as mitigating rounding error in floating point operations. (https://stackoverflow.com/a/15933677)
Suppose
fma dest, a, b, c
translates todest = a * b + c
. This program could be implemented in 3 timesteps even if only one instruction can be performed at once:fma R0, R0, R0, $zero fma R0, R1, R1, R0 fma R0, R2, R2, R0