This is using x87.
Code:
x.d = 1
y.d = 1.000001
t.l = ElapsedMilliseconds()
!fld qword [v_y]
!fld qword [v_x]
!mov ecx, 100000000
!mulx_loop:
!fmul st0, st1
!dec ecx
!jnz mulx_loop
!fstp qword [v_x]
!fstp st0
t = ElapsedMilliseconds() - t
Msg.s = "X= " + StrD(x) + #CRLF$
Msg + "Temps d'execution :" + StrD(t / 1000, 2) + " Sec"
MessageRequester("", Msg)
It's also possible using the SSE2 instruction MULSD.
Code:
x.d = 1
y.d = 1.000001
t.l = ElapsedMilliseconds()
!movsd xmm0, [v_x]
!movsd xmm1, [v_y]
!mov ecx, 100000000
!mulx_loop:
!mulsd xmm0, xmm1
!dec ecx
!jnz mulx_loop
!movsd [v_x], xmm0
t = ElapsedMilliseconds() - t
Msg.s = "X= " + StrD(x) + #CRLF$
Msg + "Temps d'execution :" + StrD(t / 1000, 2) + " Sec"
MessageRequester("", Msg)
The results are slightly different because the FPU uses 80 bits internally while SSE2 uses 64 bits.