It is currently Wed Jan 20, 2021 5:41 am

All times are UTC + 1 hour




Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Cube root
PostPosted: Sun Oct 02, 2011 11:10 am 
Offline
Addict
Addict
User avatar

Joined: Wed Jun 11, 2003 9:33 pm
Posts: 4655
Location: Spa, relaxing and thinking, and learning...
From
http://www.asmcommunity.net/board/index ... ic=14769.0
Code:
Macro Cbrt(num)
  !jmp @f
  !.dat:dq 0.33333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333
  !@@:fld qword[.dat]
  !fld qword[v_#num#]
  !fabs
  !fyl2x                   ;->log2(Src1)*exponent
  !fld st0             ;copy the logarithm
  !frndint                 ;keep only the characteristic
  !fsub st1,st0          ;keeps only the mantissa
  !fxch                    ;get the mantissa on top
  !f2xm1                   ;->2^(mantissa)-1
  !fld1
  !faddp                   ;add 1 back
  !fscale                  ;scale it with the characteristic
  !fstp st1             ;copy result over and "pop" it
  !test byte[v_#num#+7],$80
  !jz @f
  !fchs                    ;restore sign
  !@@:fstp qword[v_#num#]
EndMacro
f.d=-26.4433
cbrt(f)
Debug f

Something better please? :?

_________________
http://www.zeitgeistmovie.com

While world=business:world+mafia:Wend


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Mon Feb 27, 2012 3:22 pm 
Offline
New User
New User

Joined: Thu Feb 23, 2012 1:06 pm
Posts: 8
Code:
Global T.f = 19.0
Global A.f

!movq xmm0,qword[v_T]
!sqrtss xmm0,xmm0
!movq qword[v_A],xmm0
Debug A


requires SSE.


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Mon Feb 27, 2012 10:36 pm 
Offline
Addict
Addict
User avatar

Joined: Wed Jun 11, 2003 9:33 pm
Posts: 4655
Location: Spa, relaxing and thinking, and learning...
evilgravedigger wrote:
Code:
Global T.f = 19.0
Global A.f

!movq xmm0,qword[v_T]
!sqrtss xmm0,xmm0
!movq qword[v_A],xmm0
Debug A


requires SSE.

sorry, it does not work. Teh result value is wrong.
Besides it does not work for negative values.

_________________
http://www.zeitgeistmovie.com

While world=business:world+mafia:Wend


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Mon Feb 27, 2012 11:49 pm 
Offline
Enthusiast
Enthusiast

Joined: Sat Jul 09, 2011 7:57 am
Posts: 276
evilgravedigger wrote:
Code:
Global T.f = 19.0
Global A.f

!movq xmm0,qword[v_T]
!sqrtss xmm0,xmm0
!movq qword[v_A],xmm0
Debug A


requires SSE.


sqrtss return the square root not the cube one.

sse cuberoot : http://www.musicdsp.org/showone.php?id=206


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Tue Feb 28, 2012 9:38 am 
Offline
Addict
Addict
User avatar

Joined: Wed Jun 11, 2003 9:33 pm
Posts: 4655
Location: Spa, relaxing and thinking, and learning...
Another good lesson to realize about Intel x86 line is a fully crap.

_________________
http://www.zeitgeistmovie.com

While world=business:world+mafia:Wend


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Tue Feb 28, 2012 8:39 pm 
Offline
New User
New User

Joined: Thu Feb 23, 2012 1:06 pm
Posts: 8
xorc1zt wrote:
evilgravedigger wrote:
Code:
Global T.f = 19.0
Global A.f

!movq xmm0,qword[v_T]
!sqrtss xmm0,xmm0
!movq qword[v_A],xmm0
Debug A


requires SSE.


sqrtss return the square root not the cube one.

sse cuberoot : http://www.musicdsp.org/showone.php?id=206


Ow, sorry, my bad.


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Thu Dec 12, 2013 12:05 pm 
Offline
Addict
Addict
User avatar

Joined: Wed Jun 11, 2003 9:33 pm
Posts: 4655
Location: Spa, relaxing and thinking, and learning...
Quote:
http://www.musicdsp.org/showone.php?id=206
:?:

Mmmm!!
Still waiting for a faster and bit-wider (64 bit float values) function than the first one posted here (you can use SSE, SSE2, 3, 4, 5, 6 .... ) 8)

_________________
http://www.zeitgeistmovie.com

While world=business:world+mafia:Wend


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Sat Dec 14, 2013 12:24 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
Ha, no problem :P !
Code:
;Cubic Root, Single Precision
;PB 5.21 LTS (x64)
;I use Heron: Loop {x = x * (((x * x * x) + 2 * a) / ((2 * x * x * x) + a))}

Test_Value.f = -12345.6789   ;Cubic_Root = -23,112042408247961097779983746659
Appr.f                       ;for all...

Procedure.f Cube_Root()
  !mov eax,[v_Test_Value]
  !and eax,7FFFFFFFh         ;approximation without sign
  ;now find a good approximation for the start-value; http://metamerist.com/cbrt/cbrt.htm - Kahan´s bit hack
  !mov [v_Appr],eax
  !movss xmm0,[v_Appr] 
  !movss xmm1,xmm0
  !lea rdx,[v_Appr]
  !mov eax,[rdx]
  !xor edx,edx
  !div dword[Value3]
  !add eax,[BitHack]
  !mov [v_Appr],eax
  !movss xmm2,[v_Appr]

  !addss xmm1,xmm1           ;xmm1=2*a=constant
  !mov ecx,2                 ;higher=more precision (if possible)
 !@@: 
  !movss xmm3,xmm2
  !mulss xmm3,xmm3
  !mulss xmm3,xmm2           ;xmm3=x*x*x
  !movss xmm4,xmm3
  !addss xmm4,xmm4           ;2*x*x*x
  !addss xmm4,xmm0           ;xmm0=a
  !addss xmm3,xmm1
  !divss xmm3,xmm4
  !mulss xmm2,xmm3
  !dec ecx 
 !jnz @b 
  ;set sign (if Test_Value negativ)
  !test byte[v_Test_Value+3],80h
 !jz @f
  !mulss xmm2,[Minus1]       ;restore sign
 !@@:
  !movss dword[v_Appr],xmm2
  !fld dword[v_Appr]
 ProcedureReturn
  !Minus1:  dd -1.0
  !BitHack: dd 709921077     ;for Kahan´s bit hack; http://metamerist.com/cbrt/cbrt.htm
  !Value3:  dd 3
EndProcedure

TA = ElapsedMilliseconds()
For i = 1 To 10000000
  Cbrt.f = Cube_Root()
Next
TE = ElapsedMilliseconds() - TA
MessageRequester("Cubic Root (Single-Precision) with SSE", StrF(Cbrt, 7) + #LFCR$ + Str(TE) + " ms for 10000000 Loops")


Code:
;Cubic Root, Double Precision
;PB 5.21 LTS (x64)
;I use Heron: Loop {x = x * (((x * x * x) + 2 * a) / ((2 * x * x * x) + a))}

Test_Value.d = -12345.6789   ;Cubic_Root = -23,112042408247961097779983746659
Appr.d                       ;for all...

Procedure.d Cube_Root()
  !mov rax,[v_Test_Value]
  !mov rdx,7FFFFFFFFFFFFFFFh
  !and rax,rdx               ;approximation without sign
  ;now find a good approximation for the start-value; http://metamerist.com/cbrt/cbrt.htm - Kahan´s bit hack
  !mov [v_Appr],rax
  !movsd xmm0,[v_Appr] 
  !movsd xmm1,xmm0
  !lea rdx,[v_Appr]
  !mov rax,[rdx]
  !xor rdx,rdx
  !div qword[Value3]
  !add rax,[BitHack]
  !mov [v_Appr],rax
  !movsd xmm2,[v_Appr]

  !addsd xmm1,xmm1           ;xmm1=2*a=constant
  !mov ecx,2                 ;higher=more precision (if possible)
 !@@: 
  !movsd xmm3,xmm2
  !mulsd xmm3,xmm3
  !mulsd xmm3,xmm2           ;xmm3=x*x*x
  !movsd xmm4,xmm3
  !addsd xmm4,xmm4           ;2*x*x*x
  !addsd xmm4,xmm0           ;xmm0=a
  !addsd xmm3,xmm1
  !divsd xmm3,xmm4
  !mulsd xmm2,xmm3
  !dec ecx 
 !jnz @b 
  ;set sign (if Test_Value negativ)
  !test byte[v_Test_Value+7],80h
 !jz @f
  !mulsd xmm2,[Minus1]       ;restore sign
 !@@:
  !movsd qword[v_Appr],xmm2
  !fld qword[v_Appr]
 ProcedureReturn
  !Minus1:  dq -1.0
  !BitHack: dq 3071306043645493248     ;715094163<<32, for Kahan´s bit hack; http://metamerist.com/cbrt/cbrt.htm
  !Value3:  dq 3
EndProcedure

TA = ElapsedMilliseconds()
For i = 1 To 10000000
  Cbrt.d = Cube_Root()
Next
TE = ElapsedMilliseconds() - TA
MessageRequester("Cubic Root (Double-Precision) with SSE2", StrD(Cbrt, 15) + #LFCR$ + Str(TE) + " ms for 10000000 Loops") 

Make your tests and have fun!
Helle


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Sat Dec 14, 2013 1:38 pm 
Offline
Addict
Addict
User avatar

Joined: Wed Jun 11, 2003 9:33 pm
Posts: 4655
Location: Spa, relaxing and thinking, and learning...
:shock:

OooppS!

The speed of cube root results cricical needed for x86 32 bit systems.
x86 running at 64 bit are quite strange.

Sorry, can not test as I use 32bit.

However, watching at your code, it still seems slower (Only the part of your function before the first label '!@@:' is much slower than the entire function in my firts post, and still more for the AMD processors ;) ).
Have you tested it seriously?
Thanks!

_________________
http://www.zeitgeistmovie.com

While world=business:world+mafia:Wend


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Sat Dec 14, 2013 3:52 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
I made tests with PB 64-Bit (Cubic Root -12345.6789, 10000000 Loops, Procedure):
Psychophanta:
- Float: -23.1120453 / 484 ms
- Double : -23.112042408247959 / 491 ms
Helle:
- Float: -23.1120415 / 169 ms
- Double : -23.112042408247962 / 203 ms

Edit: CPU = Intel i7-4770K@4.0GHz


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Sat Dec 14, 2013 5:46 pm 
Offline
Addict
Addict
User avatar

Joined: Wed Jun 11, 2003 9:33 pm
Posts: 4655
Location: Spa, relaxing and thinking, and learning...
Helle wrote:
I made tests with PB 64-Bit (Cubic Root -12345.6789, 10000000 Loops, Procedure):
Psychophanta:
- Float: -23.1120453 / 484 ms
- Double : -23.112042408247959 / 491 ms
Helle:
- Float: -23.1120415 / 169 ms
- Double : -23.112042408247962 / 203 ms

Edit: CPU = Intel i7-4770K@4.0GHz


Strange, very strange
Psychophanta:
- Float: -23.1120453 / 343 ms
- Double : -23.112042408247959 / 375 ms

Edit: CPU = AMD Phenom II 550, 3.0GHz (purchased in year 2009, no overcloked, no overcored)
:?: :!:

_________________
http://www.zeitgeistmovie.com

While world=business:world+mafia:Wend


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Sun Dec 15, 2013 6:19 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
Test-Code for x86 (e.g. with other CPU´s):
Code:
;For 32-Bit-Windows
;PB 5.21 LTS (x86)

Test_Value_D.d = -12345.6789 ;Cube Root = -23,112042408247961097779983746659
Appr_D.d                     ;for all...
Test_Value_F.f = -12345.6789
Appr_F.f

Procedure.f Cube_Root_F_H_x86()   ;only for x86
  !mov eax,[v_Test_Value_F]
  !and eax,7FFFFFFFh         ;without sign
  ;now find a good approximation for the start-value; http://metamerist.com/cbrt/cbrt.htm - Kahan´s bit hack
  !mov [v_Appr_F],eax
  !movss xmm0,[v_Appr_F] 
  !movss xmm1,xmm0
  !lea edx,[v_Appr_F]
  !mov eax,[edx]
  !xor edx,edx
  !div dword[Value3_F]
  !add eax,[BitHack_F]
  !mov [v_Appr_F],eax
  !movss xmm2,[v_Appr_F]

  !addss xmm1,xmm1           ;xmm1=2*a=constant
  !mov ecx,2                 ;higher=more precision (if possible)
 !@@: 
  !movss xmm3,xmm2
  !mulss xmm3,xmm3
  !mulss xmm3,xmm2           ;xmm3=x*x*x
  !movss xmm4,xmm3
  !addss xmm4,xmm4           ;2*x*x*x
  !addss xmm4,xmm0           ;xmm0=a
  !addss xmm3,xmm1
  !divss xmm3,xmm4
  !mulss xmm2,xmm3
  !dec ecx 
 !jnz @b 
  ;set sign (if Test_Value negativ)
  !test byte[v_Test_Value_F+3],80h
 !jz @f
  !mulss xmm2,[Minus1_F]     ;restore sign
 !@@:
  !movss dword[v_Appr_F],xmm2
  !fld dword[v_Appr_F]
 ProcedureReturn
  !Minus1_F:  dd -1.0
  !BitHack_F: dd 709921077   ;for Kahan´s bit hack (Float); http://metamerist.com/cbrt/cbrt.htm
  !Value3_F:  dd 3
EndProcedure

Procedure.d Cube_Root_D_H_x86()   ;only for x86
  !mov eax,dword[v_Test_Value_D+4]
  !mov edx,dword[v_Test_Value_D]
  !mov dword[v_Appr_D],edx
  !and eax,7FFFFFFFh         ;without sign
  ;now find a good approximation for the start-value; http://metamerist.com/cbrt/cbrt.htm - Kahan´s bit hack
  !mov dword[v_Appr_D+4],eax
  !movsd xmm0,[v_Appr_D] 
  !movsd xmm1,xmm0
  !lea edx,[v_Appr_D]
  !mov eax,[edx+4]
  !xor edx,edx
  !div dword[Value3_D]
  !add eax,[BitHack_D+4]
  !mov dword[v_Appr_D+4],eax
  !movsd xmm2,[v_Appr_D]

  !addsd xmm1,xmm1           ;xmm1=2*a=constant
  !mov ecx,2                 ;higher=more precision (if possible)
 !@@: 
  !movsd xmm3,xmm2
  !mulsd xmm3,xmm3
  !mulsd xmm3,xmm2           ;xmm3=x*x*x
  !movsd xmm4,xmm3
  !addsd xmm4,xmm4           ;2*x*x*x
  !addsd xmm4,xmm0           ;xmm0=a
  !addsd xmm3,xmm1
  !divsd xmm3,xmm4
  !mulsd xmm2,xmm3
  !dec ecx 
 !jnz @b 
  ;set sign (if Test_Value negativ)
  !test byte[v_Test_Value_D+7],80h
 !jz @f
  !mulsd xmm2,[Minus1_D]     ;restore sign
 !@@:
  !movsd qword[v_Appr_D],xmm2
  !fld qword[v_Appr_D]
 ProcedureReturn
  !Minus1_D:  dq -1.0
  !BitHack_D: dq 3071306043645493248   ;715094163<<32, for Kahan´s bit hack (Double); http://metamerist.com/cbrt/cbrt.htm
  !Value3_D:  dd 3
EndProcedure

Procedure.f Cube_Root_F_P()  ;for x86 and x64
  !fld dword[Dat_F]
  !fld dword[v_Test_Value_F]
  !fabs
  !fyl2x                     ;->log2(Src1)*exponent
  !fld st0                   ;copy the logarithm
  !frndint                   ;keep only the characteristic
  !fsub st1,st0              ;keeps only the mantissa
  !fxch                      ;get the mantissa on top
  !f2xm1                     ;->2^(mantissa)-1
  !fld1
  !faddp                     ;add 1 back
  !fscale                    ;scale it with the characteristic
  !fstp st1                  ;copy result over and "pop" it
  !test byte[v_Test_Value_F+3],$80
 !jz @f
  !fchs                      ;restore sign
 !@@:
 ProcedureReturn
  !Dat_F:     dd 0.33333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333 ;:-)
EndProcedure

Procedure.d Cube_Root_D_P()  ;for x86 and x64
  !fld qword[Dat_D]
  !fld qword[v_Test_Value_D]
  !fabs
  !fyl2x                     ;->log2(Src1)*exponent
  !fld st0                   ;copy the logarithm
  !frndint                   ;keep only the characteristic
  !fsub st1,st0              ;keeps only the mantissa
  !fxch                      ;get the mantissa on top
  !f2xm1                     ;->2^(mantissa)-1
  !fld1
  !faddp                     ;add 1 back
  !fscale                    ;scale it with the characteristic
  !fstp st1                  ;copy result over and "pop" it
  !test byte[v_Test_Value_D+7],$80
 !jz @f
  !fchs                      ;restore sign
 !@@:
 ProcedureReturn
  !Dat_D:     dq 0.33333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333 ;:-)
EndProcedure

TA_F_H = ElapsedMilliseconds()
For i = 1 To 10000000
  Cbrt_F_H_x86.f = Cube_Root_F_H_x86()
Next
TE_F_H = ElapsedMilliseconds() - TA_F_H
Result$ = "Test-Value : -12345.6789" + #LFCR$ + "Float Helle:" + #LFCR$ + StrF(Cbrt_F_H_x86, 7) + " / " + Str(TE_F_H) + " ms " + #LFCR$

TA_D_H = ElapsedMilliseconds()
For i = 1 To 10000000
  Cbrt_D_H_x86.d = Cube_Root_D_H_x86()
Next
TE_D_H = ElapsedMilliseconds() - TA_D_H
Result$ + "Double Helle:" + #LFCR$ + StrD(Cbrt_D_H_x86, 15) + " / " + Str(TE_D_H) + " ms " + #LFCR$

TA_F_P = ElapsedMilliseconds()
For i = 1 To 10000000
  Cbrt_F_P.f = Cube_Root_F_P()
Next
TE_F_P = ElapsedMilliseconds() - TA_F_P
Result$ + "Float Psychophanta:" + #LFCR$ + StrF(Cbrt_F_P, 7) + " / " + Str(TE_F_P) + " ms " + #LFCR$

TA_D_P = ElapsedMilliseconds()
For i = 1 To 10000000
  Cbrt_D_P.d = Cube_Root_D_P()
Next
TE_D_P = ElapsedMilliseconds() - TA_D_P
Result$ + "Double Psychophanta:" + #LFCR$ + StrD(Cbrt_D_P, 15) + " / " + Str(TE_D_P) + " ms " + #LFCR$ + #LFCR$

Result$ + "CPU: Intel i7-4770K@4.0GHz" + #LFCR$

;SetClipboardText(Result$)

MessageRequester("Cube Root x86, 10000000 Loops", Result$)

My results (with VirtualBox under Win7 x64):
Test-Value : -12345.6789
Float Helle:
-23.1120415 / 191 ms
Double Helle:
-23.112042408247962 / 362 ms; I have for Double_x86 too many overhead
Float Psychophanta:
-23.1120453 / 481 ms
Double Psychophanta:
-23.112042408247959 / 488 ms

CPU: Intel i7-4770K@4.0GHz


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Mon Dec 16, 2013 12:13 am 
Offline
Addict
Addict
User avatar

Joined: Wed Jun 11, 2003 9:33 pm
Posts: 4655
Location: Spa, relaxing and thinking, and learning...
Ok, the function you posted seems faster for intel i7 and x64 OS,
There would be good if someone could test it in an AMD.

_________________
http://www.zeitgeistmovie.com

While world=business:world+mafia:Wend


Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Mon Dec 16, 2013 7:43 am 
Offline
PureBasic Expert
PureBasic Expert

Joined: Sun Aug 08, 2004 5:21 am
Posts: 3710
Location: Netherlands
Psychophanta wrote:
Ok, the function you posted seems faster for intel i7 and x64 OS,
There would be good if someone could test it in an AMD.

Sorry, no AMD here.
Here's a small modification of the versions Helle posted.
Does that make a difference on AMD ?
Code:
Procedure.f Cube_Root_F_H_x86()   ;only for x86
  !mov eax,[v_Test_Value_F]
  !mov ecx, eax
  !and eax,0x7fffffff        ;without sign
  !and ecx,0x80000000        ;sign only
  ;now find a good approximation for the start-value; http://metamerist.com/cbrt/cbrt.htm - Kahan´s bit hack
  !movd xmm0, eax
  !movss xmm1, xmm0
  !mov edx, 0x55555555
  !mul edx
  !add edx, 0x2a508935
  !movd xmm2, edx
  !addss xmm1,xmm1           ;xmm1=2*a=constant
 
  !movss xmm3,xmm2
  !mulss xmm3,xmm3
  !mulss xmm3,xmm2           ;xmm3=x*x*x
  !movss xmm4,xmm3
  !addss xmm4,xmm4           ;2*x*x*x
  !addss xmm4,xmm0           ;xmm0=a
  !addss xmm3,xmm1
  !divss xmm3,xmm4
  !mulss xmm2,xmm3
 
  !movss xmm3,xmm2
  !mulss xmm3,xmm3
  !mulss xmm3,xmm2           ;xmm3=x*x*x
  !movss xmm4,xmm3
  !addss xmm4,xmm4           ;2*x*x*x
  !addss xmm4,xmm0           ;xmm0=a
  !addss xmm3,xmm1
  !divss xmm3,xmm4
  !mulss xmm2,xmm3
 
  !movd xmm3, ecx
  !por xmm2, xmm3            ;restore sign
  !movss dword[v_Appr_F],xmm2
  !fld dword[v_Appr_F]
 ProcedureReturn
EndProcedure
Code:
Procedure.d Cube_Root_D_HM_x86()   ;only for x86
  !movsd xmm0, [v_Test_Value_D]
  !pcmpeqb xmm4, xmm4
  !psrlq xmm4, 1
  !movsd xmm1, xmm0
  !pand xmm1, xmm4           ; without sign
  !pandn xmm4, xmm0          ; sign only
  !pshufd xmm0, xmm1, 1
  ;now find a good approximation for the start-value; http://metamerist.com/cbrt/cbrt.htm - Kahan´s bit hack
  !movq xmm2, [MulConst]
  !pmuludq xmm0, xmm2
  !paddd xmm0, xmm2
 
  !movsd xmm2,xmm0
  !mulsd xmm2,xmm2
  !mulsd xmm2,xmm0           ;xmm2=x*x*x
  !movsd xmm3,xmm2
  !addsd xmm3,xmm3           ;2*x*x*x
  !addsd xmm3,xmm1           ;xmm1=a
  !addsd xmm2,xmm1
  !addsd xmm2,xmm1
  !divsd xmm2,xmm3
  !mulsd xmm0,xmm2
 
  !movsd xmm2,xmm0
  !mulsd xmm2,xmm2
  !mulsd xmm2,xmm0           ;xmm2=x*x*x
  !movsd xmm3,xmm2
  !addsd xmm3,xmm3           ;2*x*x*x
  !addsd xmm3,xmm1           ;xmm1=a
  !addsd xmm2,xmm1
  !addsd xmm2,xmm1
  !divsd xmm2,xmm3
  !mulsd xmm0,xmm2
 
  !por xmm0, xmm4
  !movsd qword[v_Appr_D],xmm0
  !fld qword[v_Appr_D]
  ProcedureReturn
  !MulConst:  dd 0x55555555, 0x2a9f7893
EndProcedure

_________________
macOS 10.15 Catalina, Windows 10


Last edited by wilbert on Mon Dec 16, 2013 9:30 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: Cube root
PostPosted: Mon Dec 16, 2013 9:20 am 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
In the firma i have an AMD:

CPU: AMD FX-8120@3.6GHz

32-Bit-Windows:
Test-Value : -12345.6789
Float Helle:
-23.1120415 / 335 ms
Double Helle:
-23.112042408247962 / 943 ms
Float Psychophanta:
-23.1120453 / 1174 ms
Double Psychophanta:
-23.112042408247959 / 1150 ms

64-Bit-Windows:
Test-Value : -12345.6789
Float Helle:
-23.1120415 / 322 ms
Double Helle:
-23.112042408247962 / 378 ms
Float Psychophanta:
-23.1120453 / 1166 ms
Double Psychophanta:
-23.112042408247959 / 1203 ms


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye