Global T.f = 19.0
Procedure.f Sqrt(N.f)
!mov eax, [p.v_N]
!sub eax, $3F800000
!shr eax, 1
!add eax, $3F800000
!mov [esp-4], eax
!fld dword [esp-4]
CompilerIf #PB_Compiler_Debugger
ProcedureReturn
CompilerElse
!ret 4
CompilerEndIf
EndProcedure
#Tries = 50000000
time = GetTickCount_()
For I = 0 To #Tries
!movq xmm0,qword[v_T]
!sqrtss xmm0,xmm0
Next
MessageRequester("", Str(GetTickCount_()-time))
z.f
time = GetTickCount_()
For I = 0 To #Tries
z = I
Sqrt(z)
Next
MessageRequester("", Str(GetTickCount_()-time))
z.f
time = GetTickCount_()
For I = 0 To #Tries
z = I
Sqr(z)
Next
MessageRequester("", Str(GetTickCount_()-time))
I don't want to spoil the fun Helle but the speed comparisson isn't fair.
From my point of view to do a fair comparisson all methods should assign the return value of the sqr function to a variable. The sqr function is useless if no return value is used.
Besides that all methods should do the computation immediately or all should call a procedure since embedding the computation in a procedure slows things down. In this case the SSE routine is not embedded in a procedure and is only half a routine since no return value is stored. Therefore the speed can't be compared with the Sqrt function in the beginning.
Procedure.f Sqrt(N.f)
!mov eax, [p.v_N]
!sub eax, $3F800000
!shr eax, 1
!add eax, $3F800000
!mov [esp-4], eax
!fld dword [esp-4]
CompilerIf #PB_Compiler_Debugger
ProcedureReturn
CompilerElse
!ret 4
CompilerEndIf
EndProcedure
#Tries = 50000000
z.f
time = GetTickCount_()
For I = 0 To #Tries
!cvtsi2ss xmm1,[v_I]
!sqrtss xmm0,xmm1
Next
!movd [v_z],xmm0
MessageRequester("SSE", Str(GetTickCount_()-time)+#CRLF$+StrF(z))
z.f
time = GetTickCount_()
For I = 0 To #Tries
z = I
Sqrt(z)
Next
z=Sqrt(z)
MessageRequester("Procedure", Str(GetTickCount_()-time)+#CRLF$+StrF(z))
z.f
time = GetTickCount_()
For I = 0 To #Tries
z = I
Sqr(z)
Next
z=Sqr(z)
MessageRequester("PB", Str(GetTickCount_()-time)+#CRLF$+StrF(z))
This is not a fair comparison because you don't use a procedure for your test, so you have a lot less overhead. Put your SSE code in a procedure and get the results..
No it's you who isn't fair , you implemented Sqr() as an inline
function and so it would be a fair comparison to Helle's SSE code that
is also inlined.
But of course you're right if you want to compare to Sqrt()
remi_meier wrote:No it's you who isn't fair , you implemented Sqr() as an inline
function and so it would be a fair comparison to Helle's SSE code that
is also inlined.
#Tries = 100000000
z.f
time = GetTickCount_()
For I = 0 To #Tries
z = I
!cvtsi2ss xmm1,[v_I]
!sqrtss xmm0,xmm1
Next
!movd [v_z],xmm0
MessageRequester("SSE", Str(GetTickCount_()-time)+#CRLF$+StrF(z))
z.f
time = GetTickCount_()
For I = 0 To #Tries
z = I
!mov eax, [v_z]
!sub eax, $3F800000
!shr eax, 1
!add eax, $3F800000
; Don't store the result
Next
!mov eax, [v_z]
!sub eax, $3F800000
!shr eax, 1
!add eax, $3F800000
!mov [v_z], eax
MessageRequester("Inline sqrt", Str(GetTickCount_()-time)+#CRLF$+StrF(z))