PureBasic Forums - English

Posted: **Fri Mar 04, 2016 9:33 pm**

floating-point Pow2() functions for Float and Double, a bit faster than PB's Pow(n,2)
(not a fair competition though as PB's Pow() is general-purpose supporting both integer+float and variable exponent)

Code: Select all

CompilerIf #PB_Compiler_Debugger
  CompilerError("Error - Turn off debugger for timings")  
CompilerEndIf

Procedure.f Powf2(value.f)
! fld  dword [p.v_value]   ;load 32bit Float
! fmul st, st              ;pow2 = value * value
! fstp dword [p.v_value]   ;store
ProcedureReturn value
EndProcedure

Procedure.d Powd2(value.d)
! fld  qword [p.v_value]   ;load 64bit Double
! fmul st, st              ;pow2 = value * value
! fstp qword [p.v_value]   ;store
ProcedureReturn value
EndProcedure


Define result.f, ftest.f = 7

Time1 = ElapsedMilliseconds()
For i = 1 To 9000000
  result = Pow(ftest, 2)
Next i
Time2 = ElapsedMilliseconds()
MessageRequester("Purebasic Pow()", Str(Time2-Time1))


Time1 = ElapsedMilliseconds()
For i = 1 To 9000000
  result = Powf2(ftest)
Next i
Time2 = ElapsedMilliseconds()
MessageRequester("asm Powf2()", Str(Time2-Time1))

Posted: **Fri Mar 04, 2016 9:52 pm**

A Macro is much faster :

Code: Select all

Procedure.f Powf2(value.f)
  ! fld  dword [p.v_value]   ;load 32bit Float
  ! fmul st, st              ;pow2 = value * value
  ! fstp dword [p.v_value]   ;store
  ProcedureReturn value
EndProcedure

Procedure.d Powd2(value.d)
  ! fld  qword [p.v_value]   ;load 64bit Double
  ! fmul st, st              ;pow2 = value * value
  ! fstp qword [p.v_value]   ;store
  ProcedureReturn value
EndProcedure

Macro Powr2(value)
  (value * Value)
EndMacro

Define result.f, ftest.f = 7

Time1 = ElapsedMilliseconds()
For i = 1 To 9000000
  result = Pow(ftest, 2)
Next i
Time2 = ElapsedMilliseconds()
MessageRequester("Purebasic Pow()", Str(Time2-Time1))

Time1 = ElapsedMilliseconds()
For i = 1 To 9000000
  result = Powf2(ftest)
Next i
Time2 = ElapsedMilliseconds()
MessageRequester("asm Powf2()", Str(Time2-Time1))

Time1 = ElapsedMilliseconds()
For i = 1 To 9000000
  result = Powr2(ftest)
Next i
Time2 = ElapsedMilliseconds()
MessageRequester("asm Powr2()", Str(Time2-Time1))

On my computer I got 96 ms, 77 ms and 31 ms

Best regards
StarBootics

Posted: **Fri Mar 04, 2016 10:02 pm**

yes macro would be better especially seeing as its so tiny

Im not so sure about your multiply being faster though ...

What I'm doing is the following line (shown in three variations), in a "For i = 1 to 1.4 million" loop... note the timings!

Code: Select all

 ft = Sqr(Abs(Pow(*p1\a,2) - Pow(*p2\a,2)))        ;=128, 127, 127
 ft = Sqr(Abs(Powf2(*p1\a) - Powf2(*p2\a)))        ;=19, 18, 18
 ft = Sqr(Abs(*p1\a * *p1\a) - (*p2\a * *p2\a))    ;=538, 540, 538

Posted: **Fri Mar 04, 2016 10:07 pm**

Hmm, i made a cut-down demo of what im doing, but this one agrees with your finding that the straight multiply is faster:

Code: Select all

Procedure.f Powf2(value.f)
  ! fld  dword [p.v_value]   ;load 32bit Float
  ! fmul st, st              ;pow2 = value * value
  ! fstp dword [p.v_value]   ;store
  ProcedureReturn value
EndProcedure


bufa = AllocateMemory(9000000)
bufb = AllocateMemory(9000000)
*p1.Ascii = bufa
*p2.Ascii = bufb


Time1 = ElapsedMilliseconds()
For i = 1 To 9000000
 ;ft = Sqr(Abs(Pow(*p1\a,2) - Pow(*p2\a,2)))        ;=257, 248, 249
 ;ft = Sqr(Abs(Powf2(*p1\a) - Powf2(*p2\a)))        ;=48, 50, 48
 ft = Sqr(Abs(*p1\a * *p1\a) - (*p2\a * *p2\a))     ;=34, 41, 33
 *p1+1: *p2+1
Next i
Time2 = ElapsedMilliseconds()

MessageRequester("Time", Str(Time2-Time1))

but that leaves me perplexed as to why it's performing differently in my other code! completely different timings from what should be identical code. Well... at the end of the day i'll be saving a lot of time (have to call this a lot), that's the main thing lol

Posted: **Fri Mar 04, 2016 10:25 pm**

Compared with the other two lines, your line should probably be

Code: Select all

ft = Sqr(Abs(*p1\a * *p1\a - *p2\a * *p2\a))

instead of

Code: Select all

ft = Sqr(Abs(*p1\a * *p1\a) - (*p2\a * *p2\a))

If you really need to do this calculation so often, you might consider converting the whole equation to asm.

Posted: **Fri Mar 04, 2016 10:49 pm**

ahhh yes! good catch thankyou

caffeine levels not good here for debugging
the speed of the procedure is actually really good so far, Pow() seemingly the only bottleneck. I wasnt sure how Sqr() would go but it seems fast, and likewise no hiccups from Abs()

Posted: **Sat Mar 05, 2016 6:41 am**

Keya wrote:I wasnt sure how Sqr() would go but it seems fast, and likewise no hiccups from Abs()

Pow() requires multiple instructions, Sqr() only one (fsqrt). How fast that fsqrt instruction is, depends a lot on the cpu architecture but on a modern Intel cpu it's quite fast.

Posted: **Sat Mar 05, 2016 3:19 pm**

I'm not into ASM, but with your Powf2(value.f) you can't do a Pow(value, 1.3) or whatever 'power value', or am I missing something ?

OOoops yeah sorry saw it just a litle later "Power of 2"...

PureBasic Forums - English

Floating-point Power of 2 (powf2 & powd2)

Floating-point Power of 2 (powf2 & powd2)

Re: Floating-point Power of 2 (powf2 & powd2)

Re: Floating-point Power of 2 (powf2 & powd2)

Re: Floating-point Power of 2 (powf2 & powd2)

Re: Floating-point Power of 2 (powf2 & powd2)

Re: Floating-point Power of 2 (powf2 & powd2)

Re: Floating-point Power of 2 (powf2 & powd2)

Re: Floating-point Power of 2 (powf2 & powd2)