Page 1 of 2
PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 3:16 pm
by BarryG
Hi all, I'm a fan of
Dave Plummer on YouTube (he wrote "Task Manager" on Windows, and lots of other Windows things) and he just posted a video about the speed of calculating prime numbers in Python, C#, and C++ (see the results
here). He posted the source for the three languages on
Github. My obvious question is: can someone convert one of those sources to PureBasic to compare the speed to his results? Would be very interesting to see. Make sure you post your code here to compare. Thanks!
[Edit] The test is done over 5 seconds, to compare properly with his results.
Re: PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 4:10 pm
by Axolotl
i will have a look to the codes. usually i am to slow on doing this... the professionals here will beat me with ther solutions i guess.
stay tuned.
Re: PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 4:24 pm
by NicTheQuick
Just translated it for you:
Code: Select all
EnableExplicit
Global Dim rawbits.a(0)
Global sieveSize.i
Procedure validateResults(actualCount.i)
Protected count.i
Select sieveSize
Case 10: count = 1
Case 100: count = 25
Case 1000: count = 168
Case 10000: count = 1229
Case 100000: count = 9592
Case 1000000: count = 78498
Case 10000000: count = 664579
Case 100000000: count = 5761455
Default: ProcedureReturn #False
EndSelect
ProcedureReturn Bool(count = actualCount)
EndProcedure
Procedure GetBit(index.i)
If index & 1
index / 2
ProcedureReturn rawbits(index / 8) & (1 << (index % 8))
EndIf
ProcedureReturn #False
EndProcedure
Procedure ClearBit(index.i)
If index & 1
index / 2
rawbits(index / 8) & ~(1 << (index % 8))
EndIf
EndProcedure
Procedure prime_sieve(n.i)
sieveSize = n
ReDim rawbits.a(n / 8)
FillMemory(@rawbits(0), n / 8 + 1, $ff, #PB_Ascii)
EndProcedure
Procedure runSieve()
Protected factor.i = 3
Protected q = Sqr(sieveSize)
Protected num.i
While (factor < q)
num = factor
While num < sieveSize
If GetBit(num)
factor = num
Break
EndIf
num + 1
Wend
num = factor * 3
While num < sieveSize
ClearBit(num)
num + factor * 2
Wend
factor + 2
Wend
EndProcedure
Procedure printResults(showResults.i, duration.d, passes.i)
Protected num.i
If showResults
Print("2, ")
EndIf
Protected count.i = 1
For num = 3 To sieveSize
If GetBit(num)
If showResults
Print(Str(num) + ", ")
EndIf
count + 1
EndIf
Next
If showResults
PrintN("")
EndIf
PrintN("Passes: " + passes +
", Time: " + duration + " ms" +
", Avg: " + StrD(duration / passes, 3) + " ms" +
", Limit: " + sieveSize +
", Count: " + count +
", Valid: " + Str(validateResults(count)))
EndProcedure
Procedure main(limit.i = 1000000, duration.i = 10000)
Protected passes.i = 0
Protected tStart.i = ElapsedMilliseconds()
While ElapsedMilliseconds() - tStart < duration
prime_sieve(limit)
runSieve()
passes + 1
Wend
Protected tD.i = ElapsedMilliseconds() - tStart
printResults(#False, tD, passes)
EndProcedure
OpenConsole()
main(1000000, 10000)
Input()
CloseConsole()
And this is the output on my old laptop:
Code: Select all
Passes: 246, Time: 10021 ms, Avg: 40.736 ms, Limit: 1000000, Count: 78498, Valid: 1
And in case you are curious, this is with debugger on:
Code: Select all
Passes: 30, Time: 10190 ms, Avg: 339.667 ms, Limit: 1000000, Count: 78498, Valid: 1
I also compiled the C++ code using g++ on Linux and got this result:
Code: Select all
Passes: 1741, Time: 10000 ms, Avg: 5.744, Limit: 1000000, Count: 78498, Valid: 1
As you can see it is nearly 7 times faster. I gues the main reason for that is automatic inlining of functions and better optimization for GetBit and ClearBit.
Re: PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 4:30 pm
by tft
on my maschine ....... Passes 329, Time 10004 ms, AVG= 30.407
and yet ???
Re: PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 4:58 pm
by BarryG
The test is meant to be done over a 5-second period, not 10. My result over 5 seconds is:
Passes: 182, Time: 5001 ms, Avg: 27.478 ms, Limit: 1000000, Count: 78498, Valid: 1
(And thanks for the converted code, NicTheQuick!).
Re: PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 5:24 pm
by fluent
PB 5.62 x86, debugger off:
Passes: 395, Time: 5009 ms, Avg: 12.681 ms, Limit: 1000000, Count: 78498, Valid: 1
PB 5.62 x64, debugger off:
Passes: 122, Time: 5039 ms, Avg: 41.303 ms, Limit: 1000000, Count: 78498, Valid: 1
Looks like 64-bit is slower??
(win 10 x64 - 5-year-old low-end ASUS laptop)
Re: PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 5:31 pm
by skywalk
Ha! This is quite timely with v6 approaching.
Please post your OS versions and whitelisting of PB app in use.
And there is no process priority applied to the test?
Re: PureBasic speed test compared to this video
Posted: Thu Mar 25, 2021 6:51 pm
by NicTheQuick
With macros and the avoidance of modulo you actually can make it pretty fast. And that is what a C compiler usually does automatically for you:
Code: Select all
EnableExplicit
Global Dim rawbits.a(0)
Global sieveSize.i
Procedure validateResults(actualCount.i)
Protected count.i
Select sieveSize
Case 10: count = 1
Case 100: count = 25
Case 1000: count = 168
Case 10000: count = 1229
Case 100000: count = 9592
Case 1000000: count = 78498
Case 10000000: count = 664579
Case 100000000: count = 5761455
Default: ProcedureReturn #False
EndSelect
ProcedureReturn Bool(count = actualCount)
EndProcedure
Macro GetBit(index)
((index & 1) And rawbits(index >> 4) & (1 << ((index & $f) >> 1)))
EndMacro
Macro ClearBit(index)
rawbits(index >> 4) & ~(1 << ((index & $f) >> 1))
EndMacro
Procedure prime_sieve(n.i)
sieveSize = n
ReDim rawbits.a(n / 8)
FillMemory(@rawbits(0), n / 8 + 1, $ff, #PB_Ascii)
EndProcedure
Procedure runSieve()
Protected factor.i = 3
Protected q = Sqr(sieveSize)
Protected num.i
While (factor < q)
num = factor
While num < sieveSize
If GetBit(num)
factor = num
Break
EndIf
num + 1
Wend
num = factor * 3
While num < sieveSize
; We can be sure that num is always an odd number
ClearBit(num)
num + factor * 2
Wend
factor + 2
Wend
EndProcedure
Procedure printResults(showResults.i, duration.d, passes.i)
Protected num.i
If showResults
Print("2, ")
EndIf
Protected count.i = 1
For num = 3 To sieveSize
If GetBit(num)
If showResults
Print(Str(num) + ", ")
EndIf
count + 1
EndIf
Next
If showResults
PrintN("")
EndIf
PrintN("Passes: " + passes +
", Time: " + duration + " ms" +
", Avg: " + StrD(duration / passes, 3) + " ms" +
", Limit: " + sieveSize +
", Count: " + count +
", Valid: " + Str(validateResults(count)))
EndProcedure
Procedure main(limit.i = 1000000, duration.i = 5000)
Protected passes.i = 0
Protected tStart.i = ElapsedMilliseconds()
While ElapsedMilliseconds() - tStart < duration
prime_sieve(limit)
runSieve()
passes + 1
Wend
Protected tD.i = ElapsedMilliseconds() - tStart
printResults(#False, tD, passes)
EndProcedure
OpenConsole()
main(1000000, 5000)
Input()
CloseConsole()
Old code over 5 seconds:
Code: Select all
Passes: 129, Time: 5001 ms, Avg: 38.767 ms, Limit: 1000000, Count: 78498, Valid: 1
New version over 5 seconds:
Code: Select all
Passes: 1690, Time: 5000 ms, Avg: 2.959 ms, Limit: 1000000, Count: 78498, Valid: 1
C-Code over 5 seconds on the same machine:
Code: Select all
Passes: 909, Time: 5000 ms, Avg: 5.501 ms, Limit: 1000000, Count: 78498, Valid: 1
C-Code over 5 seconds with -O3 flag for best optimization:
Code: Select all
Passes: 4431, Time: 5000 ms, Avg: 1.128 ms, Limit: 1000000, Count: 78498, Valid: 1
Btw my CPU: Intel(R) Core(TM) i7-3820QM CPU @ 2.70GHz with 8MB L3 cache.
It's a Thinkpad W530.
Re: PureBasic speed test compared to this video
Posted: Fri Mar 26, 2021 2:31 am
by Rinzwind
PB needs inline procedures as manual option at least if the compiler is not smart enough to detect when to apply it.
Same as with easy byref parameters (way more BASIC than having to deal with pointers). I guess it is too much to ask for for current asm emitter logic? Too much old, sticky and cold spaghetti?
Anyway good to see that macro's saved PB's ass

. They could also do with some practical extra's btw.
Re: PureBasic speed test compared to this video
Posted: Fri Mar 26, 2021 2:41 am
by skywalk
Macro's are inlined code.

Re: PureBasic speed test compared to this video
Posted: Fri Mar 26, 2021 10:46 am
by NicTheQuick
skywalk wrote:Macro's are inlined code.

Not really. Macros are more like generic programming in C(++).
For example you can not create Macros with a non trivial return value. This only works if the macro itself is a one liner. And then you also have the issue that parameters which are used multiple times were also evaluated multiple times. See here:
Code: Select all
Global counter = 0
Procedure count()
counter + 1
ProcedureReturn counter
EndProcedure
Macro tripleAMacro(a)
((a) + (a) + (a))
EndMacro
Procedure tripleAProcedure(a)
ProcedureReturn ((a) + (a) + (a))
EndProcedure
; Calls count() once
Debug tripleAProcedure(count())
counter = 0
; Calls count() three times
Debug tripleAMacro(count())
Re: PureBasic speed test compared to this video
Posted: Fri Mar 26, 2021 12:53 pm
by skywalk
I agree, but I was referring to your macro code in the speed test. l often inline code for speed at the expense of shorter procedures.
Re: PureBasic speed test compared to this video
Posted: Fri Mar 26, 2021 1:05 pm
by NicTheQuick
skywalk wrote:I agree, but I was referring to your macro code in the speed test. l often inline code for speed at the expense of shorter procedures.
In this case the inlining doubled the speed. If I change GetBit and ClearBit back to a procedure I get this result:
Code: Select all
Passes: 876, Time: 5002 ms, Avg: 5.710 ms, Limit: 1000000, Count: 78498, Valid: 1
which is ~52% of the speed of the Macro version.
If I then replace the shifts back to division (">> 4" -> "/ 16") it makes a much bigger difference:
Code: Select all
Passes: 190, Time: 5010 ms, Avg: 26.368 ms, Limit: 1000000, Count: 78498, Valid: 1
Finally replacing the bitwise ANDs with modulo again ("& $f" -> "% 16") it gets even slower:
Code: Select all
Passes: 125, Time: 5013 ms, Avg: 40.104 ms, Limit: 1000000, Count: 78498, Valid: 1
I think Purebasic should do the optimization with the modulo and shifts automatically. Then you would not need to make it manually.
But if Purebasic evolves to a compiler for C-Code instead FASM, we do not longer care about that I guess.
Re: PureBasic speed test compared to this video
Posted: Fri Mar 26, 2021 1:29 pm
by Axolotl
@NicTheQuick:
if i understand you correctly, these are the fastest routines based on procedure, right?
Code: Select all
; Macro GetBit(index)
; ((index & 1) And rawbits(index >> 4) & (1 << ((index & $f) >> 1)))
; EndMacro
Procedure GetBit(index.i)
If index & 1
ProcedureReturn rawbits(index >> 4) & (1 << ((index & $f) >> 1))
EndIf
ProcedureReturn #False
; If index & 1
; index / 2
; ProcedureReturn rawbits(index / 8) & (1 << (index % 8))
; EndIf
; ProcedureReturn #False
EndProcedure
; Macro ClearBit(index)
; rawbits(index >> 4) & ~(1 << ((index & $f) >> 1))
; EndMacro
Procedure ClearBit(index.i)
rawbits(index >> 4) & ~(1 << ((index & $f) >> 1))
; If index & 1
; index / 2
; rawbits(index / 8) & ~(1 << (index % 8))
; EndIf
EndProcedure
@BarryG: told you

(my solution was very similar to NicTheQuick's, so i refrained from publishing it.)
Re: PureBasic speed test compared to this video
Posted: Fri Mar 26, 2021 1:40 pm
by NicTheQuick
Axolotl wrote:@NicTheQuick:
if i understand you correctly, these are the fastest routines based on procedure, right?
The fastest should be these:
Code: Select all
Procedure GetBit(index)
ProcedureReturn Bool((index & 1) And rawbits(index >> 4) & (1 << ((index & $f) >> 1)))
EndProcedure
Procedure ClearBit(index)
rawbits(index >> 4) & ~(1 << ((index & $f) >> 1))
EndProcedure
They are branchless (best for optimization within the CPU itself) and do not use any division or modulo (which is a division by itself).