Wayne Diamond wrote:Basically what I meant is that this:
Code: Select all
For x = 1 to Whatever
Do Nothing
Loop
... is basically going to compile to the same assembly-level
code, regardless of which compiler you use,
Nope, thats wrong.
Lets take this 2 examples:
Code: Select all
; code 1
For a = 1 To 100000000
Next
; code 2
For a = 1 To 100000000
x + 1
Next
The
BEST compilers fully optimize this. Code 1 does nothing,
so the compiler removes the loop completely and uses:
'a = 100000000' instead.
The second code is also useless, the compiler removes it
with: 'a = 100000000 : x + 100000000'
This is what the best compilers do, awesome optimizations.
GOOD compilers use 2 registers for the loop variable 'a'
and 'x'. In the loop, the 2 registers are incremented, and right
after the loop, 'a' and 'x' are both written to memory (in the
place where the variable is saved in memory).
PureBasic (and some other non-optimizing compilers)
run it like it is, both variables are in memory and are incremented
and compared from memory 100000000 times.
Wayne, this IS a *big* difference (and you should know that).
Dont wonder why PB loses against VB and nearly all other
compilers in this, its just logical.
(also tested against 3 C/C++ compilers, PB is always the
last and much slower than any other compiler)
Good compilers have an built-in optimizer that analyzes the code
and loops, and the best compilers check everything possible,
even pairing of generated ASM code and all the important speed stuff.
Not to talk about MMX/SSE/SSE2 directly in the generated main code,
if possible and if the compile target allows it.
PureBasic is just a 'stupid' (from the technical point of view)
translator that replaces "a + 1" with "INC dword [v_a]", there
is no analyzer and optimizer for real logic.
Optimized and intelligent register allocation? No way with PB!
Good compilers would check what is within the loop (function
calls, plain calculations only,...) and decide how many free
registers are needed.
If all registers are used currently, they push 2 registers on stack
that are not needed in the loop and put the 2 variables in it.
Now the loop runs completely in the fast registers, and on loop
end the 2 registers are written to the memory variables ('a' + 'x')
and the 2 saved regs are poped back from stack.
You say this tests are senseless because they do nothing, and
i agree generally (and i told people too), BUT:
loops (repeat...until, for...next, while...wend) are *basic* things
in every language and are used very very strong.
program flow statements are one of the _fundamentals_ of
programming languages, so they must be very high optimized.
If you want to compare speed of languages, you use nearly
always loops. Not empty ones, but loops with contents.
The non-optimized loops in PureBasic will *always* slow down
every test, so chances to win the test are much lower.
People tend to believe PureBasic is fast because it can generate
ASM, and most people dont know ASM - so they have respect
and automatically think its fast (btw: even old GWbasic is very
fast on 1GHz or 3GHz CPU's we have today

).
Well, 99% of all compilers generate ASM internally and assemble
it by an assembler after generation. Sometimes the assembler
is built-in (so people dont see that process), sometimes its an
external assembler (borland BASM for Delphi, MASM for Microsoft,
gas (GNU assembler),...).
The big difference: some compilers generate very good ASM code,
others dont do it.
You cant expect too much from compilers that dont optimize
such fundamental things as loops.
PureBasic IS fast, and we have very fast processors today -
0,97 MHz CPU's were in the past (remember C64?

).
Many things have to get done in the pbcompiler to say its a
good or a very good compiler, but the users cant change that
anyway.
From a technical point of view, the PB compiler is 'OK', but not
good. The very good thing in PB are the many libs with some
hundred commands... and all that platform independent (not perfect
today, but its getting better every release (Win+Linux)).
BTW, before i forget it:
On multitasking operating systems like Win and Linux (well, every
modern OS), the tests many people do are wrong anyway.
There are happening task switches every few milliseconds and
there are always running processes in the background (the operating
system itself for example).
On windows, you have to give your test code realtime priority
to get better results and to be fair:
Code: Select all
SetPriorityClass_(GetCurrentProcess_(),#REALTIME_PRIORITY_CLASS)
; test code here
SetPriorityClass_(GetCurrentProcess_(),#NORMAL_PRIORITY_CLASS)
And the second (very important) thing:
Guys tend to compare speed only, but speed is not everything.
Sometimes there is a very good reason some code is slower,
for example:
If you compare speed of string operations, PureBasic speed should
be OK - but strings are not thread safe in PB (well, nothing
in PB is really thread safe).
Some other compilers make every 'object' (strings) thread-safe,
because many applications use threads. Threads are used very
often to run things in background and to run things at the same
time (multiprocessor), the newest CPUs get extra features for
performing better with threads.
This depends on the compiler you test with (and if you can
enable/disable thread safety), but is very important - you
should remember this while testing.
The fastest code is absolutely useless if it just crashes or gets
corrupt/mixed-up within threads.