Page 1 of 1

Optional Speed Optimizations

Posted: Wed May 05, 2004 7:00 pm
by Thomas
 
Using inline assembler brings more speed but the readability of the code is gone. So I try to use PB commands all the time.

Analyzing the generated assembler code, it's obvious that the compiler is a bit stupid sometimes (even though the compiler generates really good code compared to others).

So for example a code like this

a=12
b=a+1

first stores 12 in memory (variable a) then loads again this value from memory, adds 1 to it and saves the result to memory (variable b).

Would it be possible to omit the loading of variable a (it is stored in a register) and store variable b only if no further computation is done or the register is needed e.g. by using compiler directives?

I know that PB cannot produce hand optimized assembler code but is there still room for some *optional* speed enhancements for time critical routines or is the limit reached?

Especially when introducing new variable types like fp64 I fear the performance could drop because of the additional flexibility.
 

Posted: Thu May 06, 2004 10:16 am
by Fred
The optimisation limit isn't reach for now, and there is still lot of room for them. About the new type, it won't ever slowdown the actual code for flexibility reason, new check will be added in the compiler to handle them without sacrifying the current speed.

Posted: Thu May 06, 2004 11:27 am
by fweil
Thomas,
So for example a code like this

a=12
b=a+1

first stores 12 in memory (variable a) then loads again this value from memory, adds 1 to it and saves the result to memory (variable b).
You are right that it is possible to optimize.

But first why does the compiler execute first the variable to memory before to increment the value in memory ?

Because you, as a programmer, coded it like that !

If you coded :

b = 12 + 1, then the compiler would have store 12 in a register, then incremented it before to store in memory.

And if you coded b = 13, it would have save some steps.

Why did you wrote a = 12 and then b = a + 1 ?

Probably because you intended to use later either a and b. This, the compiler cannot decide to change, except that a possible optimization exist by loading 12 in a register and discovering that the corresponding variable will be used at the next step :

So the generated code in this case should be ie :

MOV eax, 12
MOV [v_a], eax
INC eax
MOV [v_b], eax

This is optimized like that.

But what would occur if you decide that :

a = function(12)
b = a + 1

or anything more complicate ...

Then the executable code will have to call something else in the meanwhile before to store the value of b. And in this case, registers will have changed a lot.

Well, optimization is possible, but will make the compiler more complicate.

KRgrds

Posted: Thu May 06, 2004 2:39 pm
by Fred
fweil wrote:If you coded :

b = 12 + 1, then the compiler would have store 12 in a register, then incremented it before to store in memory.
Hopefully, this one is optimized at compile time :wink:

Posted: Thu May 06, 2004 2:53 pm
by fweil
Fred,

More ... it is optimized and clever because the compiler generates :


; a = 12
MOV dword [v_a],12
; b = a + 1
MOV ebx,dword [v_a]
INC ebx
MOV dword [v_b],ebx
; c = 1 + a
MOV ebx,dword [v_a]
INC ebx
MOV dword [v_c],ebx

Taking such a simple case is interesting, because it shows you had to understand both possible ways to add 1 to a memory variable.

Hope folks understand how uneasy it can become to design a language and let coders as much free as possible to write how they like.

Anyway I use Purebasic, love it, and search my way with.

KRgrds

Re:

Posted: Fri Nov 19, 2010 5:04 pm
by Mok
fweil wrote:More ... it is optimized and clever because the compiler generates :

[...]
MOV ebx,dword [v_a]
INC ebx
MOV dword [v_b],ebx
On the German board they said, that INC needs more cycles than ADD [reg], 1...
I just wanted to note this...

Re: Re:

Posted: Fri Nov 19, 2010 5:53 pm
by Thorium
Mok wrote:
fweil wrote:More ... it is optimized and clever because the compiler generates :

[...]
MOV ebx,dword [v_a]
INC ebx
MOV dword [v_b],ebx
On the German board they said, that INC needs more cycles than ADD [reg], 1...
I just wanted to note this...
The code is from 2004. ;)
Rules for assembler optimization change with every CPU generation.
Besides INC needs not more cycles than ADD on most CPU's. Intel just suggests to not use INC anymore because Intel will not optimize that instruction any longer. Currently INC and ADD are at the exact same speed on most CPU's.

Re: Re:

Posted: Fri Nov 19, 2010 9:46 pm
by Mok
Thorium wrote: The code is from 2004. ;)
:o
OK, sorry, anyhow I didn't note that I clicked a search result instead of a board thread :oops: