PureBasic Forums - English

Posted: **Wed Sep 23, 2009 1:26 pm**

Procedure overloading: http://www.purebasic.fr/english/viewtop ... 47&start=8
Call(C)Function(Fast): http://www.purebasic.fr/english/viewtopic.php?t=37491
...2 very painful problems. 1 simple solution:

Code: Select all

CallFunctionFast(*Ptr, {Val("123")}.l)
Procedure.L Over(P.l)
; ...proc code...
EndProcedure
Procedure.b Over(P.b)
; ...proc code...
EndProcedure
Define Test.q
Over({Test}.b)
Over({Test}.l)

What was done here ? Well, we just made forced conversion of those expression, to ensure that result would have correct type. That's all ! No need to to restrict floats from Call(C)Function(Fast) anymore. No need to hestiate with adding some overload (which is very useful as replacement for OOP). Even more:

Code: Select all

FractalRGB = {X ! Y}.A << 8

..That's, BTW, was a reason why I was forced to use ASM for core of my screensaver.

Q: Why you wan't just use own conversion procedures with desired type of return value.
A: I'm not that heartless. Forcing my CPU to create stack frame, push parameters, save return address, jump, pop parameters, release stack frame and finally return back... And what for ? To just do few bytes of ASM instruction ? Too cruel, I would definitely be bashed by 'Community of CPU-protectors' for such activities.

...So, what would you say ? Small and very useful addition, no ?

Posted: **Fri Sep 25, 2009 10:35 am**

Oh, just found another usage today:

Code: Select all

Define.i a = 1, b = 2, c = 4, d
d = {(a / b)}.f * c

...Now try it with our current possibilities.

Posted: **Fri Sep 25, 2009 10:42 am**

While this would be very useful, I think it looks extremely messy. Compiler procedures like ToFloat(), ToLong() etc. would be more readable, imo.

Posted: **Fri Sep 25, 2009 10:51 am**

Compiler procedures like ToFloat(), ToLong() etc. would be more readable, imo.

Mmm, well, it's a variant too, yet mine is more compact)

Posted: **Fri Sep 25, 2009 12:08 pm**

CallFunctionFast() is obsolete. With the prototypes, this is handled gracefully:

Code: Select all

Procedure GetLong(A.l)
  Debug A
EndProcedure

Prototype ProtoGetLong(A.l)

*Fptr.ProtoGetLong = @GetLong()

; Conversion from quad/float to long automatically done by prototype
*Fptr(Val("123"))
*Fptr(12.34)

A: I'm not that heartless. Forcing my CPU to create stack frame

The PB compiler doesn't generate stack frames.

Code: Select all
FractalRGB = {X ! Y}.A << 8
..That's, BTW, was a reason why I was forced to use ASM for core of my screensaver.

You weren't forced to use ASM. You could have assigned the result of X ! Y to a temporary variable of type A. Instead you chose to use ASM.

Melissa wrote:Oh, just found another usage today:
Code: Select all
Define.i a = 1, b = 2, c = 4, d
d = {(a / b)}.f * c
...Now try it with our current possibilities.

In my opinion, languages which use / for both integer division and float division have a problem in the design. Pascal uses / for float division and div for integer division, which solves this much more nicely.
In PB you can force fpu evaluation by putting 0.+ at the start of the expression, so your expression can be rewritten like this:

Code: Select all

d = {(a / b)}.f * c
   V
d = 0.+ a / b * c

Posted: **Fri Sep 25, 2009 2:57 pm**

CallFunctionFast() is obsolete

There are still plenty of useful code, which uses it. Froggerprogger's FMod wrapper, for example.

The PB compiler doesn't generate stack frames.

Wait... What o_O ? How locals\parameters\return address is stored then ?

You weren't forced to use ASM. You could have assigned the result of X ! Y to a temporary variable of type A. Instead you chose to use ASM.

Well, I just thought that I can break expression in ASM as well, without need of allocating temporary variable in such case. When you do something 1024*768 times (with 30-40 FPS as target) every CPU's tact is critical (especially if it's time-consumer like MOVZX).

In PB you can force fpu evaluation by putting 0.+ at the start of the expression, so your expression can be rewritten like this:

Oh, so we messed up expression (remember, no actual addition is needed) and received junk opcode ('fadd'). And it's all instead of 1 (bloody) compiler directive... BTW, we not result as float any further, think about cases like:

Code: Select all

Define.i a = 1, b = 2, c = 4, d
d = {{a / b}.f * c}.i + 1

Posted: **Sun Sep 27, 2009 9:26 pm**

Melissa wrote:
CallFunctionFast() is obsolete
There are still plenty of useful code, which uses it. Froggerprogger's FMod wrapper, for example.

You're beating a dead horse. There was problems with CallFunctionFast(), and they were solved with the addition of prototypes. There is no reason to solve the same problem twice.

The PB compiler doesn't generate stack frames.
Wait... What o_O ? How locals\parameters\return address is stored then ?

On the stack, of course. But there is no "frame". If you compile with GCC, it will use the ebp register as a stack frame pointer. PB doesn't do that. All accesses are relative to the stack pointer esp, there are no explicit frames.

In PB you can force fpu evaluation by putting 0.+ at the start of the expression, so your expression can be rewritten like this:
Oh, so we messed up expression (remember, no actual addition is needed) and received junk opcode ('fadd'). And it's all instead of 1 (bloody) compiler directive... BTW, we not result as float any further, think about cases like:
Code: Select all
Define.i a = 1, b = 2, c = 4, d
d = {{a / b}.f * c}.i + 1

Floating point multiplication and addition are generally almost as fast as integer math, however, the conversion from floating point to integer is very expensive, so it seems to me like you're just slowing your code down.

Generating efficient code when mixing floating point and integers is quite difficult, because the actual conversion is very expensive. Adding the directive is probably simple, but it won't automatically give you fast code. And also, it's not in the spirit of basic, in my opinion.

Posted: **Fri Oct 02, 2009 7:28 am**

There was problems with CallFunctionFast(), and they were solved with the addition of prototypes. There is no reason to solve the same problem twice.

Yes, it was, but now there are another one: backward compatibility with some code is completely broken, and it doesn't seems fun to completely rewrite with prototypes...

CC, it will use the ebp register as a stack frame pointer. PB doesn't do that. All accesses are relative to the stack pointer esp, there are no explicit frames.

Those stack sections for each call named 'framed' anyways, AFAIR. Just different realization.

Floating point multiplication and addition are generally almost as fast as integer math,

4-5 times slower, according to my tests.

however, the conversion from floating point to integer is very expensive, so it seems to me like you're just slowing your code down.

Of course, of course... Well, now just read my sample again: f->i conversion would happen there anyways (floats could not be properly stored in integer variable), I just want to manually force it's happening before addition of 1 (for changing 'fadd' into 'inc').

Posted: **Fri Oct 02, 2009 10:28 am**

Could you excuse me, but I didn't understand this:

Code: Select all

CallFunctionFast(*Ptr, {Val("123")}.l)
Procedure.L Over(P.l)
; ...proc code...
EndProcedure
Procedure.b Over(P.b)
; ...proc code...
EndProcedure
Define Test.q
Over({Test}.b)
Over({Test}.l)

I think you do a mistake anywhere. If you have an new idea which seems interesting, I don't see it (again! I'm really a nuts...).
1) *Ptr adress is 0; I suppose the compiler should research the good adress... But you must add a line like this:
*Ptr = [[ The new magic function ]]
2) If you want to have a good code, don't forget the order of the lines. If you don't care about it, you must add others lines like declarations to be allowed to get such this freedom.
3) Maybe the solution is in this third problem: two procedures and only one name. No good! Two procedures = two names, all simply!
i.e.:
Declare.I Calculate_B(Var.S)
Declare.I Calculate_L(Var.S)

If you replace .L you want to add in the CallFunctionFast() arg with _L, you don't need to add so much lines to tell the compiler you want this specificity.

Melissa wrote:4-5 times slower, according to my tests.

I think your tests are not exact. These tests must consider the duration of the Val() function too. I am curious to see these tests. Don't worry for the speed of the execution in PureBasic: if you code correctly, you have better results than in C++ with less typed characters! A shorter and quicker code, all simply!

Posted: **Fri Oct 02, 2009 1:04 pm**

Melissa wrote:
Floating point multiplication and addition are generally almost as fast as integer math,
4-5 times slower, according to my tests.

Your tests are wrong.

Code: Select all

; Turn off debugger
#Tries = 10000000

time = ElapsedMilliseconds()
For U = 0 To #Tries
  !repeat 100
  a = b * c
  !end repeat
Next
MessageRequester("", Str(ElapsedMilliseconds()-time))

time = ElapsedMilliseconds()
For U = 0 To #Tries
  !repeat 100
  f.f = g.f * h.f
  !end repeat
Next
MessageRequester("", Str(ElapsedMilliseconds()-time))

however, the conversion from floating point to integer is very expensive, so it seems to me like you're just slowing your code down.
Of course, of course... Well, now just read my sample again: f->i conversion would happen there anyways (floats could not be properly stored in integer variable), I just want to manually force it's happening before addition of 1 (for changing 'fadd' into 'inc').

When I say it won't go faster due to float to integer conversion, I mean it. One of the reasons FP to integer is expensive is that you cannot move data from the fpu to the cpu. It must go through memory. So you have to store the float to memory (in a temp location), load it from memory onto the cpu, do an inc, and move it back to memory yet again. Moreover, the temp location has to be allocated somewhere.
Instead of two memory accesses you have four. Instead of two register accesses (for add and store) you now have 6, for alloc temp, store, load (and dealloc temp), inc and store. How is that going to be faster??

Also, I don't understand what you really want. Look at this: "{a / b}.f". Here you want the division (the INSIDE of the {}) to be done on the fpu. While here "{a.f * c}.i + 1" you want the addition (the OUTSIDE of the {}) to be done as integer math. You don't clearly define whether the {} should change the type of the expression inside before calculation is done or after.

Let's assume for a moment you declared your variable a as float. The generated code would look like this:

Code: Select all

  FLD    dword [v_a]
  FIDIV  dword [v_b]
  FIMUL  dword [v_c]
  FADD   dword [F1] ; + 1 ; 1 memory access, 1 register access
  FISTP  dword [v_d] ; 1 memory access, 1 register access

If a was an integer and you had forced float division, the first FLD would say FILD.

Now assume that we convert the fadd to an inc. It would look like this:

Code: Select all

  !FLD    dword [v_a]
  !FIDIV  dword [v_b]
  !FIMUL  dword [v_c]
  !push   eax           ; 2 register access (eax+esp), 1 memory access ([esp])
  !fistp  dword [esp]   ; 1 register access (st0), 1 memory access ([esp])
  !pop    eax           ; 2 register access (eax+esp), 1 memory access ([esp])
  !inc    eax           ; 1 register access
  !mov    dword [v_d], eax ; 1 memory access

In practice, the codes have almost the same speed, however, the first one (with fadd) is slightly faster.

PureBasic Forums - English

Forcing type for expression.

Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.

Re: Forcing type for expression.