Page 2 of 2

Re: Simple Question on Inline ASM

Posted: Fri Dec 04, 2009 5:41 pm
by Thorium
freak wrote:
Fred wrote: This is not necessarily true. By blowing up the code size with macros you can easily make the code slower due to caching effects. Calls have become really cheap in today's CPUs (because of branch prediction with call-return stacks etc). So worrying about this kind of thing is a waste of time and even dangerous. Write your code, benchmark it and then you can start thinking about these kinds of optimizations if you really need them (chances are you won't).
Just tested it and the macro was slightly faster than the procedure call without parameters. But just a few cycles, nothing to worry about.
btw. optimization and caching effects: Does the PB compiler unrole loops? Thats something you can blow your code and get a big performance plus.

Re: Simple Question on Inline ASM

Posted: Fri Dec 04, 2009 6:05 pm
by newtheogott
Does the PB compiler unrole loops? Thats something you can blow your code and get a big performance plus.
I'd be careful with such optimisation tips on CPU-Level. And i would not really recommend Fred to use much time on this topic.
Because these sort of Optimization are heavily CPU dependen.
Unrolling Loops is just an example. If you unrol a loop, the whole loop may get out of the L1 Cache etc.

This sort of optimisations can also get obsolete or even worse contraproductive with each new CPU Generation.
The new CPU's have an advanced Loop-Buffer.
This makes the pipeline shorter. But Nehalem goes a step farther. Nehalem's loop stream detector is at the end of the pipeline. When it sees a loop, the microprocessor can shut down everything except the loop stream detector, which sends out the appropriate instructions to a buffer
Small loops do not get decoded mutiple times interanlly but stay inside the L1 cache.

TOM's Hardware
With the Nehalem architecture, Intel has improved the functionality of the Loop Stream Detector. First of all the buffer is larger—it can now store 28 instructions. But what’s more, its position in the pipeline has changed. In Conroe, it was located just after the instruction fetch phase. It’s now located after the decoders; this new position allows a larger part of the pipeline to be disabled. The Nehalem’s Loop Stream Detector no longer stores x86 instructions, but rather µops. In this sense, i
As you can read, small loop with 28 (around) Instructions run in "Turbo Mode", therefore by unroling over this amount you possibly would loose speed.
I think its on long term much better to take time and choose the best algo, then to try to squeeze the final cycle out of the CPU.

About the GOSUB. This is just a difference in Compiler-Design from Powerbasic to Purebasic. In Powerbasic I can use CALL (Label) inside a Procedure. In the same way I could write GOSUB and looking at the generated ASM Output, the Compiler would just make in both cases a simple CALL instruction. This is possible as the whole design of the generated Code seems to be different then the Purebasic code. Maybe because the Purebasic Compiler is based on a Multi-Platform Intermediate-Design.
Therefore in Purebasic other rules apply and there will be other Commands to use for such situations.
i wondered that there is a GOSUB in the Help-File and even a "FakeReturn" but it was said "only in the Main Procedure. Now knowing that this is by Design, the discussion is worthless as said.

Re: Simple Question on Inline ASM

Posted: Sat Dec 05, 2009 2:29 am
by Thorium
I have a Nehalem (Core i7) and i got 15% speed up on unroling. Something that totaly not work anymore on Nehalem is prefetching. The automatic prefetcher is extremly good. I never get bedder performance if i prefetch data.

Re: Simple Question on Inline ASM

Posted: Fri Feb 19, 2016 1:35 pm
by newtheogott
I have just seen, that GOSUB is now implemented.
Maybe then old threads like this one can be deleted?

GOSUB in PureBasic

Re: Simple Question on Inline ASM

Posted: Sun Apr 24, 2016 11:02 am
by charvista
newtheogott wrote:I have just seen, that GOSUB is now implemented.
The Gosub was already implemented a long time ago.
Only in the procedures it is (still) not allowed....
Gosub may only be used within the main body of the source code, and may not be used within procedures.

Re: Simple Question on Inline ASM

Posted: Tue Apr 26, 2016 1:55 pm
by PeterJ
I personally think a "gosub"-like function within procedures has some benefits:
- it allows to structure complex coding, by doing perform step1, perform step2, etc. and keep the called sub procedures separate
- it allows to re-use coding where just some "input"-variables change, without duplicating the code
Certainly all this can be achieved by using "real" procedures, the downside is the scope of the variables in the "main"-procedure is not available in the sub-procedure and must be passed to it.
I definetly agree, that performance is not the deal, for both solutions.

To knock up a simple GoSub proposal I use 3 Macros:

Code: Select all

; ----------------------------------------------------------------------------------------------
; GOSUB inside procedures  
; ----------------------------------------------------------------------------------------------
; Call internal Procedure 
Macro Fcall(proc)
CompilerIf Defined(*bback,#PB_Variable)=0
   Define *bback         ; Define return address pointer, just once
CompilerEndIf  
*bback=?Return_#Proc#MacroExpandedCount  ; save return address
Goto PROC_#proc                          ; Goto sub prpcedure  
Return_#proc#MacroExpandedCount#:        ; Define Return Label 
EndMacro
; ----------------------------------------------------------------------------------------------
; Define internal Procedure 
; ----------------------------------------------------------------------------------------------
Macro Fproc(proc)
Define *r#proc           ; Define pointer to save return address   
; ----- handle return process in sub procedure  ------------ 
RET_#proc#:              ; Return will processed here
*bback=*r#proc           ; restore return address
*r#proc=0                ; reset return address 
 !jmp dword [p.p_bback]  ; return to caller 
; ----- Entry for sub procedure  --------------------------- 
Proc_#proc#:             ; Entry of sub procedure call 
*r#proc=*bback           ; save return address 
EndMacro                 ; normal coding starts ... 
; ----------------------------------------------------------------------------------------------
; Return from internal Procedure 
; ----------------------------------------------------------------------------------------------
Macro FReturn(proc)
 Goto RET_#proc          ; Goto return handling in FPROC 
EndMacro
to test it you can use the following program

Code: Select all

Procedure Test()
  a=1           ; set variable a
  Fcall(sub1)   ; call sub SUB1 
  a=2           ; set variable a again  
  Fcall(sub1)   ; call SUB1 again
  ProcedureReturn   
; Sub Procedure    
  FProc(sub1)   ; Define SUB1
  Debug "called with a="+Str(a)
  FReturn(sub1) ; Return from SUB1 
EndProcedure 
by using the JMP statement, the FCALL can be used several times and will return to the appropriate statement after the FCALL.

I agree, it is pretty much dependant on your programming style, you may like or dislike it, but that's the way I prefer it.

Re: Simple Question on Inline ASM

Posted: Tue Apr 26, 2016 3:57 pm
by Danilo

Re: Simple Question on Inline ASM

Posted: Wed Apr 27, 2016 5:47 am
by PeterJ
Congrats Danilo, this is really nice coding!