It is currently Mon Dec 11, 2017 5:10 am

All times are UTC + 1 hour




Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 1:25 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Sun Dec 02, 2007 12:11 pm
Posts: 256
Location: Germany
Hello everyone,

before I ask for compiler optimizations I would like to understand why the compiler currently does things like the following examples. Maybe the PureBasic team can elaborate a little bit about the reasons?

Let's assume the following PureBasic code:
Code:
Procedure Test()
  Protected A, B
  A = Random(100)
  B = A
  ProcedureReturn B
EndProcedure
Which results in the following ASM code:
Code:
; Procedure Test()
macro MP0{
_Procedure0:
  PS0=64
  XOR    rax,rax
  PUSH   rax
  PUSH   rax
  SUB    rsp,40                                                                                                                                                                                                                                                       
; Protected A, B
; A = Random(100)
  PUSH   qword 100
  POP    rcx
  CALL   PB_Random_THREAD
  MOV    qword [rsp+40],rax
; B = A
  PUSH   qword [rsp+40]
  POP    rax
  MOV    qword [rsp+48],rax
; ProcedureReturn B
  MOV    rax,qword [rsp+48]
  JMP   _EndProcedure1
; EndProcedure
  XOR    rax,rax
_EndProcedure1:
  ADD    rsp,56
  RET
}


Example 1; the following extract from the code above:
Code:
; A = Random(100)
  PUSH   qword 100
  POP    rcx
  CALL   PB_Random_THREAD
  MOV    qword [rsp+40],rax
Could be shortened to:
Code:
; A = Random(100)
  MOV    rcx,qword 100
  CALL   PB_Random_THREAD
  MOV    qword [rsp+40],rax


Example 2; the following extract from the code above:
Code:
; B = A
  PUSH   qword [rsp+40]
  POP    rax
  MOV    qword [rsp+48],rax
Could be shortened to:
Code:
; B = A
  MOV    rax,qword [rsp+40]
  MOV    qword [rsp+48],rax
But it could also be done as:
Code:
; B = A
  PUSH   qword [rsp+40]
  POP    qword [rsp+48]


Question: Why does the current PB compiler make use of the stack and registers to move variable contents like this?

Thanks in advance!

[compiled using PB 5.11 x64 on Windows]
[Edit 1: updated to clarify code examples]


Last edited by mback2k on Sun May 12, 2013 4:54 pm, edited 2 times in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 2:12 pm 
Offline
Addict
Addict
User avatar

Joined: Tue Nov 09, 2010 10:15 pm
Posts: 1487
Deleted.


Last edited by Tenaja on Sun May 12, 2013 5:30 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 4:49 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Sun Dec 02, 2007 12:11 pm
Posts: 256
Location: Germany
Thanks for your answer, but I don't think you understood the purpose of those examples. The examples 1 and 2 show the result of a single compiled PureBasic line, therefore it is indeed possible to optimize the code that is produced in a single compiler pass.

1. Yes, I understand the purpose of those PUSHes you are talking about and I actually didn't question those.
2. Yes, it is a one/single pass compiler, but the example is the result of a single line.
3. I am not talking about the irrelevant variable, but the the way the value is moved from A to B. Please take a look at each example and the optimization proposals below each of them.

I am sorry if my initial post was not clear enough and hope that this clarified what I am talking about. tl;dr: I am talking about ASM optimizations for a single PureBasic line.

Thanks.


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 5:18 pm 
Offline
Addict
Addict
User avatar

Joined: Tue Nov 09, 2010 10:15 pm
Posts: 1487
I just looked at the 32-bit asm, and it does use just the push/call combination in favor of the push/pop/call.

Hopefully, if Fred can improve this in the 64-bit output, he will. I was wondering why some algorithms are slower in 64-bit mode; perhaps he only improved the 32-bit compiler at one time.

And btw, the fact that it is on one line (or command) is irrelevant in optimizing. What matters is where the asm instructions are generated, and if Fred wants to complicate the procedures that generate those instructions.


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 5:31 pm 
Offline
Addict
Addict
User avatar

Joined: Tue Nov 09, 2010 10:15 pm
Posts: 1487
I just saw this in the History file, under heading dated 1st December 2009 : Version 4.40:
Quote:
- Added: Peephole optimizer to 64-bit versions to produce better code


If this is true, then Fred can certainly improve the code. If he has the time.


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 8:41 pm 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3228
Location: New Zealand
maybe this article on x64 might help explain
http://codejury.com/a-walk-in-x64-land/

_________________
Got winter blues?
Enjoy a Caravan Trip into, "The Land of Grey and Pink", wine and punk weed optional!
https://www.youtube.com/watch?v=D5iX9YhCCp8


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 9:00 pm 
Offline
Administrator
Administrator

Joined: Fri May 17, 2002 4:39 pm
Posts: 13142
Location: France
The PureBasic compiler has been created using a stack call based model (useful for x86) so the whole pipeline use this and it's a very big task to add a new mode to handle a register based model (useful for x64). So for now, it still use the stack and unpop the register as need before calling the function. Even if it does look ugly (and I do agree it would be better without that), it won't be a big bottleneck performance wise as all is in the cache. The peephole optimizer already takes care of several standard cases to produce smaller output, and I could add more if needed.


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Sun May 12, 2013 10:08 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Sun Dec 02, 2007 12:11 pm
Posts: 256
Location: Germany
Thanks for the insight, Fred. I actually thought that the stack call based model is the (historic) reason. Thanks for the confirmation. :-)

I think it would be great if you could add the case of the second example to the peephole optimizer. Could you also elaborate a little bit on what is possible with the peephole optimizer?

I also spotted another case, please take a look at the x64 ASM code generated from the following code and compare the assignments between the two With-sections:
Code:
Structure TestStruc
  A.Integer
  B.Integer
EndStructure

Define Test.TestStruc

With Test
  \A\i = 1
  \B\i = 2
EndWith

Define *Ptr.TestStruc = @Test

With *Ptr
  \A\i = 1
  \B\i = 2
EndWith

The first With-section results in the following code:
Code:
; With Test
; \A\i = 1
  LEA    rbp,[v_Test]
  MOV    qword [rbp],1
; \B\i = 2
  MOV    qword [rbp+8],2
; EndWith
The second one in the following code:
Code:
; With *Ptr
; \A\i = 1
  MOV    rbp,qword [p_Ptr]
  MOV    qword [rbp],1
; \B\i = 2
  MOV    rbp,qword [p_Ptr]
  MOV    qword [rbp+8],2
; EndWith

It would be great if the rpb-register could only be set once, even for pointers (and probably other dynamic types, like lists, etc.).

I know that those are very small optimizations which do not matter that much. But I still think that small optimizations can contribute to an overall better product.
Thanks in advance!


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Mon May 13, 2013 10:17 am 
Offline
Administrator
Administrator

Joined: Fri May 17, 2002 4:39 pm
Posts: 13142
Location: France
Basically the peephole optimizer parse the generated assembly code, detect standard patterns, and if it matches replace them in-place with a smaller/better code.


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Fri Nov 08, 2013 8:57 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Sun Dec 02, 2007 12:11 pm
Posts: 256
Location: Germany
Thanks for implementing the first optimization mentioned in each of my posts above. What about the second ones?
Code:
; B = A
  PUSH   qword [rsp+40]
  POP    rax
  MOV    qword [rsp+48],rax

Code:
; With *Ptr
; \A\i = 1
  MOV    rbp,qword [p_Ptr]
  MOV    qword [rbp],1
; \B\i = 2
  MOV    rbp,qword [p_Ptr]
  MOV    qword [rbp+8],2
; EndWith


Top
 Profile  
Reply with quote  
 Post subject: Re: Possible Compiler Optimizations
PostPosted: Wed Dec 06, 2017 7:20 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Thu Nov 26, 2015 6:52 pm
Posts: 134
Location: Italy
I wanted to ask some clarification about Fred's statement:

Fred wrote:
Basically the peephole optimizer parse the generated assembly code, detect standard patterns, and if it matches replace them in-place with a smaller/better code.


... so, does this imply that when producing the PureBasic.asm file via the compiler switch /COMMENTED, the assembly code thus produced is still unoptimized?

If this is the case, it means that any manual code editing done to the PureBasic.asm file will also undergo optimization when compiling it with /REASM?

Also, this thread is a few years old, have things changed since, regarding compiler optimizations?

_________________
The PureBASIC Archives:
FOSS Resources:


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye