Page 2 of 3

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 12:47 am
by Josh
mk-soft wrote:PureBasic is a one-pass compiler, so there may be one PUSH/POP too many when using constants.
Somehow I can't imagine that constants influence the asm code in any way. Isn't a constant something like a simple macro that is immediately replaced with its value when the code is read?

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 1:24 am
by Tenaja
This might be more to your liking...
PureBasic is a one-pass compiler, so there may be one PUSH/POP too many when using it.

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 7:07 am
by BarryG
The problem is that PureBasic has always been focused on compilation speed and being a fast one-pass compiler. This means little things like useless PUSH/POPs get ignored, and we have to Declare procedures, and uncalled procedures sometimes get included for absolutely no reason (depending on their code). A fast compilation speed may have been a nice selling point a decade ago, but computers are sufficiently fast enough these days that a two-pass compiler should be the norm, so we get less bloat due to better peephole optimization, and not have to worry about Declaring procedures, or that uncalled procedures will be included. I know hell will freeze over before this happens, but just wanted to give my two cents.

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 9:23 am
by Rinzwind
@BarryG
100% agree. A 2-pass compiler must be the norm. For PB too. It's advantages are too many.

Also, it is sad the PB choose to use null-terminated pure c-strings instead of a handy length prefix. This makes string processing slow.

banned c functions: strcpy, strcat, strncpy, strncat, sprintf, vsprintf, gmtime, localtime, ctime, ctime_r, asctime, asctime_r
https://news.ycombinator.com/item?id=26347867
"For reasons that were never clearly articulated, the prefix approach was considered odd, backwards, and to have numerous downsides, at least where I learned C. In hindsight, I can only cringe at that attitude. Strings as added in later Pascal, about 40 years ago now, were memory safe in a way that C strings still are not."

"In the case of C, it's a design decision Denis Ritchie made that came down to the particular instruction set of PDP-11, that could efficiently process zero terminated strings.
So a severely memory limited architecture of the 70s led to blending of data with control - which is never a safe idea, see naked SQL."

"Whenever I review C code, I first look at the string function uses. Almost always I'll find a bug. It's usually an off by one error dealing with the terminating 0. It's also always a tangled bit of code, and slow due to repeatedly running strlen.
But strings in BASIC are so simple. They just work. I decided when designing D that it wouldn't be good unless string handling was as easy as in BASIC."

Anyway, the topic is not helpful in anyway without specific details.

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 11:06 am
by TI-994A
Everything wrote:PB is almost perfect and I want to remove the word "almost".
If you are satisfied no matter what, shame on you :|
Sadly, this cannot hold true for any language. :lol:

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 11:15 am
by Josh
BarryG wrote:... and uncalled procedures sometimes get included for absolutely no reason (depending on their code).
I also looked into this once and never found procedures included for no reason. As soon as you have created a reference to a procedure, it will always be included, because a compiler simply cannot determine whether the procedure will be accessed by the reference at runtime.

I would also prefer a two-pass compiler, but you can't make the one-pass compiler responsible for everything that has nothing to do with this.

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 11:34 am
by BarryG
Josh wrote:a compiler simply cannot determine whether the procedure will be accessed by the reference at runtime
A one-pass compiler can't, but a two-pass compiler can - I coded one in the past for PureBasic as a test, so I know with 100% certainty that it works. A two-pass compiler can easily see that there are no calls to the procedure below, and therefore the compiler simply doesn't include that procedure on the second pass (as if it was commented out, or removed). That's what my two-pass compiler did, and it worked great. I really need to write it again.

Code: Select all

Procedure Never_Called_But_Included()
  MessageRequester("This is in the compiled exe","")
EndProcedure

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 12:24 pm
by Josh
Did you even read what I wrote?
BarryG wrote:
Josh wrote:a compiler simply cannot determine whether the procedure will be accessed by the reference at runtime
A one-pass compiler can't, but a two-pass compiler can
No, no compiler can do that, no matter if one-pass or two-pass compiler. If the procedure is called by reference, the access is determined only at runtime. How should the compiler know then?

Code: Select all

a = 5

Procedure MyProc()
  MessageRequester ("", "Procedure called")
EndProcedure

x = @Myproc()
y = x + 2
y + a

CallFunctionFast (y-7)
Here is just a simple example where the compiler could theoretically still determine whether the procedure is called or not. But just change the value of the variable 'a'. This value could come from anywhere, then the compiler can't determine for sure if the procedure is called or not.

I think in Pb is only one possibility that a procedure is included without a need. If a procedure is called in a procedure that itself is never called. But I can live with this exceptional case, because this additional procedure has no negative impact on the program.


The compiler can't do miracles. But if your self-written two-pass compiler can do that, why don't you use it?

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 1:54 pm
by BarryG
In your example, my two-pass compiler would've included the procedure due to the "x = @Myproc()" line. If that line wasn't there, then my two-pass compiler would've omitted the "Myproc" procedure entirely. That's what I mean. No different to how a human optimizing the source would decide to omit the procedure if the "x = @Myproc()" line wasn't there. So when I say "called procedure", I don't mean conditionally called as you showed; I simply mean a call to the procedure anywhere in the source code - whether it's ever done at runtime or not.

Re: [NOT] Optimizing compiler

Posted: Thu Mar 11, 2021 7:06 pm
by Josh
BarryG wrote:In your example, my two-pass compiler would've included the procedure due to the "x = @Myproc()" line. If that line wasn't there, then my two-pass compiler would've omitted the "Myproc" procedure entirely.
This is exactly what the Pb compiler does and even manages this in one pass. If the procedure is not called anywhere, it is not included in the exe.

Re: [NOT] Optimizing compiler

Posted: Fri Mar 12, 2021 1:09 am
by Tenaja
Where pb falls short on the "included procedures" front is the simplistic method it uses, which is just a flag indicating if it's "referenced anywhere". Some compilers use a call tree, which allows it to omit called procedures if the call is in an uncalled procedure. This is easy when it's two pass. (But you clearly can't use that method for libraries you compile ahead of time... Nor when only compiling files that have changed, like Ninja [as opposed to Make].)

Re: [NOT] Optimizing compiler

Posted: Fri Mar 12, 2021 4:15 am
by BarryG
Josh wrote:If the procedure is not called anywhere, it is not included in the exe.
Not true. Obsolete code inside an uncalled procedure can sometimes be included in the exe. My other post (with MessageRequester) clearly and demonstrably shows this.

Here's another example: the following code creates a 148 KB executable, because the unused code inside the uncalled procedure with the PNG commands is included.

Remove the procedure, like you would if manually optimizing the code by hand, and the executable becomes a tiny 5 KB instead.

Code: Select all

Procedure Never_Used()
  UsePNGImageEncoder()
  UsePNGImageDecoder()
  ; Assume an image load routine is here.
  ; I haven't coded it for this example.
EndProcedure

MessageRequester("Hi","This exe is huge for no reason!")

Re: [NOT] Optimizing compiler

Posted: Fri Mar 12, 2021 8:00 am
by Josh
BarryG wrote:
Josh wrote:If the procedure is not called anywhere, it is not included in the exe.
Not true. Obsolete code inside an uncalled procedure can ...
What I have written in my post yesterday 12:24?

BarryG wrote:So when I say "called procedure", I don't mean conditionally called as you showed; I simply mean a call to the procedure anywhere in the source code - whether it's ever done at runtime or not.
This also applies to my statement

Re: [NOT] Optimizing compiler

Posted: Fri Mar 12, 2021 5:25 pm
by mk-soft
UsePNGImageEncoder is not a function, but a compiler directive that the encoder is linked. Therefore, this has nothing to do within a procedure, but must only be entered before the first use.

Re: [NOT] Optimizing compiler

Posted: Fri Mar 12, 2021 5:47 pm
by chi
Everything wrote:Is it possible to do something else in the field of code optimization?
You could try pbOptimizer, although there aren't many ASM optimizations yet (like removing PUSH/POP). Maybe I'll add them later?!