Short-circuit boolean evaluation to speed up your IFs

luis · Post by **luis** » Mon Nov 09, 2009 5:00 pm

Hi all

I started examining the code generated by the compiler, out of curiosity and because knowing how PB translate our source code into assembly can help us to write (sometimes) better code.

The first thing I examined is the IF statement and the way PB evaluate the boolean expression after it.
I thought could be interesting so I posted this... some beginners could find it interesting and maybe not only them

This is the example code:

Code: Select all

start = ElapsedMilliseconds()

For k = 0 To 999999
    If (k < 10000) And (Sin(Random(6)) * Cos(Random(6)) > 0.5) ; faster 
    ;;; If (Sin(Random(6)) * Cos(Random(6)) > 0.5) And (k < 10000)  ; slower 
        j + 1
    EndIf
Next

finish = ElapsedMilliseconds() - start

MessageRequester("Boolean Short-circuit eval", "time = " + Str(finish) + ", j = " + Str(j))

PB (from what I saw) use a short-circuit evaluation of the expression after the IF, like the C language does.
This means if an expression is composed by many sub-expressions ANDed in the IF statement, the evaluation of the expression is stopped when a subexpression is false. Hence the name short-circuit. Because there is no need to evaluate what it's to the right of the false sub-expression to know the main expression will be a nice FALSE.

This is the opposite of languages using full boolean evaluation where all the components of the main expression are always evaluated.

Look at a fragment of the ASM generated for the code above (I added some comments):

Code: Select all

; For k = 0 To 999999
  MOV    dword [v_k],0
_For1:
  MOV    eax,999999
  CMP    eax,dword [v_k]
  JL    _Next2
; If (k < 10000) And (Sin(Random(6)) * Cos(Random(6)) > 0.5) ; faster 
  MOV    ebx,dword [v_k] 	; now we evaluate the first sub-expression
  CMP    ebx,10000 		; compare to 10000
  JGE    No0 			; if it is not less than 10000 we skip the HEAVY second expression
  PUSH   dword 6
  CALL  _PB_Random@4
  MOV    dword [esp-4],eax
  FILD   dword [esp-4]
  FSIN
  SUB    esp,8
  FSTP   qword [esp+0]
  PUSH   dword 6
  CALL  _PB_Random@4
  FLD    qword [esp+0]
  ADD    esp,8
  MOV    dword [esp-4],eax
  FILD   dword [esp-4]
  FCOS
  FMULP  st1,st0
  FCOMP  qword [D1]
  FNSTSW ax
  TEST   ah,41h
  JNE    No0
Ok0:
  MOV    eax,1
  JMP    End0
No0:					; and here we are, a lot of code skipped
  XOR    eax,eax
End0:
  AND    eax,eax
  JE    _EndIf4
; If (Sin(Random(6)) * Cos(Random(6)) > 0.5) And (k < 10000)  ; slower 
; j + 1
  INC    dword [v_j]
; EndIf
_EndIf4:
; Next
_NextContinue2:
  INC    dword [v_k]
  JMP   _For1
_Next2:

As you can see the short-circuit evaluation can SOMETIMES save a lot of time. When ? When the first sub-expression is false, in this case. What can we learn from this ?

If we know (and very often we do) a certain sub-expression will be false the majority of the time we are going to test for it, we should put that sub-exp at the left of our main expression.

The code above demonstrate this. Please note the code is silly, but I wanted to keep it simple to not hide the important message inside a clever but obscure algorithm. The only purpose here is to easily show the relevant point.

If you look at the code, should be evident the first sub-exp (k < 10000) will be false most of the time and we can predict that in advance easily. The second sub-exp is heavier to show the gain we can have not evaluating it when it's not needed to. Sometimes it will evaluate to true, sometimes not.

Try to enable alternatively the two IF statements, and run the program (without debugger would be better to have a more realistic idea). You will see how by simply swapping the two sub-exps you can save a lot of computations.

You can do a lot of good things to speed up your code if you spend some time examining the compiler output !

Dreamland Fantasy · Post by **Dreamland Fantasy** » Tue Nov 10, 2009 1:33 pm

Hi Luis,

That's very interesting to know. Nice work!

Usually what I do is if I know a certain condition is going to be false in the majority of cases is something like this:

Code: Select all

If a = 0
  If b = c * d
    ...code...
  EndIf
EndIf

I can certainly use what you posted to help tidy up some of my code a bit.

Kind regards,

Francis

luis · Post by **luis** » Tue Nov 10, 2009 3:53 pm

Hi Francis

Dreamland Fantasy wrote:Usually what I do is if I know a certain condition is going to be false in the majority of cases is something like ...

You are perfectly right, both are equivalent and in fact the way to simulate short-circuit evaluation in languages supporting only full evaluation is to separate the test in two or more IFs.

Your way is already ok, there is no really need to change your code this way performance - wise, unless you prefer the AND form for other reasons (readability or whatever).

This was more to let people know how things works with PB specifically, and to make clear that the order of the sub-expressions is not irrelevant if you use AND in your IFs.

Using the splitted IF is equivalent and safe with any language.

Demivec · Post by **Demivec** » Wed Nov 11, 2009 3:13 am

luis wrote:This was more to let people know how things works with PB specifically, and to make clear that the order of the sub-expressions is not irrelevant if you use AND in your IFs.

PureBasic also uses short-circuit evaluation when OR is used in an IF statement. It does this by stopping at the first True evaluation. With regard to the ordering of sub-expressions it makes possible to write a statement such as this:

Code: Select all

If *pointer And *poointer\count < 15
  ;action
EndIf

This will evaluate only evaluate the second sub-expression if the first sub-expression was true. If every sub-expression was evaluated (as in some other languages) first it could result in an IMA when *pointer = #Null.

luis · Post by **luis** » Wed Nov 11, 2009 11:35 am

Yes, it's something you see constantly in C programs

Psychophanta · Post by **Psychophanta** » Fri Nov 13, 2009 2:40 pm

Nice, but you all forgot to say the evaluation is performed from left to right in the source line statements.

luis · Post by **luis** » Fri Nov 13, 2009 2:52 pm

I didn't forget it. I didn't say it.

It's quite clear to me looking at the ASM code above, its explanation and the suggested way to construct the full expression.

But if you prefer to say it, it's ok with me

Psychophanta · Post by **Psychophanta** » Sat Nov 14, 2009 1:21 am

To you is clear, but think that most newbies don't understand ASM, so it is important to point that the evaluation is performed left to right, just in case.

PureBasic Forums - English

Short-circuit boolean evaluation to speed up your IFs

Short-circuit boolean evaluation to speed up your IFs

Re: Short-circuit boolean evaluation to speed up your IFs

Re: Short-circuit boolean evaluation to speed up your IFs

Re: Short-circuit boolean evaluation to speed up your IFs

Re: Short-circuit boolean evaluation to speed up your IFs

Re: Short-circuit boolean evaluation to speed up your IFs

Re: Short-circuit boolean evaluation to speed up your IFs

Re: Short-circuit boolean evaluation to speed up your IFs