Page 1 of 3

IF+IF is quicker than IF+AND or IF+OR

Posted: Sun Apr 26, 2009 8:58 am
by Ollivier
I have the gourdin.

Instead of writing this :

Code: Select all

If Condition1 And Condition2
   ...
EndIf
It's quicker in the execution writing this:

Code: Select all

If Condition1
   If Condition2
      ...
   EndIf
EndIf
... And place in the Condition1 the condition which is considered the most often false.



A second observation is with the OR operation.

Instead of writing this :

Code: Select all

Macro MyMacro()
...
EndMacro

If Condition1 Or Condition2
   MyMacro()
EndIf
It's quicker to invert manually the two conditions and write this:

Code: Select all

Macro MyMacro()
...
EndMacro

If (Not Condition1) ; Here, we invert it ('=' becomes '<>', etc...)
   If (Not Condition2) ; Here, idem
      ; Empty line
   Else
      MyMacro()
   EndIf
Else
   MyMacro()
EndIf
Ollivier

Posted: Sun Apr 26, 2009 10:54 am
by blueznl
Are you sure? Did you actually benchmark this?

Posted: Sun Apr 26, 2009 10:55 am
by Kaeru Gaman
I suspect it would highly depend on the complexity of the conditions and the size of the codesections to execute...?

how did you test it?


... and why is it a request?

Posted: Sun Apr 26, 2009 1:09 pm
by Michael Vogel
Kaeru Gaman wrote:[...]and why is it a request?
Good question :!:

Maybe Olivier asks for a better code optimization?
Or better control for logical functions (NOT :wink:) ?
Or just some possibilities to keep user defined variables in CPU registers when possible?
Or something completely different?

Michael

Posted: Sun Apr 26, 2009 5:19 pm
by Kaeru Gaman

Posted: Mon Apr 27, 2009 2:35 am
by lexvictory
Kaeru Gaman wrote:
That's my favourite part of that movie! :lol:

Posted: Sat Jun 06, 2009 2:52 pm
by Ollivier
Here is a benchmark:
I obtain a result difference near -33%. (XP 1GHz 1 core)

Code: Select all

;*******************
; And gate test
;*******************

DisableDebugger
   
; FIRST MODULE

   Delay(1)
   Begin1 = ElapsedMilliseconds()
   For A = 0 To 8000
      For B = 0 To 8000
         If A And B 
         
         EndIf
      Next B
   Next A
   Finish1 = ElapsedMilliseconds()

; SECOND MODULE

   Delay(1)
   Begin2 = ElapsedMilliseconds()
   For A = 0 To 8000
      For B = 0 To 8000
         If A
            If B
            
            EndIf
         EndIf
      Next B
   Next A
   Finish2 = ElapsedMilliseconds()

   Duration1.F = Finish1 - Begin1
   Duration2.F = Finish2 - Begin2
   MessageRequester("AND Gate tests", "First module : " + Str(Finish1 - Begin1) + Chr(10) + "Second module : " + Str(Finish2 - Begin2) + Chr(10) + "Duration difference = " + Str(Int(100.0 * Duration2 / Duration1) - 100) + "%")
   

Posted: Sat Jun 06, 2009 3:04 pm
by AND51
I run this test several times with disabled debugger (don't rely on DisableDebugger alone!) and my results alternate from 0% to -38% (average: 20%).

Testing conditions: PB 4.30 x86 @ Vista x64.

Posted: Sat Jun 06, 2009 3:18 pm
by Kaeru Gaman
Kaeru Gaman wrote:I suspect it would highly depend on the complexity of the conditions and the size of the codesections to execute
with simplest conditions and to code in between, what would be the point in testing it?

anyhow,
it would be nice to know WHY this result occurs,
and it IS nice to know that more-and-simpler lines is faster than less-and-more-complex lines.

Posted: Sat Jun 06, 2009 7:20 pm
by Demivec
Here's some assembly from the speed test code with the respective modules code listed side by side for x86:

Code: Select all

;                                                    
; FIRST MODULE                                       ; SECOND MODULE                  
;                                                    ;                                
                 
; If a And b                                         ; If a                           
  CMP    dword [v_a],0                                 CMP    dword [v_a],0           
  JE     No0                                           JE    _EndIf12                 
                                                     ; If b                   
  CMP    dword [v_b],0                                 CMP    dword [v_b],0           
  JE     No0                                           JE    _EndIf14                 
Ok0:                                                         
  MOV    eax,1                                                                       
  JMP    End0                                                               
No0:                                                                        
  XOR    eax,eax                                                            
End0:                                                                       
  AND    eax,eax                                                 
  JE    _EndIf6  
  
; Code inside conditional goes here                  ; Code inside conditional goes here
                                                  
; EndIf                                              ; EndIf               
                                                       _EndIf14:             
                                                     ; EndIf               
_EndIf6:                                               _EndIf12: 
It seems there's twice as many commands in the First Module's assembled code. My guess is that the simplified method of the Second Module can only be applied to multiple And conditions. When combined with other conditionals (Or and Not) it can get a little more complicated to translate. As a result the more general method of the First Module is used with multiple conditional expressions with the tradeoff of speed.

@Ollivier: if your method could be extended (or specified more completely) somewhat I think it would be adopted by Fred (I hope), barring some other complication. :wink:

Posted: Sun Jun 07, 2009 1:55 am
by Rescator
I got 2% to 6% with averaging at 4%.
So the first module was always faster here.

Phenom X3 2.1Ghz.

Posted: Sun Jun 07, 2009 2:07 am
by Rescator
Haha, here's a head scratcher indeed. the previous post was with PureBasic x64.

When I ran the test again with PureBasic x86 I got constantly -27%

:shock:

The only thing I can think of is that "complex" code runs better on x64 due to twice (!) as many registers being available, or?.

Posted: Sun Jun 07, 2009 12:28 pm
by Ollivier
@Rescator

I deduce it's a problem solved in the version x64. Version I couldn't test actually.

Ollivier

Posted: Mon Jun 08, 2009 8:54 am
by Trond
The test is flawed. The conditions are true once and false 63999999 times, which is not very balanced.
Also, there would realistically be some code within the condition, so let's add some. Suddenly the And version is faster.
I think that the If version is faster with no code within because the processor branch prediction detects that the check is senseless and does some optimizations based on that.

Code: Select all

; FIRST MODULE

Delay(1)
Begin1 = ElapsedMilliseconds()
For A = 0 To 10000
  For B = 0 To 10000
     If A&1 And B&1
       somecode+goes-here
     EndIf
  Next B
Next A
Finish1 = ElapsedMilliseconds()

; SECOND MODULE

Delay(1)
Begin2 = ElapsedMilliseconds()
For A = 0 To 10000
  For B = 0 To 10000
     If A&1
        If B&1
          somecode+goes-here
        EndIf
     EndIf
  Next B
Next A
Finish2 = ElapsedMilliseconds()

Duration1.d = Finish1 - Begin1
Duration2.d = Finish2 - Begin2
MessageRequester("AND Gate tests", "First module : " + Str(Finish1 - Begin1) + Chr(10) + "Second module : " + Str(Finish2 - Begin2) + Chr(10) + "Duration difference = " + Str(Int(100.0 * Duration2 / Duration1) - 100) + "%")

Posted: Mon Jun 08, 2009 12:45 pm
by Demivec
@Trond: With your code I get -23% -> -28% .

I'm running Windows XP Sp3 (x86) on AMD 64 3500+.

With Ollivier's code I had gotten -17% -> -53%.