PureBench32 progress laughable, consistancy question!

Everything else that doesn't fall into one of the other PB categories.
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

Hmmm. You're right, even if alignment is, in my mind, good to have comparable results. A more useful thing to add is high task priority :

Code: Select all

SetPriorityClass_(  GetCurrentProcess_(), #HIGH_PRIORITY_CLASS)
Road Runner
User
User
Posts: 48
Joined: Tue Oct 07, 2003 3:10 pm

Post by Road Runner »

Both code and data alignment are important to timings but I can't imagine a modern compiler not aligning 32 bit data such as longs and floats, on a 32 bit (4 byte) boundary. Some compilers may misalign larger data such as doubles as they should be aligned on an 8 byte boundary for best performance but are often 4 byte aligned too.
Maybe you need to check data alignment if only to exclude it as a problem.
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

We were not talking about data, but code alignment. We can also discuss of code rearranging upon task needed, but I've not a sufficient experience needed with the x86 assembly.
Helle
Enthusiast
Enthusiast
Posts: 178
Joined: Wed Apr 12, 2006 7:59 pm
Location: Germany
Contact:

Post by Helle »

For alignment up to 16 you can use this macro:

Code: Select all

!macro Blign value ;Begin Macro
!{
!a = value - (($ - $$) mod value) ;$=this Offset-Adress, $$=Basis-Adress of the Section

!if a = value 
  !a = 0             
!end if

!if a=1
  !irp value, $90  ;NOP, the well-known standard-value                              
  !\{
  !DB value
  !\} 
!end if

!if a=2
  !irp value, $8B, $C0  ;MOV EAX,EAX
  !\{
  !DB value
  !\} 
!end if

!if a=3
  !irp value, $8D, $40, $00  ;LEA EAX, dword ptr[EAX + 00]
  !\{
  !DB value
  !\} 
!end if

!if a=4
  !irp value, $8D, $44, $20, $00  ;LEA EAX, dword ptr[EAX]
  !\{
  !DB value
  !\} 
!end if

!if a=5
  !irp value, $66, $8D, $54, $22, $00  ;LEA DX, word ptr[EDX] 
  !\{
  !DB value
  !\} 
!end if

!if a=6
  !irp value, $8D, $80, $00, $00, $00, $00  ;LEA EAX, dword ptr[EAX + 00000000]
  !\{
  !DB value
  !\} 
!end if

!if a=7
  !irp value, $8D, $04, $05, $00, $00, $00, $00  ;LEA EAX, dword ptr[EAX + 00000000]
  !\{
  !DB value
  !\} 
!end if

!if a=8
  !irp value, $66, $8D, $04, $05, $00, $00, $00, $00  ;LEA AX, word ptr[EAX + 00000000]
  !\{
  !DB value
  !\} 
!end if

!if a=9
  !irp value, $90, $66, $8D, $04, $05, $00, $00, $00, $00
  !\{
  !DB value
  !\} 
!end if

!if a=10
  !irp value, $8B, $C0, $66, $8D, $04, $05, $00, $00, $00, $00
  !\{
  !DB value
  !\} 
!end if

!if a=11
  !irp value, $66, $8D, $54, $22, $00, $8D, $80, $00, $00, $00, $00
  !\{
  !DB value
  !\} 
!end if

!if a=12
  !irp value, $66, $8D, $54, $22, $00, $8D, $04, $05, $00, $00, $00, $00
  !\{
  !DB value
  !\} 
!end if

!if a=13
  !irp value, $66, $8D, $54, $22, $00, $66, $8D, $04, $05, $00, $00, $00, $00
  !\{
  !DB value
  !\} 
!end if

!if a=14
  !irp value, $8D, $80, $00, $00, $00, $00, $66, $8D, $04, $05, $00, $00, $00, $00
  !\{
  !DB value
  !\} 
!end if

!if a=15
  !irp value, $8D, $04, $05, $00, $00, $00, $00, $66, $8D, $04, $05, $00, $00, $00, $00
  !\{
  !DB value
  !\} 
!end if
!}                 ;End Macro
;========================================================================================

;- Test

Global TimeToRun.l = 5000 
Global IntIterations.l 
Global FpIterations.l 

OpenConsole() 

    Timer.d = ElapsedMilliseconds() 
    IntIterations = 0 
    
    PrintN("Starting...") 
    Delay(20) 
    
    While ElapsedMilliseconds() < Timer + TimeToRun 
        IntA.l = 1234 
        IntB.l = 4321 
        IntIterations = IntIterations + 1 
        For i = 1 To 1000 
            iAns.l = IntA + IntB 
            iAns = IntA - IntB 
            iAns = IntB % IntA 
            iAns = IntA * IntB 
            IntA = IntA + i 
            IntB = IntB - i 
        Next      
    Wend 

    PrintN("Integer Iterations: " + Str(IntIterations)) 
    
    Timer.d = ElapsedMilliseconds() 
    FpIterations = 0 
    Delay(100)  ;let the cpus calm down 
    
    While ElapsedMilliseconds() < Timer + TimeToRun 
      !Blign 8 
        FPA.d = 1234.1234 
        FPB.d = 4321.4321 
        FpIterations = FpIterations + 1 
        For i = 1 To 1000 
            fAns.d = FPA + FPB 
            fAns = FPA - FPB 
            fAns = FPB / FPA 
            fAns = FPA * FPB 
            FPA = FPA + i 
            FPB = FPB - i 
        Next      
    Wend 
    
    PrintN("Floating Point Iterations: " + Str(FPIterations)) 
    Input() 

CloseConsole() 

But the right effect is only in asm-code...

Gruss
Helle
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Perhaps it's not an alignment issue?

Starting...
Integer Iterations: 543985
Floating Point Iterations: 419598

Starting...
Integer Iterations: 629234
Floating Point Iterations: 92945

I used your macro code, I changed the .f to .d and ran it again. I have no issue with the floating point times but the integer times are out by over 10% again. :?

It's always the same though, in all these tests, using doubles rather than floats makes ints run faster :?:
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Road Runner
User
User
Posts: 48
Joined: Tue Oct 07, 2003 3:10 pm

Post by Road Runner »

Paul,
you're going to have to look at the underlying ASM code to see what's going on.
Create the simplest demo you can that demonstrates the problem then compile it and look at the ASM to see how the compiled code for the integers differs in the 2 cases.
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

How do I view the ASM?

My ASM skills aren't good but I can read it enough to make some comparisons. (so I should be able to check the INT differences if there are any - which there shouldn't be.

Is there some switch in the compiler to dump the ASM code?
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

No problem here changing .f to .d

Code: Select all

SetPriorityClass_(  GetCurrentProcess_(), #HIGH_PRIORITY_CLASS) 

Global TimeToRun.l = 5000
Global IntIterations.l
Global FpIterations.l

OpenConsole()

 
Goto f
!SECTION '.testf' CODE READABLE EXECUTABLE ALIGN 4096
f:

    IntIterations = 0
    Timer.d = ElapsedMilliseconds()

    While ElapsedMilliseconds() < Timer + TimeToRun
        IntA.l = 1234
        IntB.l = 4321
        IntIterations = IntIterations + 1
        For i = 1 To 1000
            iAns.l = IntA + IntB
            iAns = IntA - IntB
            iAns = IntB % IntA
            iAns = IntA * IntB
            IntA = IntA + i
            IntB = IntB - i
        Next     
    Wend

    PrintN("Integer Iterations: " + Str(IntIterations)) 

    Delay(100)  ;let the cpus calm down
 
Goto g
!SECTION '.testf' CODE READABLE EXECUTABLE ALIGN 4096
g:

    FpIterations = 0
    Timer.d = ElapsedMilliseconds()

    While ElapsedMilliseconds() < Timer + TimeToRun
        FPA.d = 1234.1234
        FPB.d = 4321.4321
        FpIterations = FpIterations + 1
        For i = 1 To 1000
            fAns.d = FPA + FPB
            fAns = FPA - FPB
            fAns = FPB / FPA
            fAns = FPA * FPB
            FPA = FPA + i
            FPB = FPB - i
        Next     
    Wend
   
    PrintN("Floating Point Iterations: " + Str(FPIterations))

Goto f

Input()

CloseConsole() 


End
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

This is good, in the sense that it's backwards to before so there's no weird connection between the types and points to alignment again.

If I use your code I get this

Integer Iterations: 570102
Floating Point Iterations: 428964
... repeat

Then if I change the .d to .f I get

Integer Iterations: 691000
Floating Point Iterations: 428520
... repeat

previously longs were better with doubles, now it's with floats. A type change is still changing the perf of an unrelated area of code though. I guess it can't be helped
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

pdwyer wrote:How do I view the ASM?

My ASM skills aren't good but I can read it enough to make some comparisons. (so I should be able to check the INT differences if there are any - which there shouldn't be.

Is there some switch in the compiler to dump the ASM code?
I wrote a tool (bat file) to show the asm of any open file from the IDE. (Note: it does not use the compilation options like threadsafe.)

Arguments: "%TEMPFILE" (in quotes)

Code: Select all

@echo off
cd "c:\program files\purebasic\compilers"
pbcompiler %1 /COMMENTED %2
if errorlevel 1 goto wait
start PureBasic.asm
exit
:wait
pause

Helle
Enthusiast
Enthusiast
Posts: 178
Joined: Wed Apr 12, 2006 7:59 pm
Location: Germany
Contact:

Post by Helle »

I think, it´s a problem with the alignment in the data section and the sort. Try this:

Code: Select all

;PB_DataPointer.l            ;exist from PB
Global TimeToRun.l = 5000 
Global IntIterations.l 
Global FpIterations.l 
Global IntA.l
Global IntB.l
Global iAns.l
Global Timer.l
Global i.l
Global X0.l                  ;Dummy
Global X1.l                   
Global X2.l                  ;with the dummys we have 12 longs

 
OpenConsole() 

    Timer.l = ElapsedMilliseconds() 
    IntIterations = 0 
    
    PrintN("Starting...Single") 
    Delay(20) 
   
    While ElapsedMilliseconds() < Timer + TimeToRun 
        IntA.l = 1234 
        IntB.l = 4321 
        IntIterations = IntIterations + 1 
        For i = 1 To 1000 
            iAns.l = IntA + IntB 
            iAns = IntA - IntB 
            iAns = IntB % IntA 
            iAns = IntA * IntB 
            IntA = IntA + i 
            IntB = IntB - i 
        Next      
    Wend 

    PrintN("Integer Iterations: " + Str(IntIterations)) 
    
    Timer.l = ElapsedMilliseconds() 
    FpIterations = 0 
    Delay(100)  ;let the cpus calm down 
  
    While ElapsedMilliseconds() < Timer + TimeToRun 
        FPA.f = 1234.1234 
        FPB.f = 4321.4321 
        FpIterations = FpIterations + 1 
        For i = 1 To 1000 
            fAns.f = FPA + FPB 
            fAns = FPA - FPB 
            fAns = FPB / FPA 
            fAns = FPA * FPB 
            FPA = FPA + i 
            FPB = FPB - i 
        Next      
    Wend 
    
    PrintN("Floating Point Iterations: " + Str(FPIterations)) 
    Input() 

CloseConsole() 

Code: Select all

;PB_DataPointer.l            ;exist from PB
Global X0.l                  ;Dummy
Global TimeToRun.l = 5000 
Global IntIterations.l 
Global FpIterations.l 
Global IntA.l
Global IntB.l
Global iAns.l
Global Timer.l
Global i.l
Global X1.l                   
Global X2.l                  ;with the dummys we have 12 longs
  
 
OpenConsole() 

    Timer.l = ElapsedMilliseconds() 
    IntIterations = 0 
    
    PrintN("Starting...Double") 
    Delay(20) 
  
    While ElapsedMilliseconds() < Timer + TimeToRun 
        IntA = 1234 
        IntB = 4321 
        IntIterations = IntIterations + 1 
        For i = 1 To 1000 
            iAns = IntA + IntB 
            iAns = IntA - IntB 
            iAns = IntB % IntA    ;Modulo or Div?
            iAns = IntA * IntB 
            IntA = IntA + i 
            IntB = IntB - i 
        Next      
    Wend 

    PrintN("Integer Iterations: " + Str(IntIterations)) 
    
    Timer = ElapsedMilliseconds() 
    FpIterations = 0 
    Delay(100)  ;let the cpus calm down 
   
    While ElapsedMilliseconds() < Timer + TimeToRun 
        FPA.d = 1234.1234 
        FPB.d = 4321.4321 
        FpIterations = FpIterations + 1 
        For i = 1 To 1000 
            fAns.d = FPA + FPB 
            fAns = FPA - FPB 
            fAns = FPB / FPA 
            fAns = FPA * FPB 
            FPA = FPA + i 
            FPB = FPB - i 
        Next      
    Wend 
    
    PrintN("Floating Point Iterations: " + Str(FPIterations)) 
    Input() 

CloseConsole() 
Gruss
Helle
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

This seems to work! :)

Could you explain it a bit? You buffered the global vars with dummy's but I don't understand it well enough to use in a different situation. Is this only because there isn't much code and any number of globals more than this wouldn't be a problem?

Thanks
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Post by Demivec »

With Helle's code I get these results on my AMD Athlon 64 3500+ 1.79 GHz.:

For singles:
  • Integer Iterations: 189721
    Floating Point Iterations: 414949
For doubles:
  • Integer Iterations: 186721
    Floating Point Iterations: 212254

With these tests Doubles are now half as fast as Floats and Integers are consistent between the two tests. It's still a little confusing that Floats and Doubles are both faster than Integers.
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Your int scores okay? That seems wierd

Starting...Double
Integer Iterations: 701282
Floating Point Iterations: 420483

Starting...Single
Integer Iterations: 700125
Floating Point Iterations: 420459

This method isn't working too well, I might go back to the idea of having the benchmark do something, like encryption and compression. Perhaps this kind of simple math is a poor test due to things like CPU cache hits on close to identical code looped
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Something more like this maybe... (needs thread safe turned on)

Starting...
Single thread iterations: 92542
8 Thread iterations: 377289
Difference: 307.69%
Done!

Code: Select all


#SubOpCount = 1000

Global gIterations.l
Global MtxIterates.l

Declare.l CryptSpeed(MilliSecondsToRun) 

Declare Main()

MtxIterates = CreateMutex()
Main()


Procedure Main()

    OpenConsole()

        PrintN("Starting...")
        TimeToRun.l = 2000

        CreateThread(@CryptSpeed(),TimeToRun)
        Delay(TimeToRun + 200)
        
        PrintN("Single thread iterations: " + Str(gIterations))
        TmpIterations = gIterations
        Delay(20)

        gIterations = 0 
        
        For i = 1 To 8
            CreateThread(@CryptSpeed(),2000)
            Delay(20)
        Next
        
        Delay(2000 + 1000)
                
        PrintN("8 Thread iterations: " + Str(gIterations))
        PrintN("Difference: "+ StrF((gIterations/TmpIterations * 100) -100,2) + "%" )
        PrintN("Done!")
        Input()
    
    CloseConsole()
EndProcedure


Procedure CryptSpeed(MilliSecondsToRun.l)

    *Buffer = AllocateMemory(1024)
    Protected IterationCount.l = 0
    Protected Timer.l = ElapsedMilliseconds()
    
    While ElapsedMilliseconds() < Timer + MilliSecondsToRun
    
        IterationCount = IterationCount + 1
        
        For i = 0 To 1023
            PokeB(*Buffer + i,i % 255)
        Next
            
        Result = CRC32Fingerprint(*Buffer, 1024)   
        Results.s = MD5Fingerprint(*Buffer, 1024)
        Results.s = DESFingerprint(PeekS(*Buffer), PeekS(*Buffer))  
    Wend
    
    FreeMemory(*Buffer)
    
    LockMutex(MtxIterates)
    gIterations = gIterations + IterationCount
    UnlockMutex(MtxIterates)
    
EndProcedure


Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Post Reply