Page 3 of 3

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 11:25 am
by Fred
Tenaja wrote:
Shield wrote:Well, C# and Java are compilers and they do a great job about optimizing the binaries.
Although they output bytecode, it's still machine code while the application is running and therefore no penalties,
and even better, "just in time" optimization. :)

I'd also like to see more compiler features rather than new libraries, though. :)

Edit: oh, Fred was faster there. :)
Well, sure, sorta...but not really.

C++ consistently is 3x faster than Java and the .NET crew. (At least in the current benchmarks I was researching last month.)

While I wouldn't anticipate a "cafe basic" to be as fast as C++ (although there is no technical reason it cannot be), I would certainly expect it to be faster than anything that puts out a resemblance of bytecode.
As I said, JAVA is compiled. It's not becuase the executable is a bytecode than it gets interpreted. Where did you find your benchmarks ? The few i found doesn't see such a difference, and in some case JAVA is faster the C++. For example, a recent one:

http://blog.cfelde.com/2010/06/c-vs-java-performance/

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 12:20 pm
by luis
c4s wrote:Yes, this would be great! Honestly I thought this already happens but as it it doesn't work for includes?!
Danilo already explained this, but anyway there is a summary I wrote some time ago:

http://www.purebasic.fr/english/viewtop ... 03#p300303

It would be really great if PB could remove the unused procedures, and report the unused vars too for that matter.

Both things, while immensely appreciated by some of us, shouldn't be terrible difficult to implement... in any case the effort would be well spent... IMHO.

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 1:12 pm
by Fred
Without a multiple pass compiler, it's actually difficult to detect references accross procedures, that's why it's not done for now.

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 5:45 pm
by Tenaja
Fred wrote:As I said, JAVA is compiled. It's not because the executable is a bytecode than it gets interpreted. Where did you find your benchmarks ? The few i found doesn't see such a difference, and in some case JAVA is faster the C++. For example, a recent one:

http://blog.cfelde.com/2010/06/c-vs-java-performance/
I am fully aware of the differences between interpreters and virtual machines. That's why Java is slow to get started.

I was searching diligently a couple months ago, and had not found your link. In this benchmark test set (which allows comparing numerous compilers), Java beats C++ in only one test:
http://shootout.alioth.debian.org/u64q/ ... &lang2=gpp

I do admit, I misquoted the 3x--I must have been remembering the C# scores--but C++ is still consistently and noticeably faster.


...on the other hand, like you said, Java has had an institution behind it, whereas gcc has users behind it--who knows how good its optimizations are. I remember with one target I used it on, the compiler didn't compensate for double-sized variables that the cpu could handle. Every use of double-words resulted in about 4x the instructions as hand written asm code.

Here I found yet another benchmark, from just after yours:
http://www.codeproject.com/KB/dotnet/Ru ... mance.aspx
Of course, this one is a little more Windows oriented, and it is primarily looking at startup and UI initialization. Nevertheless, this is a quote from the article:
As you can easily see, C++ is a clear winner in all categories but with some caveats to that: the caller process 'helps' the C++ tests by preloading some common dlls and also there are no calls made to the garbage collector.
As in the previous non UI tests, C++/MFC appears to perform a lot better than the other frameworks.
One thing I found interesting with this test
http://shootout.alioth.debian.org/
was that C# appeared to take the biggest hit (or be the least optimized) for 64-bit

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 5:51 pm
by Trond
For the stripping of procedures (which does not really affect speed). With a data structure that acts both as a map/list and as tree at the same time, it can be done quite elegantly:

Code: Select all

;- Procedure node
Structure SProc
  List *Calls.SProc()
  IsCalled.i ; Whether to include this procedure in the asm output
EndStructure

;- All procedures with their calls are stored in this list
NewMap AllProcedures.SProc()
; Except a "special" variable for collecting calls that occur outside any procedure
TopLevel.SProc

;- Add our procedures
; This can happen gradually during the single pass compilation
For I = 'a' To 'g'
  AllProcedures(Chr(i))
Next

;- Add procedure calls
; This can happen gradually during the single pass compilation.
; Because procedures cannot be called before they are defined/declared
; this and the previous step can be intertwined in a single pass!

; b() calls b()
AddElement(AllProcedures("b")\Calls())
AllProcedures("b")\Calls() = FindMapElement(AllProcedures(), "b")

; d() calls c()
AddElement(AllProcedures("d")\Calls())
AllProcedures("d")\Calls() = FindMapElement(AllProcedures(), "c")

; e() calls f()
AddElement(AllProcedures("e")\Calls())
AllProcedures("e")\Calls() = FindMapElement(AllProcedures(), "f")

; f() calls e()
AddElement(AllProcedures("f")\Calls())
AllProcedures("f")\Calls() = FindMapElement(AllProcedures(), "e")

; g() calls g()
AddElement(AllProcedures("g")\Calls())
AllProcedures("g")\Calls() = FindMapElement(AllProcedures(), "g")

; Toplevel calls f
AddElement(TopLevel\Calls())
TopLevel\Calls() = FindMapElement(AllProcedures(), "f")

; Toplevel calls g
AddElement(TopLevel\Calls())
TopLevel\Calls() = FindMapElement(AllProcedures(), "g")

;- Set \IsCalled to 0 for all procedures
; IsCalled is initialized to 0, so we don't need to do anything special

;- Traverse the map as a tree, to mark called procedures
Procedure TraverseAndMark(*N.SProc)
  ForEach *N\Calls()
    If Not *N\Calls()\IsCalled
      *N\Calls()\IsCalled = 1
      TraverseAndMark(*N\Calls())
    EndIf
  Next
EndProcedure

TraverseAndMark(TopLevel)

;- Traverse map as map again, to know which procedures were marked
ForEach AllProcedures()
  If AllProcedures()\IsCalled
    Debug MapKey(AllProcedures())
  EndIf
Next
The relationships between the procedures was drawn from this code:

Code: Select all

Procedure a() 
  ; NOT NEEDED
  ; never referenced  
EndProcedure

Procedure b()
  ; NOT NEEDED
  ; referenced only by itself (recursive)
  b()
EndProcedure

Procedure c()
  ; NOT NEEDED
  ; referenced by a function never referenced: d()
EndProcedure

Procedure d()
  ; NOT NEEDED
  ; never referenced
  c()
EndProcedure

Declare f()
Procedure e()
  ; NEEDED
  ; referenced and called by f()
  ; and also we have a mutual recursion here, to really test the algorithm
  f()
EndProcedure

Procedure f()
  ; NEEDED
  e()
EndProcedure

Procedure g()
  ; NEEDED
  ; referenced by itself and toplevel
  g()
EndProcedure

f()
g()

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 8:38 pm
by Blood
All these benchmarks are useless to argue over. Just use the right tool for the job at hand and forget favoritism. This is the mark of a good developer! It's pointless arguing, sometimes it's best to use one, sometimes it's best to use the another. Next question....

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 8:56 pm
by blueznl
Fred wrote:Without a multiple pass compiler, it's actually difficult to detect references accross procedures, that's why it's not done for now.
Weeeeeeelllllll.... Even I could do it using CodeCaddy, but indeed that's more than one pass :-) and crappy written at that ;-)

Still, would it be that bad to do an additonal pass when creating a stand alone executable, as opposed to including the unused stuff when doing a simple test/run?

Edit: must have another look at Trond's code, he's smarter than me, anyway... <whistling innocently>

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 10:46 pm
by X
Fred wrote:Without a multiple pass compiler, it's actually difficult to detect references accross procedures, that's why it's not done for now.
oo! Is this a hint to what is coming? Or am I reading too much into this?

Re: ...PureBasic 2x slower than Java?

Posted: Tue Nov 15, 2011 11:12 pm
by luis
@trond

I was trying too to check if it's possible with a single pass and I came with something very similar to yours I see... almost the same in fact.

Code: Select all

Structure T_CALL_LIST
 flgCalled.i
 List CallList$()
EndStructure

Global NewList TopLevel$()
Global NewMap Procedures.T_CALL_LIST()

; simulate single pass  parse of a PB file

; add proc a() from its body declaration
AddMapElement(Procedures(), "a") 

; add proc b() from its body declaration
AddMapElement(Procedures(), "b") 
; we found it call itself, so we don't add that info

; add proc c() from its body declaration
AddMapElement(Procedures(), "c")

; add proc d() from its body declaration
AddMapElement(Procedures(), "d")
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "c" ; 

; add proc f() from its declare statement
AddMapElement(Procedures(), "f") ; 

; add proc e() from its declare statement
AddMapElement(Procedures(), "e") ; 

; add proc z() from its body declaration
AddMapElement(Procedures(), "z")
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "f" ; 
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "e" ; 

; we found proc e() body, it's already in the map so we retrieve it
FindMapElement(Procedures(),"e")
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "f" 
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "z" 

; we found proc f() body, it's already in the map so we retrieve it
FindMapElement(Procedures(),"f")
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "e"

; add proc g() from its body declaration
AddMapElement(Procedures(), "g")
; we found it call itself, so we don't add that info

; top level call to the proc f()
AddElement(TopLevel$())
TopLevel$() = "f" 

; top level call to the proc g()
AddElement(TopLevel$())
TopLevel$() = "g" 


Procedure FollowChain (proc$)
 Protected called$
 
 ForEach Procedures(proc$)\CallList$()
    called$ = Procedures()\CallList$()
    If Procedures(called$)\flgCalled = 0
        Procedures(called$)\flgCalled = 1
        FollowChain(called$)
    EndIf
 Next 
EndProcedure

; mark them
ForEach TopLevel$()    
    Procedures(TopLevel$())\flgCalled = 1    
    FollowChain(TopLevel$())       
Next

; show them
ForEach Procedures()
    If Procedures()\flgCalled        
        Debug MapKey(Procedures())
    EndIf
Next
The code to parse (taken from your sample and modified a little):

Code: Select all

Procedure a()
  ; NOT NEEDED
  ; never called 
EndProcedure

Procedure b()
  ; NOT NEEDED
  ; referenced only by itself (recursive)
  b()
EndProcedure

Procedure c()
  ; NOT NEEDED
  ; referenced by a function never called: d()
EndProcedure

Procedure d()
  ; NOT NEEDED
  ; never called
  c()
EndProcedure

Declare f()
Declare e()

Procedure z()
  ; NEEDED
  ; called by e()
f()
e()
EndProcedure

Procedure e()
  ; NEEDED
  ; called by f(), z()
  f()
  z()
EndProcedure

Procedure f()  
  ; NEEDED
  ; called by toplevel, e(), z()
  e()
EndProcedure

Procedure g()
  ; NEEDED
  ; referenced by itself and called by toplevel
  g()
EndProcedure

f()
g()
output (the stuff to keep):

Code: Select all

[23:05:42] e
[23:05:42] f
[23:05:42] g
[23:05:42] z

It seems to work. At this point I don't understand why Fred says it's hard to do.
Am I missing something?