Page 2 of 3

Posted: Sun Dec 03, 2006 9:36 pm
by Trond
Can you test this? If Blitz still is faster, then it's cheating inside the square root function.

Code: Select all


n.l
x.f

OpenConsole() 
start = ElapsedMilliseconds() 

; For n = 1 To 30000000
; x = 1/Sqr(100) 
; Next

; For n = 1 To 500
  MOV    dword [v_n],1
  MOV    eax, 1; dword [v_n]
!_For1:
  CMP    eax, 50000000
  JNL    _Next2
; x = 1/Sqr(100) 
  ;FLD    dword [F1]
  FLD    qword [D2]
  FSQRT
  FDIVR  qword [D1]
  ;FDIVR  dword [F2]
  !FSTP  dword [v_x]
; Next
!_NextContinue2:
  INC    eax
  JMP   _For1
!_Next2:
! MOV dword [v_n], eax

stop = ElapsedMilliseconds() 
PrintN(Str(stop-start)) 
Input()

End
DataSection
!F1: DD 1120403456
!F2: DD 1065353216
!D1: DD 0, 1072693248
!D2: DD 0,1072693248
EndDataSection

Posted: Sun Dec 03, 2006 10:32 pm
by AND51
@ Derek (1st post):

With debugger: 5500 ms
Without debug: 1219 ms

My CPU: 3.4 GHz DualCore

Posted: Sun Dec 03, 2006 10:55 pm
by thefool
Tons of bg apps running, amd athlon 64(on a 32bit system so it should do no difference), 3800+ single core:

Without debugger: 719 ms

bg apps such as FLStudio (resource hogging, not too many vsts loaded though), msn, lots of other stuff was running,.

Posted: Sun Dec 03, 2006 10:59 pm
by netmaestro
AMD 3800+ X2 dual core:

4140
1031

I was eating pizza at the time.

Posted: Sun Dec 03, 2006 11:11 pm
by thefool
Why is mine so fast? Its weird..
The latest code from trond gave this result. The very first code posted was still under 900ms!

Posted: Sun Dec 03, 2006 11:15 pm
by netmaestro
Trond's code gives 797 here, 11000 with debugger.

Posted: Sun Dec 03, 2006 11:28 pm
by GeoTrail
Derek, your first code I get:
With debugger 4266
Without 1047
Derek wrote:
PB=2360 and 8192.000000

Blitz=1291 and 8192.0

Probably right about the precision but I still think something is up with PB and its way that it works with dual cores!
My result:
1031
8192.000000

Posted: Mon Dec 04, 2006 12:35 am
by MikeB
I just ran Tronds code on my old 1.5Ghz Pentium 4 as I am soon getting a new computer with the 2.4 Ghz Core 2 Duo 6600 and was just wondering what sort of speed increase I am liable to get.

The result however seems rather weird -
With debugger 33508
Without 2874

Why such a big difference, nearly 12 times as long with the debugger!
Other people's results show something just over 4 times.

Posted: Mon Dec 04, 2006 3:22 am
by AND51
netmaestro wrote:AMD 3800+ X2 dual core:

4140
1031

I was eating pizza at the time.
You ate a pizza in 5171 ms? :shock: :roll:

Posted: Mon Dec 04, 2006 4:26 am
by jack
more than likely the reason Blitz is faster is because it optimizes x=1/Sqr(100) to a constant instead of recalculating
btw my new mac pro runing windows xp under parallels desktop (virtual machine) i get 4747 and 1712 Trond's code: 9624 and 190

Posted: Mon Dec 04, 2006 10:13 am
by Derek
With Tronds code I get 266.

@mikeb, it seems like you don't get double the speed, more like half. :(

Posted: Mon Dec 04, 2006 10:21 am
by Derek
So normal PB code works best on AMD's

Netmaestro's xp3800+x2 = 1031
My 6300 = 2390

but Tronds optimized code is better on intels

Netmaestro = 797
Mine= 266

How do you go about optimizing your programs to get the best out of them!

Posted: Wed Dec 06, 2006 7:58 pm
by MikeB
Derek wrote:With Tronds code I get 266.

@mikeb, it seems like you don't get double the speed, more like half. :(
Not quite sure what you mean, my results -

With debugger 33508
Without 2874

33508÷2874 = 11·659

As I said nearly 12 times as long using the debugger as the time taken without. Apparently the debugger runs very slowly on a 1·5Ghz. Pentium 4.

Posted: Wed Dec 06, 2006 9:09 pm
by Derek
With the first code I get 2390 and rescator gets 1172 so I said half the speed. It's just a joke though.

Obviously you wouldn't have a program that needs to calculate 50,000,000 square roots and nothing else.

The programs that I have used with the core2duo have blown away the amd3000 64bit that I used to have.

I'm sure you will find the same.

Posted: Wed Dec 06, 2006 9:44 pm
by dracflamloc
When comparing results from different manufacturers you need ot take into consideration that AMD and Intel use different algorithms for pipelining and certain things that might flush the pipe on intel might not on AMD. This would lead to a huge performance loss. This kind of thing actually happens a lot and AMD's better algorithms are what allow them to compare with intel's performance with slower FSBs, among other things.

Intel has its good spots too of course, so there really is no "one cpu to rule them all"