Floats faster than doubles here!

Everything else that doesn't fall into one of the other PB categories.
User avatar
Psychophanta
Always Here
Always Here
Posts: 5153
Joined: Wed Jun 11, 2003 9:33 pm
Location: Anare
Contact:

Floats faster than doubles here!

Post by Psychophanta »

I made some weeks ago this routine to substract an angle value from another (when angle values are substracted we must be careful because it must be made knowing what we are making..)
Try this code (run it, then replace first line by Define .d and run again) and see:

Code: Select all

Define .f ; <- REPLACE BY .d AND SEE THE DIFFERENCE
Procedure.d SubstractAnglesASM(angle1.d,angle2.d); <- performs: angle1-angle2
  !fldpi
  !fadd st0,st0 ;<- 2*pi in st0
  !fld qword[p.v_angle1]; <- angle1 in st1
  !fprem1; <- remainder1 in st1
  !fstp st1; <- remainder1 in st0
  !fldpi
  !fadd st0,st0 ;<- 2*pi in st0,remainder1 in st1
  !fld qword[p.v_angle2];<- angle2 in st0, 2*pi in st1, remainder1 in st2
  !fprem1; <- remainder2 in st1, remainder1 in st2
  !fstp st1; <- remainder2 in st0, remainder1 in st1
  !fsubp st1,st0;  <-  REPLACE THIS LINE BY !faddp st1,st0 TO GET AddAngles() INSTEAD OF SubstractAngles()
  ;To get the minimal angle:
  !fldpi
  !fcomi st1
  !jnc near @f
  !fadd st0,st0 ;<-  2*pi in st0
  !fsubp st1,st0 ;<- result-2*pi
  ProcedureReturn
  !@@:fchs
  !fcomi st1
  !jc near @f
  !fadd st0,st0 ;<-  -2*pi in st0
  !fsubp st1,st0 ;<- result+2*pi in st0
  ProcedureReturn
  !@@:fstp st0
  ProcedureReturn
EndProcedure
Procedure.d WrapAngleSigned(angle.d); <- wraps a value into [-Pi,Pi] fringe
  !fldpi
  !fadd st0,st0; <- now i have 2*pi into st0
  !fld qword[p.v_angle]
  !fprem1
  !fstp st1
  ProcedureReturn
EndProcedure
Procedure.d SubstractAngles(angle1,angle2); <- performs: angle1-angle2
  angle1=WrapAngleSigned(angle1)
  angle2=WrapAngleSigned(angle2)
  res.d=angle1-angle2
  ;To get the minimal angle:
  If res.d<-#PI:res.d+2*#PI
  ElseIf res.d>#PI:res.d-2*#PI
  EndIf
  ProcedureReturn res.d
EndProcedure

;It works in radians. 
#Max=1000000
;Premier Test 
i=0 :i1=#Max/2
Tps=ElapsedMilliseconds() 
While i < #Max 
angleNew=SubstractAnglesASM(i,i1)
i + 0.1:i1-0.1
Wend 
Total1=ElapsedMilliseconds()-Tps 

; 
; ;Deuxième test 
i=0 :i1=#Max/2
Tps=ElapsedMilliseconds() 
While i < #Max 
angleNew=SubstractAngles(i,i1) 
i + 0.1 :i1-0.1
Wend 
Total2=ElapsedMilliseconds()-Tps 

MessageRequester("Test","SubstractAnglesASM = " + Str(Total1) + #LFCR$ + "SubstractAngles = " + Str(Total2),0)
So i take advantage to test this speed difference in Pentium4. So please P4 owners (like Comtois), can you tell result?
http://www.zeitgeistmovie.com

while (world==business) world+=mafia;
thefool
Always Here
Always Here
Posts: 5875
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

Why shouldnt double's be slower? They are 2xfloats, and managing larger variables in memory is slower.


However i tried this on my 3800+ 64bit (no 64bit windows though :( ) and the results only varied a few 100 ms..
va!n
Addict
Addict
Posts: 1104
Joined: Wed Apr 20, 2005 12:48 pm

Post by va!n »

Here are my results: (corrected - without enabled debugger :oops:)

Code: Select all

Define.d
SubtractAngelesASM =  672
SubtractAngeles    = 1156

Define.f
SubtractAngelesASM =  496
SubtractAngeles    = 1031
Seems there is some great speed potential. ASM routines are about 2x faster! ;)
Last edited by va!n on Tue Mar 14, 2006 5:06 pm, edited 1 time in total.
va!n aka Thorsten

Intel i7-980X Extreme Edition, 12 GB DDR3, Radeon 5870 2GB, Windows7 x64,
User avatar
Psychophanta
Always Here
Always Here
Posts: 5153
Joined: Wed Jun 11, 2003 9:33 pm
Location: Anare
Contact:

Post by Psychophanta »

Thefool, in http://www.purebasic.fr/english/viewtopic.php?t=19402 el choni explains why 80bit floats are faster than 64bit ones.
And the same reason is valid to explain why 64bit floats should be faster than 32bit ones.
The conclusion i think is that doubles are not accurated enough by the compiler :cry:
http://www.zeitgeistmovie.com

while (world==business) world+=mafia;
Fred
Administrator
Administrator
Posts: 18351
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Post by Fred »

On my asm FPU doc, using single floats for FSTP, uses 7 clocks, while double uses 8 clocks (and treal = 6 clocks). That's just one command but if it's generalized, then double are slower than float. Just compare the 2 commented asm files to see if anything is wrong when using double instead of crying :wink:.
User avatar
Psychophanta
Always Here
Always Here
Posts: 5153
Joined: Wed Jun 11, 2003 9:33 pm
Location: Anare
Contact:

Post by Psychophanta »

Well, ok then.
I am glad if PB is good compiling with doubles. At least as efficient as for 32bit floats.

However, your explanation means that we must not use doubles at least it is needed.
http://www.zeitgeistmovie.com

while (world==business) world+=mafia;
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

Hi Psychophanta,

It looks like doubles in Pure are as as accurate as doubles in any other language. They all use the same standard. Not sure on the speed thing. Guess it is a matter of whether precision considerations outweigh speed considerations, or vice versa.
@}--`--,-- A rose by any other name ..
User avatar
Psychophanta
Always Here
Always Here
Posts: 5153
Joined: Wed Jun 11, 2003 9:33 pm
Location: Anare
Contact:

Post by Psychophanta »

Yes, but
Fred wrote:using single floats for FSTP, uses 7 clocks, while double uses 8 clocks (and treal = 6 clocks).
Big reason for listening this request:
http://www.purebasic.fr/english/viewtopic.php?t=19402 :roll:
http://www.zeitgeistmovie.com

while (world==business) world+=mafia;
Fred
Administrator
Administrator
Posts: 18351
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Post by Fred »

va!n wrote:Here are my results:

Code: Select all

Define.d
SubtractAngelesASM =  6671
SubtractAngeles    = 17985

Define.f
SubtractAngelesASM =  6299
SubtractAngeles    = 16577
Seems there is some great speed potential. ASM routines are about 3x faster! :shock:
What about disabling the debugger when doing a speed test ? :lol:
User avatar
Comtois
Addict
Addict
Posts: 1432
Joined: Tue Aug 19, 2003 11:36 am
Location: Doubs - France

Post by Comtois »

Define.f
---------------------------
Test
---------------------------
SubstractAnglesASM = 1625

SubstractAngles = 1718
---------------------------
OK
---------------------------
Define.d
---------------------------
Test
---------------------------
SubstractAnglesASM = 2250

SubstractAngles = 2563
---------------------------
OK
---------------------------
Please correct my english
http://purebasic.developpez.com/
User avatar
Psychophanta
Always Here
Always Here
Posts: 5153
Joined: Wed Jun 11, 2003 9:33 pm
Location: Anare
Contact:

Post by Psychophanta »

Comtois wrote:Define.f
---------------------------
Test
---------------------------
SubstractAnglesASM = 1625

SubstractAngles = 1718
---------------------------
OK
---------------------------
Define.d
---------------------------
Test
---------------------------
SubstractAnglesASM = 2250

SubstractAngles = 2563
---------------------------
OK
---------------------------
Uhh!
The difference is greater in Pentium4 :shock:
Each day i prefer AMD more and more vs Intel.
However i've heard SSE and SSE2 is better optimiced in intel units.
http://www.zeitgeistmovie.com

while (world==business) world+=mafia;
va!n
Addict
Addict
Posts: 1104
Joined: Wed Apr 20, 2005 12:48 pm

Post by va!n »

Fred wrote: What about disabling the debugger when doing a speed test ? :lol:
upsss.... :lol: results corrected now :wink:
va!n aka Thorsten

Intel i7-980X Extreme Edition, 12 GB DDR3, Radeon 5870 2GB, Windows7 x64,
Bonne_den_kule
Addict
Addict
Posts: 841
Joined: Mon Jun 07, 2004 7:10 pm

Post by Bonne_den_kule »

Define.f
---------------------------
Test
---------------------------
SubstractAnglesASM = 1687

SubstractAngles = 2328
---------------------------
OK
---------------------------
Define.d
---------------------------
Test
---------------------------
SubstractAnglesASM = 2234

SubstractAngles = 3328
---------------------------
OK
---------------------------
AthlonXP 2500+ (1,81Gh)
thefool
Always Here
Always Here
Posts: 5875
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

Athlon 64 (on 32bit os :( ) 3800+

define.f
---------------------------
Test
---------------------------
SubstractAnglesASM = 375

SubstractAngles = 672
---------------------------
OK
---------------------------
define.d
---------------------------
Test
---------------------------
SubstractAnglesASM = 531

SubstractAngles = 922
---------------------------
OK
---------------------------

edit:
Results are taken with winamp playing, and various other bg programs..
(and thanks to fred for supporting my statement about why floats are faster :P )
dagcrack
Addict
Addict
Posts: 1868
Joined: Sun Mar 07, 2004 8:47 am
Location: Argentina
Contact:

Post by dagcrack »

Is ElapsedMilliseconds() still using a low resolution timer? if it is, shame on you guys.

Use a high res timer already for your benchmarks... Its no use to keep on using a timer that has a 10ms discrepancy. jesus.

By the way, did you know that binary shift operations are almost twice as fast in AMD processors than INTEL? ;) go figure (I'm serious, I've made several tests with high resolution timers on several processors and its a known fact as well). And about doubles, arent you working with TWICE memory, etc... should be slower then
! Black holes are where God divided by zero !
My little blog!
(Not for the faint hearted!)
Post Reply