Fastest way to get the decimals of a float?

Just starting out? Need help? Post your questions and find answers here.
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Post by superadnim »

yes the debugger is off however I forgot to mention a key factor: my benchmark does a tiny attempt to reproduce real-life conditions in the sense that the input data is randomly generated however on both tests the code is the same and I'm not benchmarking the actual time it takes (well I am) but the relation in between test A and B (the ratio)

that said, I don't think my test is flawed

the test cases:

Code: Select all

Macro TestCaseA
	Define.f temp 					= Random(10)+Random(10)*0.1
	Define.f temp_decimals
	decimals2(temp, temp_decimals) 
EndMacro

Macro TestCaseB
	Define.f temp 					= Random(10)+Random(10)*0.1
	Define.f temp_decimals 	= decimals(temp) 
EndMacro
since the first line is the same, it's ruled out of the equation pretty much even though the benchmarks take much longer... (and for the sake of it, in between each test the random seed is set to the same one used at the first case - and no I'm not counting this)

another thing is that I'm running the process in realtime priority class, the results differ by quite a lot if I don't do this.

the reason I asked about where to put the fstp is simply because of the way processors are predicting branching and whatnot I think by keeping all of the fpu calls together and in pairs things tend to run better?, might not be the case with this code though.

I'll simplify the test scenario and post the code soon.

:lol: should I bash the keyboard and give up?
:?
dioxin
User
User
Posts: 97
Joined: Thu May 11, 2006 9:53 pm

Post by dioxin »

superadnim,
even though the benchmarks take much longer
That's the answer, "even though it takes much longer".
For the sake of argument, let's say the INT() method takes 500clks and the ASM method takes 50clks and your generating of random numbers takes 2,000clks.

Total time would then be 2,500 vs 2,050 a saving of around 20% but the ASM is actually 10 times faster then the INT() it replaces.

My guess is that generating your random numbers is taking most of the time and is masking the savings from the ASM.
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Post by superadnim »

I don't understand your argument, the way I see it the random calls negate each other. a real benchmark should take into consideration real tests not just "how fast is this code executing by itself" because the latter is no good in reality, or is it?

Code: Select all

;---

Macro decimals( _n_ ) : (_n_-Int(_n_)) : EndMacro 

Macro decimals2( _n_, _result_ )
	
	!push $1f7f0000      			; FP control word needed to round to zero 
	!fstcw [esp]         			; Save the current FP control word 
	!fldcw [esp+2]       			; Set the FP to round to zero 
	!fld dword[v_#_n_] 	 			; Load the FP with the number to work on 
	!fld st0             			; Replicate/push that number on the FP stack 
	!frndint             			; Round number to leave integer 
	!fsubp st1,st0       			; Subtract from original to leave fraction part (and pop stack) 
	!fldcw [esp]         			; Restore original FP control word 
	!fstp dword[v_#_result_]
	!add esp,4           			; Clean up CPU stack 
	
EndMacro

;---

Macro TestNumber
	Random(10)+Random(10)*0.1 ; 123.456 ; 
EndMacro

Macro TestCaseA
	Define.f a_temp 					= TestNumber
	Define.f a_temp_decimals
	decimals2(a_temp, a_temp_decimals) 
EndMacro

Macro TestCaseB
	Define.f b_temp 					= TestNumber
	Define.f b_temp_decimals 	= decimals(b_temp) 
EndMacro

;---

#BENCHMARK_ITERATIONS = 100000000;0;0
SetPriorityClass_( GetCurrentProcess_(), #REALTIME_PRIORITY_CLASS)

Macro InitTestCase
	Delay(1000)
	RandomSeed(123456789)
EndMacro

;---

Define.i a_time_old, a_time_new
Define.i b_time_old, b_time_new

InitTestCase

a_time_old = ElapsedMilliseconds()
For i=1 To #BENCHMARK_ITERATIONS
	TestCaseA
Next
a_time_new = ElapsedMilliseconds()

InitTestCase

b_time_old = ElapsedMilliseconds()
For i=1 To #BENCHMARK_ITERATIONS
	TestCaseB
Next
b_time_new = ElapsedMilliseconds()

;---

Define.i a_result, b_result
Define.f a_percent, b_percent

a_result 	= ( a_time_new - a_time_old )
b_result 	= ( b_time_new - b_time_old )
a_percent = (( a_result / b_result ) * 100)
b_percent = (( b_result / a_result ) * 100)

Define.s result

result + #CRLF$
result + "A= " 	+ Str(a_result) + "ms" + #LF$
result + "B= " 	+ Str(b_result) + "ms" + #LF$
result + #CRLF$
result + StrF( a_percent, 2 ) +"%"+ " ( "+StrF( 100-a_percent, 2 )+"% )" + #LF$
result + StrF( b_percent, 2 ) +"%"+ " ( "+StrF( 100-b_percent, 2 )+"% )" + #LF$
result + #CRLF$

MessageRequester( "benchmark result", result )

no need for a higher resolution timer (tried, same results... it's a big scale). there should be a min or max routine to make more sense of the results but I left that out.

if you benchmark the random calls on each test, you'll see it takes the same time on both cases (with a margin of 0.1 due to the low resolution of the timer) hence they are "negating" each other in the test I just pasted.

so, there is no need to initialize the random numbers in an array at all but it would make for a better test.

:lol: should I bash the keyboard and give up?
:?
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

so, there is no need to initialize the random numbers in an array at all but it would make for a better test.
Think about it once more, please.

But I'm afraid it's much worse than that. Both codes actually gives the sign in addition to the decimals (the only thing asked for was the decimals). So they give the wrong result at negative values.

Better slow and correct than slow and wrong. :wink:
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Post by superadnim »

you can get rid of the top bit (the sign bit) and that'll do it, it's an extra instruction I recall.

anyway the results I got on this test is that test A is 24.90% faster than test B using the randoms and with the constant float (123.456) I get 39.80% for test A against test B.

since in reality I won't be using a constant... I think the first test is the only valid test here. :lol:

:lol: should I bash the keyboard and give up?
:?
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Post by superadnim »

Trond wrote:[]Think about it once more, please.
I did and since benchmarking the random() calls (without anything else) on test A and B gave the same results on each test (equality) then it is safe to assume that they are not biasing nor skewing the results, it's just an extra few milliseconds I DONT care about since all I'm doing is taking a % out of the results, whether the percentage comes from bigger or smaller times doesn't matter since the extra added number here is assumed constant on both cases.
Last edited by superadnim on Sat Nov 15, 2008 5:20 pm, edited 3 times in total.

:lol: should I bash the keyboard and give up?
:?
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

superadnim wrote:
Trond wrote:Think about it once more, please.
I did and since benchmarking the random() calls (without anything else) on test A and B gave the same results on each test (equality) then it is safe to assume that they are not biasing nor skewing the results, it's just an extra few milliseconds I DONT care about since all I'm doing is taking a % out of the results, whether the percentage comes from bigger or smaller times doesn't matter since the extra added number here is assumed constant on both cases.
That's exactly why you care.

; Case 1
Define.f
NumA = 100
NumB = 200
C = NumA / NumB
Debug C

; Case with added constant value
Define.f
NumA = 100 + 800
NumB = 200 + 800
C = NumA / NumB
Debug C
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Post by superadnim »

But that's not what I'm doing, look:

Code: Select all

; Case 1 
Define.f 
NumA = 100
NumB = 200
result_a = (NumB-NumA)

; Case with added constant value 
Define.f 
NumA = 100 + 800
NumB = 200 + 800
result_b = (NumB-NumA)

Debug ((result_a / result_b) * 100)
Debug ((result_b / result_a) * 100)
the only problem I see is that you could get rounding errors if using floats and the constant being added is too big.

:lol: should I bash the keyboard and give up?
:?
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

Read your own code. I makes absolutely no sense.
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Post by superadnim »

Trond wrote:Read your own code. I makes absolutely no sense.
You're right, you aren't making any sense.

Please shed some light onto the obvious?

:lol: should I bash the keyboard and give up?
:?
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

superadnim wrote:Please shed some light onto the obvious?
Your variable a_result corresponds to my NumA with the added constant value. Think about it.


Code: Select all

Macro decimals( _n_ ) : (_n_-Int(_n_)) : EndMacro

Macro decimals2( _n_, _result_ )
   
   !push $1f7f0000               ; FP control word needed to round to zero
   !fstcw [esp]                  ; Save the current FP control word
   !fldcw [esp+2]                ; Set the FP to round to zero
   !fld dword[v_#_n_]              ; Load the FP with the number to work on
   !fld st0                      ; Replicate/push that number on the FP stack
   !frndint                      ; Round number to leave integer
   !fsubp st1,st0                ; Subtract from original to leave fraction part (and pop stack)
   !fldcw [esp]                  ; Restore original FP control word
   !fstp dword[v_#_result_]
   !add esp,4                    ; Clean up CPU stack
   
EndMacro


Macro decimals3(n, result)
  ! fld1
  ! fld dword [v_#n]
  ! fabs
  ! fld st0
  ! frndint
  ! fcomi st0, st1
  ! fldz
  ! fcmovnbe st0, st3
  ! fsub st1, st0
  ! fstp st0
  ! fsubp st1, st0
  ! fstp dword[v_#result]
  ! fstp st0
EndMacro


; SetPriorityClass_( GetCurrentProcess_(), #REALTIME_PRIORITY_CLASS)

#Tries = 10000000

temp.f = 6184927.2348192
temp2.f = -8234.2343
temp3.f = 0.0
temp4.f = 0.432
result.f

timeA = GetTickCount_()
For U = 0 To #Tries
  decimals2(temp, result)
  decimals2(temp2, result)
  decimals2(temp3, result)
  decimals2(temp4, result)
Next
timeA = GetTickCount_() - timeA

timeB = GetTickCount_()
For U = 0 To #Tries
  result = decimals(temp)
  result = decimals(temp2)
  result = decimals(temp3)
  result = decimals(temp4)
Next
timeB = GetTickCount_() - timeB

timeC = GetTickCount_()
For U = 0 To #Tries
  decimals3(temp, result)
  decimals3(temp2, result)
  decimals3(temp3, result)
  decimals3(temp4, result)
Next
timeC = GetTickCount_() - timeC

r.s = "decimals2: " + Str(timeA) + #CRLF$
r.s + "decimals: " + Str(timeB) + #CRLF$
r.s + "decimals3: " + Str(timeC) + #CRLF$

MessageRequester("", r)
Last edited by Trond on Sat Nov 15, 2008 7:35 pm, edited 1 time in total.
dioxin
User
User
Posts: 97
Joined: Thu May 11, 2006 9:53 pm

Post by dioxin »

Trond,
your method would appear more impressive if you called decimals3 somewhere instead of calling decimals2 twice!
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

Thanks! :oops: It should blow some doors now. :twisted: Let's just hope no bugs surface...
Little John
Addict
Addict
Posts: 4777
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Post by Little John »

superadnim wrote:I don't understand your argument, the way I see it the random calls negate each other.
The way you see it is just mathematically wrong. The random calls add to the time, and afterwards in order to get the percent value you are doing a division. This post by dioxin explains the situation pretty good.

Regards, Little John
Post Reply