Page 1 of 1

fast 2D distance between points

Posted: Thu Nov 13, 2003 2:48 am
by griz

Code: Select all

; /// return approx distance between 2 points ///
; /// usage : approx_distance(x1-x2,y1-y2)    ///
Procedure approx_distance(xx,yy)
  If xx < 0 : xx = -xx : EndIf 
  If yy < 0 : yy = -yy : EndIf
  If xx < yy
    min = xx : max = yy
  Else
    min = yy : max = xx
  EndIf 
  ProcedureReturn (( max << 8 ) + ( max << 3 ) - ( max << 4 ) - ( max << 1 ) + ( min << 7 ) - ( min << 5 ) + ( min << 3 ) - ( min << 1 )) >> 8 
EndProcedure 
The code above is 3 times quicker than the standard SQR based method from some of my tests. The results do take a slight accuracy hit but this may be acceptable in your project. For those interested here's more information : http://www.flipcode.com/articles/articl ... ance.shtml

In a real time application (like a game) you may prefer to avoid calling a procedure but instead integrate the code directly in the main loop (where possible/feasible) for a speed up.

Can anyone offer quicker or better ways of accomplishing this? Any tips? Perhaps some inline assembly? :D

Posted: Thu Nov 13, 2003 11:55 am
by Psychophanta
Can anyone offer quicker or better ways of accomplishing this? Any tips? Perhaps some inline assembly?
I can offer. This is quicker and better implemented:

Code: Select all

Procedure approx_distance(xx,yy)
  xx=Abs(xx)
  yy=Abs(yy)
  If xx<yy 
    ProcedureReturn (yy<<8+yy<<3-yy<<4-yy<<1+xx<<7-xx<<5+xx<<3-xx<<1)>>8
  Else
    ProcedureReturn (xx<<8+xx<<3-xx<<4-xx<<1+yy<<7-yy<<5+yy<<3-yy<<1)>>8 
  EndIf
EndProcedure 
I mean it is much faster, it is using your algorithm (which is not very accurate in the resulting distance), and more elegant. :wink:


Surely if i use sqr assembler instruction, it would be even faster and smaller size. I'll try.

Posted: Thu Nov 13, 2003 12:19 pm
by Psychophanta
Here you have a comparative:

Code: Select all

; /// return approx distance between 2 points /// 
; /// usage : approx_distance(x1-x2,y1-y2)    /// 
Procedure approx_distance0(xx,yy) 
  If xx < 0 : xx = -xx : EndIf 
  If yy < 0 : yy = -yy : EndIf 
  If xx < yy 
    min = xx : max = yy 
  Else 
    min = yy : max = xx 
  EndIf 
  ProcedureReturn (max<<8+max<<3-max<<4-max<<1+min<<7-min<<5+min<<3-min<<1)>>8 
EndProcedure 
; /// return approx distance between 2 points /// 
; /// usage : approx_distance(x1-x2,y1-y2)    /// 
Procedure approx_distance(xx,yy)
  xx=Abs(xx)
  yy=Abs(yy)
  If xx<yy 
    ProcedureReturn (yy<<8+yy<<3-yy<<4-yy<<1+xx<<7-xx<<5+xx<<3-xx<<1)>>8
  Else
    ProcedureReturn (xx<<8+xx<<3-xx<<4-xx<<1+yy<<7-yy<<5+yy<<3-yy<<1)>>8 
  EndIf
EndProcedure

Procedure.f distance(xx.f,yy.f)
  !finit
  !fld dword[esp] ;push xx to FPU stack (to st0)
  !fmul st0,st0
  !fld dword[esp+4] ;push yy value to FPU stack (to st0) (xx is now in st1)
  !fmul st0,st0
  !faddp
  !fsqrt
EndProcedure 


;-PROVE IT:
cx1=13:cx2=782:cy1=345:cy2=386

tt=gettickcount_()
For t=1 To 1000000
  approx_distance0(cx2-cx1,cy2-cy1)
Next
Debug gettickcount_()-tt

tt=gettickcount_()
For t=1 To 1000000
  approx_distance(cx2-cx1,cy2-cy1)
Next
Debug gettickcount_()-tt

tt=gettickcount_()
For t=1 To 1000000
  distance(cx2-cx1,cy2-cy1)
Next
Debug gettickcount_()-tt


Debug ""
Debug approx_distance0(cx2-cx1,cy2-cy1)
Debug approx_distance(cx2-cx1,cy2-cy1)
Debug distance(cx2-cx1,cy2-cy1)
Keep in mind that ASM in PB is not so fast as it shoud be, because compiler,,,... don't know why :?: Ask Fred :arrow:

Posted: Thu Nov 13, 2003 1:35 pm
by griz
Thanks Psychophanta, very nice.

approx_distance() is now about 40% quicker on my PC :!:

I see the assembly version isn't that much quicker but it does return a much more accurate result. Perhaps you might write an ASM version using the same (bit shifting) method as approx_distance()? I wonder if that would be quicker still? approx_distance_asm()?

Again, thanks for the great feedback.

For an experiment I avoided procedure calls with the next bit of code (to add to your timed tests). This is a slight speed improvement, easily worth another 10%. How would you impliment your distance() procedure unrolled like this?

Code: Select all

tt=gettickcount_() 
For t=1 To 1000000 
  xx=Abs(cx2-cx1) : yy=Abs(cy2-cy1) 
  If xx<yy 
    dis=(yy<<8+yy<<3-yy<<4-yy<<1+xx<<7-xx<<5+xx<<3-xx<<1)>>8 
  Else 
    dis=(xx<<8+xx<<3-xx<<4-xx<<1+yy<<7-yy<<5+yy<<3-yy<<1)>>8 
  EndIf 
Next 
Debug gettickcount_()-tt 

Posted: Thu Nov 13, 2003 2:54 pm
by Fred
Psychophanta wrote:Keep in mind that ASM in PB is not so fast as it shoud be, because compiler,,,... don't know why :?: Ask Fred :arrow:
Sorry, I don't understand 8O

Posted: Thu Nov 13, 2003 3:11 pm
by Psychophanta
I see the assembly version isn't that much quicker but it does return a much more accurate result. Perhaps you might write an ASM version using the same (bit shifting) method as approx_distance()? I wonder if that would be quicker still? approx_distance_asm()?
I guess a bit-shifting ASM version shouldn't be faster, because SQR in asm is a very simple calculation. However i'll do it to test speed.
For an experiment I avoided procedure calls with the next bit of code (to add to your timed tests). This is a slight speed improvement, easily worth another 10%. How would you impliment your distance() procedure unrolled like this?
If you eliminate function calls, the speed is increased, but structured source code is loss.
Here you have, but it is slow:

Code: Select all

cx1=34:cx2=7890:cy1=345:cy2=45
res.l
tt=gettickcount_() 
!finit
For t=1 To 1000000 
  !fld dword[v_cx1] ;push xx to FPU stack (to st0) 
  !fsub dword[v_cx2];substract
  !fmul st0,st0 ;xx^2
  !fld dword[v_cy1] ;push yy value to FPU stack (to st0) (xx is now in st1) 
  !fsub dword[v_cy2];substract
  !fmul st0,st0 ;yy^2
  !faddp ;xx^2+yy^2
  !fsqrt ;hypotenuse, this is, SQR(xx^2+yy^2)
  !fstp dword[v_res]
Next 
Debug gettickcount_()-tt 


Debug res

Posted: Thu Nov 13, 2003 3:25 pm
by Psychophanta
I said:
Keep in mind that ASM in PB is not so fast as it shoud be, because compiler,,,... don't know why Ask Fred
Fred said:
Sorry, I don't understand
Fred, it is difficult for me to believe this fact:

Code: Select all

r.l=-234.234234 

ff=gettickcount_() 
For t=0 To 10000000 
  Abs(r) 
Next 
Debug gettickcount_()-ff 
; --------------------------------- 
ff=gettickcount_() 
For t=0 To 10000000 
  !fld dword[v_r] 
  !fabs 
Next 
Debug gettickcount_()-ff
Here is faster Abs() than the 2 ASM instructions :!:
Help me, teach me to believe it :P :twisted: :roll: :wink: :!:

Posted: Thu Nov 13, 2003 3:33 pm
by Pupil
@Psychophanta
i told you why here:
viewtopic.php?t=7853&postdays=0&postorder=asc&start=30

Code: Select all

r.l=-234.234234
DisableDebugger
ff=gettickcount_()
For t=0 To 10000000
  Abs(r)
Next
EnableDebugger
Debug gettickcount_()-ff
DisableDebugger
; ---------------------------------
ff=gettickcount_()
For t=0 To 10000000
  !fld dword[v_r]
  !fabs
  !fstp st0
Next
EnableDebugger
Debug gettickcount_()-ff

Posted: Thu Nov 13, 2003 4:04 pm
by Fred
Damn, pupil was faster :twisted:

Posted: Thu Nov 13, 2003 4:35 pm
by Psychophanta
Sorry Pupil, i din't see your answer. :oops:

I didn't know that to clear FPU stack was important for speed. :oops:

You said: "You will see that it's already as fast as it gets because the command Abs() is translated into asm directly(inline) so there isn't any call to a function at all"

I think all PB built-in functions, at least as much as possible, should be like this (without calling) :wink:

Thank you very much :D



For you, griz, here is exactly your algorithm for hypotenuse, made in ASM:

Code: Select all

Procedure.l approx_distanceASM(xx.l,yy.l) 
  !mov eax,dword[esp]
  !mov ebx,dword[esp+4]
  !cmp eax,0
  !jnl @f
  !neg eax
  !@@:
  !cmp ebx,0
  !jnl @f
  !neg ebx
  !@@:
  !cmp eax,ebx
  !jnl @f
  !xchg eax,ebx  
  !@@:
  !mov edx,eax
  !shl edx,1
  !shl eax,3
  !sub eax,edx
  !shl edx,3
  !sub eax,edx
  !shl edx,4
  !add eax,edx
  !;
  !mov edx,ebx
  !shl edx,1
  !shl ebx,3
  !sub ebx,edx
  !shl edx,4
  !sub ebx,edx
  !shl edx,2
  !add ebx,edx
  !;
  !add eax,ebx
  !shr eax,8
  ProcedureReturn
EndProcedure 


;-PROVE IT: 
cx1=13:cx2=782:cy1=345:cy2=386 

tt=gettickcount_() 
For t=1 To 1000000 
  approx_distanceASM(cx2-cx1,cy2-cy1) 
Next 
Debug gettickcount_()-tt 

Debug approx_distanceASM(cx2-cx1,cy2-cy1) 

Posted: Thu Nov 13, 2003 5:30 pm
by Psychophanta
Here is the final conclusion. As i thought, the exact distance (the SQR with floats, this is, Phythagoras algorithm) is the faster one and the more accurate in calculation :wink:

Code: Select all

; /// return approx distance between 2 points ///
; /// usage : approx_distance(x1-x2,y1-y2)    ///

Procedure approx_distance(xx,yy)
  xx=Abs(xx)
  yy=Abs(yy)
  If xx<yy
    ProcedureReturn (yy<<8+yy<<3-yy<<4-yy<<1+xx<<7-xx<<5+xx<<3-xx<<1)>>8
  Else
    ProcedureReturn (xx<<8+xx<<3-xx<<4-xx<<1+yy<<7-yy<<5+yy<<3-yy<<1)>>8
  EndIf
EndProcedure

Procedure.f distance(xx.f,yy.f)
  !;finit
  !fld dword[esp] ;push xx to FPU stack (to st0)
  !fmul st0,st0
  !fld dword[esp+4] ;push yy value to FPU stack (to st0) (xx is now in st1)
  !fmul st0,st0
  !faddp
  !fsqrt
EndProcedure

Procedure.l approx_distanceASM(xx.l,yy.l)
  !mov eax,dword[esp]
  !mov ebx,dword[esp+4]
  !cmp eax,0
  !jnl @f
  !neg eax
  !@@:
  !cmp ebx,0
  !jnl @f
  !neg ebx
  !@@:
  !cmp eax,ebx
  !jnl @f
  !xchg eax,ebx  
  !@@:
  !mov edx,eax
  !shl edx,1
  !shl eax,3
  !sub eax,edx
  !shl edx,3
  !sub eax,edx
  !shl edx,4
  !add eax,edx
  !;
  !mov edx,ebx
  !shl edx,1
  !shl ebx,3
  !sub ebx,edx
  !shl edx,4
  !sub ebx,edx
  !shl edx,2
  !add ebx,edx
  !;
  !add eax,ebx
  !shr eax,8
  ProcedureReturn
EndProcedure

;-PROVE IT:
DisableDebugger
cx1=13:cx2=782:cy1=345:cy2=286

tt=gettickcount_()
For t=1 To 1000000
  approx_distance(cx2-cx1,cy2-cy1)
Next
EnableDebugger
Debug gettickcount_()-tt

DisableDebugger
tt=gettickcount_()
!finit
For t=1 To 1000000
  distance(cx2-cx1,cy2-cy1)
Next
EnableDebugger
Debug gettickcount_()-tt

DisableDebugger
tt=gettickcount_()
For t=1 To 1000000
  approx_distanceASM(cx2-cx1,cy2-cy1)
Next 
EnableDebugger
Debug gettickcount_()-tt


Debug ""
Debug approx_distance(cx2-cx1,cy2-cy1)
Debug distance(cx2-cx1,cy2-cy1)
Debug approx_distanceASM(cx2-cx1,cy2-cy1)
griz, take the best one for you.

And here can see the speed difference between call or not call to function:

Code: Select all

Procedure.f distance(xx.f,yy.f)
  !fld dword[esp] ;push xx to FPU stack (to st0)
  !fmul st0,st0
  !fld dword[esp+4] ;push yy value to FPU stack (to st0) (xx is now in st1)
  !fmul st0,st0
  !faddp
  !fsqrt
EndProcedure

DisableDebugger
cx1=13:cx2=782:cy1=345:cy2=286
xx.f=cx2-cx1:yy.f=cy2-cy1
result.f

tt=gettickcount_()
!finit
For t=1 To 1000000
  !fld dword[v_xx] ;push xx to FPU stack (to st0)
  !fmul st0,st0 ;xx^2
  !fld dword[v_yy] ;push yy value to FPU stack (to st0) (xx is now in st1)
  !fmul st0,st0 ;yy^2
  !faddp ;xx^2+yy^2
  !fsqrt ;hypotenuse, this is, SQR(xx^2+yy^2)
  !fstp dword[v_result]
Next
EnableDebugger
Debug gettickcount_()-tt

DisableDebugger
tt=gettickcount_()
!finit
For t=1 To 1000000
  distance(xx,yy)
Next
EnableDebugger
Debug gettickcount_()-tt

Debug ""
Debug result
Debug distance(cx2-cx1,cy2-cy1)

Posted: Thu Nov 13, 2003 7:39 pm
by griz
Thanks again Psychophanta!

Yes it does seem that your SQR based ASM version is the best. I wonder if this holds true on slower, older computers? When I get a chance I'll test this out and let you know. For now, here's the fastest/most accurate/simplest way :

Code: Select all

Procedure.f distance(xx.f,yy.f) 
  !;finit 
  !fld dword[esp] ;push xx to FPU stack (to st0) 
  !fmul st0,st0 
  !fld dword[esp+4] ;push yy value to FPU stack (to st0) (xx is now in st1) 
  !fmul st0,st0 
  !faddp 
  !fsqrt 
EndProcedure 

Posted: Thu Nov 13, 2003 7:47 pm
by Kale
How is this function called? i guess the distance between 2 points is 2 sets of xy coordinates. So say point 1 is (x=5, y=10) and point 2 is (x=15, y=20) to find the distance between them do i call the function like this:

Code: Select all

distance(15 - 5, 20 - 10)
:?:

Posted: Thu Nov 13, 2003 7:57 pm
by Psychophanta
YES 8)

or:

distance(5-15,10-20)

Posted: Fri Nov 14, 2003 1:44 am
by Kale
I thought so, Thankyou. :)