Speed of PureBasic

Road Runner · Post by **Road Runner** » Sun Jan 25, 2004 8:38 pm

Psychophanta,

at least RoadRunner lied when said:

Oh dear, we don't need to get into this sort of nonsense, do we?
In case of confusion, the "other PB" is PowerBASIC for Windows V7.02.

The code as posted was compiled as is, with all the default compiler settings.
The only changes made were to the type suffixes, the other PB doesn't require .f and .l suffixes and the variables were declared with the line:

DIM yy AS EXT, zz AS EXT, xx AS LONG

The posted ASM was that obtained from Ollydebug but I added the BASIC statements at the end as Davies said he wasn't familiar with ASM so I thought it would help him understand if ASM and BASIC were side by side.

If all you want is some "PureBASIC beats The other PB" code then.. PureBASIC compiles the assignment of SINGLE floats more efficiently.

a.f=1.23

complies to a single MOV DWORD address,immediate instruction in PureBASIC but it compiles to FLD sourceaddress, FSTP Destinationaddress in the other PB.

"the other PB" lacks too, coz extended precision floats are intended to be 80 bit, not 64.

The other PB includes 32 bit (SINGLE), 64 bit (DOUBLE) and 80 bit (EXT) FP numbers. It doesn't lack in that respect. Are you talking about a different PB?

Is it for you "For xx.l = 1 To 100000000" same than "FOR xx=100000000 to 1 step-1"

But this is at the heart of the discussion. One compiler sees an opportunity to make a (small) improvement and does it. The other compiler doesn't.
In the code as posted, the change from counting up to counting down results in the correct answer more quickly.
Comparing true like for like code on my PC the other PB runs that loop in 14clks compared to PureBASICs 18 clks.
Allowing Register variables.. the otherPB runs it at 80bit precision in 5.5clks/loop.
These differences are significant when running physics simulations which currently take "2 or 3 weeks" to complete.

dell_jockey · Post by **dell_jockey** » Sun Jan 25, 2004 9:03 pm

Syntax Error,

Code: Select all

Would somebody care to test this HeapSort code and see what speed they get?

on a 2 GHz Dell Lattitude C840 with XP Prof., it took 280 ms, running it as an executable from the command line with debugging code disabled.

Psychophanta · Post by **Psychophanta** » Sun Jan 25, 2004 10:42 pm

@Road Runner:
PB lacks al little bit about floats, and since i know PB i noticed it and told to the author Fred about that. All the users know about this fact.
But try this next

PowerBasic for Windows VERSION 7.02:

Code: Select all

#COMPILE EXE

FUNCTION PBMAIN () AS LONG

DIM starttime AS DWORD
DIM elapsedtime AS DWORD
DIM a(10) AS DWORD
DIM b AS DWORD
DIM i AS DWORD
b=2

For i=1 To 500000000
    a(1)=a(2)+a(3)
    a(b)=a(1)-a(2)
    a(2)=17+i
    a(3)=100-a(2)
Next

     MSGBOX "Hello, World!"

END FUNCTION

Compiled .exe size = 7168 bytes.
On Win2000 Pro AMD Athlon 900Mhz it takes 24.2 secs.

Same program in PB version:

Code: Select all

Dim a.l(10)
b.l=2
For i.l=1 To 500000000
a(1)=a(2)+a(3)
a(b)=a(1)-a(2)
a(2)=17+i
a(3)=100-a(2)
Next

Messagebox_(0,"Hello World!!","",#MB_ICONINFORMATION)

Compiled .exe size = 4640 bytes.
On Win2000 Pro AMD Athlon 900Mhz (exactly at same conditions as before) it takes 10.7 secs. (less than half)

Convinced?!

blueznl · Post by **blueznl** » Sun Jan 25, 2004 10:51 pm

hey, psycho, better try it without the messagebox part, same difference?

Road Runner · Post by **Road Runner** » Sun Jan 25, 2004 11:15 pm

Psychophanta,

I don't see how your code relates to Davies' question which is how to get the best performance for FPU intensive calculations in week long physics simulations.

Convinced?!

Of what? That I lied?? No.

Psychophanta · Post by **Psychophanta** » Sun Jan 25, 2004 11:32 pm

Syntax Error,

Code:
Would somebody care to test this HeapSort code and see what speed they get?

on a 2 GHz Dell Lattitude C840 with XP Prof., it took 280 ms, running it as an executable from the command line with debugging code disabled.

Something rare should be happening there; in an Athlon 900Mhz it takes 200 msecs for 10000 elements... 8O

Psychophanta · Post by **Psychophanta** » Sun Jan 25, 2004 11:42 pm

Road Runner, i understood you all, i said you "convinced!?" because it is too much difference of speed between Power and PureBasic when treating mem pointers.

I've not installed PowerBasic now; so could you try to compare same arrays codes, but with 32bit floats instead of longs?

Post by **Dare2** » Mon Jan 26, 2004 12:44 am

Heya Syntax Error:

10 times for each:

Avg for debug-off PB, exe created:
184.4

Avg for IBASIC 2.02A, exe (bind) done.
16824.2

Avg for BlitzBasic
Unavailable as the when I had to reinstall (way back) and had lost my key the beggers wouldn't help out and tried to sell me blitz3d instead.

Road Runner · Post by **Road Runner** » Mon Jan 26, 2004 1:21 am

Psychophanta,
I only have the demo version of PB so I can't compare the two timings directly but looking at the ASM produced by each it looks unlikely that the timings you gave are accurate. Not that it matters for the discussion on physics simulation.
Changing a() to 32 bit float makes the code take exactly twice as long in the other PB.

Also, program size.. they both compile to 8k on my PC as they are bigger than 1 disk cluster (4k) and smaller than 2.

davies · Post by **davies** » Mon Jan 26, 2004 3:09 am

Hi everyone,

Thanks for all the replies but there still doesn't appear to be any definitive answer. Having now spent a few days looking for fast BASIC compilers it seems the following claim to be very quick:

PureBasic
PowerBasic
HotBasic

As mentioned before, unlike most users, I am only interested in floating point calculations. Unfortunately there don't appear to be any definitive benchmark comparisons between the different flavours.

One thing that seems clear is that the fastest performance for floating point calculations is achieved using the FPU instruction set. Does anyone know whether PureBasic automatically uses the FPU instruction set for such calculations or would I have to stuggle and do it myself in ASM?

If I was to write most of the simulation is straight BASIC with just the key calculations in inline ASM, can anyone please tell me is the procedure to add inline ASM the same for each flavour and which would be easiest to use?

Finally, I'm a total newbie to ASM. Do you please have any recommendations for books to learn ASM including the FPU instruction set. I would prefer a book rather than online resources.

Many thanks in advance.

Post by **Dare2** » Mon Jan 26, 2004 3:40 am

@davies

Best move you'll make is to asm, and if with windows then masm, fasm and goasm seem ok.

Masm is my preferred.

Some links

http://win32asm.cjb.net/
http://www.masm32.com/
http://www.movsd.com/
http://www.masmforum.com/

Some very clued up people in the asm arena. On the forums, watch the egos though - there seem to be two main camps.

For what you're doing you might find that much is available already in the tutorials and examples.

[EDIT]
Iczelion's tutorials are good. First link above.

Also look for The Art Of Assembly Language Programming by Randall Hyde, free online book - two versions, one uses his HL asm, the other is standard. But I can't remember where, and am too lazy to look. Webster springs to mind but the .com appears to be the dictionary.

[EDIT AGAIN]
http://webster.cs.ucr.edu/Page_asm/ArtOfAsm.html

Danilo · Post by **Danilo** » Mon Jan 26, 2004 8:37 am

@davies:
You have 2 methods to choose from when it comes to
ASM with PureBasic.
The first method is inlineASM. After enabling it in compiler
options menu, you can use ASM directly in your code:

Code: Select all

; ENABLE InlineASM in compiler options!
xyz.l
MOV xyz, 12
Debug xyz

the second way is !directASM. With !directASM everything after
a '!' on line-start is not touched by the PB compiler and directly
given to the underlying assembler, FASM.
The compiler doesnt touch your !directASM code, so you cant
use variables, labels and everything directly. Its easy, you have
to add 'v_' for variables, 'l_' for labels in front of the variable name,
thats it.
I prefer this way because i directly use FASM with this way, but
you can also use inlineASM if you prefer this.

I dont know if it works with the demo version of PB,
but here a small test to see how easy !directASM works:

Code: Select all

Global xx.l, yy.f, zz.f 

#count = 8

EnableDebugger
Debug "PB:"
DisableDebugger

For a = 1 To #count
  yy.f = 1.000001 
  zz.f = 1
  For xx.l = 1 To 10000000
    zz.f = zz.f * yy.f
  Next
  EnableDebugger
  Debug zz
  DisableDebugger
Next a


EnableDebugger
Debug "!directASM:"
DisableDebugger

For a = 1 To #count
  yy.f = 1.000001
  zz.f = 1
  !FLD  dword [v_zz] ; load variable zz
  !FLD  dword [v_yy] ; load variable yy
  !MOV  dword ECX,10000000
!loop_label:
  !FMUL ST1,ST
  !DEC  dword ECX
  !JNZ  loop_label
  !FFREE ST
  !FINCSTP
  !FSTP dword [v_zz] ; store variable zz
  EnableDebugger
  Debug zz
  DisableDebugger
Next a

So you have access to both worlds directly from 1 source code.

With the compiler option "/COMMENTED" the compiler generates
and outputs an ASM file from your BASIC source, absolutely
needed if you want to optimize your FPU code (and you could
also learn some ASM from it, could help you at start).

Cant help you with a paper book, sorry davies. All my ASM books
are written in german language, wouldnt help you much.

Maybe some other guys can recommend you an english book for
_learning_ ASM.

As a reference you could order iNTEL's processor manual which
includes system architecture, optimization and more (4 books).
You can download this books as .PDF and you can order the
4 paper books for free. Yes, absolutely free.
See URL: Intel Pentium 4 manuals

You can download the first 4 books as .pdf directly or you can
use "Request a hardcopy of this document." for the books to
get it home for free. Takes round about 1 - 2 weeks to arrive,
depends where you live.

Psychophanta · Post by **Psychophanta** » Mon Jan 26, 2004 11:25 am

Davies wrote:

As mentioned before, unlike most users, I am only interested in floating point calculations. Unfortunately there don't appear to be any definitive benchmark comparisons between the different flavours.

One thing that seems clear is that the fastest performance for floating point calculations is achieved using the FPU instruction set. Does anyone know whether PureBasic automatically uses the FPU instruction set for such calculations or would I have to stuggle and do it myself in ASM?

If I was to write most of the simulation is straight BASIC with just the key calculations in inline ASM, can anyone please tell me is the procedure to add inline ASM the same for each flavour and which would be easiest to use?

Here you have some of my ASM optimized codes useful for physic simulations. These are vectoring calculations in 2D. The functions are written in ASM and maximum optimized for speed. If you want to get still more speed then just simply put the Procedure content in the place where it is called, replacing the call to procedure.
The samples have a PROVE section to watch what it perform:

Code: Select all

;Assembler function to find the projected vector of a given vector onto another one.

;NOTE: This function doesn't use any trigonometric function, but geometric calculation.
; Author: Psychophanta
; Date: 24 Dic 2003
Procedure Project_Vector()
  ;-Project_Vector ASM Function:
  !fld dword[v_v+20];<-Horizontal coord of vector to project onto.
  !fst st1;<-Make a copy of it in st1.
  !fld st0;<-Make a copy of it in st1.
  !fmul st1,st0;<-Horizontal coord^2 in st1.
  !fmul dword[v_v];<- st0 = Horizontal coord of vector * Horizontal coord of vector to project onto.
  !fld dword[v_v+24];<-Vertical coord of vector to project onto.
  !fst st4;<-Make a copy of it in st4.
  !fld st0;<-Make a copy of it in st1.
  !fmul st0,st0;<-Vertical coord^2 to st0.
  !faddp st3,st0;<-Vertical coord in st0. (Horizontal coord of vector * Horizontal coord of vector to project onto) in st1. (Horizontal coord^2+Vertical coord^2) in st2.
  !fmul dword[v_v+4];<- st0 = Vertical coord of vector * Vertical coord of vector to project onto.
  !faddp st1,st0;<-(Vertical coord of vector * Vertical coord of vector to project onto) + (Horizontal coord of vector * Horizontal coord of vector to project onto) now in st0. (Horizontal coord^2+Vertical coord^2) in st1. 
  !fdivrp st1,st0;<-Constant result to multiply by vector to project onto, to get the wanted vector.
  !fmul st1,st0;<-Horizontal coord result in st1.
  !fmul st0,st2;<-Vertical coord result in st0.
  !fstp dword[v_v+4]
  !fst dword[v_v]
  ;*End Project_Vector ASM Function
EndProcedure
Structure vector
  c1.f;<-horizontal coord
  c2.f;<-vertical coord
  Length.f;<-length(modulo)
  Angle.f;<- angle
  LateralForce.f;<-Side force factor. It is the tangent of the angle to rotate.
                   ;A 1 here means an inclination deviation of 45 degrees,
                   ;from 0 to +-1 implicates a deviation of 0 to +-45 degrees,
                   ;from +-1 to +-infinite implicates a deviation of +-45 to +-90 degrees.
  p1.f;<-horizontal coord of vector to project onto
  p2.f;<-vertical coord of vector to project onto
EndStructure

;-PROVE IT:

;-INITS:
bitplanes.b=32:RX.w=1024:RY.w=768:#PI=3.14159265
If InitMouse()=0 Or InitSprite()=0 Or InitKeyboard()=0
  MessageRequester("Error","Can't open DirectX",0)
  End
EndIf
While OpenScreen(RX.w,RY.w,bitplanes.b,"Balls")=0
  If bitplanes.b>16:bitplanes.b-8
  ElseIf RY.w>600:RX.w=800:RY.w=600
  ElseIf RY.w>480:RX.w=640:RY.w=480
  ElseIf RY.w>400:RX.w=640:RY.w=400
  ElseIf RY.w>240:RX.w=320:RY.w=240
  ElseIf RY.w>200:RX.w=320:RY.w=200
  Else:MessageRequester("VGA limitation","Can't open Screen!",0):End
  EndIf
Wend
;-MAIN:
CX.w=RX.w/2:CY.w=RY.w/2
c1.f=100:c2.f=200;<-initial vector
v.vector\p1=300:v.vector\p2=30;<-vector to project onto.
Repeat
  ExamineKeyboard()
  ExamineMouse()
  If MouseButton(1)=0 And MouseButton(2)=0:c1.f+MouseDeltaX():c2.f+MouseDeltaY()
  ElseIf MouseButton(2):CX.w+MouseDeltaX():CY.w+MouseDeltaY()
  ElseIf MouseButton(1):v.vector\p1+MouseDeltaX():v.vector\p2+MouseDeltaY()
  EndIf
  v.vector\c1=c1.f:v.vector\c2=c2.f
  modp.f=Sqr(v.vector\p1*v.vector\p1+v.vector\p2*v.vector\p2)
  modc.f=Sqr(v.vector\c1*v.vector\c1+v.vector\c2*v.vector\c2)
  ClearScreen(0,0,0)
  StartDrawing(ScreenOutput())
  Line(CX.w,CY.w,v.vector\c1,v.vector\c2,$CCCCCC)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f+v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f-v.vector\c1*10/modc.f,$CCCCCC)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f-v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f+v.vector\c1*10/modc.f,$CCCCCC)
  Line(CX.w,CY.w,v.vector\p1,v.vector\p2,$AADDCC)
    Line(CX.w+v.vector\p1,CY.w+v.vector\p2,-v.vector\p1*10/modp.f+v.vector\p2*10/modp.f,-v.vector\p2*10/modp.f-v.vector\p1*10/modp.f,$AADDCC)
    Line(CX.w+v.vector\p1,CY.w+v.vector\p2,-v.vector\p1*10/modp.f-v.vector\p2*10/modp.f,-v.vector\p2*10/modp.f+v.vector\p1*10/modp.f,$AADDCC)
  Project_Vector()
  modc.f=Sqr(v.vector\c1*v.vector\c1+v.vector\c2*v.vector\c2)
  Line(CX.w,CY.w,v.vector\c1,v.vector\c2,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f+v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f-v.vector\c1*10/modc.f,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f-v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f+v.vector\c1*10/modc.f,$11FFFF)
  StopDrawing()
  FlipBuffers()
Until KeyboardPushed(#PB_Key_Escape)
CloseScreen()

Code: Select all

;Assembler function to find the projected vector of a given vector onto another one.

;NOTE: This function doesn't use any trigonometric function, but geometric calculation.
; Author: Psychophanta
; Date: 24 Dic 2003
Procedure Ortogonal_Project_Vector()
  ;-Ortogonal_Project_Vector ASM Function:
  !fld dword[v_v+20];<-Horizontal coord of vector to project onto.
  !fst st1;<-Make a copy of it in st1.
  !fld st0;<-Make a copy of it in st1.
  !fmul st1,st0;<-Horizontal coord^2 in st1.
  !fmul dword[v_v+4];<- st0 = Vertical coord of vector * Horizontal coord of vector to project onto.
  !fld dword[v_v+24];<-Vertical coord of vector to project onto.
  !fst st4;<-Make a copy of it in st4.
  !fld st0;<-Make a copy of it in st1.
  !fmul st0,st0;<-Vertical coord^2 to st0.
  !faddp st3,st0;<-Vertical coord in st0. (Horizontal coord of vector * Horizontal coord of vector to project onto) in st1. (Horizontal coord^2+Vertical coord^2) in st2.
  !fmul dword[v_v];<- st0 = Horizontal coord of vector * Vertical coord of vector to project onto.
  !fsubp st1,st0;<-(Vertical coord of vector * Horizontal coord of vector to project onto) - (Horizontal coord of vector * Vertical coord of vector to project onto) now in st0. (Horizontal coord^2+Vertical coord^2) in st1. 
  !fdivrp st1,st0;<-Constant result to multiply by vector to project onto, to get the wanted vector.
  !fmul st1,st0;<-Horizontal coord result in st1.
  !fmul st0,st2;<-Vertical coord result in st0.
  !fchs
  !fstp dword[v_v]
  !;<-moving previous fchs here would result in a opposed projected vector
  !fst dword[v_v+4]
  ;*End Ortogonal_Project_Vector ASM Function
EndProcedure
Structure vector
  c1.f;<-horizontal coord
  c2.f;<-vertical coord
  Length.f;<-length(modulo)
  Angle.f;<- angle
  LateralForce.f;<-Side force factor. It is the tangent of the angle to rotate.
                   ;A 1 here means an inclination deviation of 45 degrees,
                   ;from 0 to +-1 implicates a deviation of 0 to +-45 degrees,
                   ;from +-1 to +-infinite implicates a deviation of +-45 to +-90 degrees.
  p1.f;<-horizontal coord of vector to project onto
  p2.f;<-vertical coord of vector to project onto
EndStructure

;-PROVE IT:

;-INITS:
bitplanes.b=32:RX.w=1024:RY.w=768:#PI=3.14159265
If InitMouse()=0 Or InitSprite()=0 Or InitKeyboard()=0
  MessageRequester("Error","Can't open DirectX",0)
  End
EndIf
While OpenScreen(RX.w,RY.w,bitplanes.b,"Balls")=0
  If bitplanes.b>16:bitplanes.b-8
  ElseIf RY.w>600:RX.w=800:RY.w=600
  ElseIf RY.w>480:RX.w=640:RY.w=480
  ElseIf RY.w>400:RX.w=640:RY.w=400
  ElseIf RY.w>240:RX.w=320:RY.w=240
  ElseIf RY.w>200:RX.w=320:RY.w=200
  Else:MessageRequester("VGA limitation","Can't open Screen!",0):End
  EndIf
Wend
;-MAIN:
CX.w=RX.w/2:CY.w=RY.w/2
c1.f=100:c2.f=200;<-initial vector
v.vector\p1=300:v.vector\p2=30;<-vector to project onto.
Repeat
  ExamineKeyboard()
  ExamineMouse()
  If MouseButton(1)=0 And MouseButton(2)=0:c1.f+MouseDeltaX():c2.f+MouseDeltaY()
  ElseIf MouseButton(2):CX.w+MouseDeltaX():CY.w+MouseDeltaY()
  ElseIf MouseButton(1):v.vector\p1+MouseDeltaX():v.vector\p2+MouseDeltaY()
  EndIf
  v.vector\c1=c1.f:v.vector\c2=c2.f
  modp.f=Sqr(v.vector\p1*v.vector\p1+v.vector\p2*v.vector\p2)
  modc.f=Sqr(v.vector\c1*v.vector\c1+v.vector\c2*v.vector\c2)
  ClearScreen(0,0,0)
  StartDrawing(ScreenOutput())
  Locate(0,0):DrawText("Lateral Amount: "+StrF(v.vector\c2*v.vector\p1-v.vector\c1*v.vector\p2))
  Line(CX.w,CY.w,v.vector\c1,v.vector\c2,$CCCCCC)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f+v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f-v.vector\c1*10/modc.f,$CCCCCC)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f-v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f+v.vector\c1*10/modc.f,$CCCCCC)
  Line(CX.w,CY.w,v.vector\p1,v.vector\p2,$AADDCC)
    Line(CX.w+v.vector\p1,CY.w+v.vector\p2,-v.vector\p1*10/modp.f+v.vector\p2*10/modp.f,-v.vector\p2*10/modp.f-v.vector\p1*10/modp.f,$AADDCC)
    Line(CX.w+v.vector\p1,CY.w+v.vector\p2,-v.vector\p1*10/modp.f-v.vector\p2*10/modp.f,-v.vector\p2*10/modp.f+v.vector\p1*10/modp.f,$AADDCC)
  Ortogonal_Project_Vector()
  modc.f=Sqr(v.vector\c1*v.vector\c1+v.vector\c2*v.vector\c2)
  Line(CX.w,CY.w,v.vector\c1,v.vector\c2,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f+v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f-v.vector\c1*10/modc.f,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f-v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f+v.vector\c1*10/modc.f,$11FFFF)
  StopDrawing()
  FlipBuffers()
Until KeyboardPushed(#PB_Key_Escape)
CloseScreen()

Code: Select all

;Assembler function to rotate a given vector due to a perpendicular attraction or repulsion force.
;Resulting vector has SAME LENGTH and modified inclination.
;The lateral force is given as a factor which:
; -from 0 to +-1 implicates a deviation of 0 to +-45 degrees of the given vector,
; -from +-1 to +-infinite implicates a deviation of +-45 to +-90 degrees of the given vector.
;So then, note that the given factor is the tangent of the angle to rotate.
;NOTE: This function doesn't use any trigonometric function, but geometric calculation.
; Author: Psychophanta
; Date: 24 Dic 2003
Procedure Rotate_Vector_by_Side_Force()
;-Rotate_Vector_by_Side_Force ASM Function:
!fld dword[v_v+16];<-SideForceFactor
!fld st0;<-Make a copy of it in st1
!fmul st0,st0;<-SideForceFactor^2
!fld1;<-push a 1 in st0
!faddp st1,st0;<-Now (SideForceFactor^2 + 1) in st0. SideForceFactor in st1.
!fsqrt  ;<-Now Sqr(SideForceFactor^2 + 1) in st0. SideForceFactor in st1.
!fld dword[v_v+4];<-Vertical coord in st0. Sqr(SideForceFactor^2 + 1) in st1. SideForceFactor in st2.
!fld dword[v_v];<-horizontal coord in st0. Vertical coord in st1. Sqr(SideForceFactor^2 + 1) in st2. SideForceFactor in st3.
!fld st1;<-y in st0 and in st2. x coord in st1. Sqr(SideForceFactor^2 + 1) in st3. SideForceFactor in st4.
!fmul st0,st4;<-(y * SideForceFactor) in st0.
!fsubr st0,st1;<-(x - y * SideForceFactor) in st0.
!fdiv st0,st3;<-That's the new x coord (x - y * SideForceFactor)/Sqr(SideForceFactor^2 + 1) in st0.
!fstp dword[v_v];<-store it. Now x coord in st0. y in st1. Sqr(SideForceFactor^2 + 1) in st2. SideForceFactor in st3.
!fmulp st3,st0;<-(x * SideForceFactor) in st2. y in st0. Sqr(SideForceFactor^2 + 1) in st1.
!faddp st2,st0;<-(y + x * SideForceFactor) in st1. Sqr(SideForceFactor^2 + 1) in st0.
!fdivp st1,st0;<-the wanted value y coord (y + x * SideForceFactor)/Sqr(SideForceFactor^2 + 1) is now in st0.
!fstp dword[v_v+4];<-store it.
;*End Rotate_Vector_by_Side_Force ASM Function
EndProcedure
Structure vector
  c1.f;<-horizontal coord
  c2.f;<-vertical coord
  Length.f;<-length(modulo)
  Angle.f;<- angle
  LateralForce.f;<-Side force factor. It is the tangent of the angle to rotate.
                   ;A 1 here means an inclination deviation of 45 degrees,
                   ;from 0 to +-1 implicates a deviation of 0 to +-45 degrees,
                   ;from +-1 to +-infinite implicates a deviation of +-45 to +-90 degrees.
EndStructure
;-PROVE IT:

;-INITS:
bitplanes.b=32:RX.w=1024:RY.w=768:#PI=3.14159265
If InitMouse()=0 Or InitSprite()=0 Or InitKeyboard()=0
  MessageRequester("Error","Can't open DirectX",0)
  End
EndIf
While OpenScreen(RX.w,RY.w,bitplanes.b,"Balls")=0
  If bitplanes.b>16:bitplanes.b-8
  ElseIf RY.w>600:RX.w=800:RY.w=600
  ElseIf RY.w>480:RX.w=640:RY.w=480
  ElseIf RY.w>400:RX.w=640:RY.w=400
  ElseIf RY.w>240:RX.w=320:RY.w=240
  ElseIf RY.w>200:RX.w=320:RY.w=200
  Else:MessageRequester("VGA limitation","Can't open Screen!",0):End
  EndIf
Wend
;-MAIN:
CX.w=RX.w/2:CY.w=RY.w/2
v.vector\c1=-43;<-vector x coordenate
v.vector\c2=27;<-vector y coordenate
v.vector\LateralForce=-1/50;<-Side force factor
Repeat
  ExamineKeyboard()
  ExamineMouse()
  If MouseButton(1)=0 And MouseButton(2)=0:v.vector\LateralForce+MouseDeltaX()/1000
  ElseIf MouseButton(2):CX.w+MouseDeltaX():CY.w+MouseDeltaY()
  ElseIf MouseButton(1):v.vector\c1+MouseDeltaX():v.vector\c2+MouseDeltaY()
  EndIf
  ClearScreen(0,0,0)
  StartDrawing(ScreenOutput())
  Locate(0,0):DrawText(StrF(v.vector\LateralForce)+" units")
  Rotate_Vector_by_Side_Force()
  modc.f=Sqr(v.vector\c1*v.vector\c1+v.vector\c2*v.vector\c2)
  Line(CX.w,CY.w,v.vector\c1,v.vector\c2,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f+v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f-v.vector\c1*10/modc.f,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f-v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f+v.vector\c1*10/modc.f,$11FFFF)
  StopDrawing()
  FlipBuffers()
Until KeyboardPushed(#PB_Key_Escape)
CloseScreen()

Code: Select all

;Assembler function to modulate a given vector, this is, to transform a given vector to another
;with a given length (modulo):
; Author: Psychophanta
; Date: 22 Dic 2003
Procedure Modulate_Vector()
;-Get_Modulated_Vector ASM Function: <-To obtain vector (v.vector\c1,v.vector\c2) with a length of v.vector\length.
!fld dword[v_v+8];<-the New Length wanted for the vector
!fld dword[v_v];<-horizontal coord
!fld st0;<-horizontal coord in st1 too
!fmul st0,st0;<- x^2
!fld dword[v_v+4];<-vertical coord
!fld st0;<- in st1 too
!fmul st0,st0;<- y^2
!fadd st0,st2;<- x^2+y^2 now in st0. NewLength in st4
!fsqrt;<-Sqr(x^2+y^2) now in st0
!fdivp st4,st0;<-NewLength/Sqr(x^2+y^2) is the value to multiply each old vector component to get the new ones.
!fstp st1;<-mov st0 to st1 and pop, this is flush st1
!fmul st0,st2;<-new vertical coord now in st0
!fstp dword[v_v+4];<-new vertical coord
!fmulp st1,st0
!fstp dword[v_v];<-new horizontal coord
;*End Get_Modulated_Vector ASM Function
EndProcedure
Structure vector
  c1.f;<-horizontal coord
  c2.f;<-vertical coord
  length.f;<-length (modulo)
EndStructure
;-PROVE IT:

;-INITS:
bitplanes.b=32:RX.w=1024:RY.w=768:#PI=3.14159265
If InitMouse()=0 Or InitSprite()=0 Or InitKeyboard()=0
  MessageRequester("Error","Can't open DirectX",0)
  End
EndIf
While OpenScreen(RX.w,RY.w,bitplanes.b,"Balls")=0
  If bitplanes.b>16:bitplanes.b-8
  ElseIf RY.w>600:RX.w=800:RY.w=600
  ElseIf RY.w>480:RX.w=640:RY.w=480
  ElseIf RY.w>400:RX.w=640:RY.w=400
  ElseIf RY.w>240:RX.w=320:RY.w=240
  ElseIf RY.w>200:RX.w=320:RY.w=200
  Else:MessageRequester("VGA limitation","Can't open Screen!",0):End
  EndIf
Wend
;-MAIN:
CX.w=RX.w/2:CY.w=RY.w/2
c1.f=-43;<-vector x coordenate
c2.f=27;<-vector y coordenate
v.vector\length=30;<-wanted length
Repeat
  ExamineKeyboard()
  ExamineMouse()
  If MouseButton(1)=0 And MouseButton(2)=0:v.vector\length+MouseDeltaX()
  ElseIf MouseButton(2):CX.w+MouseDeltaX():CY.w+MouseDeltaY()
  ElseIf MouseButton(1):c1.f+MouseDeltaX():c2.f+MouseDeltaY()
  EndIf
  v.vector\c1=c1.f:v.vector\c2=c2.f
  ClearScreen(0,0,0)
  StartDrawing(ScreenOutput())
  Locate(0,0):DrawText(StrF(v.vector\length)+" units")
  modc.f=Sqr(v.vector\c1*v.vector\c1+v.vector\c2*v.vector\c2)
  Line(CX.w,CY.w,v.vector\c1,v.vector\c2,$CCCCCC)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f+v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f-v.vector\c1*10/modc.f,$CCCCCC)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f-v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f+v.vector\c1*10/modc.f,$CCCCCC)
  Modulate_Vector()
  modc.f=Sqr(v.vector\c1*v.vector\c1+v.vector\c2*v.vector\c2)
  Line(CX.w,CY.w,v.vector\c1,v.vector\c2,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f+v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f-v.vector\c1*10/modc.f,$11FFFF)
    Line(CX.w+v.vector\c1,CY.w+v.vector\c2,-v.vector\c1*10/modc.f-v.vector\c2*10/modc.f,-v.vector\c2*10/modc.f+v.vector\c1*10/modc.f,$11FFFF)
  StopDrawing()
  FlipBuffers()
Until KeyboardPushed(#PB_Key_Escape)
CloseScreen()

If you want more, just ask for it.
I hope these is usefel for you to know how.

Psychophanta · Post by **Psychophanta** » Mon Jan 26, 2004 11:42 am

Davies wrote:

Finally, I'm a total newbie to ASM. Do you please have any recommendations for books to learn ASM including the FPU instruction set. I would prefer a book rather than online resources.

I STROOONGLY recommend you to:
- Go to http://flatassembler.net/
- Download the version "for WIN32 gui"
- Send to a printer the Fasm.pdf file.

There are explained all the useful Ix86 stuff (including 3DNow, MMX, SSE, SSE2) in a simple and fastly understandable way. Really, i can't imagine a better way to explain it, and i congratuled to the author Thomas Grysztar for it, besides to request him for some improvements for FASM.

Post by **Dare2** » Tue Jan 27, 2004 1:11 am

@davies

I recall seeing some 80bit maths functions written in asm for masm by some very clever people and placed into public domain. These handle advanced maths as well as simple arithmetic.

I have been trying to find those again, but failed.

Posting here in case this jogs someone else's memory and they know where these are.

I was thinking you could make them into a dll used by a PB front end, and let the dll do the grunt work.

However as Danilo pointed out, you could make them inline as well. Not sure which would be better, but people like Danilo and Pyschophanta are streets ahead of me in ability so whatever they suggest would be best.

Anyhow, the code is out there somewhere, probably a link or zip in one of the more popular asm sites.

Cya.