Page 1 of 1
Assembler optimizing - simple questions
Posted: Mon Apr 13, 2009 4:10 pm
by Michael Vogel
I just started to add some speed to one of my programs, but I'm not sure if everything I did, works correct...
I'm also unsure, if such code would be "thread safe", "water resistant" etc.?
This should be ok, I believe...
Code: Select all
Procedure.l Max(a.l, b.l)
CompilerIf #AllowAssembler
!mov eax,dword[p.v_a]
!mov ecx,dword[p.v_b]
!cmp eax,ecx
!jg skip
!mov eax,ecx
!skip:
ProcedureReturn
CompilerElse
If a>b
ProcedureReturn a
Else
ProcedureReturn b
EndIf
CompilerEndIf
EndProcedure
This also (because I stole it from this forum

)...
Code: Select all
Procedure.l Check(x.l,min.l,max.l)
CompilerIf #AllowAssembler
!mov eax,dword[p.v_x] ; Move the value into a reg eax
!cmp eax,dword[p.v_min] ; compare eax with argument 2 (min)
!jng l_low ; jump to low if value is less than min
!cmp eax,dword[p.v_max] ; compare eax with argument 3 (max)
!jnl l_high ; jump to high if value is greater than max
!jmp l_term ; if value is not greater than or less than min or max jump to end
!l_low:
!mov eax,dword[p.v_min] ; value is lower so move the min value to eax which will return
!jmp l_term ; jump to terminate
!l_high:
!mov eax,dword[p.v_max] ; value is higher so move the max value to eax which will return
!l_term:
ProcedureReturn
CompilerElse
If x<min
ProcedureReturn min
ElseIf x>max
ProcedureReturn max
Else
ProcedureReturn x
EndIf
CompilerEndIf
EndProcedure
But what about this, how I could optimize this to work faster?
Code: Select all
#AllowAssembler=1
Procedure.l Limit(x.l)
CompilerIf #AllowAssembler
!mov eax,dword[p.v_x]
!cmp eax,0
!jng l_low
!cmp eax,255
!jnl l_high
!jmp l_term
!l_low:
!mov eax,0
!jmp l_term
!l_high:
!mov eax,255
!l_term:
ProcedureReturn
CompilerElse
If x<0
ProcedureReturn 0
ElseIf x>255
ProcedureReturn 255
Else
ProcedureReturn x
EndIf
CompilerEndIf
EndProcedure
Debug limit(-5)
Debug limit(0)
Debug limit(1)
Debug limit(254)
Debug limit(255)
Debug limit(256)
Debug limit(99999)
Please give me a hint,
Michael
Posted: Mon Apr 13, 2009 9:26 pm
by dioxin
Please give me a hint
Try using the conditional move instructions as they'll cut out all the jumps
Code: Select all
'syntax may need revising to work with PB
mn = 0 'the lower bound
mx = 255 'the upper bound
!mov eax,YourInputNumber
!cmp eax,mx 'compare with top limit
!cmovg eax,mx 'if greater then set value to top limit
!cmp eax,mn 'compare with bottom limit
!cmovl eax,mn 'if lower, set value to bottom limit
Posted: Tue Apr 14, 2009 7:15 am
by Michael Vogel
dioxin wrote:[...]
!cmovl eax,mn 'if lower, set value to bottom limit
[...]
Thanks,
just two questions...
...are the
cmovl and
cmovg commands available on all actual CPUs?
...why the hell the assembler code is not faster than the basic source?
Thanks,
Michael
Code: Select all
#AllowAssembler=0
Procedure.l Limit(x.l)
CompilerIf #AllowAssembler
!mov eax,dword[p.v_x]
!cmp eax,0 ; compare with top limit
!cmovg eax,0 ; if greater then set value to top limit
!cmp eax,255 ; compare with bottom limit
!cmovl eax,255 ; if lower, set value to bottom limit
ProcedureReturn
CompilerElse
If x<0
ProcedureReturn 0
ElseIf x>255
ProcedureReturn 255
Else
ProcedureReturn x
EndIf
CompilerEndIf
EndProcedure
DisableDebugger
x=ElapsedMilliseconds()
For i=0 To 999999999
Debug limit(-5)
Debug limit(0)
Debug limit(1)
Debug limit(254)
Debug limit(255)
Debug limit(256)
Debug limit(99999)
Next i
x-ElapsedMilliseconds()
MessageRequester("!",Str(-x))
Posted: Tue Apr 14, 2009 7:32 am
by Deeem2031
Michael Vogel wrote:
...why the hell the assembler code is not faster than the basic source?
No wonder if you write "DisableDebugger" and put all you test-calls after a "Debug" - or did u really want to test an empty For-loop?

Posted: Tue Apr 14, 2009 8:00 am
by Michael Vogel
Deeem2031 wrote:Michael Vogel wrote:
...why the hell the assembler code is not faster than the basic source?
No wonder if you write "DisableDebugger" and put all you test-calls after a "Debug" - or did u really want to test an empty For-loop?

What a shame

-- I really thought, that the expression will be still evaluated, but not displayed
Changing the Debug lines to "n=Limit...." has changed everything: now I get an invalid operand message in the line "!cmovg eax,0"
Not my day

Posted: Tue Apr 14, 2009 9:43 am
by Helle
The CMOVxx-instructions don´t work with values, only with variables or registers:
Code: Select all
!XOR EDX, EDX ;set EDX to zero
!CMOVG EAX,EDX
Gruss
Helle
Posted: Tue Apr 14, 2009 11:37 am
by Michael Vogel
Helle wrote:The CMOVxx-instructions don´t work with values, only with variables or registers:
Code: Select all
!XOR EDX, EDX ;set EDX to zero
!CMOVG EAX,EDX
Gruss
Helle
Thanks, got it
~15% more speed now
Code: Select all
Procedure.l Limit(x.l)
CompilerIf #AllowAssembler
!mov eax,dword[p.v_x]
!xor edx,edx ; set EDX to zero
!cmp eax,0 ; compare with top limit
!cmovl eax,edx ;
!mov edx,255 ; 255
!cmp eax,edx ; compare with bottom limit
!cmovg eax,edx ; if lower, set value to bottom limit
ProcedureReturn
CompilerElse
If x<0
ProcedureReturn 0
ElseIf x>255
ProcedureReturn 255
Else
ProcedureReturn x
EndIf
CompilerEndIf
EndProcedure
Posted: Tue Apr 14, 2009 6:54 pm
by Trond
Code: Select all
Procedure.l Limit_SSE(x.l)
!movss xmm0, dword [p.v_x]
!movss xmm1, dword [null]
!packssdw xmm0, xmm1
!packuswb xmm0, xmm1
!movss dword [p.v_x], xmm0
ProcedureReturn x
!null dd 0
EndProcedure
By the way, DisableDebugger in the code is not enough, you need to disable it from the menu, since only that turns on the optimizer.
Also, a macro would probably be much faster.
Posted: Wed Apr 15, 2009 7:41 pm
by Michael Vogel
Trond wrote:By the way, DisableDebugger in the code is not enough, you need to disable it from the menu, since only that turns on the optimizer.
Yes, did it (all the time :roll:) anyhow, saw no different time values here...
Trond wrote:Also, a macro would probably be much faster.
Next key word for a slow performer

...
... interesting point #1: the assembler procedure is some percent slower than the basic part
... interesting (for an assembler newbie like me) what to do to get a macro working? (even beside the duplicate label issue everything else seems to be wrong in this code)
Michael
Code: Select all
#AllowAssembler=0
Macro AbsI(n)
PUSHF
MOV eax,n
BT eax,31
!JNC short Mskip
NEG eax
!Mskip:
POPF
EndMacro
Procedure.l PAbsI(n.l)
CompilerIf #AllowAssembler
!mov eax,dword[p.v_n]
!bt eax,31; check if the top bit is set (-ve)
!jnc short skip; if not, just skip the next instruction
!neg eax; negate the number
!skip:
ProcedureReturn
CompilerElse
If n<0
n!-1
n+1
EndIf
ProcedureReturn n
CompilerEndIf
EndProcedure
DisableDebugger
x=ElapsedMilliseconds()
z.l
For i=0 To 9999999;9
z=AbsI(-5)
z=AbsI(0)
z=AbsI(9)
z=AbsI(99999)
Next i
x-ElapsedMilliseconds()
MessageRequester("!",Str(-x))
Posted: Thu Apr 16, 2009 8:30 am
by Helle
An example with an ASM-Macro:
Code: Select all
#AllowAssembler=1 ;1=ASM-Macro
!Macro AbsI n, z
! { ;begin macro
;! PUSHF ;why?
! MOV eax,n
! AND eax,eax ;check MSB
! JNS @f
! NEG eax
!@@:
! MOV [v_z],eax
;! POPF
! } ;end macro
Procedure.l PAbsI(n.l)
CompilerIf #AllowAssembler
!mov eax,dword[p.v_n]
!bt eax,31; check if the top bit is set (-ve)
!jnc short skip; if not, just skip the next instruction
!neg eax; negate the number
!skip:
ProcedureReturn
CompilerElse
If n<0
n!-1
n+1
EndIf
ProcedureReturn n
CompilerEndIf
EndProcedure
DisableDebugger
x=ElapsedMilliseconds()
z.l
CompilerIf #AllowAssembler
For i=0 To 9999999;9
!AbsI -5, z
!AbsI 0, z
!AbsI 9, z
!AbsI 99999, z
Next i
CompilerElse
For i=0 To 9999999;9
z=PAbsI(-5)
z=PAbsI(0)
z=PAbsI(9)
z=PAbsI(99999)
Next i
CompilerEndIf
x-ElapsedMilliseconds()
MessageRequester("!",Str(-x))
Gruss
Helle
Posted: Thu Apr 16, 2009 9:06 am
by Michael Vogel
Thanks Helle,
just found also some nice code from you in the german forum which will help me a lot. I've modified it and post it here, it may be useful also for some others
Code: Select all
; Define ShowAsmInfo
OpenWindow(0,0,0,960,200,"ASM Information")
Global AsmInfoID=TextGadget(0,0,0,960,500,"")
Global AsmInfoText.s
Macro WaitKey
Repeat
Until WaitWindowEvent(10)=#WM_CHAR
EndMacro
Macro ShowAsmInfo(Text)
#DecimalMode=#True
Global VarAX.l ;der Ordnung halber alle Variablen deklarieren
Global VarBX.l
Global VarCX.l
Global VarDX.l
Global VarDI.l
Global VarSI.l
Global VarBP.l
Global VarSP.l
Global FA.b ;C, P, A, Z, S, D, O - Flags
Global FC.b
Global FD.b
Global FO.b
Global FP.b
Global FS.b
Global FZ.b
!PUSHFD ;Flags sichern
!PUSHAD ;alle Register sichern
!MOV [v_VarAX],eax ;alle Register in Variablen schreiben
!MOV [v_VarBX],ebx
!MOV [v_VarCX],ecx
!MOV [v_VarDX],edx
!MOV [v_VarDI],edi
!MOV [v_VarSI],esi
!MOV [v_VarBP],ebp
!MOV [v_VarSP],esp
!PUSHFD ;das 32-bittige E-FLAG-Register auf den Stack schieben
!POP eax ;und in das EAX-Register laden
!MOV [v_FC],0 ;für Carry-Flag; Variable erstmal auf Null
!SHR eax,1 ;nach Rechts shiften, das letzte geshiftete Bit steht im Carry-Flag
!ADC [v_FC],0 ;Wert des Carry-Flags zum Variablen-Wert addieren
!MOV [v_FP],0 ;für Parity-Flag
!SHR eax,2
!ADC [v_FP],0
!MOV [v_FA],0 ;für Auxiliary-Flag
!SHR eax,2
!ADC [v_FA],0
!MOV [v_FZ],0 ;für Zero-Flag
!SHR eax,2
!ADC [v_FZ],0
!MOV [v_FS],0 ;für Signum-Flag
!SHR eax,1
!ADC [v_FS],0
!MOV [v_FD],0 ;für Direction-Flag
!SHR eax,3
!ADC [v_FD],0
!MOV [v_FO],0 ;für Overflow-Flag
!SHR eax,1
!ADC [v_FO],0
Delay(1) ;sonst funktionierts direkt nach einem MessageRequester nicht
;hWnd=GetForegroundWindow_()
CompilerIf #DecimalMode
AsmInfoText+"C="+Str(FC)+" P="+Str(FP)+" A="+Str(FA)+" Z="+Str(FZ)+" S="+Str(FS)+" D="+Str(FD)+" O="+Str(FO)+" EAX="+Str(VarAX)+" EBX="+Str(VarBX)+" ECX="+Str(VarCX)+" EDX="+Str(VarDX)+" EDI="+Str(VarDI)+" ESI="+Str(VarSI)+" EBP="+Str(VarBP)+" ESP="+Str(VarSP)+" ["+Text+"]"+#CRLF$
CompilerElse
AsmInfoText+"C="+Str(FC)+" P="+Str(FP)+" A="+Str(FA)+" Z="+Str(FZ)+" S="+Str(FS)+" D="+Str(FD)+" O="+Str(FO)+" EAX="+Hex(VarAX)+" EBX="+Hex(VarBX)+" ECX="+Hex(VarCX)+" EDX="+Hex(VarDX)+" EDI="+Hex(VarDI)+" ESI="+Hex(VarSI)+" EBP="+Hex(VarBP)+" ESP="+Hex(VarSP)+" ["+Text+"]"+#CRLF$
CompilerEndIf
SetWindowText_(AsmInfoID,@AsmInfoText)
!POPAD ;die gesicherten Register wieder zurück holen
!POPFD ;die gesicherten Flags wieder zurück holen
EndMacro
; EndDefine
Global DxMul.l=1000
Global DxMem.l=123
Global x.l=50
Global y.l=100
Global Screenx.l=2000
Global Screeny.l=1000
MOV eax,x
MOV ecx,y
CMP eax,ScreenX ; x > ScreenX ?
JG exitMP32
CMP ecx,ScreenY ; y > SreenY ?
JG exitMP32
OR eax,ecx ; x OR y < 0?
JS exitMP32
;MOV eax,DxMul ; DxMul
;IMUL ecx ; y*DxMul
IMUL ecx,DxMul ; y*DxMul
; SHL ecx,2 ; x*4
ShowAsmInfo("")
MOV eax,x ; x
ShowAsmInfo("x")
SHL eax,2 ; x*4
ShowAsmInfo("x*4")
ADD eax,ecx ; x*4+y*DxMul
ShowAsmInfo("x*4+y*dxmul")
ADD eax,DxMem ; x*4+y*DxMul
ShowAsmInfo("+dxmem")
!exitMP32:
WaitKey
Posted: Thu Apr 16, 2009 9:19 am
by Michael Vogel
Helle wrote:An example with an ASM-Macro [...]
Assembler macros seem to be very restrictive, so its necessary to use exactly the variable which has been used in the macro itself:
Code: Select all
!Macro AbsI n,z
! { ;begin macro
! MOV eax,n
! AND eax,eax ;check MSB
! JNS @f
! NEG eax
!@@:
! MOV [v_z],eax
! } ;end macro
z.l
m.l
!AbsI 99,m
!AbsI -5, z
Debug m
Debug z
Posted: Thu Apr 16, 2009 1:13 pm
by Helle
What´s the problem

? Your PB-Code was
Another version:
Code: Select all
#AllowAssembler=1 ;1=ASM-Macro
!Macro AbsI n
! { ;begin macro
! MOV eax,n
! AND eax,eax ;check MSB
! JNS @f
! NEG eax
!@@:
! } ;end macro
DisableDebugger
x=ElapsedMilliseconds()
xyz.l
CompilerIf #AllowAssembler
For i=0 To 9999999;9
!AbsI -5
!MOV [v_xyz],eax
!AbsI 0
!MOV [v_xyz],eax
!AbsI 9
!MOV [v_xyz],eax
!AbsI 99999
!MOV [v_xyz],eax
Next i
CompilerElse
;For i=0 To 9999999;9
; z=PAbsI(-5)
; z=PAbsI(0)
; z=PAbsI(9)
; z=PAbsI(99999)
;Next i
CompilerEndIf
x-ElapsedMilliseconds()
MessageRequester("!",Str(-x))
Gruss
Helle
Posted: Thu Apr 16, 2009 2:59 pm
by dioxin
Probably faster because it avoids the jump but uses edx register as well:
Code: Select all
!mov eax,MyValue
!cdq
!xor eax,edx
!sub eax,edx
'eax now contains ABS(MyValue)