Page 1 of 1

Assembler optimizing - simple questions

Posted: Mon Apr 13, 2009 4:10 pm
by Michael Vogel
I just started to add some speed to one of my programs, but I'm not sure if everything I did, works correct...

I'm also unsure, if such code would be "thread safe", "water resistant" etc.?

This should be ok, I believe...

Code: Select all

Procedure.l Max(a.l, b.l)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_a]
		!mov ecx,dword[p.v_b]
		!cmp eax,ecx
		!jg skip
		!mov eax,ecx
		!skip:
		ProcedureReturn

	CompilerElse

		If a>b
			ProcedureReturn a
		Else
			ProcedureReturn b
		EndIf

	CompilerEndIf

EndProcedure
This also (because I stole it from this forum :wink: )...

Code: Select all

Procedure.l Check(x.l,min.l,max.l)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_x]	; Move the value into a reg eax
		!cmp eax,dword[p.v_min]	; compare eax with argument 2 (min)
		!jng l_low					; jump to low if value is less than min
		!cmp eax,dword[p.v_max]	; compare eax with argument 3 (max)
		!jnl l_high					; jump to high if value is greater than max
		!jmp l_term				; if value is not greater than or less than min or max jump to end
		!l_low:
		!mov eax,dword[p.v_min]	; value is lower so move the min value to eax which will return
		!jmp l_term				; jump to terminate
		!l_high:
		!mov eax,dword[p.v_max]	; value is higher so move the max value to eax which will return
		!l_term:
		ProcedureReturn

	CompilerElse

		If x<min
			ProcedureReturn min
		ElseIf x>max
			ProcedureReturn max
		Else
			ProcedureReturn x
		EndIf

	CompilerEndIf

EndProcedure
But what about this, how I could optimize this to work faster?

Code: Select all

#AllowAssembler=1

Procedure.l Limit(x.l)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_x]
		!cmp eax,0
		!jng l_low
		!cmp eax,255
		!jnl l_high
		!jmp l_term
		!l_low:
		!mov eax,0
		!jmp l_term
		!l_high:
		!mov eax,255
		!l_term:

		ProcedureReturn

	CompilerElse

		If x<0
			ProcedureReturn 0
		ElseIf x>255
			ProcedureReturn 255
		Else
			ProcedureReturn x
		EndIf

	CompilerEndIf

EndProcedure


Debug limit(-5)
Debug limit(0)
Debug limit(1)
Debug limit(254)
Debug limit(255)
Debug limit(256)
Debug limit(99999)
Please give me a hint,
Michael

Posted: Mon Apr 13, 2009 9:26 pm
by dioxin
Please give me a hint
Try using the conditional move instructions as they'll cut out all the jumps

Code: Select all

'syntax may need revising to work with PB
mn = 0   'the lower bound
mx = 255 'the upper bound

!mov eax,YourInputNumber
!cmp eax,mx      'compare with top limit
!cmovg eax,mx    'if greater then set value to top limit
!cmp eax,mn      'compare with bottom limit
!cmovl eax,mn    'if lower, set value to bottom limit  

Posted: Tue Apr 14, 2009 7:15 am
by Michael Vogel
dioxin wrote:[...]
!cmovl eax,mn 'if lower, set value to bottom limit
[...]
Thanks,
just two questions...
...are the cmovl and cmovg commands available on all actual CPUs?
...why the hell the assembler code is not faster than the basic source? :evil:

Thanks,
Michael

Code: Select all

#AllowAssembler=0

Procedure.l Limit(x.l)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_x]
		!cmp eax,0			; compare with top limit
		!cmovg eax,0		; if greater then set value to top limit
		!cmp eax,255		; compare with bottom limit
		!cmovl eax,255	; if lower, set value to bottom limit

		ProcedureReturn

	CompilerElse

		If x<0
			ProcedureReturn 0
		ElseIf x>255
			ProcedureReturn 255
		Else
			ProcedureReturn x
		EndIf

	CompilerEndIf

EndProcedure

DisableDebugger
x=ElapsedMilliseconds()
For i=0 To 999999999
	Debug limit(-5)
	Debug limit(0)
	Debug limit(1)
	Debug limit(254)
	Debug limit(255)
	Debug limit(256)
	Debug limit(99999)
Next i
x-ElapsedMilliseconds()
MessageRequester("!",Str(-x))

Posted: Tue Apr 14, 2009 7:32 am
by Deeem2031
Michael Vogel wrote: ...why the hell the assembler code is not faster than the basic source? :evil:
No wonder if you write "DisableDebugger" and put all you test-calls after a "Debug" - or did u really want to test an empty For-loop? :lol:

Posted: Tue Apr 14, 2009 8:00 am
by Michael Vogel
Deeem2031 wrote:
Michael Vogel wrote: ...why the hell the assembler code is not faster than the basic source? :evil:
No wonder if you write "DisableDebugger" and put all you test-calls after a "Debug" - or did u really want to test an empty For-loop? :lol:
What a shame :oops: -- I really thought, that the expression will be still evaluated, but not displayed

Changing the Debug lines to "n=Limit...." has changed everything: now I get an invalid operand message in the line "!cmovg eax,0" :lol:

Not my day :cry:

Posted: Tue Apr 14, 2009 9:43 am
by Helle
The CMOVxx-instructions don´t work with values, only with variables or registers:

Code: Select all

!XOR EDX, EDX   ;set EDX to zero
!CMOVG EAX,EDX
Gruss
Helle

Posted: Tue Apr 14, 2009 11:37 am
by Michael Vogel
Helle wrote:The CMOVxx-instructions don´t work with values, only with variables or registers:

Code: Select all

!XOR EDX, EDX   ;set EDX to zero
!CMOVG EAX,EDX
Gruss
Helle
Thanks, got it :D

~15% more speed now :wink:

Code: Select all

Procedure.l Limit(x.l)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_x]
		!xor edx,edx		; set EDX to zero
		!cmp eax,0			; compare with top limit
		!cmovl eax,edx	; 
		!mov edx,255		; 255
		!cmp eax,edx		; compare with bottom limit
		!cmovg eax,edx	; if lower, set value to bottom limit

		ProcedureReturn

	CompilerElse

		If x<0
			ProcedureReturn 0
		ElseIf x>255
			ProcedureReturn 255
		Else
			ProcedureReturn x
		EndIf
	
	CompilerEndIf

EndProcedure

Posted: Tue Apr 14, 2009 6:54 pm
by Trond

Code: Select all

Procedure.l Limit_SSE(x.l)
  !movss xmm0, dword [p.v_x]
  !movss xmm1, dword [null]
  !packssdw xmm0, xmm1
  !packuswb xmm0, xmm1
  !movss dword [p.v_x], xmm0
  
  ProcedureReturn x
  !null dd 0
EndProcedure
By the way, DisableDebugger in the code is not enough, you need to disable it from the menu, since only that turns on the optimizer.

Also, a macro would probably be much faster.

Posted: Wed Apr 15, 2009 7:41 pm
by Michael Vogel
Trond wrote:By the way, DisableDebugger in the code is not enough, you need to disable it from the menu, since only that turns on the optimizer.
Yes, did it (all the time :roll:) anyhow, saw no different time values here...
Trond wrote:Also, a macro would probably be much faster.
Next key word for a slow performer 8)...
... interesting point #1: the assembler procedure is some percent slower than the basic part
... interesting (for an assembler newbie like me) what to do to get a macro working? (even beside the duplicate label issue everything else seems to be wrong in this code)

Michael

Code: Select all

#AllowAssembler=0

Macro AbsI(n)

	PUSHF
	MOV eax,n
	BT eax,31
	!JNC short Mskip
	NEG eax
	!Mskip:
	POPF

EndMacro

Procedure.l PAbsI(n.l)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_n]
		!bt eax,31;		check if the top bit is set (-ve)
		!jnc short skip;	if not, just skip the next instruction
		!neg eax;			negate the number
		!skip:
		ProcedureReturn

	CompilerElse

		If n<0
			n!-1
			n+1
		EndIf
		ProcedureReturn n

	CompilerEndIf

EndProcedure

DisableDebugger
x=ElapsedMilliseconds()
z.l
For i=0 To 9999999;9
	z=AbsI(-5)
	z=AbsI(0)
	z=AbsI(9)
	z=AbsI(99999)
Next i
x-ElapsedMilliseconds()
MessageRequester("!",Str(-x))

Posted: Thu Apr 16, 2009 8:30 am
by Helle
An example with an ASM-Macro:

Code: Select all

#AllowAssembler=1  ;1=ASM-Macro

!Macro AbsI n, z
! {                ;begin macro 
;!   PUSHF          ;why?
!   MOV eax,n
!   AND eax,eax    ;check MSB
!   JNS @f
!   NEG eax
!@@:
!   MOV [v_z],eax
;!   POPF
! }                ;end macro

Procedure.l PAbsI(n.l) 

   CompilerIf #AllowAssembler 

      !mov eax,dword[p.v_n] 
      !bt eax,31;      check if the top bit is set (-ve) 
      !jnc short skip;   if not, just skip the next instruction 
      !neg eax;         negate the number 
      !skip: 
      ProcedureReturn 

   CompilerElse 

      If n<0 
         n!-1 
         n+1 
      EndIf 
      ProcedureReturn n 

   CompilerEndIf 

EndProcedure 

DisableDebugger 
x=ElapsedMilliseconds() 
z.l 

CompilerIf #AllowAssembler 
  For i=0 To 9999999;9 
   !AbsI -5, z
   !AbsI 0, z
   !AbsI 9, z
   !AbsI 99999, z
  Next i 
 CompilerElse
  For i=0 To 9999999;9 
   z=PAbsI(-5) 
   z=PAbsI(0) 
   z=PAbsI(9) 
   z=PAbsI(99999) 
  Next i 

CompilerEndIf 

x-ElapsedMilliseconds() 
MessageRequester("!",Str(-x)) 
Gruss
Helle

Posted: Thu Apr 16, 2009 9:06 am
by Michael Vogel
Thanks Helle,
just found also some nice code from you in the german forum which will help me a lot. I've modified it and post it here, it may be useful also for some others :wink:

Code: Select all


; Define ShowAsmInfo

	OpenWindow(0,0,0,960,200,"ASM Information")
	Global AsmInfoID=TextGadget(0,0,0,960,500,"")
	Global AsmInfoText.s


	Macro WaitKey
		Repeat
		Until WaitWindowEvent(10)=#WM_CHAR
	EndMacro

	Macro ShowAsmInfo(Text)

		#DecimalMode=#True

		Global VarAX.l          ;der Ordnung halber alle Variablen deklarieren
		Global VarBX.l
		Global VarCX.l
		Global VarDX.l
		Global VarDI.l
		Global VarSI.l
		Global VarBP.l
		Global VarSP.l
		Global FA.b             ;C, P, A, Z, S, D, O - Flags
		Global FC.b
		Global FD.b
		Global FO.b
		Global FP.b
		Global FS.b
		Global FZ.b

		!PUSHFD                 ;Flags sichern
		!PUSHAD                 ;alle Register sichern
		!MOV [v_VarAX],eax      ;alle Register in Variablen schreiben
		!MOV [v_VarBX],ebx
		!MOV [v_VarCX],ecx
		!MOV [v_VarDX],edx
		!MOV [v_VarDI],edi
		!MOV [v_VarSI],esi
		!MOV [v_VarBP],ebp
		!MOV [v_VarSP],esp

		!PUSHFD                 ;das 32-bittige E-FLAG-Register auf den Stack schieben
		!POP eax                ;und in das EAX-Register laden

		!MOV [v_FC],0           ;für Carry-Flag; Variable erstmal auf Null
		!SHR eax,1              ;nach Rechts shiften, das letzte geshiftete Bit steht im Carry-Flag
		!ADC [v_FC],0           ;Wert des Carry-Flags zum Variablen-Wert addieren

		!MOV [v_FP],0           ;für Parity-Flag
		!SHR eax,2
		!ADC [v_FP],0

		!MOV [v_FA],0           ;für Auxiliary-Flag
		!SHR eax,2
		!ADC [v_FA],0

		!MOV [v_FZ],0           ;für Zero-Flag
		!SHR eax,2
		!ADC [v_FZ],0

		!MOV [v_FS],0           ;für Signum-Flag
		!SHR eax,1
		!ADC [v_FS],0

		!MOV [v_FD],0           ;für Direction-Flag
		!SHR eax,3
		!ADC [v_FD],0

		!MOV [v_FO],0           ;für Overflow-Flag
		!SHR eax,1
		!ADC [v_FO],0

		Delay(1)                ;sonst funktionierts direkt nach einem MessageRequester nicht

		;hWnd=GetForegroundWindow_()

		CompilerIf #DecimalMode
			AsmInfoText+"C="+Str(FC)+"  P="+Str(FP)+"  A="+Str(FA)+"  Z="+Str(FZ)+"  S="+Str(FS)+"  D="+Str(FD)+"  O="+Str(FO)+"   EAX="+Str(VarAX)+"  EBX="+Str(VarBX)+"  ECX="+Str(VarCX)+"  EDX="+Str(VarDX)+"  EDI="+Str(VarDI)+"  ESI="+Str(VarSI)+"  EBP="+Str(VarBP)+"  ESP="+Str(VarSP)+"     ["+Text+"]"+#CRLF$
		CompilerElse
			AsmInfoText+"C="+Str(FC)+"  P="+Str(FP)+"  A="+Str(FA)+"  Z="+Str(FZ)+"  S="+Str(FS)+"  D="+Str(FD)+"  O="+Str(FO)+"   EAX="+Hex(VarAX)+"  EBX="+Hex(VarBX)+"  ECX="+Hex(VarCX)+"  EDX="+Hex(VarDX)+"  EDI="+Hex(VarDI)+"  ESI="+Hex(VarSI)+"  EBP="+Hex(VarBP)+"  ESP="+Hex(VarSP)+"     ["+Text+"]"+#CRLF$
		CompilerEndIf
		SetWindowText_(AsmInfoID,@AsmInfoText)

		!POPAD                  ;die gesicherten Register wieder zurück holen
		!POPFD                  ;die gesicherten Flags wieder zurück holen

	EndMacro

; EndDefine

Global DxMul.l=1000
Global DxMem.l=123

Global x.l=50
Global y.l=100
Global Screenx.l=2000
Global Screeny.l=1000

MOV eax,x
MOV ecx,y
CMP eax,ScreenX		; x > ScreenX ?
JG exitMP32
CMP ecx,ScreenY		; y > SreenY ?
JG exitMP32
OR eax,ecx			; x OR y < 0?
JS exitMP32

;MOV eax,DxMul		; DxMul
;IMUL ecx				; y*DxMul
IMUL ecx,DxMul		; y*DxMul

;		SHL ecx,2				; x*4
ShowAsmInfo("")
MOV eax,x				; x
ShowAsmInfo("x")
SHL eax,2				; x*4
ShowAsmInfo("x*4")
ADD eax,ecx			; x*4+y*DxMul
ShowAsmInfo("x*4+y*dxmul")
ADD eax,DxMem			; x*4+y*DxMul
ShowAsmInfo("+dxmem")
!exitMP32:

WaitKey

Posted: Thu Apr 16, 2009 9:19 am
by Michael Vogel
Helle wrote:An example with an ASM-Macro [...]
Assembler macros seem to be very restrictive, so its necessary to use exactly the variable which has been used in the macro itself:

Code: Select all

!Macro AbsI n,z
! {                ;begin macro
!   MOV eax,n
!   AND eax,eax    ;check MSB
!   JNS @f
!   NEG eax
!@@:
!   MOV [v_z],eax
! }                ;end macro

z.l
m.l

!AbsI 99,m
!AbsI -5, z

Debug m
Debug z

Posted: Thu Apr 16, 2009 1:13 pm
by Helle
What´s the problem :D ? Your PB-Code was

Code: Select all

 z=PAbsI(-5)
...
Another version:

Code: Select all

#AllowAssembler=1  ;1=ASM-Macro

!Macro AbsI n
! {                ;begin macro 
!   MOV eax,n
!   AND eax,eax    ;check MSB
!   JNS @f
!   NEG eax
!@@:
! }                ;end macro

DisableDebugger 
x=ElapsedMilliseconds() 
xyz.l 

CompilerIf #AllowAssembler 
  For i=0 To 9999999;9 
   !AbsI -5
   !MOV [v_xyz],eax
   !AbsI 0
   !MOV [v_xyz],eax  
   !AbsI 9
   !MOV [v_xyz],eax  
   !AbsI 99999
   !MOV [v_xyz],eax  
  Next i 
 CompilerElse
  ;For i=0 To 9999999;9 
  ; z=PAbsI(-5) 
  ; z=PAbsI(0) 
  ; z=PAbsI(9) 
  ; z=PAbsI(99999) 
  ;Next i 

CompilerEndIf 

x-ElapsedMilliseconds() 
MessageRequester("!",Str(-x)) 
Gruss
Helle

Posted: Thu Apr 16, 2009 2:59 pm
by dioxin
Probably faster because it avoids the jump but uses edx register as well:

Code: Select all

!mov eax,MyValue
!cdq
!xor eax,edx
!sub eax,edx
'eax now contains ABS(MyValue)