Assembler optimizing - expert section

Just starting out? Need help? Post your questions and find answers here.
User avatar
Michael Vogel
Addict
Addict
Posts: 2797
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Assembler optimizing - expert section

Post by Michael Vogel »

This is really a complicate point, but maybe there's an assembler expert who can give me an idea what would be the best way to solve that...

I wrote a small tool to view tracks in 3D which works fine quite a while, but I think now about optimizing the pixel set routine which is called millions of time from an antialiasing line routine. In fact, there are more different functions to set a pixel because of different possible color depths...

Code: Select all

; DxFunction gets address of pixel function corresponding to the screen depth...
; e.g. DxFunction=@MergePixel**(); **=16, 24 or 32
And this is the part which will be done very often:

Code: Select all

If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY
CallFunctionFast(DxFunction,x1,y1,r,g,b,255)
EndIf
Here are the function which set a pixel (*DX and DxMem are defined globally)

Code: Select all

Procedure MergePixel32(x,y,r,g,b,w)
	;If x>=0 And x<ScreenX And y>=0 And y<ScreenY
	*DX=DxMem+x<<2+y*DxMul
	*DX\l=ColMix(*DX\l,b,g,r,w)
	;EndIf
EndProcedure
Procedure MergePixel24(x,y,r,g,b,w)
	;If x>=0 And x<ScreenX And y>=0 And y<ScreenY
	*DX=DxMem+x*3+y*DxMul
	*DX\l=ColMix(*DX\l,b,g,r,w)|(*DX\l&$FF000000)
	;EndIf
EndProcedure
Procedure MergePixel16(x,y,r,g,b,w)
	;If x>=0 And x<ScreenX And y>=0 And y<ScreenY
	*DX=DxMem+x<<1+y*DxMul
	*DX\w=ColMurx(*DX\w,r,g,b,w)
	;EndIf
EndProcedure
And here are the used macros...

Code: Select all

Macro ColRGB(r,g,b)
	( r&$ff + (g&$ff)<<8 + (b&$ff)<<16 )
EndMacro
Macro ColRed(col)
	(col&$ff)
EndMacro
Macro ColGreen(col)
	(col>>8&$ff)
EndMacro
Macro ColBlue(col)
	(col>>16&$ff)
EndMacro
Macro ColMix(c,r,g,b,w)
	; 24/32 Bit-Pixel: bbbbbbbb|gggggggg|rrrrrrrr
	( (r*w+ColRed(c)*(255-w))>>8 + (g*w+ColGreen(c)*(255-w))&$ff00 + (b*w+ColBlue(c)*(255-w))<<8&$ff0000 )
EndMacro
Macro ColMurx(c,r,g,b,w)
	; 16 Bit-Pixel:  rrrr|rggg|ggxb|bbbb    (wobei "x" bei grün verwendet wird)
	( (r*w+(c>>8&$f8)*(255-w))&$f800 + ((g*w+(c>>3&$f8)*(255-w))&$f800)>>5 + (b*w+(c<<3&$f8)*(255-w))>>11 )
EndMacro
And my questions, of course...
• where should be started converting to assembler code?
• would it be effective to try to get all macros converted to assembler?
• is there an easy way to speed up the needed check if x|y is within the screen?
• any other ideas?

Thanks for all tips,
Michael
dioxin
User
User
Posts: 97
Joined: Thu May 11, 2006 9:53 pm

Post by dioxin »

Code: Select all

If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY 
CallFunctionFast(DxFunction,x1,y1,r,g,b,255) 
EndIf
Untested but should show you one way:

Code: Select all

!mov eax,xCoOrdinate    'get x
!mov ecx,yCoOrdinate    'get y
!cmp eax,ScreenSizeX    'is x greater than the screen x size?
!jg ExitIf              'yes, then exit
!cmp ecx,ScreenSizeY    'is y greater than the screen y size?
!jg ExitIf              'yes, then exit
!or eax,ecx             'quick check for <0, OR x and y and if the sign bit is set then one of them was negative
!js ExitIf              'either x or y was negative so the point if off the screen, exit.
'If it gets here then the point is on the screen
CallFunctionFast(DxFunction,x1,y1,r,g,b,255)
ExitIf:
Your code could be made a lot faster if written in ASM. All parts of it look suitable for conversion.
You should start with the most used parts first as that's where there'll be most time to gain.
freak
PureBasic Team
PureBasic Team
Posts: 5940
Joined: Fri Apr 25, 2003 5:21 pm
Location: Germany

Post by freak »

Don't use CallFunctionFast(), use a prototype instead. Its one less level of jumping/indirection in the call.
quidquid Latine dictum sit altum videtur
User avatar
Michael Vogel
Addict
Addict
Posts: 2797
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Post by Michael Vogel »

dioxin wrote:Untested but should show you one way: [...]
Thanks, dioxin, thats (I believe) the first time, where an assembler code is definitely much faster than the PB original code! In my selfmade attempts I got only results which needed the same time as before :cry:

Code: Select all

#AllowAssembler=0

Global ScreenX=GetSystemMetrics_(#SM_CXSCREEN)
Global ScreenY=GetSystemMetrics_(#SM_CYSCREEN)

Procedure AntiLineXY(x1,y1,x2,y2,col)
	CompilerIf #AllowAssembler
		!mov eax,dword[p.v_x1]
		!mov ecx,dword[p.v_y1]
		!cmp eax,v_ScreenX		; x > ScreenX ?
		!jg exit1
		!cmp ecx,v_ScreenY		; y > SreenY ?
		!jg exit1
		!or eax,ecx				; x OR y < 0?
		!jg exit1
		; #Null
		!exit1:
	CompilerElse
		If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY
			; #Null
		EndIf
	CompilerEndIf
EndProcedure

DisableDebugger
RandomSeed(0)
x=ElapsedMilliseconds()
z.l
For i=0 To 99999999;9
	a=Random(ScreenX+ScreenY)
	b=Random(ScreenX)
	AntiLineXY(a,b,0,0,0)
	AntiLineXY(a,b,0,0,0)
	AntiLineXY(ScreenX-a,b-100,0,0)
	AntiLineXY(ScreenX-a,b-100,0,0,0)
Next i
x-ElapsedMilliseconds()
MessageRequester("!",Str(-x))
The assembler version needs around 15s here, the basic code 18s !

Just an additional question - when I have to use dword[p.v_var] and when [v_var] only?

Thanks,
Michael


EDITED
It's fast, but I'll get wrong results... it seems, that the or eac,ecx trick does not work :cry:

Code: Select all

#AllowAssembler=1

Global ScreenX=GetSystemMetrics_(#SM_CXSCREEN)
Global ScreenY=GetSystemMetrics_(#SM_CYSCREEN)

Procedure AntiLineXY(x1,y1,x2,y2,col)
	CompilerIf #AllowAssembler
		!mov eax,dword[p.v_x1]
		!mov ecx,dword[p.v_y1]
		!cmp eax,v_ScreenX		; x > ScreenX ?
		!jg exit1
		!cmp ecx,v_ScreenY		; y > SreenY ?
		!jg exit1
		!or eax,ecx				; x OR y < 0?
		!jg exit1
		Debug "Plot"
		!exit1:
	CompilerElse
		If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY
			Debug "Plot"
		EndIf
	CompilerEndIf
EndProcedure

AntiLineXY(20,20,0,0,0)
Last edited by Michael Vogel on Tue Apr 14, 2009 7:54 am, edited 1 time in total.
User avatar
Michael Vogel
Addict
Addict
Posts: 2797
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Post by Michael Vogel »

freak wrote:Don't use CallFunctionFast(), use a prototype instead. Its one less level of jumping/indirection in the call.
Thanks freak for the hint, I will have a search in the forum to get some more infos...

Michael
Helle
Enthusiast
Enthusiast
Posts: 178
Joined: Wed Apr 12, 2006 7:59 pm
Location: Germany
Contact:

Post by Helle »

With the OR-instruction use JS (test if MSB is set). The MSB ist after OR =1, if one or boths operands are negativ:

Code: Select all

#AllowAssembler=1 

Global ScreenX=GetSystemMetrics_(#SM_CXSCREEN) 
Global ScreenY=GetSystemMetrics_(#SM_CYSCREEN) 

Procedure AntiLineXY(x1,y1,x2,y2,col) 
   CompilerIf #AllowAssembler 
      !mov eax,dword[p.v_x1] 
      !mov ecx,dword[p.v_y1] 
      !cmp eax,v_ScreenX      ; x > ScreenX ? 
      !jg exit1 
      !cmp ecx,v_ScreenY      ; y > SreenY ? 
      !jg exit1 
      !or eax,ecx            ; x OR y < 0? 
      !js exit1 
      Debug "Plot" 
      !exit1: 
   CompilerElse 
      If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY 
         Debug "Plot" 
      EndIf 
   CompilerEndIf 
EndProcedure 

AntiLineXY(20,20,0,0,0)
Gruss
Helle
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

Here's a code to visually see optim. Red line-> Asm; blue line-> PB. The lower the slower. (don't forget DirectX9 in compiler options)

Code: Select all

;*****************************************************************************
;* Rasterline and VBL Sync
;* By djes (djes@free.fr) 03/24/2009
;* Thanx to Stefan Moebius for VBL sync examples
;* Note : I do not check functions return values : I know it's bad!!!
;*****************************************************************************

Structure D3DRASTER_STATUS
  InVBlank.l
  ScanLine.l
EndStructure

;*****************************************************************************

Define.D3DRASTER_STATUS raster
Define.IDirect3DDevice9 D3Ddevice_interface

;*****************************************************************************

Global ScreenX=1024;GetSystemMetrics_(#SM_CXSCREEN)
Global ScreenY=768;GetSystemMetrics_(#SM_CYSCREEN)

Procedure AntiLineXY_asm(x1,y1,x2,y2,col)
  !mov eax,dword[p.v_x1]
  !mov ecx,dword[p.v_y1]
  !cmp eax,v_ScreenX      ; x > ScreenX ?
  !jg exit1
  !cmp ecx,v_ScreenY      ; y > SreenY ?
  !jg exit1
  !or eax,ecx            ; x OR y < 0?
  !js exit1
     Plot(x1, y1, col)
  !exit1:
EndProcedure

Procedure AntiLineXY(x1,y1,x2,y2,col)
  If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY
     Plot(x1, y1, col)
  EndIf
EndProcedure

;*****************************************************************************
;wait for the vblank to start - takes all the cpu
;attend le début de la synchro vbl
Procedure vblank_wait()
  Shared D3Ddevice_interface, raster, exit
  Repeat
    ExamineKeyboard()
    If KeyboardPushed(#PB_Key_Escape)
      exit=#True
    EndIf
    D3Ddevice_interface\GetRasterStatus(0, @raster)
  Until raster\InVBlank=#True Or exit
EndProcedure

;*****************************************************************************
;wait for the vblank to finish - takes all the cpu
;attend la fin de la synchro vbl
Procedure vblank_end_wait()
  Shared D3Ddevice_interface, raster, exit
  Repeat
    ExamineKeyboard()
    If KeyboardPushed(#PB_Key_Escape)
      exit=#True
    EndIf
    D3Ddevice_interface\GetRasterStatus(0, @raster)
  Until raster\InVBlank=#False Or exit
EndProcedure

;*****************************************************************************
;look for the raster pos and draw a coloured line
;récupère la position du raster et y affiche une ligne de la couleur indiquée
Procedure rasterline(color.l)
  Shared raster, D3Ddevice_interface
  StartDrawing(ScreenOutput())
  D3Ddevice_interface\GetRasterStatus(0, @raster)
  LineXY(0, raster\ScanLine, 1023, raster\ScanLine, color)
  StopDrawing()
EndProcedure

;*****************************************************************************
;Open screen and get the device - Ouvre l'écran et récupère le device
Procedure init_display()
   Shared D3Ddevice_interface
   OpenScreen(1024, 768, 32, "")
   !extrn _PB_Screen_Direct3DDevice
   !MOV dword EAX, [_PB_Screen_Direct3DDevice]
   !MOV dword [v_D3Ddevice_interface],EAX
EndProcedure

;*****************************************************************************
;Check if the user has switched (and that we have lost focus) (ALT-TAB)
;Vérifie si l'utilisateur n'a pas changé d'appli (en nous faisant donc perdre le focus) (ALT-TAB)
Procedure check_lostfocus()
 
  If IsScreenActive() = 0

    ReleaseMouse(1)
   
    ;we're lowering our priority - on baisse notre priorité
    SetPriorityClass_(  GetCurrentProcess_(), #NORMAL_PRIORITY_CLASS)
    SetThreadPriority_( GetCurrentThread_() , #THREAD_PRIORITY_NORMAL)
   
    ;Wait til our window is coming back to foreground - Attend que notre fenêtre repasse au premier plan
    Repeat
      ;now our events are to be processed with WaitWindowEvent, else IsScreenActive() will never say that our window has the focus again
      ;it should be written in the doc!
      ;maintenant les événements sont à gérer avec WaitWindowEvent, sinon IsScreenActive() ne dit jamais que notre fenêtre a à nouveau le focus
      ;il faudrait mettre ça dans la doc!!!!
      WaitWindowEvent()   
    Until IsScreenActive() <> 0
   
    ReleaseMouse(0)
    ;Better recreate the screen - il vaut mieux recréer l'écran
    CloseScreen()
    init_display()
    ;and the sprites too (have to!) - et les sprites aussi (indispensable)
    ;give to the system some time to rest - laisse un peu le temps au système de récupérer
    Delay(2000)
   
    ;We're waiting for the synchro a new time - On réattend la synchro
    vblank_wait()
    SetPriorityClass_(  GetCurrentProcess_(), #HIGH_PRIORITY_CLASS)
    SetThreadPriority_( GetCurrentThread_() , #THREAD_PRIORITY_ABOVE_NORMAL)

  EndIf

EndProcedure

;*****************************************************************************
;-                                START
;*****************************************************************************

InitSprite()
InitKeyboard()
InitMouse()

init_display()

exit=#False

;we're giving max priority to our process - on donne une priorité maximale à notre processus
SetPriorityClass_(  GetCurrentProcess_(), #REALTIME_PRIORITY_CLASS)
SetThreadPriority_( GetCurrentThread_() , #THREAD_PRIORITY_TIME_CRITICAL)     ;warning : keyboard lock

;*****************************************************************************
;-                                MAIN LOOP
;*****************************************************************************

Repeat

  ;wait for the sync - attends la synchro
  vblank_wait()

  ;flip the buffers without the PB's sync - flippe le buffer sans utiliser la synchro PB
  FlipBuffers(#PB_Screen_NoSynchronization)

  ;during VBL, lower a bit our priority to give time to OS (especially to handle the keyboard)
  ;pendant la VBL, baisse un peu notre priorité pour donner un peu de temps au système (pour gérer le clavier surtout)
  SetPriorityClass_(  GetCurrentProcess_(), #HIGH_PRIORITY_CLASS)
  SetThreadPriority_( GetCurrentThread_() , #THREAD_PRIORITY_NORMAL)

  ExamineKeyboard()
  ExamineMouse()
 
  If KeyboardPushed(#PB_Key_Escape)
    exit=#True
  EndIf

  check_lostfocus()

  ;we're waiting for the VBL end (could be better!!!)
  ;on attend la fin de la VBL (à améliorer) pour être sûr de ne pas sauter une frame
  vblank_end_wait()

  ;we're giving max priority to our process - on donne une priorité maximale à notre processus
  SetPriorityClass_(  GetCurrentProcess_(), #REALTIME_PRIORITY_CLASS)
  SetThreadPriority_( GetCurrentThread_() , #THREAD_PRIORITY_TIME_CRITICAL)     ;warning : keyboard lock

  ClearScreen(0)

  ;--- CODE TO BENCHMARK

  db=1-db

  If db=0
 
    StartDrawing(ScreenOutput())
    For i = 0 To 150000
      AntiLineXY_asm(1, 1, 10, 767, RGB($FF, $FF, $FF))
    Next i
    StopDrawing()

    ;draws a rasterline to see (visually) how much time our fx is taking
    ;affiche une rasterline pour visualiser combien de temps prend notre effet
    rasterline($FF0000)

  Else
  
    StartDrawing(ScreenOutput())
    For i = 0 To 150000
      AntiLineXY(1, 1, 10, 767, RGB($FF, $FF, $FF))
    Next i
    StopDrawing()

    ;draws a rasterline to see (visually) how much time our fx is taking
    ;affiche une rasterline pour visualiser combien de temps prend notre effet
    rasterline($0000FF)

  EndIf

Until exit

CloseScreen()

End 
User avatar
Michael Vogel
Addict
Addict
Posts: 2797
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Post by Michael Vogel »

freak wrote:Don't use CallFunctionFast(), use a prototype instead. Its one less level of jumping/indirection in the call.
Thanks freak , learned again something new...

...here are some parts of the new code (which seems to run a tick faster now as well :wink: -- using remi meiers analyzer doesnt work when the assembler routines are active, so I do not have exact values)

Code: Select all

Prototype MergePixelType(x,y,r,g,b,w)
Global MergePixelRoutine.MergePixelType
:
Procedure MergePixel16(x,y,r,g,b,w)
	CompilerIf #AllowAssembler
		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX		; x > ScreenX ?
		JG exitMP16
		CMP ecx,ScreenY		; y > SreenY ?
		JG exitMP16
		OR eax,ecx		; x OR y < 0?
		JS exitMP16
		
		MOV eax,DxMul		; DxMul
		IMUL ecx			; y*DxMul
		MOV ecx,x			; x
		SHL ecx,1			; x*2
		ADD eax,ecx		; x*2+y*DxMul
		MOV x,eax			; -> x

		*DX=DxMem+x		; x<<1+y*DxMul
		*DX\w=ColMurx(*DX\w,r,g,b,w)
		!exitMP16:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		*DX=DxMem+x<<1+y*DxMul
		*DX\w=ColMurx(*DX\w,r,g,b,w)
		; Plot(x,y,ColMix(Point(x,y),r,g,b,w))
		; SetPixel_(hdc,x,y,col)
		; EndIf
	CompilerEndIf
EndProcedure
:
If DxPix>=#PB_PixelFormat_32Bits_RGB
	MergePixelRoutine=@MergePixel32()
ElseIf DxPix>=#PB_PixelFormat_24Bits_RGB
	MergePixelRoutine=@MergePixel24()
ElseIf DxPix>=#PB_PixelFormat_15Bits
	MergePixelRoutine=@MergePixel16()
Else
	OptAntialiasing=0
EndIf

:
MergePixelRoutine(x1,y1,r,g,b,255)
:
MergePixelRoutine(x2,y2,r,g,b,255)
:

The only part (beside writing optimized assembler code, but this will need some time) which could be done for a speed up seems to be the macro CallMurx, but that's heavy stuff...

Code: Select all

Macro ColMurx(c,r,g,b,w,iw)
	;
	; 16 Bit-Pixel:  rrrr|rggg|ggxb|bbbb    (wobei "x" bei grün verwendet wird)
	( (r*w+(c>>8&$f8)*(iw))&$f800 + ((g*w+(c>>3&$f8)*(iw))&$f800)>>5 + (b*w+(c<<3&$f8)*(iw))>>11 )
EndMacro
User avatar
Michael Vogel
Addict
Addict
Posts: 2797
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Post by Michael Vogel »

Hurray,
my first routine works! It's for setting a (antialized) pixel in 32 bit color depth and speeds up a line function by around 50%...

The code is not perfect (I'm doing my first steps), but 50% is better than nothing :lol:

Michael

Code: Select all

Procedure MergePixel32(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX		; x > ScreenX ?
		JG exitMP32
		CMP ecx,ScreenY		; y > SreenY ?
		JG exitMP32
		OR eax,ecx			; x OR y < 0?
		JS exitMP32

		;*DX=DxMem+x<<2+y*DxMul
		IMUL ecx,DxMul		; y*DxMul
		MOV eax,x				; x
		SHL eax,2				; x*4
		ADD eax,ecx			; x*4+y*DxMul
		ADD eax,DxMem		; DxMem+x<<2+y*DxMul
		MOV *DX,eax			; -> *DX

		;*DX\l=ColMix(*DX\l,b,g,r,w,255-w)
		MOV ebx,*DX\l		; color
		MOV edx,255	; 255
		MOV ecx,w		; w (ecx)
		SUB edx,ecx	; 255-w (edx)

		MOV eax,ebx	; color
		AND eax,$ff	; cr=Red(color)
		MUL dl			; cr*(255-w)

		MOV edi,eax	; parken
		MOV eax,r		; r
		MUL cl			; r*w
		ADD eax,edi	; r*w+cr*(255-w)
		SAR eax,8		; [r*w+cr*(255-w)] >>8

		MOV esi,eax	; rot
		

		MOV eax,ebx	; color
		SAR eax,8		; >>8
		AND eax,$ff	; cg=Green(color)
		MUL dl			; cr*(255-w)

		MOV edi,eax	; parken
		MOV eax,g		; g
		MUL cl			; g*w
		ADD eax,edi	; g*w+cg*(255-w)
		AND eax,$ff00	; [g*w+cg*(255-w)] &$ff00

		ADD esi,eax;	grün|rot


		MOV eax,ebx	; color
		SAR eax,16	; >>16
		AND eax,$ff	; cb=Blue(color)
		MUL dl			; cb*(255-w)

		MOV edi,eax	; parken
		MOV eax,b		; b
		MUL cl			; b*w
		ADD eax,edi	; b*w+cb*(255-w)
		AND eax,$ff00	; [b*w+cb*(255-w)] &$ff00
		SAL eax,8

		ADD esi,eax;	blau|grün|rot

		MOV *DX\l,esi
		
		!exitMP32:

	CompilerElse

		If x>=0 And x<ScreenX And y>=0 And y<ScreenY
			*DX=DxMem+x<<2+y*DxMul
			*DX\l=ColMix(*DX\l,b,g,r,w,255-w)
		EndIf

	CompilerEndIf

EndProcedure
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

:!: What is an "antialiased point" :?:

Could you post some complete code so that we can test and help?
User avatar
Michael Vogel
Addict
Addict
Posts: 2797
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Post by Michael Vogel »

djes wrote::!: What is an "antialiased point" :?:

Could you post some complete code so that we can test and help?
Here we are - I use the point function to adapt the color of one single point of a antialiasing line...

The MergePoint16 and MergePoint24 routines still have errors inside, working on that :?

Michael

Code: Select all

EnableExplicit

#AllowAssembler=1

Global OptAntialiasing=#True
Global Directmemory=#True

Global Dim AntiA(255)
Global Dim AntiB(255)

Global ScreenX=GetSystemMetrics_(#SM_CXSCREEN)
Global ScreenY=GetSystemMetrics_(#SM_CYSCREEN)

Structure DxStructure
	StructureUnion
		l.l
		w.w
	EndStructureUnion
EndStructure

Global DxMem,DxMul,DxPix;		Plot-Beschleunigung...
Global *DX.DxStructure

Declare MergePixel32(x,y,r,g,b,w)
Declare MergePixel24(x,y,r,g,b,w)
Declare MergePixel16(x,y,r,g,b,w)
Declare MergePixel(x,y,r,g,b,w)

Prototype MergePixelType(x,y,r,g,b,w)
Global MergePixelRoutine.MergePixelType

Macro ColRGB(r,g,b)
	( r&$ff + (g&$ff)<<8 + (b&$ff)<<16 )
EndMacro
Macro ColRed(col)
	(col&$ff)
EndMacro
Macro ColGreen(col)
	(col>>8&$ff)
EndMacro
Macro ColBlue(col)
	(col>>16&$ff)
EndMacro
Macro ColMix(c,r,g,b,w,iw)
	;
	; 24/32 Bit-Pixel: bbbbbbbb|gggggggg|rrrrrrrr
	( (r*w+ColRed(c)*(iw))>>8 + (g*w+ColGreen(c)*(iw))&$ff00 + (b*w+ColBlue(c)*(iw))<<8&$ff0000 )
	;( (r*w+ColRed(c)*(255-w))>>8 + (g*w+ColGreen(c)*(255-w))&$ff00 + (b*w+ColBlue(c)*(255-w))<<8&$ff0000 )
	;
	; nach  [w*a+(255-w)*b]/256 = [w*(a-b)+255*b]/256 ~ [w+(a-b)]/256 + b  sparen wir eine Multiplikation...
	;( (w*(r-ColRed(c)))>>8 + (w*(g-ColGreen(c)))&$ff00 + (w*(b-ColBlue(c)))<<8&$ff0000 + c)
EndMacro
Macro ColMurx(c,r,g,b,w,iw)
	;
	; 16 Bit-Pixel:  rrrr|rggg|ggxb|bbbb    (wobei "x" bei grün verwendet wird)
	( (r*w+(c>>8&$f8)*(iw))&$f800 + ((g*w+(c>>3&$f8)*(iw))&$f800)>>5 + (b*w+(c<<3&$f8)*(iw))>>11 )
	;( (r*w+(c>>8&$f8)*(255-w))&$f800 + ((g*w+(c>>3&$f8)*(255-w))&$f800)>>5 + (b*w+(c<<3&$f8)*(255-w))>>11 )
	;
	; nach  [w*a+(255-w)*b]/256 = [w*(a-b)+255*b]/256 ~ [w+(a-b)]/256 + b  sparen wir eine Multiplikation...
	;( (w*(r-c>>8&$f8)+c)&$f800 + ((w*(g-c>>3&$f8))>>5+c)&$7c0 + (w*(b-c<<3&$f8))>>11 +c&$1f )
EndMacro
Procedure MergePixel32(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX		; x > ScreenX ?
		JG exitMP32
		CMP ecx,ScreenY		; y > SreenY ?
		JG exitMP32
		OR eax,ecx			; x OR y < 0?
		JS exitMP32

		;*DX=DxMem+x<<2+y*DxMul
		IMUL ecx,DxMul		; y*DxMul
		MOV eax,x				; x
		SHL eax,2				; x*4
		ADD eax,ecx			; x*4+y*DxMul
		ADD eax,DxMem		; DxMem+x<<2+y*DxMul
		MOV *DX,eax			; -> *DX

		;*DX\l=ColMix(*DX\l,b,g,r,w,255-w)
		MOV ebx,*DX\l		; color
		MOV edx,255	; 255
		MOV ecx,w		; w (ecx)
		SUB edx,ecx	; 255-w (edx)

		MOV eax,ebx	; color
		AND eax,$ff	; cr=Red(color)
		MUL dl			; cr*(255-w)

		MOV edi,eax	; parken
		MOV eax,r		; r
		MUL cl			; r*w
		ADD eax,edi	; r*w+cr*(255-w)
		SAR eax,8		; [r*w+cr*(255-w)] >>8

		MOV esi,eax	; rot


		MOV eax,ebx	; color
		SAR eax,8		; >>8
		AND eax,$ff	; cg=Green(color)
		MUL dl			; cr*(255-w)

		MOV edi,eax	; parken
		MOV eax,g		; g
		MUL cl			; g*w
		ADD eax,edi	; g*w+cg*(255-w)
		AND eax,$ff00	; [g*w+cg*(255-w)] &$ff00

		ADD esi,eax;	grün|rot


		MOV eax,ebx	; color
		SAR eax,16	; >>16
		AND eax,$ff	; cb=Blue(color)
		MUL dl			; cb*(255-w)

		MOV edi,eax	; parken
		MOV eax,b		; b
		MUL cl			; b*w
		ADD eax,edi	; b*w+cb*(255-w)
		AND eax,$ff00	; [b*w+cb*(255-w)] &$ff00
		SAL eax,8

		ADD esi,eax;	blau|grün|rot

		MOV *DX\l,esi

		!exitMP32:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		*DX=DxMem+x<<2+y*DxMul
		*DX\l=ColMix(*DX\l,b,g,r,w,255-w)
		; EndIf

	CompilerEndIf

EndProcedure
Procedure MergePixel24(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX		; x > ScreenX ?
		JG exitMP24
		CMP ecx,ScreenY		; y > SreenY ?
		JG exitMP24
		OR eax,ecx			; x OR y < 0?
		JS exitMP24

		IMUL eax,DxMul		; y*DxMul
		;MUL ecx,3
		;ADD eax,ecx			; x*3+y*DxMul
		ADD eax,ecx			; y*DxMul+x
		ADD eax,ecx			; y*DxMul+2*x
		ADD eax,ecx			; y*DxMul+3*x
		MOV x,eax				; -> x

		*DX=DxMem+x		; anstelle von x*3+y*DxMul
		*DX\l=ColMix(*DX\l,b,g,r,w,255-w)|(*DX\l&$FF000000)
		!exitMP24:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		*DX=DxMem+x*3+y*DxMul
		*DX\l=ColMix(*DX\l,b,g,r,w,255-w)|(*DX\l&$FF000000)
		; EndIf

	CompilerEndIf

EndProcedure
Procedure MergePixel16(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX		; x > ScreenX ?
		JG exitMP16
		CMP ecx,ScreenY		; y > SreenY ?
		JG exitMP16
		OR eax,ecx			; x OR y < 0?
		JS exitMP16

		;*DX=DxMem+x<<1+y*DxMul
		IMUL ecx,DxMul		; y*DxMul
		MOV eax,x				; x
		SHL eax,1				; x*2
		ADD eax,ecx			; x*2+y*DxMul
		ADD eax,DxMem		; DxMem+x<<1+y*DxMul
		MOV *DX,eax			; -> *DX

		;*DX\w=ColMurx(*DX\w,r,g,b,w,255-w)
		MOV bx,*DX\w	; color
		MOV edx,255	; 255
		MOV ecx,w		; w (ecx)
		SUB edx,ecx	; 255-w (edx)

		MOV eax,ebx	; color
		SAR eax,8		; <<8
		AND eax,$f8	; cr=Red(color)
		MUL dl			; cr*(255-w)

		MOV edi,eax	; parken
		MOV eax,r		; r
		MUL cl			; r*w
		ADD eax,edi	; r*w+cr*(255-w)
		AND eax,$f800

		MOV esi,eax	; rot


		MOV eax,ebx	; color
		SAR eax,3		; >>3
		AND eax,$f8	; cg=Green(color)
		MUL dl			; cr*(255-w)

		MOV edi,eax	; parken
		MOV eax,g		; g
		MUL cl			; g*w
		ADD eax,edi	; g*w+cg*(255-w)
		SAR eax,5		; >>5
		AND eax,$ff00	; [g*w+cg*(255-w)] &$ff00

		ADD esi,eax;	grün|rot


		MOV eax,ebx	; color
		SAL eax,3		; <<3
		AND eax,$f8	; cb=Blue(color)
		MUL dl			; cb*(255-w)

		MOV edi,eax	; parken
		MOV eax,b		; b
		MUL cl			; b*w
		ADD eax,edi	; b*w+cb*(255-w)
		AND eax,$ff00	; [b*w+cb*(255-w)] &$ff00
		SAR eax,11	; >>11

		ADD eax,esi;	blau|grün|rot

		MOV *DX\w,ax

		!exitMP16:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		*DX=DxMem+x<<1+y*DxMul
		*DX\w=ColMurx(*DX\w,r,g,b,w,255-w)
		; Plot(x,y,ColMix(Point(x,y),r,g,b,w,255-w))
		; SetPixel_(hdc,x,y,col)
		; EndIf

	CompilerEndIf

EndProcedure
Procedure MergePixel(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_x]
		!mov ecx,dword[p.v_y]
		!cmp eax,v_ScreenX		; x > ScreenX ?
		!jg exitMP
		!cmp ecx,v_ScreenY		; y > SreenY ?
		!jg exitMP
		!or eax,ecx				; x OR y < 0?
		!js exitMP
		Plot(x,y,ColMix(Point(x,y),r,g,b,w,255-w))
		!exitMP:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		Plot(x,y,ColMix(Point(x,y),r,g,b,w,255-w))
		; EndIf

	CompilerEndIf

EndProcedure

Procedure AntiLineXY(x1,y1,x2,y2,col)

	If OptAntialiasing

		Protected r = ColRed(col)
		Protected g = ColGreen(col)
		Protected b = ColBlue(col)
		Protected xd=x2-x1
		Protected yd=y2-y1
		Protected x,y,xf,yf
		Protected grad,w

		;Debug Str(x1)+", "+Str(y1)+", "+Str(x2)+", "+Str(y2)
		CompilerIf #AllowAssembler
			MergePixelRoutine(x1,y1,r,g,b,255)
		CompilerElse
			If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY
				MergePixelRoutine(x1,y1,r,g,b,255)
			EndIf
		CompilerEndIf


		CompilerIf #AllowAssembler
			MergePixelRoutine(x2,y2,r,g,b,255)
		CompilerElse
			If x2>=0 And x2<ScreenX And y2>=0 And y2<ScreenY
				MergePixelRoutine(x2,y2,r,g,b,255)
			EndIf
		CompilerEndIf
		If xd=0 Or yd=0
			LineXY(x1,y1,x2,y2,col)
			ProcedureReturn
		EndIf

		If Abs(xd)>Abs(yd)
			If (x1>x2)
				Swap x1,x2
				Swap y1,y2
				xd=-xd
				yd=-yd
			EndIf

			grad=yd<<16/xd
			yf=y1<<16

			For x=x1+1 To x2-1
				yf+grad
				w=(yf>>8)&$FF
				y=yf>>16

				CompilerIf #AllowAssembler
					MergePixelRoutine(x,y,r,g,b,AntiA(w))
					MergePixelRoutine(x,y+1,r,g,b,AntiB(w))
				CompilerElse
					If x>=0 And x<ScreenX
						If y>=0 And y<ScreenY
							MergePixelRoutine(x,y,r,g,b,AntiA(w))
						EndIf
						If y>=-1 And y+1<ScreenY
							MergePixelRoutine(x,y+1,r,g,b,AntiB(w))
						EndIf
					EndIf
				CompilerEndIf

			Next
		Else
			If (y1>y2)
				Swap x1,x2
				Swap y1,y2
				xd=-xd
				yd=-yd
			EndIf
			grad=xd<<16/yd
			xf=x1<<16
			For y=y1+1 To y2-1
				xf+grad
				w=(xf>>8)&$FF
				x=xf>>16

				CompilerIf #AllowAssembler
					MergePixelRoutine(x,y,r,g,b,AntiA(w))
					MergePixelRoutine(x+1,y,r,g,b,AntiB(w))
				CompilerElse
					If y>=0 And y<ScreenY
						If x>=0 And x<ScreenX
							MergePixelRoutine(x,y,r,g,b,AntiA(w))
						EndIf
						If x>=-1 And x+1<ScreenX
							MergePixelRoutine(x+1,y,r,g,b,AntiB(w))
						EndIf
					EndIf
				CompilerEndIf

			Next
		EndIf
	Else
		LineXY(x1,y1,x2,y2,col)
	EndIf

EndProcedure

Procedure.l Limit(x.l)

	CompilerIf #AllowAssembler
		!mov eax,dword[p.v_x]
		!xor edx,edx		; set EDX to zero
		!cmp eax,0			; compare with top limit
		!cmovl eax,edx	;
		!mov edx,255		; 255
		!cmp eax,edx		; compare with bottom limit
		!cmovg eax,edx	; if lower, set value to bottom limit
		ProcedureReturn

	CompilerElse

		If x<0
			ProcedureReturn 0
		ElseIf x>255
			ProcedureReturn 255
		Else
			ProcedureReturn x
		EndIf

	CompilerEndIf

EndProcedure
Procedure AntiTable()
	Protected dummy,k
	Dummy=5<<4
	For k=0 To 255
		AntiA(k)=Limit(255-k+Dummy)
		AntiB(k)=Limit(k+Dummy)
	Next k
EndProcedure

Procedure Demo()
	Protected i,win

	win=OpenWindow(0,0,0,ScreenX,ScreenY,"",#PB_Window_ScreenCentered | #PB_Window_BorderLess)
	InitSprite()
	OpenWindowedScreen(win,0,0,ScreenX,ScreenY,0,0,0)

	StartDrawing(ScreenOutput())

	If DirectMemory

		DxMem=DrawingBuffer()
		DxMul=DrawingBufferPitch()
		DxPix=DrawingBufferPixelFormat(); liefert bei einigen Notebooks leider 0...

		If DxPix=0
			i=GetDC_(0)
			Select GetDeviceCaps_(i,#BITSPIXEL)
			Case 32
				DxPix=#PB_PixelFormat_32Bits_RGB
			Case 24
				DxPix=#PB_PixelFormat_24Bits_RGB
			Case 15,16
				DxPix=#PB_PixelFormat_16Bits
			EndSelect
			ReleaseDC_(0,i)
		EndIf

		If DxPix>=#PB_PixelFormat_32Bits_RGB
			MergePixelRoutine=@MergePixel32()
		ElseIf DxPix>=#PB_PixelFormat_24Bits_RGB
			MergePixelRoutine=@MergePixel24()
		ElseIf DxPix>=#PB_PixelFormat_15Bits
			MergePixelRoutine=@MergePixel16()
		Else
			OptAntialiasing=0
			End
		EndIf

	Else
		MergePixelRoutine=@MergePixel()
		End
	EndIf

	For i=0 To 49
		AntiLineXY(Random(screenx),Random(screeny),Random(screenx),Random(screeny),$c0c0e0)
		AntiLineXY(Random(screenx),Random(screeny),Random(screenx),Random(screeny),$c0e0c0)
		AntiLineXY(Random(screenx),Random(screeny),Random(screenx),Random(screeny),$e0c0c0)
	Next i

	StopDrawing()
	FlipBuffers(#PB_Screen_SmartSynchronization)

	Repeat
	Until GetAsyncKeyState_(#VK_SPACE)

EndProcedure

AntiTable()
Demo()
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

There's a lot of things to optimize. You should consider a good clipping algorithm instead of doing test every point. Here's one I've adapted some times ago:

Code: Select all

;fast done Liang-Barsky line clipping algo
;by djes 09/2007

x1 = - 200
y1 = - 300
x2 = 1300
y2 = 1400

minx = 0
miny = 0
maxx = 1024
maxy = 768


t0.f = 0
t1.f = 1

dx.f = x2 - x1
dy.f = y2 - y1

;*************************
p.f = - dx
q.f = - (minx - x1)
r.f = q/p

If p = 0 And q < 0
  Goto ENDCLIP
EndIf

If p < 0
  If r > t1
    Goto ENDCLIP
  EndIf
  If r > t0
    t0 = r
  EndIf
ElseIf p > 0
  If r < t0
    Goto ENDCLIP
  EndIf
  If r < t1
    t1 = r
  EndIf
EndIf

;*************************
p.f = dx
q.f = maxx - x1
r.f = q/p

If p = 0 And q < 0
  Goto ENDCLIP
EndIf

If p < 0
  If r > t1
    Goto ENDCLIP
  EndIf
  If r > t0
    t0 = r
  EndIf
ElseIf p > 0
  If r < t0
    Goto ENDCLIP
  EndIf
  If r < t1
    t1 = r
  EndIf
EndIf

;*************************
p.f = - dy
q.f = - (miny - y1)
r.f = q/p

If p = 0 And q < 0
  Goto ENDCLIP
EndIf

If p < 0
  If r > t1
    Goto ENDCLIP
  EndIf
  If r > t0
    t0 = r
  EndIf
ElseIf p > 0
  If r < t0
    Goto ENDCLIP
  EndIf
  If r < t1
    t1 = r
  EndIf
EndIf

;*************************
p.f = dy
q.f = maxy - y1
r.f = q/p

If p = 0 And q < 0
  Goto ENDCLIP
EndIf

If p < 0
  If r > t1
    Goto ENDCLIP
  EndIf
  If r > t0
    t0 = r
  EndIf
ElseIf p > 0
  If r < t0
    Goto ENDCLIP
  EndIf
  If r < t1
    t1 = r
  EndIf
EndIf

;*************************
ENDCLIP:

If t1 < 1
  x2 = x1 + t1*dx
  y2 = y1 + t1*dy
EndIf

If t0 > 0
  x1 = x1 + t0*dx
  y1 = y1 + t0*dy
EndIf

Debug x1
Debug y1
Debug x2
Debug y2
User avatar
Michael Vogel
Addict
Addict
Posts: 2797
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Post by Michael Vogel »

djes wrote:There's a lot of things to optimize. You should consider a good clipping algorithm instead of doing test every point.
Thanks, that will need some time to check if I can adapt your code to be 100% safe for my AntiLine function, at the moment I have a small problem I have to check...

When hitting the right border, my antialiasing code makes a (wrong) point at the left side of the screen. This could be a problem, when this will below the screen area (and out of the screen memory range)...

Michael

Code: Select all

EnableExplicit

#AllowAssembler=1

Global OptAntialiasing=#True
Global Directmemory=#True

Global Dim AntiA(255)
Global Dim AntiB(255)

Global ScreenX=GetSystemMetrics_(#SM_CXSCREEN)
Global ScreenY=GetSystemMetrics_(#SM_CYSCREEN)

Structure DxStructure
	StructureUnion
		l.l
		w.w
	EndStructureUnion
EndStructure

Global DxMem,DxMul,DxPix;      Plot-Beschleunigung...
Global *DX.DxStructure

Declare MergePixel32(x,y,r,g,b,w)
Declare MergePixel24(x,y,r,g,b,w)
Declare MergePixel16(x,y,r,g,b,w)
Declare MergePixel(x,y,r,g,b,w)

Prototype MergePixelType(x,y,r,g,b,w)
Global MergePixelRoutine.MergePixelType

Macro ColRGB(r,g,b)
	( r&$ff + (g&$ff)<<8 + (b&$ff)<<16 )
EndMacro
Macro ColRed(col)
	(col&$ff)
EndMacro
Macro ColGreen(col)
	(col>>8&$ff)
EndMacro
Macro ColBlue(col)
	(col>>16&$ff)
EndMacro
Macro ColMix(c,r,g,b,w,iw)
	;
	; 24/32 Bit-Pixel: bbbbbbbb|gggggggg|rrrrrrrr
	( (r*w+ColRed(c)*(iw))>>8 + (g*w+ColGreen(c)*(iw))&$ff00 + (b*w+ColBlue(c)*(iw))<<8&$ff0000 )
	;( (r*w+ColRed(c)*(255-w))>>8 + (g*w+ColGreen(c)*(255-w))&$ff00 + (b*w+ColBlue(c)*(255-w))<<8&$ff0000 )
	;
	; nach  [w*a+(255-w)*b]/256 = [w*(a-b)+255*b]/256 ~ [w+(a-b)]/256 + b  sparen wir eine Multiplikation...
	;( (w*(r-ColRed(c)))>>8 + (w*(g-ColGreen(c)))&$ff00 + (w*(b-ColBlue(c)))<<8&$ff0000 + c)
EndMacro
Macro ColMurx(c,r,g,b,w,iw)
	;
	; 16 Bit-Pixel:  rrrr|rggg|ggxb|bbbb    (wobei "x" bei grün verwendet wird)
	( (r*w+(c>>8&$f8)*(iw))&$f800 + ((g*w+(c>>3&$f8)*(iw))&$f800)>>5 + (b*w+(c<<3&$f8)*(iw))>>11 )
	;( (r*w+(c>>8&$f8)*(255-w))&$f800 + ((g*w+(c>>3&$f8)*(255-w))&$f800)>>5 + (b*w+(c<<3&$f8)*(255-w))>>11 )
	;
	; nach  [w*a+(255-w)*b]/256 = [w*(a-b)+255*b]/256 ~ [w+(a-b)]/256 + b  sparen wir eine Multiplikation...
	;( (w*(r-c>>8&$f8)+c)&$f800 + ((w*(g-c>>3&$f8))>>5+c)&$7c0 + (w*(b-c<<3&$f8))>>11 +c&$1f )
EndMacro
Procedure MergePixel32(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX      ; x > ScreenX ?
		JG exitMP32
		CMP ecx,ScreenY      ; y > SreenY ?
		JG exitMP32
		OR eax,ecx         ; x OR y < 0?
		JS exitMP32

		;*DX=DxMem+x<<2+y*DxMul
		IMUL ecx,DxMul      ; y*DxMul
		MOV eax,x            ; x
		SHL eax,2            ; x*4
		ADD eax,ecx         ; x*4+y*DxMul
		ADD eax,DxMem      ; DxMem+x<<2+y*DxMul
		MOV *DX,eax         ; -> *DX

		;*DX\l=ColMix(*DX\l,b,g,r,w,255-w)
		MOV ebx,*DX\l      ; color
		MOV edx,255   ; 255
		MOV ecx,w      ; w (ecx)
		SUB edx,ecx   ; 255-w (edx)

		MOV eax,ebx   ; color
		AND eax,$ff   ; cr=Red(color)
		MUL dl         ; cr*(255-w)

		MOV edi,eax   ; parken
		MOV eax,r      ; r
		MUL cl         ; r*w
		ADD eax,edi   ; r*w+cr*(255-w)
		SAR eax,8      ; [r*w+cr*(255-w)] >>8

		MOV esi,eax   ; rot


		MOV eax,ebx   ; color
		SAR eax,8      ; >>8
		AND eax,$ff   ; cg=Green(color)
		MUL dl         ; cr*(255-w)

		MOV edi,eax   ; parken
		MOV eax,g      ; g
		MUL cl         ; g*w
		ADD eax,edi   ; g*w+cg*(255-w)
		AND eax,$ff00   ; [g*w+cg*(255-w)] &$ff00

		ADD esi,eax;   grün|rot


		MOV eax,ebx   ; color
		SAR eax,16   ; >>16
		AND eax,$ff   ; cb=Blue(color)
		MUL dl         ; cb*(255-w)

		MOV edi,eax   ; parken
		MOV eax,b      ; b
		MUL cl         ; b*w
		ADD eax,edi   ; b*w+cb*(255-w)
		AND eax,$ff00   ; [b*w+cb*(255-w)] &$ff00
		SAL eax,8

		ADD esi,eax;   blau|grün|rot

		MOV *DX\l,esi

		!exitMP32:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		*DX=DxMem+x<<2+y*DxMul
		*DX\l=ColMix(*DX\l,b,g,r,w,255-w)
		; EndIf

	CompilerEndIf

EndProcedure
Procedure MergePixel24(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX      ; x > ScreenX ?
		JG exitMP24
		CMP ecx,ScreenY      ; y > SreenY ?
		JG exitMP24
		OR eax,ecx         ; x OR y < 0?
		JS exitMP24

		IMUL eax,DxMul      ; y*DxMul
		;MUL ecx,3
		;ADD eax,ecx         ; x*3+y*DxMul
		ADD eax,ecx         ; y*DxMul+x
		ADD eax,ecx         ; y*DxMul+2*x
		ADD eax,ecx         ; y*DxMul+3*x
		MOV x,eax            ; -> x

		*DX=DxMem+x      ; anstelle von x*3+y*DxMul
		*DX\l=ColMix(*DX\l,b,g,r,w,255-w)|(*DX\l&$FF000000)
		!exitMP24:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		*DX=DxMem+x*3+y*DxMul
		*DX\l=ColMix(*DX\l,b,g,r,w,255-w)|(*DX\l&$FF000000)
		; EndIf

	CompilerEndIf

EndProcedure
Procedure MergePixel16(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		MOV eax,x
		MOV ecx,y
		CMP eax,ScreenX      ; x > ScreenX ?
		JG exitMP16
		CMP ecx,ScreenY      ; y > SreenY ?
		JG exitMP16
		OR eax,ecx         ; x OR y < 0?
		JS exitMP16

		;*DX=DxMem+x<<1+y*DxMul
		IMUL ecx,DxMul      ; y*DxMul
		MOV eax,x            ; x
		SHL eax,1            ; x*2
		ADD eax,ecx         ; x*2+y*DxMul
		ADD eax,DxMem      ; DxMem+x<<1+y*DxMul
		MOV *DX,eax         ; -> *DX

		;*DX\w=ColMurx(*DX\w,r,g,b,w,255-w)
		MOV bx,*DX\w   ; color
		MOV edx,255   ; 255
		MOV ecx,w      ; w (ecx)
		SUB edx,ecx   ; 255-w (edx)

		MOV eax,ebx   ; color
		SAR eax,8      ; <<8
		AND eax,$f8   ; cr=Red(color)
		MUL dl         ; cr*(255-w)

		MOV edi,eax   ; parken
		MOV eax,r      ; r
		MUL cl         ; r*w
		ADD eax,edi   ; r*w+cr*(255-w)
		AND eax,$f800

		MOV esi,eax   ; rot


		MOV eax,ebx   ; color
		SAR eax,3      ; >>3
		AND eax,$f8   ; cg=Green(color)
		MUL dl         ; cr*(255-w)

		MOV edi,eax   ; parken
		MOV eax,g      ; g
		MUL cl         ; g*w
		ADD eax,edi   ; g*w+cg*(255-w)
		SAR eax,5      ; >>5
		AND eax,$ff00   ; [g*w+cg*(255-w)] &$ff00

		ADD esi,eax;   grün|rot


		MOV eax,ebx   ; color
		SAL eax,3      ; <<3
		AND eax,$f8   ; cb=Blue(color)
		MUL dl         ; cb*(255-w)

		MOV edi,eax   ; parken
		MOV eax,b      ; b
		MUL cl         ; b*w
		ADD eax,edi   ; b*w+cb*(255-w)
		AND eax,$ff00   ; [b*w+cb*(255-w)] &$ff00
		SAR eax,11   ; >>11

		ADD eax,esi;   blau|grün|rot

		MOV *DX\w,ax

		!exitMP16:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		*DX=DxMem+x<<1+y*DxMul
		*DX\w=ColMurx(*DX\w,r,g,b,w,255-w)
		; Plot(x,y,ColMix(Point(x,y),r,g,b,w,255-w))
		; SetPixel_(hdc,x,y,col)
		; EndIf

	CompilerEndIf

EndProcedure
Procedure MergePixel(x,y,r,g,b,w)

	CompilerIf #AllowAssembler

		!mov eax,dword[p.v_x]
		!mov ecx,dword[p.v_y]
		!cmp eax,v_ScreenX      ; x > ScreenX ?
		!jg exitMP
		!cmp ecx,v_ScreenY      ; y > SreenY ?
		!jg exitMP
		!or eax,ecx            ; x OR y < 0?
		!js exitMP
		Plot(x,y,ColMix(Point(x,y),r,g,b,w,255-w))
		!exitMP:

	CompilerElse

		; If x>=0 And x<ScreenX And y>=0 And y<ScreenY
		Plot(x,y,ColMix(Point(x,y),r,g,b,w,255-w))
		; EndIf

	CompilerEndIf

EndProcedure

Procedure AntiLineXY(x1,y1,x2,y2,col)

	If OptAntialiasing

		Protected r = ColRed(col)
		Protected g = ColGreen(col)
		Protected b = ColBlue(col)
		Protected xd=x2-x1
		Protected yd=y2-y1
		Protected x,y,xf,yf
		Protected grad,w

		;Debug Str(x1)+", "+Str(y1)+", "+Str(x2)+", "+Str(y2)
		CompilerIf #AllowAssembler
			MergePixelRoutine(x1,y1,r,g,b,255)
		CompilerElse
			If x1>=0 And x1<ScreenX And y1>=0 And y1<ScreenY
				MergePixelRoutine(x1,y1,r,g,b,255)
			EndIf
		CompilerEndIf


		CompilerIf #AllowAssembler
			MergePixelRoutine(x2,y2,r,g,b,255)
		CompilerElse
			If x2>=0 And x2<ScreenX And y2>=0 And y2<ScreenY
				MergePixelRoutine(x2,y2,r,g,b,255)
			EndIf
		CompilerEndIf
		If xd=0 Or yd=0
			LineXY(x1,y1,x2,y2,col)
			ProcedureReturn
		EndIf

		If Abs(xd)>Abs(yd)
			If (x1>x2)
				Swap x1,x2
				Swap y1,y2
				xd=-xd
				yd=-yd
			EndIf

			grad=yd<<16/xd
			yf=y1<<16

			For x=x1+1 To x2-1
				yf+grad
				w=(yf>>8)&$FF
				y=yf>>16

				CompilerIf #AllowAssembler
					MergePixelRoutine(x,y,r,g,b,AntiA(w))
					MergePixelRoutine(x,y+1,r,g,b,AntiB(w))
				CompilerElse
					If x>=0 And x<ScreenX
						If y>=0 And y<ScreenY
							MergePixelRoutine(x,y,r,g,b,AntiA(w))
						EndIf
						If y>=-1 And y+1<ScreenY
							MergePixelRoutine(x,y+1,r,g,b,AntiB(w))
						EndIf
					EndIf
				CompilerEndIf

			Next
		Else
			If (y1>y2)
				Swap x1,x2
				Swap y1,y2
				xd=-xd
				yd=-yd
			EndIf
			grad=xd<<16/yd
			xf=x1<<16
			For y=y1+1 To y2-1
				xf+grad
				w=(xf>>8)&$FF
				x=xf>>16

				CompilerIf #AllowAssembler
					MergePixelRoutine(x,y,r,g,b,AntiA(w))
					MergePixelRoutine(x+1,y,r,g,b,AntiB(w))
				CompilerElse
					If y>=0 And y<ScreenY
						If x>=0 And x<ScreenX
							MergePixelRoutine(x,y,r,g,b,AntiA(w))
						EndIf
						If x>=-1 And x+1<ScreenX
							MergePixelRoutine(x+1,y,r,g,b,AntiB(w))
						EndIf
					EndIf
				CompilerEndIf

			Next
		EndIf
	Else
		LineXY(x1,y1,x2,y2,col)
	EndIf

EndProcedure
Procedure ClipAntiLineXY(x1,y1,x2,y2,col)
	;fast done Liang-Barsky line clipping algo
	;by djes 09/2007

	Protected minx=0
	Protected miny=0
	Protected maxx = screenx
	Protected maxy = screeny
	Protected t0.f
	Protected t1.f
	Protected dx.f
	Protected dy.f
	Protected p.f
	Protected q.f
	Protected r.f

	t0.f = 0
	t1.f = 1

	dx.f = x2 - x1
	dy.f = y2 - y1

	;*************************
	p.f = - dx
	q.f = - (minx - x1)
	r.f = q/p

	If p = 0 And q < 0
		Goto ENDCLIP
	EndIf

	If p < 0
		If r > t1
			Goto ENDCLIP
		EndIf
		If r > t0
			t0 = r
		EndIf
	ElseIf p > 0
		If r < t0
			Goto ENDCLIP
		EndIf
		If r < t1
			t1 = r
		EndIf
	EndIf

	;*************************
	p.f = dx
	q.f = maxx - x1
	r.f = q/p

	If p = 0 And q < 0
		Goto ENDCLIP
	EndIf

	If p < 0
		If r > t1
			Goto ENDCLIP
		EndIf
		If r > t0
			t0 = r
		EndIf
	ElseIf p > 0
		If r < t0
			Goto ENDCLIP
		EndIf
		If r < t1
			t1 = r
		EndIf
	EndIf

	;*************************
	p.f = - dy
	q.f = - (miny - y1)
	r.f = q/p

	If p = 0 And q < 0
		Goto ENDCLIP
	EndIf

	If p < 0
		If r > t1
			Goto ENDCLIP
		EndIf
		If r > t0
			t0 = r
		EndIf
	ElseIf p > 0
		If r < t0
			Goto ENDCLIP
		EndIf
		If r < t1
			t1 = r
		EndIf
	EndIf

	;*************************
	p.f = dy
	q.f = maxy - y1
	r.f = q/p

	If p = 0 And q < 0
		Goto ENDCLIP
	EndIf

	If p < 0
		If r > t1
			Goto ENDCLIP
		EndIf
		If r > t0
			t0 = r
		EndIf
	ElseIf p > 0
		If r < t0
			Goto ENDCLIP
		EndIf
		If r < t1
			t1 = r
		EndIf
	EndIf

	;*************************
	ENDCLIP:

	If t1 < 1
		x2 = x1 + t1*dx
		y2 = y1 + t1*dy
	EndIf

	If t0 > 0
		x1 = x1 + t0*dx
		y1 = y1 + t0*dy
	EndIf

	Debug x1
	Debug y1
	Debug x2
	Debug y2
	AntiLineXY(x1,y1,x2,y2,col)

EndProcedure

Procedure.l Limit(x.l)

	CompilerIf #AllowAssembler
		!mov eax,dword[p.v_x]
		!xor edx,edx      ; set EDX to zero
		!cmp eax,0         ; compare with top limit
		!cmovl eax,edx   ;
		!mov edx,255      ; 255
		!cmp eax,edx      ; compare with bottom limit
		!cmovg eax,edx   ; if lower, set value to bottom limit
		ProcedureReturn

	CompilerElse

		If x<0
			ProcedureReturn 0
		ElseIf x>255
			ProcedureReturn 255
		Else
			ProcedureReturn x
		EndIf

	CompilerEndIf

EndProcedure
Procedure AntiTable()
	Protected dummy,k
	Dummy=5<<4
	For k=0 To 255
		AntiA(k)=Limit(255-k+Dummy)
		AntiB(k)=Limit(k+Dummy)
	Next k
EndProcedure

Procedure Demo()
	Protected i,win

	win=OpenWindow(0,0,0,ScreenX,ScreenY,"",#PB_Window_ScreenCentered | #PB_Window_BorderLess)
	InitSprite()
	OpenWindowedScreen(win,0,0,ScreenX,ScreenY,0,0,0)

	StartDrawing(ScreenOutput())

	If DirectMemory

		DxMem=DrawingBuffer()
		DxMul=DrawingBufferPitch()
		DxPix=DrawingBufferPixelFormat(); liefert bei einigen Notebooks leider 0...

		If DxPix=0
			i=GetDC_(0)
			Select GetDeviceCaps_(i,#BITSPIXEL)
			Case 32
				DxPix=#PB_PixelFormat_32Bits_RGB
			Case 24
				DxPix=#PB_PixelFormat_24Bits_RGB
			Case 15,16
				DxPix=#PB_PixelFormat_16Bits
			EndSelect
			ReleaseDC_(0,i)
		EndIf

		If DxPix>=#PB_PixelFormat_32Bits_RGB
			MergePixelRoutine=@MergePixel32()
		ElseIf DxPix>=#PB_PixelFormat_24Bits_RGB
			MergePixelRoutine=@MergePixel24()
		ElseIf DxPix>=#PB_PixelFormat_15Bits
			MergePixelRoutine=@MergePixel16()
		Else
			OptAntialiasing=0
			End
		EndIf

	Else
		MergePixelRoutine=@MergePixel()
		End
	EndIf


	AntiLineXY(0,-4,screenx,screeny-4,$c0c0e0)
	ClipAntiLineXY(0,-2,screenx,screeny-2,$f0c0c0)

	StopDrawing()
	FlipBuffers(#PB_Screen_SmartSynchronization)

	Repeat
	Until GetAsyncKeyState_(#VK_SPACE)

EndProcedure

AntiTable()
Demo()
Post Reply