PureBasic Forum
https://www.purebasic.fr/english/

[Solved] Fast Alpha Blending -percent based needed
https://www.purebasic.fr/english/viewtopic.php?f=35&t=68513
Page 1 of 3

Author:  walbus [ Sun May 21, 2017 3:00 pm ]
Post subject:  [Solved] Fast Alpha Blending -percent based needed

Hi,
as a last step for quick output very large animated GIF frames directly on canvas (As sample ORBO GIF) its helpfull for speed up
a PB code based function a little
So i would ask for help converting this or a similar function to ASM

Code:
Procedure Color_Mix(color1.l, color2.l, percent.l)
  r= ((Red(color1)*percent)/100) + ((Red(color2)*(100-percent)) / 100)
  g= ((Green(color1)*percent)/100) + ((Green(color2)*(100-percent)) / 100)
  b= ((Blue(color1)*percent)/100) + ((Blue(color2)*(100-percent)) / 100)
  ProcedureReturn RGB(r,g,b)
EndProcedure

Author:  wilbert [ Sun May 21, 2017 6:08 pm ]
Post subject:  Re: Fast Alpha Blending -percent based needed

This thread contains a mix procedure
viewtopic.php?f=35&t=66220
It is not percent based but requires a value from 0 - 255.

Author:  walbus [ Sun May 21, 2017 6:16 pm ]
Post subject:  Re: Fast Alpha Blending -percent based needed

Many thanks Wilbert, this is what i want !
I have many pleasure with the ORBO Gif, looking here
https://www.reddit.com/r/orbo/

Author:  wilbert [ Sun May 21, 2017 6:33 pm ]
Post subject:  Re: Fast Alpha Blending -percent based needed

walbus wrote:
I have many pleasure with the ORBO Gif, looking here
https://www.reddit.com/r/orbo/

Nice quality images :)

Author:  netmaestro [ Sun May 21, 2017 8:17 pm ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

It's not easy to beat wilbert with a solution and nearly impossible to beat the speed of his code. But I worked on this dammit so you are getting it. On my machine it runs more than 2x as fast as the posted PB procedure (Remember to turn the debugger off) Also, bear in mind it's x86 only.
Code:
Procedure Color_Mix_asm(color1.l, color2.l, percent.l)
  ; netmaestro May 2017
 
  Protected.b r, g, b
 
  ; r = Red(color1)*percent / 100
  !mov eax, [p.v_color1]
  !and eax, 0xFF
  !imul eax, [p.v_percent]
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !mov [p.v_r], al
  ; r + Red(color2)*(100-percent) / 100
  !mov eax, [p.v_color2]
  !and eax, 0xFF
  !mov ebx, 0x64
  !sub ebx, [p.v_percent]
  !imul eax, ebx
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !add [p.v_r], al
 
 ; g = Green(color1)*percent / 100
  !mov eax, [p.v_color1]
  !shr eax, 8
  !and eax, 0xFF
  !imul eax, [p.v_percent]
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !mov [p.v_g], al
  ; g + Green(color2)*(100-percent) / 100
  !mov eax, [p.v_color2]
  !shr eax, 8
  !and eax, 0xFF
  !mov ebx, 0x64
  !sub ebx, [p.v_percent]
  !imul eax, ebx
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !add [p.v_g], al
   
  ; b = Blue(color1)*percent / 100
  !mov eax, [p.v_color1]
  !shr eax, 16
  !and eax, 0xFF
  !imul eax, [p.v_percent]
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !mov [p.v_b], al
  ; b + Blue(color2)*(100-percent) / 100
  !mov eax, [p.v_color2]
  !shr eax, 16
  !and eax, 0xFF
  !mov ebx, 0x64
  !sub ebx, [p.v_percent]
  !imul eax, ebx
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !add [p.v_b], al
 
  ; ProcedureReturn RGB(r, g, b)
  !xor eax, eax
  !mov al, [p.v_b]
  !shl eax, 16
  !mov ah, [p.v_g]
  !mov al, [p.v_r]
 
  ProcedureReturn
 
EndProcedure

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  r= ((Red(color1)*percent)/100) + ((Red(color2)*(100-percent)) / 100)
  g= ((Green(color1)*percent)/100) + ((Green(color2)*(100-percent)) / 100)
  b= ((Blue(color1)*percent)/100) + ((Blue(color2)*(100-percent)) / 100)
  ProcedureReturn RGB(r,g,b)
EndProcedure

; Debug RSet(Hex(Color_Mix_pb(#White, #Blue, 36),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_asm(#White, #Blue, 36),#PB_Long), 6, "0")
;
; End

CompilerIf #PB_Compiler_Debugger
  MessageRequester("Notice:", "Please turn off the debugger for this test")
  End
CompilerEndIf

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_pb(#Green,#Blue, 50)
Next
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_asm(#Green,#Blue, 50)
Next
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))


Author:  wilbert [ Mon May 22, 2017 6:46 am ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

netmaestro wrote:
I worked on this dammit so you are getting it. On my machine it runs more than 2x as fast as the posted PB procedure (Remember to turn the debugger off) Also, bear in mind it's x86 only.

That's a more literal conversion :)
Any reason why you use esp ?
On OSX, offsets to esp can be different from Windows.
For cross platform compatibility, it's better to use [p.v_color1] instead of [esp+16].

To get the byte value from a color, you could also have used movzx.
It would result in a few lines less code but probably wouldn't make a significant difference when it comes to speed (div has the biggest impact).

Author:  walbus [ Mon May 22, 2017 8:00 am ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

@Wilbert
Now i have add your routine to my shapes engine
With other ASM routines for color distance and invisible color handling, also from you
All works very, very fine !
Again many thanks for your friendly help !

Author:  netmaestro [ Mon May 22, 2017 8:01 pm ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

Ok, good point on the esp, I had forgotten that MacOS uses it differently. I made the change to named vars but i don't know how to implement movzx to streamline the code. Any light you can shed on it would be appreciated.

Author:  wilbert [ Mon May 22, 2017 8:30 pm ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

netmaestro wrote:
I made the change to named vars but i don't know how to implement movzx to streamline the code. Any light you can shed on it would be appreciated.

movzx allows you to load a byte (or word) into a 32 bit register. The upper 24 bits are cleared (zx means zero extend).
This way you don't need to use a shift and a mask to get the red, green or blue value.

example:
Code:
  !movzx eax, byte [p.v_color1]; red
  !movzx eax, byte [p.v_color1 + 1]; green
  !movzx eax, byte [p.v_color1 + 2]; blue

Author:  netmaestro [ Mon May 22, 2017 9:57 pm ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

Thanks, I think I got it. Lots of streamlining although as you predicted, not a noticeable improvement in speed, though it does execute in approx. 40% of the time the PureBasic code version takes:
Code:
Procedure Color_Mix_asm(color1.l, color2.l, percent.l)
  ; netmaestro May 2017
 
  Protected result=0
 
  !xor ecx, ecx
  !@@:
  !movzx eax, byte [p.v_color1 + ecx]
  !imul eax, [p.v_percent]
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !mov [p.v_result + ecx], al
  !movzx eax, byte [p.v_color2 + ecx]
  !mov ebx, 0x64
  !sub ebx, [p.v_percent]
  !imul eax, ebx
  !cdq
  !mov ebx, 0x64
  !idiv ebx
  !add [p.v_result + ecx], al
  !inc ecx
  !cmp ecx, 0x2
  !jle @b
   
  ProcedureReturn result
 
EndProcedure

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  r= ((Red(color1)*percent)/100) + ((Red(color2)*(100-percent)) / 100)
  g= ((Green(color1)*percent)/100) + ((Green(color2)*(100-percent)) / 100)
  b= ((Blue(color1)*percent)/100) + ((Blue(color2)*(100-percent)) / 100)
  ProcedureReturn RGB(r,g,b)
EndProcedure

; Debug RSet(Hex(Color_Mix_pb(#White, #Black, 17),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_asm(#White, #Black, 17),#PB_Long), 6, "0")
;
; End

CompilerIf #PB_Compiler_Debugger
  MessageRequester("Notice:", "Please turn off the debugger for this test")
  End
CompilerEndIf

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_pb(#Green,#Blue, 50)
Next
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_asm(#Green,#Blue, 50)
Next
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))

Author:  wilbert [ Tue May 23, 2017 5:43 am ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

netmaestro wrote:
Thanks, I think I got it. Lots of streamlining although as you predicted, not a noticeable improvement in speed, though it does execute in approx. 40% of the time the PureBasic code version takes:

Nice idea, that loop :)
As for speed improvement, probably the only way to get a significant increase is to get rid of the idiv instruction.

Author:  netmaestro [ Tue May 23, 2017 7:28 pm ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

Quote:
As for speed improvement, probably the only way to get a significant increase is to get rid of the idiv instruction.
I found some algorithms on the web for dividing by 100 using only add and shift. I picked one and implemented it in asm for this procedure and it was actually slower. So I tried another shorter one and now it reduces execution time vs. PB code from 40% to 25%:
Code:
Procedure Color_Mix_asm(color1.l, color2.l, percent.l)
  ; netmaestro May 2017
 
  Protected result=0
 
  !xor ecx, ecx
  !@@:
  !movzx eax, byte [p.v_color1 + ecx]
  !imul eax, [p.v_percent]
  !mov edx, eax    
  !shr edx, 5    
  !add edx, eax   
  !shl edx, 2    
  !add eax, edx   
  !shr eax, 9    
  !mov [p.v_result + ecx], al
  !movzx eax, byte [p.v_color2 + ecx]
  !mov ebx, 0x64
  !sub ebx, [p.v_percent]
  !imul eax, ebx
  !mov edx, eax    
  !shr edx, 5    
  !add edx, eax   
  !shl edx, 2    
  !add eax, edx   
  !shr eax, 9    
  !add [p.v_result + ecx], al
  !inc ecx
  !cmp ecx, 0x2
  !jle @b
 
  ProcedureReturn result
 
EndProcedure

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  r= ((Red(color1)*percent)/100) + ((Red(color2)*(100-percent)) / 100)
  g= ((Green(color1)*percent)/100) + ((Green(color2)*(100-percent)) / 100)
  b= ((Blue(color1)*percent)/100) + ((Blue(color2)*(100-percent)) / 100)
  ProcedureReturn RGB(r,g,b)
EndProcedure

; Debug RSet(Hex(Color_Mix_pb(#Red, #Green, 80),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_asm(#Red, #Green, 80),#PB_Long), 6, "0")
;
; End

CompilerIf #PB_Compiler_Debugger
  MessageRequester("Notice:", "Please turn off the debugger for this test")
  End
CompilerEndIf

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_pb(#Green,#Blue, 50)
Next
e=ElapsedMilliseconds()-s
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_asm(#Green,#Blue, 50)
Next
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))

There may be a more efficient way to do this but I'm not finding it... Although actually it's executing 10 million times in 124 ms here. That's pretty fast.

Author:  wilbert [ Wed May 24, 2017 6:27 am ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

netmaestro wrote:
There may be a more efficient way to do this but I'm not finding it... Although actually it's executing 10 million times in 124 ms here. That's pretty fast.

Nice division algorithm. :D
Even with the same algorithm you can speed things up.
You can first add the components and then divide (see the adapted PB routine).
It's also not required to use ebx (which officially should be preserved) and to allocate a variable for result.
Last change I made is using a local label instead of an anonymous label because anonymous asm labels aren't supported on OSX (nasm/yasm instead of fasm).
Code:
Procedure Color_Mix_asm(color1.l, color2.l, percent.l)
  ; netmaestro May 2017
 
  !xor ecx, ecx
  !.loop:
 
  !mov eax, 0x64
  !sub eax, [p.v_percent]
  !movzx edx, byte [p.v_color2 + ecx]
  !imul edx, eax
  !movzx eax, byte [p.v_color1 + ecx]
  !imul eax, [p.v_percent]
  !add eax, edx
 
  !mov edx, eax   
  !shr edx, 5     
  !add edx, eax   
  !shl edx, 2     
  !add eax, edx   
  !shr eax, 9     
  !mov [p.v_color1 + ecx], al
 
  !inc ecx
  !cmp ecx, 0x2
  !jle .loop
 
  !mov eax, [p.v_color1]
  ProcedureReturn
 
EndProcedure

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  r= (Red(color1)*percent + Red(color2)*(100-percent)) / 100
  g= (Green(color1)*percent + Green(color2)*(100-percent)) / 100
  b= (Blue(color1)*percent + Blue(color2)*(100-percent)) / 100
  ProcedureReturn RGB(r,g,b)
EndProcedure

; Debug RSet(Hex(Color_Mix_pb(#Red, #Green, 80),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_asm(#Red, #Green, 80),#PB_Long), 6, "0")
;
; End

CompilerIf #PB_Compiler_Debugger
  MessageRequester("Notice:", "Please turn off the debugger for this test")
  End
CompilerEndIf

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_pb(#Green,#Blue, 50)
Next
e=ElapsedMilliseconds()-s
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_asm(#Green,#Blue, 50)
Next
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))


Using a multiply with a shift to divide seems to be a bit faster.
There might be rounding differences between the two approaches; haven't checked for that.
Code:
Procedure Color_Mix_asm(color1.l, color2.l, percent.l)
  ; netmaestro May 2017
 
  !xor ecx, ecx
  !.loop:
 
  !mov eax, 0x64
  !sub eax, [p.v_percent]
  !movzx edx, byte [p.v_color2 + ecx]
  !imul edx, eax
  !movzx eax, byte [p.v_color1 + ecx]
  !imul eax, [p.v_percent]
  !add eax, edx
 
  !imul eax, 167773
  !shr eax, 24
  !mov [p.v_color1 + ecx], al
 
  !inc ecx
  !cmp ecx, 0x2
  !jle .loop
 
  !mov eax, [p.v_color1]
  ProcedureReturn
 
EndProcedure

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  r= (Red(color1)*percent + Red(color2)*(100-percent)) / 100
  g= (Green(color1)*percent + Green(color2)*(100-percent)) / 100
  b= (Blue(color1)*percent + Blue(color2)*(100-percent)) / 100
  ProcedureReturn RGB(r,g,b)
EndProcedure

; Debug RSet(Hex(Color_Mix_pb(#Red, #Green, 80),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_asm(#Red, #Green, 80),#PB_Long), 6, "0")
;
; End

CompilerIf #PB_Compiler_Debugger
  MessageRequester("Notice:", "Please turn off the debugger for this test")
  End
CompilerEndIf

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_pb(#Green,#Blue, 50)
Next
e=ElapsedMilliseconds()-s
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_asm(#Green,#Blue, 50)
Next
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))

Author:  netmaestro [ Wed May 24, 2017 6:41 am ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

Excellent work wilbert. You've consolidated the task nicely and the divide is better yet. It's executing 10m times in 85-90 msec here now, down from 120-124 with my latest. Thanks for the instructive input.

Author:  Demivec [ Wed May 24, 2017 7:34 am ]
Post subject:  Re: [Solved] Fast Alpha Blending -percent based needed

Just as a side note, the original PureBasic version can be improved further by simplifying the code to:
Code:
Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  Protected p.f, r, g, b
  p = percent / 100
  r= (Red(color1) - Red(color2)) * p + Red(color2)
  g= (Green(color1) - Green(color2)) * p + Green(color2)
  b= (Blue(color1) - Blue(color2)) * p + Blue(color2)
  ProcedureReturn RGB(r,g,b)
EndProcedure


I had hopes to to improve the assembler version by implementing this same idea there but I didn't see a way to readily do so, though I did try :wink: .

wilbert's implementation incorporates a similar idea with the hoped for speed improvements and thus more than meets the initial goal. Thanks wilbert.

@Edit: corrected the p variable type to be a float. It was correct in the production code, honest. :)

Page 1 of 3 All times are UTC + 1 hour
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/