[Solved] Fast Alpha Blending -percent based needed

Bare metal programming in PureBasic, for experienced users
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [Solved] Fast Alpha Blending -percent based needed

Post by wilbert »

Demivec wrote:Just as a side note, the original PureBasic version can be improved further by simplifying the code to:

Code: Select all

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  Protected p, r, g, b
  p = percent / 100
  r= (Red(color1) - Red(color2)) * p + Red(color2)
  g= (Green(color1) - Green(color2)) * p + Green(color2)
  b= (Blue(color1) - Blue(color2)) * p + Blue(color2)
  ProcedureReturn RGB(r,g,b)
EndProcedure
In this case p has to be a float instead of integer :wink:
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Demivec
Addict
Addict
Posts: 4086
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: [Solved] Fast Alpha Blending -percent based needed

Post by Demivec »

wilbert wrote:
Demivec wrote:Just as a side note, the original PureBasic version can be improved further by simplifying the code to:

Code: Select all

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  Protected p, r, g, b
  p = percent / 100
  r= (Red(color1) - Red(color2)) * p + Red(color2)
  g= (Green(color1) - Green(color2)) * p + Green(color2)
  b= (Blue(color1) - Blue(color2)) * p + Blue(color2)
  ProcedureReturn RGB(r,g,b)
EndProcedure
In this case p has to be a float instead of integer :wink:
That was the problem I ran into with my assembler version too . . . until it dawned on me a little ways into testing.

Either way I corrected this in the previous post.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [Solved] Fast Alpha Blending -percent based needed

Post by wilbert »

@Demivec, your idea of a delta with a single multiplication works great.
It can be applied to asm code as well.

Code: Select all

Macro M_Color_Mix(channel)
  !movzx eax, byte [p.v_color1 + channel]
  !movzx edx, byte [p.v_color2 + channel]
  !sub eax, edx
  !imul eax, ecx
  !add eax, 0x800000
  !shr eax, 24
  !add eax, edx
  !mov [p.v_color1 + channel], al
EndMacro

Procedure Color_Mix_asm(color1.l, color2.l, percent.l)

  !mov ecx, [p.v_percent]
  !imul ecx, 167772
  M_Color_Mix(0)
  M_Color_Mix(1)
  M_Color_Mix(2)
  !mov eax, [p.v_color1]
  ProcedureReturn
  
EndProcedure

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  r= (Red(color1)*percent + Red(color2)*(100-percent)) / 100
  g= (Green(color1)*percent + Green(color2)*(100-percent)) / 100
  b= (Blue(color1)*percent + Blue(color2)*(100-percent)) / 100
  ProcedureReturn RGB(r,g,b)
EndProcedure

; Debug RSet(Hex(Color_Mix_pb(#Red, #Green, 80),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_asm(#Red, #Green, 80),#PB_Long), 6, "0")
; 
; End

CompilerIf #PB_Compiler_Debugger
  MessageRequester("Notice:", "Please turn off the debugger for this test")
  End
CompilerEndIf

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_pb(#Green,#Blue, 50)
Next
e=ElapsedMilliseconds()-s
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_asm(#Green,#Blue, 50)
Next
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: [Solved] Fast Alpha Blending -percent based needed

Post by walbus »

Interesting new variants

Here a little changed fast macro from user eesau, but not with percent

Code: Select all

 Macro RGB(red, green, blue) : (((blue<<8+green)<<8)+red) : EndMacro ; Macro by eesau
  Macro Red(color)   : (color&$FFFFFF>>16) : EndMacro
  Macro Green(color) : (color&$FFFF)>>8    : EndMacro
  Macro Blue(color)  : (color>>16)         : EndMacro
  Macro AlphaBlend(color_1, color_2, alpha)
    RGB(((Red(color_2)*alpha+Red(color_1)*(256-alpha))>>8),
        ((Green(color_2)*alpha+Green(color_1)*(256-alpha))>>8),
        ((Blue(color_2)*alpha+Blue(color_1)*(256-alpha))>>8))
  EndMacro
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [Solved] Fast Alpha Blending -percent based needed

Post by wilbert »

For comparison also a SSE2 version which takes a percent value from 0 - 100.

Code: Select all

Macro M_Color_Mix(channel)
  !movzx eax, byte [p.v_color1 + channel]
  !movzx edx, byte [p.v_color2 + channel]
  !sub eax, edx
  !imul eax, ecx
  !add eax, 0x800000
  !shr eax, 24
  !add eax, edx
  !mov [p.v_color1 + channel], al
EndMacro

Procedure Color_Mix_asm(color1.l, color2.l, percent.l)

  !mov ecx, [p.v_percent]
  !imul ecx, 167772
  M_Color_Mix(0)
  M_Color_Mix(1)
  M_Color_Mix(2)
  !mov eax, [p.v_color1]
  ProcedureReturn
  
EndProcedure

Procedure Color_Mix_SSE2(color1.l, color2.l, percent.l)
  !mov eax, [p.v_percent]
  !imul eax, 167772
  !shr eax, 8
  !movd xmm0, [p.v_color1]
  !movd xmm1, [p.v_color2]
  !movd xmm2, eax
  !punpcklbw xmm0, xmm0
  !punpcklbw xmm1, xmm1
  !pshuflw xmm2, xmm2, 0
  !pcmpeqw xmm3, xmm3
  !pxor xmm3, xmm2
  !pmulhuw xmm0, xmm2
  !pmulhuw xmm1, xmm3
  !paddw xmm0, xmm1
  !psrlw xmm0, 8
  !packuswb xmm0, xmm0
  !movd eax, xmm0
  ProcedureReturn
EndProcedure

Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
  r= (Red(color1)*percent + Red(color2)*(100-percent)) / 100
  g= (Green(color1)*percent + Green(color2)*(100-percent)) / 100
  b= (Blue(color1)*percent + Blue(color2)*(100-percent)) / 100
  ProcedureReturn RGB(r,g,b)
EndProcedure

; Debug RSet(Hex(Color_Mix_pb(#Red, #Green, 80),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_asm(#Red, #Green, 80),#PB_Long), 6, "0")
; Debug RSet(Hex(Color_Mix_SSE2(#Red, #Green, 80),#PB_Long), 6, "0")
; 
; End

CompilerIf #PB_Compiler_Debugger
  MessageRequester("Notice:", "Please turn off the debugger for this test")
  End
CompilerEndIf

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_pb(#Green,#Blue, 50)
Next
e=ElapsedMilliseconds()-s
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_asm(#Green,#Blue, 50)
Next
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  Color_Mix_SSE2(#Green,#Blue, 50)
Next
MessageRequester("SSE2 Version", Str(ElapsedMilliseconds()-s))
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: [Solved] Fast Alpha Blending -percent based needed

Post by walbus »

Wow, nice new codes !

A other
This macro above is very fast, i use it for output shapes on canvas

Code: Select all

  Macro RGB(red, green, blue) : (((blue<<8+green)<<8)+red) : EndMacro ; Macro by eesau
  Macro Red(color)   : (color&$FFFFFF>>16) : EndMacro
  Macro Green(color) : (color&$FFFF)>>8    : EndMacro
  Macro Blue(color)  : (color>>16)         : EndMacro
  Macro AlphaBlend(color_1, color_2, alpha)
    RGB(((Red(color_2)*alpha+Red(color_1)*(256-alpha))>>8),
        ((Green(color_2)*alpha+Green(color_1)*(256-alpha))>>8),
        ((Blue(color_2)*alpha+Blue(color_1)*(256-alpha))>>8))
  EndMacro
  
   Procedure AlphaBlend_(color_1, color_2, mix); mix [0, 255] ; By wilbert
    !movd xmm0, [p.v_color_1]
    !movd xmm1, [p.v_color_2]
    !movd xmm2, [p.v_mix]
    !punpcklbw xmm0, xmm0
    !punpcklbw xmm1, xmm1
    !punpcklbw xmm2, xmm2
    !pcmpeqw xmm3, xmm3
    !pshuflw xmm2, xmm2, 0
    !pxor xmm3, xmm2
    !pmulhuw xmm1, xmm2
    !pmulhuw xmm0, xmm3
    !paddw xmm0, xmm1
    !psrlw xmm0, 8
    !packuswb xmm0, xmm0
    !movd eax, xmm0
    ProcedureReturn
  EndProcedure

s=ElapsedMilliseconds()
For i=1 To 10000000
 x= AlphaBlend($AAAA,#Blue, 50)
Next
Debug Hex(x)
e=ElapsedMilliseconds()-s
MessageRequester("PB Code Version", Str(ElapsedMilliseconds()-s))

s=ElapsedMilliseconds()
For i=1 To 10000000
  x= AlphaBlend_($AAAA,#Blue, 50)
Next
Debug Hex(x)
MessageRequester("asm Version", Str(ElapsedMilliseconds()-s))

wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [Solved] Fast Alpha Blending -percent based needed

Post by wilbert »

walbus wrote:This macro above is very fast, i use it for output shapes on canvas
It sure is.
Here's a variation on your macro but it doesn't seem to make a big difference.

Code: Select all

  Macro AlphaBlend(color_1, color_2, alpha)
     ((((color_2 & $FF00FF)*alpha+(color_1 & $FF00FF)*(256-alpha))>>8 & $FF00FF) |
     (((color_2 & $FF00)*alpha+(color_1 & $FF00)*(256-alpha))>>8 & $FF00))
  EndMacro
It must be the function call which slows things down.
I'm sure if it would be a procedure to blend an array of pixels, asm would be faster. :wink:

Here's also a macro for blending all four RGBA channels instead of only RGB.

Code: Select all

  Macro AlphaBlendRGBA(color_1, color_2, alpha)
     ((((color_2 & $FF00FF)*alpha+(color_1 & $FF00FF)*(256-alpha))>>8 & $FF00FF) |
     (((color_2 >>8 & $FF00FF)*alpha+(color_1 >> 8 & $FF00FF)*(256-alpha)) & $FF00FF00))
  EndMacro
Last edited by wilbert on Sat May 27, 2017 6:42 am, edited 1 time in total.
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: [Solved] Fast Alpha Blending -percent based needed

Post by walbus »

Hi WIlbert, thanks again, your macro looks better and is many better readable
Yes, but am surprised that it makes so much

Both macros needs only ~14ms on me older i7

regards werner
User avatar
Michael Vogel
Addict
Addict
Posts: 2666
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Re: [Solved] Fast Alpha Blending -percent based needed

Post by Michael Vogel »

I have done a procedure some years ago which is as fast as the procedure 'Color_Mic_asm' (at least when scaling from 0 to 255, because 'n' must not to be rescaled then)...

Code: Select all

#loop=10000000
a=#Green
b=#Blue
z=50

Macro M_Color_Mix(channel)
	!movzx eax, byte [p.v_color1 + channel]
	!movzx edx, byte [p.v_color2 + channel]
	!sub eax, edx
	!imul eax, ecx
	!add eax, 0x800000
	!shr eax, 24
	!add eax, edx
	!mov [p.v_color1 + channel], al
EndMacro
Procedure Color_Mix_asm(color1.l, color2.l, percent.l)
	!mov ecx, [p.v_percent]
	!imul ecx, 167772
	M_Color_Mix(0)
	M_Color_Mix(1)
	M_Color_Mix(2)
	!mov eax, [p.v_color1]
	ProcedureReturn
EndProcedure
Procedure Color_Mix_SSE2(color1.l, color2.l, percent.l)
	!mov eax, [p.v_percent]
	!imul eax, 167772
	!shr eax, 8
	!movd xmm0, [p.v_color1]
	!movd xmm1, [p.v_color2]
	!movd xmm2, eax
	!punpcklbw xmm0, xmm0
	!punpcklbw xmm1, xmm1
	!pshuflw xmm2, xmm2, 0
	!pcmpeqw xmm3, xmm3
	!pxor xmm3, xmm2
	!pmulhuw xmm0, xmm2
	!pmulhuw xmm1, xmm3
	!paddw xmm0, xmm1
	!psrlw xmm0, 8
	!packuswb xmm0, xmm0
	!movd eax, xmm0
	ProcedureReturn
EndProcedure
Procedure Color_Mix_pb(color1.l, color2.l, percent.l)
	r= (Red(color1)*percent + Red(color2)*(100-percent)) / 100
	g= (Green(color1)*percent + Green(color2)*(100-percent)) / 100
	b= (Blue(color1)*percent + Blue(color2)*(100-percent)) / 100
	ProcedureReturn RGB(r,g,b)
EndProcedure

Procedure.i ColorScale(ColA,ColB,n)

	;n*255
	;n/100
	ProcedureReturn  ( ((ColA&$FF)*n+(ColB&$FF)*(255-n))>>8 ) | ( (((ColA&$FF00)*n+(ColB&$FF00)*(255-n))>>8)&$FF00 ) | (((ColA>>8&$FF00)*n+(ColB>>8&$FF00)*(255-n)) & $FF0000)

EndProcedure

CompilerIf #PB_Compiler_Debugger
	MessageRequester("Notice:", "Please turn off the debugger for this test")
	End
CompilerEndIf

m.s=""
s=ElapsedMilliseconds()
For i=1 To #loop
	r=Color_Mix_pb(a,b,z)
Next
e=ElapsedMilliseconds()-s
m+"PB"+#TAB$+Str(ElapsedMilliseconds()-s)+#TAB$+Hex(r)+#CR$

s=ElapsedMilliseconds()
For i=1 To #loop
	r=Color_Mix_asm(a,b,z)
Next
m+"ASM"+#TAB$+Str(ElapsedMilliseconds()-s)+#TAB$+Hex(r)+#CR$

s=ElapsedMilliseconds()
For i=1 To #loop
	r=Color_Mix_SSE2(a,b,z)
Next
m+"SSE2"+#TAB$+Str(ElapsedMilliseconds()-s)+#TAB$+Hex(r)+#CR$

s=ElapsedMilliseconds()
z_=z*255/100
For i=1 To #loop
	r=ColorScale(a,b,z*255/100); speeds up by using r=ColorScale(a,b,z_)
Next
m+"BIRD"+#TAB$+Str(ElapsedMilliseconds()-s)+#TAB$+Hex(r)

MessageRequester(": )",m)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [Solved] Fast Alpha Blending -percent based needed

Post by wilbert »

Michael Vogel wrote:I have done a procedure some years ago which is as fast as the procedure 'Color_Mic_asm' (at least when scaling from 0 to 255, because 'n' must not to be rescaled then)...
The best (fastest) approach depends a lot on what you exactly want to do.
Blending an array of pixels is a different problem compared to only two color values.
The range of the blend value (0 - 100), (0 - 255), (0 - 256) or (0.0 - 1.0) also makes a big difference.
If you need to do a lot of iterations with the same blend value it's indeed wise to pre-calculate like you are doing in your benchmark :)
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: [Solved] Fast Alpha Blending -percent based needed

Post by walbus »

Pre calculating is a "must do" how ever it is available
On a real picture the best is ever pre compare color1 with color2
It color1=color2 a function call is not needed
This is also important for color distance
Last edited by walbus on Thu May 25, 2017 6:05 pm, edited 1 time in total.
User avatar
Michael Vogel
Addict
Addict
Posts: 2666
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Re: [Solved] Fast Alpha Blending -percent based needed

Post by Michael Vogel »

wilbert wrote:If you need to do a lot of iterations with the same blend value it's indeed wise to pre-calculate like you are doing in your benchmark :)
If would do precalculation, I would define a 256x256 matrix which costs 64Kb, but allows to need only some shift commands - but you're right, everything depends on the needs... :wink:

About my *255/100 line in the benchmark, I find it interesting, that the following variants results in different timing:

The slowest:

Code: Select all

Procedure.i ColorScale(ColA,ColB,n)
	n*255
	n/100
	ProcedureReturn  ...
EndProcedure

For i=1 To #loop
	r=ColorScale(a,b,z)
Next
Slightly faster:

Code: Select all

Procedure.i ColorScale(ColA,ColB,n)
	n=n*255/100
	ProcedureReturn  ...
EndProcedure
The quickest:

Code: Select all

Procedure.i ColorScale(ColA,ColB,n)
	ProcedureReturn  ...
EndProcedure

For i=1 To #loop
	r=ColorScale(a,b,z*255/100)
Next
Even better:

Code: Select all

For i=1 To #loop
	r=ColorScale(a,b,z*2.55)
Next
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: [Solved] Fast Alpha Blending -percent based needed

Post by wilbert »

Michael Vogel wrote:If would do precalculation, I would define a 256x256 matrix which costs 64Kb, but allows to need only some shift commands - but you're right, everything depends on the needs... :wink:

About my *255/100 line in the benchmark, I find it interesting, that the following variants results in different timing:
Interesting idea, a 256x256 matrix with lookup. I wonder if in practice it would be faster or not.
Multiplication is very fast these days and with a lookup you have more memory access.

As for your *255/100, division is an operation which takes a lot of time. When you multiply by 2.55 you have no division 8)
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: [Solved] Fast Alpha Blending -percent based needed

Post by walbus »

For Next is the slowest from all loops, for speed optimizing unsuitable
Last edited by walbus on Thu May 25, 2017 8:41 pm, edited 3 times in total.
User avatar
Michael Vogel
Addict
Addict
Posts: 2666
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Re: [Solved] Fast Alpha Blending -percent based needed

Post by Michael Vogel »

wilbert wrote:As for your *255/100, division is an operation which takes a lot of time. When you multiply by 2.55 you have no division 8)
yep, but I thought a floating point would slowing it down. More surprising is, why 'n*255/100' is faster than 'n*255 : n/100' and why the same formula is done quicker outside the procedure than inside?!
Post Reply