HSL/HSV to RGB

Just starting out? Need help? Post your questions and find answers here.
User avatar
Piero
Addict
Addict
Posts: 1166
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: HSL/HSV to RGB

Post by Piero »

wilbert wrote: Wed Sep 13, 2023 1:39 pmunfortunately I'm not familiar enough with it at the moment to write such code. :(
Thanks anyway; this seems to have started an interesting thread...
I hope Fred will implement some kind of superfast "hue" color mapping if possible; I think it would be very useful, e.g. for games...
SMaag
Enthusiast
Enthusiast
Posts: 353
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: HSL/HSV to RGB

Post by SMaag »

wilbert srote_
While the conversion to and from YIQ colorspace does produce different results from what you might expect, it does do a better job in respecting the luminance of the source image when adjusting the hue (at least to my eyes)
Yes that might be the reason! But anyway, I want to understand the code and want to know why and what is it doing!
And I want to use this basics for other transformations.

But there is signigicant difference in color transforming with your code and my Classic code, what I can't understand.

I can't follow your code at all, because you are doing some optimations what are good for speed but hard to understand.
Especally the form of the Matrix. For me, at the moment it looks like as you moved some entries to other positions to optimate
SSE Commands!

What I try to do next:
I try to recalculate the correct parameters from the single Vector Matrix multiplications. But when doing this on a paper,
I have always some faults in!

I guess this is the correct math for this! Step by step!

Code: Select all

; RGB -> YIQ conversion:
; [ Y ]     [ 0.299   0.587   0.114 ] [ R ]
; [ I ]  =  [ 0.596  -0.275  -0.321 ] [ G ]
; [ Q ]     [ 0.212  -0.523   0.311 ] [ B ]

; Y = 0.299 *R + 0.587 *G + 0.114 *B
; I = 0.596 *R - 0.275 *G - 0.321 *B
; Q = 0.212 *R - 0.523 *G + 0.311 *B

; Hue Shift
; h = Hue Shift Degree]
; s = Saturation multiplier
; v = Salue multiplier

; VSU = v * s * Cos(h*#PI/180)
; VSW = v * s * Sin(h*#PI/180)

; [ Y' ]     [ V   0    0   ] [ Y ]
; [ I' ]  =  [ 0  VSU  -VSW ] [ I ]
; [ Q' ]     [ 0  VSW   VSU ] [ Q ]

; Y' = V *Y + 0       + 0
; I' = 0    + VSU *I  - VSW *Q
; Q' = 0    + VSW *I  + VSU *Q

; YIQ -> RGB conversion:
; [ R' ]     [ 1   0.956   0.621 ] [ Y' ]
; [ G' ]  =  [ 1  -0.272  -0.647 ] [ I' ]
; [ B' ]     [ 1  -1.105   1.702 ] [ Q' ]

; R' = Y' + 0.956 *I' + 0.621 *Q'
; G' = Y' - 0.272 *I' - 0.647 *Q'
; B' = Y' - 1.105 *I' + 1.702 *Q'

SMaag
Enthusiast
Enthusiast
Posts: 353
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: HSL/HSV to RGB

Post by SMaag »

Where is the Bug?

Now a Step by step Code for the transformation. But the result is far away from correct!

Code: Select all


EnableExplicit

Procedure.l HueShift(Color.l, h.f, s.f, v.f)
    
  Protected.l R, G, B, R_, G_, B_
  Protected.f Y, I, Q, Y_, I_, Q_
  Protected.f VSU, VSW
  
  R = Red(Color)
  G = Green(Color)
  B = Blue(Color)
  
  Debug "RGB(" + Str(R) + ", " + Str(G) + ", " + Str(B) + ")"
;   Debug "R = " + Str(R)
;   Debug "G = " + Str(G)
;   Debug "B = " + Str(B)
 
  ; RGB -> YIQ conversion:
  ; [ Y ]     [ 0.299   0.587   0.114 ] [ R ]
  ; [ I ]  =  [ 0.596  -0.275  -0.321 ] [ G ]
  ; [ Q ]     [ 0.212  -0.523   0.311 ] [ B ]
  
  Y = 0.299 *R + 0.587 *G + 0.114 *B
  I = 0.596 *R - 0.275 *G - 0.321 *B
  Q = 0.212 *R - 0.523 *G + 0.311 *B
  
  Debug #Null$
  Debug "Y = " + StrF(Y, 3)
  Debug "I = " + StrF(I, 3)
  Debug "Q = " + StrF(Q, 3)

  ; Hue Shift
  ; h = Hue Shift Degree]
  ; s = Saturation multiplier
  ; v = Salue multiplier
  
  VSU = v * s * Cos(h*#PI/180)
  VSW = v * s * Sin(h*#PI/180)
  
  Debug #Null$
  Debug "VSU = " + StrF(VSU, 3)
  Debug "VSW = " + StrF(VSW, 3)

  ; [ Y' ]     [ V   0    0   ] [ Y ]
  ; [ I' ]  =  [ 0  VSU  -VSW ] [ I ]
  ; [ Q' ]     [ 0  VSW   VSU ] [ Q ]
  
  Y_ = v *Y               ; + 0       + 0
  I_ = VSU *I  - VSW *Q
  Q_ = VSW *I  + VSU *Q
  
  Debug #Null$
  Debug "Y' = " + StrF(Y_, 3)
  Debug "I' = " + StrF(I_, 3)
  Debug "Q' = " + StrF(Q_, 3)

  ; YIQ -> RGB conversion:
  ; [ R' ]     [ 1   0.956   0.621 ] [ Y' ]
  ; [ G' ]  =  [ 1  -0.272  -0.647 ] [ I' ]
  ; [ B' ]     [ 1  -1.105   1.702 ] [ Q' ]
  
  R_ = Y_ + 0.956 *I_ + 0.621 *Q_
  G_ = Y_ - 0.272 *I_ - 0.647 *Q_
  B_ = Y_ - 1.105 *I_ + 1.702 *Q_
  
  If R_ >255
    R_ =255
  ElseIf R_ <0
    R_ = 0
  EndIf
  
  If G_ >255
    G_ =255
  ElseIf G_ <0
    G_ = 0
  EndIf
  
  If B_ >255
    B_ =255
  ElseIf B_ <0
    B_ = 0
  EndIf

    
  Debug #Null$
  Debug "RGB'(" + Str(R_) + ", " + Str(G_) + ", " + Str(B_) + ")"
 
  ProcedureReturn RGBA(R_, G_, B_, Alpha(Color))
EndProcedure

#PbFw_COL_Red         = 255          ; RGB(255,0,0)
#PbFw_COL_Green       = 65280        ; RGB(0,255,0)
#PbFw_COL_Blue        = 16711680     ; RGB(0,0,255)

#PbFw_COL_Yellow      = 65535        ; RGB(255,255,0) 
#PbFw_COL_Cyan        = 16776960     ; RGB(0,255,255) 


Debug "HueShift red, 120°"
HueShift(#PbFw_COL_Red, 120, 1, 1)
Debug "-----------------------------"
Debug #Null$
Debug "HueShift green, 120°"
HueShift(#PbFw_COL_Green, 120, 1, 1)
Debug "-----------------------------"

Debug #Null$
Debug "HueShift blue, 120°"
HueShift(#PbFw_COL_Blue, 120, 1, 1)
Debug "-----------------------------"

Debug #Null$
Debug "HueShift red, 180°"
HueShift(#PbFw_COL_Red, 180, 1, 1)
Debug "-----------------------------"

Debug #Null$
Debug "HueShift green, 180°"
HueShift(#PbFw_COL_Green, 180, 1, 1)
Debug "-----------------------------"

Debug #Null$
Debug "HueShift blue, 180°"
HueShift(#PbFw_COL_Blue, 180, 1, 1)

The Debug Result
HueShift red, 120°
RGB(255, 0, 0)

Y = 76.245
I = 151.980
Q = 54.060

VSU = -0.500
VSW = 0.866

Y' = 76.245
I' = -122.807
Q' = 104.589

RGB'(24, 42, 255)
-----------------------------

HueShift green, 120°
RGB(0, 255, 0)

Y = 149.685
I = -70.125
Q = -133.365

VSU = -0.500
VSW = 0.866

Y' = 149.685
I' = 150.560
Q' = 5.952

RGB'(255, 105, 0)
-----------------------------

HueShift blue, 120°
RGB(0, 0, 255)

Y = 29.070
I = -81.855
Q = 79.305

VSU = -0.500
VSW = 0.866

Y' = 29.070
I' = -27.753
Q' = -110.541

RGB'(0, 108, 0)
-----------------------------

HueShift red, 180°
RGB(255, 0, 0)

Y = 76.245
I = 151.980
Q = 54.060

VSU = -1.000
VSW = 0.000

Y' = 76.245
I' = -151.980
Q' = -54.060

RGB'(0, 153, 152)
-----------------------------

HueShift green, 180°
RGB(0, 255, 0)

Y = 149.685
I = -70.125
Q = -133.365

VSU = -1.000
VSW = 0.000

Y' = 149.685
I' = 70.125
Q' = 133.365

RGB'(255, 44, 255)
-----------------------------

HueShift blue, 180°
RGB(0, 0, 255)

Y = 29.070
I = -81.855
Q = 79.305

VSU = -1.000
VSW = 0.000

Y' = 29.070
I' = 81.855
Q' = -79.305

RGB'(58, 58, 0)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3944
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HSL/HSV to RGB

Post by wilbert »

SMaag wrote: Wed Sep 13, 2023 5:18 pmI can't follow your code at all, because you are doing some optimations what are good for speed but hard to understand.
Especally the form of the Matrix. For me, at the moment it looks like as you moved some entries to other positions to optimate
SSE Commands!
That is right. :wink:

Because the internal order of the color components is different on macOS and Windows, let's name them c1, c2, c3 and c4.
When you first load a pixel into a xmm register with movd, the 16 bytes from most significant to least significant are ...
00-00-00-00-00-00-00-00-00-00-00-00-c4-c3-c2-c1
The next step is to expand the bytes to words (punpcklbw). After that, the 8 words are
00-00 - 00-00 - c4-c3 - c2-c1
After the pshufd instructions to shuffle them

xmm0 = c02-c01 - c02-c01 - c02-c01 - c02-c01
xmm1 = c04-c03 - c04-c03 - c04-c03 - c04-c03

Matrix values

xmm2 = m42-m41 - m32-m31 - m22-m21 - m12-m11
xmm3 = m44-m43 - m34-m33 - m24-m23 - m14-m13

after 2 x maddwd and 1 paddd, 4 dwords in xmm0

c4' - c3' - c2' - c1'

c1' = c01*m11 + c02*m12 + c03*m13 + c04*m14
c2' = c01*m21 + c02*m22 + c03*m23 + c04*m24
c3' = c01*m31 + c02*m32 + c03*m33 + c04*m34
c4' = c01*m41 + c02*m42 + c03*m43 + c04*m44

It is indeed a bit difficult to get the matrix values in the right positions of xmm2 and xmm3.
I hope I didn't swap matrix rows and columns in my explanation because I'm not very used to working with matrix multiplication.
Windows (x64)
Raspberry Pi OS (Arm64)
SMaag
Enthusiast
Enthusiast
Posts: 353
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: HSL/HSV to RGB

Post by SMaag »

Now I found the basic problem of the calculation.

In YIQ Colorspace, a rotation of 120° of red -> green onother 120° -> blue.

Y represent the rotation of the grey vector from 0..360°

Y = 0.299 *R + 0.587 *G + 0.114 *B
so Y is the formula for greyscaling but the rotation is wrong because 0.299+0.587+0.114 = 1.0
if R=255, G=255, B=255 we get the maximum result = 255 but rotation max. is 360. There are missing
the correct 120/240° offsets for green and blue!

This color space stuff is so fucking complicated that you can't trust a code wich you just copy from the internet.
Here we are at the well known programmers knowlege dilemma (what's digital 0,1)
If you have a professional working code you need 0 knowledge, but if not, you need 100% knowledge.

here is a much better description ob color manipulation! From Paul Haeberli
https://graficaobscura.com/matrix/index.html

Here the link to Wikipedai, who is Paul Haeberli
https://en.wikipedia.org/wiki/Paul_Haeberli

I guess as a Coumputer grafics specialist from the early days we can trust his description.

Here the short description of how to do Hue rotation

1. Rotate the grey vector into positive Z
2. Rotate the hue
3 Rotate the grey vector back into place

The resulting matrix will rotate the hue of the input RGB colors. A rotation of 120.0 degrees will exactly map Red into Green, Green into Blue and Blue into Red. This transformation has one problem, however, the luminance of the input colors is not preserved. This can be fixed with the following refinement:
SMaag
Enthusiast
Enthusiast
Posts: 353
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: HSL/HSV to RGB

Post by SMaag »

Now I have a demo which works nearly correct. It is the orginal code form Paul Haeberli, matrix.c.
Thanks to PureBasic, the simpler C, it was easy to convert to PB.

One small error I coudn't not find until know is the hue rotatian angle. Rotate red 120° has to be green but it is blue here.

UPDATE: now hue Rotation is working correct!
I changed the xrotmat, yrotmat, zrotmat to the 3D Grafics standard matrix. And adapted the direction in the hue code.
I fixed some type mismatch in the matrix elements!


I guess that's a code we can start to optimize!

For testing: go to Procedrue Test() and activate the routine you want. I started with the simple hue rotation without saturation correction!

Code: Select all

; https://graficaobscura.com/matrix/index.html

; this file is a fork of matrix.c from Paul Haeberli from 1993
; https://graficaobscura.com/matrix/matrix.c

; 2023/09/14 S.Maag : translated to PureBasic

EnableExplicit

#RLUM = 0.3086    ; red lumination factor
#GLUM = 0.6094    ; green lumination factor
#BLUM = 0.0820    ; blue lumination factor

; Intel Little Endian RGBA-ByteOrder in Memory
#OFFSET_R = 0
#OFFSET_G = 1
#OFFSET_B = 2
#OFFSET_A = 3

; Big Endian Byte-Order in Memory
; #OFFSET_R = 3
; #OFFSET_G = 2
; #OFFSET_B = 1
; #OFFSET_A = 0

Structure TColorMatrix4f; Color Transformation Matrix, signed 16Bit Integer
  m.f[0] ; 0..15
  m00.f : m01.f : m02.f : m03.f ; 0..3
  m10.f : m11.f : m12.f : m13.f ; 4..7
  m20.f : m21.f : m22.f : m23.f ; 12..15
  m30.f : m31.f : m32.f : m33.f ; 8..11
EndStructure

Structure TColor
  cb.a[0]       ; Access as Array 4Bytes [0..3]                 
  Color.l      ; Access as 32Bit Long
EndStructure

 Procedure applymatrix(*lptr, *mat.TColorMatrix4f, NoOfPixel.i)
 ; use a matrix To transform colors.

  Protected.i ir, ig, ib, r, g, b
  Protected *cptr.TColor ; unsigned char

  *cptr = *lptr 
  
  NoOfPixel-1
  While NoOfPixel
  	ir = *cptr\cb[#OFFSET_R]
  	ig = *cptr\cb[#OFFSET_G]
  	ib = *cptr\cb[#OFFSET_B]
  	
  	With *mat
    	r = ir * \m00 + ig * \m10 + ib * \m20 + \m30 
    	g = ir * \m01 + ig * \m11 + ib * \m21 + \m31 
    	b = ir * \m02 + ig * \m12 + ib * \m22 + \m32 
    EndWith
  
  	If r <0 : r = 0 : EndIf  
  	If r>255 : r = 255 : EndIf
  	  
  	If g <0 : g = 0 : EndIf 
  	If g >255 : g = 255 : EndIf 
  	  
  	If b <0 : b = 0 : EndIf 
  	If b >255 : b = 255 : EndIf 
  	  
  	*cptr\cb[#OFFSET_R] = r
  	*cptr\cb[#OFFSET_G] = g
  	*cptr\cb[#OFFSET_B] = b
  	*cptr + SizeOf(Long)
  	NoOfPixel-1
  Wend
  ProcedureReturn *mat  
EndProcedure

Procedure.i matrixmult_new(*A.TColorMatrix4f, *B.TColorMatrix4f, *C.TColorMatrix4f)
  Protected OUT.TColorMatrix4f
  
  OUT\m00 = *A\m00 * *B\m00  +  *A\m10 * *B\m01  +  *A\m20 * *B\m02  +  *A\m30 * *B\m03  
	OUT\m01 = *A\m01 * *B\m00  +  *A\m11 * *B\m01  +  *A\m21 * *B\m02  +  *A\m31 * *B\m03
	OUT\m01 = *A\m02 * *B\m00  +  *A\m12 * *B\m01  +  *A\m22 * *B\m02  +  *A\m32 * *B\m03
	OUT\m03 = *A\m03 * *B\m00  +  *A\m13 * *B\m01  +  *A\m23 * *B\m02  +  *A\m33 * *B\m03
	
	OUT\m10 = *A\m00 * *B\m10  +  *A\m10 * *B\m11  +  *A\m20 * *B\m12  +  *A\m30 * *B\m13
	OUT\m11 = *A\m01 * *B\m10  +  *A\m11 * *B\m11  +  *A\m21 * *B\m12  +  *A\m31 * *B\m13
	OUT\m12 = *A\m02 * *B\m10  +  *A\m12 * *B\m11  +  *A\m22 * *B\m12  +  *A\m32 * *B\m13
	OUT\m13 = *A\m03 * *B\m10  +  *A\m13 * *B\m11  +  *A\m23 * *B\m12  +  *A\m33 * *B\m13

	OUT\m20 = *A\m00 * *B\m20  +  *A\m10 * *B\m21  +  *A\m20 * *B\m22  +  *A\m30 * *B\m23
	OUT\m21 = *A\m01 * *B\m20  +  *A\m11 * *B\m21  +  *A\m21 * *B\m22  +  *A\m31 * *B\m23
	OUT\m22 = *A\m02 * *B\m20  +  *A\m12 * *B\m21  +  *A\m22 * *B\m22  +  *A\m32 * *B\m23
	OUT\m23 = *A\m03 * *B\m20  +  *A\m13 * *B\m21  +  *A\m23 * *B\m22  +  *A\m33 * *B\m23

	OUT\m30 = *A\m00 * *B\m30  +  *A\m10 * *B\m31  +  *A\m20 * *B\m32  +  *A\m30 * *B\m33
	OUT\m31 = *A\m01 * *B\m30  +  *A\m11 * *B\m31  +  *A\m21 * *B\m32  +  *A\m31 * *B\m33
	OUT\m32 = *A\m02 * *B\m30  +  *A\m12 * *B\m31  +  *A\m22 * *B\m32  +  *A\m32 * *B\m33
	OUT\m33 = *A\m03 * *B\m30  +  *A\m23 * *B\m31  +  *A\m23 * *B\m32  +  *A\m33 * *B\m33
  	
	CopyStructure(Out, *C, TColorMatrix4f)
	ProcedureReturn *C
EndProcedure
  
Procedure.i matrixmult (*a.TColorMatrix4f, *b.TColorMatrix4f, *c.TColorMatrix4f)
; multiply two matricies

  Protected  x, y
  Protected temp.TColorMatrix4f
  
;      For(y=0; y<4 ; y++)
;         For(x=0 ; x<4 ; x++) {
;             temp[y][x] = b[y][0] * a[0][x]
;                        + b[y][1] * a[1][x]
;                        + b[y][2] * a[2][x]
;                        + b[y][3] * a[3][x];
 
  For y=0 To 3 ; y<4 ; y++)
    For x=0 To 3 ; x<4 ; x++) {
      temp\m[y*4 + x] = *b\m[y*4 +0] * *a\m[0+x]  + *b\m[y*4 +1] * *a\m[4+x]  + *b\m[y*4 +2] * *a\m[8+x] + *b\m[y*4 +3] * *a\m[12+x]
    Next
  Next
  
  CopyStructure(temp, *c, TColorMatrix4f)               
  ProcedureReturn 
EndProcedure
                 
Procedure.i identmat(*mat.TColorMatrix4f)
  ; make an identity matrix
  With *mat
    \m00 = 1 :  \m01 = 0 : \m02 = 0 : \m03 = 0
    \m10 = 0 :  \m11 = 1 : \m12 = 0 : \m13 = 0
    \m20 = 0 :  \m21 = 0 : \m22 = 1 : \m23 = 0
    \m30 = 0 :  \m31 = 0 : \m32 = 0 : \m33 = 1  
  EndWith
  
  ProcedureReturn *mat
EndProcedure

Procedure.i xformpnt(*mat.TColorMatrix4f, x.f, y.f, z.f, *tx.float, *ty.float, *tz.float)
  ; transform a 3D point using a matrix
  With *mat
    *tx\f = x * \m00 + y * \m10 + z * \m20 + \m30 
    *ty\f = x * \m01 + y * \m11 + z * \m21 + \m31 
    *tz\f = x * \m02 + y * \m12 + z * \m22 + \m32 
  EndWith
  
  ProcedureReturn *mat
EndProcedure

Procedure.i cscalemat(*mat.TColorMatrix4f , rscale.f ,gscale.f , bscale.f)
; make a color scale marix
  Protected mmat.TColorMatrix4f
  
  With mmat
    \m00 = rscale : \m01 = 0      : \m02 = 0      : \m03 = 0   
    \m10 = 0      : \m11 = gscale : \m12 = 0      : \m13 = 0  
    \m20 = 0      : \m21 = 0      : \m22 = bscale : \m23 = 0   
    \m30 = 0      : \m31 = 0      : \m32 = 0      : \m33 = 1
  EndWith
  
  matrixmult(mmat, *mat, *mat)  
  ProcedureReturn *mat    
EndProcedure
  
Procedure.i lummat(*mat.TColorMatrix4f)
; make a luminance matrix
  Protected   mmat.TColorMatrix4f
  Protected.f rwgt, gwgt, bwgt

  rwgt = #RLUM
  gwgt = #GLUM
  bwgt = #BLUM
    
  With mmat
    \m00 = rwgt : \m01 = rwgt : \m02 = rwgt : \m03 = 0   
    \m10 = gwgt : \m11 = gwgt : \m12 = gwgt : \m13 = 0    
    \m20 = bwgt : \m21 = bwgt : \m22 = bwgt : \m23 = 0    
    \m30 = 0    : \m31 = 0    : \m32 = 0    : \m33 = 1
  EndWith
  
  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat  
EndProcedure

Procedure.i saturatemat(*mat.TColorMatrix4f, sat.f)
; make a saturation marix

  Protected mmat.TColorMatrix4f ; float mmat[4][4];
  Protected.f a, b, c, d, e, f, g, h, i
  Protected.f rwgt, gwgt, bwgt
  
  rwgt = #RLUM
  gwgt = #GLUM
  bwgt = #BLUM
  
  a = (1.0-sat)*rwgt + sat
  b = (1.0-sat)*rwgt
  c = (1.0-sat)*rwgt
  d = (1.0-sat)*gwgt
  e = (1.0-sat)*gwgt + sat
  f = (1.0-sat)*gwgt
  g = (1.0-sat)*bwgt
  h = (1.0-sat)*bwgt
  i = (1.0-sat)*bwgt + sat
    
  With mmat 
    \m00 = a  : \m01 = b  : \m02 = c : \m03 = 0   
    \m10 = d  : \m11 = e  : \m12 = f : \m13 = 0   
    \m20 = g  : \m21 = h  : \m22 = i : \m23 = 0
    \m30 = 0  : \m31 = 0  : \m32 = 0 : \m33 = 1
  EndWith 
  
  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat
EndProcedure

Procedure.i offsetmat(*mat.TColorMatrix4f ,roffset.f ,goffset.f ,boffset.f)
  ; offset r, g, And b
  Protected mmat.TColorMatrix4f ; float mmat[4][4];
 
  With mmat
    \m00 = 1        : \m01 = 0        : \m02 = 0        : \m03 = 0
    \m10 = 0        : \m11 = 1        : \m12 = 0        : \m13 = 0
    \m20 = 0        : \m21 = 0        : \m22 = 1        : \m23 = 0
    \m30 = roffset  : \m31 = goffset  : \m32 = boffset  : \m33 = 1    
  EndWith
  
  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat
EndProcedure

; HSV ColorSpace hue for a Saturation of 100% and a Value of 100%
; 0 = red : 60 = yellow : 120 = green : 180 = cyan : 240 = blue : 300 = Magenta
; red=HSV(0,1,1) yellow=HSV(60,1,1) green=HSV(120,1,1); blue=HSV(240,1,1) magenta=HSV(300,1,1) purpur=HSV(300,1,0.5) white=HSV(0,0,1) grey=HSV(0,0,0.5)
; but it is left rotation system.
;
;                      +y grass green (90°)
;                      |
;                      |
;                      |
;                      |
; <------------------------------------------>
; -x cyan (180°)       |                 +x red (0°)
;                      |
;                      |
;                      |
;                      -y violet (270°)
;                        
;

Procedure.i xrotatemat(*mat.TColorMatrix4f, rs.f , rc.f)
  ; rotate about the x (red) axis
    
  ; S.Maag: changed rotation orientation to be compatible with the standard 3D matrix for 3D Grafics 

  ; Rotation X                      
  ; | 1     0      0    0 |  
  ; | 0    cos   -sin   0 |  
  ; | 0    sin    cos   0 |  
  ; | 0     0      0    1 |  

  Protected mmat.TColorMatrix4f ; float mmat[4][4]
  
   With mmat    
    \m00 = 1.0
    \m01 = 0.0
    \m02 = 0.0
    \m03 = 0.0
    
    \m10 = 0.0
    \m11 = rc
    \m12 = -rs      ; org: rs
    \m13 = 0.0
    
    \m20 = 0.0
    \m21 = rs       ; org: -rs
    \m22 = rc
    \m23 = 0.0
    
    \m30 = 0.0
    \m31 = 0.0
    \m32 = 0.0
    \m33 = 1.0
  EndWith
  
  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat
EndProcedure

Procedure.i yrotatemat(*mat.TColorMatrix4f, rs.f , rc.f)
  ; rotate about the y (green) axis
  
    ; S.Maag: changed rotation orientation to be compatible with the standard 3D matrix for 3D Grafics 

    ; Rotation Y 
    ; | cos   0    sin   0 | 
    ; |  0    1     0    0 |
    ; |-sin   0    cos   0 | 
    ; |  0    0     0    1 | 

  Protected mmat.TColorMatrix4f ; float mmat[4][4]
  
  With mmat    
        
    \m00 = rc
    \m01 = 0.0
    \m02 = rs     ; org: -rs
    \m03 = 0.0
    
    \m10 = 0.0
    \m11 = 1.0
    \m12 = 0.0
    \m13 = 0.0
    
    \m20 = -rs    ; org: rs
    \m21 = 0.0
    \m22 = rc
    \m23 = 0.0
    
    \m30 = 0.0
    \m31 = 0.0
    \m32 = 0.0
    \m33 = 1.0
  EndWith
  
  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat
EndProcedure

Procedure.i zrotatemat(*mat.TColorMatrix4f, rs.f , rc.f)
  ; rotate about the z (blue) axis
  Protected mmat.TColorMatrix4f ; float mmat[4][4]
  
    ; S.Maag: changed rotation orientation to be compatible with the standard 3D matrix for 3D Grafics 

    ; Rotation Z (counterclockwise)
    ; |cos   -sin   0   0 | 
    ; |sin    cos   0   0 |  
    ; | 0      0    1   0 | 
    ; | 0      0    0   1 | 

  With mmat 
    \m00 = rc
    \m01 = -rs     ; org: rs
    \m02 = 0.0
    \m03 = 0.0
    
    \m10 = rs    ; org: -rs
    \m11 = rc     
    \m12 = 0.0
    \m13 = 0.0
    
    \m20 = 0.0
    \m21 = 0.0
    \m22 = 1.0
    \m23 = 0.0
    
    \m30 = 0.0
    \m31 = 0.0
    \m32 = 0.0
    \m33 = 1.0
  EndWith
  
  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat
EndProcedure

Procedure.i zshearmat(*mat.TColorMatrix4f, dx.f, dy.f)
  ;shear z using x And y.
  Protected mmat.TColorMatrix4f ; float mmat[4][4]
  
  With mmat    
    \m00 = 1.0 : \m01 = 0.0 : \m02 = dx  : \m03 = 0.0
    \m10 = 0.0 : \m11 = 1.0 : \m12 = dy  : \m13 = 0.0
    \m20 = 0.0 : \m21 = 0.0 : \m22 = 1.0 : \m23 = 0.0
    \m30 = 0.0 : \m31 = 0.0 : \m32 = 0.0 : \m33 = 1.0
 EndWith
  
  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat
EndProcedure

Procedure.i simplehuerotatemat(*mat.TColorMatrix4f, rot.f)
  ; simple hue rotation. This changes luminance 
    Protected.f mag
    Protected.f xrs, xrc
    Protected.f yrs, yrc
    Protected.f zrs, zrc
    
    ; S.Maag: changed x and y rotation direction and 
    ; changed xrotmat, yrotmat, zrotmat to the 3D standard rotation matrix
; /* rotate the grey vector into positive Z */
    mag = Sqr(2.0)
    xrs = -1.0/mag    ; org: 1.0/mag
    xrc = 1.0/mag
    xrotatemat(*mat, xrs, xrc)

    mag = Sqr(3.0)
    yrs = 1.0/mag;    ; org: -1.0/mag
    yrc = Sqr(2.0)/mag
    yrotatemat(*mat, yrs, yrc)

; /* rotate the hue */
;     zrs = Sin(rot* #PI/180.0)
;     zrc = Cos(rot* #PI/180.0)
    zrs = Sin(Radian(rot))
    zrc = Cos(Radian(rot))
    zrotatemat(*mat, zrs, zrc)

; /* rotate the grey vector back into place */
    yrotatemat(*mat, -yrs, yrc)
    xrotatemat(*mat, -xrs, xrc)
    
EndProcedure
  
Procedure.i huerotatemat(*mat.TColorMatrix4f, rot.f)
  ; rotate the hue, While maintaining luminance.
   
  Protected mmat.TColorMatrix4f ; float mmat[4][4]
  Protected.f mag
  Protected.f lx, ly, lz
  Protected.f xrs, xrc
  Protected.f yrs, yrc
  Protected.f zrs, zrc
  Protected.f zsx, zsy

  identmat(mmat)
  
  ; S.Maag: changed x and y rotation direction and 
  ; changed xrotmat, yrotmat, zrotmat to the 3D standard rotation matrix

; /* rotate the grey vector into positive Z */
  mag = Sqr(2.0)
  xrs = -1.0/mag  ; org: 1.0/mag 
  xrc = 1.0/mag
  xrotatemat(mmat, xrs, xrc)
  
  mag = Sqr(3.0)
  yrs = 1.0/mag   ; -1.0/mag
  yrc = Sqr(2.0)/mag
  yrotatemat(mmat, yrs, yrc)

; /* shear the space To make the luminance plane horizontal */
  xformpnt(mmat, #RLUM, #GLUM, #BLUM, @lx, @ly, @lz)
  zsx = lx/lz
  zsy = ly/lz
  zshearmat(mmat, zsx, zsy)

; /* rotate the hue */
;   zrs = Sin(rot* #PI/180.0)
;   zrc = Cos(rot* #PI/180.0)
  zrs = Sin(Radian(rot))
  zrc = Cos(Radian(rot))
  zrotatemat(mmat, zrs, zrc)

; /* unshear the space to put the luminance plane back */
  zshearmat(mmat, -zsx, -zsy)

; /* rotate the grey vector back into place */
  yrotatemat(mmat, -yrs, yrc)
  xrotatemat(mmat, -xrs, xrc)

  matrixmult(mmat, *mat, *mat)
  ProcedureReturn *mat
EndProcedure

; Create application window

Define mat.TColorMatrix4f
Define Event
Define *PixelBuffer, PixelCount
Define ticks1
Define t1
Define File.s

Procedure Test(*lpBuf, Pixels)
  Protected mat.TColorMatrix4f
  Protected.f rot, sat, v
  
  rot = GetGadgetState(5)
  sat = GetGadgetState(7)/100
  v = GetGadgetState(9)/100
  
  Debug "rotation = " + rot
  Debug "saturation = " +sat 
  
  identmat(mat)
  ;offsetmat(mat, -50.0, -50.0, -50.0)  ;	/* offset color */ 
  ;cscalemat(mat, 1.4, 1.5, 1.6)      ;	/* scale the colors */
  ;saturatemat(mat,  sat)            ; /* saturate by 2.0 */
  ;huerotatemat(mat, rot)           ; /* rotate the hue 10 */
  
  ; first we test the simple HueRotation! But until know there is anywhere a problem, because
  simplehuerotatemat(mat, rot)
  ; printmat(mat)
  applymatrix(*lpBuf, mat, Pixels)
EndProcedure

UseJPEGImageDecoder()
UsePNGImageDecoder()

#WindowWith = 1210
#WindowHeight = 1000

#AreaWith = 600
#AreaHeight = 400

If OpenWindow(0, 0, 0, #WindowWith, #WindowHeight, "HSV Color Transform", #PB_Window_SystemMenu|#PB_Window_ScreenCentered)
  If CreateStatusBar(0, WindowID(0))
    AddStatusBarField(#PB_Ignore)
    StatusBarText(0, 0, "Nothing processed yet")
  EndIf
  
  ScrollAreaGadget(0, 0, 10, #AreaWith, #AreaHeight, 0, 0, 10, #PB_ScrollArea_Flat|#PB_ScrollArea_Center)
  ImageGadget(1, 0, 0, 0, 0, 0)
  CloseGadgetList()
  
  ScrollAreaGadget(2, #AreaWith + 2, 10, #AreaWith, #AreaHeight, 10, 0, 10, #PB_ScrollArea_Flat|#PB_ScrollArea_Center)
  ImageGadget(3, 0, 0, 0, 0, 0)
  CloseGadgetList()
  
  TextGadget(4, 12, 430, 98, 30, "Hue rotation")
  SpinGadget(5, 10, 450, 100, 30, -180, 360, #PB_Spin_Numeric)
  TextGadget(6, 132, 430, 98, 30, "Saturation")
  SpinGadget(7, 130, 450, 100, 30, 0, 200, #PB_Spin_Numeric)
  TextGadget(8, 252, 430, 98, 30, "Value")
  SpinGadget(9, 250, 450, 100, 30, 0, 200, #PB_Spin_Numeric)
  SetGadgetState(5, 0)
  SetGadgetState(7, 100)
  SetGadgetState(9, 100)

  ButtonGadget(10, 380, 450, 120, 30, "Apply transform")
  ButtonGadget(11, 780, 450, 120, 30, "Load image")  
  Repeat
    Event = WaitWindowEvent()
    If Event = #PB_Event_Gadget
      Select EventGadget()
        Case 10:
          If IsImage(0)
            If CreateImage(1, ImageWidth(0), ImageHeight(0), 32) And StartDrawing(ImageOutput(1))
              ; make a 32 bit copy of the loaded image
              DrawingMode(#PB_2DDrawing_AllChannels)
              DrawImage(ImageID(0), 0, 0)
              ; get the buffer address
              *PixelBuffer = DrawingBuffer()
              PixelCount = OutputHeight() * DrawingBufferPitch() >> 2
              ; set the transform matrix
              
              ; apply the transform matrix
              t1 = ElapsedMilliseconds()
              
              ;PixelCount = ColorTranform(@m, *PixelBuffer, *PixelBuffer, PixelCount) 
              Test(*PixelBuffer, PixelCount)
              t1 = ElapsedMilliseconds()-t1
              
              StopDrawing()
              SetGadgetState(3, ImageID(1))
              SetGadgetAttribute(2, #PB_ScrollArea_InnerWidth, ImageWidth(1))
              SetGadgetAttribute(2, #PB_ScrollArea_InnerHeight, ImageHeight(1))
              StatusBarText(0, 0, "Processed "+Str(PixelCount)+" pixels in "+Str(t1)+" ms")
            EndIf
          EndIf
        Case 11:
          File = OpenFileRequester("Select image file", "", "Image file | *.png;*.jpg;*.jpeg", 0)
          If File And LoadImage(0, File)
            SetGadgetState(1, ImageID(0))
            SetGadgetAttribute(0, #PB_ScrollArea_InnerWidth, ImageWidth(0))
            SetGadgetAttribute(0, #PB_ScrollArea_InnerHeight, ImageHeight(0))
          EndIf
      EndSelect
    EndIf
  Until Event = #PB_Event_CloseWindow
  
EndIf    

wilbert
PureBasic Expert
PureBasic Expert
Posts: 3944
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HSL/HSV to RGB

Post by wilbert »

SMaag wrote: Thu Sep 14, 2023 4:35 pm One small error I coudn't not find until know is the hue rotatian angle.
It now is just the opposite of the hue rotation from my code and the hue rotation in Affinity Photo 2 (the graphics application I'm using).
A hue rotation of -60 degrees in your test code is about the same as a rotation of +60 degrees in the other ones.
SMaag wrote: Thu Sep 14, 2023 4:35 pm I guess that's a code we can start to optimize!
I suppose it is possible to combine the transformations like the author of the other matrix also did ?
Windows (x64)
Raspberry Pi OS (Arm64)
SMaag
Enthusiast
Enthusiast
Posts: 353
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: HSL/HSV to RGB

Post by SMaag »

@wilbert: may you test again, (see previos post with the code) I updated it. Now it looks ok! In the original matrix.c is a bug! The rotation direction for hue is inverted. Red 120° rotates to blue not to green! The problem was in the Z rotation matrix.
Because the matrices x,y,z are inverted in rotation direction compared to the 3D standard matirx!
I changed the sin components in all 3 directions x,y,z to be compatible with standard 3 operations. Additionally I had to change the sign for the sin component at calculation time! See documents in the code!

If the function ist ok. I would integrate it in my Color or Image Module and use SSE Vector commands from 3D operations.

The bottelnek of my previous code (68ms) with Float Vector was the RGB to float conversion!

here is an example for testing this effekt.
If R,G,B is converted to float is slow, if R,G,B is an INT it is 5 times faster (50ms/11ms)

Code: Select all

EnableExplicit

Define I, col.l, t1
Define.f r, g, b
Define.f ir, ig, ib   ; if ir, ig, ib is an INT the Code is 5 times faster!

#Pixels = 1920 * 1080

Structure TColorMatrix4f; Color Transformation Matrix, signed 16Bit Integer
  m.f[0] ; 0..15
  m00.f : m01.f : m02.f : m03.f ; 0..3
  m10.f : m11.f : m12.f : m13.f ; 4..7
  m20.f : m21.f : m22.f : m23.f ; 12..15
  m30.f : m31.f : m32.f : m33.f ; 8..11
EndStructure

Define mat.TColorMatrix4f

For I = 0 To 15
  mat\m[I] = 1  
Next


t1 = ElapsedMilliseconds()
For I = 1 To #Pixels
  col = I
  
  ir = Red(col)
  ig = Green(col)
  ib = Blue(col)
  
  With mat
    	r = ir * \m00 + ig * \m10 + ib * \m20 + \m30 
    	g = ir * \m01 + ig * \m11 + ib * \m21 + \m31 
    	b = ir * \m02 + ig * \m12 + ib * \m22 + \m32 
  EndWith

Next
t1 = ElapsedMilliseconds()-t1

MessageRequester("Time", Str(t1))

wilbert
PureBasic Expert
PureBasic Expert
Posts: 3944
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HSL/HSV to RGB

Post by wilbert »

SMaag wrote: Fri Sep 15, 2023 8:46 pm @wilbert: may you test again, (see previos post with the code) I updated it. Now it looks ok! In the original matrix.c is a bug! The rotation direction for hue is inverted. Red 120° rotates to blue not to green! The problem was in the Z rotation matrix.
A rotation of 120 or 240 looks okay now but a rotation of 60 or 180 doesn't.
A red color rotated +60 degrees should become yellow but it doesn't look that way.
Windows (x64)
Raspberry Pi OS (Arm64)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3944
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HSL/HSV to RGB

Post by wilbert »

I tried to do a HSL to RGB as well with integer math.
You have to make sure yourself the input is within the correct range but it is pretty fast I guess.

Code: Select all

Procedure HSLToRGB(h, s = 100, l = 50)
  
  ; h[0,360], s[0,100], l[0,100]
  
  Protected.l c, x, r, g, b
  
  c = l
  If l > 50
    c = 100 - c
  EndIf
  c*s
  l = 100*l - c
  c+c
  
  h = (h * 1118481) >> 10
  If h & $10000
    x = c
    c = ((~h & $ffff) * x) >> 16
  Else
    x = ((h & $ffff) * c) >> 16
  EndIf
  h >> 17
  
  If h < 1
    r = c+l : g = x+l : b = l
  ElseIf h = 1
    r = l : g = c+l : b = x+l
  Else
    r = x+l : g = l : b = c+l
  EndIf  
  
  r = (r*106955+$200000) >> 22
  g = (g*106955+$200000) >> 22
  b = (b*106955+$200000) >> 22  
  
  ProcedureReturn b<<16 | g<<8 | r
EndProcedure

or with inline C (looks to be slightly faster)

Code: Select all

Procedure HSLToRGB_C(h, s = 100, l = 50)
  
  ; h[0,360], s[0,100], l[0,100]
  
  !  long v_b, v_c, v_g, v_r, v_x;
  
  !  v_c = v_l;
  !  if (v_l > 50) v_c = 100 - v_c;
  !  v_c *= v_s;
  !  v_l = (100 * v_l) - v_c;
  !  v_c += v_c;
  
  !  v_h = (v_h * 1118481) >> 10;
  !  if (v_h & 0x10000) {
  !    v_x = v_c;
  !    v_c = ((~v_h & 0xffff) * v_x) >> 16;
  !  } else {
  !    v_x = ((v_h & 0xffff) * v_c) >> 16;
  !  }
  !  v_h = v_h >> 17;
  
  !  if (v_h < 1) {
  !    v_r = v_c + v_l;
  !    v_g = v_x + v_l;
  !    v_b = v_l;
  !  } else if (v_h == 1) {
  !    v_r = v_l;
  !    v_g = v_c + v_l;
  !    v_b = v_x + v_l;
  !  } else {
  !    v_r = v_x + v_l;
  !    v_g = v_l;
  !    v_b=  v_c + v_l;
  !  }
  
  !  v_r = (v_r * 106955 + 0x200000) >> 22;
  !  v_g = (v_g * 106955 + 0x200000) >> 22;
  !  v_b = (v_b * 106955 + 0x200000) >> 22;
  
  !  v_h = (v_b<<16) | (v_g<<8) | v_r; 
  
  ProcedureReturn h
EndProcedure
Windows (x64)
Raspberry Pi OS (Arm64)
SMaag
Enthusiast
Enthusiast
Posts: 353
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: HSL/HSV to RGB

Post by SMaag »

I tried to adapt the SSE Version from wilbert from 128Bit to 256Bit SSE and processing 2 Pixels at the same time.
It works! But the effect is not as expected!
The 256 Bit Version is 40x slower than the 128Bit Version. That means, the 256Bit SSE is slower than the classic verison without SSE extention.
I can't understand why!

here is the modified code for 128 and 256 Bit Version of the ApplyTransform() Function
copy this code into the Demo-Project from wilbert to test it!

Code: Select all

; It is an unbeliveable effect! The 256Bit Version, which process 2 Pixels at same time it's 40x slower than the 128Bit Version,
; which process single pixels!
; 128Bit = 1.45ms  256Bit = 58ms; I expected it would be faster than 128Bit Version!

Macro ASM_ColorTransform_SSE256(REGA, REGD, REGC) 
   ; Procedure ColorTranform(*Matrix.TColorMatrix, *InPixels, *OutPixels, NumPixels)
   
    !vpxor ymm5, ymm5, ymm5              ; xmm5 = 0
    !mov REGA, [p.p_Matrix]
    
     ;!movdqu xmm2, [REGA]
    !movdqu xmm0, [REGA]
    !VPSLLDQ ymm2, ymm0, 16
    !VPOR ymm2, ymm2, ymm0
    
    ;!movdqu xmm3, [REGA + 16]
    !movdqu xmm0, [REGA+16]
    !VPSLLDQ ymm3, ymm0, 16
    !VPOR ymm3, ymm3, ymm0
   
    ;!movdqu xmm4, [REGA + 32]
    !movdqu xmm0, [REGA+32]
    !VPSLLDQ ymm4, ymm0, 16
    !VPOR ymm4, ymm4, ymm0
    
    !mov REGA, [p.p_InPixels]
    !mov REGD, [p.p_OutPixels]
    
    !mov REGC, [p.v_NumPixels]
    !sub REGC, 2                 ; NumPixels - 1
    !jc .wend                     ; If NumPixel < 0
    
    !.while:
    ;                             ; reciprocal throuput
    !movq xmm1, [REGA + REGC *4]  ; 1;    load pixel
    !vpunpcklbw ymm1, ymm1, ymm5         ; 1:    zero extend bytes to words (ymm5 = 0)
    !vpshufd ymm0, ymm1, 0         ; 1;    ymm0 [c1c0 c1c0 c1c0 c1c0]
    !vpshufd ymm1, ymm1, 01010101b ; 1;    shuffle =85 ymm1 [c3c2 c3c2 c3c2 c3c2]
    !vpmaddwd ymm0, ymm0, ymm2           ; 0.5;  multiply and add
    !vpmaddwd ymm1, ymm1, ymm3           ; 0.5;  multiply and add
    !vpaddd ymm0, ymm0, ymm1             ; 0.25; add together
    !vpaddd ymm0, ymm0, ymm4             ; 0.25; add constant
    !vpsrad ymm0, ymm0, 14               ; 0.5;  reduce to byte range
    !vpackssdw ymm0, ymm0, ymm0          ; 0.5;  convert 32s > 16s
    !vpackuswb ymm0, ymm0, ymm0          ; 0.5;  convert 16s > 8u
    !movq [REGD + REGC *4], xmm0  ; 1;    store pixel
    
    !sub REGC, 2
    !jnc .while
    !.wend:                      ; Sum=8.75 ticks; Avg=8.75/14 = 0.625 ticks per command
  
 EndMacro

 Macro ASM_ColorTransform_SSE128(REGA, REGD, REGC) 
   ; Procedure ColorTranform(*Matrix.TColorMatrix, *InPixels, *OutPixels, NumPixels)
   
    !pxor xmm5, xmm5              ; xmm5 = 0
    !mov REGA, [p.p_Matrix]
    !movdqu xmm2, [REGA]         ; Matrix Element m11..m24 = Line 1 and Line 2
    !movdqu xmm3, [REGA + 16]    ; Matrix Element m31..m44 = Line 3 and Line 4
    !movdqu xmm4, [REGA + 32]    ; Matrix Element a0..a3
    
    !mov REGA, [p.p_InPixels]
    !mov REGD, [p.p_OutPixels]
    
    !mov REGC, [p.v_NumPixels]
    !sub REGC, 1                  ; NumPixels - 1
    !jc .wend
    
    !.while:
    ;                             ; reciprocal throuput
    !movd xmm1, [REGA + REGC *4]  ; 1;    load pixel
    !punpcklbw xmm1, xmm5         ; zero extend bytes to words (xmm5 = 0)
    !pshufd xmm0, xmm1, 0         ; xmm0 [c1c0 c1c0 c1c0 c1c0]
    !pshufd xmm1, xmm1, 85        ; xmm1 [c3c2 c3c2 c3c2 c3c2]
    !pmaddwd xmm0, xmm2           ; multiply and add
    !pmaddwd xmm1, xmm3           ; multiply and add
    !paddd xmm0, xmm1             ; add together
    !paddd xmm0, xmm4             ; add constant
    !psrad xmm0, 14               ; reduce to byte range
    !packssdw xmm0, xmm0          ; convert 32s > 16s
    !packuswb xmm0, xmm0          ; convert 16s > 8u
    !movd [REGD + REGC *4], xmm0  ; 1;    store pixel
    
    !sub REGC, 1
    !jnc .while
    !.wend:                      ; Sum=8.75 ticks; Avg=8.75/14 = 0.625 ticks per command
  
 EndMacro
 
 Procedure ApplyTransform(*Matrix.TColorMatrix, *InPixels, *OutPixels, NumPixels)
  
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    ASM_ColorTransform_SSE128(RAX, RDX, RCX)
    ;ASM_ColorTransform_SSE256(RAX, RDX, RCX)
   
  CompilerElseIf #PB_Compiler_Processor = #PB_Processor_x32
    
    ASM_ColorTransform_SSE128(EAX, EDX, ECX)   
    ;ASM_ColorTransform_SSE256(EAX, EDX, ECX)   
    
  CompilerEndIf
  
  ProcedureReturn  NumPixels
EndProcedure

wilbert
PureBasic Expert
PureBasic Expert
Posts: 3944
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HSL/HSV to RGB

Post by wilbert »

SMaag wrote: Sun Sep 17, 2023 6:08 am The 256 Bit Version is 40x slower than the 128Bit Version. That means, the 256Bit SSE is slower than the classic verison without SSE extention.
I can't understand why!
I also did some tests with AVX2 and it was also much slower.
But that wasn't the only problem I ran into. The instructions also don't work as I expected.

For example PACKUSWB xmm0, xmm0 packs the 8 signed words of the source register into the 8 lowest bytes of the destination register.
What I expected is that VPACKUSWB ymm0, ymm0 would pack the 16 signed words of ymm0 into the lowest 16 bytes of ymm0 but this isn't the case. The words occupying bits [127:0] are packed into bits [63:0] but bits [255:128] are packed into [191:128] instead of [127:64].
So simply adding a V in front of the opcodes like you did doesn't produce the right result.

That's why I explored a different thing in my previous post
viewtopic.php?p=607263#p607263
The unpack and pack instructions in this code are working on two pixels at once.
Other instructions are duplicated to handle two pixels but this still has the advantage that more often the cpu doesn't have to wait for the result of the previous instruction to proceed. In the end this approach does give a nice performance boost while still only using SSE2 instructions.
Windows (x64)
Raspberry Pi OS (Arm64)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3944
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HSL/HSV to RGB

Post by wilbert »

@SMaag
Maybe this is also interesting to you.
https://stackoverflow.com/questions/524 ... ate-filter
This is how chromium css does it.
It's again different. More like my original code but with different weights and + and - of the hue rotation angle switched.
Windows (x64)
Raspberry Pi OS (Arm64)
SMaag
Enthusiast
Enthusiast
Posts: 353
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: HSL/HSV to RGB

Post by SMaag »

wilbert wrote: Mon Sep 18, 2023 7:44 am @SMaag
Maybe this is also interesting to you.
https://stackoverflow.com/questions/524 ... ate-filter
This is how chromium css does it.
I saw similar in a Net documentation too. They do it with a 5x5 Matrix.

Now i tested your 2 Pixel Code. At my PC it's 20% faster than single Pixel code (1.15ms to 1.45ms).
2 Pixel is not that faster! It shows that your single pixel code it the best optimated version in code execution.

Until know I expected, SSE is always faster than scalar code. But this is not the case at all!
I guess, the standard SSE operations for INT is best optimated in CPU because with this they try to win grafical benchmarks.

I found at the intel community forum a resaon for the SSE, AVX and scalar code speed difference.
It seems to be a cache issue.
It is not at all unusual for AVX code on Sandy Bridge & Ivy Bridge to be slightly slower than SSE code for data that is not contained in the L1 cache.

For L1-contained data, AVX vector code was by far the fastest, followed by SSE vector code, followed by scalar code.

For L2-contained data, the SSE vector code was 2%-5% faster than the AVX vector code (as I expected), but the *scalar* code
was 10% to 40% faster than the SSE vector code and 15%-60% faster than the AVX vector code.

For L3-contained data (using a single threaded benchmark test), the SSE vector code was 3% to 14% faster than the AVX vector code,
but again the *scalar* code was fastest: 50%-60% faster than the SSE vector code and 60%-80% faster than the AVX vector code.

For data in local memory (using a single-threaded benchmark test), the SSE vector code was 1%-3% faster than the AVX vector code,
while the *scalar* code was 4%-9% faster than the SSE vector code and 7%-12% faster than the AVX vector code.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3944
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HSL/HSV to RGB

Post by wilbert »

SMaag wrote: Mon Sep 18, 2023 8:38 amI found at the intel community forum a resaon for the SSE, AVX and scalar code speed difference.
It seems to be a cache issue.
That's interesting.
Windows (x64)
Raspberry Pi OS (Arm64)
Post Reply