Packing bytes: a little help from my friends
Re: Packing bytes: a little help from my friends
After some thinking, maybe it's wiser to look first for speed enhancements that don't require asm.
I think there's quite some speed to gain just optimizing the basic code and that keeps everything working even on x64.
I think there's quite some speed to gain just optimizing the basic code and that keeps everything working even on x64.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: Packing bytes: a little help from my friends
netmaestro wrote: still looking for Linux.
Works on Linux!
Windows 11, Manjaro, Raspberry Pi OS


Re: Packing bytes: a little help from my friends
Here's my attempt to improve the speed a bit without using asm.
Code: Select all
;========================================================================================
;{ NEUQUANT COLORTABLE GENERATION SECTION
;========================================================================================
;
; NeuQuant Neural-Net Quantization Algorithm
; Copyright (c) 1994 Anthony Dekker
;
; NEUQUANT Neural-Net quantization algorithm by Anthony Dekker, 1994.
; See "Kohonen neural networks for optimal colour quantization"
; in "Network: Computation in Neural Systems" Vol. 5 (1994) pp 351-367.
; for a discussion of the algorithm.
; See also http://members.ozemail.com.au/~dekker/NEUQUANT.HTML
;
; Any party obtaining a copy of these files from the author, directly or
; indirectly, is granted, free of charge, a full and unrestricted irrevocable,
; world-wide, paid up, royalty-free, nonexclusive right and license to deal
; in this software and documentation files (the "Software"), including without
; limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
; and/or sell copies of the Software, and to permit persons who receive
; copies from any such party to do so, with the only requirement being
; that this copyright notice remain intact.
;
; Ported to PureBasic 4.31-4.40 by Luis on Oct 2009
; http://luis.no-ip.net
; The code in this include *should* be cross / platform. Tested on Windows 32/64 bit.
; Pfew! A lot or pointers' arithmetic to convert !
;
; Pseudo code usage example for PB:
;
; [ obtain a pointer to the color data bitmap in memory ]
;
; // please note the color data must be a sequence of BGRA quadruplets (32 bit/pixel) and that the A channel is ignored
; // allocated space would be = (4 * ImageWidth * ImageHeight)
;
; NQ_InitNetwork() ; initialize network
; NQ_Learn (*ImageBuf, MemorySize(*ImageBuf), [1..30]) ; learning
; NQ_UnbiasNetwork() ; network unbias
;
; // at this point the optimized palette has been calculated and it's in g_NQ_Network()
; // g_NQ_Network(0..255)\iBGRC[0] ; BLUE
; // g_NQ_Network(0..255)\iBGRC[1] ; GREEN
; // g_NQ_Network(0..255)\iBGRC[2] ; RED
;
; NQ_SortNetwork() ; optimize the palette order for the lookup phase
;
; // now you loop through every pixel in the original 24/32 bit image and retrieve the palette index to be used in the 8 bit image
;
; [read B, G, R data from source image ]
; iIndex = NQ_LookupPalette(B, G, R)
; [ write iIndex to the destination image ]
CompilerIf Defined(MAXLONG, #PB_Constant)=0
#MAXLONG = $7FFFFFFF
CompilerEndIf
CompilerIf Defined(RGBQUAD, #PB_Structure)=0
Structure RGBQUAD
rgbBlue.b
rgbGreen.b
rgbRed.b
rgbReserved.b
EndStructure
CompilerEndIf
Structure T_RGBTRIPLE
rgbtBlue.a
rgbtGreen.a
rgbtRed.a
EndStructure
Structure T_PIXEL
iBGRC.i[4]
EndStructure
Declare.i _NQ_Contest (b, g, r)
Declare _NQ_AlterSingle (alpha, i, b, g, r)
Declare _NQ_AlterNeigh (rad, i, b, g, r)
Declare NQ_InitNetwork ()
Declare NQ_UnbiasNetwork()
Declare NQ_SortNetwork()
Declare.i NQ_LookupPalette (b,g,r)
Declare.i NQ_Learn (*ImageBuffer.T_RGBTRIPLE, iBufferLen, iSample)
#NQ_netsize = 256 ; number of colours used
; four primes near 500 - assume no image has a length so large
; that it is divisible by all four primes
#NQ_prime1 = 499
#NQ_prime2 = 491
#NQ_prime3 = 487
#NQ_prime4 = 503
#NQ_minpicturebytes = 3 * #NQ_prime4 ; minimum size for input image
; Network Definitions
#NQ_maxnetpos = #NQ_netsize - 1
#NQ_netbiasshift = 4 ; bias for colour values
#NQ_ncycles = 100 ; no. of learning cycles
; Defs for freq and bias
#NQ_intbiasshift = 16 ; bias for fractions
#NQ_intbias = 1 << #NQ_intbiasshift
#NQ_gammashift = 10 ; gamma = 1024
#NQ_gamma = 1 << #NQ_gammashift
#NQ_betashift = 10
#NQ_beta = #NQ_intbias >> #NQ_betashift ; beta = 1/1024
#NQ_betagamma = #NQ_intbias << (#NQ_gammashift - #NQ_betashift)
; defs for decreasing radius factor
#NQ_initrad = #NQ_netsize >> 3 ; for 256 cols, radius starts
#NQ_radiusbiasshift = 6 ; at 32.0 biased by 6 bits
#NQ_radiusbias = 1 << #NQ_radiusbiasshift
#NQ_initradius = #NQ_initrad * #NQ_radiusbias
#NQ_radiusdec = 15 ; and decreases by a factor of 1/#NQ_radiusdec each cycle (originally was 30)
; defs for decreasing alpha factor
#NQ_alphabiasshift = 10 ; alpha starts at 1.0
#NQ_initalpha = 1 << #NQ_alphabiasshift
; radbias and alpharadbias used for radpower calculation
#NQ_radbiasshift = 8
#NQ_radbias = 1 << #NQ_radbiasshift
#NQ_alpharadbshift = #NQ_alphabiasshift + #NQ_radbiasshift
#NQ_alpharadbias = 1 << #NQ_alpharadbshift
Global Dim g_NQ_Network.T_PIXEL (#NQ_netsize - 1) ; the network itself
Global Dim g_NQ_NetIndex(256 - 1) ; for network lookup - really 256
Global Dim g_NQ_Bias (#NQ_netsize - 1) ; bias and freq arrays for learning
Global Dim g_NQ_Freq (#NQ_netsize - 1) ; bias and freq arrays for learning
Global Dim g_NQ_RadPower (#NQ_initrad - 1) ; radpower for precomputation
Procedure.i _NQ_Contest (b, g, r)
Protected i, dist, a, biasdist, betafreq
Protected bestpos, bestbiaspos, bestd, bestbiasd
Protected *p.Integer, *f.Integer, *n.T_PIXEL
; finds closest neuron (min dist) and updates freq
; finds best neuron (min dist-bias) and returns position
; for frequently chosen neurons, g_NQ_Freq(i) is high and g_NQ_Bias(i) is negative
; g_NQ_Bias(i) = gamma*((1/#NQ_netsize)-g_NQ_Freq(i))
bestd = #MAXLONG
bestbiasd = bestd
bestpos = -1
bestbiaspos = bestpos
*p = @g_NQ_Bias()
*f = @g_NQ_Freq()
For i = 0 To #NQ_netsize - 1
*n = @g_NQ_Network(i)
dist = *n\iBGRC[0] - b : If dist < 0 : dist = -dist : EndIf
a = *n\iBGRC[1] - g : If a < 0 : a = -a : EndIf
dist + a
a = *n\iBGRC[2] - r : If a < 0 : a = -a : EndIf
dist + a
If dist < bestd : bestd = dist : bestpos = i : EndIf
biasdist = dist - (*p\i >> (#NQ_intbiasshift - #NQ_netbiasshift))
If biasdist < bestbiasd : bestbiasd = biasdist: bestbiaspos = i : EndIf
betafreq = *f\i >> #NQ_betashift
*f\i - betafreq : *f + SizeOf(Integer)
*p\i + (betafreq << #NQ_gammashift) : *p + SizeOf(Integer)
Next
g_NQ_Freq(bestpos) + #NQ_beta
g_NQ_Bias(bestpos) - #NQ_betagamma
ProcedureReturn bestbiaspos
EndProcedure
; Move neuron i towards biased (b,g,r) by factor alpha
Procedure _NQ_AlterSingle (alpha, i, b, g, r)
Protected *n.T_PIXEL = @g_NQ_Network(i) ; alter hit neuron
*n\iBGRC[0] - (alpha * (*n\iBGRC[0] - b)) / #NQ_initalpha
*n\iBGRC[1] - (alpha * (*n\iBGRC[1] - g)) / #NQ_initalpha
*n\iBGRC[2] - (alpha * (*n\iBGRC[2] - r)) / #NQ_initalpha
EndProcedure
; Move adjacent neurons by precomputed alpha*(1-((i-j)^2/[r]^2)) in g_NQ_RadPower[|i-j|]
Procedure _NQ_AlterNeigh (rad, i, b, g, r)
Protected j, k, lo, hi, a
Protected *p.T_PIXEL, *q.Integer
lo = i - rad : If lo < -1 : lo = -1 : EndIf
hi = i + rad : If hi > #NQ_netsize : hi = #NQ_netsize : EndIf
j = i + 1
k = i - 1
*q = @g_NQ_RadPower()
While (j < hi) Or (k > lo)
*q + SizeOf(Integer) : a = *q\i
If j < hi
*p = @g_NQ_Network(j)
*p\iBGRC[0] - (a * (*p\iBGRC[0] - b)) / #NQ_alpharadbias
*p\iBGRC[1] - (a * (*p\iBGRC[1] - g)) / #NQ_alpharadbias
*p\iBGRC[2] - (a * (*p\iBGRC[2] - r)) / #NQ_alpharadbias
j + 1
EndIf
If k > lo
*p = @g_NQ_Network(k)
*p\iBGRC[0] - (a * (*p\iBGRC[0] - b)) / #NQ_alpharadbias
*p\iBGRC[1] - (a * (*p\iBGRC[1] - g)) / #NQ_alpharadbias
*p\iBGRC[2] - (a * (*p\iBGRC[2] - r)) / #NQ_alpharadbias
k - 1
EndIf
Wend
EndProcedure
; Initialise network in range (0,0,0) To (255,255,255) and set parameters
Procedure NQ_InitNetwork ()
Protected i, *p.T_PIXEL
For i = 0 To #NQ_netsize - 1
*p = @g_NQ_Network(i)
*p\iBGRC[0] = (i << (#NQ_netbiasshift + 8)) / #NQ_netsize
*p\iBGRC[1] = *p\iBGRC[0]
*p\iBGRC[2] = *p\iBGRC[0]
g_NQ_Freq(i) = #NQ_intbias/#NQ_netsize ; 1/netsize
g_NQ_Bias(i) = 0
Next
EndProcedure
; Unbias network to give byte values 0..255 and record position i to prepare for sort
Procedure NQ_UnbiasNetwork()
Protected i, j, t
For i = 0 To #NQ_netsize - 1
For j = 0 To 2
t = (g_NQ_Network(i)\iBGRC[j] + (1 << (#NQ_netbiasshift - 1))) >> #NQ_netbiasshift
If (t > 255) : t = 255 : EndIf
g_NQ_Network(i)\iBGRC[j] = t
Next
g_NQ_Network(i)\iBGRC[3] = i ; record colour no
Next
EndProcedure
; Insertion sort of network and building of g_NQ_NetIndex(0..255) (to do after unbias)
Procedure NQ_SortNetwork()
Protected i, j, smallpos, smallval
Protected *p.T_PIXEL, *q.T_PIXEL
Protected previouscol, startpos
previouscol = 0
startpos = 0
For i = 0 To #NQ_netsize - 1
*p = @g_NQ_Network(i)
smallpos = i
smallval = *p\iBGRC[1] ; index on g
; find smallest in i..#NQ_netsize - 1
For j = i+1 To #NQ_netsize - 1
*q = @g_NQ_Network(j)
If *q\iBGRC[1] < smallval ; index on g
smallpos = j
smallval = *q\iBGRC[1] ; index on g
EndIf
Next
*q = @g_NQ_Network(smallpos)
; swap p (i) and q (smallpos) entries
If i <> smallpos
Swap *q\iBGRC[0], *p\iBGRC[0]
Swap *q\iBGRC[1], *p\iBGRC[1]
Swap *q\iBGRC[2], *p\iBGRC[2]
Swap *q\iBGRC[3], *p\iBGRC[3]
EndIf
; smallval entry is now in position i
If smallval <> previouscol
g_NQ_NetIndex(previouscol) = (startpos+i) >> 1
For j = previouscol + 1 To smallval - 1
g_NQ_NetIndex(j) = i
Next
previouscol = smallval
startpos = i
EndIf
Next
g_NQ_NetIndex(previouscol) = (startpos + #NQ_maxnetpos) >> 1
For j = previouscol + 1 To 255
g_NQ_NetIndex(j) = #NQ_maxnetpos ; really 256
Next
EndProcedure
; Search for BGR values 0..255 (after net is unbiased) and return colour index
Procedure.i NQ_LookupPalette (b, g, r)
Protected i, j, dist, a, best, bestd
Protected *p.T_PIXEL
bestd = 1000 ; biggest possible dist is 256*3
best = -1
i = g_NQ_NetIndex(g) ; index on g
j = i - 1 ; start at g_NQ_NetIndex(g) and work outwards
While i < #NQ_netsize Or j >= 0
If i < #NQ_netsize
*p = @g_NQ_Network(i)
dist = *p\iBGRC[1] - g ; inx key
If dist >= bestd
i = #NQ_netsize ; stop iter
Else
i + 1
If dist < 0: dist = -dist : EndIf
a = *p\iBGRC[0] - b : If a < 0 : a = -a : EndIf
dist + a
If dist < bestd
a = *p\iBGRC[2] - r : If a < 0 : a = -a : EndIf
dist + a
If dist < bestd : bestd = dist : best = *p\iBGRC[3] : EndIf
EndIf
EndIf
EndIf
If j >= 0
*p = @g_NQ_Network(j)
dist = g - *p\iBGRC[1] ; inx key - reverse dif
If dist >= bestd
j = -1 ; stop iter
Else
j - 1
If dist < 0 : dist = -dist : EndIf
a = *p\iBGRC[0] - b : If a < 0 : a = -a : EndIf
dist + a
If dist < bestd
a = *p\iBGRC[2] - r : If a < 0 : a = -a : EndIf
dist + a
If dist < bestd : bestd = dist : best = *p\iBGRC[3] : EndIf
EndIf
EndIf
EndIf
Wend
ProcedureReturn best
EndProcedure
; Main Learning Loop
Procedure.i NQ_Learn (*ImageBuffer.T_RGBTRIPLE, iBufferLen, iSample)
Protected i, j, b, g, r
Protected radius, rad, alpha, iStep, delta, samplepixels, alphadec
Protected *lim
If iBufferLen < #NQ_minpicturebytes
ProcedureReturn 0
EndIf
alphadec = 30 + ((iSample - 1) / 3)
*lim = *ImageBuffer + iBufferLen
samplepixels = iBufferLen / (4 * iSample)
delta = samplepixels / #NQ_ncycles
alpha = #NQ_initalpha
radius = #NQ_initradius
If samplepixels = 0 Or delta = 0
ProcedureReturn 0
EndIf
rad = radius >> #NQ_radiusbiasshift
If rad <= 1 : rad = 0 : EndIf
For i = 0 To rad - 1
g_NQ_RadPower(i) = alpha * (((rad * rad - i * i) * #NQ_radbias) / (rad*rad))
Next
If ((iBufferLen % #NQ_prime1) <> 0)
iStep = 4 * #NQ_prime1
ElseIf ((iBufferLen % #NQ_prime2) <> 0)
iStep = 4 * #NQ_prime2
ElseIf ((iBufferLen % #NQ_prime3) <> 0)
iStep = 4 * #NQ_prime3
Else
iStep = 4 * #NQ_prime4
EndIf
i = 0
While (i < samplepixels)
b = (*ImageBuffer\rgbtBlue &$FF) << #NQ_netbiasshift
g = (*ImageBuffer\rgbtGreen &$FF) << #NQ_netbiasshift
r = (*ImageBuffer\rgbtRed &$FF) << #NQ_netbiasshift
j = _NQ_Contest(b,g,r)
_NQ_AlterSingle (alpha,j,b,g,r)
If (rad) : _NQ_AlterNeigh(rad,j,b,g,r) : EndIf ; alter neighbours
*ImageBuffer + iStep
If *ImageBuffer >= *lim : *ImageBuffer - iBufferLen : EndIf
i + 1
If i % delta = 0
alpha - (alpha / alphadec)
radius - (radius / #NQ_radiusdec)
rad = radius >> #NQ_radiusbiasshift
If rad <= 1 : rad = 0 : EndIf
For j = 0 To rad - 1
g_NQ_RadPower(j) = alpha * (((rad * rad - j * j) * #NQ_radbias) / (rad*rad))
Next
EndIf
Wend
ProcedureReturn 1
EndProcedure
;========================================================================================
;{ FLOYD-STEINBERG DITHERING RESIDENTS
;========================================================================================
; Copyright (c) 2009 the authors listed at the following URL, and/or
; the authors of referenced articles or incorporated external code:
; http://en.literateprograms.org/Floyd-Steinberg_dithering_(C)?action=history&offset=20080916082812
;
; Permission is hereby granted, free of charge, to any person obtaining
; a copy of this software and associated documentation files (the
; "Software"), to deal in the Software without restriction, including
; without limitation the rights to use, copy, modify, merge, publish,
; distribute, sublicense, and/or sell copies of the Software, and to
; permit persons to whom the Software is furnished to do so, subject to
; the following conditions:
;
; The above copyright notice and this permission notice shall be
; included in all copies or substantial portions of the Software.
;
; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
; IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
; CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
; TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
; SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
;
; Retrieved from: http://en.literateprograms.org/Floyd-Steinberg_dithering_(C)?oldid=14630
;
; Ported to PureBasic code by Lloyd Gallant (netmaestro) on October 1, 2009
;
Structure RGBQUAD_U ; Unsigned char RGBQUAD
rgbBlue.a
rgbGreen.a
rgbRed.a
rgbReserved.a
EndStructure
Structure PALETTE ; Used to index the colortable
index.RGBQUAD_U[256]
EndStructure
Structure COLORBITS_RGBQUAD ; Used to index the source colorbits
pixels.RGBQUAD_U[0]
EndStructure
Structure COLORBITS_1BYTE_PERPIXEL ; Used to index the target colorbits
pixels.a[0]
EndStructure
Macro plus_truncate_uchar(a, b)
If a+b<0 : a=0 : ElseIf a+b>255 : a=255 : Else : a+b : EndIf
EndMacro
Macro compute_disperse(channel)
error = (*colorbits\pixels[x + y*w]\channel - *palette\index[index]\channel ) >> 1 ; shift to soften
If (x + 1) < w : plus_truncate_uchar(*colorbits\pixels[(x+1) + (y+0)*w]\channel, (error*7) >> 4) : EndIf
If (y + 1) < h
If (x - 1) > 0
plus_truncate_uchar(*colorbits\pixels[(x-1) + (y+1)*w]\channel, (error*3) >> 4)
plus_truncate_uchar(*colorbits\pixels[(x+0) + (y+1)*w]\channel, (error*5) >> 4)
If (x + 1 < w) : plus_truncate_uchar(*colorbits\pixels[(x+1)+(y+1)*w]\channel,(error*1) >> 4) : EndIf
EndIf
EndIf
EndMacro
;}
Procedure.i CreatePalette_NeuQuant (*ImageBuf, iQuality=10)
; By luis Oct 2009
; Modified by netmaestro to accept buffer as
; parameter instead of image handle Oct 2013
Protected *Palette, k
Protected Dim tPaletteTable.RGBQUAD(256)
If *ImageBuf
NQ_InitNetwork()
If NQ_Learn(*ImageBuf, MemorySize(*ImageBuf), iQuality)
NQ_UnbiasNetwork()
For k = 0 To 255
With tPaletteTable(k)
\rgbBlue = g_NQ_Network(k)\iBGRC[0]
\rgbGreen = g_NQ_Network(k)\iBGRC[1]
\rgbRed = g_NQ_Network(k)\iBGRC[2]
\rgbReserved = 0
EndWith
Next
*Palette = AllocateMemory (256 * SizeOf(RGBQUAD))
CopyMemory(@tPaletteTable(), *Palette, MemorySize(*Palette))
EndIf
EndIf
ProcedureReturn *Palette
EndProcedure
;=============================================================
;{ Save as GIF Section
;=============================================================
Structure RGB
Blue.a
Green.a
Red.a
EndStructure
Structure COLORMAPINDEX
color.l
newindex.a
EndStructure
Structure GIF_HEADER
bytes.a[6]
EndStructure
Structure LOGICAL_SCREEN_DESCRIPTOR
Width.w
Height.w
PackedByte.a
BackgroundColorIndex.a
PixelAspectRatio.a
EndStructure
Structure GRAPHICS_CONTROL_EXTENSION
Sntroducer.a
Label.a
BlockSize.a
PackedByte.a
DelayTime.w
TransparentColorIndex.a
BlockTerminator.a
EndStructure
Structure IMAGE_DESCRIPTOR
Separator.a
ImageLeft.w
ImagTop.w
ImageWidth.w
ImageHeight.w
PackedByte.a
EndStructure
Structure code
code.l
size.l
EndStructure
DataSection
gifheader:
Data.a $47, $49, $46, $38, $39, $61
graphicscontrolextension:
Data.a $21, $F9, $04, $00, $00, $00, $00, $00
applicationextension:
Data.b $21, $FF, $0B, $4E, $45, $54, $53, $43, $41, $50, $45, $32, $2E, $30, $03, $01, $00, $00, $00
EndDataSection
UsePNGImageDecoder()
UseJPEGImageDecoder()
UseTIFFImageDecoder()
dither = 1
pattern$ = "PNG, BMP, JPEG, TIFF|*.png;*.bmp;*.jpg;*.jpeg;*.tiff|PNG (*.png)|*.png|BMP (*.bmp)|*.bmp|JPEG (*.jpg)|*.jpg|TIFF (*.tif)|*.tif"
inpath$ = OpenFileRequester("Choose an image to convert to GIF:","",pattern$, 0)
prompt$ = RemoveString(GetFilePart(inpath$),"."+GetExtensionPart(inpath$))
prompt$ +".gif"
If FileSize(inpath$) < 1
MessageRequester("Info:","No file selected. Ending...")
End
EndIf
If inpath$
hImageIn = LoadImage(#PB_Any, inpath$)
Else
MessageRequester("Error:", "Cannot open this image. Ending...")
End
EndIf
w=ImageWidth(hImageIn)
h=ImageHeight(hImageIn)
StartDrawing(ImageOutput(hImageIn))
If w*h<1600
*colorbits.COLORBITS_RGBQUAD = AllocateMemory(1600*4)
Else
*colorbits.COLORBITS_RGBQUAD = AllocateMemory(w*h*4)
EndIf
cc=0
For j=0 To h-1
For i=0 To w-1
With *colorbits\pixels[cc]
\rgbRed = Red(Point(i,j))
\rgbGreen = Green(Point(i,j))
\rgbBlue = Blue(Point(i,j))
EndWith
cc+1
Next
Next
StopDrawing()
*palette.PALETTE = CreatePalette_NeuQuant(*colorbits)
*bits8.COLORBITS_1BYTE_PERPIXEL = AllocateMemory(w*h)
NQ_SortNetwork() ; for NQ_LookupPalette to work properly
; Set and optionally dither the colorbits
For y = 0 To h-1
For x = 0 To w-1
index.a = NQ_LookupPalette (*colorbits\pixels[x + y*w]\rgbBlue, *colorbits\pixels[x + y*w]\rgbGreen, *colorbits\pixels[x + y*w]\rgbRed)
*bits8\pixels[x + y*w] = index
If dither
compute_disperse(rgbRed)
compute_disperse(rgbGreen)
compute_disperse(rgbBlue)
EndIf
Next
Next
Dim colors.l(255)
Numcolors = 256
CopyMemory(*palette, @colors(), MemorySize(*palette))
FreeMemory(*palette)
sz_colorbits = MemorySize(*bits8)
;Count actual colors used in the image
NewMap colormap.COLORMAPINDEX()
For i=*bits8 To *bits8+sz_colorbits-1
If AddMapElement(colormap(), RSet(Hex(PeekA(i),#PB_Byte),2,"0"), #PB_Map_ElementCheck)
colormap(RSet(Hex(PeekA(i),#PB_Byte),2,"0"))\color = colors(PeekA(i))
EndIf
Next
NumColors = MapSize(colormap())
;If needed, shrink colortable To actual colors used And remap the colorbits To the new table
If NumColors < 256
Dim colors2.l(MapSize(colormap())-1)
cc=0
ForEach colormap()
colors2(cc) = colormap()\color
colormap()\newindex = cc
cc+1
Next
For i=*bits8 To *bits8+sz_colorbits-1
PokeA(i, colormap(RSet(Hex(PeekA(i),#PB_Byte),2,"0"))\newindex)
Next
Else
Dim colors2.l(ArraySize(colors()))
CopyArray(colors(), colors2())
EndIf
FreeArray(colors())
Select NumColors
Case 1 To 3
bytes_colortable = 12
sz_colortable = 1
min_codesize = 2
Case 4 To 7
bytes_colortable = 24
sz_colortable = 2
min_codesize = 3
Case 8 To 15
bytes_colortable = 48
sz_colortable = 3
min_codesize = 4
Case 16 To 31
bytes_colortable = 96
sz_colortable = 4
min_codesize = 5
Case 32 To 63
bytes_colortable = 192
sz_colortable = 5
min_codesize = 6
Case 64 To 127
bytes_colortable = 384
sz_colortable = 6
min_codesize = 7
Case 128 To 256
bytes_colortable = 768
sz_colortable = 7
min_codesize = 8
EndSelect
pattern$ = "GIF (*.gif)|*.gif;"
outpath$ = SaveFileRequester("Choose a path to save the .gif file:",prompt$,pattern$, 0)
outpath$ = RemoveString(outpath$, ".gif")
If outpath$
outpath$ + ".gif"
If Not CreateFile(0, outpath$)
MessageRequester("Error:","Problem creating image file... ending.")
End
EndIf
Else
MessageRequester("","No save location chosen... ending.")
End
EndIf
*write_header.GIF_HEADER = AllocateMemory(SizeOf(GIF_HEADER))
CopyMemory(?gifheader, *write_header, SizeOf(GIF_HEADER))
WriteData(0, *write_header, MemorySize(*write_header))
*write_logical_screen_descriptor.LOGICAL_SCREEN_DESCRIPTOR = AllocateMemory(SizeOf(LOGICAL_SCREEN_DESCRIPTOR))
With *write_logical_screen_descriptor
\width = w
\height = h
\PackedByte = $80|(sz_colortable<<4)|sz_colortable
EndWith
WriteData(0, *write_logical_screen_descriptor, MemorySize(*write_logical_screen_descriptor))
*colortable = AllocateMemory(bytes_colortable)
*writeptr.RGB = *colortable
For i=0 To ArraySize(colors2())
*writeptr\Red = Red(colors2(i))
*writeptr\Green = Green(colors2(i))
*writeptr\Blue = Blue(colors2(i))
*writeptr+SizeOf(RGB)
Next
WriteData(0, *colortable, MemorySize(*colortable))
*write_graphics_control_extension.GRAPHICS_CONTROL_EXTENSION = AllocateMemory(SizeOf(GRAPHICS_CONTROL_EXTENSION))
CopyMemory(?graphicscontrolextension, *write_graphics_control_extension, SizeOf(GRAPHICS_CONTROL_EXTENSION))
WriteData(0, *write_graphics_control_extension, MemorySize(*write_graphics_control_extension))
*write_application_extension = AllocateMemory(27)
CopyMemory(?applicationextension, *write_application_extension, 19)
;WriteData(0, *write_application_extension, MemorySize(*write_application_extension))
*write_image_descriptor.IMAGE_DESCRIPTOR = AllocateMemory(SizeOf(IMAGE_DESCRIPTOR))
With *write_image_descriptor
\Separator = $2C
\ImageWidth = w
\ImageHeight = h
EndWith
WriteData(0, *write_image_descriptor, MemorySize(*write_image_descriptor))
; Encode the colorbits
Procedure EncodeBits(*buf.Long, *bits8.Ascii, min_codesize, num_colors)
#SizeOf2Char = SizeOf(character) << 1
Structure TwoChar
StructureUnion
s.s{2}
CompilerIf #PB_Compiler_Unicode
cc.l
CompilerElse
cc.w
CompilerEndIf
EndStructureUnion
EndStructure
Static Dim hex_bytes.TwoChar(255)
NewMap CodeTable.l(4096)
Protected ib.s{8192}
Protected *buf_.Long, *ib.TwoChar = @ib, *bits8end = *bits8 + MemorySize(*bits8)
Protected.l i, l, bitpos, codesize, clrcode = 1 << min_codesize
Protected.l endcode = clrcode + 1, nextcode = clrcode + 2
If hex_bytes(0)\cc = 0
For i = 0 To 255
hex_bytes(i)\s = Right(Hex(i|256), 2)
Next
EndIf
CodeTable("00") = $1000 : codesize = min_codesize + 1
i = 1 : While i < num_colors
CodeTable(hex_bytes(i)\s) = i : i + 1
Wend
*buf\l = clrcode
bitpos = codesize
While *bits8 < *bits8end
*ib\cc = hex_bytes(*bits8\a)\cc : *ib + #SizeOf2Char : *ib\cc = 0
i = CodeTable(ib)
If i = 0
*buf_ = *buf + bitpos >> 3 : *buf_\l | l << (bitpos & 7) : bitpos + codesize
CodeTable(ib) = nextcode
codesize + nextcode >> codesize : nextcode + 1
If codesize > 12
*buf_ = *buf + bitpos >> 3 : *buf_\l | clrcode << (bitpos & 7) : bitpos + 12
nextcode = endcode + 1
ClearMap(CodeTable())
CodeTable("00") = $1000 : codesize = min_codesize + 1
i = 1 : While i < num_colors
CodeTable(hex_bytes(i)\s) = i : i + 1
Wend
EndIf
*ib = @ib : i = *bits8\a : *ib\cc = hex_bytes(i)\cc : *ib + #SizeOf2Char : *ib\cc = 0
EndIf
l = i & $fff
*bits8 + 1
Wend
*buf_ = *buf + bitpos >> 3 : *buf_\l | l << (bitpos & 7) : bitpos + codesize
*buf_ = *buf + bitpos >> 3 : *buf_\l | endcode << (bitpos & 7) : bitpos + codesize
ProcedureReturn ((bitpos + 7) & -8) >> 3
EndProcedure
*buf = AllocateMemory(1024*1024*10)
*readbuf = *buf
cctotal = EncodeBits(*buf, *bits8, min_codesize, NumColors)
WriteByte(0, min_codesize)
While cctotal > 255
WriteByte(0, 255)
WriteData(0, *readbuf, 255)
*readbuf + 255
cctotal - 255
Wend
WriteByte(0, cctotal)
WriteData(0, *readbuf, cctotal)
WriteByte(0, $00)
WriteByte(0, $3B)
CloseFile(0)
;========================================================================
;
;========================================================================Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
- netmaestro
- PureBasic Bullfrog

- Posts: 8452
- Joined: Wed Jul 06, 2005 5:42 am
- Location: Fort Nelson, BC, Canada
Re: Packing bytes: a little help from my friends
Thanks for working on it! A significant portion of the processing time is taken up by the neural network and that would be a major undertaking to optimize. However, the version you posted speeds up the encoding part quite a bit. In some larger more complex files the speed gain is quite marked. One file that takes around 810 ms to process with the original code takes 275 with this version. Now to take it through the full spectrum of testing and make sure it's solid. (but it seems so.)
BERESHEIT
Re: Packing bytes: a little help from my friends
You should especially check the larger image that didn't work when you tried switching the hex strings from 4 to 2 bytes.netmaestro wrote:Now to take it through the full spectrum of testing and make sure it's solid. (but it seems so.)
The updated code I posted also uses 2 bytes but builds the total string used for the map in a different way.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: Packing bytes: a little help from my friends
This might even further reduce the speed but needs some testing.
Code: Select all
Procedure EncodeBits(*buf.Long, *bits8.Ascii, min_codesize, num_colors)
#SizeOf2Char = SizeOf(character) << 1
Structure TwoChar
StructureUnion
s.s{2}
CompilerIf #PB_Compiler_Unicode
cc.l
CompilerElse
cc.w
CompilerEndIf
EndStructureUnion
EndStructure
Static Dim hex_bytes.TwoChar(255)
NewMap CodeTable.l(4096)
Protected ib.s{8192}
Protected *buf_.Long, *ib.TwoChar = @ib, *bits8end = *bits8 + MemorySize(*bits8)
Protected.l i, l, bitpos, codesize = min_codesize + 1, clrcode = 1 << min_codesize
Protected.l endcode = clrcode + 1, nextcode = clrcode + 2
If hex_bytes(0)\cc = 0
For i = 0 To 255
hex_bytes(i)\s = Right(Hex(i|256), 2)
Next
EndIf
*buf\l = clrcode
bitpos = codesize
If *bits8 <> *bits8end
l = *bits8\a: *ib\cc = hex_bytes(l)\cc : *ib + #SizeOf2Char : *bits8 + 1
While *bits8 < *bits8end
*ib\cc = hex_bytes(*bits8\a)\cc : *ib + #SizeOf2Char: *ib\cc = 0
i = CodeTable(ib)
If i = 0
*buf_ = *buf + bitpos >> 3 : *buf_\l | l << (bitpos & 7) : bitpos + codesize
CodeTable(ib) = nextcode
codesize + nextcode >> codesize : nextcode + 1
If codesize > 12
*buf_ = *buf + bitpos >> 3 : *buf_\l | clrcode << (bitpos & 7) : bitpos + 12
nextcode = endcode + 1: codesize = min_codesize + 1
ClearMap(CodeTable())
EndIf
*ib = @ib : i = *bits8\a : *ib\cc = hex_bytes(i)\cc : *ib + #SizeOf2Char
EndIf
l = i
*bits8 + 1
Wend
*buf_ = *buf + bitpos >> 3 : *buf_\l | l << (bitpos & 7) : bitpos + codesize
EndIf
*buf_ = *buf + bitpos >> 3 : *buf_\l | endcode << (bitpos & 7) : bitpos + codesize
ProcedureReturn ((bitpos + 7) & -8) >> 3
EndProcedureWindows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
- netmaestro
- PureBasic Bullfrog

- Posts: 8452
- Joined: Wed Jul 06, 2005 5:42 am
- Location: Fort Nelson, BC, Canada
Re: Packing bytes: a little help from my friends
Still testing the first version, so far no failures. I briefly tested the last proc you posted for speed against its predecessor and found no significant improvement. As I'm just starting to understand what's going on with the first one, I'll stick with that. Your procedure-izing the EncodeBits routine is a help, I was actually engaged in doing just that when you posted your faster version. It's necessary for structure as I move to animation. It's going to be:
Code: Select all
write header
write logical screen descriptor
write graphics control extension
foreach image
write image descriptor
write local color table
write image data
next
write application extension
write comment extension
write trailer byteBERESHEIT
Re: Packing bytes: a little help from my friends
My last proc takes advantage of the fact that a single pixel lookup always returned the initial value.netmaestro wrote:Still testing the first version, so far no failures. I briefly tested the last proc you posted for speed against its predecessor and found no significant improvement.
Map("01") = 1
...
Map("09") = 9
...
Since the outcome is known, the initial colors don't have to be stored in the code table, only the ones that represent a pattern of multiple pixels.
It makes clearing the code table easier and while num_colors is still in the procedure declaration, it could be removed since it's no longer used in that version.
Speedwise a small improvement is to be expected in complex images where the code table is renewed a lot. Also it would be a bit easier to port to asm if needed but you already mentioned that other parts of the code consume more time.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
- netmaestro
- PureBasic Bullfrog

- Posts: 8452
- Joined: Wed Jul 06, 2005 5:42 am
- Location: Fort Nelson, BC, Canada
Re: Packing bytes: a little help from my friends
Pored over this for a while now and came to a couple of conclusions.
One, I have to write my own decoder as the one from Localmotion34 uses windows imaging and also doesn't work quite right in some cases.
Two, in order to do that I have to do the encoding process backwards which means I have to understand the process fully. This is a problem with Wilbert's faster version because I haven't had any luck going backwards with it. As the faster version is only running 1.4 x faster than the SetBits version from Idle (and not 5 times faster or such) that's an acceptable compromise for me.
To that end I've written and tested a GetBits macro to read the bits that Idle's SetBits macro wrote. This is what I have:
My question is, is this as streamlined (fast) as it can be written? Or could it be optimized. I've been looking so hard and long at it I can't tell anymore. I consider myself lucky to have gotten it working at all 
One, I have to write my own decoder as the one from Localmotion34 uses windows imaging and also doesn't work quite right in some cases.
Two, in order to do that I have to do the encoding process backwards which means I have to understand the process fully. This is a problem with Wilbert's faster version because I haven't had any luck going backwards with it. As the faster version is only running 1.4 x faster than the SetBits version from Idle (and not 5 times faster or such) that's an acceptable compromise for me.
To that end I've written and tested a GetBits macro to read the bits that Idle's SetBits macro wrote. This is what I have:
Code: Select all
Macro Setbits(buf,value,bitcount) ; By Idle
*tb.long
*tb = buf + (pos >> 3)
*tb\l | (value << (pos & 7))
pos + bitcount
EndMacro
Macro GetBits(buf, var, bitcount) ; By Idle & netmaestro
*tb.Long
*tb = buf + (pos >> 3)
var = ( *tb\l & ( ( 1 << bitcount )-1 ) << ( pos & 7 ) ) >> ( pos & 7 )
pos + bitcount
EndMacro
*mem = AllocateMemory(100)
SetBits(*mem, 1019, 10)
SetBits(*mem, 1020, 10)
SetBits(*mem, 1021, 10)
SetBits(*mem, 1022, 10)
SetBits(*mem, 1023, 10)
SetBits(*mem, 2043, 11)
SetBits(*mem, 2044, 11)
SetBits(*mem, 2045, 11)
SetBits(*mem, 2046, 11)
SetBits(*mem, 2047, 11)
SetBits(*mem, 4091, 12)
SetBits(*mem, 4092, 12)
SetBits(*mem, 4093, 12)
SetBits(*mem, 4094, 12)
SetBits(*mem, 4095, 12)
SetBits(*mem, 7, 3)
pos=0
result.l
For i=1 To 5
GetBits(*mem, result, 10) : Debug result
Next
For i=1 To 5
GetBits(*mem, result, 11) : Debug result
Next
For i=1 To 5
GetBits(*mem, result, 12) : Debug result
Next
GetBits(*mem, result, 3) : Debug result
BERESHEIT
Re: Packing bytes: a little help from my friends
This one from my bitvector class may be better
Code: Select all
Macro GetBits(buf, pos, bitcount)
*tb.long
mask = ($FFFFFFFF >> (32-bitcount))
If *buf
*ti = *buf + ((pos)>>3)
shift = (index & $07)
ProcedureReturn (*tb\l & (mask << shift)) >> shift
EndIf
EndMacro
Windows 11, Manjaro, Raspberry Pi OS


- netmaestro
- PureBasic Bullfrog

- Posts: 8452
- Joined: Wed Jul 06, 2005 5:42 am
- Location: Fort Nelson, BC, Canada
Re: Packing bytes: a little help from my friends
It looks quite similar to one of my earlier tries, though not equal because mine didn't work. I guess it's meant to be called from inside a procedure? (there's a ProcedureReturn in it) And what is 'index'?
BERESHEIT
Re: Packing bytes: a little help from my friends
sorry wasn't awake there!
Code: Select all
Macro GetBits(buf,value,bitcount,pos)
*tb.long
mask = ($FFFFFFFF >> (32-bitcount))
If buf
*ti = buf + ((pos)>>3)
shift = (pos & $07)
value= (*tb\l & (mask << shift)) >> shift
EndIf
EndMacro
Windows 11, Manjaro, Raspberry Pi OS


Re: Packing bytes: a little help from my friends
Just like with the encoding routine, it could probably be optimized after you create a working version of a decoding routine.netmaestro wrote:This is a problem with Wilbert's faster version because I haven't had any luck going backwards with it.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
- netmaestro
- PureBasic Bullfrog

- Posts: 8452
- Joined: Wed Jul 06, 2005 5:42 am
- Location: Fort Nelson, BC, Canada
Re: Packing bytes: a little help from my friends
Makes sense. When I have a working version using my GetBits routine I'll turn it over for "turbocharging" 
BERESHEIT
- Kwai chang caine
- Always Here

- Posts: 5502
- Joined: Sun Nov 05, 2006 11:42 pm
- Location: Lyon - France
Re: Packing bytes: a little help from my friends
Never i can thanks you, in all my life, if a day you can create a GIF Encoder/Decoder.NetMaestro wrote:Once this is finished KCC can have his Gif animator
If this day come, even if i'm a french "Frog eater"....I try to give them their thighs now
Excuse me to not have testing before, but currently i not really programming
so i have tested it in an old PC with XP PRO SP3 and v5.20NetMaestro wrote:Could you please test this code and see how crossplatform it is? And let me know of any bugs/failures.
After a long time, i have nice GIF.
She is nearly perfect, except the front of the nice woman, now have a paint like a sqaw

See yourself :

The original JPG french singer

And the result

And the code of WILBERT do the same thing.
Thank you to all of you masters, trying to give PB, this decoder that lacks
The happiness is a road...Not a destination