HEX string generating - extreme fast

Share your advanced PureBasic knowledge/code with the community.
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

HEX string generating - extreme fast

Post by walbus »

For the latest string encryption inside QAES the PB Hex generating was absolutely to slow.

I have maked this little routine for very quick generating big HEX strings.

Test it, it is very quick and works with Ascii and Unicode

Werner Albus - http://www.quick-aes-256.de - http://www.nachtoptik.de

Code: Select all

*buffer=AllocateMemory(1e6) ; Buffer to encoding
len_string_bytes=MemorySize(*buffer)
RandomData(*buffer, len_string_bytes)

Repeat
  string_1$+RSet(Hex(i), 2, "0")
  i+1
Until i=256

time=ElapsedMilliseconds()

string$=Space(len_string_bytes*2)
len_string_bytes-1
If SizeOf(character)>1
  Repeat
    PokeL(@string$+iii, PeekL(@string_1$+PeekA(*buffer+iiii)*4))
    iiii+1 : iii+4
  Until iiii>len_string_bytes
Else
  Repeat
    PokeW(@string$+iii, PeekW(@string_1$+PeekA(*buffer+iiii)*2))
    iiii+1 : iii+2
  Until iiii>len_string_bytes
EndIf

Debug (ElapsedMilliseconds()-time)
Debug string$
time=ElapsedMilliseconds()

string$="" : i=0 ; PB variante
Repeat
  string$+RSet(Hex(PeekA(*buffer+i)), 2, "0")
  i+1
Until i=len_string_bytes

Debug (ElapsedMilliseconds()-time)
Last edited by walbus on Sat Apr 23, 2016 10:18 pm, edited 2 times in total.
infratec
Always Here
Always Here
Posts: 6817
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: HEX string generating - extreme fast

Post by infratec »

Hi,

if debug is enabled, you get never correct results.
You should use MessageRequester() to output the time.

And << 1 is always faster than * 2
<< 2 is faster than * 4

And your Hex() version is slow because you always add something to a string.
It's not Hex() which is sooo slow.

Better comparrison:

Code: Select all

*buffer=AllocateMemory(1e6) ; Buffer to encoding
len_string_bytes=MemorySize(*buffer)
RandomData(*buffer, len_string_bytes)

Repeat
  string_1$+RSet(Hex(i), 2, "0")
  i+1
Until i=256


time=ElapsedMilliseconds()
string$=Space(len_string_bytes*2)
iiii=0
iii = 0
Repeat
  CompilerIf #PB_Compiler_Unicode
    PokeL(@string$+iii, PeekL(@string_1$+PeekA(*buffer+iiii) << 2))
    iii+4
  CompilerElse
    PokeW(@string$+iii, PeekW(@string_1$+PeekA(*buffer+iiii) << 1))
    iii+2
  CompilerEndIf
  iiii+1
Until iiii=len_string_bytes
MessageRequester("Info", Str(ElapsedMilliseconds()-time))
Debug string$


Debug "----"


time=ElapsedMilliseconds()
string$=Space(len_string_bytes*2)
iiii=0
iii = 0
Repeat
  PokeS(@string$ + iii, RSet(Hex(PeekA(*buffer+iiii)), 2, "0"))
  CompilerIf #PB_Compiler_Unicode
    iii + 4
  CompilerElse
    iii + 2
  CompilerEndIf
  iiii+1
Until iiii=len_string_bytes
MessageRequester("Info", Str(ElapsedMilliseconds()-time))
Debug string$
And switch off Debug for the real time :wink:

Bernd
Last edited by infratec on Mon Apr 18, 2016 10:27 pm, edited 1 time in total.
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: HEX string generating - extreme fast

Post by walbus »

@Bernd
It is not important for get the exactely time.
I must supporting sizes up to about 2e9.
Important is the result, making things tricky.

Regards werner
Last edited by walbus on Tue Apr 19, 2016 3:31 pm, edited 1 time in total.
infratec
Always Here
Always Here
Posts: 6817
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: HEX string generating - extreme fast

Post by infratec »

No,

it takes 10 seconds.

Ok, your version is faster.
But the point is not Hex() :wink:
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: HEX string generating - extreme fast

Post by walbus »

You must have a fast computer.
My c. run, it looks endles and the difference is very important.
The mostly users have computers same mine, i think.
Yep, you want to the moon, you can use a old rocket, or you can ask scotty for beaming.
Last edited by walbus on Sat Apr 23, 2016 10:21 pm, edited 1 time in total.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HEX string generating - extreme fast

Post by wilbert »

Bin2Data (for turning a file into a DataSection), uses SSE2 to convert to hex.
http://www.purebasic.fr/english/viewtop ... 27&t=49196
It's a very fast method for a large amount of data.
Windows (x64)
Raspberry Pi OS (Arm64)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HEX string generating - extreme fast

Post by wilbert »

Here's even faster (SSE2) :wink:

Code: Select all

DeclareModule HexData
  
  Declare.s HexData(*mem, size, lowercase = #False)
  
EndDeclareModule

Module HexData
  
  DisableDebugger
  EnableExplicit
  EnableASM
  
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
    Macro rax : eax : EndMacro
    Macro rcx : ecx : EndMacro
    Macro rdx : edx : EndMacro
    Macro rsp : esp : EndMacro  
  CompilerEndIf
  
  Macro M_movdqa(r1, r2)
    !movdqa r1, r2
  EndMacro
  
  Macro M_movdqu(r1, r2)
    !movdqu r1, r2
  EndMacro
  
  Procedure.s HexData(*mem, size, lowercase = #False)
    Protected Dim b.u(size + 30)
    ; init xmm registers
    !mov eax, [p.v_lowercase]
    !shl eax, 29
    !and eax, 0x20000000
    !or eax, 0x0739300f
    !movd xmm0, eax
    !punpcklbw xmm0, xmm0
    !punpcklwd xmm0, xmm0
    !pshufd xmm2, xmm0, 00000000b
    !pshufd xmm3, xmm0, 01010101b
    !pshufd xmm4, xmm0, 10101010b
    !pshufd xmm5, xmm0, 11111111b
    ; load source, destination and size
    mov rax, [p.a_b]
    mov rcx, [p.v_size]
    mov rdx, [p.p_mem]
    add rcx, rdx
    shr rdx, 4
    shl rdx, 4
    ; backup xmm6 and xmm7
    M_movdqu ([rsp - 16], xmm6)
    M_movdqu ([rsp - 32], xmm7)
    ; main loop
    !.l_hexdata_loop:
    M_movdqa (xmm0, [rdx])
    !movdqa xmm1, xmm0
    !psrld xmm0, 4
    !pand xmm0, xmm2
    !pand xmm1, xmm2
    !por xmm0, xmm3
    !por xmm1, xmm3
    !movdqa xmm6, xmm0
    !movdqa xmm7, xmm1
    !pcmpgtb xmm6, xmm4
    !pcmpgtb xmm7, xmm4
    !pand xmm6, xmm5
    !pand xmm7, xmm5
    !paddb xmm0, xmm6
    !paddb xmm7, xmm1
    !movdqa xmm1, xmm0
    !punpcklbw xmm0, xmm7
    !punpckhbw xmm1, xmm7
    M_movdqu ([rax], xmm0)
    M_movdqu ([rax + 16], xmm1)
    add rdx, 16
    add rax, 32
    cmp rdx, rcx
    !jb .l_hexdata_loop
    ; restore xmm6 and xmm7
    M_movdqu (xmm6, [rsp - 16])
    M_movdqu (xmm7, [rsp - 32])
    *mem & 15 : b(*mem + size) = 0
    ProcedureReturn PeekS(@b(*mem), -1, #PB_Ascii)
  EndProcedure
  
EndModule



; test module
A.q = $0123456789abcdef
Debug HexData::HexData(@A, 8, #True)

Dim Buffer.b(100)
RandomData(@Buffer(), 100)
Debug HexData::HexData(@Buffer(), 100, #True)

Edit:
Newer versions of PureBasic use unicode internally.
The code above creates the result in ascii and converts it to unicode when returning the result.
While this is very fast on macOS, it seems to be slower on Windows and Linux.
If you are using Windows or Linux, you might want to try the unicode version below and see if it is faster.

Code: Select all

; Unicode only !!!

DeclareModule HexData
  
  Declare.s HexData(*mem, size, lowercase = #False)
  
EndDeclareModule

Module HexData
  
  DisableDebugger
  EnableExplicit
  EnableASM
  
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
    Macro rax : eax : EndMacro
    Macro rcx : ecx : EndMacro
    Macro rdx : edx : EndMacro
    Macro rsp : esp : EndMacro  
  CompilerEndIf
  
  Macro M_movdqa(r1, r2)
    !movdqa r1, r2
  EndMacro
  
  Macro M_movdqu(r1, r2)
    !movdqu r1, r2
  EndMacro
  
  Procedure.s HexData(*mem, size, lowercase = #False)
    Protected Dim b.l(size + 30)
    ; init xmm registers
    !mov eax, [p.v_lowercase]
    !shl eax, 29
    !and eax, 0x20000000
    !or eax, 0x0739300f
    !movd xmm0, eax
    !punpcklbw xmm0, xmm0
    !punpcklwd xmm0, xmm0
    !pshufd xmm2, xmm0, 00000000b
    !pshufd xmm3, xmm0, 01010101b
    !pshufd xmm4, xmm0, 10101010b
    !pshufd xmm5, xmm0, 11111111b
    ; load source, destination and size
    mov rax, [p.a_b]
    mov rcx, [p.v_size]
    mov rdx, [p.p_mem]
    add rcx, rdx
    shr rdx, 4
    shl rdx, 4
    ; backup xmm6 and xmm7
    M_movdqu ([rsp - 16], xmm6)
    M_movdqu ([rsp - 32], xmm7)
    ; main loop
    !.l_hexdata_loop:
    M_movdqa (xmm0, [rdx])
    !movdqa xmm1, xmm0
    !psrld xmm0, 4
    !pand xmm0, xmm2
    !pand xmm1, xmm2
    !por xmm0, xmm3
    !por xmm1, xmm3
    !movdqa xmm6, xmm0
    !movdqa xmm7, xmm1
    !pcmpgtb xmm6, xmm4
    !pcmpgtb xmm7, xmm4
    !pand xmm6, xmm5
    !pand xmm7, xmm5
    !paddb xmm0, xmm6
    !paddb xmm7, xmm1
    !movdqa xmm1, xmm0
    !punpcklbw xmm0, xmm7
    !punpckhbw xmm1, xmm7
    !pxor xmm7, xmm7
    !movdqa xmm6, xmm0
    !punpcklbw xmm0, xmm7
    !punpckhbw xmm6, xmm7
    M_movdqu ([rax], xmm0)
    M_movdqu ([rax + 16], xmm6)
    !movdqa xmm6, xmm1
    !punpcklbw xmm1, xmm7
    !punpckhbw xmm6, xmm7
    M_movdqu ([rax + 32], xmm1)
    M_movdqu ([rax + 48], xmm6)
    add rdx, 16
    add rax, 64
    cmp rdx, rcx
    !jb .l_hexdata_loop
    ; restore xmm6 and xmm7
    M_movdqu (xmm6, [rsp - 16])
    M_movdqu (xmm7, [rsp - 32])
    *mem & 15 : b(*mem + size) = 0
    ProcedureReturn PeekS(@b(*mem))
  EndProcedure  
  
EndModule



; test module
A.q = $0123456789abcdef
Debug HexData::HexData(@A, 8, #True)

Dim Buffer.b(100)
RandomData(@Buffer(), 100)
Debug HexData::HexData(@Buffer(), 100, #True)
Last edited by wilbert on Thu Nov 07, 2019 8:36 am, edited 5 times in total.
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: HEX string generating - extreme fast

Post by walbus »

Many thanks wilbert

I'll install your code now !
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HEX string generating - extreme fast

Post by wilbert »

walbus wrote:I'll install your code now !
I updated the module code above with an additional lowercase flag.
Some people prefer lowercase, others uppercase and this way you don't have the additional time the PB conversion functions would take.
Windows (x64)
Raspberry Pi OS (Arm64)
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: HEX string generating - extreme fast

Post by Little John »

How was it possible that I missed that code by wilbert? :-)
Thank you very much!
User avatar
Lord
Addict
Addict
Posts: 847
Joined: Tue May 26, 2009 2:11 pm

Re: HEX string generating - extreme fast

Post by Lord »

Maybe I'm wrong, but shouldn't

Code: Select all

A.q = $0123456789abcdef
Debug HexData::HexData(@A, 8, #True)
give "0123456789abcdef"

and

Code: Select all

A.q = $0123456789abcdef
Debug HexData::HexData(@A, 16, #True)
return "00000000000000000123456789abcdef"
instead of
"efcdab89674523011700000000000000"?
Image
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HEX string generating - extreme fast

Post by wilbert »

Lord wrote:Maybe I'm wrong, but shouldn't

Code: Select all

A.q = $0123456789abcdef
Debug HexData::HexData(@A, 8, #True)
give "0123456789abcdef"
No, Intel is little-endian. The least significant bytes are stored first.
So a 64 bits integer $0123456789abcdef is stored in memory as
ef cd ab 89 67 45 23 01

This procedure doesn't return a hex value but returns a string of all bytes of a memory area as they are stored in memory.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Lord
Addict
Addict
Posts: 847
Joined: Tue May 26, 2009 2:11 pm

Re: HEX string generating - extreme fast

Post by Lord »

wilbert wrote:...
This procedure doesn't return a hex value but returns a string of all bytes of a memory area as they are stored in memory.
Oh, I see.
I must have missed the exit from hex-string-highway to hex-data-highway. :wink:
Image
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: HEX string generating - extreme fast

Post by wilbert »

I updated my module code above by adding a unicode only version which might be faster on Windows and Linux.
Windows (x64)
Raspberry Pi OS (Arm64)
swhite
Enthusiast
Enthusiast
Posts: 726
Joined: Thu May 21, 2009 6:56 pm

Re: HEX string generating - extreme fast

Post by swhite »

Hi Wilbert

I tested your newer unicode version in PB 5.71 Windows and the time for 100,000 iterations was 17ms so it is faster than the previous version by 4ms and about 30 times faster then my PB code. So thank-you very much.

I would be helpful to have a PokeH() function as well that takes a Hex string and converts it to the binary ascii values. :)

Simon
Simon White
dCipher Computing
Post Reply