Hex2Dec (hex string to decimal)

Share your advanced PureBasic knowledge/code with the community.
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Hex2Dec (hex string to decimal)

Post by Lunasole »

I guess there was enough implementations of this, but going to post one more ^^
It takes hex string of any len and turns it into a byte array.

So generally, it is not for something like "convert hex to a long". I've made it to convert hash-strings (very long and not fitting in any decimal type, also can have variable size) to their raw form.
Also had an idea to convert into array of integers/quads and so on, but that's more complicated to do.

Here are 2 variants, both of them should work fine in Unicode and ASCII modes.

Short and clear version:

Code: Select all

// unactual

Optimized version. For even more speed - use assembler code by @wilbert or something like it ^^

Code: Select all

; convert hex string into raw bytes [optimized]
; Out()		unsigned char array to receive result
; Hex$		string with hex data
; RETURN:	decimal valuee are placed to Out() array, size of array returned
Procedure Hex2Dec (Array Out.a (1), Hex$)
	Protected t$ = "$  "
	Protected *c.Character = @Hex$
	Protected pg, p = 1
	Protected out_len = Len(Hex$) : out_len + out_len % 2 : out_len * 0.5 - Bool(out_len)
	ReDim Out(out_len)
	While *c\c
		If p > 2
			Out(pg) = Val(t$)
			PokeC(@t$ + SizeOf(Character), 0)
			PokeC(@t$ + SizeOf(Character) * 2, 0)
			p = 1
			pg + 1
		EndIf
		PokeC(@t$ + p * SizeOf(Character), *c\c)
		p + 1
		*c + SizeOf(Character)
	Wend
	Out(pg) = Val(t$)
	ProcedureReturn ArraySize(Out())
EndProcedure


; test/example
Dim Key.a(0)
Debug Hex2Dec(Key(), "")		; 0
Debug Hex2Dec(Key(), "F")		; 0
Debug Hex2Dec(Key(), "FFF")		; 1
Debug Hex2Dec(Key(), "FFFF")	; 1
Debug Hex2Dec(Key(), "FFFFF")	; 2
Debug Hex2Dec(Key(), "FFFFFF")	; 2
Last edited by Lunasole on Sat Feb 18, 2017 3:05 am, edited 7 times in total.
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
User avatar
RSBasic
Moderator
Moderator
Posts: 1228
Joined: Thu Dec 31, 2009 11:05 pm
Location: Gernsbach (Germany)
Contact:

Re: Hex2Dec (hex string to decimal)

Post by RSBasic »

Very useful
Image
Image
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Hex2Dec (hex string to decimal)

Post by Lunasole »

Fixed it to work fine in ASCII-mode, also removed second variant with memory pointer.
Also shortened it some more ^^
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: Hex2Dec (hex string to decimal)

Post by walbus »

@Lunasole
Hi, try this little tricky code, it is a part from my quick-aes-256

I think it is more as thousand times as fast, or many, many more :shock:

Code: Select all

Procedure Hex2bin(*source, *destination, length)
  Protected i, ii, iii, length_minus_1=length-1
  Static only_one
  
  If length<1 : ProcedureReturn 0 : EndIf 
  
  CompilerIf #PB_Compiler_Unicode
    
    If Not only_one
      only_one=1
      Static Dim hex_field.l(255)
      For i=0 To 255
        string$=RSet(Hex(i), 2, "0")
        hex_field(i)=PeekL(@string$)
      Next i
    EndIf
    For i=0 To length_minus_1 Step 4
      For ii=0 To 255
        If PeekL(*source+i)=hex_field(ii)
          PokeA(*destination+iii, ii)
          iii+1
          Break
        EndIf
      Next ii
    Next i
    
  CompilerElse
    
    If Not only_one
      only_one=1
      Static Dim hex_field.w(255)
      For i=0 To 255
        string$=RSet(Hex(i), 2, "0")
        hex_field(i)=PeekW(@string$)
      Next i
    EndIf
    For i=0 To length_minus_1 Step 2
      For ii=0 To 255
        If PeekW(*source+i)=hex_field(ii)
          PokeA(*destination+iii, ii)
          iii+1
          Break
        EndIf
      Next ii
    Next i
    
  CompilerEndIf
  
  ProcedureReturn length
EndProcedure

 a$="c3db14c065f55203b033c81a697b97c5"
 For i=1 To 18 : a$+a$ : Next i
 Beep_(400,400)
*buffer=AllocateMemory(StringByteLength(a$))
 hex2bin(@a$, *buffer, MemorySize(*buffer))
Beep_(800,400)
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Hex2Dec (hex string to decimal)

Post by Lunasole »

walbus wrote:@Lunasole
Hi, try this little tricky code, it is a part from my quick-aes-256

I think it is more as thousand times as fast, or many, many more :shock:
It looks so ^^ But I've got something like infinite loop running it (some unreal values passed as Hex2bin length argument, and counting in iii/ii loops).

What about performance of that my variant, it takes ~500ms to translate 100 000 MD5 hashes on a very-very OLD processor (weakest than oldest Pentium 4). On modern CPU that should be done instantly, so generally it's performance is fine for most cases and I didn't found reasons to optimize it more (making thus code more long and complex).
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Hex2Dec (hex string to decimal)

Post by wilbert »

If it's fast enough, it's fast enough :)
The main issue with Mid is that it's slow on large strings but if you only need to convert MD5 hashes, it's not a big problem.
If you would need to convert hex strings with millions of characters, there are better approaches but if the job you need to do only takes 500 msec on a P4, that seems fast enough :wink:
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: Hex2Dec (hex string to decimal)

Post by walbus »

Yep, it´s the Mid function
It is with increasing string length progressively slower

But, i self use a ASM routine from wilbert :wink:

On my quick-aes-256 i convert in the editor gadget from the suite 'on demand'
hex based AES encrypted text files to binary, small and very large files.
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Hex2Dec (hex string to decimal)

Post by Lunasole »

wilbert wrote: The main issue with Mid is that it's slow on large strings but if you only need to convert MD5 hashes, it's not a big problem.
Hm, I didn't know about that. Just tested and really speed of mid(string, 1) is greatly different than mid(string, 10000).
That's damn strange, I though that Val() is slow, and Mid() just jumps to a string area like that is performed using pointer (base address + string position * char size).

Thanks, then there is something to optimize here ^^
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Hex2Dec (hex string to decimal)

Post by wilbert »

walbus wrote:But, i self use a ASM routine from wilbert :wink:
Did I post it somewhere on the forum ?
I couldn't find it myself.
Lunasole wrote:Hm, I didn't know about that. Just tested and really speed of mid(string, 1) is greatly different than mid(string, 10000).
That's damn strange, I though that Val() is slow, and Mid() just jumps to a string area like that is performed using pointer (base address + string position * char size).
No, PB doesn't store string lengths so if you use Mid with a starting position of 10000, it needs to scan all 9999 characters before that to see if it doesn't encounter a 0 byte which terminates the string.
You are jumping two characters at a time so if you need to work with a 16 character hex string, it needs to scan 0 + 2 + 4 + 6 + 8 + 10 + 12 + 14 = 56 additional characters compared to straight jumping to the right position. You can imagine how the number grows for larger strings.
So in a loop like the example you posted, it makes a huge difference.
Windows (x64)
Raspberry Pi OS (Arm64)
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: Hex2Dec (hex string to decimal)

Post by walbus »

PB, it looks, search ever firstly the string termination from a string on the string end.
So also each Mid call must search firstly ever the string end.
The string handling from PB is OK but very slow, i self think.
The only way you become speed, is, you must use other tricky ways.
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: Hex2Dec (hex string to decimal)

Post by walbus »

@wilbert
I think I once asked you if you can make me an for me tool

Code: Select all

;- Module Hex2Bin from wilbert -

DeclareModule Hex2Bin
  ; Number of bytes is length of string divided by 2
  Declare Hex2Bin(*source, *destination, length)
  
EndDeclareModule

Module Hex2Bin
  
  DisableDebugger
  EnableExplicit
  
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
    Macro rax : eax : EndMacro
    Macro rcx : ecx : EndMacro
    Macro rdx : edx : EndMacro
    Macro rsp : esp : EndMacro  
  CompilerEndIf
  
  Macro M_movq(r1, r2)
    !movq r1, r2
  EndMacro
  
  Macro M_movdqu(r1, r2)
    !movdqu r1, r2
  EndMacro
  
  Procedure Hex2Bin(*source, *destination, length)
    EnableASM
    !mov eax, 0x0f0940
    !movd xmm0, eax
    !punpcklbw xmm0, xmm0
    !punpcklwd xmm0, xmm0
    !pshufd xmm2, xmm0, 00000000b
    !pshufd xmm3, xmm0, 01010101b
    !pshufd xmm4, xmm0, 10101010b
    mov rax, *source
    mov rdx, *destination
    mov rcx, length
    And rcx, -8
    add rcx, rdx
    !Hex2Bin_loop:
    M_movdqu (xmm0, [rax])
    CompilerIf #PB_Compiler_Unicode
      M_movdqu (xmm1, [rax + 16])
      !packuswb xmm0, xmm1
      add rax, 32
    CompilerElse
      add rax, 16
    CompilerEndIf
    !movdqa xmm1, xmm0
    !pcmpgtb xmm1, xmm2
    !pand xmm1, xmm3
    !paddb xmm0, xmm1
    !pand xmm0, xmm4
    !movdqa xmm1, xmm0
    !psllw xmm1, 12
    !por xmm0, xmm1
    !psrlw xmm0, 8
    !packuswb xmm0, xmm0
    M_movq ([rdx], xmm0)
    add rdx, 8
    cmp rdx, rcx
    !jb Hex2Bin_loop
    DisableASM
  EndProcedure
  
EndModule

User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Hex2Dec (hex string to decimal)

Post by Lunasole »

walbus wrote: The string handling from PB is OK but very slow, i self think.
The only way you become speed, is, you must use other tricky ways.
It is "reasonably fast" :) I.e. in most applications it is enough. Generally extra optimizations are justified only if really getting noticeable performance problems, else they are mostly waste of time, as no real improvements are visible when using program.


Here I've made some changes to avoid Mid() and concatenation. Len() is also additional slowdown, but let it remains to be more usable and not require to set output array size manually :3

Also I was wrong about 500ms (500 was when I've missed to turn off debugger).
Without debugger enabled, the result in that 100k MD5 test is ~300ms for previous version, and ~200ms for this one. That difference should increase a lot when used with longer strings.

Code: Select all

 [added to a first post] 
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
walbus
Addict
Addict
Posts: 929
Joined: Sat Mar 02, 2013 9:17 am

Re: Hex2Dec (hex string to decimal)

Post by walbus »

As a little sample.
My encryption tool generates for encryption a movie, about 1 000 000 different 256bit hashes.
Here i also use wilberts SHA3 Module for about 40% speed up !
Further, without a tricky handling and a lot experience you can forget fast speed programming.

The best way is ever a simple way, try ever found a simple way, this ist the fastest !
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Hex2Dec (hex string to decimal)

Post by Lunasole »

walbus wrote:As a little sample.
My encryption tool generates for encryption a movie, about 1 000 000 different 256bit hashes.
Here i also use wilberts SHA3 Module for about 40% speed up !
Further, without a tricky handling and a lot experience you can forget fast speed programming.

I come from the old Commodore C64 and Atari ST.
With this old computers my tools were mostly up to 20 times faster as other tools.
The best way is ever a simple way, try ever found a simple way, this ist the fastest !
You say right, but it all depends on purpose.
I liked "optimizations for optimizations" years ago and was spending many hours just to improve speed a bit more, for now I don't think it is justified on practice. And I still like nice code which works fast and not eats 50% RAM of your PC, but there is no difference between program which internally works 10000 times faster than other, if both of them are fast when using them.

So speed is not the only criteria, you should also look at code itself, how clear and easy to maintain or change it is? If you take ASM code, it is very hard by those params and always takes much more time to learn/edit and debug it. The price of speed improvement becomes very high, also for now I often just cannot understand some highly-optimized code I've made years ago, because it offers speed, but damages readability and purity of implemented algorithms.
That's main reason why I'm almost never looking to ASM, doing only some "high-level" optimizations. Generally is better to use C compiler (which evolved for over 30 years and offers very cool automatic optimizations, especially for code which does lot of math calculations) to write some function on it, compile to a simple static library and use in PB. But even this is rarely justified I think.

So doing extra optimizations by yourself in some hard way is not rational in many cases nowadays, but of course it can be fine in some fundamental libraries (like your AES), engines, etc. That kind of software obviously differs from typical applications/games where "reasonably fast" code is enough.
And surely it is anyway always fine if just like to do it. I like, but not so much to go into assembler level, or code using bit shifts etc, instead of clear expressions.
Last edited by Lunasole on Sat Aug 20, 2016 7:43 pm, edited 1 time in total.
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Hex2Dec (hex string to decimal)

Post by wilbert »

@walbus, thanks for posting the code.
For those interested, it's optimized for working with hex strings with a length which is a multiple of 16 characters (processing 8 bytes at a time).
It's very fast but if you need to process arbitrary length zero terminated strings, a different kind of optimization would be required.
Lunasole wrote:I liked "optimizations for optimizations"
I still do :lol:
I guess it's a leftover from staring with a computer with a 3.5 Mhz processor.
You are right ASM is more difficult to debug. Even without using ASM, there's still room for optimization but a bit less of course.
Here's a bit manipulation example to convert two hex characters to a byte value.

Code: Select all

Procedure.a ValByteStringU(*ByteStringU.Long); Unicode version
  ProcedureReturn ((*ByteStringU\l + *ByteStringU\l >> 6 & $00010001 * 9) & $000f000f * $10000100) >> 24
EndProcedure

Procedure.a ValByteStringA(*ByteStringA.Word); Ascii version
  ProcedureReturn ((*ByteStringA\w + *ByteStringA\w >> 6 & $0101 * 9) & $0f0f * $1001) >> 8
EndProcedure



t1 = ElapsedMilliseconds()
For i = 1 To 20000000
  a.a = Val("$A8")
Next
t2 = ElapsedMilliseconds()
For i = 1 To 20000000
  a.a = ValByteStringU(@"A8")
Next
t3 = ElapsedMilliseconds()

MessageRequester("", Str(t2-t1) + " vs " + Str(t3 - t2))
Windows (x64)
Raspberry Pi OS (Arm64)
Post Reply