MD5FingerprintBin()

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

MD5FingerprintBin()

Post by Rescator »

Why? Because this code is so ugly, but sadly something I use a lot on some programs!

A function similar to this one, is what I'd love to see. PHP5 has it for MD5 and SHA too I believe.

The other benefit is that binary should be much faster right? (or does the md5 implementation actually do it as hex internally?)

Code: Select all

EnableExplicit

Procedure.l MD5FingerprintBin(*buf,len.i,*md5bin) ;returns a pointer to a 16 byte md5 binary fingerprint
 Protected hex.s, x.i, a.i, b.i, c.i, i.i
 hex.s=UCase(MD5Fingerprint(*buf,len))
 i=0
 For x=0 To 31 Step 2
  a=PeekB((@hex)+x) & $FF
  If a<58
   a=a-48
  Else
   a=a-55
  EndIf
  b=PeekB((@hex)+(x+1)) & $FF
  If b<58
   b=b-48
  Else
   b=b-55
  EndIf
  PokeB(*md5bin+i,(a<<4)+b)
  i+1
 Next
 ProcedureReturn *md5bin
EndProcedure


Define *md5bin,text$
*md5bin=AllocateMemory(16)

text$="Test!"
MD5FingerprintBin(@text$,StringByteLength(text$),*md5bin) ;returns a pointer to *md5bin or 0 if error.
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

As I know Fred hates bumps, I took this opportunity to update/change the example to something better. (the old example was unsafe in multi-threaded uses etc.)

@PB Team: If you folks don't wanna do this one, then could you point me to the md5 source you based this on so that I can get similar behavior/speed for a bin variant as the native hex one has?

For those wondering why I need a bin variant, beyond possibly speed improvement (no string/hex handling overhead etc.)...
When building a combined key using MD5 hashes, I can for example get twice as many md5 hashes into 1024 bytes if the hashes are binary as opposed to hex (2 bytes for 1 character) which would be just 2*512 and half the potential strength.

PHP5's bin support: http://no.php.net/manual/en/function.md5.php
Fred
Administrator
Administrator
Posts: 16581
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Post by Fred »

The MD5 is indeeded calculated in binary format, only the result is transformed into string, so it will have absolutely no speed gain to do a 'bin' function. BTW, your procedure code can be made somewhat easier:

Code: Select all

Procedure.l MD5FingerprintBin(*buf,len.i,*md5bin) ;returns a pointer to a 16 byte md5 binary fingerprint
 Protected hex.s, x.i
 hex.s = MD5Fingerprint(*buf, len)

 For x=0 To 31 Step 2
  PokeB(*md5bin+x/2, Val("$"+Mid(hex, x+1, 2)))
 Next

 ProcedureReturn *md5bin
EndProcedure
Last edited by Fred on Thu May 14, 2009 1:50 pm, edited 1 time in total.
User avatar
DoubleDutch
Addict
Addict
Posts: 3219
Joined: Thu Aug 07, 2003 7:01 pm
Location: United Kingdom
Contact:

Post by DoubleDutch »

PokeB(*md5bin+x/2, Val("$"+Mid(hex, x+1, 2)))
Nice trick to convert back from hex. I'll pinch that.
https://deluxepixel.com <- My Business website
https://reportcomplete.com <- School end of term reports system
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Fred wrote:The MD5 is indeeded calculated in binary format, only the result is transformed into string, so it will have absolutely no speed gain to do a 'bin' function.
Yeah but exposing the non-transformed (raw as PHP calls it) result would actually speed things up for situations like this:

(concept only)
Define *md5bin,text$
*md5bin=AllocateMemory(256)

text$="This is a test, a long line of text! Like a passphrase!"
MD5FingerprintBin(@text$,Len(text$),*md5bin)
MD5FingerprintBin(@text$+1,Len(text$)-1,*md5bin+16)
MD5FingerprintBin(@text$+2,Len(text$)-2,*md5bin+32)
MD5FingerprintBin(@text$+3,Len(text$)-3,*md5bin+48)
MD5FingerprintBin(@text$+4,Len(text$)-4,*md5bin+64)

Luckily one would not need to build a key like this that often,
but it still feels silly that something that originally is a binary md5 is turned to hex, then turned to binary again. A raw or Bin variant would avoid the hex stuff fully.

Maybe I'm just nitpicking but... :P

Oh and thanks for the hex/bin tip there, never thought of that one before :)
Rinzwind
Enthusiast
Enthusiast
Posts: 636
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: MD5FingerprintBin()

Post by Rinzwind »

Late to the party. You're not nitpicking. The fingerprint functions seem to follow some kind of backwards thinking. One should start with a PB build-in function that gives as result a byte pointer and after that add convenience functions that translate a byte pointer into a string hex representation. And to be complete, a function that transforms a string hex into a byte pointer. Right?

As of now, we need things like (viewtopic.php?t=63309) which should not be necessary.

Code: Select all

  Global Dim HexTable.a(127)
  Procedure InitHexTable()
    Protected i
    For i = '0' To '9'
      HexTable(i) = (i -'0')
    Next
    For i = 'a' To 'f'
      HexTable(i) = (i -'a' + 10)
    Next
    For i = 'A' To 'F'
      HexTable(i) = (i -'A' + 10)
    Next
    For i = 0 To 15
      HexTable(i) = Asc(Hex(i))
    Next
  EndProcedure
  
  Structure DoubleChar
    c1.c
    c2.c
  EndStructure
  
  Procedure.s MemToHex(*mem.ascii, memlen)
    Protected *pos.DoubleChar
    Protected shex.s
    shex = Space(memlen * 2)
    *pos = @shex
    While memlen > 0
      *pos\c1 = HexTable(*mem\a >> 4)
      *pos\c2 = HexTable(*mem\a & $f)
      *pos +SizeOf(DoubleChar)
      *mem +1
      memlen -1
    Wend
    ProcedureReturn shex
  EndProcedure
  
  Procedure HexToMem(*phex.DoubleChar, *mem=0, memlen=0)
    Protected *pos.ascii
    Protected hexlen
    hexlen = MemoryStringLength(*phex)
    If hexlen & %1
      ProcedureReturn 0
    EndIf
    If *mem = 0
      memlen = hexlen / 2
      *mem = AllocateMemory(memlen)     
    ElseIf memlen=0
      memlen = MemorySize(*mem)
    EndIf
    If memlen >= hexlen/2
      *pos = *mem
      While *phex\c1 > 0 And *phex\c2 > 0
        *pos\a = HexTable(*phex\c1 & $7f) << 4 + HexTable(*phex\c2 & $7f)
        *phex +SizeOf(DoubleChar)
        *pos +1
      Wend
      ProcedureReturn *mem
    EndIf
    ProcedureReturn 0   
  EndProcedure
  
  Procedure.s HexToStr(*hex.DoubleChar)
    Protected str.s, len
    len = MemoryStringLength(*hex) / SizeOf(character) / 2
    str = Space(len)
    HexToMem(*hex, @str, StringByteLength(str))
    ProcedureReturn str
  EndProcedure  
User avatar
Controller
User
User
Posts: 28
Joined: Thu Jul 22, 2004 5:26 am
Location: Germany
Contact:

Re: MD5FingerprintBin()

Post by Controller »

I would also appreciate an MD5FingerprintBin() function...
User avatar
Saki
Addict
Addict
Posts: 830
Joined: Sun Apr 05, 2020 11:28 am
Location: Pandora

Re: MD5FingerprintBin()

Post by Saki »

You can emulate that.
It is then no longer an md5 hash, but a hash based on md5.
Probably the collision safety is even higher thereby, as a significant diffusion is added.
And it is very fast.

Practically this construction is applicable for example with the blockwise transfer of files.
The individual hashes then always add up automatically to a new one for each block,
without further intervention.

Since the Pre Image is, a md5 is also absolutely OK.

The problem is that it is hard to sell to a person who doesn't understand it.
If he asks which hash method is used and you say an md5 based one,
a lot of people probably start hyperventilating, because they think they know something very clever,
that md5 is broken.
Which is absolute cheese in this context.

It's probably not that easy to understand how this works.

Code: Select all

Define text$="Hello i am a md5 based 16 byte binary hash"
Define *source=@text$
Define *hash_buffer=AllocateMemory(16)

Procedure BinHash16(*source, source_length, *hash_buffer)
  UseMD5Fingerprint() ; By Saki
  Define md5$=Fingerprint(*source, source_length, #PB_Cipher_MD5)
  FillMemory(*hash_buffer, 16, 0)
  AESEncoder(*hash_buffer, *hash_buffer, 16, @md5$, 256, 0, #PB_Cipher_ECB)
EndProcedure

BinHash16(@text$, StringByteLength(text$), *hash_buffer)
ShowMemoryViewer(*hash_buffer, 16) ; Binary md5 based hash
地球上の平和
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: MD5FingerprintBin()

Post by Mijikai »

On Windows OS the Crypt API could be a option (CryptAcquireContextW -> CryptGetHashParam).
The API is pretty neat as it supports most hashes, i used it before but i also wrote a routine that generates a hex string.
Anyway there is (almost) no speed benefit when not converting the result to a string so i dont see the problem tbh.
The PB functions works just fine and i suggest to use it unless u have a solid reason not to.
GPI
PureBasic Expert
PureBasic Expert
Posts: 1394
Joined: Fri Apr 25, 2003 6:41 pm

Re: MD5FingerprintBin()

Post by GPI »

I wrote a faster "dehex"-routine, run the example with disabled debugger for valid results.

Code: Select all

UseMD5Fingerprint()

Procedure DeHex(*str.character,*result.ascii)
  Protected.l flip
  Protected.A val,hiByte
  While *str\c >0
    val = *str\c - '0'
    If val > 9
      val - ('A'-'0'-10)
      If val > 15
        val - ('a'-'A')
        If val > 15
          Break 
        EndIf
      EndIf
    EndIf
    Debug val
    If flip
      *result\a = (hiByte << 4) | val
      ;Debug ">>" + *result\a+" " + Str(hiByte<<8)+" "+val
      *result + SizeOf(ascii)
      flip=#False
    Else
      hiByte = val
      flip=#True
    EndIf
    *str+SizeOf(Character)
  Wend
  ProcedureReturn *result
EndProcedure
Procedure.s ToHex(*dat.ascii, len)
  Protected.l i
  Protected.s str=""
  For i=1 To len
    str+Right("0"+Hex(*dat\a),2)
    *dat + SizeOf(ascii)
  Next
  ProcedureReturn str
EndProcedure



str.s = UCase(StringFingerprint("Something",#PB_Cipher_MD5))
  buflen = (Len(str)+1)/2
  *dest1 = AllocateMemory(buflen)
  
  
timer= ElapsedMilliseconds()
For i=0 To 1000000
  DeHex(@str, *dest1)
Next
timer = ElapsedMilliseconds()-timer

*dest2 = AllocateMemory(buflen)
  
timer2= ElapsedMilliseconds()
For i=0 To 1000000

 For x=0 To 31 Step 2
   PokeB(*dest2+x/2, Val("$"+Mid(str, x+1, 2)))
  Next
  
  Next
timer2 = ElapsedMilliseconds()-timer2



MessageRequester("test","md5  :"+str +#LF$ +
                        "deHex:"+ tohex(*dest1,buflen)+" "+timer+"ms"+#LF$ +
                        "val  :"+ tohex(*dest2,buflen)+" "+timer2+"ms"+#LF$+
                        "")
User avatar
Keya
Addict
Addict
Posts: 1891
Joined: Thu Jun 04, 2015 7:10 am

Re: MD5FingerprintBin()

Post by Keya »

A hash in hex format is only useful for display purposes ... whereas, for processing and storage it's better in it's original binary format. It should be up to the programmer to convert a binary hash to hex ... it's frustrating at the moment that we have to decode a hexadecimal hash to recover the binary version (bin -> hex -> bin), especially as it would require less code, and especially as the code to do so already exists within PB but isn't available. This is why I don't use the Fingerprint() functions.
User avatar
Saki
Addict
Addict
Posts: 830
Joined: Sun Apr 05, 2020 11:28 am
Location: Pandora

Re: MD5FingerprintBin()

Post by Saki »

Well, my function above is hard to figure out, but it also doesn't require NextFingerprint() and is extremely fast.

It doesn't get any simpler or more efficient in the PB environment, I think.

A similar solution can be created for all hashes.
地球上の平和
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: MD5FingerprintBin()

Post by Mijikai »

I posted some hashing code for Windows (Hash to Bin).
The code supports several hashing algorithms.
This time i used the newest Crypto API (BCrypt).

Code:
https://www.purebasic.fr/english/viewto ... 05#p570605

Have fun 8)
Rinzwind
Enthusiast
Enthusiast
Posts: 636
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: MD5FingerprintBin()

Post by Rinzwind »

Here we are again. Still no changes. Had to change PB's hex string to bin again lately... which PB does not even support a function for, which would be the least to offer when it chooses hex strings as results. Can this lib get some attention please? Add some other common used algorithms like HMAC while your at it. Then again, listening to and taking advise from the community is not a thing with PB.
BarryG
Addict
Addict
Posts: 3266
Joined: Thu Apr 18, 2019 8:17 am

Re: MD5FingerprintBin()

Post by BarryG »

Rinzwind wrote: Mon Nov 15, 2021 4:29 amlistening to and taking advise from the community is not a thing with PB.
It used to be, years ago. Fred was extremely active here. Kinda makes me sad that he doesn't anymore. Even the Blog doesn't get the love that it used to. I know times change and the language is going through a LOT of work at the moment, though. But even a weekly Blog update about what's currently going on would be good. Customers need that interaction.
Post Reply