It is currently Sat Dec 05, 2020 1:19 pm

All times are UTC + 1 hour




Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 4:48 am 
Offline
User
User

Joined: Wed Dec 17, 2014 11:54 am
Posts: 49
This code works fine in Ascii:

Code:
a$="test" : Debug MD5Fingerprint(@a$,Len(a$)) ; 098f6bcd4621d373cade4e832627b4f6

And the result perfectly matches what http://www.md5.cz says.

But if I switch the compiler to Unicode, the result is incorrect: 84afc5c978db956e578615db0f111ed4

I know there's procedures in these forums to show how to "fix" it, but really the command itself should do it internally, because I think it's important for the MD5FingerPrint() command to ALWAYS output the result to match what the website above says, regardless of the compiler setting. Otherwise there's no point in even having an MD5FingerPrint() command, if we're just going to wrap it in a "fix" procedure, right?


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 6:44 am 
Offline
PureBasic Expert
PureBasic Expert

Joined: Sun Aug 08, 2004 5:21 am
Posts: 3707
Location: Netherlands
The website you are mentioning returns a md5 code for a string.
PureBasic returns a md5 code for a memory buffer.
When compiling in Ascii mode these happen to be the same but you can't expect the result to be the same in unicode mode.
All you are doing is passing a memory pointer to the MD5Fingerprint procedure. How should the procedure know you are passing a unicode string ?

_________________
macOS 10.15 Catalina, Windows 10


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 7:34 am 
Offline
User
User

Joined: Wed Dec 17, 2014 11:54 am
Posts: 49
If a human puts text into that website, it outputs a known MD5 hash for it. I would assume the MD5FingerPrint() command would return the same human-expected result.


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 8:27 am 
Offline
PureBasic Expert
PureBasic Expert

Joined: Sun Aug 08, 2004 5:21 am
Posts: 3707
Location: Netherlands
PB Fanatic wrote:
I would assume the MD5FingerPrint() command would return the same human-expected result.

It would if the procedure would be MD5Fingerprint(String.s) but the procedure is MD5Fingerprint(*Buffer, Size) .
What it currently returns might not be what you assumed but it does exactly what the help file says; generating a md5 hash from a memory buffer.

_________________
macOS 10.15 Catalina, Windows 10


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 8:39 am 
Offline
User
User

Joined: Wed Dec 17, 2014 11:54 am
Posts: 49
True that, but the manual has this example:

Code:
; Example: string as memory buffer
test.s = "This is a test string!"
Debug MD5Fingerprint(@test, StringByteLength(test))

Which produces two different results, depending on the compiler state. But it should return the SAME result, because we're testing a string, just as you said.


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 9:20 am 
Offline
Enthusiast
Enthusiast

Joined: Fri Apr 20, 2012 8:09 pm
Posts: 299
@PB Fanatic
you are not testing a string, you are sending a memory location to MD5 function which in ascii is single bytes but unicode is double bytes.
assuming that you are only interested in the fact that your MD5 should match MD5 from PHP ( or websites or ascii )
This should work for you:
Code:
Procedure.s md5ascii( s.s) ; create single byte text out of unicode , no error check
  Protected i,mbuf
  mbuf= AllocateMemory(StringByteLength(s))
  For i = 0 To Len(s)-1
    PokeB(mbuf+i,Asc(Mid(s,i+1,1)))
  Next
  PokeB(mbuf+Len(s),0)
  ProcedureReturn MD5Fingerprint(mbuf,Len(s))
 
EndProcedure

a$="test" : Debug MD5Fingerprint(@a$,StringByteLength(a$)) ; 098f6bcd4621d373cade4e832627b4f6

Debug md5ascii(a$)

just remember, text$ is a variable for a string , @text$ is a pointer to a place in memory

Norm.

_________________
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 9:50 am 
Offline
User
User

Joined: Wed Dec 17, 2014 11:54 am
Posts: 49
normeus wrote:
assuming that you are only interested in the fact that your MD5 should match MD5 from PHP

That's exactly my goal, actually. My website returns an MD5 hash of a string that I pass to it by PHP, and it didn't match what PureBasic was giving me with MD5FingerPrint(). Now I see I will have to wrap a procedure around the MD5FingerPrint() command to get the same result, which is a shame. I wish PureBasic just had a simple straightforward MD5 command like in PHP with no concern of Ascii vs Unicode and memory poking.


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 10:17 am 
Offline
Addict
Addict
User avatar

Joined: Tue Oct 09, 2007 2:15 am
Posts: 1144
Code:
EnableExplicit

Procedure.s md5(String.s)
 
  Protected *Buffer, Result.s = ""
 
  If String <> ""
    *Buffer = AllocateMemory(StringByteLength(String, #PB_Ascii))
    If *Buffer
      PokeS(*Buffer, String, StringByteLength(String, #PB_Ascii), #PB_Ascii)
      Result = MD5Fingerprint(*Buffer, StringByteLength(String, #PB_Ascii))
      FreeMemory(*Buffer) 
    EndIf
  EndIf
 
  ProcedureReturn Result
 
EndProcedure

Define String.s = "The quick brown fox â jumps over the lazy dog."

;: PHP function generates -> 66e7a64aef34d4148d8bde4aa2976ab9

Debug "PHP -> 66e7a64aef34d4148d8bde4aa2976ab9"
Debug "PB  -> " + md5(String)


Check it with compilerswitch ASCII and Unicode...

Edit: And very important: PHP don't "integrate" the 0 Byte at the end of a string to calculate MD5 Hashes, so this is also without 0 Byte...

_________________
PureBasic 5.72 LTS (Windows x86/x64) | Windows10 Pro x64 | Z370 Extreme4 | i7 8770k | 64GB RAM | iChill GeForce RTX 2080 Super | HAF XF Evo​​
English is not my native language... (I often use DeepL to translate my texts.)


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 10:22 am 
Offline
PureBasic Expert
PureBasic Expert

Joined: Sun Aug 08, 2004 5:21 am
Posts: 3707
Location: Netherlands
Small procedure
Code:
Procedure.s MD5AsciiFingerprint(s.s)
  Protected a.s=s:ProcedureReturn MD5Fingerprint(@a,PokeS(@a,s,-1,#PB_Ascii)) 
EndProcedure


Similar but with format specification like STARGÅTE posted below
Code:
Procedure.s md5(s.s, fmt = #PB_UTF8)
  Protected Dim a.a(StringByteLength(s,fmt)+1):ProcedureReturn MD5Fingerprint(@a(),PokeS(@a(),s,-1,fmt))
EndProcedure


Quote:
I wish PureBasic just had a simple straightforward MD5 command like in PHP with no concern of Ascii vs Unicode and memory poking.

There's lots of users that depend on the support of unicode because their language contains characters not in the ascii range.
If only ascii is supported, this is a problem. If unicode is required, there's the possibility of generating a hash based on UCS-2 or UTF-8.
The PB command is very flexible and allows the user to make a choice.

_________________
macOS 10.15 Catalina, Windows 10


Last edited by wilbert on Sun Jan 04, 2015 7:44 pm, edited 2 times in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 12:59 pm 
Offline
Addict
Addict
User avatar

Joined: Thu Jan 10, 2008 1:30 pm
Posts: 1321
Location: Germany, Glienicke
PB Fanatic wrote:
I wish PureBasic just had a simple straightforward MD5 command like in PHP with no concern of Ascii vs Unicode and memory poking.

In php you have the same problem, if you change your file format from ascii to utf8:
Code:
echo md5('Äpfel');
gets 16114a0b3232bc9a8f978311387e74f2 if your file is utf8, or d1c5faac7b530be151406b478f36bfb1 if it is in ascii.

Here is my function for MD5 with an optional Flag to define the format.
Code:
Procedure.s MD5(String.s, Flags.i=#PB_UTF8)
 
  Protected Length.i = StringByteLength(String, Flags)
  Protected *Buffer  = AllocateMemory(Length)
  Protected Result.s
 
  PokeS(*Buffer, String, #PB_Default, Flags|#PB_String_NoZero)
  Result = MD5Fingerprint(*Buffer, Length)
  FreeMemory(*Buffer)
 
  ProcedureReturn Result
 
EndProcedure

Debug MD5("Äpfel", #PB_Ascii)
Debug MD5("Äpfel", #PB_UTF8)
Debug MD5("Äpfel", #PB_Unicode)

_________________
ImageImage


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sat Jan 03, 2015 1:19 pm 
Offline
User
User

Joined: Wed Dec 17, 2014 11:54 am
Posts: 49
wilbert wrote:
Small procedure
Code:
Procedure.s MD5AsciiFingerprint(s.s)
  Protected a.s=s:ProcedureReturn MD5Fingerprint(@a,PokeS(@a,s,-1,#PB_Ascii)) 
EndProcedure

I like this! Thanks! :D


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sun Jan 04, 2015 7:27 pm 
Offline
Addict
Addict

Joined: Thu Jun 07, 2007 3:25 pm
Posts: 3965
Location: Berlin, Germany
wilbert wrote:
Similar but with format specification like STARGÅTE posted below
Code:
Procedure.s md5(s.s, fmt = #PB_UTF8)
  Dim a.a(StringByteLength(s,fmt)+1):ProcedureReturn MD5Fingerprint(@a(),PokeS(@a(),s,-1,fmt))
EndProcedure


Hello wilbert, that's nice. :-) Thank you! I'll put that code into my private string library.

However, the code contains a glitch that can cause an unwanted effect in a program that contains a Global Array a.a():

Code:
Global Dim a.a(2)

Procedure.s md5(s.s, fmt = #PB_UTF8)
  Dim a.a(StringByteLength(s,fmt)+1):ProcedureReturn MD5Fingerprint(@a(),PokeS(@a(),s,-1,fmt))
EndProcedure

For i = 0 To ArraySize(a())
   a(i) = i
   Debug a(i)
Next

Debug ""
Debug md5("Äpfel")
Debug ""

For i = 0 To ArraySize(a())
   Debug a(i)
Next

So it's better to use

Code:
  Protected Dim ...

in the Procedure.

wilbert wrote:
There's lots of users that depend on the support of unicode because their language contains characters not in the ascii range.
If only ascii is supported, this is a problem. If unicode is required, there's the possibility of generating a hash based on UCS-2 or UTF-8.
The PB command is very flexible and allows the user to make a choice.

I know this was your reply to PB Fanatic, and all that you wrote is true, of course.

Howevr, IMHO PB's built-in MD5Fingerprint() function just should provide the option to pass a format parameter.
Then it wouldn't be necessary for us to write a wrapper in order to get this important option.
( But this is a feature request by me. I can't see a bug here. )

Thanks again!

_________________
Please excuse my flawed English. My native language is PureBasic.
Search
RSBasic's backups


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Sun Jan 04, 2015 7:48 pm 
Offline
PureBasic Expert
PureBasic Expert

Joined: Sun Aug 08, 2004 5:21 am
Posts: 3707
Location: Netherlands
Little John wrote:
So it's better to use

Code:
  Protected Dim ...

in the Procedure.

Thanks for mentioning. I changed the procedure in my post above.
The help file states that a Dim is always local. It wasn't clear to me that it can conflict with a global array.

You are right that an optional parameter would be best.
If it could be set to Ascii, Unicode, UTF8 or Binary and Binary would be the default, it wouldn't break backward compatibility and be a useful addition.

_________________
macOS 10.15 Catalina, Windows 10


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Mon Jan 05, 2015 9:22 am 
Offline
User
User

Joined: Wed Dec 17, 2014 11:54 am
Posts: 49
Little John wrote:
Howevr, IMHO PB's built-in MD5Fingerprint() function just should provide the option to pass a format parameter.
Then it wouldn't be necessary for us to write a wrapper in order to get this important option.

Yes indeed, that's what I'd like to see, too.


Top
 Profile  
Reply with quote  
 Post subject: Re: MD5FingerPrint in Unicode
PostPosted: Mon Jan 05, 2015 4:41 pm 
Offline
Addict
Addict
User avatar

Joined: Thu Jan 10, 2008 1:30 pm
Posts: 1321
Location: Germany, Glienicke
Little John wrote:
Howevr, IMHO PB's built-in MD5Fingerprint() function just should provide the option to pass a format parameter.
Then it wouldn't be necessary for us to write a wrapper in order to get this important option.
( But this is a feature request by me. I can't see a bug here. )


The MD5Fingerprint() is a function for a memory buffer (not directly for strings!).
So, it is not the job of this memory function to "change" the format of the buffer.

This is also a rule for all other functions such as:
CRC32Fingerprint, SHA1Fingerprint, AESDecoder, AESEncoder, Base64Decoder, Base64Encoder and so on.
All this functions are memory functions, and it is the job of the user to convert a string to a buffer with PokeS() and not with @String.
It's a bad habit to use strings as memory buffer.

_________________
ImageImage


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next

All times are UTC + 1 hour


Who is online

Users browsing this forum: Majestic-12 [Bot] and 57 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
cron

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye