PeekS byte length?

Windows specific forum
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

PeekS byte length?

Post by IdeasVacuum »

PB5.51 x86

Code: Select all

sStr.s = "1ABCDEFGHJ2abcdefghj3ABCDEFGHJ4abcdefghj5ABCDEFGHJ"
  *Buf = AllocateMemory(128)

         PokeS(*Buf, sStr, -1, #PB_UTF8)             
sVal.s = PeekS(*Buf, 32, #PB_UTF8 | #PB_ByteLength)
    FreeMemory(*Buf)
Debug sVal
Debug StringByteLength(sVal)
I was expecting PeekS to return a string 32bytes long, but it returns a string 32 characters long?
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: PeekS byte length?

Post by Lunasole »

IdeasVacuum wrote:PB5.51 x86

Code: Select all

sStr.s = "1ABCDEFGHJ2abcdefghj3ABCDEFGHJ4abcdefghj5ABCDEFGHJ"
  *Buf = AllocateMemory(128)

         PokeS(*Buf, sStr, -1, #PB_UTF8)             
sVal.s = PeekS(*Buf, 32, #PB_UTF8 | #PB_ByteLength)
    FreeMemory(*Buf)
Debug sVal
Debug StringByteLength(sVal)
I was expecting PeekS to return a string 32bytes long, but it returns a string 32 characters long?
As for me It is OK, because UTF8 uses 1 byte per character when you writing it into some custom buffer or file, etc.
It takes 2 bytes per char only if stored inside PB strings (which are UNICODE).
Just also specify format to StringByteLength(sVal, #PB_UTF8) for to get correct 32 from it.

Code: Select all

sStr.s = "1ABCDEFGHJ2abcdefghj3ABCDEFGHJ4abcdefghj5ABCDEFGHJ"
*Buf = AllocateMemory(128)

PokeS(*Buf, sStr, -1, #PB_UTF8)             
ShowMemoryViewer(*Buf, 128)
CallDebugger

sVal.s = PeekS(*Buf, 32, #PB_UTF8 | #PB_ByteLength)
FreeMemory(*Buf)
Debug sVal
Debug StringByteLength(sVal)
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PeekS byte length?

Post by IdeasVacuum »

You are right, I should have qualified the type:

Code: Select all

StringByteLength(sVal, #PB_UTF8)
...does return 32

However, other functions such as AESEncode will 'see' that string as Unicode? In which case it's size needs to be 32bytes in Unicode if for example used as a Key.

It would be useful if PeekS could optionally return a Unicode string of n bytes.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
TI-994A
Addict
Addict
Posts: 2512
Joined: Sat Feb 19, 2011 3:47 am
Location: Singapore
Contact:

Re: PeekS byte length?

Post by TI-994A »

IdeasVacuum wrote:It would be useful if PeekS could optionally return a Unicode string of n bytes.
I believe that the PokeS()/PeekS() functions work on the string length, Len(), and not the byte length. :wink:
Texas Instruments TI-99/4A Home Computer: the first home computer with a 16bit processor, crammed into an 8bit architecture. Great hardware - Poor design - Wonderful BASIC engine. And it could talk too! Please visit my YouTube Channel :D
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8425
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: PeekS byte length?

Post by netmaestro »

However, other functions such as AESEncode will 'see' that string as Unicode?
The AES commands don't take strings as input and they don't output any strings. All inputs and outputs are pointers to binary data (which may well contain strings) except keysize and mode, which are integers. If the string library ran up to the AES library and bit it on the ass it wouldn't recognize it.
BERESHEIT
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: PeekS byte length?

Post by Lunasole »

IdeasVacuum wrote: However, other functions such as AESEncode will 'see' that string as Unicode?
Looks like yes, if use *buffer_with_utf8string they will read it as UTF8 1 byte per char. If @StringVariable$, they will read unicode, which used with ciphers might damage security due to a lot of repeating zero bytes with known positions.
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8425
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: PeekS byte length?

Post by netmaestro »

Seriously, it looks like yes? It's no. No, pointers to binary data only, size determined by keysize for the key and MemorySize for the buffers. Where you're getting this utf8 stuff is a mystery. Do you see string types somewhere in the AES command prototypes? If you have difficulty determining buffer size for stringbytelength reasons or for any other reason, trust me the AES library doesn't know a thing about it. Such considerations are for the coder to solve in advance, not the cipher library. It knows pointers and lengths and nothing more. Again, no-strings-in-or-out. Nostrings. InorOut. Give me a tune and I'll sing it to you. But I won't dance. No way. You have to draw the line somewhere.
BERESHEIT
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: PeekS byte length?

Post by Lunasole »

netmaestro wrote:Seriously, it looks like yes? It's no. No, pointers to binary data only, size determined by keysize for the key and MemorySize for the buffers. Where you're getting this utf8 stuff is a mystery. Do you see string types somewhere in the AES command prototypes? If you have difficulty determining buffer size for stringbytelength reasons or for any other reason, trust me the AES library doesn't know a thing about it. Such considerations are for the coder to solve in advance, not the cipher library. It knows pointers and lengths and nothing more. Again, no-strings-in-or-out. Nostrings. InorOut. Give me a tune and I'll sing it to you. But I won't dance. No way. You have to draw the line somewhere.
I know what you talking about, but you should better read before writing so much.
He asked about how AESEncoder() and other functions will read regular PB string, and they are reading it's content as Unicode, because built-in strings using Unicode format. If don't know about that, easily can think than "S$ = PeekS (#PB_UTF8)" will lead S$ to be UTF8 internally.
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
Post Reply