PureBasic Forums - English

Posted: **Thu Sep 24, 2009 9:31 pm**

First of all: I don't know too much about unicode

.

Anyway I realized that unicode strings normally begin with $FFFE. But now I have one that has $FF00FE at the beginning and I don't understand what this means.

Any ideas?

Posted: **Thu Sep 24, 2009 10:12 pm**

It is a file header for unicode or other code types.
You can for example use notepad to open this file and use save as to check the current filetype.

The 2 a 3 bytes are in specific order and value to inidicate the specific type.
Old fashion ansifiles don't have this header.

Posted: **Fri Sep 25, 2009 8:13 am**

not seeing yours

http://en.wikipedia.org/wiki/Byte-order_mark

Posted: **Fri Sep 25, 2009 9:50 am**

Unicode strings don't have any special beginning.

$FFFE is at the start of some unicode files. It means the file uses UCS2 encoding (the same as PB uses for unicode strings).

$FF00FE is something different entirely. It probably isn't a unicode file, or it's a unicode file without a byte order mark. Or you simply read the value wrong, and it's really $FF FE 00 00 or $00 00 FE FF, which means UTF-32.

Posted: **Fri Sep 25, 2009 10:10 am**

pdwyer wrote:not seeing yours

http://en.wikipedia.org/wiki/Byte-order_mark

That's why I'm confused.

Trond wrote:Unicode strings don't have any special beginning.

$FFFE is at the start of some unicode files. It means the file uses UCS2 encoding (the same as PB uses for unicode strings).

$FF00FE is something different entirely. It probably isn't a unicode file, or it's a unicode file without a byte order mark. Or you simply read the value wrong, and it's really $FF FE 00 00 or $00 00 FE FF, which means UTF-32.

To be more precise: The string I'm reading is a ID3-Tag. The tag has a unicode flag so the whole string must be in unicode as far as I know.
So normally the unicode tags I found began with $FFFE but now I found a mp3 file where every tag has $FF00FE at the beginning.

I really don't know what this means.

Posted: **Fri Sep 25, 2009 10:18 am**

c4s wrote:
pdwyer wrote:not seeing yours

http://en.wikipedia.org/wiki/Byte-order_mark
That's why I'm confused.

Trond wrote:Unicode strings don't have any special beginning.

$FFFE is at the start of some unicode files. It means the file uses UCS2 encoding (the same as PB uses for unicode strings).

$FF00FE is something different entirely. It probably isn't a unicode file, or it's a unicode file without a byte order mark. Or you simply read the value wrong, and it's really $FF FE 00 00 or $00 00 FE FF, which means UTF-32.
To be more precise: The string I'm reading is a ID3-Tag. The tag has a unicode flag so the whole string must be in unicode as far as I know.
So normally the unicode tags I found began with $FFFE but now I found a mp3 file where every tag has $FF00FE at the beginning.

I really don't know what this means.

Well ID3 is messed up by a lot of editors (including windows media player - it doesn't safe the tagsize of APIC as syncsafe integer). Maybe its not your fault, but the fault of the editor you have used for the file.

Posted: **Fri Sep 25, 2009 5:41 pm**

I'm working from memory, but I'm fairly sure I've seen this used to determine whether the file uses Little-Endian or big-Endian byte ordering.

Posted: **Fri Sep 25, 2009 5:43 pm**

Hm, I just found out that the following will handle the string as I need it:

Code: Select all

Content.s = PeekS(*MemID, Size, #PB_UTF8)

...Because it will display the "best" with (not) unicode-executable and (not) unicode-string, right?
At least it can handle $FF00FE and stuff.

PureBasic Forums - English

strange beginning of Unicode String..

strange beginning of Unicode String..

Re: strange beginning of Unicode String..

Re: strange beginning of Unicode String..

Re: strange beginning of Unicode String..

Re: strange beginning of Unicode String..

Re: strange beginning of Unicode String..

Re: strange beginning of Unicode String..

Re: strange beginning of Unicode String..