Page 1 of 1
strange beginning of Unicode String..
Posted: Thu Sep 24, 2009 9:31 pm
by c4s
First of all: I don't know too much about unicode

.
Anyway I realized that unicode strings normally begin with $FFFE. But now I have one that has $FF00FE at the beginning and I don't understand what this means.
Any ideas?
Re: strange beginning of Unicode String..
Posted: Thu Sep 24, 2009 10:12 pm
by Edwin Knoppert
It is a file header for unicode or other code types.
You can for example use notepad to open this file and use save as to check the current filetype.
The 2 a 3 bytes are in specific order and value to inidicate the specific type.
Old fashion ansifiles don't have this header.
Re: strange beginning of Unicode String..
Posted: Fri Sep 25, 2009 8:13 am
by pdwyer
Re: strange beginning of Unicode String..
Posted: Fri Sep 25, 2009 9:50 am
by Trond
Unicode strings don't have any special beginning.
$FFFE is at the start of some unicode files. It means the file uses UCS2 encoding (the same as PB uses for unicode strings).
$FF00FE is something different entirely. It probably isn't a unicode file, or it's a unicode file without a byte order mark. Or you simply read the value wrong, and it's really $FF FE 00 00 or $00 00 FE FF, which means UTF-32.
Re: strange beginning of Unicode String..
Posted: Fri Sep 25, 2009 10:10 am
by c4s
That's why I'm confused.
Trond wrote:Unicode strings don't have any special beginning.
$FFFE is at the start of some unicode files. It means the file uses UCS2 encoding (the same as PB uses for unicode strings).
$FF00FE is something different entirely. It probably isn't a unicode file, or it's a unicode file without a byte order mark. Or you simply read the value wrong, and it's really $FF FE 00 00 or $00 00 FE FF, which means UTF-32.
To be more precise: The string I'm reading is a ID3-Tag. The tag has a unicode flag so the whole string must be in unicode as far as I know.
So normally the unicode tags I found began with $FFFE but now I found a mp3 file where every tag has $FF00FE at the beginning.
I really don't know what this means.
Re: strange beginning of Unicode String..
Posted: Fri Sep 25, 2009 10:18 am
by DarkDragon
c4s wrote: That's why I'm confused.
Trond wrote:Unicode strings don't have any special beginning.
$FFFE is at the start of some unicode files. It means the file uses UCS2 encoding (the same as PB uses for unicode strings).
$FF00FE is something different entirely. It probably isn't a unicode file, or it's a unicode file without a byte order mark. Or you simply read the value wrong, and it's really $FF FE 00 00 or $00 00 FE FF, which means UTF-32.
To be more precise: The string I'm reading is a ID3-Tag. The tag has a unicode flag so the whole string must be in unicode as far as I know.
So normally the unicode tags I found began with $FFFE but now I found a mp3 file where every tag has $FF00FE at the beginning.
I really don't know what this means.
Well ID3 is messed up by a lot of editors (including windows media player - it doesn't safe the tagsize of APIC as syncsafe integer). Maybe its not your fault, but the fault of the editor you have used for the file.
Re: strange beginning of Unicode String..
Posted: Fri Sep 25, 2009 5:41 pm
by akj
I'm working from memory, but I'm fairly sure I've seen this used to determine whether the file uses Little-Endian or big-Endian byte ordering.
Re: strange beginning of Unicode String..
Posted: Fri Sep 25, 2009 5:43 pm
by c4s
Hm, I just found out that the following will handle the string as I need it:
Code: Select all
Content.s = PeekS(*MemID, Size, #PB_UTF8)
...Because it will display the "best" with (not) unicode-executable and (not) unicode-string, right?
At least it can handle $FF00FE and stuff.