Question about variable endianness

miso · Post by **miso** » Thu Jan 23, 2025 9:38 pm

Hello all.

Are purebasic variables use the same endianness in every platform/cpu, or there are differencies?
In purebasic, If I allocate some memory and poke some longs in it, send it over network to a client also written in pb and peeks the longs, is there a possibility that it will be read in the wrong byte order?

Thanks in advance for any answers.

Quin · Post by **Quin** » Thu Jan 23, 2025 9:43 pm

https://www.purebasic.fr/english/viewtopic.php?p=600345

miso · Post by **miso** » Thu Jan 23, 2025 9:55 pm

Thank you Quin! (Should have search before asking...

)

infratec · Post by **infratec** » Sun Jan 26, 2025 4:57 pm

The problem is ...

that many network related stuff using network byte order which is ... big endian.

https://www.sciencedirect.com/topics/co ... byte-order

So you need to convert PB stuff.

miso · Post by **miso** » Sun Jan 26, 2025 5:17 pm

Thank You, Infratec! Looking trough all this, I think I will be all right with only little endian (and no check for endianness) as long as only purebasic programs are involved in using my packets/data. Arm can do both, but Fred said PB is little endian in all platform.

I don't want to make troubles for future me though, I might put a byte order mark bit in my packets...

benubi · Post by **benubi** » Tue Jan 28, 2025 4:21 pm

Yes, I was once "confused" and thought arm would be a different Endianness than on x64-x86 windows.

In theory this should test endianness in case it will be needed but that's only in theory ("fun fact" I wrote dozens of procedures at first, assuming there could be problems that will possibly never exist, lol)

Code: Select all

If 1 & $1 = 1 ;  "frist" byte/lo byte
  ; little endian ; least important byte (first) = 1
Else 
 ; Big endian ; first byte (most important) = 0
EndIf

STARGÅTE · Post by **STARGÅTE** » Tue Jan 28, 2025 7:38 pm

benubi wrote: Tue Jan 28, 2025 4:21 pm Yes, I was once "confused" and thought arm would be a different Endianness than on x64-x86 windows.

In theory this should test endianness in case it will be needed but that's only in theory ("fun fact" I wrote dozens of procedures at first, assuming there could be problems that will possibly never exist, lol)
Code: Select all
If 1 & $1 = 1 ;  "frist" byte/lo byte
  ; little endian ; least important byte (first) = 1
Else 
 ; Big endian ; first byte (most important) = 0
EndIf

How this code should check the endianness? 1 and $1 are the same numbers, both 1, so 1 & 1 is always 1.
On numerical level, the endianness is not visible, because it is how numbers are stored in memory.

miso · Post by **miso** » Tue Jan 28, 2025 7:56 pm

This should work ( written by Boddhi, I just changed PeekB to PeekA)

Code: Select all

Procedure.a Fc_ProcessorTest() ; #True if Intel, #False if Motorola
  Protected.w Value ; => 2 bytes variable type needed

  Value=1
  ; 1 will be coded '00 01' under Motorola and '01 00' under Intel
  ; We test first byte
  If PeekA(@Value)=Value:ProcedureReturn #True:EndIf
  ; or 
  ; ProcedureReturn Bool(PeekA(@Value)=Value)
EndProcedure

If Fc_ProcessorTest()
  Debug "Intel/ARM processor"
Else
  Debug "Motorola processor"
Endif

benubi · Post by **benubi** » Wed Jan 29, 2025 10:14 pm

STARGÅTE wrote: Tue Jan 28, 2025 7:38 pm How this code should check the endianness? 1 and $1 are the same numbers, both 1, so 1 & 1 is always 1.
On numerical level, the endianness is not visible, because it is how numbers are stored in memory.

Allegedly not, when you use 0x or $ operators, as I said I couldn't test it. This may also depend on how the compiler "sees" a $00000001

But here you give the "physical" description of the bits, that's why file id's and tags work inter-operably, like in PNG or RIFF formats and so on. Because they all use that operator for the constant declaration in their C sources; they would come out inverted otherwise.
I stumbled over a little C code that claimed to test in that way. So perhaps it needs a little correction like more zeros to make it fail proof, but I bet it should magically work (it's not 1 on big endian systems).

Code: Select all

#IsLittleEndian = 1 & $0001            ; word 0001h or 1000h
#IsBigEndian    = ~#IsLittleEndian & 1 ; 1
Debug "Is Little? "+#IsLittleEndian
Debug "Is Big? "+ #IsBigEndian

Plot twist (or not) to make it more profound or confusing... The values are already all in big endian format via the decimal system, which may be additionally misleading. In CSS it's possible to define colors with the # operator: #RRGGBB and #RGB (intuitive), where as in PureBasic speak it would look like $BBGGRR <- RR least significant byte (when the format is RGB and not BGR).

$BGR - It's the inverse order of the byte order, because the notation is inherited from the "big endian" decimal system.

STARGÅTE · Post by **STARGÅTE** » Wed Jan 29, 2025 10:44 pm

Dear benubi,

you mix up the number representation (source code) with the memory representation (runtime).

In PureBasic the $-prefix just indicates that the following digits are hexadecimal. It has nothing to do with the storage!
Therefore, a $1 or $0001 or $00000001 is always a number 1 and a $1000 or $00001000 is always a 4096, on all systems.
It's just the number representation in the source code, and low significant digits are always on the right.
The $-notation is (in PureBasic) not the memory representation of a number.

In little endian, the number $DEADBEEF is stored as [EF][BE][AD][DE] from low to high memory location.
In big endian, the same number $DEADBEEF is stored as [DE][AD][BE][EF] from low to high memory location.

benubi · Post by **benubi** » Wed Jan 29, 2025 11:05 pm

How "sad". But then it must work with 0x in C? And #ID's in Purebasic would have to be defined via macros all the time if one would like to preserve compatibility?

mk-soft · Post by **mk-soft** » Wed Jan 29, 2025 11:38 pm

It doesn't matter whether you define it in C or PB. It is always stored in the memory in low-high byte notation.
Regardless of whether it is Intel, ARM or M1 (macOS) code. The basis is taken from the OS. And this is the same for all as Low Hight Byte notation. With ARM there is also the option High Low Byte notation. However, this is not used with Linux and co.

With Modbus/TCP, the word is transferred in high-low byte notation and must be converted. This also applies to data exchange with S7-PLC for raw data, as the S7-PLC also uses high-low byte notation.

The check is therefore not relevant when programming.

benubi · Post by **benubi** » Thu Jan 30, 2025 11:46 am

That's all very logical in fact, the more I think about it, the more impossible "my idea" seems to be feasible, and it somehow must happen the way you guys describe it - I trust you on that anyway

So in fact checking against constants for such ID's across different byte-orders can't really work. You need a runtime procedure that receives an arbitrary constant #ID as parameter, the procedure needs to do the byte-order check and then return the ID "number" unchanged or inverted, because you can't fix the byte order otherwise via any programming language directly, syntacticaly.

An alternative could be to use DataSection, with only .a and .b types allowed. This would also need a runtime function, but the result would be of "constant" nature and with a portable byte order. Probably not so uncommon for GUID's and such.

BTW

I wonder how "mixed endianness" GUID's work. I believe that's where my interest in that ordering started a few months ago; the "new" GUID partition tables to be more precise. One way I imagine a mixed GUID would be to divide it in two quads and make quad[1] = Inverted(quad[0]). That's all very obscure and occult to me

Edit: Or a mixed-endian GUID works like building it with runtime functions something like New_GUID(...)

Code: Select all

DataSection:
  GUID_NativeEndian:
  Data.l 1; Data1.l
  Data.w 2; Data2.w
  Data.w 3; Data3.w
  Data.b 0,1,2,3,4,5,6,7; Data4.b[8]
EndDataSection

PureBasic Forums - English

Question about variable endianness

Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness

Re: Question about variable endianness