Question about variable endianness

Everything else that doesn't fall into one of the other PB categories.
miso
Enthusiast
Enthusiast
Posts: 466
Joined: Sat Oct 21, 2023 4:06 pm
Location: Hungary

Question about variable endianness

Post by miso »

Hello all.

Are purebasic variables use the same endianness in every platform/cpu, or there are differencies?
In purebasic, If I allocate some memory and poke some longs in it, send it over network to a client also written in pb and peeks the longs, is there a possibility that it will be read in the wrong byte order?

Thanks in advance for any answers.
Quin
Addict
Addict
Posts: 1133
Joined: Thu Mar 31, 2022 7:03 pm
Location: Colorado, United States
Contact:

Re: Question about variable endianness

Post by Quin »

miso
Enthusiast
Enthusiast
Posts: 466
Joined: Sat Oct 21, 2023 4:06 pm
Location: Hungary

Re: Question about variable endianness

Post by miso »

Thank you Quin! (Should have search before asking... :oops: )
infratec
Always Here
Always Here
Posts: 7618
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: Question about variable endianness

Post by infratec »

The problem is ...

that many network related stuff using network byte order which is ... big endian.

https://www.sciencedirect.com/topics/co ... byte-order

So you need to convert PB stuff.
miso
Enthusiast
Enthusiast
Posts: 466
Joined: Sat Oct 21, 2023 4:06 pm
Location: Hungary

Re: Question about variable endianness

Post by miso »

Thank You, Infratec! Looking trough all this, I think I will be all right with only little endian (and no check for endianness) as long as only purebasic programs are involved in using my packets/data. Arm can do both, but Fred said PB is little endian in all platform.

I don't want to make troubles for future me though, I might put a byte order mark bit in my packets...
benubi
Enthusiast
Enthusiast
Posts: 220
Joined: Tue Mar 29, 2005 4:01 pm

Re: Question about variable endianness

Post by benubi »

Yes, I was once "confused" and thought arm would be a different Endianness than on x64-x86 windows.

In theory this should test endianness in case it will be needed but that's only in theory ("fun fact" I wrote dozens of procedures at first, assuming there could be problems that will possibly never exist, lol)

Code: Select all

If 1 & $1 = 1 ;  "frist" byte/lo byte
  ; little endian ; least important byte (first) = 1
Else 
 ; Big endian ; first byte (most important) = 0
EndIf
User avatar
STARGÅTE
Addict
Addict
Posts: 2232
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: Question about variable endianness

Post by STARGÅTE »

benubi wrote: Tue Jan 28, 2025 4:21 pm Yes, I was once "confused" and thought arm would be a different Endianness than on x64-x86 windows.

In theory this should test endianness in case it will be needed but that's only in theory ("fun fact" I wrote dozens of procedures at first, assuming there could be problems that will possibly never exist, lol)

Code: Select all

If 1 & $1 = 1 ;  "frist" byte/lo byte
  ; little endian ; least important byte (first) = 1
Else 
 ; Big endian ; first byte (most important) = 0
EndIf
How this code should check the endianness? 1 and $1 are the same numbers, both 1, so 1 & 1 is always 1.
On numerical level, the endianness is not visible, because it is how numbers are stored in memory.
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
miso
Enthusiast
Enthusiast
Posts: 466
Joined: Sat Oct 21, 2023 4:06 pm
Location: Hungary

Re: Question about variable endianness

Post by miso »

This should work ( written by Boddhi, I just changed PeekB to PeekA)

Code: Select all

Procedure.a Fc_ProcessorTest() ; #True if Intel, #False if Motorola
  Protected.w Value ; => 2 bytes variable type needed

  Value=1
  ; 1 will be coded '00 01' under Motorola and '01 00' under Intel
  ; We test first byte
  If PeekA(@Value)=Value:ProcedureReturn #True:EndIf
  ; or 
  ; ProcedureReturn Bool(PeekA(@Value)=Value)
EndProcedure

If Fc_ProcessorTest()
  Debug "Intel/ARM processor"
Else
  Debug "Motorola processor"
Endif
benubi
Enthusiast
Enthusiast
Posts: 220
Joined: Tue Mar 29, 2005 4:01 pm

Re: Question about variable endianness

Post by benubi »

STARGÅTE wrote: Tue Jan 28, 2025 7:38 pm How this code should check the endianness? 1 and $1 are the same numbers, both 1, so 1 & 1 is always 1.
On numerical level, the endianness is not visible, because it is how numbers are stored in memory.
Allegedly not, when you use 0x or $ operators, as I said I couldn't test it. This may also depend on how the compiler "sees" a $00000001

But here you give the "physical" description of the bits, that's why file id's and tags work inter-operably, like in PNG or RIFF formats and so on. Because they all use that operator for the constant declaration in their C sources; they would come out inverted otherwise.
I stumbled over a little C code that claimed to test in that way. So perhaps it needs a little correction like more zeros to make it fail proof, but I bet it should magically work (it's not 1 on big endian systems).

Code: Select all

#IsLittleEndian = 1 & $0001            ; word 0001h or 1000h
#IsBigEndian    = ~#IsLittleEndian & 1 ; 1
Debug "Is Little? "+#IsLittleEndian
Debug "Is Big? "+ #IsBigEndian

Plot twist (or not) to make it more profound or confusing... The values are already all in big endian format via the decimal system, which may be additionally misleading. In CSS it's possible to define colors with the # operator: #RRGGBB and #RGB (intuitive), where as in PureBasic speak it would look like $BBGGRR <- RR least significant byte (when the format is RGB and not BGR).

$BGR - It's the inverse order of the byte order, because the notation is inherited from the "big endian" decimal system.
User avatar
STARGÅTE
Addict
Addict
Posts: 2232
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: Question about variable endianness

Post by STARGÅTE »

Dear benubi,

you mix up the number representation (source code) with the memory representation (runtime).

In PureBasic the $-prefix just indicates that the following digits are hexadecimal. It has nothing to do with the storage!
Therefore, a $1 or $0001 or $00000001 is always a number 1 and a $1000 or $00001000 is always a 4096, on all systems.
It's just the number representation in the source code, and low significant digits are always on the right.
The $-notation is (in PureBasic) not the memory representation of a number.

In little endian, the number $DEADBEEF is stored as [EF][BE][AD][DE] from low to high memory location.
In big endian, the same number $DEADBEEF is stored as [DE][AD][BE][EF] from low to high memory location.
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
benubi
Enthusiast
Enthusiast
Posts: 220
Joined: Tue Mar 29, 2005 4:01 pm

Re: Question about variable endianness

Post by benubi »

How "sad". But then it must work with 0x in C? And #ID's in Purebasic would have to be defined via macros all the time if one would like to preserve compatibility?
User avatar
mk-soft
Always Here
Always Here
Posts: 6246
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Question about variable endianness

Post by mk-soft »

It doesn't matter whether you define it in C or PB. It is always stored in the memory in low-high byte notation.
Regardless of whether it is Intel, ARM or M1 (macOS) code. The basis is taken from the OS. And this is the same for all as Low Hight Byte notation. With ARM there is also the option High Low Byte notation. However, this is not used with Linux and co.

With Modbus/TCP, the word is transferred in high-low byte notation and must be converted. This also applies to data exchange with S7-PLC for raw data, as the S7-PLC also uses high-low byte notation.

The check is therefore not relevant when programming.
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
benubi
Enthusiast
Enthusiast
Posts: 220
Joined: Tue Mar 29, 2005 4:01 pm

Re: Question about variable endianness

Post by benubi »

That's all very logical in fact, the more I think about it, the more impossible "my idea" seems to be feasible, and it somehow must happen the way you guys describe it - I trust you on that anyway ;)

So in fact checking against constants for such ID's across different byte-orders can't really work. You need a runtime procedure that receives an arbitrary constant #ID as parameter, the procedure needs to do the byte-order check and then return the ID "number" unchanged or inverted, because you can't fix the byte order otherwise via any programming language directly, syntacticaly.

An alternative could be to use DataSection, with only .a and .b types allowed. This would also need a runtime function, but the result would be of "constant" nature and with a portable byte order. Probably not so uncommon for GUID's and such.

BTW :arrow: I wonder how "mixed endianness" GUID's work. I believe that's where my interest in that ordering started a few months ago; the "new" GUID partition tables to be more precise. One way I imagine a mixed GUID would be to divide it in two quads and make quad[1] = Inverted(quad[0]). That's all very obscure and occult to me :lol:

Edit: Or a mixed-endian GUID works like building it with runtime functions something like New_GUID(...)

Code: Select all

DataSection:
  GUID_NativeEndian:
  Data.l 1; Data1.l
  Data.w 2; Data2.w
  Data.w 3; Data3.w
  Data.b 0,1,2,3,4,5,6,7; Data4.b[8]
EndDataSection
Post Reply