How to align memory?

Mistrel · Post by **Mistrel** » Fri Oct 08, 2010 7:41 am

I've been having some unexpected memory access violations with InterlockedCompareExchange. I believe this is due to memory alignment, according to MSDN:

The parameters for this function must be aligned on a 32-bit boundary; otherwise, the function will behave unpredictably on multiprocessor x86 systems and any non-x86 systems. See _aligned_malloc.

I've been looking at pcfreak's code as a solution:

http://www.purebasic.fr/english/viewtop ... 12&t=41127

However, if I want to align a long on a 32-bit system, what do I specify for the alignment in bytes? Also, what would I specify for an integer on a 64-bit system? I don't follow exactly how alignment works.

Thorium · Post by **Thorium** » Fri Oct 08, 2010 8:59 am

4 byte

Alignment means the memory address is devidable by a fixed number.
Alignment is for performance. A memory access is fastest if the address is alignt to a number equal to the datatyp size that will be accessed. So if you access a long memory should be alignt to 4 byte = 32bit.
You also can align code, which can speed up
loops. For code you should use the native register size for alignment.

Mistrel · Post by **Mistrel** » Fri Oct 08, 2010 9:18 am

Is it still 4 bytes for 64-bit?

inc. · Post by **inc.** » Fri Oct 08, 2010 11:07 am

64bit -> 8 Byte alignment.
Some instructions even do expect 16 byte alignment (128bit)

Code: Select all

Procedure AllocateMemoryAligned(size.l)
  Protected diff.l, *ptr.Long = #Null
  If size > ($7FFFFFFF-16) 
    ProcedureReturn #Null
  EndIf 
  *ptr = AllocateMemory(size+16)
  If Not *ptr
    ProcedureReturn *ptr
  EndIf 
  diff = ((-*ptr - 1)&15) + 1
  *ptr = *ptr + diff;
  PokeB(*ptr-1, diff)
  ProcedureReturn *ptr
EndProcedure

Procedure FreememoryAligned(*ptr)
  If *ptr
    FreeMemory(*ptr - PeekB(*ptr-1))
  EndIf 
EndProcedure

Procedure ReAllocateMemoryAligned(*ptr, size.l)
  Protected diff.l
  If size > ($7FFFFFFF-16) 
    ProcedureReturn #Null
  EndIf
  If Not *ptr
    ProcedureReturn AllocateMemoryAligned(size.l)
  EndIf
  diff = PeekB(*ptr-1)
  ProcedureReturn ReAllocateMemory(*ptr-diff, size+diff)+diff
EndProcedure

Procedure zeroAllocateMemoryAligned(size.l)
  Protected *ptr = AllocateMemoryAligned(size)
  Protected *ptr_b.Byte = *ptr
  Protected *ptr_l.Long = *ptr
  Protected *ptr_end = *ptr+size
  If *ptr
    If Not size%4 ; size gots a 32Bit boundary?
      While *ptr_l < *ptr_end
        *ptr_l\l = 0
        *ptr_l+4
      Wend
    Else
      While *ptr_b < *ptr_end
        *ptr_b\b = 0
        *ptr_b+1
      Wend
    EndIf 
    ProcedureReturn *ptr
  EndIf
EndProcedure

test = zeroAllocateMemoryAligned(1024)

Debug test

FreememoryAligned(test)

Mistrel · Post by **Mistrel** » Fri Oct 08, 2010 11:10 am

So alignment according to the register size is literally the number of bytes in the total bits supported by the processor?

16-bit is 2 bytes, 32-bit 4 bytes, 64-bit is 8 bytes?

cas · Post by **cas** » Fri Oct 08, 2010 1:16 pm

Isn't already all memory that AllocateMemory returns aligned? We need to worry about alignment only when we deal with pointers math (for example: *mem+k in loop will not be aligned always). Here on 64bit OS everything is aligned to 64bit boundary so there should be no problem with that function that needs memory to be aligned to 32bit boundary.

Code: Select all

EnableExplicit
DisableDebugger
Define k,count,*mem
For k=0 To 1000000
  *mem=AllocateMemory(20+Random(20))
  If *mem%8<>0 ;check if it is not aligned to 64bit boundary
    count+1
  EndIf
  ;FreeMemory(*mem) ;don't free memory!
Next
EnableDebugger
Debug count

Thorium · Post by **Thorium** » Fri Oct 08, 2010 6:32 pm

Mistrel wrote:So alignment according to the register size is literally the number of bytes in the total bits supported by the processor?

16-bit is 2 bytes, 32-bit 4 bytes, 64-bit is 8 bytes?

No.
It's as i wrote. The size of the datatyp you want to access is what tells you the best alignment.

word = 2 byte
long = 4 byte
quad = 8 byte
integer = sizeof(integer)

The 16 byte alignment is to load xmm registers, they are 128bit big, so it's a long quad or what ever you call a 16 byte number.

This is according to Intels optimization guide.

Mistrel · Post by **Mistrel** » Sat Oct 09, 2010 2:20 am

Thorium wrote:The size of the datatype you want to access is what tells you the best alignment.

Would you give an example of how this would apply to a structure which uses several differently sized data types? I've never understood where padding should be applied and why.

Thorium · Post by **Thorium** » Sat Oct 09, 2010 8:05 am

In a structure you can simply add alignment bytes.

Code: Select all

Structure TestAlign
  Blub.b

  ;Blub brackes the alignment as its only 1 byte big

  Align1[3].b

  ;we just insert a byte array with 3 elements to get the 4 byte alignment for the next structure element

  Bla.l
EndStructure

Mistrel · Post by **Mistrel** » Sat Oct 09, 2010 8:48 am

Is this correct?

Code: Select all

Structure Test
  A.w
  B.b
  C.b ; <- no alignment needed?
  ..[3].b
  D.l
  ..[4].b ; <- should be aligned for 8 bytes?
  E.q
  F.b
  ..[7].b ; <- should be aligned for 8 bytes?
  G.q
EndStructure

Also, why is alignment so important for 64-bit compiling but didn't seem to be an issue with 32-bit?

Wikipedia also mentions:

It is important to note that the last member is padded with the number of bytes required that the total size of the structure should be a least common multiple of the size of a largest structure member.

Does this apply to PureBasic as well?

Thorium · Post by **Thorium** » Sat Oct 09, 2010 11:10 am

Mistrel wrote:Is this correct?

Code: Select all

Structure Test
  A.w
  B.b
  C.b ; <- no alignment needed? correct
  ..[3].b ;this is not needed, it's allready aligned to 4 bytes. A2 + B1 + C1 = 4  
  D.l
  ..[4].b ; <- should be aligned for 8 bytes? Yes but thats only needed on x64 or if you use mmx registers on x86
  E.q
  F.b
  ..[7].b ; <- should be aligned for 8 bytes? correct, but same as E
  G.q
EndStructure

Also, why is alignment so important for 64-bit compiling but didn't seem to be an issue with 32-bit?

It isnt important. It's just for performance and intel improved misaligned memory accesses a lot with the Core i7 and is still improving them, dont know how things are on AMD.

Mistrel wrote: Wikipedia also mentions:

It is important to note that the last member is padded with the number of bytes required that the total size of the structure should be a least common multiple of the size of a largest structure member.

For my understanding, that text does not make any sense.
A structure doesnt need to have a size padding. However it could improve performance of structure copieing. But it is not important at all. Except if a API function or lib specificaly askes for a size padding or alignment. Code can be written in a way it will only work with aligned data. Thats only do gain some performance.

Mistrel · Post by **Mistrel** » Sat Oct 09, 2010 11:50 am

I encounter a lot of posts regarding padding and Win32 structures, such as this:

http://www.purebasic.fr/english/viewtop ... 32#p288332

Is this more of a requirement of programming with Win32 on a 64-bit processor rather than a restriction imposed by the architecture itself?

Thorium · Post by **Thorium** » Sat Oct 09, 2010 12:28 pm

Mistrel wrote: Is this more of a requirement of programming with Win32 on a 64-bit processor rather than a restriction imposed by the architecture itself?

There is no restriction on the CPU, there are even special instructions to load xmm registers with unaligned data. So even SSE2 dont needs a 16 byte alignment, but it's faster if it is alignt.

This is a specification of the WinAPI, not only for 64bit, it applies to 32bit as well: http://msdn.microsoft.com/en-us/library/aa296569.aspx

And it seems to be a C specification.

djes · Post by **djes** » Sat Oct 30, 2010 12:14 am

Thorium wrote:There is no restriction on the CPU, there are even special instructions to load xmm registers with unaligned data. So even SSE2 dont needs a 16 byte alignment, but it's faster if it is alignt.

movups permits unaligned data, but try addps after that, and you realize that you still need aligned datas.

PureBasic Forums - English

How to align memory?

How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?

Re: How to align memory?