Coding - AGP Writes and Compiler Optimizations

For everything that's not in any way related to PureBasic. General chat etc...
va!n
Addict
Addict
Posts: 1104
Joined: Wed Apr 20, 2005 12:48 pm

Coding - AGP Writes and Compiler Optimizations

Post by va!n »

I found an interesting article by chaos/farbrausch about agp coding and optimisations... i still think this may be very interesting for some guys here!? here we go...



AGP Writes and Compiler Optimizations

Whenever you write into a vertex or index buffer, it is every likely that you are directly accessing the AGP memory. You will probably know that you should write in sequential order. This is truely important, even exchaning two DWORD's can half your performance.

I had to find that out the hard way when i wrote an inner loop like this:

Code: Select all

  sInt x,z;
  sU32 col;
  sF32 *fp;
  BlaVertex *v;
  
  [...]

  for(z=0;z<=LS_BATCHVERTS;z++)
  {
    for(x=0;x<=LS_BATCHVERTS;x++)
    {
      fp[0] = px+x*sx;
      fp[1] = (v->HD+br->Data->Base)*ScaleH;
      fp[2] = pz+z*sz;
      ((sU32 *) fp)[3] = col;
      fp+=4;
      v++;
    }
  }
I expected this to perform well but it didn't. When I looked at the assembly code, I found that compiler (VC++) decided to reschedule the writes: It wrote the color before the z component to save a cycle somewhere.

Fortunatly, declaring the write pointer as volatile solved the problem. This tells the compiler that every read or write access to the memory the pointer points to must occur exactly as specified, with respect to other volatile access. This does not mean that the pointer variable itself is excluded from optmization, things like fp+=4; work as before.

Code: Select all

  sInt x,z;
  sU32 col;
  volatile sF32 *fp;
  BlaVertex *v;
  
  [...]

  for(z=0;z<=LS_BATCHVERTS;z++)
  {
    for(x=0;x<=LS_BATCHVERTS;x++)
    {
      fp[0] = px+x*sx;
      fp[1] = (v->HD+br->Data->Base)*ScaleH;
      fp[2] = pz+z*sz;
      ((volatile sU32 *) fp)[3] = col;
      fp+=4;
      v++;
    }
  }
va!n aka Thorsten

Intel i7-980X Extreme Edition, 12 GB DDR3, Radeon 5870 2GB, Windows7 x64,
jack
Addict
Addict
Posts: 1358
Joined: Fri Apr 25, 2003 11:10 pm

Post by jack »

thanks for sharing :)
KarLKoX
Enthusiast
Enthusiast
Posts: 681
Joined: Mon Oct 06, 2003 7:13 pm
Location: France
Contact:

Post by KarLKoX »

There is such trick inside the sourcecode of the MPlayer (*nix multimedia player) wich use the AGP memory for very fast memcpy (through prefetching).
Very interesting indeed :)
"Qui baise trop bouffe un poil." P. Desproges

http://karlkox.blogspot.com/
Post Reply