AGP Writes and Compiler Optimizations
Whenever you write into a vertex or index buffer, it is every likely that you are directly accessing the AGP memory. You will probably know that you should write in sequential order. This is truely important, even exchaning two DWORD's can half your performance.
I had to find that out the hard way when i wrote an inner loop like this:
Code: Select all
sInt x,z;
sU32 col;
sF32 *fp;
BlaVertex *v;
[...]
for(z=0;z<=LS_BATCHVERTS;z++)
{
for(x=0;x<=LS_BATCHVERTS;x++)
{
fp[0] = px+x*sx;
fp[1] = (v->HD+br->Data->Base)*ScaleH;
fp[2] = pz+z*sz;
((sU32 *) fp)[3] = col;
fp+=4;
v++;
}
}
Fortunatly, declaring the write pointer as volatile solved the problem. This tells the compiler that every read or write access to the memory the pointer points to must occur exactly as specified, with respect to other volatile access. This does not mean that the pointer variable itself is excluded from optmization, things like fp+=4; work as before.
Code: Select all
sInt x,z;
sU32 col;
volatile sF32 *fp;
BlaVertex *v;
[...]
for(z=0;z<=LS_BATCHVERTS;z++)
{
for(x=0;x<=LS_BATCHVERTS;x++)
{
fp[0] = px+x*sx;
fp[1] = (v->HD+br->Data->Base)*ScaleH;
fp[2] = pz+z*sz;
((volatile sU32 *) fp)[3] = col;
fp+=4;
v++;
}
}

