By using Delay() you are effectively suspending your program and freeing the CPU for other tasks (check cpu usage, prob goes from 100 to 0). Task switching on Windows can take around 10..20ms hence once you use it you get a much lower figure. The numbers you are quoting confuse program "loop" rate with screen refresh rate. FPS itself can be confusing depending on what you are doing (eg, updating physics/ game ai/ actual display redraw). Loop time includes display refresh time AND any inherent delay waiting to refresh the display (on calling FlipBuffers()). It gets even more confusing when you introduce double/triple buffering, but much simpler (poss) if you have activated vsync'ing. Any delay on calling FlipBuffers() represents time unavailable to your other code unless you push the FlipBuffers() to as near to a vsync as poss. The holy grail seems to be "phase locking" the refresh of the display with vsync but with no vsync'ing set in driver. I've had limited success doing this specifically because of Delay().
64 FPS with the delay and 2431 FPS without the delay. Don't seem weird to me, like Psychophanta said
Got it upto 4495 FPS without ClearScreen(255,255,255) hehe.
I Stepped On A Cornflake!!! Now I'm A Cereal Killer!