FileBufferSize speed test results!

Everything else that doesn't fall into one of the other PB categories.
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

FileBufferSize speed test results!

Post by Rescator »

Ref: http://www.purebasic.fr/english/viewtopic.php?t=37074

The test code can be found near the end of the post.
After the test results are some comments on the test.
The buffer size is in bytes and in KB alignments.
The timings are in milliseconds, so 19086 = 19.086 seconds.
FileBuffersSize() results:
Buffer: 4096
ReadByte: 19086
ReadChar: 10309
ReadWord: 10034
ReadLong: 5405
ReadQuad: 3311
ReadFloat: 5602
ReadDouble: 3392
ReadData: 484

Buffer: 8192
ReadByte: 18874
ReadChar: 10068
ReadWord: 9862
ReadLong: 5213
ReadQuad: 3109
ReadFloat: 5385
ReadDouble: 3240
ReadData: 397

Buffer: 16384
ReadByte: 18716
ReadChar: 9983
ReadWord: 9730
ReadLong: 5094
ReadQuad: 3009
ReadFloat: 5267
ReadDouble: 3108
ReadData: 328

Buffer: 32768
ReadByte: 18698
ReadChar: 9959
ReadWord: 9729
ReadLong: 5087
ReadQuad: 3000
ReadFloat: 5273
ReadDouble: 3111
ReadData: 404

Buffer: 65536
ReadByte: 18677
ReadChar: 9945
ReadWord: 9740
ReadLong: 5071
ReadQuad: 2987
ReadFloat: 5249
ReadDouble: 3095
ReadData: 383

Buffer: 131072
ReadByte: 18674
ReadChar: 9969
ReadWord: 9697
ReadLong: 5058
ReadQuad: 2980
ReadFloat: 5243
ReadDouble: 3084
ReadData: 367

Buffer: 262144
ReadByte: 18664
ReadChar: 9955
ReadWord: 9725
ReadLong: 5066
ReadQuad: 2980
ReadFloat: 5247
ReadDouble: 3081
ReadData: 400

Buffer: 524288
ReadByte: 18843
ReadChar: 9996
ReadWord: 9779
ReadLong: 5132
ReadQuad: 3037
ReadFloat: 5306
ReadDouble: 3125
ReadData: 443

Buffer: 1048576
ReadByte: 18984
ReadChar: 10143
ReadWord: 9921
ReadLong: 5208
ReadQuad: 3128
ReadFloat: 5396
ReadDouble: 3226
ReadData: 576
In the test I set core affinity (found in Taskmanager if you right click a process) to the second core on my 3 core cpu and unchecked core 0 and 2.

ReadData size was the same as the filebuffer size in the tests.

As you can see, nothing beats ReadData() so the conclusion is that if you know what to read and how much you need, then use ReadData()
It is advised that the ReadData chunks and filebuffersize matches.

In general one can see that the larger the filebuffersize the faster the reads are, however one can see that at a certain point (on my system)
it hit a limit. This could possibly be the OS filesystem or the hard drive buffer limit, and the gain began to be less.

It seems that 32KB, 64KB and 128KB seems to be among the better general choices.
Certain data might benefit from other buffersizes, especially in the case of ReadData, like when streaming audio or video or other data.

My advice is to sync up the sizes to create a double, tripple or a cascading buffering effect, thus you avoid overbuffering or underbuffering which might reduce performance.

Another thing to keep in mind is that the larger the buffer, the less disk reads are needed, and the less filesystem access and thus less CPU use.
If a lot of disk activity is expected or simultaneous processing then larger buffers are advised.

A large test file is advised.
I used the dxsdk_march2008.exe (Direct X march 2008 installer) which was 452751KB.

Feel free to post your own speed tests in this thread,
maybe we can find a nice buffersize "default" that is the most optimal across many systems. The PureBasic default is currently set at 4096 bytes (4KB).

Code: Select all

DisableDebugger
EnableExplicit

Procedure.s test(bufsize.i,file$)
 #File1=1
 Protected start.l,stop.l,text$,*mem
 text$="Buffer: "+Str(bufsize)+#CRLF$

 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadByte(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadByte: "+Str(stop-start)+#CRLF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadCharacter(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadChar: "+Str(stop-start)+#CRLF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadWord(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadWord: "+Str(stop-start)+#CRLF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadLong(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadLong: "+Str(stop-start)+#CRLF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadQuad(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadQuad: "+Str(stop-start)+#CRLF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadFloat(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadFloat: "+Str(stop-start)+#CRLF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadDouble(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadDouble: "+Str(stop-start)+#CRLF$
 
 *mem=AllocateMemory(bufsize)
 If *mem
  If ReadFile(#File1,file$)
   FileBuffersSize(#File1,bufsize)
   start=timeGetTime_()
   While Eof(#File1)=#False
    ReadData(#File1,*mem,bufsize)
   Wend
   stop=timeGetTime_()
   CloseFile(#File1)
  EndIf
  FreeMemory(*mem)
 EndIf
 text$+"ReadData: "+Str(stop-start)+#CRLF$
 text$+#CRLF$
 ProcedureReturn text$
EndProcedure

timeBeginPeriod_(1)

Define file$,text$
file$="E:\Downloads\dxsdk_march2008.exe"

text$="FileBuffersSize() results:"+#CRLF$

text$+test(4096,file$)
text$+test(8192,file$)
text$+test(16384,file$)
text$+test(32768,file$)
text$+test(65536,file$)
text$+test(131072,file$)
text$+test(262144,file$)
text$+test(524288,file$)
text$+test(1048576,file$)

MessageRequester("test()",text$)

timeEndPeriod_(1)
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

And just to show that no filebuffer is worse than the default:

Application Compatibility Toolkit.msi 10958KB
FileBuffersSize() results:
Buffer: 0
ReadByte: 51657
ReadChar: 25951
ReadWord: 25959
ReadLong: 12975
ReadQuad: 6533
ReadFloat: 13277
ReadDouble: 6517
Ugh! Even Vista SP1's filesystem can't help here, I'm glad I tested on a smaller file.

I am surprised at how linear the difference is though, a OS filesystem thing maybe? 2 bytes is twice as fast as 1 byte reads? Are these truly RAW harddrive reads? Fascinating!

Now the ReadData() test.

The same test file as the test in the first post:
dxsdk_march2008.exe (Direct X march 2008 installer) which was 452751KB.
FileBuffersSize(,0) results:
Buffer: 0

Chunksize: 4096
ReadData: 764

Chunksize: 8192
ReadData: 492

Chunksize: 16384
ReadData: 352

Chunksize: 32768
ReadData: 346

Chunksize: 65536
ReadData: 316

Chunksize: 131072
ReadData: 302

Chunksize: 262144
ReadData: 304

Chunksize: 524288
ReadData: 318

Chunksize: 1048576
ReadData: 345
What is interesting here is that with filebuffers set to 0 (disabled),
there seems to be a penalty on small chunksizes read.
Once you get up to 16KB chunks, the lack of filebuffering does not seem to be that bad,
in fact it looks like a slight speed increase. (the filebuffer read-ahead code is avoided I guess?)

But since this was a almost half a GB file, and these values are so small, the gain is miniscule.

One benefit of disabling filebuffers when you read larger chunks using ReadData though is that you use less memory. (no filebuffermemory needed)

Again it seems the sweet spot is 32KB, 64KB, 128KB, at least on my system, so it's probably a hard drive buffer/memory thing, or the OS filesystem.

So if others get similar value differences, then 32KB, 64KB, 128KB
is advised, and if using ReadData then the same chunk sizes but filebuffers 0 might give a tiny performance increase.

To avoid wasting too much time doing too many tests,
I intentionally did not test different ReadData and Filebuffersizes.
Like say 128KB filebuffer and 64KB ReadData chunks.

But if these values are anything to go by then I'm assuming the difference
is probably not more dramatic than ReadData() with or without filebuffers was.

Code: Select all

DisableDebugger
EnableExplicit

Procedure.s test1(file$)
 #File1=1
 Protected start.l,stop.l,text$,*mem,bufsize.i
 bufsize=0
 text$="Buffer: "+Str(bufsize)+#LF$

 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadByte(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadByte: "+Str(stop-start)+#LF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadCharacter(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadChar: "+Str(stop-start)+#LF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadWord(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadWord: "+Str(stop-start)+#LF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadLong(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadLong: "+Str(stop-start)+#LF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadQuad(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadQuad: "+Str(stop-start)+#LF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadFloat(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadFloat: "+Str(stop-start)+#LF$
 
 If ReadFile(#File1,file$)
  FileBuffersSize(#File1,bufsize)
  start=timeGetTime_()
  While Eof(#File1)=#False
   ReadDouble(#File1)
  Wend
  stop=timeGetTime_()
  CloseFile(#File1)
 EndIf
 text$+"ReadDouble: "+Str(stop-start)+#LF$
 text$+#LF$
 ProcedureReturn text$
EndProcedure

Procedure.s test2(bufsizea.i,file$)
 #File1=1
 Protected start.l,stop.l,text$,*mem,bufsize.i
 bufsize=0
 text$="Chunksize: "+Str(bufsizea)+#LF$

 *mem=AllocateMemory(bufsizea)
 If *mem
  If ReadFile(#File1,file$)
   FileBuffersSize(#File1,bufsize)
   start=timeGetTime_()
   While Eof(#File1)=#False
    ReadData(#File1,*mem,bufsizea)
   Wend
   stop=timeGetTime_()
   CloseFile(#File1)
  EndIf
  FreeMemory(*mem)
 EndIf
 text$+"ReadData: "+Str(stop-start)+#LF$
 text$+#LF$
 ProcedureReturn text$
EndProcedure

timeBeginPeriod_(1)

Define file$,text$

text$="FileBuffersSize() results:"+#LF$

MessageRequester("test()","start")

file$="E:\Downloads\Application Compatibility Toolkit.msi"
text$+test1(file$)
file$="E:\Downloads\dxsdk_march2008.exe"
text$+test2(4096,file$)
text$+test2(8192,file$)
text$+test2(16384,file$)
text$+test2(32768,file$)
text$+test2(65536,file$)
text$+test2(131072,file$)
text$+test2(262144,file$)
text$+test2(524288,file$)
text$+test2(1048576,file$)

MessageRequester("test()",text$)

timeEndPeriod_(1)
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

remember that you are fiddling with PB file buffers here an not the OS through some API behind the scenes. After this the OS then does it's fun and games beffering so it's going to be difficult to measure exacly
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Hence why I want others to do these tests too.
XP, Vista, Win7, Mac, Linux. Combined with the various HD's people have we should get a "feel" for where a good "all round" value would be.

I.e: Which PureBasic filebuffer size gives an all round good performance across systems/os/hd's.

If I wasn't lazy right now, I would have made a test app (thats peeps could compile and run (with debugger off obviously), post the result here, and then compile the data into a mergable table and get a final result.
User avatar
DoubleDutch
Addict
Addict
Posts: 3220
Joined: Thu Aug 07, 2003 7:01 pm
Location: United Kingdom
Contact:

Post by DoubleDutch »

I think that from what I've seen 16k should be the default.
https://deluxepixel.com <- My Business website
https://reportcomplete.com <- School end of term reports system
jamirokwai
Enthusiast
Enthusiast
Posts: 796
Joined: Tue May 20, 2008 2:12 am
Location: Cologne, Germany
Contact:

Post by jamirokwai »

Here are the results for MacOS X. I had to replace the Windows-only-
function timeGetTime_() with ElapsedMilliseconds(). Does that influence
the results? The ReadDate-Time is somewhat low...

System: Core2Duo 2.16Ghz, 2GB Ram, GMA950, 120gb FUJITSU MHW2120BH,
running Mac OS X 10.5.7 (iTunes was playing in background).

File: 42mb Disk-Image, Merlin2.dmg from local harddrive.


-----

Code: Select all

FileBuffersSize() results:
Buffer: 0
ReadByte: 10640
ReadChar: 10636
ReadWord: 5308
ReadLong: 2745
ReadQuad: 1369
ReadFloat: 2739
ReadDouble: 1381

Chunksize: 4096
ReadData: 71

Chunksize: 8192
ReadData: 68

Chunksize: 16384
ReadData: 65

Chunksize: 32768
ReadData: 64

Chunksize: 65536
ReadData: 65

Chunksize: 131072
ReadData: 65

Chunksize: 262144
ReadData: 66

Chunksize: 524288
ReadData: 68

Chunksize: 1048576
ReadData: 75
Regards,
JamiroKwai
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Cool. It's a shame you didn't test a larger file though, the results are so small that the difference could be coincidental.
On a modern PC a file of a few hundred MB would be better, a CD image file or a really huge zip or TV series episode or similar is ideal.

The results from test2() is probably the most interesting to see OS/HD performance.
If you are interested in the PB buffering and optimal values then the source in the first post is more useful.

As for replacing timeGetTime_() with ElapsedMilliseconds() on Mac and Linux,
that should be ok as I seem to recall someone stating that they have millisecond accuracy on those platforms. Only on Windows is it necessary to do what I do in this code.

PS! Make sure you choose the nondebug choice when compiling in the menu, as the DisableDebugger keyword do not fully disable the debugger.


However, I felt inspired, and if you check this thread: http://www.purebasic.fr/english/viewtopic.php?t=37516
You'll see a proper "Test tool" that produces data sets that can better be used to do some serious speed analysis comparisons.
jamirokwai
Enthusiast
Enthusiast
Posts: 796
Joined: Tue May 20, 2008 2:12 am
Location: Cologne, Germany
Contact:

Post by jamirokwai »

Rescator wrote:Cool. It's a shame you didn't test a larger file though,
the results are so small that the difference could be coincidental.
Here are the results for a 1.6gb virtual image file, freshly copied, so it
should be without fragments. Seems to me, that from buffersize 16384 on,
the ChunkSize isn't that important anymore...

Code: Select all

FileBuffersSize() results:
Buffer: 0
ReadByte: 129145
ReadChar: 127812
ReadWord: 71014
ReadLong: 57540
ReadQuad: 57516
ReadFloat: 57828
ReadDouble: 57555

Chunksize: 4096
ReadData: 61068

Chunksize: 8192
ReadData: 69812

Chunksize: 16384
ReadData: 57893

Chunksize: 32768
ReadData: 59925

Chunksize: 65536
ReadData: 57782

Chunksize: 131072
ReadData: 57592

Chunksize: 262144
ReadData: 57626

Chunksize: 524288
ReadData: 57913

Chunksize: 1048576
ReadData: 57413
Regards,
JamiroKwai
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Yup! You are correct.

If you look at http://www.purebasic.fr/english/viewtopic.php?t=37518
Especially the graphs in later posts, you'll see that although it varies slightly there seem to be a peak at 16KB or the tests right after, but at a certain point a larger size may even cause a penalty.

As Fred pointed out in that thread, the tool only does sequential read tests,
I'm fiddling with a random read test now so keep an eye on that thread.

This whole thing reminds me of the Goldilocks (Goldylocks?) story,
where one bed was too hard, one to soft, one just right, one pourrage (sp?) too cold, one too hot and one just right, etc.
It's the same here, too small you get a performance penalty, too large and you get a penalty.

Then again, half the fun is finding out what is "just right", well, at most folks systems anyway :P There will always be the odd system where nothing makes sense heh.
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

I wonder if the FS make much of a difference here. On the windows platform does everyone use NTFS now or do people still use Fat32?
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Well if you chhecked the charts for v2.3 of the test tool and now look at the new ones (read my ast post in that thread) you'll see that the filesystem and the cache it uses has a huge impact.
Post Reply