Improved FileBuffersSize features.

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Improved FileBuffersSize features.

Post by Rescator »

I think something like the following would make sense:

OpenFile(#File,Filename$,Filebuffersize=4096)

same for ReadFile() and CreateFile() as well.

Why? Simple really.
I pretty much always end up using FileBuffersSize() after opening a file anyway as I tend to default to 65536 instead of the default 4096 size.

Obviously FileBuffersSize() is still useful, it's just that in 99% of situations I only need to set it when opening/creating a file. I can't recall the last time I needed to change it "while" working with a file, or a need to disable the buffer. (setting size to 0)

Oh, and adding a GetFileBuffersSize() function probably wouldn't hurt either. :)

And maybe even a DefaultFileBuffersSize() ? ;)

PS! Oh and I've always wondered... Why is it called FileBuffersSize() and not FileBufferSize() :?: or SetFileBufferSize()
PPS! Oh and what happens if one should change the buffersize while working with a file? Like reducing it, any danger in loosing read data? )manual does not mention that scenario)
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

You realise that this buffer is a PB built buffer and nothing to do with the OS file buffering?

To be honest, when I was playing with building a benchmark tool I was playing with this setting a lot and reading some NTFS related docs and I had a hard time making any performance differences with this setting at all so I stopped using it. It seems that OS buffering in winxp and later trumps this. not sure about linux though.

(I'm not against the request, I'm just wondering if you are getting as much benefit as you believe you are from this feature)
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
nco2k
Addict
Addict
Posts: 1344
Joined: Mon Sep 15, 2003 5:55 am

Post by nco2k »

If OSVersion() = #PB_OS_Windows_ME : End : EndIf
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

@pdwyer before Fred added this, stuff like ReadByte() would be horrible slow etc.

And if you do not use it then PB 4+ defaults to 4096.
I think NTFS uses 4096 blocks or similar though.

How this works is simple. if the buffer is set to 4096 then this means that if you read 1 byte or 4096 bytes, 4096 bytes will be read. Which is a huge benefit if you are using ReadByte, otherwise the filesystem would have to read an entire block just to get each byte.

It's kinda like a pre-population of a buffer.

I forget if NTFS has 512 or 4096 byte blocks or not though.
But obviously reading 4096 bytes once when you need to do ReadByte on say the first 100 is much faster than doing 100 reads of filesystem 4096 blocks right?

You may only need the first 100 bytes, but the filesystem still need to read the entire 4096 blocks those bytes are in or worse those 100 bytes may be across the border of two blocks. Obviously it's way faster to read it from PB's memory in that case right?

If you use 65536 as a buffer and the file is like 50K then if you use ReadByte or ReadLong or ReadString etc then you are not using the disk at all, you're reading from the buffer instead. Minimal overhead.

THe reason you can set the buffer to 0 though is for those that use their own filecaching.

I'm not sure why I tend to default to 64K but I think (again) it was due to some NTFS block size stuff or filesystem buffers.

If you use ReadData() and read in sizes of a few MB then you'll most likely not see that much of a speed change normally, although if you're streaming data it might be wise to set the filebuffer to match the streambuffer, that way PB will have the next batch of data that need to be stream in memory already.


The only way to speed it up further is to implement your own double buffering and flip pointers for your streaming routines. In that case setting the filebuffer to 0 might be wise to reduce overhead.
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

To be honest I've never used "readbyte()" I always use readdata() with a block for a structure or the whole file or whatever so perhaps I haven't seen the benefits you describe.

I did spend some time with this on benchmarking and flushing buffers though ans I suspect that you are not getting quite as much of a benefit as you may think. Not because PB's feature doesn't work, but because the OS's have started doing these to a certain degree too and the gap has closed somewhat.

I would be interested to see a code sample where a measurable difference between 4096 and 65536 could be detected. (I mean that sincerely, not sarcastically)
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Post Reply