Page 1 of 1

A few questions regarding Files and Linked Lists

Posted: Thu Nov 07, 2019 12:45 am
by Pud
Greetings all.

"How often have I said that when you have excluded the impossible whatever remains,
however improbable, must be the truth." said Sherlock Homes

I'm chasing a destructive and expensive bug on a Windows system. Simple Ascii strings
separated by a CRLF are stored on a network, loaded when needed in to a linked list and
saved back again when required. I always use EnableExplicit.

I need to confirm the following, as stupid as is might seem…

1. A linked lists (of simple strings). Defined inside a procedure when the procedure
finishes the linked list is freed with no further action?

2. Closefile also implies FlushFileBuffers?

3. A Createfile or Openfile using default settings (except #PB_ASCII) cannot be opened
again by anyone on the network?

4. If I delete file "A" on a network and immediately delete file "B" is this always the order
the OS actually removes them.

Anyone aware of any "gotchas" with linked lists and network sequential file access?

Cheers.

Re: A few questions regarding Files and Linked Lists

Posted: Thu Nov 07, 2019 1:35 am
by skywalk
Can you elaborate on your bug exactly?
Are you returning text stripped of its line endings?
Are you getting double line endings?

1. The List() is effectively gone as long as you don't define Global NewList List() within your Procedure or pass it by reference in the call list.
2. CloseFile() says "Once the file is closed, it may not be used anymore. Closing a file ensures the buffer will effectively be put to the disk."
FlushFileBuffers() is a separate command.
3. I would test this case since network locking of files is complex.
Best to create guaranteed unique filenames to avoid conflict.
4. File deletion is also complex. Best not to be dependent on deletion order.

Re: A few questions regarding Files and Linked Lists

Posted: Thu Nov 07, 2019 9:45 am
by captain_skank
May be a stupid question, but are you using a windows file server or Linux ( native or via samba ) file server.

I've had problems in the past using Samba and random file anomalies ( both in samba 3 and 4 ).

Re: A few questions regarding Files and Linked Lists

Posted: Thu Nov 07, 2019 11:05 am
by Pud
captain_skank wrote:May be a stupid question, but are you using a windows file server or Linux ( native or via samba ) file server.

I've had problems in the past using Samba and random file anomalies ( both in samba 3 and 4 ).

Yes indeed. Running from a NAS fileserver. However, I migrated everything to a Windows server and had the same issues so, for various reasons, I migrated back again.

I don’t think it’s a buffering error as the corruption is in neat records. What I mean by this is if you imagine a list of around 3000 text records all about 500 bytes long and nicely sorted. Twenty users are hammering away and after a few days a half a dozen or so consecutive records will suddenly duplicate themselves somewhere further down the list, neatly overwriting whatever was there!

My write to disc is centralised in just one smallish procedure with a simple locking mechanism that (seems) to work faultlessly. Even if that failed then concurrent writes are protected by the ‘OS’s “one file open at a time” Aren’t they?.

Thanks for the response and SkyWalk’s

Re: A few questions regarding Files and Linked Lists

Posted: Thu Nov 07, 2019 12:01 pm
by captain_skank
I seem to remember something about disabling smbv3 or smbv1 on the client, but I can't exactly recall wher I saw that.

I had similar problems with random file locking, files becomming corrupted etc, but that only happend when users were copying/saving file from Solidworks to the server.

In the end it was such a ball ache that I set them up on their own windows server and the problem never reoccured.

I've been using multiple samba servers for over 18 years now and it's the only problem i've ever had.

May be worth checking which version of samba your NAS is running.

Re: A few questions regarding Files and Linked Lists

Posted: Tue Dec 10, 2019 3:59 pm
by Pud
Ok. One month on and my problem persists. Extensive testing and error trapping (no errors) seems to point to the following procedure.
The list is just simple asci text with each line 1k to 5k long. To exclude networking errors I first copy the file to a unique local
temporary file and work on it there.

What happens every now and again is a few, sometimes a dozen or so consecutive records duplicate themselves and overwrite other
records half way through the file. This is at random and does not seem connected to the records being worked on.
There is sometimes minor data corruption but mostly it’s whole ‘neat’ records.

This can happen on any one of 20+ PCs. All are running Avast but with all relevant folders excluded.
To deliberately do this, if the list is the culprit, I would have to change the current list element for a few records to somewhere random
and then change back again to continue.
If the file writing is the culprit then something is manipulating the file pointer with a similar result.

I’ve even tried making a small console application to just handle the few file access routines I need so as to exclude problems I might
have in the applications.
My gut feeling is a buffering issue as there doesn’t seem to be anything else left! I’ve tried turning off PBs buffering but saving times
go from a couple of seconds to thirty or so, this is no good.

A minor reward offered for information leading to the catching this bug. (:


Procedure SaveList(filename$,List lsave$(),type=#PB_Ascii)
Protected h

h=CreateFile(#PB_Any,filename$,type)
If h
ForEach lsave$()
If Trim(lsave$())<>""
WriteStringN(h,lsave$())
EndIf
Next
CloseFile(h)
Else
errorlog("Savelist(). Cannot createfile "+GetFilePart(filename$))
EndIf

EndProcedure

Re: A few questions regarding Files and Linked Lists

Posted: Wed Dec 11, 2019 1:42 am
by normeus
Since you have the list in memory you should be able to read back the file you created into a list then compare.
It doubles the time, but you'll be sure it was written right.

also use:

Code: Select all

FlushFileBuffers(h) ; force write to disk before closing file ( closefile() does the same, couldn't hurt ) 
CloseFile(h)
RenameFile(filename$,filename$)  ; if the file is usable you can rename it to itself to make sure it is done writting.
; here open text file using Openfile(h,filename$) and compare to list in memory 
I am assuming that each of the 20 computers that you mentioned has it's own file, otherwise you have a file open and 19 other computers waiting for it to be done.


Norm.

Re: A few questions regarding Files and Linked Lists

Posted: Thu Dec 12, 2019 12:54 am
by Pud
Thanks for the reply.

Yes I have created a ‘test for dups’ procedure and am currently inserting it around various bits of code it to see if
I can determine exactly the point of corruption. I’ve tried FlushFilebuffers and also tried setting the size of the file buffer
to something longer then the file so no buffer boundary updates.

Renaming a file to itself is an interesting one a nice simple fix to try. I sincerely hope it wins you the reward (:

Yes the main database file is copied to a temp file on the local hard drive, worked on there and then copied back.

The problem is the users are hammering away at this all day and I can go several days without an issue.
Today the corruption happened twice! ARRRRRRRRRRRR.

11/12/2019 @ 11:37:33
-
Checking Jobsheets0000.md Start=57000 End=59532
-
Checking Invoices0000.md Start=66000 End=67875
-
Checking CreditNotes0000.md Start=4000 End=4079
-
Checking Purchases0000.md Start=87000 End=91179
-
Checking CollectionNotes0000.md Start=51000 End=55032
54664 is duplicated
54665 is duplicated
54666 is duplicated
54667 is duplicated
54668 is duplicated
54669 is duplicated
54670 is duplicated
54671 is duplicated
54672 is duplicated
54673 is duplicated
51375 '{ }' mismatch
51376 is missing
51377 is missing
51378 is missing
51379 is missing
51380 is missing
51381 is missing
51382 is missing
51383 is missing
51384 is missing
51385 is missing
-
Checking Production0000.md Start=1000 End=3023
-

Re: A few questions regarding Files and Linked Lists

Posted: Thu Dec 12, 2019 6:28 pm
by kpeters58
Hmm - that sounds very much like an issue I had for the longest time years ago (a number of radiology clinics with up to 10 PCs each writing image metadata to a central location).
It would go well for up to a week or longer, but eventually "wondrous" things would happen: Duplicate lines, missing lines, garbage lines - some sort of file corruption.

In the end, I did give up and rewrote it to use a database server (in their case I was able to use SQLite with a retry loop around the writes to handle SQLite's locking) - it works flawlessly to this day (10+ years). The clinics each only have an average of < 200 patients a day, so the locking conflicts almost never arise and if they do, my retry loop handles them easily. I use SQLite's in-memory tables locally, so I get all the integrity checking I put in the DDL for free (such as unique constraints etc.)

Your app sounds like it might be way more I/O intensive - in that case I would have chosen a real multi-user DB like PostgreSQL or MySQL/MariaDB.

Re: A few questions regarding Files and Linked Lists

Posted: Fri Dec 13, 2019 3:14 am
by Rinzwind
Just a generic hint.. write a basic test client that reads and writes sequentially and randomly. Run multiple instances. And/or use threads to simulate simultaneous access. Try to replicate the issue in its most basic way.