I have a folder with thousands of html files that I need to read and process.
The script runs good and fast then suddenly the performance drops and it becomes slower and slower.
I suspect the ReadFile() function but I'm not sure.
Here is my code :
Code: Select all
Procedure.s ExtractString(str.s, prefix.s, suffix.s) ; function to extract a text between two delimiters
If FindString(str, prefix) = 0 Or FindString(str, suffix) = 0
ProcedureReturn ""
EndIf
prefix_pos = FindString(str , prefix)
suffix_pos = FindString(str, suffix, prefix_pos)
ProcedureReturn Mid(str, prefix_pos + Len(prefix), suffix_pos - prefix_pos - Len(prefix))
EndProcedure
OpenConsole()
text.s = ""
cin.s = ""
If ExamineDirectory(0, "FICHES\", "*") ; folder containing subfolders
; ignore "." and ".."
NextDirectoryEntry(0)
NextDirectoryEntry(0)
While NextDirectoryEntry(0)
subfolder.s = "FICHES\" + DirectoryEntryName(0) ; subfolders containig 10k html files each
If ExamineDirectory(1, subfolder, "*.htm")
While NextDirectoryEntry(1)
counter + 1
text = ""
ReadFile(2, subfolder + "\" + DirectoryEntryName(1))
text = ReadString(2, #PB_File_IgnoreEOL)
CloseFile(2)
cin = ExtractString(text, ~"cin\" value=\"", ~"\" id=")
PrintN(Str(counter) + " : " + cin)
Wend
FinishDirectory(1)
Delay(1000)
EndIf
Wend
EndIf
Input()