Page 1 of 1

Read a text file into 64k string buffers in memory

Posted: Sun Jan 25, 2004 4:26 am
by NelsonN
Code updated For 5.20+

Ok, this is not the most elegant solution, but it works.

If you need to read a text file larger than the current string limit of PureBasic then the following will help. It takes a file and reads it into multiple buffers, taking care not to split a text line between the buffers. If you need to search the file and replace text then you are assured that no text lines will be split between the buffers.

I needed this because I wanted to search and replace in memory in one pass for a CGI application.

Code: Select all

BufferP.w = 0
Buffers.w = 0

CompilerIf #PB_Compiler_OS = #PB_OS_Linux
  tEOL.s = Chr(13)
CompilerElse
  tEOL.s = Chr(13)+Chr(10)
CompilerEndIf

If ReadFile(0, "c:FileToOpen.txt")
  SizeOfFile.l = FileSize("c:FileToOpen.txt")
  Buffers = Int(Round(SizeOfFile/63999,1))
  Dim MyString.s(Buffers)
  BufferP = 0
  If SizeOfFile > 63999
    BytesRead.l = 0
    tBytesRead.l = 0
    FPointer1.l = 0
    FPointer2.l = 0
    MyString(0) = ""
    Repeat
      FPointer1 = tBytesRead
      fLine.s = ReadString(0)
      FPointer2 = Loc(0) ; Loc() is needed because ReadString does not return bytes read
      BytesRead = FPointer2 - FPointer1
      tBytesRead = tBytesRead + BytesRead
      If BytesRead+Len(tEOL)+Len(MyString(BufferP)) > 63999
        BufferP = BufferP + 1 ; previous buffer filled, advance to the next and
        MyString(BufferP) = "" ; make sure buffer is empty
        MyString(BufferP) = MyString(BufferP) + fLine + tEOL
      Else
        MyString(BufferP) = MyString(BufferP) + fLine + tEOL
      EndIf
    Until tBytesRead = SizeOfFile
  Else
    ; Read smaller files with a ReadData type of input, will be faster
  EndIf
EndIf

CloseFile(0)

; process the buffers
; rename read file to .bak
; write modified data in buffers to new file
; (I used MD5 to test modification of the data
; as ReplaceString does not tell me if it made changes)

Dim MyString.s(0) ; remove buffers from memory




Posted: Sun Jan 25, 2004 5:24 pm
by Kale
Why not use 1 big buffer?

Code: Select all

If ReadFile(1, "BigFile.txt")
    LengthOfFile = Lof()
    *Buffer = AllocateMemory(1, LengthOfFile, 0)
    If *Buffer <> 0
        ReadData(UseMemory(1), LengthOfFile)
    EndIf
    CloseFile(1)
EndIf

Posted: Mon Jan 26, 2004 8:28 pm
by NelsonN
Kale wrote:Why not use 1 big buffer?
Because I am using the string manipulation functions of PureBasic and, also, they have a problem with strings larger than 64k.

Posted: Mon Jan 26, 2004 11:21 pm
by Kale
oh yeah, :)

Posted: Tue Jan 27, 2004 10:06 am
by PB
> I am using the string manipulation functions of PureBasic and, also,
> they have a problem with strings larger than 64k.

See: viewtopic.php?p=45054

Posted: Tue Jan 27, 2004 1:36 pm
by Kale
From above link:
Anyway, this trick doesnt work with some functions:

; CRASH 1 - ReplaceString()
A$ = ReplaceString(A$,"abc","def")
:?

Posted: Tue Jan 27, 2004 4:32 pm
by PB
I know, but also from the above link:
the two crashes you showed can easily be replaced with Procedures to
get them working again, and the benefit of being able to use long strings
would be worth it (IMO)
So just create your own "ReplaceString" procedure until PureBasic is
able to use large strings natively. ;)