Read text files (all formats)
Posted: Wed Mar 26, 2025 7:14 pm
Hi, I wrote a purebasic tool to search quickly within source files (or other text files) which uses the command ReadString().
Badly, some text file formats aren't support so I did the following code as a quick workaround for big endian formatted files.
This solution does need to allocate memory two times (Reallocate and PeekS), I thought to read the file directly into the string but then I'd need to use something like Buffer=Space(FileSize>>#PB_Compiler_Unicode) which is slower than just allocating memory. Any other ideas without eliminating the string by just using a memory buffer?
Maybe also the byte swapping could be improved, any ideas?
Badly, some text file formats aren't support so I did the following code as a quick workaround for big endian formatted files.
This solution does need to allocate memory two times (Reallocate and PeekS), I thought to read the file directly into the string but then I'd need to use something like Buffer=Space(FileSize>>#PB_Compiler_Unicode) which is slower than just allocating memory. Any other ideas without eliminating the string by just using a memory buffer?
Maybe also the byte swapping could be improved, any ideas?
Code: Select all
#File=1
FileName.s="test.txt"
FileSize=FileSize(FileName)
Buffer.s
If ReadFile(#File,FileName,#PB_File_SharedRead)
m=ReadStringFormat(#File); BOM (Byte Order Mark)
; Debug Str(m)+": "+Files(Tool\CheckFiles)\Name
If m%24>=#PB_UTF16BE
If m=#PB_UTF16BE
BufferSize=FileSize
*Buffer=ReAllocateMemory(*Buffer,BufferSize,#PB_Memory_NoClear)
If *Buffer
BufferSize=ReadData(0,*Buffer,BufferSize)
; ShowMemoryViewer(*Buffer,BufferSize)
m=BufferSize
If m
*BufferA=*Buffer
*BufferB=*Buffer+1
While m
n=PeekA(*BufferA)
PokeA(*BufferA,PeekA(*BufferB))
PokeA(*BufferB,n)
m-2
*BufferA+2
*BufferB+2
Wend
; ShowMemoryViewer(*Buffer,BufferSize)
Buffer=PeekS(*Buffer,BufferSize>>1)
Else
Buffer=""
EndIf
Else
Buffer=""
EndIf
Else
Debug "PANIC - Filetype not supported"
Buffer=""
EndIf
Else
Buffer=ReadString(#File,m|#PB_File_IgnoreEOL)
EndIf
CloseFile(#File)
EndIf