ReplaceString for large Files

Share your advanced PureBasic knowledge/code with the community.
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Rings.

; ReplaceString for large Files
;
; by Siegfried (Siggi) Rings
; Note, this is in PureBasic , not in Assembler
; but Assembler can speed up this routine.
; btw it is fast enough and beats every VB indeed.
; Length of Search and Replace must be the same !!!!
;
Procedure ReplaceFileString(File.s,Search.s,Replace.s)
If Len(Search)Len(Replace)
;Length of Search and Replace must be the same !!!!
ProcedureReturn -2
EndIf

;MessageRequester("Info",File+Chr(13)+Search+Chr(13)+Replace,0)

pResult.l=-1
If ReadFile(0, File)
FileLength = Lof()
If FileLength And AllocateMemory(0, FileLength , 0)
Mempointer=UseMemory(0)
ReadData(Mempointer, FileLength) ; Read the whole file in the memory buffer
;So now we have all Data from the file in memory
T=0
P=0
L=Len(Search)
; MessageRequester("Info",Str(FileLength)+":"+Str(L),0)
NochMal:
a=PeekB(UseMemory(0)+T);Read a byte
If a=Asc(Mid(Search,P+1,1)) ;Is byte same as in Search ?
If P=L-1
;Okay found, now replace it
For H=1 To L
PokeB(UseMemory(0)+T-L+H,Asc(Mid(Replace,H,1)) )
Next H
P=0;Reset Pointer in SearchString
pResult=pResult+1; increment our ResultCounter
MessageRequester("Info","Found at Position "+Str(T-L),0)
Else
P=P+1
T=T+1
Goto NochMal
EndIf
Else
P=0
EndIf
T=T+1
If T-1
MessageRequester("Info",Str(Result+1)+" has been replaced!",0)
Else
MessageRequester("Info",Str(Result+1)+" ERROR",0)
EndIf



Siggi
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Pupil.

Nice Rings!

I've made a routine much like yours, don't know if mine is faster, smaller or whatever but i works atleast. Only problem is that it's not so good commented as yours, my bad :wink:
Included in the code listing is a small example of how to call the procedure..
With some more work it will be possible to handle different sizes of search-string and replace-string. One solution could be to go thru the whole search procedure and put markers in a dynamic list where a replacement should be. Then save the fragment between markers and put the replacement in between that. Hope you got that and that my explannation wasn't too bad..

Code: Select all

Structure CharType
  Char.b
EndStructure

Procedure FindAndReplace_Mem(*buffer, memlength.l, search.s, replace.s)
length.l=Len(search)
If length.lLen(replace.s)
  ProcedureReturn #FALSE
EndIf
*mem.CharType = *buffer
*string_s.CharType = @search.s : stringpos.l=1
*string_r.CharType = @replace.s
memmax.l=*mem+memlength
stringbase_s.l=*string_s
stringbase_r.l=*string_r
stringmax_s.l=*string_s+length-1
stringmax_r.l=*string_r+length-1
Repeat
  If *mem\Char = *string_s\Char
    *mem+1 : *string_s+1
    memold.l=*mem
    loop=#TRUE
    While *mem=memmax
ProcedureReturn #TRUE
EndProcedure

If ReadFile(0,"data.txt")
  l.l=Lof()
  *data=AllocateMemory(0,l,0)
  ReadData(*data,l) : CloseFile(0)
  If FindAndReplace_Mem(*data, l, "n", "_")
    If CreateFile(0,"data1.txt")
      WriteData(*data,l) : CloseFile(0)
    EndIf
  EndIf
Endif
Have edited and updated the code some! Possibly faster now..




Edited by - Pupil on 16 January 2002 18:16:30
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Rings.
I've made a routine much like yours, don't know if mine is faster, smaller or whatever but i works atleast.
yes it works and thats it.
Nice code Pupil.You use more the pointers, as i know that that is faster indeed, but more complex to read and understand(not for mine but just for all those who only know about Qbasic or VB ).
If you have more snippets, just put it here.
We all can learn from great stuff.
thx

Siggi
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Paul.

You can also upload then to the PB Resources Site (as PB or TXT file) so others can find them easily (and quicker than having to go through all the posts here when trying to find old ones)
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Pupil.
You can also upload then to the PB Resources Site (as PB or TXT file) so others can find them easily (and quicker than having to go through all the posts here when trying to find old ones)
Ok, i've uploaded it. Hope you're not getting mad at me for not zipping it and including a readme file :wink:
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Pupil.
Nice code Pupil.You use more the pointers, as i know that that is faster indeed, but more complex to read and understand(not for mine but just for all those who only know about Qbasic or VB ).
Thanks! Pointers -yes, when i look at my code now i have a hard time myself to follow whats going on!! :wink:
If you have more snippets, just put it here.
We all can learn from great stuff.
I figure i'll complete this find-and-replace stuff with a routine that can have different length of search-string and replace-string, but i will do this the coming weekend i think.

Rings!
One small thing i noticed while running your 'ReplaceString' program:

if you have a data file containing the following:
hello hello helloj
And searches for "hello helloj" and want to replace it with "012345678901" the output result is this:
hello hello helloj
It's the same! But the output should really be this:
hello 012345678901
Should not be so hard to fix, probably just store some value before the 'similarity-loop' and restore it if there was not a hit in this loop.
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Rings.
if you have a data file containing the following:
hello hello helloj
And searches for "hello helloj" and want to replace it with "012345678901" the output result is this:
hello hello helloj
It's the same! But the output should really be this:
hello 012345678901
Should not be so hard to fix, probably just store some value before the 'similarity-loop' and restore it if there was not a hit in this loop.
Yes that can be true.
But notice that I have written this example to show Ralf that is not so difficult to deal with MemoryBlocks instead of using Strings(64K Barrier).

I figure i'll complete this find-and-replace stuff with a routine that can have different length of search-string and replace-string
please do it.
thx

Siggi
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Ralf.

see i am a total beginner.
but i will study your two examples and hope i get it in my
little brain.
Thanks anyway.
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Rings.

thats is why we post here.
Getting better with a little help from my friends....

Siggi
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Ralf.

exactly. :)
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Ralf.

Thanks a lot !
I tried both examples and both worked.
Now i will study them.
Post Reply