Page 1 of 1

Fast search in files

Posted: Wed Sep 13, 2006 2:08 pm
by raska
Hi:

I am just making a try to get how fast PB can be making some comparations searching in 2 files at time...

I've tested the ReadString command and it is fast enough, but when looking for words in the line the process is slow...

My question is if there is some instruction to get the words in a line, something like

Code: Select all

a$=ReadString(0)

firstword$=word$(a$,1)
secondword$=word$(a$,2)

and so on.... 
I've tried to get every word in the line doing a mid$(a$,count,1) and it's so slow.
Any ideas to acelerate the process of searching...?

I am talking of bigger files of more than 10 meg and with aprox. 50,000 lines.

That's what I made getting very slow results:

Code: Select all

OpenFile (0,"h:\test.obj")

    While Eof(0) = 0          
      a$=ReadString(0)  
      If Left(a$,2)="v " 
         Gosub num 
      EndIf   
 
     d1x$=""
      d1y$=""
      d1z$=""
    Wend
        CloseFile(0)      
    End 
       
num:        
       pos=2
       Repeat
        pos=pos+1
        x$=Mid(a$,pos,1)
        dx$=dx$+x$
       Until x$=" "
       Repeat
        pos=pos+1
        x$=Mid(a$,pos,1)
        dy$=dy$+x$
       Until x$=" "
       Repeat
        pos=pos+1
        x$=Mid(a$,pos,1)
        dz$=dz$+x$
       Until Val(x$)=0
       
       Return
P.S. the structure of every line with the searched values is:

v number number number

So I search for "v " in every line and if found, then the subroutine num try to get the 3 numbers that follow.

Posted: Wed Sep 13, 2006 4:19 pm
by raska
Answering myself.....

Sorry for the mistake...

I was testing the file with the debugger on..

Now, it takes just 1 second to compare 2 files 10 meg big. LOL

What a beatiful speed. :D

I will continue testing this thing and if all is as good as this, one more PB addict will be soon here. :D

Regards!!

By the way, in the previous code it was an error

The
Until Val(x$)=0

must be
Until x$=chr(0)
If there is a good way to detect the end of the line, please say about :D

Nevertheless I still wanna know if there is a more fast way to get the words inside a line without need to use the mid thing.

Posted: Wed Sep 13, 2006 4:56 pm
by Trond
I don't know if this works or not, but you can try it:

Code: Select all

#MEMSIZE = 1024 * 1024 * 5 ; 5 MB

OpenFile(0, "c:\cmldr")


S.l = AllocateMemory(#MEMSIZE)
Find.s = #LF$ + "v "
Nums.s
Dx.s
Dy.s
Dz.s

While Eof(0) = 0
  FileSeek(0, Loc(0) - 3)
  ReadData(0, S, #MEMSIZE)
  For I = 0 To #MEMSIZE - 3
    If CompareMemory(S+I, @Find, 3)
      J = I
      While PeekB(S+I) <> #CR
        Nums = PeekS(S+J, I-J)
        Dx + StringField(Nums, 1, " ")
        Dy + StringField(Nums, 2, " ")
        Dz + StringField(Nums, 3, " ")
        I + 1
      Wend
      I + 1
    EndIf
  Next
Wend
CloseFile(0)


Posted: Wed Sep 13, 2006 7:24 pm
by raska
Thanks Trond,

i will try to play with the memory commands. I had no idea of those commands.

Btw, I found the function I was looking for. It speed the program a lot.

when the line containing "v " is found I only need to make this to get the values: :D

Code: Select all

dx$=StringField(a$,2," ")
dy$=StringField(a$,3," ")
dz$=StringField(a$,4," ")
I think this basic is a very good option to choose. :D