Page 2 of 2

Re: Compare String

Posted: Wed May 01, 2013 9:40 pm
by idle
you need to test the map approach so that it's sorting the data while it's being read in from file, using it after won't be faster.
However to make it faster will probably require you to use a thread and buffer the file stream so it can process the stream asynchronously.

something like this but optimized

Code: Select all

Global NewMap Ma.s() 
Global NewList listCombined() 
Global NewList Buffer.s()
Global mbuffer = CreateMutex()
Global sbuffer = CreateSemaphore()
Global quit 

Procedure Filllist(s1.s) 
  If Not FindMapElement(Ma(),s1)
    AddMapElement(Ma(),s1) 
    AddElement(listCombined())
    Ma() = s1 
    listCombined() = @ma()
  EndIf    
EndProcedure    

Procedure Push(input.s) 
   LockMutex(mbuffer)
   LastElement(Buffer()) 
   AddElement(Buffer())
   buffer() = input
   UnlockMutex(mbuffer) 
   SignalSemaphore(sbuffer)
EndProcedure 

Procedure.s Pop()
   Protected out.s 
   LockMutex(mBuffer)
    If ListSize(buffer())
      FirstElement(Buffer())
      out = Buffer()
      DeleteElement(buffer())
   EndIf 
   UnlockMutex(mbuffer)
   ProcedureReturn out 
EndProcedure    

Procedure Sortdata(void.i) 
  Protected input.s 
  Repeat  
  WaitSemaphore(sbuffer)  
   input = pop() 
     If input <> ""
        Filllist(input)
     EndIf    
    
  Until quit 
   
EndProcedure   
    
Procedure ReadFromFile() 
   
  CreateThread(@Sortdata(),0) 
   
  For a = 1 To 100 
    push("some string " + Str(a))    
    Delay(1) ;simulate a delay note file reads are done into a buffer of at least 4096k 
                     
  Next 

  For a = 50 To 150 
     push("some string " + Str(a)) 
     Delay(1) 
  Next  
  
  quit = 1 
  SignalSemaphore(sbuffer) 
  
EndProcedure   
  
  
Define *ps.string 

Debug "the combined lists output" 

ReadFromFile() 

ForEach listCombined() 
   *ps = listCombined() 
   Debug *ps\s 
Next    

Re: Compare String

Posted: Thu May 02, 2013 8:14 am
by martin66119
Hi idle,

Thank you very much for your help. Before I try to change the code completely I would try to use the funktion "CompareMemoryStrin(.....) but I do not know to use it. Please could you help me to modify the posted code.

Thank you very much for any help in advance.

Re: Compare String

Posted: Thu May 02, 2013 11:46 pm
by idle
you've stated that you have two lists of strings and need to find duplicates
but you haven't made it clear what you're doing or what you intend to do after finding the duplicates

what are you trying to achieve eg: A spell checker?
what are you intending to do about the duplicates, eliminate duplicates to create a single data set, build a dictionary ...?
how many strings are in these lists, <10000, >10000 ?
What sort of access do you want to items in the lists? direct access or don't care

as for CompareMemoryString usage

Code: Select all

result = CompareMemoryString(@"hello",@"hello") 
Select result 
   Case #PB_String_Equal
      debug "equal"   
   Case #PB_String_Lower
   Case #PB_String_Greater   
EndSelect

;or 
If Not CompareMemoryString(@"hello",@"hello") 
  debug "equal"
EndIf 

;or implement your own but since you need to know the length passing the strings in will probably be slower 
;returns 1 if equal  
Procedure CompString(s1.s,s2.s) 
   Protected l1,l2,*pa.Ascii,*pb.Ascii,pos,result=1 
   l1 = Len(s1)
   l2 = Len(s2) 
   If l1 <> l2 
     result = 0  
   Else 
      *pa = @s1 
      *pb = @s2
      While pos < l1
        If *pa\a <> *pb\a 
          Result = 0 
          Break 
       EndIf 
       *pa+1
       *pb+1 
       pos+1
    Wend 
 EndIf 
 ProcedureReturn result 
 
EndProcedure