Page 2 of 3

Re: Find a string in *Memory

Posted: Fri Mar 29, 2024 11:52 pm
by charvista
@Infratec
For your information, as I first posted my problem with *Mem = AllocateMemory(500000000) [Five hundred million] I tried your code with the CompareMemory() removed, and the result is:
First: Europe in 62ms
Last : Oceania in 13ms
This is incredibly fast ! :mrgreen:

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 1:29 am
by skywalk
Yes, mine used comparememory calls also. I'm going to inline that call like infratec's code. 8)

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 10:44 am
by infratec
Added the possibility to return the offset.
Added the possibility to set the start address. So you can search fast for the next entry.

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 11:24 am
by charvista
Added the possibility to return the offset.
Added the possibility to set the start address. So you can search fast for the next entry.
Very interesting. A big thank you for your help, infratec.
I would also like to thank AZJIO and Caronte3D for their support :wink:
Fred might be interested for this, as it is a 'missing' function in the language....

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 12:07 pm
by idle
This is still probably the best option with asm backend
https://www.purebasic.fr/english/viewto ... 48#p467148
use BM for needles more than 16 bytes and SSE find for needles < 16 bytes

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 2:19 pm
by charvista
Dear infratec
I am sorry to tell you after trying to include your code in my programs, that I get *Here=0 when trying to get the Second$. (fine with First$, Last$ and NextToLast$ however).
It works well with *Mem=Allocate(10000000) [Ten million] but when I change to 50000000 (Fifty million], *Here returns 0....

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 2:34 pm
by infratec
Bug:

Code: Select all

*Here = FindStringInMemory(*Mem, HW$, #False, #False, *Here + 1)
Fixed:

Code: Select all

*Here = FindStringInMemory(*Mem, HW$, #False, #False, *Here + 2)
I don't know how it can work with a smaller memory, since the entry point points never to a correct character.

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 2:58 pm
by infratec
New version with automatic increase by 2 for next search if it already points to a search entry:

Code: Select all

EnableExplicit

DisableDebugger

Procedure.i FindMemoryString(*Memory, String$, FindFromEnd.i=#False, ReturnOffset.i=#False, *StartPtr.Character=#Null)
  
  Protected.i StringByteLength, Result
  Protected *MemoryEnd, *Ptr.Character, *FoundPtr, *Ptr2.Character, *String.Character
  
  
  StringByteLength = StringByteLength(String$)
  *String = @String$
  
  If FindFromEnd
    
    If *StartPtr
      *Ptr = *StartPtr - StringByteLength
    Else
      *Ptr = *Memory + MemorySize(*Memory) - StringByteLength
    EndIf
    
    Repeat
      
      If *Ptr\c = *String\c
        *Ptr2 = *Ptr + 2
        *String + 2
        While *Ptr2\c = *String\c
          *Ptr2 + 2
          *String + 2
        Wend
        If *String\c = #Null
          *FoundPtr = *Ptr
          Break
        EndIf
        *String = @String$
      EndIf
      
      *Ptr - 2
    Until *Ptr <= *Memory
    
  Else
    
    *MemoryEnd = *Memory + MemorySize(*Memory) - StringByteLength
    
    If *StartPtr
      If *StartPtr\c = *String\c
        *Ptr = *StartPtr + 2
      Else
        *Ptr = *StartPtr
      EndIf
    Else
      *Ptr = *Memory
    EndIf
    
    Repeat
      
      If *Ptr\c = *String\c
        *Ptr2 = *Ptr + 2
        *String + 2
        While *Ptr2\c = *String\c
          *Ptr2 + 2
          *String + 2
        Wend
        If *String\c = #Null
          *FoundPtr = *Ptr
          Break
        EndIf
        *String = @String$
      EndIf
      
      *Ptr + 2
    Until *Ptr >= *MemoryEnd
    
  EndIf
  
  If ReturnOffset
    Result = *FoundPtr - *Memory
  Else
    Result = *FoundPtr
  EndIf
  
  ProcedureReturn Result
  
EndProcedure




Define.q StartTime, EndTime, Summ
Define HW$, Result$
Define *Memory, *Here


HW$ = "Hello World of "
*Memory = AllocateMemory(50000000)
If *Memory
  Debug "Mem allocated"
  PokeS(*Memory + 18205400, HW$ + "Europe !")
  PokeS(*Memory + 26055450, HW$ + "America!")
  PokeS(*Memory + 36750820, HW$ + "Asia   !")
  PokeS(*Memory + 39046460, HW$ + "Africa !")
  PokeS(*Memory + 46750820, HW$ + "Oceania!")
  
  Repeat
    StartTime = ElapsedMilliseconds()
    *Here = FindMemoryString(*Memory, HW$, #False, #False, *Here)
    EndTime = ElapsedMilliseconds()
    If *Here
      Result$ + PeekS(*Here + StringByteLength(HW$), 7) + " in " + Str(EndTime - StartTime) + "ms" + #LF$
      Summ + (EndTime - StartTime)
    EndIf
  Until Not *Here
  Result$ + "Summ: " + Str(Summ) + "ms" + #LF$
  
  
  Result$ + #LF$ + "Reverse" + #LF$
  
  *Here = #Null
  Summ = 0
  Repeat
    StartTime = ElapsedMilliseconds()
    *Here = FindMemoryString(*Memory, HW$, #True, #False, *Here)
    EndTime = ElapsedMilliseconds()
    If *Here
      Result$ + PeekS(*Here + StringByteLength(HW$), 7) + " in " + Str(EndTime - StartTime) + "ms" + #LF$
      Summ + (EndTime - StartTime)
    EndIf
  Until Not *Here
  Result$ + "Summ: " + Str(Summ) + "ms" + #LF$
  
  FreeMemory(*Memory)
EndIf

EnableDebugger

Debug Result$
Interestingly is searching backwards much faster. At least on my PC.
forward 67ms , reverse 39ms (ASM backend)
forward 65ms , reverse 37ms (C backend)
forward 26ms , reverse 11ms (C backend optimized)

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 3:55 pm
by charvista
Your new version, unmodified, when run on my computer, it gives:
Europe in 6ms
America in 3ms
Asia in 4ms
Africa in 0ms
Oceania in 3ms
Summ: 16ms

Reverse
Oceania in 1ms
Africa in 4ms
Asia in 1ms
America in 5ms
Europe in 3ms
Summ: 14ms
Wow, if only we could travel around the globe at that speed! :mrgreen:
Perhaps it's because I'm using 13th Gen Intel(R) Core(TM) i9-13900K 3.00 GHz
Ok, now I am going to test it with my program and see if it works...

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 4:56 pm
by skywalk
Booo, infratec is faster than AI!

What is the best way to account for utf-8 or ansi memory block?

Duplicate this function with new name or add more function parameters?

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 5:36 pm
by charvista
Small question, as I am unable to make it working in my program for now.
The *StartAddress.Character, is that based on the last *Ptr value? I need to use the offset of the last search, based on *Mem.
With other words, your version does not work when using ReturnOffset.i=#True
Would this possibly affect the search speed?

Re: Find a string in *Memory

Posted: Sat Mar 30, 2024 6:45 pm
by infratec
But it is half so fast, because increase is only 1 and not 2

Code: Select all

EnableExplicit

DisableDebugger

Procedure.i FindMemory(*Memory, *ToFind.Ascii, FindFromEnd.i=#False, MemoryLength.i=0, ToFindLength.i=0, ReturnOffset.i=#False, *StartPtr.Ascii=#Null)
  
  Protected.i ByteLength, Result
  Protected *MemoryEnd, *Ptr.Ascii, *FoundPtr, *Ptr2.Ascii, *ToFindTmp.Ascii, *ToFindEnd
  
  
  If ToFindLength = 0
    ByteLength = MemorySize(*ToFind)
  Else
    ByteLength = ToFindLength
  EndIf
  
  *ToFindTmp = *ToFind
  *ToFindEnd = *ToFind + ByteLength
  
  If FindFromEnd
    
    If *StartPtr
      *Ptr = *StartPtr - ByteLength
    Else
      *Ptr = *Memory + MemorySize(*Memory) - ByteLength
    EndIf
    
    Repeat
      
      If *Ptr\a = *ToFindTmp\a
        *Ptr2 = *Ptr + 1
        *ToFindTmp + 1
        While *Ptr2\a = *ToFindTmp\a
          *Ptr2 + 1
          *ToFindTmp + 1
          If *ToFindTmp = *ToFindEnd
            *FoundPtr = *Ptr
            Break 2
          EndIf
        Wend
        *ToFindTmp = *ToFind
      EndIf
      
      *Ptr - 1
    Until *Ptr <= *Memory
    
  Else
    
    *MemoryEnd = *Memory + MemorySize(*Memory) - ByteLength
    
    If *StartPtr
      If *StartPtr\a = *ToFind\a
        *Ptr = *StartPtr + 1
      Else
        *Ptr = *StartPtr
      EndIf
    Else
      *Ptr = *Memory
    EndIf
    
    Repeat
      
      If *Ptr\a = *ToFindTmp\a
        *Ptr2 = *Ptr + 1
        *ToFindTmp + 1
        While *Ptr2\a = *ToFindTmp\a
          *Ptr2 + 1
          *ToFindTmp + 1
          If *ToFindTmp = *ToFindEnd
            *FoundPtr = *Ptr
            Break 2
          EndIf
        Wend
        *ToFindTmp = *ToFind
      EndIf
      
      *Ptr + 1
    Until *Ptr >= *MemoryEnd
    
  EndIf
  
  If ReturnOffset
    Result = *FoundPtr - *Memory
  Else
    Result = *FoundPtr
  EndIf
  
  ProcedureReturn Result
  
EndProcedure




Define.q StartTime, EndTime, Summ
Define HW$, Result$
Define *Memory, *Here, *HW


HW$ = "Hello World of "
*HW = UTF8(HW$)
If *HW
  *Memory = AllocateMemory(50000000)
  If *Memory
    Debug "Mem allocated"
    PokeS(*Memory + 18205400, HW$ + "Europe !", -1, #PB_UTF8|#PB_String_NoZero)
    PokeS(*Memory + 26055450, HW$ + "America!", -1, #PB_UTF8|#PB_String_NoZero)
    PokeS(*Memory + 36750820, HW$ + "Asia   !", -1, #PB_UTF8|#PB_String_NoZero)
    PokeS(*Memory + 39046460, HW$ + "Africa !", -1, #PB_UTF8|#PB_String_NoZero)
    PokeS(*Memory + 46750820, HW$ + "Oceania!", -1, #PB_UTF8|#PB_String_NoZero)
    
    Repeat
      StartTime = ElapsedMilliseconds()
      *Here = FindMemory(*Memory, *HW, #False, 0, MemorySize(*HW) - 1, #False, *Here)
      EndTime = ElapsedMilliseconds()
      If *Here
        Result$ + PeekS(*Here + MemorySize(*HW) - 1, 7, #PB_UTF8) + " in " + Str(EndTime - StartTime) + "ms" + #LF$
        Summ + (EndTime - StartTime)
      EndIf
    Until Not *Here
    Result$ + "Summ: " + Str(Summ) + "ms" + #LF$
    
    
    Result$ + #LF$ + "Reverse" + #LF$
    
    *Here = #Null
    Summ = 0
    Repeat
      StartTime = ElapsedMilliseconds()
      *Here = FindMemory(*Memory, *HW, #True, 0, MemorySize(*HW) - 1, #False, *Here)
      EndTime = ElapsedMilliseconds()
      If *Here
        Result$ + PeekS(*Here + MemorySize(*HW) - 1, 7, #PB_UTF8) + " in " + Str(EndTime - StartTime) + "ms" + #LF$
        Summ + (EndTime - StartTime)
      EndIf
    Until Not *Here
    Result$ + "Summ: " + Str(Summ) + "ms" + #LF$
    
    FreeMemory(*Memory)
  EndIf
  FreeMemory(*HW)
EndIf

EnableDebugger

Debug Result$

Re: Find a string in *Memory

Posted: Sun Mar 31, 2024 1:31 am
by idle
you could try BMsearch.
positives skips through input by patten length
drawbacks you need to tell it the sizes
you can call it repeatedly with pos+1

Code: Select all

;BMSearch 
;idle pb 6.10 

EnableExplicit 

Structure ara
  a.a[0] 
EndStructure

Procedure BMsearch(*pinput,inlen,*pat.ara,palen,pos=0,mode=#PB_Unicode)
  
  Protected i,t,skip,*input.ara = *pinput 
  Static Dim skiptable(256)
  
  If mode = #PB_Unicode
    pos << 1 
  EndIf   
  
  inlen-pos 
  *input+pos 
  
  If pos = 0  
    For i = 0 To 255
      Skiptable(i) = -1;
    Next  
    
    t= palen-1
    For i = 0 To t
      skiptable(*pat\a[i]) = i
    Next     
  EndIf 
  
  i=0
  skip=0
  While (skip <= (inlen - palen))
    i = palen - 1;
    
    While (i >= 0 And *pat\a[i] = *input\a[skip + i])
      i-1 
    Wend   
    
    If (i < 0) 
      If mode = #PB_Unicode 
        ProcedureReturn ((skip+pos) >> 1)  
      Else 
        ProcedureReturn skip + pos 
      EndIf 
        
;       If palen < inlen ;if you want to continue search to add to a list rather than return  
;         skip + palen - Skiptable(*input\a[skip+palen]) 
;       Else 
;         skip + 1;
;       EndIf 
      
    Else
      t = i - Skiptable(*input\a[skip+i])   
      If t > 1 
        skip + t 
      Else 
        skip + 1
      EndIf 
    EndIf   
    
  Wend  
    
EndProcedure 

Global Input.s =  "thequickbrownfoxjumpsoverthelazydogthequickbrownfoxjumpsoverthelazydog"
Global Pat.s =  "quick"
Global pos 

pos = BMsearch(@input,StringByteLength(input),@pat,StringByteLength(pat)) 
Debug pos 
pos = BMsearch(@input,StringByteLength(input),@pat,StringByteLength(pat),pos+1) 
Debug pos 
pos = BMsearch(@input,StringByteLength(input),@pat,StringByteLength(pat),pos+1) 
Debug pos



Re: Find a string in *Memory

Posted: Sun Mar 31, 2024 8:56 am
by Little John

Re: Find a string in *Memory

Posted: Sun Mar 31, 2024 12:06 pm
by infratec
@Little John

I also used first CompareMemory() but ... it is much slower if you have to call a subroutine in your loop.