Page 2 of 3

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 4:40 pm
by highend
@mk-soft

Wow!

At max 20ms, depending on the input string (and the size definition for the fixed string), down to 11 ms (e.g. if limited to 4)^^

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 4:42 pm
by Oso
This reduces the time down to around 15 - 20ms for me, sometimes completes in less than that. It's necessary to allocate some memory and pass the pointer to that, as workspace. This reduces the work, since there is no string handling before the final result.s. :D

Code: Select all

OpenConsole()
EnableExplicit

Procedure.s GetTrailingNumbersFromString8(*StrPtr.Character, *NewStr.Character)
  
  Protected *NewPtr.Character = *NewStr                                 ; Set moving pointer to start of buffer
  
  While *StrPtr\c                                                       ; Loop through evaluation until trailing zero
    Select *StrPtr\c                                                    ; Check non-zero
      Case 48 To 57                                                     ; Numeric range 0 to 9
        *NewPtr\c = *StrPtr\c                                           ; Move into buffer at *NewPtr
        *NewPtr + 2                                                     ; Advance buffer pos.
      Default
        *NewPtr = *NewStr                                               ; Reset buffer position to start
    EndSelect

    *StrPtr + 2                                                         ; Next evaluation character
  Wend
  
  ProcedureReturn PeekS(*NewStr, (*NewPtr - *NewStr) / 2)               ; Return the buffer at fixed calculated no. of chars
  
EndProcedure

Define.s demoText = "123Caption86"
Define loop.i
Define *NewStr = AllocateMemory(100)
Define t.q     = ElapsedMilliseconds()

For loop.i = 1 To 100000
  GetTrailingNumbersFromString8(@demoText, *NewStr)
Next

PrintN("Time = " + Str(ElapsedMilliseconds() - t.q) + "ms")
PrintN("Result = " + GetTrailingNumbersFromString8(@demoText, *NewStr))

FreeMemory(*NewStr)

Print("Press <ENTER> : ")
Input()
EDIT : Simplified without result.s, now reduces the time down to less than 10ms.

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 4:53 pm
by mk-soft
@Oso
Protected result.s{1024} build on stack and not a call string function by ASM

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 5:09 pm
by mk-soft
Optimation 8 with static string buffer

Update 2

Code: Select all

;-TOP by mk-soft

CompilerIf #PB_Compiler_Thread
  ; Static String Buffer Threadsafe
  Threaded StringBuffer.s{2048}
CompilerElse
  ; Static String Buffer
  Global StringBuffer.s{2048}
CompilerEndIf

Procedure.s GetNumbersFromString8(*StrPtr.Character)
  Protected *result.Character
  
  *result = @StringBuffer
  While *StrPtr\c
    Select *StrPtr\c
      Case '0' To '9'
        *result\c = *StrPtr\c
        *result + SizeOf(Character)
      Default
        *result = @StringBuffer
    EndSelect
    *StrPtr + SizeOf(Character)
  Wend
  *result\c = #Null
  ProcedureReturn StringBuffer
  
EndProcedure

Define.s demoText = "1234Caption64"
r1.s = GetNumbersFromString8(@demoText)
Debug r1

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 5:20 pm
by highend
@mk-soft

Two questions to fully understand what you're doing there:

Code: Select all

      Default
        *result = @result
You're setting the pointer back to the start of the string here?

Let's say the demo string is: 25468test123
Then "25468" is already in the result
The current char is now 't', the pointer is moved back to the start

The loop continues, "123" are captured
The result still contains "12368" because the "68" is a left-over from the leading numbers stored previously

And now:

Code: Select all

*result\c = #Null
You're adding the terminating 0 character (in 4th spot after "123") which finally limits our result to only "123"

Correct?

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 5:23 pm
by Oso
mk-soft wrote: Sat Dec 30, 2023 5:09 pm Optimation 8 with static string buffer

Update
Definitely simpler. Is there an advantage to Threaded, over Global? They seem to be the same in performance, difficult to tell exactly, because each test varies. The allocated memory is the fastest, I'm getting < 10ms for that.

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 5:25 pm
by mk-soft
You got it exactly right.

Can you show your runtime measurement and tell us what method 8 looks like? Once without and once with threadsafe

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 5:29 pm
by mk-soft
Oso wrote: Sat Dec 30, 2023 5:23 pm
mk-soft wrote: Sat Dec 30, 2023 5:09 pm Optimation 8 with static string buffer

Update
Definitely simpler. Is there an advantage to Threaded, over Global? They seem to be the same in performance, difficult to tell exactly, because each test varies. The allocated memory is the fastest, I'm getting < 10ms for that.
I have added a second update with compiler option threadsafe.
Threaded is only needed if you work with threads. So each thread gets its own string buffer, so that the function is threadsafe

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 5:36 pm
by highend
Ok, thanks!

About 13ms for both variants (Create threadsafe executable / or not). No real difference

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 5:48 pm
by mk-soft
Thank you ;)

once again 4 times faster than the other methods. And ThreadSafe :mrgreen:

Playing with code with PureBasic is fun 8)

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 7:00 pm
by SMaag
For removing NonWordCharacters we had a similar discussion in summer this year.

As an alternative, here is my final code from this discussing which is very easy to adapt to whatever you want to keep as characters.
It is a full pointer Version which removes the the characters directly from the original String. So it do not touch the String Memory itself
and do not copy the string. This is an advantage if you deal with millions of short strings.

But anyway, the mk-soft version is, as all from mk-soft, very professional and I guess it's be the better solution for your case!

Code: Select all

 Procedure RemoveNonWordChars(*String.Character)
    ; ============================================================================
    ; NAME: RemoveNonWordChars
    ; NAME: Attention! This is a Pointer-Version! Be sure to call it with a
    ; DESC: correct String-Pointer
    ; DESC: Removes NonWord Characters from the String
    ; DESC: The String will be shorter after
    ; DESC: (question at PB-Forum: https://www.purebasic.fr/english/viewtopic.php?t=82139)
    ; VAR(*String) : Pointer to String
    ; RET: - 
    ; ============================================================================
    
    Protected *pWrite.Character = *String
  	
    Macro RemoveNonWordChars_KeepChar()
    	If *pWrite <> *String     ; if  WritePosition <> ReadPosition
    		*pWrite\c = *String\c   ; Copy the Character from ReadPosition to WritePosition => compacting the String
    	EndIf
    	*pWrite + SizeOf(Character) ; set new Write-Position 
    EndMacro
    
    If Not *String
      ProcedureReturn
    EndIf
    
  	While *String\c     ; While Not NullChar
  	  
  	  Select *String\c
  	      
        ; ----------------------------------------------------------------------
        ; Characters to keep
        ; ----------------------------------------------------------------------             
  	      
        ; If we check for the most probably Chars first, we speed up the operation
        ; because we minimze the number of checks to do!
       ; Case 'a' To 'z'   ; keep  a to z
        ;  RemoveNonWordChars_KeepChar()		; local Macro _KeppChar()
          
       ; Case 'A' To 'Z'   ; keep  A to Z
       ;   RemoveNonWordChars_KeepChar()				
                 
   	    Case '0' To '9'   ; keep 0 to 9
          RemoveNonWordChars_KeepChar()				
          
        ; Case '_'          ; keep '_'
        ;  RemoveNonWordChars_KeepChar()				
                        
      EndSelect
      
      *String + SizeOf(Character) ; Set Pointer to NextChar
  		
    Wend
  	
  	; If *pWrite is not at end of orignal *String,
  	; we removed some char and must write a 0-Termination as new EndOfString 
  	If *String <> *pWrite
  		*pWrite\c = 0
  	EndIf
  
  EndProcedure


Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 7:19 pm
by AZJIO

Code: Select all

EnableExplicit
DisableDebugger

CompilerIf #PB_Compiler_Thread
  ; Static String Buffer Threadsafe
	Threaded StringBuffer.s{2048}
CompilerElse
  ; Static String Buffer
	Global StringBuffer.s{2048}
CompilerEndIf

; mk-soft
Procedure.s GetNumbersFromString8(*StrPtr.Character)
	Protected *result.Character

	*result = @StringBuffer
	While *StrPtr\c
		Select *StrPtr\c
			Case '0' To '9'
				*result\c = *StrPtr\c
				*result + SizeOf(Character)
			Default
				*result = @StringBuffer
		EndSelect
		*StrPtr + SizeOf(Character)
	Wend
	*result\c = #Null
	ProcedureReturn StringBuffer

EndProcedure

; AZJIO (corrupts the original string)
Procedure.s GetNumbersFromString9(*Source.Character)
	Protected *Ptr, flgStart = 1

	While *Source\c
		If *Source\c >= '0' And *Source\c <= '9'
			If flgStart
				*Ptr = @*Source\c
				flgStart = 0
			EndIf
		Else
			flgStart = 1
		EndIf
		*Source + SizeOf(Character)
	Wend

	If Not *Ptr
		ProcedureReturn ""
	EndIf

	*Source = *Ptr
	While *Source\c
		If *Source\c > '9' Or *Source\c < '0'
			*Source\c = #Null
			Break
		EndIf
		*Source + SizeOf(Character)
	Wend
	ProcedureReturn PeekS(*Ptr)

EndProcedure

; AZJIO
Procedure.s GetNumbersFromString10(*Source.Character)
	Protected *Ptr, flgStart = 1, length

	While *Source\c
		If *Source\c >= '0' And *Source\c <= '9'
			If flgStart
				*Ptr = @*Source\c
				flgStart = 0
			EndIf
		Else
			flgStart = 1
		EndIf
		*Source + SizeOf(Character)
	Wend

	If Not *Ptr
		ProcedureReturn ""
	EndIf

	*Source = *Ptr
	While *Source\c
		If *Source\c >= '0' And *Source\c <= '9'
			length + 1
		Else
			Break
		EndIf
		*Source + SizeOf(Character)
	Wend
	ProcedureReturn PeekS(*Ptr, length)

EndProcedure


Define demoText.s, r1.s, r2.s, r3.s, StartTime, i, t1, t2, t3

demoText = "1234Caption64"

StartTime = ElapsedMilliseconds()
For i = 0 To 1000000
	r1.s = GetNumbersFromString8(@demoText)
Next
t1 = ElapsedMilliseconds() - StartTime

demoText = "1234Caption64"

StartTime = ElapsedMilliseconds()
For i = 0 To 1000000
	r2.s = GetNumbersFromString9(@demoText)
Next
t2 = ElapsedMilliseconds() - StartTime

demoText = "1234Caption64"

StartTime = ElapsedMilliseconds()
For i = 0 To 1000000
	r3.s = GetNumbersFromString10(@demoText)
Next
t3 = ElapsedMilliseconds() - StartTime

EnableDebugger
Debug t1
Debug t2
Debug t3
Debug r1
Debug r2
Debug r3

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 8:22 pm
by highend
@SMaag

Thanks for adding this to the discussion

I do not plan to test it for speed but it adds some additional value (so I've added it to the string functions I use) :D

Re: Get only trailing numbers from a string?

Posted: Sat Dec 30, 2023 8:37 pm
by AZJIO
I have changed the behavior of the function unlike the previous ones. Previously, I was looking for the last number in the string (it may be in the middle of the string, but there are no numbers after it), now I'm looking for a number on the edge of the string (there are no other characters after the number).

Code: Select all

EnableExplicit
DisableDebugger

Procedure.s GetNumbersFromString11(*Source.Character, length)
	Protected *i.Character, *SourceEnd.Character
	*SourceEnd =  *Source + SizeOf(Character) * (length - 1)
	For *i = *SourceEnd To *Source Step  - SizeOf(Character)
		If *i\c > '9' Or *i\c < '0'
			*i + SizeOf(Character)
			ProcedureReturn PeekS(@*i\c)
		EndIf
	Next
	ProcedureReturn PeekS(@*Source\c)
EndProcedure


Define demoText.s, r1.s, StartTime, i, t1

demoText = "1234Caption64"

StartTime = ElapsedMilliseconds()
For i = 0 To 1000000
	r1.s = GetNumbersFromString11(@demoText, Len(demoText))
Next
t1 = ElapsedMilliseconds() - StartTime

demoText = "1234Caption64"

EnableDebugger
Debug t1
Debug r1

Re: Get only trailing numbers from a string?

Posted: Sun Dec 31, 2023 12:20 pm
by mk-soft
Ok,
without string buffer . I don't know why they did it the same way as usual ...

last update ;)

Code: Select all

;-TOP by mk-soft

Procedure.s GetNumbersFromString(*StrPtr.Character)
  Protected *StartPtr.Character
  
  While *StrPtr\c
    Select *StrPtr\c
      Case '0' To '9'
        If *StartPtr = 0
          *StartPtr = *StrPtr
        EndIf
      Default
        If *StartPtr
          *StartPtr = 0
        EndIf
    EndSelect
    *StrPtr + SizeOf(Character)
  Wend
  
  If *StartPtr
    ProcedureReturn PeekS(*StartPtr)
  Else
    ProcedureReturn ""
  EndIf
  
EndProcedure

Define.s demoText = "1234Caption64"
r1.s = GetNumbersFromString(@demoText)
Debug r1