Get only trailing numbers from a string?

Just starting out? Need help? Post your questions and find answers here.
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Get only trailing numbers from a string?

Post by highend »

I can easily do this by using regex but I'd like to do it without it

This one gets all numbers from the whole string:

Code: Select all

Procedure.s GetNumbersFromString(*StrPtr.Character)
  Protected.s result

  While *StrPtr\c
    Select *StrPtr\c
      Case 48 To 57
        result + Chr(*StrPtr\c)
    EndSelect

    *StrPtr + 2
  Wend

  ProcedureReturn result
EndProcedure

Define.s demoText = "12Caption64"
Debug GetNumbersFromString(@demoText) ; Output = 1264
In my usecase I only want trailing numbers (e.g.: 64 from the example above)

How do I traverse the string in reverse (in the same fast manner as it's done in the existing function)?
Oso
Enthusiast
Enthusiast
Posts: 595
Joined: Wed Jul 20, 2022 10:09 am

Re: Get only trailing numbers from a string?

Post by Oso »

Hello Highend, you could just add two lines for a non-numeric check and reset the output, as below. If you want to count in reverse, you would still need to find the trailing zero byte, in order to first know the point at which to begin in reverse. I think therefore it makes more sense to avoid doing that.

Code: Select all

Procedure.s GetNumbersFromString(*StrPtr.Character)
  
  Protected.s result

  While *StrPtr\c
    Select *StrPtr\c
      Case 48 To 57
        result + Chr(*StrPtr\c)
      Default ;                <----- Addition
        result = "" ;          <----- Addition
    EndSelect

    *StrPtr + 2
  Wend

  ProcedureReturn result
EndProcedure

Define.s demoText = "12Caption64"
Debug GetNumbersFromString(@demoText) ; Output = 64
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: Get only trailing numbers from a string?

Post by Marc56us »

Quick and dirty sample. Basic only, no pointers, so easy to understand for beginners :wink:

Code: Select all

Define.s demoText = "12Caption64"

Debug "--- " + demoText
demoText = ReverseString(demoText)
Debug "--- " + demoText

For i = 1 To Len(demoText)
    nb = Val(Mid(demoText, i, 1))
    If nb > 0 And nb < 9
        nb$ + Str(nb)
    Else
        Break
    EndIf
Next i

nb = Val(ReverseString(nb$))
Debug "Value : " + nb

Code: Select all

--- 12Caption64
--- 46noitpaC21
Value : 64
You can also read the string from the end to the beginning and then reverse the result.
Oso
Enthusiast
Enthusiast
Posts: 595
Joined: Wed Jul 20, 2022 10:09 am

Re: Get only trailing numbers from a string?

Post by Oso »

Marc56us wrote: Sat Dec 30, 2023 9:54 am demoText = ReverseString(demoText)
Well, that's one I hadn't seen before, Marc56us :o
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: Get only trailing numbers from a string?

Post by Marc56us »

Oso wrote: Sat Dec 30, 2023 10:28 am
Marc56us wrote: Sat Dec 30, 2023 9:54 am demoText = ReverseString(demoText)
Well, that's one I hadn't seen before, Marc56us :o
I've got a big secret: the F1 key :mrgreen:
I've read (and reread) ALL the help functions (out of curiosity) except the 3D functions, which I don't use.
I've tested all examples of all functions.
:wink:
Oso
Enthusiast
Enthusiast
Posts: 595
Joined: Wed Jul 20, 2022 10:09 am

Re: Get only trailing numbers from a string?

Post by Oso »

Marc56us wrote: Sat Dec 30, 2023 10:45 am I've got a big secret: the F1 key :mrgreen:
I've read (and reread) ALL the help functions (out of curiosity, and tested all examples) except the 3D functions, which I don't use. :wink:
Ah, I had a feeling there might be something wrong with my keyboard :D

Image
Oso
Enthusiast
Enthusiast
Posts: 595
Joined: Wed Jul 20, 2022 10:09 am

Re: Get only trailing numbers from a string?

Post by Oso »

But if you really do want to count in reverse, Highend (and because it makes me feel like a grown-up, using pointers... :lol: ) you could do it like this. Maybe there's a quicker way I haven't thought of. Since you're reversing back, you need to prefix 'result' with each character, instead of appending.

Code: Select all

Procedure.s GetNumbersFromString(*StrPtr.Character)
  
  Protected.s result
  Protected  *LastPtr.Character = *StrPtr
  
  While *LastPtr\c                                                      ; Start last ptr at the beginning, looking for zero byte
    *LastPtr + 2                                                        ; Next char.
  Wend
  *LastPtr - 2                                                          ; Adjust because we're now on the zero byte

  While *LastPtr >= *StrPtr                                             ; Loop again but from the last character, in reverse, until start
    Select *LastPtr\c
      Case 48 To 57
        result = Chr(*LastPtr\c) + result                               ; Add the character to the start (not the end)
      Default
        Break                                                           ; Drop out the loop, as we've found a non-numeric
    EndSelect

    *LastPtr - 2                                                        ; Prev. character
  Wend

  ProcedureReturn result
  
EndProcedure

Define.s demoText = "12Caption64"
Debug GetNumbersFromString(@demoText) ; Output = 64
Oso
Enthusiast
Enthusiast
Posts: 595
Joined: Wed Jul 20, 2022 10:09 am

Re: Get only trailing numbers from a string?

Post by Oso »

And with due regard for Marc56us's nifty ReverseString(), do that also. It might be quicker, because we wouldn't be shuffling the string each time.

Code: Select all

        result + Chr(*LastPtr\c)                                        ; Add the character
                .
                .
                .
  ProcedureReturn ReverseString(result)                                 ; Reverse the output
 
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Re: Get only trailing numbers from a string?

Post by highend »

Thanks for the input!

@Marc56us
Sorry, your version won't work. The number 0 would be treated as any other character that is not a number (and <= 9 would be required to catch the 9)

@Oso
I've timed all variants with a loop of 100.000 runs per function (ofc without the debugger), results are:

Code: Select all

Result (function 1): 6481 [92.197 ms]
Result (function 2): 6481 [106.495 ms]
Result (function 3): 6481 [133.019 ms]
Result (function 4): 6481 [42.623 ms]
Result (function 5): 6481 [45.636 ms]
So going to the end of the string and then reverse it (via pointers) is atm the fastest method.
The slight variant 5 (not shuffling the output string but return the reverse string at the end) is minimal slower.
That might change if the test string would be significant longer though...

The only way to speed up function 4/5 would maybe by advancing to the last character of the string (via pointer) without a loop (if that's even possible) before going over them in reverse order

Code: Select all

Define.i hRegex
hRegex = CreateRegularExpression(#PB_Any, "\d+$", #PB_RegularExpression_MultiLine)

Procedure.s GetTrailingNumbersFromString1(hRegex, string.s)
  Protected.s result

  If ExamineRegularExpression(hRegex, string)
    While NextRegularExpressionMatch(hRegex)
      result = RegularExpressionMatchString(hRegex)
    Wend
  EndIf

  ProcedureReturn result
EndProcedure


Procedure.s GetTrailingNumbersFromString2(string.s)
  Protected.i i, num
  Protected.s char, result

  For i = Len(string) To 1 Step -1
    char = Mid(string, i, 1)
    Select char
      Case "0", "1", "2", "3", "4", "5", "6", "7", "8", "9"
        result + char
      Default
        Break
    EndSelect
  Next

  ProcedureReturn ReverseString(result)
EndProcedure


Procedure.s GetTrailingNumbersFromString3(*StrPtr.Character)
  Protected.s result

  While *StrPtr\c
    Select *StrPtr\c
      Case 48 To 57 ; 0 - 9 as ascii
        result + Chr(*StrPtr\c)

      Default
        result = ""
    EndSelect

    *StrPtr + 2
  Wend

  ProcedureReturn result
EndProcedure


Procedure.s GetTrailingNumbersFromString4(*StrPtr.Character)
  Protected.s result
  Protected  *LastPtr.Character = *StrPtr

  While *LastPtr\c                                                      ; Start last ptr at the beginning, looking for zero byte
    *LastPtr + 2                                                        ; Next char.
  Wend
  *LastPtr - 2                                                          ; Adjust because we're now on the zero byte

  While *LastPtr >= *StrPtr                                             ; Loop again but from the last character, in reverse, until start
    Select *LastPtr\c
      Case 48 To 57
        result = Chr(*LastPtr\c) + result                               ; Add the character to the start (not the end)
      Default
        Break                                                           ; Drop out the loop, as we've found a non-numeric
    EndSelect

    *LastPtr - 2                                                        ; Prev. character
  Wend

  ProcedureReturn result
EndProcedure


Procedure.s GetTrailingNumbersFromString5(*StrPtr.Character)
  Protected.s result
  Protected  *LastPtr.Character = *StrPtr

  While *LastPtr\c                                                      ; Start last ptr at the beginning, looking for zero byte
    *LastPtr + 2                                                        ; Next char.
  Wend
  *LastPtr - 2                                                          ; Adjust because we're now on the zero byte

  While *LastPtr >= *StrPtr                                             ; Loop again but from the last character, in reverse, until start
    Select *LastPtr\c
      Case 48 To 57
        result + Chr(*LastPtr\c)                                        ; Add the character to the start (not the end)
      Default
        Break                                                           ; Drop out the loop, as we've found a non-numeric
    EndSelect

    *LastPtr - 2                                                        ; Prev. character
  Wend

  ProcedureReturn ReverseString(result)
EndProcedure
Oso
Enthusiast
Enthusiast
Posts: 595
Joined: Wed Jul 20, 2022 10:09 am

Re: Get only trailing numbers from a string?

Post by Oso »

highend wrote: Sat Dec 30, 2023 11:58 am

Code: Select all

Result (function 1): 6481 [92.197 ms]
Result (function 2): 6481 [106.495 ms]
Result (function 3): 6481 [133.019 ms]
Result (function 4): 6481 [42.623 ms]
Result (function 5): 6481 [45.636 ms]
Very useful to see these comparisons. I'm surprised at the disparity in time between (3) and (4/5) because there isn't much difference in the loop logic. It probably demonstrates that checking for a non-zero \c unicode character (3) is notably slower than comparing the two pointers (4/5). I think it ought to be quicker though, and perhaps worth another test.

I suppose that in (2) the Len() check inside the For/Next loop perhaps doesn't matter in this case, from a performance point of view, because it's only the start counter, rather than the target, which therefore doesn't need to be checked on every loop iteration.

Have you tried compiling in C with optimisation? I've recently been performance testing with optimisation and it just blows everything away.

Anyway, interesting to see this and it really shows that whatever is best for the requirement, is usually the right choice. I would use (2) for an end-user application, similar to Marc56us's, because ease of maintenance by others is necessary and speed is not so important. Personally, I would be very unlikely to use Regex.
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Re: Get only trailing numbers from a string?

Post by highend »

I've used v6.04 LTS for the tests. Normal backend, x64

Compiling it with the C-Backend [with [x] Optimize generated code] makes most functions (slightly) slower :shock:
But the second function suffers the most...
The only one that gains a little bit of speed is the regex one

Code: Select all

Result (function 1): 6481 [85.745 ms]
Result (function 2): 6481 [132.206 ms]
Result (function 3): 6481 [139.788 ms]
Result (function 4): 6481 [46.160 ms]
Result (function 5): 6481 [51.860 ms]
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: Get only trailing numbers from a string?

Post by Marc56us »

We don't have the same test data, nor the same machine. So I suggest you try this variant (which can be improved).
(Yes, 0 and 9 are OK now.)

Code: Select all

Procedure$ Get_Last_Number(Txt$)
    Txt$ = ReverseString(Txt$)   
    For i = 1 To Len(Txt$)
        nb = Asc(Mid(Txt$, i, 1))
        If nb >= 48 And nb <= 57
            nb$ + Chr(nb)
        Else
            ProcedureReturn ReverseString(Txt$) + " => " + ReverseString(nb$)
        EndIf
    Next i
EndProcedure

Debug Get_Last_Number("12Caption64")
Debug Get_Last_Number("12Caption064")
Debug Get_Last_Number("12Caption0123456789")

; 12Caption64 => 64
; 12Caption064 => 064
; 12Caption0123456789 => 0123456789
PS. Instead of using ReverseString twice, we can process the string from right to left
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Re: Get only trailing numbers from a string?

Post by highend »

@Marc56us
The slightly optimised version takes about 75ms (again with 100k loops)
Your last posted version takes about 85ms

Code: Select all

Procedure.s GetTrailingNumbersFromString6(string.s)
  Protected.i i, num
  Protected.s result

  For i = Len(string) To 1 Step -1
    num = Asc(Mid(string, i, 1))
    If num >= 48 And num <= 57
      result + Chr(num)
    Else
      ProcedureReturn ReverseString(result)
    EndIf
  Next
EndProcedure
boddhi
Enthusiast
Enthusiast
Posts: 524
Joined: Mon Nov 15, 2010 9:53 pm

Re: Get only trailing numbers from a string?

Post by boddhi »

Marc56us wrote: Quick and dirty sample. Basic only, no pointers, so easy to understand for beginners :wink:
More dirty? In one line:

Code: Select all

Debug ReverseString(Str(Val(ReverseString("12Caption64"))))
Inconvenience: If they are, leading "0" disappear...
If my English syntax and lexicon are incorrect, please bear with Google translate and DeepL. They rarely agree with each other!
Except on this sentence...
User avatar
mk-soft
Always Here
Always Here
Posts: 6224
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Get only trailing numbers from a string?

Post by mk-soft »

I don't think numbers have more than 1000 digits

Code: Select all

Procedure.s GetNumbersFromString7(*StrPtr.Character)
  
  Protected result.s{1024}
  Protected *result.Character
  
  *result = @result
  While *StrPtr\c
    Select *StrPtr\c
      Case '0' To '9'
        *result\c = *StrPtr\c
        *result + SizeOf(Character)
      Default
        *result = @result
    EndSelect
    *StrPtr + SizeOf(Character)
  Wend
  *result\c = #Null
  ProcedureReturn result
  
EndProcedure

Define.s demoText = "1234Caption64"
Debug GetNumbersFromString7(@demoText)
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
Post Reply