Page 1 of 3
Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 9:18 am
by highend
I can easily do this by using regex but I'd like to do it without it
This one gets all numbers from the whole string:
Code: Select all
Procedure.s GetNumbersFromString(*StrPtr.Character)
Protected.s result
While *StrPtr\c
Select *StrPtr\c
Case 48 To 57
result + Chr(*StrPtr\c)
EndSelect
*StrPtr + 2
Wend
ProcedureReturn result
EndProcedure
Define.s demoText = "12Caption64"
Debug GetNumbersFromString(@demoText) ; Output = 1264
In my usecase I only want trailing numbers (e.g.: 64 from the example above)
How do I traverse the string in reverse (in the same fast manner as it's done in the existing function)?
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 9:51 am
by Oso
Hello Highend, you could just add two lines for a non-numeric check and reset the output, as below. If you want to count in reverse, you would still need to find the trailing zero byte, in order to first know the point at which to begin in reverse. I think therefore it makes more sense to avoid doing that.
Code: Select all
Procedure.s GetNumbersFromString(*StrPtr.Character)
Protected.s result
While *StrPtr\c
Select *StrPtr\c
Case 48 To 57
result + Chr(*StrPtr\c)
Default ; <----- Addition
result = "" ; <----- Addition
EndSelect
*StrPtr + 2
Wend
ProcedureReturn result
EndProcedure
Define.s demoText = "12Caption64"
Debug GetNumbersFromString(@demoText) ; Output = 64
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 9:54 am
by Marc56us
Quick and dirty sample. Basic only, no pointers, so easy to understand for beginners
Code: Select all
Define.s demoText = "12Caption64"
Debug "--- " + demoText
demoText = ReverseString(demoText)
Debug "--- " + demoText
For i = 1 To Len(demoText)
nb = Val(Mid(demoText, i, 1))
If nb > 0 And nb < 9
nb$ + Str(nb)
Else
Break
EndIf
Next i
nb = Val(ReverseString(nb$))
Debug "Value : " + nb
Code: Select all
--- 12Caption64
--- 46noitpaC21
Value : 64
You can also read the string from the end to the beginning and then reverse the result.
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 10:28 am
by Oso
Marc56us wrote: Sat Dec 30, 2023 9:54 am
demoText = ReverseString(demoText)
Well, that's one I hadn't seen before, Marc56us

Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 10:45 am
by Marc56us
Oso wrote: Sat Dec 30, 2023 10:28 am
Marc56us wrote: Sat Dec 30, 2023 9:54 am
demoText = ReverseString(demoText)
Well, that's one I hadn't seen before, Marc56us
I've got a big secret: the
F1 key
I've read (and reread)
ALL the help functions (out of curiosity) except the 3D functions, which I don't use.
I've tested all examples of all functions.

Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 10:51 am
by Oso
Marc56us wrote: Sat Dec 30, 2023 10:45 am
I've got a big secret: the
F1 key
I've read (and reread)
ALL the help functions (out of curiosity, and tested all examples) except the 3D functions, which I don't use.
Ah, I had a feeling there might be something wrong with my keyboard

Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 11:00 am
by Oso
But if you really do want to count in reverse, Highend (and because it makes me feel like a grown-up, using pointers...

) you could do it like this. Maybe there's a quicker way I haven't thought of. Since you're reversing back, you need to prefix 'result' with each character, instead of appending.
Code: Select all
Procedure.s GetNumbersFromString(*StrPtr.Character)
Protected.s result
Protected *LastPtr.Character = *StrPtr
While *LastPtr\c ; Start last ptr at the beginning, looking for zero byte
*LastPtr + 2 ; Next char.
Wend
*LastPtr - 2 ; Adjust because we're now on the zero byte
While *LastPtr >= *StrPtr ; Loop again but from the last character, in reverse, until start
Select *LastPtr\c
Case 48 To 57
result = Chr(*LastPtr\c) + result ; Add the character to the start (not the end)
Default
Break ; Drop out the loop, as we've found a non-numeric
EndSelect
*LastPtr - 2 ; Prev. character
Wend
ProcedureReturn result
EndProcedure
Define.s demoText = "12Caption64"
Debug GetNumbersFromString(@demoText) ; Output = 64
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 11:17 am
by Oso
And with due regard for Marc56us's nifty ReverseString(), do that also. It might be quicker, because we wouldn't be shuffling the string each time.
Code: Select all
result + Chr(*LastPtr\c) ; Add the character
.
.
.
ProcedureReturn ReverseString(result) ; Reverse the output
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 11:58 am
by highend
Thanks for the input!
@Marc56us
Sorry, your version won't work. The number 0 would be treated as any other character that is not a number (and <= 9 would be required to catch the 9)
@Oso
I've timed all variants with a loop of 100.000 runs per function (ofc without the debugger), results are:
Code: Select all
Result (function 1): 6481 [92.197 ms]
Result (function 2): 6481 [106.495 ms]
Result (function 3): 6481 [133.019 ms]
Result (function 4): 6481 [42.623 ms]
Result (function 5): 6481 [45.636 ms]
So going to the end of the string and then reverse it (via pointers) is atm the fastest method.
The slight variant 5 (not shuffling the output string but return the reverse string at the end) is minimal slower.
That might change if the test string would be significant longer though...
The only way to speed up function 4/5 would maybe by advancing to the last character of the string (via pointer) without a loop (if that's even possible) before going over them in reverse order
Code: Select all
Define.i hRegex
hRegex = CreateRegularExpression(#PB_Any, "\d+$", #PB_RegularExpression_MultiLine)
Procedure.s GetTrailingNumbersFromString1(hRegex, string.s)
Protected.s result
If ExamineRegularExpression(hRegex, string)
While NextRegularExpressionMatch(hRegex)
result = RegularExpressionMatchString(hRegex)
Wend
EndIf
ProcedureReturn result
EndProcedure
Procedure.s GetTrailingNumbersFromString2(string.s)
Protected.i i, num
Protected.s char, result
For i = Len(string) To 1 Step -1
char = Mid(string, i, 1)
Select char
Case "0", "1", "2", "3", "4", "5", "6", "7", "8", "9"
result + char
Default
Break
EndSelect
Next
ProcedureReturn ReverseString(result)
EndProcedure
Procedure.s GetTrailingNumbersFromString3(*StrPtr.Character)
Protected.s result
While *StrPtr\c
Select *StrPtr\c
Case 48 To 57 ; 0 - 9 as ascii
result + Chr(*StrPtr\c)
Default
result = ""
EndSelect
*StrPtr + 2
Wend
ProcedureReturn result
EndProcedure
Procedure.s GetTrailingNumbersFromString4(*StrPtr.Character)
Protected.s result
Protected *LastPtr.Character = *StrPtr
While *LastPtr\c ; Start last ptr at the beginning, looking for zero byte
*LastPtr + 2 ; Next char.
Wend
*LastPtr - 2 ; Adjust because we're now on the zero byte
While *LastPtr >= *StrPtr ; Loop again but from the last character, in reverse, until start
Select *LastPtr\c
Case 48 To 57
result = Chr(*LastPtr\c) + result ; Add the character to the start (not the end)
Default
Break ; Drop out the loop, as we've found a non-numeric
EndSelect
*LastPtr - 2 ; Prev. character
Wend
ProcedureReturn result
EndProcedure
Procedure.s GetTrailingNumbersFromString5(*StrPtr.Character)
Protected.s result
Protected *LastPtr.Character = *StrPtr
While *LastPtr\c ; Start last ptr at the beginning, looking for zero byte
*LastPtr + 2 ; Next char.
Wend
*LastPtr - 2 ; Adjust because we're now on the zero byte
While *LastPtr >= *StrPtr ; Loop again but from the last character, in reverse, until start
Select *LastPtr\c
Case 48 To 57
result + Chr(*LastPtr\c) ; Add the character to the start (not the end)
Default
Break ; Drop out the loop, as we've found a non-numeric
EndSelect
*LastPtr - 2 ; Prev. character
Wend
ProcedureReturn ReverseString(result)
EndProcedure
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 12:28 pm
by Oso
highend wrote: Sat Dec 30, 2023 11:58 am
Code: Select all
Result (function 1): 6481 [92.197 ms]
Result (function 2): 6481 [106.495 ms]
Result (function 3): 6481 [133.019 ms]
Result (function 4): 6481 [42.623 ms]
Result (function 5): 6481 [45.636 ms]
Very useful to see these comparisons. I'm surprised at the disparity in time between (3) and (4/5) because there isn't much difference in the loop logic. It probably demonstrates that checking for a non-zero \c unicode character (3) is notably slower than comparing the two pointers (4/5). I think it ought to be quicker though, and perhaps worth another test.
I suppose that in (2) the Len() check inside the For/Next loop perhaps doesn't matter in this case, from a performance point of view, because it's only the start counter, rather than the target, which therefore doesn't need to be checked on every loop iteration.
Have you tried compiling in C with optimisation? I've recently been performance testing with optimisation and it just blows everything away.
Anyway, interesting to see this and it really shows that whatever is best for the requirement, is usually the right choice. I would use (2) for an end-user application, similar to Marc56us's, because ease of maintenance by others is necessary and speed is not so important. Personally, I would be very unlikely to use Regex.
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 12:38 pm
by highend
I've used v6.04 LTS for the tests. Normal backend, x64
Compiling it with the C-Backend [with [x] Optimize generated code] makes most functions (slightly) slower

But the second function suffers the most...
The only one that gains a little bit of speed is the regex one
Code: Select all
Result (function 1): 6481 [85.745 ms]
Result (function 2): 6481 [132.206 ms]
Result (function 3): 6481 [139.788 ms]
Result (function 4): 6481 [46.160 ms]
Result (function 5): 6481 [51.860 ms]
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 2:59 pm
by Marc56us
We don't have the same test data, nor the same machine. So I suggest you try this variant (which can be improved).
(Yes, 0 and 9 are OK now.)
Code: Select all
Procedure$ Get_Last_Number(Txt$)
Txt$ = ReverseString(Txt$)
For i = 1 To Len(Txt$)
nb = Asc(Mid(Txt$, i, 1))
If nb >= 48 And nb <= 57
nb$ + Chr(nb)
Else
ProcedureReturn ReverseString(Txt$) + " => " + ReverseString(nb$)
EndIf
Next i
EndProcedure
Debug Get_Last_Number("12Caption64")
Debug Get_Last_Number("12Caption064")
Debug Get_Last_Number("12Caption0123456789")
; 12Caption64 => 64
; 12Caption064 => 064
; 12Caption0123456789 => 0123456789
PS. Instead of using ReverseString twice, we can process the string from right to left
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 3:13 pm
by highend
@Marc56us
The slightly optimised version takes about 75ms (again with 100k loops)
Your last posted version takes about 85ms
Code: Select all
Procedure.s GetTrailingNumbersFromString6(string.s)
Protected.i i, num
Protected.s result
For i = Len(string) To 1 Step -1
num = Asc(Mid(string, i, 1))
If num >= 48 And num <= 57
result + Chr(num)
Else
ProcedureReturn ReverseString(result)
EndIf
Next
EndProcedure
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 3:32 pm
by boddhi
Marc56us wrote:
Quick and dirty sample. Basic only, no pointers, so easy to understand for beginners
More dirty? In one line:
Code: Select all
Debug ReverseString(Str(Val(ReverseString("12Caption64"))))
Inconvenience: If they are, leading "0" disappear...
Re: Get only trailing numbers from a string?
Posted: Sat Dec 30, 2023 4:25 pm
by mk-soft
I don't think numbers have more than 1000 digits
Code: Select all
Procedure.s GetNumbersFromString7(*StrPtr.Character)
Protected result.s{1024}
Protected *result.Character
*result = @result
While *StrPtr\c
Select *StrPtr\c
Case '0' To '9'
*result\c = *StrPtr\c
*result + SizeOf(Character)
Default
*result = @result
EndSelect
*StrPtr + SizeOf(Character)
Wend
*result\c = #Null
ProcedureReturn result
EndProcedure
Define.s demoText = "1234Caption64"
Debug GetNumbersFromString7(@demoText)