Page 1 of 1
Remove double spaces
Posted: Sun Aug 28, 2022 10:58 am
by AZJIO
While there are problems if single characters
Code: Select all
EnableExplicit
Procedure.s StripChar(String$, flag, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
Protected Len1, Len2, i, j, NotFirstSpace = 1, SavePointer = 1, Added = 0
Protected *memChar, *c.Character, *jc.Character
Protected NewList Parts.i()
Protected tmp$ = ""
Len1 = Len(TrimChar$)
Len2 = Len(String$)
If Not Asc(String$)
ProcedureReturn ""
EndIf
*c.Character = @String$
*memChar = @TrimChar$
*jc.Character = *memChar
For j = 1 To Len1
If *c\c = *jc\c
Added = 1
Break
EndIf
*jc + SizeOf(Character)
Next
For i = 1 To Len2
*jc.Character = *memChar
For j = 1 To Len1
If *c\c = *jc\c
If NotFirstSpace
*c\c = 0
SavePointer = 1
Else
NotFirstSpace = 1
*c\c = 32
EndIf
Break
EndIf
*jc + SizeOf(Character)
Next
If *c\c
If SavePointer And NotFirstSpace
SavePointer = 0
NotFirstSpace = 0
If AddElement(Parts())
Parts() = *c
EndIf
EndIf
EndIf
*c + SizeOf(Character)
Next
tmp$ = ""
If Added
tmp$ + " "
EndIf
ForEach Parts()
tmp$ + PeekS(Parts())
Next
ProcedureReturn tmp$
EndProcedure
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #CRLF$ + #TAB$ + " "
Debug "|" + StripChar(String$, 4) + "|"
String$ ="h w t b o a n e n e r"
Debug "|" + StripChar(String$, 4) + "|"
There is an idea to write in the same line, ignoring spaces. When I have free time I will try.
Code: Select all
*c.Character = @String$
*new.Character = @String$
Re: Remove double spaces
Posted: Sun Aug 28, 2022 11:29 am
by STARGÅTE
I think you have to set NotFirstSpace = 0 whenever If *c\c is called, not only if SavePointer is true:
Code: Select all
If *c\c
If SavePointer And NotFirstSpace
SavePointer = 0
NotFirstSpace = 0
If AddElement(Parts())
Parts() = *c
EndIf
EndIf
NotFirstSpace = 0
EndIf
However, a regular expression should be easier:
Code: Select all
Enumeration
#Regex
EndEnumeration
CreateRegularExpression(#Regex, "(?<=(\s))(\1)+")
Define String$
String$ =" " + #LF$ + #LF$ + " Hello World! " + #LF$ + #TAB$ + " "
Debug "|" + ReplaceRegularExpression(#Regex, String$, "") + "|"
String$ ="h w t b o a n e n e r"
Debug "|" + ReplaceRegularExpression(#Regex, String$, "") + "|"
Please keep also in mind, the #CRLF$ is
not a single character, it is #CR$ and #LF$.
This means, #CRLF$ + #CRLF$ is not shorten to #CRLF$, but #CR$+#CR$ and #LF$+#LF$
Re: Remove double spaces
Posted: Sun Aug 28, 2022 12:36 pm
by mk-soft
Good start,
I have revised it complete ...
Update
Code: Select all
;-TOP by mk-soft, v1.01.2, 28.08.2022
EnableExplicit
Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character
If Not Bool(String)
ProcedureReturn ""
EndIf
*c.Character = @String
*findChar = @FindChar
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
Result + ReplaceString
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
FirstFound = #False
Result + Chr(*c\c)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result
EndProcedure
CompilerIf #PB_Compiler_IsMainFile
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
Debug "[" + TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ") + "]"
String$ ="h w t b o a n e n e r"
Debug "[" + TrimChars(String$, "-") + "]"
CompilerEndIf
Re: Remove double spaces
Posted: Mon Aug 29, 2022 4:29 am
by AZJIO
mk-soft wrote: Sun Aug 28, 2022 12:36 pm
I have revised it complete ...
Compared the execution speed, my version works 10 times faster. I checked the execution speed because string operations are slow. I also have string operations, but they are in large fragments. I wanted to replace string copying with memory copying method, but it doesn't work, I don't understand why.
Code: Select all
; ForEach Parts()
; tmp$ + PeekS(Parts())
; Next
; ProcedureReturn tmp$
Protected tmp$ = Space(Len2)
Protected *Point, *Result.String = @tmp$
*Point = @*Result\s
ForEach Parts()
CopyMemoryString(Parts(), @*Point)
; CompareMemory(Parts(), @*Point , StringByteLength(PeekS(Parts())))
Next
String$ = PeekS(@*Point)
For -> While - yes, I'm wrong, I stepped on the same rake again
I also had a variant (see below), but in it each word received a pointer. In the previous version, only a double space caused a pointer to be obtained. That is, the entire text, if there are no double spaces in it, then it was one pointer, so for the sake of speed I developed the previous version. But if you rewrite to the same line, then each character will have to be copied into memory byte by byte, I don’t know how fast it works. I liked listing bytes and now I try to use this method everywhere.
Code: Select all
EnableExplicit
Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
Protected i, j, SavePointer = 1
Protected *memChar, *c.Character, *jc.Character
Protected NewList Parts.i()
Protected tmp$ = ""
If Not Asc(String$)
ProcedureReturn ""
EndIf
*c.Character = @String$
*memChar = @TrimChar$
While *c\c
*jc.Character = *memChar
While *jc\c
If *c\c = *jc\c
SavePointer = 1
*c\c = 0
Break
EndIf
*jc + SizeOf(Character)
Wend
If *c\c And SavePointer
SavePointer = 0
If AddElement(Parts())
Parts() = *c
EndIf
EndIf
*c + SizeOf(Character)
Wend
ForEach Parts()
tmp$ + PeekS(Parts()) + " "
Next
ProcedureReturn tmp$
EndProcedure
Define String$
String$ = " " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
Debug "|" + StripChar(String$) + "|"
String$ = "h w t b o a n e n e r"
Debug "|" + StripChar(String$) + "|"
STARGÅTE wrote: Sun Aug 28, 2022 11:29 am
However, a regular expression should be easier:
Increases the program by 200 kb. I write 60-80 kb programs, so I try to do it without regular expressions.
Re: Remove double spaces
Posted: Mon Aug 29, 2022 6:32 am
by netmaestro
I don't know, this seems pretty simple unless I missed something:
Code: Select all
string$ = "Hello World, How Are You Today?"
While FindString(string$, " ")
string$ = ReplaceString(string$, " ", " ")
Wend
Debug string$
Re: Remove double spaces
Posted: Mon Aug 29, 2022 6:50 am
by BarryG
I was going to suggest what netmaestro suggested, but then I assumed someone would say it's too slow for large strings, so I didn't bother.
Re: Remove double spaces
Posted: Mon Aug 29, 2022 7:10 am
by jacdelad
I also use FindString, adapted to the need of the inputstring:
Code: Select all
;-TOP by mk-soft, v1.01.2, 28.08.2022
EnableExplicit
Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character
If Not Bool(String)
ProcedureReturn ""
EndIf
*c.Character = @String
*findChar = @FindChar
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
Result + ReplaceString
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
FirstFound = #False
Result + Chr(*c\c)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result
EndProcedure
Procedure.s TrimChars_(String.s)
ReplaceString(String,#CR$," ",#PB_String_InPlace)
ReplaceString(String,#LF$," ",#PB_String_InPlace)
ReplaceString(String,#TAB$," ",#PB_String_InPlace)
While FindString(String," ")
String=ReplaceString(String," "," ")
Wend
ProcedureReturn String
EndProcedure
CompilerIf #PB_Compiler_IsMainFile
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
DisableDebugger
OpenConsole()
Define output.s,count,tick.q=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN(Str(tick))
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimChars_(String$)
Next
tick=ElapsedMilliseconds()-tick
PrintN(Str(tick))
Input()
CompilerEndIf
39.7 vs. 2.9 seconds.
Re: Remove double spaces
Posted: Mon Aug 29, 2022 10:51 am
by mk-soft
AZJIO wrote: Mon Aug 29, 2022 4:29 am
Compared the execution speed, my version works 10 times faster. I checked the execution speed because string operations are slow. I also have string operations, but they are in large fragments. I wanted to replace string copying with memory copying method, but it doesn't work, I don't understand why.
First of all, this was about a working function, not about speed.
I don't know how you came up with 10 times faster. There's something wrong with your test.
If you want it to be faster, see TrimCharsShort.
Disable Debugger
Update
- Bugfix
Code: Select all
;-TOP by mk-soft, v1.01.2, 28.08.2022
EnableExplicit
Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character
If Not Bool(String)
ProcedureReturn ""
EndIf
*c.Character = @String
*findChar = @FindChar
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
Result + ReplaceString
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
FirstFound = #False
Result + Chr(*c\c)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result
EndProcedure
Procedure.s TrimCharsShort(String.s, FindChar.s = " ")
Protected Result.s, FirstFound, Found, cnt
Protected *findChar, *c.Character, *fc.Character, *rc.character
If Not Bool(String)
ProcedureReturn ""
EndIf
Result = Space(Len(String))
*rc = @Result
*c.Character = @String
*findChar = @FindChar
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
*rc\c = ' '
*rc + SizeOf(character)
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
FirstFound = #False
*rc\c = *c\c
*rc + SizeOf(character)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure
; ****
; by jacdelad
Procedure.s TrimChars_(String.s)
ReplaceString(String,#CR$," ",#PB_String_InPlace)
ReplaceString(String,#LF$," ",#PB_String_InPlace)
ReplaceString(String,#TAB$," ",#PB_String_InPlace)
While FindString(String," ")
String=ReplaceString(String," "," ")
Wend
ProcedureReturn String
EndProcedure
; ----
; by AZJIO
Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
Protected i, j, SavePointer = 1
Protected *memChar, *c.Character, *jc.Character
Protected NewList Parts.i()
Protected tmp$ = ""
If Not Asc(String$)
ProcedureReturn ""
EndIf
*c.Character = @String$
*memChar = @TrimChar$
While *c\c
*jc.Character = *memChar
While *jc\c
If *c\c = *jc\c
SavePointer = 1
*c\c = 0
Break
EndIf
*jc + SizeOf(Character)
Wend
If *c\c And SavePointer
SavePointer = 0
If AddElement(Parts())
Parts() = *c
EndIf
EndIf
*c + SizeOf(Character)
Wend
ForEach Parts()
tmp$ + PeekS(Parts()) + " "
Next
ProcedureReturn tmp$
EndProcedure
;-Test
CompilerIf #PB_Compiler_IsMainFile
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
DisableDebugger
OpenConsole()
Define output.s,count,tick.q
PrintN("TrimChars")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN(Str(tick))
PrintN("[" + output + "]")
PrintN("TrimCharsShort")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimCharsShort(String$, #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN(Str(tick))
PrintN("[" + output + "]")
PrintN("TrimChars_")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimChars_(String$)
Next
tick=ElapsedMilliseconds()-tick
PrintN(Str(tick))
PrintN("[" + output + "]")
PrintN("StripChar")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=StripChar(String$, #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN(Str(tick))
PrintN("[" + output + "]")
Input()
CompilerEndIf
P.S.
At your version missing first space ...
Re: Remove double spaces
Posted: Mon Aug 29, 2022 11:42 am
by mk-soft
jacdelad wrote: Mon Aug 29, 2022 7:10 am
I also use FindString, adapted to the need of the inputstring:
...
39.7 vs. 2.9 seconds.
Disable Debugger
TrimChars
1524
[ Hello World! ]
TrimCharsShort
681
[ Hello World! ]
TrimChars_
1270
[ Hello World! ]
StripChar
1190
[Hello World! ]
To test with the debugger, you can also switch off the debugger within the function for runtime-intensive functions.
Code: Select all
Procedure foo()
DisableDebugger
; Do any
ProcedureReturn 1
EnableDebugger
EndProcedure
Re: Remove double spaces
Posted: Mon Aug 29, 2022 2:08 pm
by jacdelad
mk-soft wrote: Mon Aug 29, 2022 11:42 am
jacdelad wrote: Mon Aug 29, 2022 7:10 am
I also use FindString, adapted to the need of the inputstring:
...
39.7 vs. 2.9 seconds.
Disable Debugger
TrimChars
1524
[ Hello World! ]
TrimCharsShort
681
[ Hello World! ]
TrimChars_
1270
[ Hello World! ]
StripChar
1190
[Hello World! ]
To test with the debugger, you can also switch off the debugger within the function for runtime-intensive functions.
Code: Select all
Procedure foo()
DisableDebugger
; Do any
ProcedureReturn 1
EnableDebugger
EndProcedure
...but my debugger was disabled, see the code I posted.
Re: Remove double spaces
Posted: Mon Aug 29, 2022 2:14 pm
by Demivec
AZJIO wrote: Mon Aug 29, 2022 4:29 am
I also had a variant (see below), but in it each word received a pointer. In the previous version, only a double space caused a pointer to be obtained. That is, the entire text, if there are no double spaces in it, then it was one pointer, so for the sake of speed I developed the previous version. But if you rewrite to the same line, then each character will have to be copied into memory byte by byte, I don’t know how fast it works. I liked listing bytes and now I try to use this method everywhere.
Code: Select all
EnableExplicit
Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
Protected i, j, SavePointer = 1
Protected *memChar, *c.Character, *jc.Character
Protected NewList Parts.i()
Protected tmp$ = ""
If Not Asc(String$)
ProcedureReturn ""
EndIf
*c.Character = @String$
*memChar = @TrimChar$
While *c\c
*jc.Character = *memChar
While *jc\c
If *c\c = *jc\c
SavePointer = 1
*c\c = 0
Break
EndIf
*jc + SizeOf(Character)
Wend
If *c\c And SavePointer
SavePointer = 0
If AddElement(Parts())
Parts() = *c
EndIf
EndIf
*c + SizeOf(Character)
Wend
ForEach Parts()
tmp$ + PeekS(Parts()) + " "
Next
ProcedureReturn tmp$
EndProcedure
Define String$
String$ = " " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
Debug "|" + StripChar(String$) + "|"
String$ = "h w t b o a n e n e r"
Debug "|" + StripChar(String$) + "|"
Here is a faster modified version of mk-soft's version that now removes the string functions for concatenation:
Code: Select all
Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character
If Not Bool(String)
ProcedureReturn ""
EndIf
*c.Character = @String
*findChar = @FindChar
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
Result + ReplaceString
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
FirstFound = #False
Result + Chr(*c\c)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result
EndProcedure
;modification of mk-soft's TrimChars() by Demivec
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character, *rc.character
If Not Bool(String)
ProcedureReturn ""
EndIf
Result = Space(Len(String))
*rc = @Result
*c.Character = @String
*findChar = @FindChar
CopyMemoryString(@"", @*rc)
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
CopyMemoryString(@ReplaceString)
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
FirstFound = #False
*rc\c = *c\c
*rc + SizeOf(character)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result
EndProcedure
CompilerIf #PB_Compiler_IsMainFile
#doublingMax = 5
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
For doubling = 0 To #doublingMax
DisableDebugger
OpenConsole()
Define output.s,count,tick.q
PrintN("Input length: " + Str(Len(String$)))
If doubling < 4
PrintN("TrimChars")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
EndIf
PrintN("replaceTrimChars")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
String$ + String$
PrintN("------------------")
Next
Input()
CompilerEndIf
Here is the test output, notice the times increase for the modified version only double when the string length doubles while the unmodified version's times grow at a much faster rate as the string length grows . I only show execution times for the longer string lengths with the faster subroutine after it is evident that the unmodified version is no longer in the running:
Code: Select all
Input length: 42
TrimChars
Time: 1790
Output length: 14
[ Hello World! ]
replaceTrimChars
Time: 639
Output length: 14
[ Hello World! ]
------------------
Input length: 84
TrimChars
Time: 3686
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars
Time: 1075
Output length: 27
[ Hello World! Hello World! ]
------------------
Input length: 168
TrimChars
Time: 8781
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars
Time: 1920
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 336
TrimChars
Time: 24409
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars
Time: 3615
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 672
replaceTrimChars
Time: 6979
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 1344
replaceTrimChars
Time: 13661
Output length: 417
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
When included in the test with the other procedures (in mk-soft's test code above) it compares well with mk-soft's TrimCharsShort() procedure since they are virtually identical but it still allows specifying a custom replace string:
Code: Select all
TrimChars
1787
[ Hello World! ]
replaceTrimChars
635
[ Hello World! ]
TrimCharsShort
614
[ Hello World! ]
TrimChars_
1262
[ Hello World! ]
StripChar
6729
[Hello World! ]
Re: Remove double spaces
Posted: Mon Aug 29, 2022 4:12 pm
by mk-soft
@Demives,
Works so well as long as you limit the ReplaceString to one character.
Otherwise it can lead to a memory access error.
Better so ..
Code: Select all
;-TOP by mk-soft, v1.03.1, 29.08.2022
Procedure.s TrimCharsToChar(String.s, ReplaceChar.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character, *rc.character, *tc.character
If Not Bool(String)
ProcedureReturn ""
EndIf
Result = Space(Len(String))
*rc = @Result
*c = @String
*tc = @ReplaceChar
*findChar = @FindChar
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
If *tc\c
*rc\c = *tc\c
*rc + SizeOf(character)
EndIf
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack
*rc\c = *c\c
*rc + SizeOf(character)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure
; ----
;modification of mk-soft's TrimChars() by Demivec
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character, *rc.character
If Not Bool(String)
ProcedureReturn ""
EndIf
Result = Space(Len(String))
*rc = @Result
*c.Character = @String
*findChar = @FindChar
CopyMemoryString(@"", @*rc)
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
CopyMemoryString(@ReplaceString)
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack (by mk-soft)
;FirstFound = #False
*rc\c = *c\c
*rc + SizeOf(character)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result
EndProcedure
CompilerIf #PB_Compiler_IsMainFile
#doublingMax = 4
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
For doubling = 0 To #doublingMax
DisableDebugger
OpenConsole()
Define output.s,count,tick.q
PrintN("Input length: " + Str(Len(String$)))
PrintN("TrimCharsToChar")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimCharsToChar(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
PrintN("replaceTrimChars")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
String$ + String$
PrintN("------------------")
Next
Input()
CompilerEndIf
Re: Remove double spaces
Posted: Tue Aug 30, 2022 3:58 am
by Demivec
mk-soft wrote: Mon Aug 29, 2022 4:12 pm
@Demives,
Works so well as long as you limit the ReplaceString to one character.
Otherwise it can lead to a memory access error.
Better so ..
Thanks for catching that. Since I originally wanted to offer the ability to replace a single character or group of characters in the FindChar.s parameter with a string of one or more characters I have made one last modification to replaceTrimChars() to include this feature which could be done in the original TrimChar(). There is some time penalty to include this functionality but I think it is a nice option. The single character option is still covered nicely by TrimCharsToChar():
Code: Select all
;-TOP by mk-soft, v1.03.1, 29.08.2022
Procedure.s TrimCharsToChar(String.s, ReplaceChar.s = " ", FindChar.s = " ")
Protected Result.s, FirstFound, Found
Protected *findChar, *c.Character, *fc.Character, *rc.character, *tc.character
If Not Bool(String)
ProcedureReturn ""
EndIf
Result = Space(Len(String))
*rc = @Result
*c = @String
*tc = @ReplaceChar
*findChar = @FindChar
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
If *tc\c
*rc\c = *tc\c
*rc + SizeOf(character)
EndIf
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack
*rc\c = *c\c
*rc + SizeOf(character)
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure
; ----
;modification of mk-soft's TrimChars() by Demivec
;Replaces all groups of characters in FindChar with the entire ReplaceString.
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
Protected FirstFound, Found
Protected *findChar, *c.Character, *fc.Character, *rc.character
Protected len_String, len_ReplaceString, len_Result, toggle, maxLen_Result
Dim Result.s(1)
If Not Bool(String)
ProcedureReturn ""
EndIf
len_String = Len(String) * 2
maxLen_Result = len_String
toggle = 0
Result(toggle) = Space(maxLen_Result)
*rc = @Result(toggle)
*c.Character = @String
*findChar = @FindChar
len_ReplaceString = Len(ReplaceString)
len_Result = 0
CopyMemoryString(@"", @*rc)
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
If len_Result + len_ReplaceString > maxLen_Result
toggle ! 1 ;toggle now indexes alternate string
maxLen_Result + len_String
Result(toggle) = Space(maxLen_Result)
CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
*rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
EndIf
CopyMemoryString(@ReplaceString)
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack (by mk-soft)
If len_Result + SizeOf(character) > maxLen_Result
toggle ! 1 ;toggle now indexes alternate string
maxLen_Result + len_String
Result(toggle) = Space(maxLen_Result)
CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
*rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
EndIf
*rc\c = *c\c
*rc + SizeOf(character)
len_Result + 1
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result(toggle)
EndProcedure
CompilerIf #PB_Compiler_IsMainFile
#doublingMax = 4
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
For doubling = 0 To #doublingMax
DisableDebugger
OpenConsole()
Define output.s,count,tick.q
PrintN("Input length: " + Str(Len(String$)))
PrintN("TrimCharsToChar")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimCharsToChar(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
PrintN("replaceTrimChars with single character ' '")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
PrintN("replaceTrimChars with string ' <-----> '")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output = replaceTrimChars(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
String$ + String$
PrintN("------------------")
Next
Input()
CompilerEndIf
output:
Code: Select all
Input length: 42
TrimCharsToChar
Time: 733
Output length: 14
[ Hello World! ]
replaceTrimChars with single character ' '
Time: 746
Output length: 14
[ Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 809
Output length: 38
[ <-----> Hello <-----> World! <-----> ]
------------------
Input length: 84
TrimCharsToChar
Time: 1071
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 1229
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 1323
Output length: 67
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 168
TrimCharsToChar
Time: 1917
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 2171
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 2389
Output length: 125
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 336
TrimCharsToChar
Time: 3579
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 4031
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 4482
Output length: 241
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 672
TrimCharsToChar
Time: 6862
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 7782
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 8431
Output length: 473
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Re: Remove double spaces
Posted: Tue Aug 30, 2022 12:32 pm
by mk-soft
Actually, further optimisation does not bring anything more. Only code style and the last nano-seconds.
So here is my last version and thanks to Demives for the ideas.
Update v1.03.3
Code: Select all
;-TOP by mk-soft, v1.03.3, 30.08.2022
Procedure.s TrimCharsToString(String.s, ReplaceString.s = " ", FindChars.s = " ")
DisableDebugger
Protected result.s, firstFound, found, size_max, size_replace
Protected *result, *findChars, *c.Character, *fc.Character, *rc.character, *tc.character, *ofs
If Not Bool(String)
ProcedureReturn ""
EndIf
size_max = StringByteLength(String) + SizeOf(Character)
size_replace = StringByteLength(ReplaceString)
*result = AllocateMemory(size_max + size_replace * 4, #PB_Memory_NoClear)
*rc = *result
*c = @String
*tc = @ReplaceString
*findChars = @FindChars
While *c\c
*fc = *findChars
While *fc\c
If *c\c = *fc\c
If Not firstFound
firstFound = #True
If *tc\c
CopyMemory(*tc, *rc, size_replace)
*rc + size_replace
If *rc - *result >= size_max
*ofs = *rc - *result
size_max = MemorySize(*result)
*result = ReAllocateMemory(*result, size_max + size_replace * 4, #PB_Memory_NoClear)
*rc = *result + *ofs
EndIf
EndIf
EndIf
found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not found
If firstFound : firstFound = #False : EndIf
*rc\c = *c\c
*rc + SizeOf(character)
Else
found = #False
EndIf
*c + SizeOf(Character)
Wend
*rc\c = 0
result = PeekS(*result)
FreeMemory(*result)
ProcedureReturn result
EnableDebugger
EndProcedure
;modification of mk-soft's TrimChars() by Demivec
;Replaces all groups of characters in FindChar with the entire ReplaceString.
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
DisableDebugger
Protected FirstFound, Found
Protected *findChar, *c.Character, *fc.Character, *rc.character
Protected len_String, len_ReplaceString, len_Result, toggle, maxLen_Result
Dim Result.s(1)
If Not Bool(String)
ProcedureReturn ""
EndIf
len_String = Len(String) * 2
maxLen_Result = len_String
toggle = 0
Result(toggle) = Space(maxLen_Result)
*rc = @Result(toggle)
*c.Character = @String
*findChar = @FindChar
len_ReplaceString = Len(ReplaceString)
len_Result = 0
CopyMemoryString(@"", @*rc)
While *c\c
*fc.Character = *findChar
While *fc\c
If *c\c = *fc\c
If Not FirstFound
FirstFound = #True
If len_Result + len_ReplaceString > maxLen_Result
toggle ! 1 ;toggle now indexes alternate string
maxLen_Result + len_String
Result(toggle) = Space(maxLen_Result)
CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
*rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
EndIf
CopyMemoryString(@ReplaceString)
EndIf
Found = #True
Break
EndIf
*fc + SizeOf(Character)
Wend
If Not Found
If FirstFound : FirstFound = #False : EndIf
If len_Result + SizeOf(character) > maxLen_Result
toggle ! 1 ;toggle now indexes alternate string
maxLen_Result + len_String
Result(toggle) = Space(maxLen_Result)
CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
*rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
EndIf
*rc\c = *c\c
*rc + SizeOf(character)
len_Result + 1
Else
Found = #False
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn Result(toggle)
EnableDebugger
EndProcedure
; ****
;-Test
CompilerIf #PB_Compiler_IsMainFile
CompilerIf #PB_Compiler_Debugger
CompilerWarning "For runtime test disable debugger!"
CompilerEndIf
#doublingMax = 2
Define String$
String$ =" " + #CRLF$ + #CRLF$ + " Hello World! " + #LFCR$ + #TAB$ + " "
For doubling = 0 To #doublingMax
OpenConsole()
Define output.s,count,tick.q
PrintN("Input length: " + Str(Len(String$)))
PrintN("TrimCharsToString with single character ' '")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output=TrimCharsToString(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
PrintN("replaceTrimChars with single character ' '")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
PrintN("TrimCharsToString with string ' <-----> '")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output = TrimCharsToString(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
PrintN("replaceTrimChars with string ' <-----> '")
tick=ElapsedMilliseconds()
For count=1 To 1000000
output = replaceTrimChars(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
Next
tick=ElapsedMilliseconds()-tick
PrintN("Time: " + Str(tick))
PrintN("Output length: " + Str(Len(output)))
PrintN("[" + output + "]")
String$ + String$
PrintN("------------------")
Next
Input()
CompilerEndIf