Page 1 of 1

Remove double spaces

Posted: Sun Aug 28, 2022 10:58 am
by AZJIO
While there are problems if single characters

Code: Select all

EnableExplicit

Procedure.s StripChar(String$, flag, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected Len1, Len2, i, j, NotFirstSpace = 1, SavePointer = 1, Added = 0
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""

	Len1 = Len(TrimChar$)
	Len2 = Len(String$)

	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	*jc.Character = *memChar
	For j = 1 To Len1
		If *c\c = *jc\c
			Added = 1
			Break
		EndIf
		*jc + SizeOf(Character)
	Next

	For i = 1 To Len2
		*jc.Character = *memChar

		For j = 1 To Len1
			If *c\c = *jc\c
				If NotFirstSpace
					*c\c = 0
					SavePointer = 1
				Else
					NotFirstSpace = 1
					*c\c = 32
				EndIf
				Break
			EndIf
			*jc + SizeOf(Character)
		Next

		If *c\c
			If SavePointer And NotFirstSpace
				SavePointer = 0
				NotFirstSpace = 0
				If AddElement(Parts())
					Parts() = *c
				EndIf
			EndIf
		EndIf
		*c + SizeOf(Character)
	Next


	tmp$ = ""

	If Added
		tmp$ + " "
	EndIf
	ForEach Parts()
		tmp$ + PeekS(Parts())
	Next
	ProcedureReturn tmp$
EndProcedure


Define String$
String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #CRLF$ + #TAB$ + "    "
Debug "|" + StripChar(String$, 4) + "|"
String$ ="h w t b o a n e n e r"
Debug "|" + StripChar(String$, 4) + "|"
There is an idea to write in the same line, ignoring spaces. When I have free time I will try.

Code: Select all

	*c.Character = @String$
	*new.Character = @String$

Re: Remove double spaces

Posted: Sun Aug 28, 2022 11:29 am
by STARGÅTE
I think you have to set NotFirstSpace = 0 whenever If *c\c is called, not only if SavePointer is true:

Code: Select all

		If *c\c
			If SavePointer And NotFirstSpace
				SavePointer = 0
				NotFirstSpace = 0
				If AddElement(Parts())
					Parts() = *c
				EndIf
			EndIf
			NotFirstSpace = 0
		EndIf
However, a regular expression should be easier:

Code: Select all

Enumeration
	#Regex
EndEnumeration

CreateRegularExpression(#Regex, "(?<=(\s))(\1)+")

Define String$
String$ ="       " + #LF$ + #LF$ + "    Hello      World!   " + #LF$ + #TAB$ + "    "
Debug "|" + ReplaceRegularExpression(#Regex, String$, "") + "|"
String$ ="h w t b o a n e n e r"
Debug "|" + ReplaceRegularExpression(#Regex, String$, "") + "|"
Please keep also in mind, the #CRLF$ is not a single character, it is #CR$ and #LF$.
This means, #CRLF$ + #CRLF$ is not shorten to #CRLF$, but #CR$+#CR$ and #LF$+#LF$

Re: Remove double spaces

Posted: Sun Aug 28, 2022 12:36 pm
by mk-soft
Good start,

I have revised it complete ...

Update

Code: Select all

;-TOP by mk-soft, v1.01.2, 28.08.2022

EnableExplicit

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  Debug "[" + TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ") + "]"
  String$ ="h w t b o a n e n e r"
  Debug "[" + TrimChars(String$, "-") + "]"
  
CompilerEndIf

Re: Remove double spaces

Posted: Mon Aug 29, 2022 4:29 am
by AZJIO
mk-soft wrote: Sun Aug 28, 2022 12:36 pm I have revised it complete ...
Compared the execution speed, my version works 10 times faster. I checked the execution speed because string operations are slow. I also have string operations, but they are in large fragments. I wanted to replace string copying with memory copying method, but it doesn't work, I don't understand why.

Code: Select all

; ForEach Parts()
; 	tmp$ + PeekS(Parts())
; Next
; ProcedureReturn tmp$

Protected tmp$ = Space(Len2)
Protected *Point, *Result.String = @tmp$

*Point = @*Result\s
ForEach Parts()
	CopyMemoryString(Parts(), @*Point)
; CompareMemory(Parts(), @*Point , StringByteLength(PeekS(Parts())))
Next
String$ = PeekS(@*Point)
For -> While - yes, I'm wrong, I stepped on the same rake again

I also had a variant (see below), but in it each word received a pointer. In the previous version, only a double space caused a pointer to be obtained. That is, the entire text, if there are no double spaces in it, then it was one pointer, so for the sake of speed I developed the previous version. But if you rewrite to the same line, then each character will have to be copied into memory byte by byte, I don’t know how fast it works. I liked listing bytes and now I try to use this method everywhere.

Code: Select all

EnableExplicit

Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected i, j, SavePointer = 1
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""


	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	While *c\c
		*jc.Character = *memChar

		While *jc\c
			If *c\c = *jc\c
				SavePointer = 1
				*c\c = 0
				Break
			EndIf
			*jc + SizeOf(Character)
		Wend

		If *c\c And SavePointer
			SavePointer = 0
			If AddElement(Parts())
				Parts() = *c
			EndIf
		EndIf
		*c + SizeOf(Character)
	Wend

	ForEach Parts()
		tmp$ + PeekS(Parts()) + " "
	Next
	ProcedureReturn tmp$
EndProcedure


Define String$

String$ = "       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
Debug "|" + StripChar(String$) + "|"
String$ = "h w t b o a n e n e r"
Debug "|" + StripChar(String$) + "|"
STARGÅTE wrote: Sun Aug 28, 2022 11:29 am However, a regular expression should be easier:
Increases the program by 200 kb. I write 60-80 kb programs, so I try to do it without regular expressions.

Re: Remove double spaces

Posted: Mon Aug 29, 2022 6:32 am
by netmaestro
I don't know, this seems pretty simple unless I missed something:

Code: Select all

string$ = "Hello               World, How         Are    You                                 Today?"
While FindString(string$, "  ")
  string$ = ReplaceString(string$, "  ", " ")
Wend

Debug string$

Re: Remove double spaces

Posted: Mon Aug 29, 2022 6:50 am
by BarryG
I was going to suggest what netmaestro suggested, but then I assumed someone would say it's too slow for large strings, so I didn't bother.

Re: Remove double spaces

Posted: Mon Aug 29, 2022 7:10 am
by jacdelad
I also use FindString, adapted to the need of the inputstring:

Code: Select all

;-TOP by mk-soft, v1.01.2, 28.08.2022

EnableExplicit

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

Procedure.s TrimChars_(String.s)
  ReplaceString(String,#CR$," ",#PB_String_InPlace)
  ReplaceString(String,#LF$," ",#PB_String_InPlace)
  ReplaceString(String,#TAB$," ",#PB_String_InPlace)
  While FindString(String,"  ")
    String=ReplaceString(String,"  "," ")
  Wend
  ProcedureReturn String
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  DisableDebugger
  OpenConsole()
  Define output.s,count,tick.q=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars_(String$)
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  Input()
CompilerEndIf
39.7 vs. 2.9 seconds.

Re: Remove double spaces

Posted: Mon Aug 29, 2022 10:51 am
by mk-soft
AZJIO wrote: Mon Aug 29, 2022 4:29 am Compared the execution speed, my version works 10 times faster. I checked the execution speed because string operations are slow. I also have string operations, but they are in large fragments. I wanted to replace string copying with memory copying method, but it doesn't work, I don't understand why.
First of all, this was about a working function, not about speed.

I don't know how you came up with 10 times faster. There's something wrong with your test.
If you want it to be faster, see TrimCharsShort.

Disable Debugger

Update
- Bugfix

Code: Select all

;-TOP by mk-soft, v1.01.2, 28.08.2022

EnableExplicit

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

Procedure.s TrimCharsShort(String.s, FindChar.s = " ")
  Protected Result.s, FirstFound, Found, cnt
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          *rc\c = ' '
          *rc + SizeOf(character)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure

; ****

; by jacdelad

Procedure.s TrimChars_(String.s)
  ReplaceString(String,#CR$," ",#PB_String_InPlace)
  ReplaceString(String,#LF$," ",#PB_String_InPlace)
  ReplaceString(String,#TAB$," ",#PB_String_InPlace)
  While FindString(String,"  ")
    String=ReplaceString(String,"  "," ")
  Wend
  ProcedureReturn String
EndProcedure

; ----

; by AZJIO

Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected i, j, SavePointer = 1
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""


	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	While *c\c
		*jc.Character = *memChar

		While *jc\c
			If *c\c = *jc\c
				SavePointer = 1
				*c\c = 0
				Break
			EndIf
			*jc + SizeOf(Character)
		Wend

		If *c\c And SavePointer
			SavePointer = 0
			If AddElement(Parts())
				Parts() = *c
			EndIf
		EndIf
		*c + SizeOf(Character)
	Wend

	ForEach Parts()
		tmp$ + PeekS(Parts()) + " "
	Next
	ProcedureReturn tmp$
EndProcedure

;-Test

CompilerIf #PB_Compiler_IsMainFile
  
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  DisableDebugger
  OpenConsole()
  Define output.s,count,tick.q
  PrintN("TrimChars")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  PrintN("TrimCharsShort")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimCharsShort(String$, #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  PrintN("TrimChars_")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars_(String$)
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  PrintN("StripChar")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=StripChar(String$, #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  Input()
  
CompilerEndIf
P.S.
At your version missing first space ...

Re: Remove double spaces

Posted: Mon Aug 29, 2022 11:42 am
by mk-soft
jacdelad wrote: Mon Aug 29, 2022 7:10 am I also use FindString, adapted to the need of the inputstring:
...
39.7 vs. 2.9 seconds.
Disable Debugger :!:
TrimChars
1524
[ Hello World! ]
TrimCharsShort
681
[ Hello World! ]
TrimChars_
1270
[ Hello World! ]
StripChar
1190
[Hello World! ]
To test with the debugger, you can also switch off the debugger within the function for runtime-intensive functions.

Code: Select all

Procedure foo()
  DisableDebugger
  ; Do any
  ProcedureReturn 1
  EnableDebugger
EndProcedure

Re: Remove double spaces

Posted: Mon Aug 29, 2022 2:08 pm
by jacdelad
mk-soft wrote: Mon Aug 29, 2022 11:42 am
jacdelad wrote: Mon Aug 29, 2022 7:10 am I also use FindString, adapted to the need of the inputstring:
...
39.7 vs. 2.9 seconds.
Disable Debugger :!:
TrimChars
1524
[ Hello World! ]
TrimCharsShort
681
[ Hello World! ]
TrimChars_
1270
[ Hello World! ]
StripChar
1190
[Hello World! ]
To test with the debugger, you can also switch off the debugger within the function for runtime-intensive functions.

Code: Select all

Procedure foo()
  DisableDebugger
  ; Do any
  ProcedureReturn 1
  EnableDebugger
EndProcedure
...but my debugger was disabled, see the code I posted.

Re: Remove double spaces

Posted: Mon Aug 29, 2022 2:14 pm
by Demivec
AZJIO wrote: Mon Aug 29, 2022 4:29 am I also had a variant (see below), but in it each word received a pointer. In the previous version, only a double space caused a pointer to be obtained. That is, the entire text, if there are no double spaces in it, then it was one pointer, so for the sake of speed I developed the previous version. But if you rewrite to the same line, then each character will have to be copied into memory byte by byte, I don’t know how fast it works. I liked listing bytes and now I try to use this method everywhere.

Code: Select all

EnableExplicit

Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected i, j, SavePointer = 1
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""


	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	While *c\c
		*jc.Character = *memChar

		While *jc\c
			If *c\c = *jc\c
				SavePointer = 1
				*c\c = 0
				Break
			EndIf
			*jc + SizeOf(Character)
		Wend

		If *c\c And SavePointer
			SavePointer = 0
			If AddElement(Parts())
				Parts() = *c
			EndIf
		EndIf
		*c + SizeOf(Character)
	Wend

	ForEach Parts()
		tmp$ + PeekS(Parts()) + " "
	Next
	ProcedureReturn tmp$
EndProcedure


Define String$

String$ = "       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
Debug "|" + StripChar(String$) + "|"
String$ = "h w t b o a n e n e r"
Debug "|" + StripChar(String$) + "|"
Here is a faster modified version of mk-soft's version that now removes the string functions for concatenation:

Code: Select all

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

;modification of mk-soft's TrimChars() by Demivec
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c.Character = @String
  *findChar = @FindChar
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  #doublingMax = 5
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    DisableDebugger
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    If doubling < 4
      PrintN("TrimChars")
      tick=ElapsedMilliseconds()
      For count=1 To 1000000
        output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
      Next
      tick=ElapsedMilliseconds()-tick
      PrintN("Time: " + Str(tick))
      PrintN("Output length: " + Str(Len(output)))
      PrintN("[" + output + "]")
    EndIf
    
    PrintN("replaceTrimChars")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf
Here is the test output, notice the times increase for the modified version only double when the string length doubles while the unmodified version's times grow at a much faster rate as the string length grows . I only show execution times for the longer string lengths with the faster subroutine after it is evident that the unmodified version is no longer in the running:

Code: Select all

Input length: 42
TrimChars
Time: 1790
Output length: 14
[ Hello World! ]
replaceTrimChars
Time: 639
Output length: 14
[ Hello World! ]
------------------
Input length: 84
TrimChars
Time: 3686
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars
Time: 1075
Output length: 27
[ Hello World! Hello World! ]
------------------
Input length: 168
TrimChars
Time: 8781
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars
Time: 1920
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 336
TrimChars
Time: 24409
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars
Time: 3615
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 672
replaceTrimChars
Time: 6979
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 1344
replaceTrimChars
Time: 13661
Output length: 417
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
When included in the test with the other procedures (in mk-soft's test code above) it compares well with mk-soft's TrimCharsShort() procedure since they are virtually identical but it still allows specifying a custom replace string:

Code: Select all

TrimChars
1787
[ Hello World! ]
replaceTrimChars
635
[ Hello World! ]
TrimCharsShort
614
[ Hello World! ]
TrimChars_
1262
[ Hello World! ]
StripChar
6729
[Hello World! ]

Re: Remove double spaces

Posted: Mon Aug 29, 2022 4:12 pm
by mk-soft
@Demives,
Works so well as long as you limit the ReplaceString to one character.
Otherwise it can lead to a memory access error.

Better so ..

Code: Select all

;-TOP by mk-soft, v1.03.1, 29.08.2022

Procedure.s TrimCharsToChar(String.s, ReplaceChar.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character, *tc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c = @String
  *tc = @ReplaceChar
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          If *tc\c
            *rc\c = *tc\c
            *rc + SizeOf(character)
          EndIf
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure

; ----

;modification of mk-soft's TrimChars() by Demivec
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c.Character = @String
  *findChar = @FindChar
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack (by mk-soft)
      ;FirstFound = #False
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  #doublingMax = 4
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    DisableDebugger
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    
    PrintN("TrimCharsToChar")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output=TrimCharsToChar(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf

Re: Remove double spaces

Posted: Tue Aug 30, 2022 3:58 am
by Demivec
mk-soft wrote: Mon Aug 29, 2022 4:12 pm @Demives,
Works so well as long as you limit the ReplaceString to one character.
Otherwise it can lead to a memory access error.

Better so ..
Thanks for catching that. Since I originally wanted to offer the ability to replace a single character or group of characters in the FindChar.s parameter with a string of one or more characters I have made one last modification to replaceTrimChars() to include this feature which could be done in the original TrimChar(). There is some time penalty to include this functionality but I think it is a nice option. The single character option is still covered nicely by TrimCharsToChar():

Code: Select all

;-TOP by mk-soft, v1.03.1, 29.08.2022

Procedure.s TrimCharsToChar(String.s, ReplaceChar.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character, *tc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c = @String
  *tc = @ReplaceChar
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          If *tc\c
            *rc\c = *tc\c
            *rc + SizeOf(character)
          EndIf
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure

; ----

;modification of mk-soft's TrimChars() by Demivec
;Replaces all groups of characters in FindChar with the entire ReplaceString.
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  Protected len_String, len_ReplaceString, len_Result, toggle, maxLen_Result
  Dim Result.s(1)
  
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  len_String = Len(String) * 2
  maxLen_Result = len_String
  toggle = 0
  Result(toggle) = Space(maxLen_Result)
  *rc = @Result(toggle)
  *c.Character = @String
  *findChar = @FindChar
  len_ReplaceString = Len(ReplaceString)
  len_Result = 0
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          
          If len_Result + len_ReplaceString > maxLen_Result
            toggle ! 1 ;toggle now indexes alternate string
            maxLen_Result + len_String
            Result(toggle) = Space(maxLen_Result)
            CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
            *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
          EndIf
            
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack (by mk-soft)
      
      If len_Result + SizeOf(character) > maxLen_Result
        toggle ! 1 ;toggle now indexes alternate string
        maxLen_Result + len_String
        Result(toggle) = Space(maxLen_Result)
        CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
        *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
      EndIf
      
      *rc\c = *c\c
      *rc + SizeOf(character)
      len_Result + 1
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result(toggle)
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  #doublingMax = 4
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    DisableDebugger
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    
    PrintN("TrimCharsToChar")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output=TrimCharsToChar(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with single character ' '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with string ' <-----> '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf
output:

Code: Select all

Input length: 42
TrimCharsToChar
Time: 733
Output length: 14
[ Hello World! ]
replaceTrimChars with single character ' '
Time: 746
Output length: 14
[ Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 809
Output length: 38
[ <-----> Hello <-----> World! <-----> ]
------------------
Input length: 84
TrimCharsToChar
Time: 1071
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 1229
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 1323
Output length: 67
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 168
TrimCharsToChar
Time: 1917
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 2171
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 2389
Output length: 125
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 336
TrimCharsToChar
Time: 3579
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 4031
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 4482
Output length: 241
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 672
TrimCharsToChar
Time: 6862
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 7782
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 8431
Output length: 473
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------

Re: Remove double spaces

Posted: Tue Aug 30, 2022 12:32 pm
by mk-soft
Actually, further optimisation does not bring anything more. Only code style and the last nano-seconds.

So here is my last version and thanks to Demives for the ideas.

Update v1.03.3

Code: Select all

;-TOP by mk-soft, v1.03.3, 30.08.2022

Procedure.s TrimCharsToString(String.s, ReplaceString.s = " ", FindChars.s = " ")
  DisableDebugger
  
  Protected result.s, firstFound, found, size_max, size_replace
  Protected *result, *findChars, *c.Character, *fc.Character, *rc.character, *tc.character, *ofs
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  size_max = StringByteLength(String) + SizeOf(Character)
  size_replace = StringByteLength(ReplaceString)
  
  *result = AllocateMemory(size_max + size_replace * 4, #PB_Memory_NoClear)
  *rc = *result
  *c = @String
  *tc = @ReplaceString
  *findChars = @FindChars
  
  While *c\c
    *fc = *findChars
    While *fc\c
      If *c\c = *fc\c
        If Not firstFound
          firstFound = #True
          If *tc\c
            CopyMemory(*tc, *rc, size_replace)
            *rc + size_replace
            If *rc - *result >= size_max
              *ofs = *rc - *result
              size_max = MemorySize(*result)
              *result = ReAllocateMemory(*result, size_max + size_replace * 4, #PB_Memory_NoClear) 
              *rc = *result + *ofs
            EndIf
          EndIf
        EndIf
        found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not found
      If firstFound : firstFound = #False : EndIf
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  *rc\c = 0
  result = PeekS(*result)
  FreeMemory(*result)
  ProcedureReturn result
  
  EnableDebugger
EndProcedure

;modification of mk-soft's TrimChars() by Demivec
;Replaces all groups of characters in FindChar with the entire ReplaceString.

Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  DisableDebugger
  
  Protected FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  Protected len_String, len_ReplaceString, len_Result, toggle, maxLen_Result
  Dim Result.s(1)
  
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  len_String = Len(String) * 2
  maxLen_Result = len_String
  toggle = 0
  Result(toggle) = Space(maxLen_Result)
  *rc = @Result(toggle)
  *c.Character = @String
  *findChar = @FindChar
  len_ReplaceString = Len(ReplaceString)
  len_Result = 0
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          
          If len_Result + len_ReplaceString > maxLen_Result
            toggle ! 1 ;toggle now indexes alternate string
            maxLen_Result + len_String
            Result(toggle) = Space(maxLen_Result)
            CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
            *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
          EndIf
            
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf
      
      If len_Result + SizeOf(character) > maxLen_Result
        toggle ! 1 ;toggle now indexes alternate string
        maxLen_Result + len_String
        Result(toggle) = Space(maxLen_Result)
        CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
        *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
      EndIf
      
      *rc\c = *c\c
      *rc + SizeOf(character)
      len_Result + 1
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result(toggle)
  
  EnableDebugger
EndProcedure

; ****

;-Test

CompilerIf #PB_Compiler_IsMainFile
  
  CompilerIf #PB_Compiler_Debugger
    CompilerWarning "For runtime test disable debugger!"
  CompilerEndIf
  
  #doublingMax = 2
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    
    PrintN("TrimCharsToString with single character ' '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output=TrimCharsToString(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with single character ' '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("TrimCharsToString with string ' <-----> '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = TrimCharsToString(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with string ' <-----> '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf