Remove double spaces

Just starting out? Need help? Post your questions and find answers here.
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Remove double spaces

Post by AZJIO »

While there are problems if single characters

Code: Select all

EnableExplicit

Procedure.s StripChar(String$, flag, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected Len1, Len2, i, j, NotFirstSpace = 1, SavePointer = 1, Added = 0
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""

	Len1 = Len(TrimChar$)
	Len2 = Len(String$)

	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	*jc.Character = *memChar
	For j = 1 To Len1
		If *c\c = *jc\c
			Added = 1
			Break
		EndIf
		*jc + SizeOf(Character)
	Next

	For i = 1 To Len2
		*jc.Character = *memChar

		For j = 1 To Len1
			If *c\c = *jc\c
				If NotFirstSpace
					*c\c = 0
					SavePointer = 1
				Else
					NotFirstSpace = 1
					*c\c = 32
				EndIf
				Break
			EndIf
			*jc + SizeOf(Character)
		Next

		If *c\c
			If SavePointer And NotFirstSpace
				SavePointer = 0
				NotFirstSpace = 0
				If AddElement(Parts())
					Parts() = *c
				EndIf
			EndIf
		EndIf
		*c + SizeOf(Character)
	Next


	tmp$ = ""

	If Added
		tmp$ + " "
	EndIf
	ForEach Parts()
		tmp$ + PeekS(Parts())
	Next
	ProcedureReturn tmp$
EndProcedure


Define String$
String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #CRLF$ + #TAB$ + "    "
Debug "|" + StripChar(String$, 4) + "|"
String$ ="h w t b o a n e n e r"
Debug "|" + StripChar(String$, 4) + "|"
There is an idea to write in the same line, ignoring spaces. When I have free time I will try.

Code: Select all

	*c.Character = @String$
	*new.Character = @String$
User avatar
STARGÅTE
Addict
Addict
Posts: 2227
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: Remove double spaces

Post by STARGÅTE »

I think you have to set NotFirstSpace = 0 whenever If *c\c is called, not only if SavePointer is true:

Code: Select all

		If *c\c
			If SavePointer And NotFirstSpace
				SavePointer = 0
				NotFirstSpace = 0
				If AddElement(Parts())
					Parts() = *c
				EndIf
			EndIf
			NotFirstSpace = 0
		EndIf
However, a regular expression should be easier:

Code: Select all

Enumeration
	#Regex
EndEnumeration

CreateRegularExpression(#Regex, "(?<=(\s))(\1)+")

Define String$
String$ ="       " + #LF$ + #LF$ + "    Hello      World!   " + #LF$ + #TAB$ + "    "
Debug "|" + ReplaceRegularExpression(#Regex, String$, "") + "|"
String$ ="h w t b o a n e n e r"
Debug "|" + ReplaceRegularExpression(#Regex, String$, "") + "|"
Please keep also in mind, the #CRLF$ is not a single character, it is #CR$ and #LF$.
This means, #CRLF$ + #CRLF$ is not shorten to #CRLF$, but #CR$+#CR$ and #LF$+#LF$
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
User avatar
mk-soft
Always Here
Always Here
Posts: 6207
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Remove double spaces

Post by mk-soft »

Good start,

I have revised it complete ...

Update

Code: Select all

;-TOP by mk-soft, v1.01.2, 28.08.2022

EnableExplicit

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  Debug "[" + TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ") + "]"
  String$ ="h w t b o a n e n e r"
  Debug "[" + TrimChars(String$, "-") + "]"
  
CompilerEndIf
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
AZJIO
Addict
Addict
Posts: 2143
Joined: Sun May 14, 2017 1:48 am

Re: Remove double spaces

Post by AZJIO »

mk-soft wrote: Sun Aug 28, 2022 12:36 pm I have revised it complete ...
Compared the execution speed, my version works 10 times faster. I checked the execution speed because string operations are slow. I also have string operations, but they are in large fragments. I wanted to replace string copying with memory copying method, but it doesn't work, I don't understand why.

Code: Select all

; ForEach Parts()
; 	tmp$ + PeekS(Parts())
; Next
; ProcedureReturn tmp$

Protected tmp$ = Space(Len2)
Protected *Point, *Result.String = @tmp$

*Point = @*Result\s
ForEach Parts()
	CopyMemoryString(Parts(), @*Point)
; CompareMemory(Parts(), @*Point , StringByteLength(PeekS(Parts())))
Next
String$ = PeekS(@*Point)
For -> While - yes, I'm wrong, I stepped on the same rake again

I also had a variant (see below), but in it each word received a pointer. In the previous version, only a double space caused a pointer to be obtained. That is, the entire text, if there are no double spaces in it, then it was one pointer, so for the sake of speed I developed the previous version. But if you rewrite to the same line, then each character will have to be copied into memory byte by byte, I don’t know how fast it works. I liked listing bytes and now I try to use this method everywhere.

Code: Select all

EnableExplicit

Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected i, j, SavePointer = 1
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""


	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	While *c\c
		*jc.Character = *memChar

		While *jc\c
			If *c\c = *jc\c
				SavePointer = 1
				*c\c = 0
				Break
			EndIf
			*jc + SizeOf(Character)
		Wend

		If *c\c And SavePointer
			SavePointer = 0
			If AddElement(Parts())
				Parts() = *c
			EndIf
		EndIf
		*c + SizeOf(Character)
	Wend

	ForEach Parts()
		tmp$ + PeekS(Parts()) + " "
	Next
	ProcedureReturn tmp$
EndProcedure


Define String$

String$ = "       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
Debug "|" + StripChar(String$) + "|"
String$ = "h w t b o a n e n e r"
Debug "|" + StripChar(String$) + "|"
STARGÅTE wrote: Sun Aug 28, 2022 11:29 am However, a regular expression should be easier:
Increases the program by 200 kb. I write 60-80 kb programs, so I try to do it without regular expressions.
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8451
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: Remove double spaces

Post by netmaestro »

I don't know, this seems pretty simple unless I missed something:

Code: Select all

string$ = "Hello               World, How         Are    You                                 Today?"
While FindString(string$, "  ")
  string$ = ReplaceString(string$, "  ", " ")
Wend

Debug string$
BERESHEIT
BarryG
Addict
Addict
Posts: 4128
Joined: Thu Apr 18, 2019 8:17 am

Re: Remove double spaces

Post by BarryG »

I was going to suggest what netmaestro suggested, but then I assumed someone would say it's too slow for large strings, so I didn't bother.
User avatar
jacdelad
Addict
Addict
Posts: 1993
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Remove double spaces

Post by jacdelad »

I also use FindString, adapted to the need of the inputstring:

Code: Select all

;-TOP by mk-soft, v1.01.2, 28.08.2022

EnableExplicit

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

Procedure.s TrimChars_(String.s)
  ReplaceString(String,#CR$," ",#PB_String_InPlace)
  ReplaceString(String,#LF$," ",#PB_String_InPlace)
  ReplaceString(String,#TAB$," ",#PB_String_InPlace)
  While FindString(String,"  ")
    String=ReplaceString(String,"  "," ")
  Wend
  ProcedureReturn String
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  DisableDebugger
  OpenConsole()
  Define output.s,count,tick.q=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars_(String$)
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  Input()
CompilerEndIf
39.7 vs. 2.9 seconds.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
mk-soft
Always Here
Always Here
Posts: 6207
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Remove double spaces

Post by mk-soft »

AZJIO wrote: Mon Aug 29, 2022 4:29 am Compared the execution speed, my version works 10 times faster. I checked the execution speed because string operations are slow. I also have string operations, but they are in large fragments. I wanted to replace string copying with memory copying method, but it doesn't work, I don't understand why.
First of all, this was about a working function, not about speed.

I don't know how you came up with 10 times faster. There's something wrong with your test.
If you want it to be faster, see TrimCharsShort.

Disable Debugger

Update
- Bugfix

Code: Select all

;-TOP by mk-soft, v1.01.2, 28.08.2022

EnableExplicit

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

Procedure.s TrimCharsShort(String.s, FindChar.s = " ")
  Protected Result.s, FirstFound, Found, cnt
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          *rc\c = ' '
          *rc + SizeOf(character)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure

; ****

; by jacdelad

Procedure.s TrimChars_(String.s)
  ReplaceString(String,#CR$," ",#PB_String_InPlace)
  ReplaceString(String,#LF$," ",#PB_String_InPlace)
  ReplaceString(String,#TAB$," ",#PB_String_InPlace)
  While FindString(String,"  ")
    String=ReplaceString(String,"  "," ")
  Wend
  ProcedureReturn String
EndProcedure

; ----

; by AZJIO

Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected i, j, SavePointer = 1
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""


	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	While *c\c
		*jc.Character = *memChar

		While *jc\c
			If *c\c = *jc\c
				SavePointer = 1
				*c\c = 0
				Break
			EndIf
			*jc + SizeOf(Character)
		Wend

		If *c\c And SavePointer
			SavePointer = 0
			If AddElement(Parts())
				Parts() = *c
			EndIf
		EndIf
		*c + SizeOf(Character)
	Wend

	ForEach Parts()
		tmp$ + PeekS(Parts()) + " "
	Next
	ProcedureReturn tmp$
EndProcedure

;-Test

CompilerIf #PB_Compiler_IsMainFile
  
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  DisableDebugger
  OpenConsole()
  Define output.s,count,tick.q
  PrintN("TrimChars")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  PrintN("TrimCharsShort")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimCharsShort(String$, #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  PrintN("TrimChars_")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=TrimChars_(String$)
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  PrintN("StripChar")
  tick=ElapsedMilliseconds()
  For count=1 To 1000000
    output=StripChar(String$, #CR$ + #LF$ + #TAB$ + " ")
  Next
  tick=ElapsedMilliseconds()-tick
  PrintN(Str(tick))
  PrintN("[" + output + "]")
  
  Input()
  
CompilerEndIf
P.S.
At your version missing first space ...
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
User avatar
mk-soft
Always Here
Always Here
Posts: 6207
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Remove double spaces

Post by mk-soft »

jacdelad wrote: Mon Aug 29, 2022 7:10 am I also use FindString, adapted to the need of the inputstring:
...
39.7 vs. 2.9 seconds.
Disable Debugger :!:
TrimChars
1524
[ Hello World! ]
TrimCharsShort
681
[ Hello World! ]
TrimChars_
1270
[ Hello World! ]
StripChar
1190
[Hello World! ]
To test with the debugger, you can also switch off the debugger within the function for runtime-intensive functions.

Code: Select all

Procedure foo()
  DisableDebugger
  ; Do any
  ProcedureReturn 1
  EnableDebugger
EndProcedure
Last edited by mk-soft on Mon Aug 29, 2022 12:32 pm, edited 1 time in total.
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
User avatar
jacdelad
Addict
Addict
Posts: 1993
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Remove double spaces

Post by jacdelad »

mk-soft wrote: Mon Aug 29, 2022 11:42 am
jacdelad wrote: Mon Aug 29, 2022 7:10 am I also use FindString, adapted to the need of the inputstring:
...
39.7 vs. 2.9 seconds.
Disable Debugger :!:
TrimChars
1524
[ Hello World! ]
TrimCharsShort
681
[ Hello World! ]
TrimChars_
1270
[ Hello World! ]
StripChar
1190
[Hello World! ]
To test with the debugger, you can also switch off the debugger within the function for runtime-intensive functions.

Code: Select all

Procedure foo()
  DisableDebugger
  ; Do any
  ProcedureReturn 1
  EnableDebugger
EndProcedure
...but my debugger was disabled, see the code I posted.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: Remove double spaces

Post by Demivec »

AZJIO wrote: Mon Aug 29, 2022 4:29 am I also had a variant (see below), but in it each word received a pointer. In the previous version, only a double space caused a pointer to be obtained. That is, the entire text, if there are no double spaces in it, then it was one pointer, so for the sake of speed I developed the previous version. But if you rewrite to the same line, then each character will have to be copied into memory byte by byte, I don’t know how fast it works. I liked listing bytes and now I try to use this method everywhere.

Code: Select all

EnableExplicit

Procedure.s StripChar(String$, TrimChar$ = #CRLF$ + #TAB$ + #FF$ + #VT$ + " ")
	Protected i, j, SavePointer = 1
	Protected *memChar, *c.Character, *jc.Character
	Protected NewList Parts.i()
	Protected tmp$ = ""


	If Not Asc(String$)
		ProcedureReturn ""
	EndIf

	*c.Character = @String$
	*memChar = @TrimChar$

	While *c\c
		*jc.Character = *memChar

		While *jc\c
			If *c\c = *jc\c
				SavePointer = 1
				*c\c = 0
				Break
			EndIf
			*jc + SizeOf(Character)
		Wend

		If *c\c And SavePointer
			SavePointer = 0
			If AddElement(Parts())
				Parts() = *c
			EndIf
		EndIf
		*c + SizeOf(Character)
	Wend

	ForEach Parts()
		tmp$ + PeekS(Parts()) + " "
	Next
	ProcedureReturn tmp$
EndProcedure


Define String$

String$ = "       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
Debug "|" + StripChar(String$) + "|"
String$ = "h w t b o a n e n e r"
Debug "|" + StripChar(String$) + "|"
Here is a faster modified version of mk-soft's version that now removes the string functions for concatenation:

Code: Select all

Procedure.s TrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  *c.Character = @String
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          Result + ReplaceString
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      Result + Chr(*c\c)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

;modification of mk-soft's TrimChars() by Demivec
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c.Character = @String
  *findChar = @FindChar
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      FirstFound = #False
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  #doublingMax = 5
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    DisableDebugger
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    If doubling < 4
      PrintN("TrimChars")
      tick=ElapsedMilliseconds()
      For count=1 To 1000000
        output=TrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
      Next
      tick=ElapsedMilliseconds()-tick
      PrintN("Time: " + Str(tick))
      PrintN("Output length: " + Str(Len(output)))
      PrintN("[" + output + "]")
    EndIf
    
    PrintN("replaceTrimChars")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf
Here is the test output, notice the times increase for the modified version only double when the string length doubles while the unmodified version's times grow at a much faster rate as the string length grows . I only show execution times for the longer string lengths with the faster subroutine after it is evident that the unmodified version is no longer in the running:

Code: Select all

Input length: 42
TrimChars
Time: 1790
Output length: 14
[ Hello World! ]
replaceTrimChars
Time: 639
Output length: 14
[ Hello World! ]
------------------
Input length: 84
TrimChars
Time: 3686
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars
Time: 1075
Output length: 27
[ Hello World! Hello World! ]
------------------
Input length: 168
TrimChars
Time: 8781
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars
Time: 1920
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 336
TrimChars
Time: 24409
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars
Time: 3615
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 672
replaceTrimChars
Time: 6979
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
Input length: 1344
replaceTrimChars
Time: 13661
Output length: 417
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
------------------
When included in the test with the other procedures (in mk-soft's test code above) it compares well with mk-soft's TrimCharsShort() procedure since they are virtually identical but it still allows specifying a custom replace string:

Code: Select all

TrimChars
1787
[ Hello World! ]
replaceTrimChars
635
[ Hello World! ]
TrimCharsShort
614
[ Hello World! ]
TrimChars_
1262
[ Hello World! ]
StripChar
6729
[Hello World! ]
User avatar
mk-soft
Always Here
Always Here
Posts: 6207
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Remove double spaces

Post by mk-soft »

@Demives,
Works so well as long as you limit the ReplaceString to one character.
Otherwise it can lead to a memory access error.

Better so ..

Code: Select all

;-TOP by mk-soft, v1.03.1, 29.08.2022

Procedure.s TrimCharsToChar(String.s, ReplaceChar.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character, *tc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c = @String
  *tc = @ReplaceChar
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          If *tc\c
            *rc\c = *tc\c
            *rc + SizeOf(character)
          EndIf
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure

; ----

;modification of mk-soft's TrimChars() by Demivec
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c.Character = @String
  *findChar = @FindChar
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack (by mk-soft)
      ;FirstFound = #False
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  #doublingMax = 4
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    DisableDebugger
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    
    PrintN("TrimCharsToChar")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output=TrimCharsToChar(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: Remove double spaces

Post by Demivec »

mk-soft wrote: Mon Aug 29, 2022 4:12 pm @Demives,
Works so well as long as you limit the ReplaceString to one character.
Otherwise it can lead to a memory access error.

Better so ..
Thanks for catching that. Since I originally wanted to offer the ability to replace a single character or group of characters in the FindChar.s parameter with a string of one or more characters I have made one last modification to replaceTrimChars() to include this feature which could be done in the original TrimChar(). There is some time penalty to include this functionality but I think it is a nice option. The single character option is still covered nicely by TrimCharsToChar():

Code: Select all

;-TOP by mk-soft, v1.03.1, 29.08.2022

Procedure.s TrimCharsToChar(String.s, ReplaceChar.s = " ", FindChar.s = " ")
  Protected Result.s, FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character, *tc.character
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  Result = Space(Len(String))
  *rc = @Result
  *c = @String
  *tc = @ReplaceChar
  *findChar = @FindChar
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          If *tc\c
            *rc\c = *tc\c
            *rc + SizeOf(character)
          EndIf
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Left(Result, (*rc - @Result) / SizeOf(character))
EndProcedure

; ----

;modification of mk-soft's TrimChars() by Demivec
;Replaces all groups of characters in FindChar with the entire ReplaceString.
Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  Protected FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  Protected len_String, len_ReplaceString, len_Result, toggle, maxLen_Result
  Dim Result.s(1)
  
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  len_String = Len(String) * 2
  maxLen_Result = len_String
  toggle = 0
  Result(toggle) = Space(maxLen_Result)
  *rc = @Result(toggle)
  *c.Character = @String
  *findChar = @FindChar
  len_ReplaceString = Len(ReplaceString)
  len_Result = 0
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          
          If len_Result + len_ReplaceString > maxLen_Result
            toggle ! 1 ;toggle now indexes alternate string
            maxLen_Result + len_String
            Result(toggle) = Space(maxLen_Result)
            CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
            *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
          EndIf
            
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf ; <- Small speed hack (by mk-soft)
      
      If len_Result + SizeOf(character) > maxLen_Result
        toggle ! 1 ;toggle now indexes alternate string
        maxLen_Result + len_String
        Result(toggle) = Space(maxLen_Result)
        CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
        *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
      EndIf
      
      *rc\c = *c\c
      *rc + SizeOf(character)
      len_Result + 1
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result(toggle)
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  #doublingMax = 4
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    DisableDebugger
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    
    PrintN("TrimCharsToChar")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output=TrimCharsToChar(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with single character ' '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with string ' <-----> '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf
output:

Code: Select all

Input length: 42
TrimCharsToChar
Time: 733
Output length: 14
[ Hello World! ]
replaceTrimChars with single character ' '
Time: 746
Output length: 14
[ Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 809
Output length: 38
[ <-----> Hello <-----> World! <-----> ]
------------------
Input length: 84
TrimCharsToChar
Time: 1071
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 1229
Output length: 27
[ Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 1323
Output length: 67
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 168
TrimCharsToChar
Time: 1917
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 2171
Output length: 53
[ Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 2389
Output length: 125
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 336
TrimCharsToChar
Time: 3579
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 4031
Output length: 105
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 4482
Output length: 241
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
Input length: 672
TrimCharsToChar
Time: 6862
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with single character ' '
Time: 7782
Output length: 209
[ Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! Hello World! ]
replaceTrimChars with string ' <-----> '
Time: 8431
Output length: 473
[ <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> Hello <-----> World! <-----> ]
------------------
User avatar
mk-soft
Always Here
Always Here
Posts: 6207
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Remove double spaces

Post by mk-soft »

Actually, further optimisation does not bring anything more. Only code style and the last nano-seconds.

So here is my last version and thanks to Demives for the ideas.

Update v1.03.3

Code: Select all

;-TOP by mk-soft, v1.03.3, 30.08.2022

Procedure.s TrimCharsToString(String.s, ReplaceString.s = " ", FindChars.s = " ")
  DisableDebugger
  
  Protected result.s, firstFound, found, size_max, size_replace
  Protected *result, *findChars, *c.Character, *fc.Character, *rc.character, *tc.character, *ofs
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  size_max = StringByteLength(String) + SizeOf(Character)
  size_replace = StringByteLength(ReplaceString)
  
  *result = AllocateMemory(size_max + size_replace * 4, #PB_Memory_NoClear)
  *rc = *result
  *c = @String
  *tc = @ReplaceString
  *findChars = @FindChars
  
  While *c\c
    *fc = *findChars
    While *fc\c
      If *c\c = *fc\c
        If Not firstFound
          firstFound = #True
          If *tc\c
            CopyMemory(*tc, *rc, size_replace)
            *rc + size_replace
            If *rc - *result >= size_max
              *ofs = *rc - *result
              size_max = MemorySize(*result)
              *result = ReAllocateMemory(*result, size_max + size_replace * 4, #PB_Memory_NoClear) 
              *rc = *result + *ofs
            EndIf
          EndIf
        EndIf
        found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not found
      If firstFound : firstFound = #False : EndIf
      *rc\c = *c\c
      *rc + SizeOf(character)
    Else
      found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  *rc\c = 0
  result = PeekS(*result)
  FreeMemory(*result)
  ProcedureReturn result
  
  EnableDebugger
EndProcedure

;modification of mk-soft's TrimChars() by Demivec
;Replaces all groups of characters in FindChar with the entire ReplaceString.

Procedure.s replaceTrimChars(String.s, ReplaceString.s = " ", FindChar.s = " ")
  DisableDebugger
  
  Protected FirstFound, Found
  Protected *findChar, *c.Character, *fc.Character, *rc.character
  Protected len_String, len_ReplaceString, len_Result, toggle, maxLen_Result
  Dim Result.s(1)
  
  
  If Not Bool(String)
    ProcedureReturn ""
  EndIf
  
  len_String = Len(String) * 2
  maxLen_Result = len_String
  toggle = 0
  Result(toggle) = Space(maxLen_Result)
  *rc = @Result(toggle)
  *c.Character = @String
  *findChar = @FindChar
  len_ReplaceString = Len(ReplaceString)
  len_Result = 0
  CopyMemoryString(@"", @*rc)
  
  While *c\c
    *fc.Character = *findChar
    While *fc\c
      If *c\c = *fc\c
        If Not FirstFound
          FirstFound = #True
          
          If len_Result + len_ReplaceString > maxLen_Result
            toggle ! 1 ;toggle now indexes alternate string
            maxLen_Result + len_String
            Result(toggle) = Space(maxLen_Result)
            CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
            *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
          EndIf
            
          CopyMemoryString(@ReplaceString)
        EndIf
        Found = #True
        Break
      EndIf
      *fc + SizeOf(Character)
    Wend
    
    If Not Found
      If FirstFound : FirstFound = #False : EndIf
      
      If len_Result + SizeOf(character) > maxLen_Result
        toggle ! 1 ;toggle now indexes alternate string
        maxLen_Result + len_String
        Result(toggle) = Space(maxLen_Result)
        CopyMemory(@Result(toggle ! 1), @Result(toggle), len_Result)
        *rc = @Result(toggle) + (len_Result + 1) * SizeOf(character)
      EndIf
      
      *rc\c = *c\c
      *rc + SizeOf(character)
      len_Result + 1
    Else
      Found = #False
    EndIf
    *c + SizeOf(Character)
  Wend
  
  ProcedureReturn Result(toggle)
  
  EnableDebugger
EndProcedure

; ****

;-Test

CompilerIf #PB_Compiler_IsMainFile
  
  CompilerIf #PB_Compiler_Debugger
    CompilerWarning "For runtime test disable debugger!"
  CompilerEndIf
  
  #doublingMax = 2
  Define String$
  String$ ="       " + #CRLF$ + #CRLF$ + "    Hello      World!   " + #LFCR$ + #TAB$ + "    "
  
  For doubling = 0 To #doublingMax
     
    OpenConsole()
    Define output.s,count,tick.q
  
    
    PrintN("Input length: " + Str(Len(String$)))
    
    PrintN("TrimCharsToString with single character ' '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output=TrimCharsToString(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with single character ' '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("TrimCharsToString with string ' <-----> '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = TrimCharsToString(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    PrintN("replaceTrimChars with string ' <-----> '")
    tick=ElapsedMilliseconds()
    For count=1 To 1000000
      output = replaceTrimChars(String$, " <-----> ", #CR$ + #LF$ + #TAB$ + " ")
    Next
    tick=ElapsedMilliseconds()-tick
    PrintN("Time: " + Str(tick))
    PrintN("Output length: " + Str(Len(output)))
    PrintN("[" + output + "]")
    
    String$ + String$
    PrintN("------------------")
  Next
  Input()
  
CompilerEndIf
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
Post Reply