StringField

Just starting out? Need help? Post your questions and find answers here.
interfind
User
User
Posts: 26
Joined: Thu Apr 22, 2021 1:41 pm

StringField

Post by interfind »

Say i have a String with words separated by spaces.
But the spaces between the word's have a different amount.

mystring="Mouse car Apple four purebasic"

How can i separate the word's with Stringfield.
Or is there a better solution?
User avatar
spikey
Enthusiast
Enthusiast
Posts: 769
Joined: Wed Sep 22, 2010 1:17 pm
Location: United Kingdom

Re: StringField

Post by spikey »

Normalise the string first with ReplaceString, then you can use StringField as you planned.

Code: Select all

mystring.S ="Mouse  car   Apple    four     purebasic"

Repeat
  mystring = ReplaceString(mystring, "  ", " ")
Until FindString(mystring, "  ") = 0

Count = CountString(mystring, " ") + 1 
For x = 1 To Count
  Debug StringField(mystring, x, " ")
Next x
User avatar
mk-soft
Always Here
Always Here
Posts: 6244
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: StringField

Post by mk-soft »

A little more complicated with pointers, but faster.
TAB is also recognised

Update with descriptions

Update v1.02.1
- Added GetWordsToArray
- Change String$ to Pointer *String

Code: Select all

;-TOP by mk-soft, v1.02.1, 20.06.2024, LGPL

Procedure GetWordsToList(*String, List Result.s())
  Protected *chr.character, *pos1
  
  ClearList(Result())
  If *String
    ; Set pointer chr to first character
    *chr = *String
    ; Set pointer pos1 to same first character
    *pos1 = *chr
    Repeat
      Select *chr\c ; Get character from pointer
        Case 0 ; Zero is end of string
          If *chr > *pos1
            AddElement(Result())
            Result() = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
          EndIf
          Break
        Case ' ', #TAB, #LF, #CR ; Here add separators
          If *chr > *pos1
            AddElement(Result())
            Result() = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
          EndIf
          *pos1 = *chr + SizeOf(Character) ; Size of character is 2 (Unicode)
      EndSelect
      *chr + SizeOf(Character) ; Size of character is 2 (Unicode)    
    ForEver
  EndIf
  ProcedureReturn ListSize(Result())
EndProcedure

Procedure GetWordsToArray(*String, Array Result.s(1))
  Protected *chr.character, *pos1, idx
  
  If *String
    ; Set pointer chr to first character
    *chr = *String
    If *chr\c
      ; Set pointer pos1 to same first character
      *pos1 = *chr
      Dim Result(7)
      idx = -1
      Repeat
        Select *chr\c ; Get character from pointer
          Case 0 ; Zero is end of string
            If *chr > *pos1
              idx + 1
              If ArraySize(Result()) <> idx
                ReDim Result(idx)
              EndIf
              Result(idx) = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
            Else
              If ArraySize(Result()) <> idx
                ReDim Result(idx)
              EndIf
            EndIf
            Break
          Case ' ', #TAB, #LF, #CR ; Here add separators
            If *chr > *pos1
              idx + 1
              If ArraySize(Result()) < idx
                ReDim Result(idx+8)
              EndIf
              Result(idx) = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
            EndIf
            *pos1 = *chr + SizeOf(Character) ; Size of character is 2 (Unicode)
        EndSelect
        *chr + SizeOf(Character) ; Size of character is 2 (Unicode)    
      ForEver
      ProcedureReturn idx + 1
    EndIf
  EndIf
  ProcedureReturn 0
EndProcedure

; ********

Define mystring.s


NewList r1.s()

mystring = " Mouse car  Apple " + #LFCR$ + "  four" + #TAB$ + " purebasic  "

cnt = GetWordsToList(@mystring, r1())
Debug "Count: " + cnt
ForEach r1()
  Debug "[" + r1() + "]"
Next

Debug "-----------------"

Dim r2.s(0)

cnt = GetWordsToArray(@mystring, r2())
Debug "Count: " + cnt
For i = 0 To cnt - 1
  Debug "Index " + i +": [" + r2(i) + "]"
Next
Last edited by mk-soft on Thu Jun 20, 2024 6:48 pm, edited 2 times in total.
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
AZJIO
Addict
Addict
Posts: 2188
Joined: Sun May 14, 2017 1:48 am

Re: StringField

Post by AZJIO »

Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: StringField

Post by Marc56us »

Code: Select all

mystring$ ="Mouse  car   Apple    four     purebasic"

CreateRegularExpression(0, "([^ ])+")

Dim Nb$(0)

Debug ExtractRegularExpression(0, mystring$, Nb$()) - 1
AZJIO
Addict
Addict
Posts: 2188
Joined: Sun May 14, 2017 1:48 am

Re: StringField

Post by AZJIO »

Taking into account the fact that the file size will increase by 150kb

Code: Select all

mystring$ ="Mouse  car   Apple    four     purebasic"

CreateRegularExpression(0, "\S+")

Define Dim Nb$(0)

Debug ExtractRegularExpression(0, mystring$, Nb$())
For i = 0 To ArraySize(Nb$())
	Debug Nb$(i)
Next
interfind
User
User
Posts: 26
Joined: Thu Apr 22, 2021 1:41 pm

Re: StringField

Post by interfind »

But what is if i want to seperate only word's with two or more spaces between it?

My Car is Red__cool_yes_____the_house_is_blue___dog__food

Result should be:

My Car is Red
cool yes
the house is blue
dog
food
Axolotl
Addict
Addict
Posts: 835
Joined: Wed Dec 31, 2008 3:36 pm

Re: StringField

Post by Axolotl »

But how about trying it out for yourself .... (sorry, no offense meant).

It is just like the above examples ....

Code: Select all

EnableExplicit 

;Global test$ = "My Car is Red__cool_yes_____the_house_is_blue___dog__food"   ; LSet("", index, "_")  
Global test$ = "My Car is Red  cool yes     the house is blue   dog  food"   ; Space(index) 
Global newSeparator$ = #CRLF$ 
Global result$ 
Global index 

; not speed optimized, but working .... 
result$ = test$ 
For index = 10 To 2 Step -1 
  result$ = ReplaceString(result$, Space(index), newSeparator$)  
Next index 

Debug result$ 
Just because it worked doesn't mean it works.
PureBasic 6.04 (x86) and <latest stable version and current alpha/beta> (x64) on Windows 11 Home. Now started with Linux (VM: Ubuntu 22.04).
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: StringField

Post by Marc56us »

Quick and dirty RegEx %) (need to use Rtrim to erase last spaces)

Code: Select all

mystring$ ="My Car is Red  cool yes     the house is blue   dog  food"
If Not CreateRegularExpression(0, "(.+?)(?:[ ]{2,}|$)")
    Debug "Bad RegEx :-/" : End
EndIf
Define Dim Nb$(0)
Debug ExtractRegularExpression(0, mystring$, Nb$())
For i = 0 To ArraySize(Nb$())
    Debug ">>>" + RTrim(Nb$(i), " ") + "<<<"
Next

Code: Select all

5
>>>My Car is Red<<<
>>>cool yes<<<
>>>the house is blue<<<
>>>dog<<<
>>>food<<<
:D
interfind
User
User
Posts: 26
Joined: Thu Apr 22, 2021 1:41 pm

Re: StringField

Post by interfind »

@Axolotl:
your code Check for 10 spaces down to 2
on found replace with #CRLF$.
It's simple and very good solution.

@marc56us:
This works like it should, but i can't full understand it.
Can you explain what the Trick is, please.
- - > CreateRegularExpression(0, "(.+?)(?:[ ]{2,}|$)")
User avatar
mk-soft
Always Here
Always Here
Posts: 6244
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: StringField

Post by mk-soft »

Update v1.02.1
- Added GetWordsToArray
- Change String$ to Pointer *String

Fast mode update. Since procedure string parameters are ByCopy, I have changed the parameter as a pointer to the string.
This means that the string is no longer copied unnecessarily.

SHOW TOP
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
SMaag
Enthusiast
Enthusiast
Posts: 324
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: StringField

Post by SMaag »

here is my Version: I used functions from my PureBasic Framwork library.
Reomving or keep Chars and Split or Join are frequent question in the forum.

Code: Select all

  Structure pChar   ; virtual CHAR-ARRAY, used as Pointer to overlay on strings 
    a.a[0]          ; fixed ARRAY Of CHAR Length 0
    c.c[0]          
  EndStructure

  #PbFw_STR_CHAR_TAB         =  9    ; TAB
  #PbFw_STR_CHAR_SPACE       = 32    ; ' '
  #PbFw_STR_CHAR_DoubleQuote = 34    ; "
  #PbFw_STR_CHAR_SingleQuote = 39    ; '
  

  Macro mac_RemoveTabsAndDoubleSpace_KeepChar()
  	If *pWrite <> *pRead          ; if  WritePosition <> ReadPosition
  		*pWrite\c = *pRead\c[0]     ; Copy the Character from ReadPosition to WritePosition => compacting the String
  	EndIf
  	*pWrite + SizeOf(Character)   ; set new Write-Position 
  EndMacro
  
  Procedure.I RemoveTabsAndDoubleSpaceFast(*String)
    ; ============================================================================
    ; NAME: RemoveTabsAndDoubleSpaceFast
    ; DESC: Attention! This is a Pointer-Version! Be sure to call it with a
    ; DESC: correct String-Pointer
    ; DESC: Removes all TABs and all double SPACEs from the String dirctly
    ; DESC: in memory. The String will be shorter after!
    ; VAR(*String) : Pointer to String
    ; RET.i: *String 
    ; ============================================================================
    
    Protected *pWrite.Character = *String
    Protected *pRead.pChar = *String
         	  
    If Not *String
      ProcedureReturn
    EndIf
    
    ; Trim leading TABs and Spaces
    While *pRead\c[0]
      If *pRead\c[0] = #PbFw_STR_CHAR_SPACE      
      ElseIf *pRead\c[0] = #PbFw_STR_CHAR_TAB       
      Else
         Break
      EndIf    
      *pRead + SizeOf(Character)
    Wend
    
  	While *pRead\c[0]     ; While Not NullChar
  	  
  	  Select *pRead\c[0]
  	       	      
        ; If we check for the most probably Chars first, we speed up the operation
        ; because we minimze the number of checks to do!
        Case #PbFw_STR_CHAR_TAB
          
          If *pRead\c[1] = #PbFw_STR_CHAR_SPACE        
            ; if NextChar = SPACE Then remove   
          ElseIf *pRead\c[1] = #PbFw_STR_CHAR_TAB
            ; if NextChar = TAB Then remove   
          Else
            ; if NextChar <> SPACE And NextChar <> TAB   
            *pRead\c[0] = #PbFw_STR_CHAR_SPACE    ; Change TAB to SPACE
            mac_RemoveTabsAndDoubleSpace_KeepChar()   ; keep the Char
          EndIf
            
        Case #PbFw_STR_CHAR_SPACE
          
          If *pRead\c[1] = #PbFw_STR_CHAR_SPACE        
           ; if NextChar = SPACE Then remove   
          Else
            mac_RemoveTabsAndDoubleSpace_KeepChar()   ; keep the Char
          EndIf          
          
        Default
          mac_RemoveTabsAndDoubleSpace_KeepChar()		; local Macro _KeepChar()
         
      EndSelect
      
      *pRead + SizeOf(Character) ; Set Pointer to NextChar
  		
    Wend
  	
  	; If *pWrite is not at end of orignal *String,
  	; we removed some char and must write a 0-Termination as new EndOfString 
  	If *pRead <> *pWrite
  		*pWrite\c = 0
  	EndIf
  	
  	; Remove last Char if it is a SPACE => RightTrim
  	*pWrite - SizeOf(Character)
  	If *pWrite\c = #PbFw_STR_CHAR_SPACE
  		*pWrite\c = 0
 	  EndIf
 	  ProcedureReturn *String
  EndProcedure
  
 ;- --------------------------------------------------
 ;-  Split
 ;- -------------------------------------------------- 
  
  #_ArrayRedimStep = 10                   ; Redim-Step if Arraysize is to small

  Prototype SplitToArray(Array Out.s(1), String$, Separator$)
  Global SplitToArray.SplitToArray
  
  Prototype SplitToList(List Out.s(), String$, Separator$, clrList= #True)
  Global SplitToList.SplitToList

  Procedure.i _SplitToArray(Array Out.s(1), *String, *Separator)
   ; ============================================================================
    ; NAME: SplitToArray
    ; DESC: Split a String into multiple Strings
    ; DESC: 
    ; VAR(Out.s()) : Array to return the Substrings 
    ; VAR(*String) : Pointer to String 
    ; VAR(*Separator) : Pointer to mulit Char Separator 
    ; RET.i : No of Substrings
    ; ============================================================================
    
    Protected *ptrString.Character = *String          ; Pointer to String
    Protected *ptrSeperator.Character = *Separator    ; Pointer to Separator
    Protected *Start.Character = *String              ; Pointer to Start of SubString    
    Protected xEqual, lenSep, N, ASize, L
      
    lenSep = MemoryStringLength(*Separator)           ; Length of Separator
     
    ASize = ArraySize(Out())
     
    While *ptrString\c
    ; ----------------------------------------------------------------------
    ;  Outer Loop: Stepping trough *String
    ; ----------------------------------------------------------------------
      
      If  *ptrString\c = *ptrSeperator\c ; 1st Character of Seperator in String   
        ; Debug "Equal : " +  Chr(*ptrString\c)
        
        xEqual =#True
        
        While *ptrSeperator\c
        ; ------------------------------------------------------------------
        ;  Inner Loop: Char by Char compare Separator with String
        ; ------------------------------------------------------------------
          If *ptrString\c
            If *ptrString\c <> *ptrSeperator\c
              xEqual = #False
            EndIf
          Else 
            xEqual =#False        ; Not Equal
            Break                 ; Exit While
          EndIf
          *ptrSeperator + SizeOf(Character)  ; NextChar Separator
          *ptrString + SizeOf(Character)     ; NextChar String  
        Wend
        
        ; If we found the complete Separator in String
        If xEqual
          ; Length of the String from start up to separator
          L =  (*ptrString - *Start)/SizeOf(Character) - lenSep 
          Out(N) = PeekS(*Start, L)
          *Start = *ptrString             ; the New Startposition
          ; Debug "Start\c= " + Str(*Start\c) + " : " + Chr(*Start\c)
          *ptrString - SizeOf(Character)  ; bo back 1 char to detected double single separators like ,,
           N + 1   
           If ASize < N
             ASize + #_ArrayRedimStep
             ReDim Out(ASize)
           EndIf      
        EndIf
        
      EndIf   
      *ptrSeperator = *Separator            ; Reset Pointer of Seperator to 1st Char
      *ptrString + SizeOf(Character)        ; NextChar in String
    Wend
   
    Out(N) = PeekS(*Start)  ; Part after the last Separator
    ProcedureReturn N+1     ; Number of Substrings
        
  EndProcedure
  SplitToArray = @_SplitToArray()   ; Bind ProcedureAddress to Prototype
  
  Procedure.i _SplitToList(List Out.s(), *String, *Separator, clrList= #True)
   ; ============================================================================
    ; NAME: SplitToList
    ; DESC: Split a String into multiple Strings
    ; DESC: 
    ; VAR(Out.s())   : List to return the Substrings 
    ; VAR(*String)   : Pointer to String 
    ; VAR(*Separator): Pointer to Separator String 
    ; VAR(clrList)   : #False: Append Splits to List; #True: Clear List first
    ; RET.i          : No of Substrings
    ; ============================================================================
    
    Protected *ptrString.Character = *String          ; Pointer to String
    Protected *ptrSeperator.Character = *Separator    ; Pointer to Separator
    Protected *Start.Character = *String              ; Pointer to Start of SubString   
    Protected xEqual, lenSep, N, L
      
    lenSep = MemoryStringLength(*Separator)           ; Length of Separator
    
    If clrList
      ClearList(Out())  
    EndIf
    
    While *ptrString\c
    ; ----------------------------------------------------------------------
    ;  Outer Loop: Stepping trough *String
    ; ----------------------------------------------------------------------
      
      If  *ptrString\c = *ptrSeperator\c ; 1st Character of Seperator in String   
        ; Debug "Equal : " +  Chr(*ptrString\c)
        xEqual =#True
       
        While *ptrSeperator\c
        ; ------------------------------------------------------------------
        ;  Inner Loop: Char by Char compare Separator with String
        ; ------------------------------------------------------------------
          If *ptrString\c 
            If *ptrString\c <> *ptrSeperator\c
              xEqual = #False
            EndIf
          Else 
            xEqual =#False        ; Not Equal
            Break                 ; Exit While
          EndIf
          *ptrSeperator + SizeOf(Character)  ; NextChar Separator
          *ptrString + SizeOf(Character)     ; NextChar String  
        Wend
        
        ; If we found the complete Separator in String
        If xEqual
          ; Length of the String from Start up to Separator
          L =  (*ptrString - *Start)/SizeOf(Character) - lenSep 
          AddElement(Out())
          Out() = PeekS(*Start, L)
          *Start = *ptrString             ; the New Startposition
          ; Debug "Start\c= " + Str(*Start\c) + " : " + Chr(*Start\c)
          *ptrString - SizeOf(Character)  ; bo back 1 char to detected double single separators like ,,
           N + 1   
         EndIf
        
      EndIf   
      *ptrSeperator = *Separator            ; Reset Pointer of Seperator to 1st Char
      *ptrString + SizeOf(Character)        ; NextChar in String
    Wend
   
    AddElement(Out())
    Out() = PeekS(*Start)  ; Part after the last Separator
    ProcedureReturn N+1     ; Number of Substrings
        
  EndProcedure
  SplitToList = @_SplitToList()   ; Bind ProcedureAddress to Prototype
  
  ;- --------------------------------------------------
  ;- TestCode
  ;- --------------------------------------------------
  
  Global test$ = "My Car is Red  cool yes     the house is blue   dog  food"   ; Space(index) 
  
  
  Debug test$
  Debug "--- Now remove TABs and double Spaces1 ---"
  RemoveTabsAndDoubleSpaceFast(@test$)
  Debug test$
  
  Define I, N
  
  Debug ""
  Debug "--- Now Split to Array ---"

  Dim Words.s(0)
  
  N = SplitToArray(Words(), test$, " ")
  

  For I = 0 To N-1
    Debug Str(I) + " : " + Words(I)
  Next
  
AZJIO
Addict
Addict
Posts: 2188
Joined: Sun May 14, 2017 1:48 am

Re: StringField

Post by AZJIO »

interfind wrote: Thu Jun 20, 2024 6:07 pm but i can't full understand it.
RegExp
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: StringField

Post by Marc56us »

interfind wrote: Thu Jun 20, 2024 6:07 pm This works like it should, but i can't full understand it.
Can you explain what the Trick is, please.
- - > CreateRegularExpression(0, "(.+?)(?:[ ]{2,}|$)")
"Think simple": A RegEx is created as we speak.

Code: Select all

(.+?)(?:[ ]{2,}|$)

(		Group (Begin capture)
.		Any char
+		One or more
?		Until the next mask matches
)		End first group (End capture)
(		Begin next group
?:		No capture this group
[		Begin chars list
		Space
]		End list
{2,}	2 or more
|		Or
$		End of line
)		End group
:wink:
interfind
User
User
Posts: 26
Joined: Thu Apr 22, 2021 1:41 pm

Re: StringField

Post by interfind »

Here is a fast und small ASM Solution. Give it a try.

Feedback are welcome ! :wink:

Code: Select all

;Remove unnecessary spaces in a String
;

Procedure RemoveSpaces(*string)
  EnableASM
  Protected string = *string

    MOV esi, [p.v_string]
    MOV edi, esi

!firstspaces:
    LODSW
    CMP al, " "
    JE firstspaces
    STOSW
    
!spaces:
    LODSW
    CMP al, " "
    JNE hit
    CMP byte [esi], " "
    JE spaces

!hit:
    STOSW
    CMP al, 0
    JNE spaces
    SUB edi, 2

!lastspaces:
    SUB  edi, 2
    CMP byte [edi], " "
    JE lastspaces
    MOV byte [edi+2], 0

    DisableASM
EndProcedure


Define string.s= "   1    This   was       a     space  test --->       <---  "
Debug "String with spaces: " + string
Debug "Length: " + Len(string)
RemoveSpaces(@string)

Debug ""

Debug "String without unnecessary spaces: " + string
Debug "Length: " + Len(string)

Debug ""

count = CountString(string, " ")
For a=1 To count+1
  Debug "Index " + a + ": " + StringField(string, a, " ")
Next a

End
Post Reply