Page 1 of 1
StringField
Posted: Mon Jun 17, 2024 6:00 pm
by interfind
Say i have a String with words separated by spaces.
But the spaces between the word's have a different amount.
mystring="Mouse car Apple four purebasic"
How can i separate the word's with Stringfield.
Or is there a better solution?
Re: StringField
Posted: Mon Jun 17, 2024 6:51 pm
by spikey
Normalise the string first with ReplaceString, then you can use StringField as you planned.
Code: Select all
mystring.S ="Mouse car Apple four purebasic"
Repeat
mystring = ReplaceString(mystring, " ", " ")
Until FindString(mystring, " ") = 0
Count = CountString(mystring, " ") + 1
For x = 1 To Count
Debug StringField(mystring, x, " ")
Next x
Re: StringField
Posted: Mon Jun 17, 2024 7:29 pm
by mk-soft
A little more complicated with pointers, but faster.
TAB is also recognised
Update with descriptions
Update v1.02.1
- Added GetWordsToArray
- Change String$ to Pointer *String
Code: Select all
;-TOP by mk-soft, v1.02.1, 20.06.2024, LGPL
Procedure GetWordsToList(*String, List Result.s())
Protected *chr.character, *pos1
ClearList(Result())
If *String
; Set pointer chr to first character
*chr = *String
; Set pointer pos1 to same first character
*pos1 = *chr
Repeat
Select *chr\c ; Get character from pointer
Case 0 ; Zero is end of string
If *chr > *pos1
AddElement(Result())
Result() = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
EndIf
Break
Case ' ', #TAB, #LF, #CR ; Here add separators
If *chr > *pos1
AddElement(Result())
Result() = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
EndIf
*pos1 = *chr + SizeOf(Character) ; Size of character is 2 (Unicode)
EndSelect
*chr + SizeOf(Character) ; Size of character is 2 (Unicode)
ForEver
EndIf
ProcedureReturn ListSize(Result())
EndProcedure
Procedure GetWordsToArray(*String, Array Result.s(1))
Protected *chr.character, *pos1, idx
If *String
; Set pointer chr to first character
*chr = *String
If *chr\c
; Set pointer pos1 to same first character
*pos1 = *chr
Dim Result(7)
idx = -1
Repeat
Select *chr\c ; Get character from pointer
Case 0 ; Zero is end of string
If *chr > *pos1
idx + 1
If ArraySize(Result()) <> idx
ReDim Result(idx)
EndIf
Result(idx) = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
Else
If ArraySize(Result()) <> idx
ReDim Result(idx)
EndIf
EndIf
Break
Case ' ', #TAB, #LF, #CR ; Here add separators
If *chr > *pos1
idx + 1
If ArraySize(Result()) < idx
ReDim Result(idx+8)
EndIf
Result(idx) = PeekS(*pos1, (*chr - *pos1) / SizeOf(Character)) ; Len is (pointer chr - pointer pos1) / 2 (Unicode)
EndIf
*pos1 = *chr + SizeOf(Character) ; Size of character is 2 (Unicode)
EndSelect
*chr + SizeOf(Character) ; Size of character is 2 (Unicode)
ForEver
ProcedureReturn idx + 1
EndIf
EndIf
ProcedureReturn 0
EndProcedure
; ********
Define mystring.s
NewList r1.s()
mystring = " Mouse car Apple " + #LFCR$ + " four" + #TAB$ + " purebasic "
cnt = GetWordsToList(@mystring, r1())
Debug "Count: " + cnt
ForEach r1()
Debug "[" + r1() + "]"
Next
Debug "-----------------"
Dim r2.s(0)
cnt = GetWordsToArray(@mystring, r2())
Debug "Count: " + cnt
For i = 0 To cnt - 1
Debug "Index " + i +": [" + r2(i) + "]"
Next
Re: StringField
Posted: Tue Jun 18, 2024 8:20 pm
by AZJIO
Re: StringField
Posted: Wed Jun 19, 2024 9:00 pm
by Marc56us
Code: Select all
mystring$ ="Mouse car Apple four purebasic"
CreateRegularExpression(0, "([^ ])+")
Dim Nb$(0)
Debug ExtractRegularExpression(0, mystring$, Nb$()) - 1
Re: StringField
Posted: Wed Jun 19, 2024 10:08 pm
by AZJIO
Taking into account the fact that the file size will increase by 150kb
Code: Select all
mystring$ ="Mouse car Apple four purebasic"
CreateRegularExpression(0, "\S+")
Define Dim Nb$(0)
Debug ExtractRegularExpression(0, mystring$, Nb$())
For i = 0 To ArraySize(Nb$())
Debug Nb$(i)
Next
Re: StringField
Posted: Thu Jun 20, 2024 3:19 pm
by interfind
But what is if i want to seperate only word's with two or more spaces between it?
My Car is Red__cool_yes_____the_house_is_blue___dog__food
Result should be:
My Car is Red
cool yes
the house is blue
dog
food
Re: StringField
Posted: Thu Jun 20, 2024 3:46 pm
by Axolotl
But how about trying it out for yourself .... (sorry, no offense meant).
It is just like the above examples ....
Code: Select all
EnableExplicit
;Global test$ = "My Car is Red__cool_yes_____the_house_is_blue___dog__food" ; LSet("", index, "_")
Global test$ = "My Car is Red cool yes the house is blue dog food" ; Space(index)
Global newSeparator$ = #CRLF$
Global result$
Global index
; not speed optimized, but working ....
result$ = test$
For index = 10 To 2 Step -1
result$ = ReplaceString(result$, Space(index), newSeparator$)
Next index
Debug result$
Re: StringField
Posted: Thu Jun 20, 2024 4:24 pm
by Marc56us
Quick and dirty RegEx %) (need to use Rtrim to erase last spaces)
Code: Select all
mystring$ ="My Car is Red cool yes the house is blue dog food"
If Not CreateRegularExpression(0, "(.+?)(?:[ ]{2,}|$)")
Debug "Bad RegEx :-/" : End
EndIf
Define Dim Nb$(0)
Debug ExtractRegularExpression(0, mystring$, Nb$())
For i = 0 To ArraySize(Nb$())
Debug ">>>" + RTrim(Nb$(i), " ") + "<<<"
Next
Code: Select all
5
>>>My Car is Red<<<
>>>cool yes<<<
>>>the house is blue<<<
>>>dog<<<
>>>food<<<

Re: StringField
Posted: Thu Jun 20, 2024 6:07 pm
by interfind
@Axolotl:
your code Check for 10 spaces down to 2
on found replace with #CRLF$.
It's simple and very good solution.
@marc56us:
This works like it should, but i can't full understand it.
Can you explain what the Trick is, please.
- - > CreateRegularExpression(0, "(.+?)(?:[ ]{2,}|$)")
Re: StringField
Posted: Thu Jun 20, 2024 6:53 pm
by mk-soft
Update v1.02.1
- Added GetWordsToArray
- Change String$ to Pointer *String
Fast mode update. Since procedure string parameters are ByCopy, I have changed the parameter as a pointer to the string.
This means that the string is no longer copied unnecessarily.
SHOW TOP
Re: StringField
Posted: Thu Jun 20, 2024 8:15 pm
by SMaag
here is my Version: I used functions from my PureBasic Framwork library.
Reomving or keep Chars and Split or Join are frequent question in the forum.
Code: Select all
Structure pChar ; virtual CHAR-ARRAY, used as Pointer to overlay on strings
a.a[0] ; fixed ARRAY Of CHAR Length 0
c.c[0]
EndStructure
#PbFw_STR_CHAR_TAB = 9 ; TAB
#PbFw_STR_CHAR_SPACE = 32 ; ' '
#PbFw_STR_CHAR_DoubleQuote = 34 ; "
#PbFw_STR_CHAR_SingleQuote = 39 ; '
Macro mac_RemoveTabsAndDoubleSpace_KeepChar()
If *pWrite <> *pRead ; if WritePosition <> ReadPosition
*pWrite\c = *pRead\c[0] ; Copy the Character from ReadPosition to WritePosition => compacting the String
EndIf
*pWrite + SizeOf(Character) ; set new Write-Position
EndMacro
Procedure.I RemoveTabsAndDoubleSpaceFast(*String)
; ============================================================================
; NAME: RemoveTabsAndDoubleSpaceFast
; DESC: Attention! This is a Pointer-Version! Be sure to call it with a
; DESC: correct String-Pointer
; DESC: Removes all TABs and all double SPACEs from the String dirctly
; DESC: in memory. The String will be shorter after!
; VAR(*String) : Pointer to String
; RET.i: *String
; ============================================================================
Protected *pWrite.Character = *String
Protected *pRead.pChar = *String
If Not *String
ProcedureReturn
EndIf
; Trim leading TABs and Spaces
While *pRead\c[0]
If *pRead\c[0] = #PbFw_STR_CHAR_SPACE
ElseIf *pRead\c[0] = #PbFw_STR_CHAR_TAB
Else
Break
EndIf
*pRead + SizeOf(Character)
Wend
While *pRead\c[0] ; While Not NullChar
Select *pRead\c[0]
; If we check for the most probably Chars first, we speed up the operation
; because we minimze the number of checks to do!
Case #PbFw_STR_CHAR_TAB
If *pRead\c[1] = #PbFw_STR_CHAR_SPACE
; if NextChar = SPACE Then remove
ElseIf *pRead\c[1] = #PbFw_STR_CHAR_TAB
; if NextChar = TAB Then remove
Else
; if NextChar <> SPACE And NextChar <> TAB
*pRead\c[0] = #PbFw_STR_CHAR_SPACE ; Change TAB to SPACE
mac_RemoveTabsAndDoubleSpace_KeepChar() ; keep the Char
EndIf
Case #PbFw_STR_CHAR_SPACE
If *pRead\c[1] = #PbFw_STR_CHAR_SPACE
; if NextChar = SPACE Then remove
Else
mac_RemoveTabsAndDoubleSpace_KeepChar() ; keep the Char
EndIf
Default
mac_RemoveTabsAndDoubleSpace_KeepChar() ; local Macro _KeepChar()
EndSelect
*pRead + SizeOf(Character) ; Set Pointer to NextChar
Wend
; If *pWrite is not at end of orignal *String,
; we removed some char and must write a 0-Termination as new EndOfString
If *pRead <> *pWrite
*pWrite\c = 0
EndIf
; Remove last Char if it is a SPACE => RightTrim
*pWrite - SizeOf(Character)
If *pWrite\c = #PbFw_STR_CHAR_SPACE
*pWrite\c = 0
EndIf
ProcedureReturn *String
EndProcedure
;- --------------------------------------------------
;- Split
;- --------------------------------------------------
#_ArrayRedimStep = 10 ; Redim-Step if Arraysize is to small
Prototype SplitToArray(Array Out.s(1), String$, Separator$)
Global SplitToArray.SplitToArray
Prototype SplitToList(List Out.s(), String$, Separator$, clrList= #True)
Global SplitToList.SplitToList
Procedure.i _SplitToArray(Array Out.s(1), *String, *Separator)
; ============================================================================
; NAME: SplitToArray
; DESC: Split a String into multiple Strings
; DESC:
; VAR(Out.s()) : Array to return the Substrings
; VAR(*String) : Pointer to String
; VAR(*Separator) : Pointer to mulit Char Separator
; RET.i : No of Substrings
; ============================================================================
Protected *ptrString.Character = *String ; Pointer to String
Protected *ptrSeperator.Character = *Separator ; Pointer to Separator
Protected *Start.Character = *String ; Pointer to Start of SubString
Protected xEqual, lenSep, N, ASize, L
lenSep = MemoryStringLength(*Separator) ; Length of Separator
ASize = ArraySize(Out())
While *ptrString\c
; ----------------------------------------------------------------------
; Outer Loop: Stepping trough *String
; ----------------------------------------------------------------------
If *ptrString\c = *ptrSeperator\c ; 1st Character of Seperator in String
; Debug "Equal : " + Chr(*ptrString\c)
xEqual =#True
While *ptrSeperator\c
; ------------------------------------------------------------------
; Inner Loop: Char by Char compare Separator with String
; ------------------------------------------------------------------
If *ptrString\c
If *ptrString\c <> *ptrSeperator\c
xEqual = #False
EndIf
Else
xEqual =#False ; Not Equal
Break ; Exit While
EndIf
*ptrSeperator + SizeOf(Character) ; NextChar Separator
*ptrString + SizeOf(Character) ; NextChar String
Wend
; If we found the complete Separator in String
If xEqual
; Length of the String from start up to separator
L = (*ptrString - *Start)/SizeOf(Character) - lenSep
Out(N) = PeekS(*Start, L)
*Start = *ptrString ; the New Startposition
; Debug "Start\c= " + Str(*Start\c) + " : " + Chr(*Start\c)
*ptrString - SizeOf(Character) ; bo back 1 char to detected double single separators like ,,
N + 1
If ASize < N
ASize + #_ArrayRedimStep
ReDim Out(ASize)
EndIf
EndIf
EndIf
*ptrSeperator = *Separator ; Reset Pointer of Seperator to 1st Char
*ptrString + SizeOf(Character) ; NextChar in String
Wend
Out(N) = PeekS(*Start) ; Part after the last Separator
ProcedureReturn N+1 ; Number of Substrings
EndProcedure
SplitToArray = @_SplitToArray() ; Bind ProcedureAddress to Prototype
Procedure.i _SplitToList(List Out.s(), *String, *Separator, clrList= #True)
; ============================================================================
; NAME: SplitToList
; DESC: Split a String into multiple Strings
; DESC:
; VAR(Out.s()) : List to return the Substrings
; VAR(*String) : Pointer to String
; VAR(*Separator): Pointer to Separator String
; VAR(clrList) : #False: Append Splits to List; #True: Clear List first
; RET.i : No of Substrings
; ============================================================================
Protected *ptrString.Character = *String ; Pointer to String
Protected *ptrSeperator.Character = *Separator ; Pointer to Separator
Protected *Start.Character = *String ; Pointer to Start of SubString
Protected xEqual, lenSep, N, L
lenSep = MemoryStringLength(*Separator) ; Length of Separator
If clrList
ClearList(Out())
EndIf
While *ptrString\c
; ----------------------------------------------------------------------
; Outer Loop: Stepping trough *String
; ----------------------------------------------------------------------
If *ptrString\c = *ptrSeperator\c ; 1st Character of Seperator in String
; Debug "Equal : " + Chr(*ptrString\c)
xEqual =#True
While *ptrSeperator\c
; ------------------------------------------------------------------
; Inner Loop: Char by Char compare Separator with String
; ------------------------------------------------------------------
If *ptrString\c
If *ptrString\c <> *ptrSeperator\c
xEqual = #False
EndIf
Else
xEqual =#False ; Not Equal
Break ; Exit While
EndIf
*ptrSeperator + SizeOf(Character) ; NextChar Separator
*ptrString + SizeOf(Character) ; NextChar String
Wend
; If we found the complete Separator in String
If xEqual
; Length of the String from Start up to Separator
L = (*ptrString - *Start)/SizeOf(Character) - lenSep
AddElement(Out())
Out() = PeekS(*Start, L)
*Start = *ptrString ; the New Startposition
; Debug "Start\c= " + Str(*Start\c) + " : " + Chr(*Start\c)
*ptrString - SizeOf(Character) ; bo back 1 char to detected double single separators like ,,
N + 1
EndIf
EndIf
*ptrSeperator = *Separator ; Reset Pointer of Seperator to 1st Char
*ptrString + SizeOf(Character) ; NextChar in String
Wend
AddElement(Out())
Out() = PeekS(*Start) ; Part after the last Separator
ProcedureReturn N+1 ; Number of Substrings
EndProcedure
SplitToList = @_SplitToList() ; Bind ProcedureAddress to Prototype
;- --------------------------------------------------
;- TestCode
;- --------------------------------------------------
Global test$ = "My Car is Red cool yes the house is blue dog food" ; Space(index)
Debug test$
Debug "--- Now remove TABs and double Spaces1 ---"
RemoveTabsAndDoubleSpaceFast(@test$)
Debug test$
Define I, N
Debug ""
Debug "--- Now Split to Array ---"
Dim Words.s(0)
N = SplitToArray(Words(), test$, " ")
For I = 0 To N-1
Debug Str(I) + " : " + Words(I)
Next
Re: StringField
Posted: Fri Jun 21, 2024 1:25 am
by AZJIO
interfind wrote: Thu Jun 20, 2024 6:07 pm
but i can't full understand it.
RegExp
Re: StringField
Posted: Fri Jun 21, 2024 6:12 am
by Marc56us
interfind wrote: Thu Jun 20, 2024 6:07 pm
This works like it should, but i can't full understand it.
Can you explain what the Trick is, please.
- - > CreateRegularExpression(0, "(.+?)(?:[ ]{2,}|$)")
"
Think simple": A RegEx is created as we speak.
Code: Select all
(.+?)(?:[ ]{2,}|$)
( Group (Begin capture)
. Any char
+ One or more
? Until the next mask matches
) End first group (End capture)
( Begin next group
?: No capture this group
[ Begin chars list
Space
] End list
{2,} 2 or more
| Or
$ End of line
) End group

Re: StringField
Posted: Sat Jun 22, 2024 12:19 pm
by interfind
Here is a fast und small ASM Solution. Give it a try.
Feedback are welcome !
Code: Select all
;Remove unnecessary spaces in a String
;
Procedure RemoveSpaces(*string)
EnableASM
Protected string = *string
MOV esi, [p.v_string]
MOV edi, esi
!firstspaces:
LODSW
CMP al, " "
JE firstspaces
STOSW
!spaces:
LODSW
CMP al, " "
JNE hit
CMP byte [esi], " "
JE spaces
!hit:
STOSW
CMP al, 0
JNE spaces
SUB edi, 2
!lastspaces:
SUB edi, 2
CMP byte [edi], " "
JE lastspaces
MOV byte [edi+2], 0
DisableASM
EndProcedure
Define string.s= " 1 This was a space test ---> <--- "
Debug "String with spaces: " + string
Debug "Length: " + Len(string)
RemoveSpaces(@string)
Debug ""
Debug "String without unnecessary spaces: " + string
Debug "Length: " + Len(string)
Debug ""
count = CountString(string, " ")
For a=1 To count+1
Debug "Index " + a + ": " + StringField(string, a, " ")
Next a
End