Page 1 of 2
[PB4] Split() and Join() commands
Posted: Fri Apr 28, 2006 9:27 pm
by Flype
Code: Select all
; Split() functions, Purebasic 4.0+
Macro CountArray(array)
( PeekL( @array-8 ) )
EndMacro
Procedure.l SplitArray(array.s(1), text.s, separator.s = ",") ; String to Array
Protected index.l, size.l = CountString(text, separator)
ReDim array.s(size)
For index = 0 To size
array(index) = StringField(text, index + 1, separator)
Next
ProcedureReturn size
EndProcedure
Procedure.s JoinArray(array.s(1), separator.s = ",") ; Array to String
Protected index.l, result.s, size.l = CountArray(array()) - 1
For index = 0 To size
result + array(index)
If (index < size)
result + separator
EndIf
Next
ProcedureReturn result
EndProcedure
Procedure.l SplitList(list.s(), text.s, separator.s = ",") ; String to List
Protected index.l, size.l = CountString(text, separator)
For index = 0 To size
If AddElement(list())
list() = StringField(text, index + 1, separator)
EndIf
Next
ProcedureReturn size
EndProcedure
Procedure.s JoinList(list.s(), separator.s = ",") ; List to String
Protected result.s, size.l = CountList(list()) - 1
ForEach list()
result + list()
If (ListIndex(list()) < size)
result + separator
EndIf
Next
ProcedureReturn result
EndProcedure
; Examples
string.s = "abc,defg,hi,jklmop,qrs,tuv,wxyz"
; string -> array -> string
Dim a.s(0)
size.l = SplitArray(a(), string, ",")
For i = 0 To size
Debug a(i)
Next
Debug JoinArray(a())
; string -> list -> string
NewList b.s()
If SplitList(b(), string)
ForEach b()
Debug b()
Next
EndIf
Debug JoinList(b())
;--
Posted: Fri Apr 28, 2006 9:40 pm
by SCRJ
Cool, thanks for sharing.

Posted: Fri Apr 28, 2006 11:11 pm
by Flype
another version, which is the
php syntax.
http://php.net/manual/en/function.implode.php
http://php.net/manual/en/function.explode.php
Code: Select all
;
; Join()/Split()
;
; Php Syntax:
; implode.s( glue.s, pieces.s(1) )
; explode.l( separator.s, string.s , limit.l = 0 )
;
; http://php.net/manual/en/function.implode.php
; http://php.net/manual/en/function.explode.php
;
Macro CountArray(array)
( PeekL( @array-8 ) )
EndMacro
Procedure.l explode_array(array.s(1), separator.s, string.s, limit.l = 0) ; String to Array
Protected index.l, size.l = CountString(string, separator)
If (limit > 0)
size = limit - 1
ElseIf (limit < 0)
size + limit
EndIf
ReDim array.s(size)
For index = 0 To size
array(index) = StringField(string, index + 1, separator)
Next
ProcedureReturn size
EndProcedure
Procedure.s implode_array(glue.s, pieces.s(1)) ; Array to String
Protected index.l, string.s, size.l = CountArray(pieces()) - 1
For index = 0 To size
string + pieces(index)
If (index < size)
string + glue
EndIf
Next
ProcedureReturn string
EndProcedure
Procedure.l explode_list(list.s(), separator.s, string.s, limit.l = 0) ; String to List
Protected index.l, size.l = CountString(string, separator)
If (limit > 0)
size = limit - 1
ElseIf (limit < 0)
size + limit
EndIf
For index = 0 To size
If AddElement(list())
list() = StringField(string, index + 1, separator)
EndIf
Next
ProcedureReturn size
EndProcedure
Procedure.s implode_list(glue.s, pieces.s()) ; List to String
Protected string.s, size.l = CountList(pieces()) - 1
ForEach pieces()
string + pieces()
If (ListIndex(pieces()) < size)
string + glue
EndIf
Next
ProcedureReturn string
EndProcedure
; Examples
input.s = "feel the power of purebasic :-)"
; string -> array -> string
Dim a.s(0)
size.l = explode_array(a(), " ", input, -3)
For i = 0 To size
Debug a(i)
Next
Debug implode_array(" =|= ", a())
; string -> list -> string
NewList b.s()
If explode_list(b(), " ", input, 5)
ForEach b()
Debug b()
Next
EndIf
Debug "[" + implode_list("][", b()) + "]"
;--
Posted: Fri Jun 30, 2006 1:47 pm
by Flype
for those who need a true php syntax,
we can 'patch' the builtin command 'StringField()' with a new custom macro in order to accept separators longer than 1 character.
so, add this code at the top of the above source code and it should work.
Code: Select all
Macro StringField(string, index, separator)
StringFieldEx(string, index, separator)
EndMacro
Procedure.s StringFieldEx(string.s, index.l, sep.s)
Protected i.l, pos.l, field.s, lSep.l = Len(sep)
For i = 1 To index
pos = FindString(string, sep, 1)
If pos
field = Left(string, pos - 1)
Else
field = string
EndIf
string = Mid(string, pos + lSep, Len(string))
Next
ProcedureReturn field
EndProcedure
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 8:29 am
by wilbert
I know this is an old thread and there are multiple threads about splitting and joining but just wanted to share my attempt.
It supports both unicode and ascii and was tested with PB 5.11
Code: Select all
Procedure.i Split(Array StringArray.s(1), StringToSplit.s, Separator.s = " ")
Protected c = CountString(StringToSplit, Separator)
Protected i, l = StringByteLength(Separator)
Protected *p1.Character = @StringToSplit
Protected *p2.Character = @Separator
Protected *p = *p1
ReDim StringArray(c)
While i < c
While *p1\c <> *p2\c
*p1 + SizeOf(Character)
Wend
If CompareMemory(*p1, *p2, l)
CompilerIf #PB_Compiler_Unicode
StringArray(i) = PeekS(*p, (*p1 - *p) >> 1)
CompilerElse
StringArray(i) = PeekS(*p, *p1 - *p)
CompilerEndIf
*p1 + l
*p = *p1
i + 1
Else
*p1 + SizeOf(Character)
EndIf
Wend
StringArray(c) = PeekS(*p)
ProcedureReturn c
EndProcedure
Procedure.s Join(Array StringArray.s(1), Separator.s = "")
Protected r.s, i, l, c = ArraySize(StringArray())
While i <= c
l + Len(StringArray(i))
i + 1
Wend
r = Space(l + Len(Separator) * c)
i = 1
l = @r
CopyMemoryString(@StringArray(0), @l)
While i <= c
CopyMemoryString(@Separator)
CopyMemoryString(@StringArray(i))
i + 1
Wend
ProcedureReturn r
EndProcedure
; *** test code ***
Dim A.s(0)
S.s = "... ++ This is a test string ++ used to test split and join ++ ..."
For i = 1 To 18
S + S
Next
t1 = ElapsedMilliseconds()
Split(A(), S, "++")
t2 = ElapsedMilliseconds()
S = Join(A(), "*")
t3 = ElapsedMilliseconds()
MessageRequester("Test result", "Split:" + Str(t2-t1) + " Join:" + Str(t3-t2))
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 9:36 am
by Joris
Hi Wilbert,
I haven't been busy with the code above, but I used Split and Join quit a lot in GB32 and so I searched already for equivalents in PB.
This one I found but haven't test it yet :
ExtractRegularExpression(#RegularExpression, String$, Array$()) must be useful for split.
It should do the same as that php command explode
http://php.net/manual/en/function.explode.php.
I anounce this as it isn't used yet in one of the sources above.
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 10:11 am
by wilbert
Joris wrote:This one I found but haven't test it yet : ExtractRegularExpression(#RegularExpression, String$, Array$()) must be useful for split.
It looks interesting but I haven't got a clue what the proper expression would be to split a string.
Also I don't know how fast it is.
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 10:56 am
by Joris
wilbert wrote:Joris wrote:This one I found but haven't test it yet : ExtractRegularExpression(#RegularExpression, String$, Array$()) must be useful for split.
It looks interesting but I haven't got a clue what the proper expression would be to split a string.
Also I don't know how fast it is.
Hm, good question. I got stuck on that...
Now more on what kind of RegularExpression can be used, as there are a lot and I don't know if all have the same definition. I thougth Perl is a language with the most possible RegularExpression, the base for the RegularExpression in PB I don't know (we must ask or find out).
Here are some RegularExpression I can use in my editor (UltraEdit) :
% Matches the start of line - Indicates the search string must be at the beginning of a line but does not include any line terminator characters in the resulting string selected.
$ Matches the end of line - Indicates the search string must be at the end of line but does not include any line terminator characters in the resulting string selected.
? Matches any single character except newline.
* Matches any number of occurrences of any character except newline.
+ Matches one or more of the preceding character/expression. At least one occurrence of the character must be found. Does not match repeated newlines.
++ Matches the preceding character/expression zero or more times. Does not match repeated newlines.
^b Matches a page break.
^p Matches a newline (CR/LF) (paragraph) (DOS Files)
^r Matches a newline (CR Only) (paragraph) (MAC Files)
^n Matches a newline (LF Only) (paragraph) (UNIX Files)
^t Matches a tab character
[xyz] A character set. Matches any characters between brackets.
[~xyz] A negative character set. Matches any characters NOT between brackets including newline characters.
^{A^}^{B^} Matches expression A OR B
^ Overrides the following regular expression character
^(…^) Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
The corresponding replacement expression is ^x, for x in the range 1-9. Example: If ^(h*o^) ^(f*s^) matches "hello folks", ^2 ^1 would replace it with "folks hello".
Added a few more so maybe easier to find the equivalents in PB (already tried some like \w and ... not compatible) :
Code: Select all
\ Indicates the next character has a special meaning. "n" on it’s own matches the character "n". "\n" matches a linefeed or newline character. See examples below (\d, \f, \n etc).
^ Matches/anchors the beginning of line.
$ Matches/anchors the end of line.
* Matches the preceding character zero or more times.
+ Matches the preceding character one or more times. Does not match repeated newlines.
. Matches any single character except a newline character. Does not match repeated newlines. (expression)
Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
The corresponding replacement expression is \x, for x in the range 1-9. Example: If (h.*o) (f.*s) matches "hello folks", \2 \1 would replace it with "folks hello".
[xyz] A character set. Matches any characters between brackets.
[^xyz] A negative character set. Matches any characters NOT between brackets including newline characters.
\d Matches a digit character. Equivalent to [0-9].
\D Matches a nondigit character. Equivalent to [^0-9].
\f Matches a form-feed character.
\n Matches a linefeed character.
\r Matches a carriage return character.
\s Matches any whitespace including space, tab, form-feed, etc but not newline.
\S Matches any non-whitespace character but not newline.
\t Matches a tab character.
\v Matches a vertical tab character.
\w Matches any word character including underscore.
\W Matches any nonword character.
\p Matches CR/LF (same as \r\n) to match a DOS line terminator.
Sometimes it's a bit searching to combine the right one but mostly they are very useful.
I just see the RegularExpression for PB are explained here :
http://www.pcre.org/pcre.txt
Quit a bunch to explore...
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 2:42 pm
by Little John
Just a small remark:
PureBasic has now a built-in function called "SplitList()".
So for more clarity, maybe these functions better should be called "SplitString()" and "JoinString()" or so.
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 2:53 pm
by Joris
Little John wrote:Just a small remark:
PureBasic has now a built-in function called "SplitList()".
So for more clarity, maybe these functions better should be called "SplitString()" and "JoinString()" or so.
"Small remark" yeah, as SplitList has a complete different function : it splits on a certain amount instead of a 'regular condition'.
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 3:01 pm
by Little John
Joris wrote:SplitList has a complete different function
Yes, of course.
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 3:53 pm
by Joris
wilbert wrote:It looks interesting but I haven't got a clue what the proper expression would be to split a string.
Also I don't know how fast it is.
@Wilbert if you haven't noticed this link below, the solution (the speed... I don't know, yet.) :
http://www.purebasic.fr/english/viewtop ... 13&t=54089
So, to split a string in words :
Code: Select all
If CreateRegularExpression(0, "\w+")
Dim Result$(0)
NbFound = ExtractRegularExpression(0, "abC ABc zbA abc", Result$())
Debug NbFound
For k = 0 To NbFound-1
Debug Result$(k)
Next
Else
Debug RegularExpressionError()
EndIf
Re: [PB4] Split() and Join() commands
Posted: Mon Mar 25, 2013 3:59 pm
by skywalk
RegularExpressions are 10 to 100 times slower than PB code.
Use them only where speed is not a concern.

Re: [PB4] Split() and Join() commands
Posted: Sun Jun 25, 2017 9:54 am
by GJ-68
@wilbert
You should test your Split function with a separator with more than one character and when the first char of sep matches.
Example: StringToSplit = "ABCxzDEFxyGHI", Separator = "xy"
Fixed version:
Code: Select all
Procedure.i Split(Array StringArray.s(1), StringToSplit.s, Separator.s = " ")
Protected c = CountString(StringToSplit, Separator)
Protected i, l = StringByteLength(Separator)
Protected *p1.Character = @StringToSplit
Protected *p2.Character = @Separator
Protected *p = *p1
ReDim StringArray(c)
While i < c
While *p1\c <> *p2\c
*p1 + SizeOf(Character)
Wend
If CompareMemory(*p1, *p2, l)
CompilerIf #PB_Compiler_Unicode
StringArray(i) = PeekS(*p, (*p1 - *p) >> 1)
CompilerElse
StringArray(i) = PeekS(*p, *p1 - *p)
CompilerEndIf
*p1 + l
*p = *p1
i + 1
Else
*p1 + SizeOf(Character)
EndIf
Wend
StringArray(c) = PeekS(*p)
ProcedureReturn c
EndProcedure
Re: [PB4] Split() and Join() commands
Posted: Sun Jun 25, 2017 5:40 pm
by wilbert
GJ-68 wrote:@wilbert
You should test your Split function with a separator with more than one character and when the first char of sep matches.
Example: StringToSplit = "ABCxzDEFxyGHI", Separator = "xy"
Thanks for mentioning.
I updated my code.
