String tokenisation

Share your advanced PureBasic knowledge/code with the community.
Kris_a
User
User
Posts: 92
Joined: Sun Feb 15, 2004 8:04 pm
Location: Manchester, UK

String tokenisation

Post by Kris_a »

Code updated for 5.20+ (same as StringField())

I ported this Blitz function (that I made a while ago) to PB. It splits a string into several pieces (seperated by a deliminator of your choice) then gets a particular one. Really useful for things like 'plain English' protocols (HTTP for example).

Pretty fast too (I hope). The test I included does 1000000 iterations in 2.1 seconds. Enjoy :D

Code: Select all

Procedure.s tok(txt.s, delim.s, tok)
  start = 1
  l = Len(delim)
  For a = 1 To tok
    If a > 1 
      start = found + l
    EndIf
    found = FindString(txt, delim, start)
    length = found - start
  Next
  ProcedureReturn Mid(txt, start, length)
EndProcedure

#NUMLOOPS = 1000000

st.s = ""

t1 = GetTickCount_()

For a = 1 To #NUMLOOPS
  st = tok("string tokeniser test", " ", 2)
Next

MessageRequester("Result", Str(#NUMLOOPS) + " in " + Str(GetTickCount_() - t1) + "ms", 0)
PS. This runs about 70% faster in PB than it does in BB : 8)
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Re: String tokenisation

Post by PB »

PureBasic has its own string parser, which gives faster results than yours. :)

Code: Select all

#NUMLOOPS = 1000000

st.s = ""

t1 = gettickcount_()

For a = 1 To #NUMLOOPS
  st = StringField("string tokeniser test",2," ")
Next

MessageRequester("Result",Str(#NUMLOOPS)+" in "+Str(gettickcount_()-t1)+"ms",0) 
Kris_a
User
User
Posts: 92
Joined: Sun Feb 15, 2004 8:04 pm
Location: Manchester, UK

Post by Kris_a »

oh damn :/

that's what I get for not reading the docs
Iria
User
User
Posts: 43
Joined: Sat Nov 29, 2003 8:49 pm

Yes but...

Post by Iria »

The PB function does not handle spaces very well though, i.e. if space is a delimiter and you have double spaces between text guess what ... you get null items parsed, so be wary.

Also just noticed that the forum preview gives me this :)

Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 69480 bytes) in /home/apache/p/ph/phpbb.myforums.net/includes/topic_review.php on line 95

WP the forum :)
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Re: Yes but...

Post by PB »

> The PB function does not handle spaces very well though, i.e. if space is a
> delimiter and you have double spaces between text guess what ... you get
> null items parsed

That's to be expected, so it's not a bug or anything. The function simply
looks for every space and splits the string, so if you have double spaces
then naturally it'll split them up.
Post Reply