Line Parser

Just starting out? Need help? Post your questions and find answers here.
User avatar
NoahPhense
Addict
Addict
Posts: 1999
Joined: Thu Oct 16, 2003 8:30 pm
Location: North Florida

Line Parser

Post by NoahPhense »

.

I'm looking for a line parser. I could have sworn that I saw one a few
months back. One that didn't use FindString(). It was small and pretty
fast. Fast enough for me anyhow.

It took a simple line:

The dog jumped over the fence.

And split it up based on the space delimeter.

And you were able to choose the delimeter, something like such:

Parse(string.s, delimeter.s)

- np
LarsG
Enthusiast
Enthusiast
Posts: 713
Joined: Mon Jun 02, 2003 1:06 pm
Location: Norway
Contact:

Post by LarsG »

don't know if this will help you....

Code: Select all

Procedure.s ExtractWord(txt$, number, separator$)
	Protected ReturnString$
  Protected letter$
	Protected cno
  letter$ = "" : cno = 0
	If number > 0
		For x = 1 To Len(txt$)
			letter$ = Mid(txt$,x,1)
			If letter$ = separator$
				; hit!
				cno = cno + 1
				If cno = number
          Break
        EndIf
      EndIf
    Next
  EndIf
    ;we've hit the spot, now extract the string
    For y = x+1 To Len(txt$)
      letter$ = Mid(txt$,y,1)
      If letter$ = separator$
        Break
      EndIf
        ReturnString$ = ReturnString$ + letter$
      Next
  ProcedureReturn ReturnString$
EndProcedure

AMD Athlon XP2400, 512 MB RAM, Hercules 3D Prophet 9600 256MB RAM, WinXP
PIII 800MHz, 320 MB RAM, Nvidia Riva Tnt 2 Mach 64 (32MB), WinXP + Linux
17" iMac, 1.8 GHz G5, 512 MB DDR-RAM, 80 GB HD, 64 MB Geforce FX 5200, SuperDrive, OSX
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Re: Line Parser

Post by PB »

It's the StringField command. Example from the docs:

Code: Select all

For k=1 To 6 
  Debug StringField("Hello I am a split string", k, " ") 
Next 
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

WooHoo! That's pretty neat.

Is there a better/quicker way than the following to determine how many delimited fields are in the string?

Code: Select all

k=1
w.s=StringField("Hello I am a split string", k, " ") 
While Len(w)>0
  Debug w
  k+1
  w=StringField("Hello I am a split string", k, " ") 
wend
Debug "Elements = "+Str(k-1)
User avatar
NoahPhense
Addict
Addict
Posts: 1999
Joined: Thu Oct 16, 2003 8:30 pm
Location: North Florida

..

Post by NoahPhense »

Dare2 wrote:WooHoo! That's pretty neat.

Is there a better/quicker way than the following to determine how many delimited fields are in the string?

Code: Select all

k=1
w.s=StringField("Hello I am a split string", k, " ") 
While Len(w)>0
  Debug w
  k+1
  w=StringField("Hello I am a split string", k, " ") 
wend
Debug "Elements = "+Str(k-1)
Thanks everyone for the help.. this will work great.

- np
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

iirc the only problem with stringfield() is that it doesn't handle double spaces well, or am i wrong here?
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
User avatar
NoahPhense
Addict
Addict
Posts: 1999
Joined: Thu Oct 16, 2003 8:30 pm
Location: North Florida

..

Post by NoahPhense »

blueznl wrote:iirc the only problem with stringfield() is that it doesn't handle double spaces well, or am i wrong here?
I did test that theory.. and yes, stringfield sucks when it comes to spaces.

The solution is to make sure you don't have any double spaces. LOL jk

Here's a temp solution. nothing special..

Code: Select all

; Noah Phense

myString.s = "aaaaaaaaa                       bbbbbbbbbbbb    cccccccccccccccccccc    ddd                eee"
myStringLen.w = Len(myString)

location=1
trueCount=0
nextWord.s = StringField(myString, location, " ")

Repeat
  nextWord = StringField(myString, location, " ")
  
  If Len(nextWord) = 0
    ; do nothing
  Else
    Debug nextWord
    trueCount = trueCount + 1
  EndIf
  
  location+1
Until location = myStringLen

Debug "trueCount = " + Str(trueCount)
I'll run it through some speed tests later.. :D

- np
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

because i do some other parsing as well i converted an old gfa routine of mine to parse whatever... i think i went a bit overboard in this case :-)
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
User avatar
NoahPhense
Addict
Addict
Posts: 1999
Joined: Thu Oct 16, 2003 8:30 pm
Location: North Florida

..

Post by NoahPhense »

blueznl wrote:because i do some other parsing as well i converted an old gfa routine of mine to parse whatever... i think i went a bit overboard in this case :-)
lol.. ok man.. you made me do it.. 8O heres a procedure

Code: Select all

; Noah Phense

Procedure.l ParseIt(myString.s)
  myStringLen.w = Len(myString)
  location=1
  trueCount=0
  nextWord.s = StringField(myString, location, " ")
  Repeat
    nextWord = StringField(myString, location, " ")
    If Len(nextWord) = 0
      ; do nothing
    Else
      Debug nextWord
      trueCount = trueCount + 1
    EndIf
    location+1
  Until location = myStringLen
  ProcedureReturn trueCount
EndProcedure

Debug ParseIT("This            is          a         test...")
- np
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

Another klurge:

Code: Select all

Srce.s="String with  some multiple    spaces inside."

; ---- eliminate extra spaces

While FindString(Srce, "  ", 1)
  Srce = ReplaceString(Srce, "  ", " ", 1, 1) 
Wend

; ---- split

k=1 
w.s=StringField(Srce, k, " ") 
While Len(w)>0 
  Debug w 
  k+1 
  w=StringField(Srce, k, " ") 
Wend 
Debug "Elements = "+Str(k-1) 

Now if only there was a real split command that created an array.
holyfieldstudios
New User
New User
Posts: 4
Joined: Wed May 25, 2005 6:57 pm

Any suggestions

Post by holyfieldstudios »

Code: Select all

Procedure StrSplit(Expression$, Delimiter$)
  
  ;Define a new zero lenght array as default result for procedure to return
  Dim res.s(0)
  count.l = CountString(Expression$, Delimiter$)
  
  ;Validate input
  If Len(Expression$) < 1
    ;Return zero lenght array
    ProcedureReturn res()
  EndIf
  
  If count = 0
    res(0) = Expression$
    ProcedureReturn res()
  EndIf
  
  k=1
  Dim ar.s(count)
  wt.s = " "
  While Len(wt) > 0
    wt = StringField(Expression$, k, Delimiter$)
    ar(k-1) = wt
    k+1
    If k > count+1
      Break
    EndIf
  Wend
  ProcedureReturn ar()
  
EndProcedure
Last edited by holyfieldstudios on Thu May 26, 2005 11:32 am, edited 1 time in total.
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

This looked fun :D I better add another solution before someone does an ASM version and spoils it all :cry: I did a quick modification to one of my new unicode functions. It'll take a delimiter (as an ascii value) and ignore multiple spaces.

Code: Select all

Procedure.s ParseIt(inString.s, index.l, Delimiter.b) ; Index is 1 based.
   ;
   Protected HoldString.l
   ;
   Protected ReturnString.s
   ;
   Protected CountSeparator.l
   ;
   Protected PositionCharacter.l
   ;
   Protected HoldCharacter.b
   ;
   Protected *MemPosition.l
   ;
   *MemPosition = @inString
   ;
   Protected FoundField.b
   ;
   Protected LastPosition.l
   ;
   Protected FoundSpace.b
   ;
   Protected StringLength.l
   ;
   StringLength = Len(inString)
   ;
   LastPosition = @inString
   ;
   If index = 0 : index = 1 : EndIf
   ; Our index is 1 based.
   Repeat
      ; Loop through our string.
      HoldCharacter = PeekB(*MemPosition)
      ; Store the current character.
      If (HoldCharacter = Delimiter And Delimiter <> 32) Or (HoldCharacter = 32 And Delimiter = 32 And FoundSpace = #False)
         ; Check if the character is our field divider.
         CountSeparator + 1
         ; Increment our delimiter count.
         If CountSeparator + 1 = index
            ; We're in the field we want to copy.
            FoundField = #True
            ;
         ElseIf CountSeparator + 1 > index
            ;
            If FoundField = #True
               ; We've found a previous separator character so calcluate based on that.
               HoldString = AllocateMemory(*MemPosition - LastPosition)
               CopyMemory(LastPosition + 1, HoldString, *MemPosition - LastPosition - 1)
               ;
               ReturnString = PeekS(HoldString)
               ;
               FreeMemory(HoldString)
               ;
            Else
               ; No previous separator character so this is the first field.
               HoldString = AllocateMemory((*MemPosition - LastPosition) + 1)
               ; If we allocate 2 more bytes than needed for the string, the last character will be 
               ; the null character (0) automatically.
               CopyMemory(inString, HoldString, *MemPosition - LastPosition)
               ; So we're copying into all but the last 2 bytes.
               ReturnString = PeekS(HoldString)
               ;
               FreeMemory(HoldString)
               ;
            EndIf
            ;
            ProcedureReturn ReturnString
            ;
         EndIf
         ; Check if we've passed our target field and if so, break out of the loop. 
         LastPosition = *MemPosition
         ; Now update with the last separator position.
      EndIf
      ;
      If HoldCharacter = 32 : FoundSpace = #True : Else : FoundSpace = #False : EndIf
      ;
      If Delimiter = 32 And HoldCharacter = 32 : LastPosition = *MemPosition : EndIf
      ;
      *MemPosition + 1
      ; Increment our position in the string.
   Until *MemPosition - @inString > StringLength
   ; Stop either by breaking out of the loop or when we reach the end.
   If FoundField = #True
      ; We found the field but we reached the end before finding another separator character.
      HoldString = AllocateMemory(*MemPosition - LastPosition)
      CopyMemory(LastPosition + 1, HoldString, *MemPosition - LastPosition - 1)
      ;
   Else
      ; We did not field the field.
      HoldString = AllocateMemory(1) 
      ; And return an empty string.  Initialized to 0 automatically.
   EndIf
   ;
   ReturnString = PeekS(HoldString)
   ;
   FreeMemory(HoldString)
   ;
   ProcedureReturn ReturnString
   ; Return the pointer to the return string.
EndProcedure
okasvi
Enthusiast
Enthusiast
Posts: 150
Joined: Wed Apr 27, 2005 9:41 pm
Location: Finland

Post by okasvi »

doesnt stringfield and trim do the job? :| there is countstring too... :|
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

Why is everyone posting codes the length of two ropes tied together in both ends?

The string_field function works perfectly when there is any number of spaces between the words. But not with spaces in the ends. To strip the spaces in the end use:

Code: Select all

string = trim(string)
string_field will now work correctly. To count the number of words all you have to do is to count the spaces. It there are for example two, three or four spaces between two words you will get the wrong result. To fix this make sure there are only one space between the words. This can be done as following:

Code: Select all

string = Trim(string)
string = ReplaceString(String, "  ", " ")
string = ReplaceString(String, "   ", " ")
The code above can handle ANY number of misplaced spaces between words. First it removes leading and trailing spaces. Then it checks for two spaces and replaces them with one. Then it checks for three spaces and replaces them with one. The result: no more misplaces spaces! (and string_field is GUARANTEED to work perfect!)

After fixing double, leading and trailing spaces, you can find the number of words with this code:

Code: Select all

words = CountString(string," ")+1
Was that so difficult?
Post Reply