PluckString command to extract string from another

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
MachineCode
Addict
Addict
Posts: 1482
Joined: Tue Feb 22, 2011 1:16 pm

Re: PluckString command to extract string from another

Post by MachineCode »

Still not much of a big deal, because you can just expand the borders a bit. In your examples, what is next after the last ">"? Maybe a #CRLF$? So include that as the right border, ie. ">"+#CRLF$. Easy. But the point of the tip is, to simply pluck something out of two borders. It's up to the coder to decide what the borders are.
Microsoft Visual Basic only lasted 7 short years: 1991 to 1998.
PureBasic: Born in 1998 and still going strong to this very day!
User avatar
TomS
Enthusiast
Enthusiast
Posts: 342
Joined: Sun Mar 18, 2007 2:26 pm
Location: Munich, Germany

Re: PluckString command to extract string from another

Post by TomS »

@MachineCode: You can't just expand the borders. Some tags may includes spaces, some not. Some attributes may be encapsulated by doublequotes, some by singlequotes, some are not encapsulated at all.
You can't check for all those variations. That's bad style and very slow.

@Trond: HTML doesn't allow single quotes. It's the browsers that allow it. In HTML every value that is not numeric (positive integer), must stand between doublequotes.

<table border=1 width="100%">
<table border=true width=100%>
<table border="true">

Just because it works doesn't mean it's correct. That's why there are so many sites that only look good in one browser.
Now one has to decide, whether to stick to the rules and ignore bad codes, or to try to compensate for all imaginable errors.
MachineCode
Addict
Addict
Posts: 1482
Joined: Tue Feb 22, 2011 1:16 pm

Re: PluckString command to extract string from another

Post by MachineCode »

TomS wrote:You can't check for all those variations
I know. But for my use, I would be testing for the same specific borders, so it works for me. And the request for such a command is a good one anyway, for general-purpose non-HTML plucking.
Microsoft Visual Basic only lasted 7 short years: 1991 to 1998.
PureBasic: Born in 1998 and still going strong to this very day!
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Re: PluckString command to extract string from another

Post by Trond »

TomS wrote:@Trond: HTML doesn't allow single quotes. It's the browsers that allow it. In HTML every value that is not numeric (positive integer), must stand between doublequotes.
Where did you get that information? The standard does not agree with you.

http://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2.2
User avatar
TomS
Enthusiast
Enthusiast
Posts: 342
Joined: Sun Mar 18, 2007 2:26 pm
Location: Munich, Germany

Re: PluckString command to extract string from another

Post by TomS »

Well, seems I'm not totally up2date. :oops:
Here it says, that single quotes are allowed, too. http://www.w3schools.com/html/html_attributes.asp
But I think that's just because all browsers accept it.
It's a trend to use single quotest in php, asp etc, because programmers don't want to escape the quotes.

echo "this is a <a href='target' >hyperlink</a>";

I'm pretty sure that this was not valid in html before 4...

Anyways, I don't like the style. Just as I hate double quotes in php, but that's just me.

It also says: "always quote". Only numerical values can stand unquoted.
Althought the majority of browser doesn't seem to have a problem with that either.

I remember IE accepting height, heigth, and heigt as an element attribute.
Again: Just because it works doesn't mean it's correct.

EDIT: As for the feature request: Have a look at this post and the follow ups ;)
http://purebasic.fr/english/viewtopic.p ... 94#p353094
User avatar
kenmo
Addict
Addict
Posts: 2032
Joined: Tue Dec 23, 2003 3:54 am

Re: PluckString command to extract string from another

Post by kenmo »

Just to add one more similar solution:

Code: Select all

Procedure.s Middle(String.s, Open.s = "", Close.s = "", Start.i = 1, NoCase.i = #True)
  Protected iOpen.i,   iClose.i, opLen.i
  Protected qString.s, qOpen.s,  qClose.s
  Protected Result.s
  
  If (NoCase)
    qString = LCase(String)
    qOpen   = LCase(Open  )
    qClose  = LCase(Close )
  Else
    qString = String
    qOpen   = Open
    qClose  = Close
  EndIf
  
  If (Open)
    opLen = Len(Open)
    iOpen = FindString(qString, qOpen, Start)
    If (iOpen)
      iClose = FindString(qString, qClose, iOpen + opLen)
      If (iClose)
        Result = Mid(String, iOpen + opLen, iClose - iOpen - opLen)
      Else
        Result = Mid(String, iOpen + opLen)
      EndIf
    Else
      Result = ""
    EndIf
  Else
    If (Close)
      iOpen = Start
      iClose   = FindString(qString, qClose, Start)
      If (iClose)
        Result = Left(String, iClose - 1)
      Else
        Result = ""
      EndIf
    Else
      Result = String
    EndIf
  EndIf
  
  ProcedureReturn (Result)
EndProcedure
[/size]

This one is a little different. If the closing (or opening) delimiter isn't found, it absorbs everything right up to the string boundary. Same if either delimiter is left blank. So I guess the name "middle" is sort of a misnomer...

But for anything more complex, like HTML parsing, you can't go wrong with Regular Expressions.
Vitor_Boss®
User
User
Posts: 81
Joined: Thu Sep 23, 2010 4:22 am

Re: PluckString command to extract string from another

Post by Vitor_Boss® »

Hi, if I understood you need split an string, I wrote a code to do that returning result as ARRAY. Enjoy

Code: Select all

Procedure Split(Array Result.s(1), Expression.s, Delimiter.s, Limit.l=-1)
  Protected.i i, ii, C
  Protected.l Length, Size, Position
  ii = CountString(Expression, Delimiter)
  If ii=0
    ReDim Result(0)
    Result(0) = Expression
  Else
    If Limit > 0 And ii > Limit-1
      ii = Limit-1
    EndIf
    ReDim Result(ii)
    Size = Len(Delimiter)
    Position = 1
    For C = 0 To ii-1
      Length = FindString(Expression, Delimiter, Position) - Position
      Result(C) = Mid(Expression, Position, Length)
      Position + Length + Size
    Next
    Result(C) = Mid(Expression, Position)
  EndIf
EndProcedure
Sorry by bad English.
HP Pavilion DV6-2155DX: Intel i3-330m 2.13 / 4GB DDR3 / 500GB Sata2 HD / Display 15.6" LED / Win7 Ultimate x64 / PB 4.50 x86 demo.
Post Reply