Page 2 of 3

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 6:21 pm
by Olli
Wiv'out v'e teev' (long term fupport)

( hash("nuts") = 2*'n' + 3*'u' + 5*'t' + 7*'s' )

Code: Select all

Macro anotherMid(alpha, beta)
    PeekS(alpha\wrd(3, beta), alpha\wrd(2, beta) )
EndMacro

#bpc = SizeOf(character) ; (b)ytes (p)er (c)haracter
#bpi = SizeOf(integer) ; (b)ytes (p)er (i)nteger

;longest word
;x32u; x32a; x64u;   x64a
;114;  1469; ?(big); ?(big)

; (for beta reducing)
#greatestUnsignedCharacter = 1 << (8 * #bpc) - 1 
#greatestSignedInteger = 1 << ((8 * #bpi) - 1) - 1

#cmLim = 1 << (8 * #bpc) - 1 ; (c)haracter (m)ask array (lim)it
#pvLim = 1 << 16 - 1 ; (p)rime (v)alue array (lim)it

Structure charMask
    Array cm.a(#cmLim)
EndStructure

Structure wrd
    Array wrd.i(3, 4095)
    qty.i
EndStructure

Structure primeValue
    Array pv.i(#pvLim)
EndStructure

Procedure cmCreate()
    Define *this.charMask = AllocateMemory(SizeOf(charMask) )
    InitializeStructure(*this, charMask)    
    ProcedureReturn *this
EndProcedure
    
Procedure pvCreate()
    Define *this.primeValue = AllocateMemory(SizeOf(primeValue) )
    Define i, j, sqrPvLim = Sqr(#pvLim)
    InitializeStructure(*this, primeValue)
    With *this
        ; *** 1/3 sieving ******************************************
        i = 2
        Repeat            
            If \pv(i) = 0
                j = i * i
                Repeat
                    \pv(j) = j
                    j + i
                Until j > #pvLim
            EndIf
            i + 1
        Until i > sqrPvLim
        ; *** 2/3 compacting ***************************************
        j = 1
        For i = 2 To #pvLim
            If Not \pv(i)
                \pv(j) = i
                j + 1
            EndIf
        Next
        j - 1
        ; *** 3/3 alpha reducing *****************************************
        \pv(0) = j
        ReDim \pv(j)
    EndWith
    ProcedureReturn *this
EndProcedure

Procedure hash(*a.character, *pv.primeValue)
    While *a\c
        i + 1
        r + *pv\pv(i) * *a\c
        *a + SizeOf(character)      
    Wend
    ProcedureReturn r
EndProcedure

Procedure SplitFilterAndHash(*a.character, *cm.charMask, *pv.primeValue)
    *c.wrd = AllocateMemory(SizeOf(wrd) ) ; resulting array
    InitializeStructure(*c, wrd)
    With *c
        *a - #bpc       
        While *a\c
            *a + #bpc
            j + 1
             If *cm\cm(*a\c)
                If r
                    \wrd(0, k) = r
                    \wrd(2, k) = i
                    k + 1
                    i = 0
                    r = 0
                EndIf
            Else                
                If r = 0
                    \wrd(1, k) = j
                    \wrd(3, k) = *a
                EndIf
                i + 1
                r + *pv\pv(i) * *a\c
            EndIf
        Wend
        If r
            \wrd(0, k) = r
            \wrd(2, k) = i
        EndIf
        \qty = k
        ProcedureReturn *c
    EndWith
EndProcedure









; Here, we go !


Define *pv.primeValue = pvCreate()
Define *cm.charMask = cmCreate()

                   ; WE EXCLUDE :
                   
*cm\cm(9) = 1      ; tabulation char
*cm\cm(10) = 1     ; line feed char
*cm\cm(13) = 1     ; carriage return char
*cm\cm(32) = 1     ; space char
*cm\cm('(') = 1    ; 1st parenthesis char
*cm\cm('+') = 1    ; 'plus' char...

*cm\cm('e') = 1    ; and 'e' char...

*cm\cm('e') = 0     ; ...finally, nop : no 'e' char exclude...

*cm\cm('♞') = 0   ; We insure ourselves we keep the horse...



a$ = "    monday (tuesday wednesday thursday+ friday"
weSearch = hash(@"wednesday", *pv)

Define *c.wrd = SplitFilterAndHash(@a$, *cm, *pv)
Debug a$
For i = 0 To *c\qty
    Debug PeekS(*c\wrd(3, i), *c\wrd(2, i) )
    If *c\wrd(0, i) = weSearch
        Debug "before " + anotherMid(*c, i) + " there is " + anotherMid(*c, i - 1) + " and before again : " + anotherMid(*c, i - 2)
        Debug "after " + anotherMid(*c, i) + " there is " + anotherMid(*c, i + 1) + " and after again : " + anotherMid(*c, i + 2)
    EndIf
Next

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 8:49 pm
by idle
That's got my vote. Keep the horse 🐎

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 9:46 pm
by ChrisR
I fell off the horse ♞ on monday and friday 🚑

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 9:59 pm
by dcr3
Although some misunderstood the question in the OP.
or maybe isn't clear enough. :oops:



What is the best way to get the word_before and word_after a specific keyword from a string.


That meets these two conditions.

First condition.

1. string.s="blah blah blah word_before keyword word_after blah blah blah"


Second condition.

2. string.s=" blah blah blah (word_before keyword word_after, blah blah blah "
word_before and word_after as it's written does not exist in English, I wrote that way to convey the meaning of the string.

There is no underscore or any other characters between words.
The only exception is ( parentheses in the word_before and , comma in the word_after.


But you all have, interesting concepts to learn from.

Thank you infratec. As always.

Thank you idle.

Thank you Oso.
Oso wrote: Fri Aug 26, 2022 11:09 pm I'm surprised at the complexity of some of the solutions put forward. I'd be concerned if they needed to be this complex, that the future task of maintaining the code would be difficult, especially as it might be someone else.
I agree.

Thank you olli.

olli you have a witty and dark sense of humor you are the winner here.
the horse bit , it is just another level. :lol: :lol:

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 10:03 pm
by Oso
idle wrote: Sat Aug 27, 2022 2:27 am @Oso you've just taken 2 steps back, each post was an improvement over the others, my post also simplified and improved the runtime complexity of Infratec's last code, he's a PB guru, he knows his stuff.
I've got ya, my apologies for 'throwing the spanner in the works', as they say :cry:

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 10:45 pm
by idle
Oso wrote: Sat Aug 27, 2022 10:03 pm
idle wrote: Sat Aug 27, 2022 2:27 am @Oso you've just taken 2 steps back, each post was an improvement over the others, my post also simplified and improved the runtime complexity of Infratec's last code, he's a PB guru, he knows his stuff.
I've got ya, my apologies for 'throwing the spanner in the works', as they say :cry:
sorry I didn't mean to sound harsh. Olli had your back and gave me a good kicking!

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 11:04 pm
by Oso
idle wrote: Sat Aug 27, 2022 10:45 pm
Oso wrote: Sat Aug 27, 2022 10:03 pm I've got ya, my apologies for 'throwing the spanner in the works', as they say :cry:
sorry I didn't mean to sound harsh. Olli had your back and gave me a good kicking!
No worries, I was slow to grasp the objective. :D

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sat Aug 27, 2022 11:06 pm
by Mijikai
Could not miss the party 8)

Code:

Code: Select all

EnableExplicit

Structure STRING_BA_STRUCT
  before.s
  after.s
EndStructure

Procedure.i StringBeforeAndAfter(*Str,Signature.s,*StrBA.STRING_BA_STRUCT)
  Protected *src.Unicode
  Protected *sig.Unicode
  Protected *a.Unicode
  Protected *b.Unicode
  Protected *c.Unicode
  *src = *str 
  *sig = @Signature
  Repeat 
    If *src\u = *sig\u
      *a = *src
      Repeat
        *src + 2
        *sig + 2
        If *src\u <> *sig\u
          Break
        EndIf
      Until *sig\u = #Null Or *src\u = #Null
      If *sig\u = #Null 
        *b = *src - *a
        *b + *a
        *c = *a - 4
        If *a < *Str Or *b\u = #Null
          ProcedureReturn #False
        EndIf
        *a + Bool(*a\u = ' ') * 4
        *b + Bool(*b\u = ' ') * 2
        *sig = #Null
        Repeat
          *sig + Bool(*c\u & $30 = $30 And *c\u < $3A) 
          *sig + Bool(*c\u & $40 = $40 And *c\u <> $40 And *c\u < $5B)
          *sig + Bool(*c\u & $60 = $60 And *c\u <> $60 And *c\u < $7B)
          *sig + Bool(*c\u = '_')
          If *sig = #Null
            Break
          EndIf
          *c - 2
          *sig = 0 
        Until *c = *Str
        If *sig = 0
          *c + 2
        EndIf
        *a\u = #Null
        *StrBA\before = PeekS(*c,(*a - *c - (Bool(*a\u = ' ') * 2)) >> 1)
        *c = *b
        *sig = #Null
        Repeat
          *sig + Bool(*c\u & $30 = $30 And *c\u < $3A) 
          *sig + Bool(*c\u & $40 = $40 And *c\u <> $40 And *c\u < $5B)
          *sig + Bool(*c\u & $60 = $60 And *c\u <> $60 And *c\u < $7B)
          *sig + Bool(*c\u = '_')
          If *sig = #Null
            Break
          EndIf
          *c + 2
          *sig = 0 
        Until *c\u = #Null
        *StrBA\after = PeekS(*b,(*c - *b) >> 1)
        ProcedureReturn #True
      EndIf
      *sig = @Signature
      *src = *Str
    EndIf  
    *src + 2
  Until *src\u = #Null
  ProcedureReturn #False
EndProcedure

Procedure.i Main()
  Protected str1.s
  Protected str2.s
  Protected str.STRING_BA_STRUCT
  str1 = "blah blah blah word_before keyword word_after blah blah blah"
  str2 = " blah blah blah (word_before keyword word_after, blah blah blah "
  StringBeforeAndAfter(@str1,"keyword",@str)
  Debug str\before
  Debug str\after
  StringBeforeAndAfter(@str2,"keyword",@str)
  Debug str\before
  Debug str\after
  ;works with and without spaces after the keyword!
  str1 = "blah blah blah word_beforekeywordword_after blah blah blah"
  str2 = " blah blah blah (word_beforekeywordword_after, blah blah blah "
  StringBeforeAndAfter(@str1,"keyword",@str)
  Debug str\before
  Debug str\after
  StringBeforeAndAfter(@str2,"keyword",@str)
  Debug str\before
  Debug str\after
  ProcedureReturn #Null
EndProcedure

Main()

End

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sun Aug 28, 2022 4:41 am
by Allen
Hi,

The shortest(slow) version? idea from Olli and Oso. :wink:

Code: Select all

Keyword.s=" keyword "

string.s="blah blah blah word_before keyword word_after blah blah blah"
Debug String
Debug "Word Before : "+LTrim(ReverseString(StringField(ReverseString(StringField(String,1,Keyword)),1," ")),"(")
Debug "Word After : "+RTrim(StringField(StringField(String,2,KeyWord),1," "),",")

string.s=" blah blah blah (word_before keyword word_after, blah blah blah "
Debug String
Debug "Word Before : "+LTrim(ReverseString(StringField(ReverseString(StringField(String,1,Keyword)),1," ")),"(")
Debug "Word After : "+RTrim(StringField(StringField(String,2,KeyWord),1," "),",")
Allen

Edit:
was: Debug "Word After : "+RTrim(StringField(ReverseString(StringField(ReverseString(String),1,ReverseString(KeyWord))),1," "),",")
is : Debug "Word After : "+RTrim(StringField(StringField(String,2,KeyWord),1," "),",")

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sun Aug 28, 2022 6:06 am
by Jeromyal

Code: Select all

Macro WordBefore(string, keyword, separator = " ")
  StringField(StringField(string, 1, keyword), CountString(StringField(string, 1, keyword), sep), sep)
EndMacro

Macro WordAfter(string, keyword, separator = " ")
  StringField(Trim(StringField(String, 2, keyword)), 1, sep)
EndMacro


string.s  = "blah blah blah word_before keyword word_after blah blah blah"
keyword.s = "keyword"
sep.s     = " "

Debug WordBefore(string, keyword, sep)
Debug WordAfter(string, keyword, sep)
Not perfect. Prone to errors.

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Sun Aug 28, 2022 6:51 am
by Jeromyal
A strange way to get the word before...

Code: Select all

Macro BeforeWord(String, Keyword, Separator = " ")
  GetExtensionPart(ReplaceString(Trim(StringField(string, 1, keyword)), sep, "."))
EndMacro
okey, enough from me.

Re: How to get the word_before and word_after a specific keyword from a string.

Posted: Wed Aug 31, 2022 10:23 am
by Olli
:lol:

GetExtensionPart() option is very researched to do this task. I think everybody has won here, each one with a different approach.

I say a "hello" to Mijikai who shew me recently this easy method of *x.character\c and certainly who learnt it by another one again, as a humain string.

I thank infratec who knows the subject perfectly.

I say a hello too to ChrisR I often read to give an idea also[/b].

And I thank to everybody to share and share again.



I think I have a other approach. (It stays myself any teeth :mrgreen: :mrgreen: )

I imagine a search where each character is counted and referring depending its place in the string. It wastes lots of memory also, but it is quick and usable everywhere in a data base.