Modified StringField() ?

Just starting out? Need help? Post your questions and find answers here.
User avatar
Otrebor
Enthusiast
Enthusiast
Posts: 198
Joined: Mon Mar 17, 2014 1:42 pm
Location: São Paulo, Brasil
Contact:

Modified StringField() ?

Post by Otrebor »

Hi
I would like to test this version from wilbert (2004), but i get Syntax Error at line 72:

Code: Select all

Procedure.l SplitStringByChar(StringToSplit.s, SplitChar.s, ArrayPtr.l)

; free the current array

! mov edx,[esp+8]
! call PB_FreeArray

; duplicate the string to split

! mov edx,[esp]
! lea ecx,[esp+8]
! call SYS_FastAllocateString

; replace all splitchars by 0 and push all pointers to the substrings

! mov edx,[esp+4]
! mov ah,byte [edx]
! xor ecx,ecx
! mov edx,[esp+8]
! _splitstring_loop0:
! inc ecx
! push edx
! _splitstring_loop1:
! mov al,byte [edx]
! inc edx
! And al,al
! jz _splitstring_cont 
! cmp al,ah
! jne _splitstring_loop1
! mov byte [edx-1],0
! jmp _splitstring_loop0

; correct stack if array allocation failed

! _ss_arrayfailed:
! pop ecx
! sal ecx,2
! add esp,ecx
! jmp _ss_exit

; allocate new array

! _splitstring_cont:
! push ecx
! mov ebx,ecx
! mov eax,4
! call PB_AllocateArray
! jz _ss_arrayfailed
! mov dword [eax],8
! pop ecx
! mov edx,ecx
! sal edx,2
! add edx,eax
! add eax,4

; set all pointers

! _splitstring_loop2:
! popd [edx]
! sub edx,4
! loop _splitstring_loop2 
! _ss_exit:
  ProcedureReturn
EndProcedure

; test the procedure

Dim B.s(0)

; supplying two times B() is required to free the old array

B() = SplitStringByChar("Hello, this is a test to see if the splitting of this string will be correct"," ",B())

ArrayLength = PeekL(@B()-8)

For i=0 To ArrayLength-1
 Debug B(i)
Next
I'm using PB 5.44LTS.
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: Modified StringField() ?

Post by Mijikai »

Mby the 'internal' calls are invalid...

I hope this helps until someone is able to fix the problem:

Unicode example (PB v. 5.60):

Code: Select all

;by Mijikai
Procedure.s SplitStringByChar(Input.s,SplitChar.s,Index.i);UNICODE ONLY!
  Protected StringStart.i
  Protected StringEnd.i
  Protected StringPos.i
  Protected Offset.i
  Protected *Char.Word
  Protected *Check.Word
  Protected Seek.i
  Protected ArrayIndex.i
  Protected StringArraySize.i
  Static Dim StringArray.s(0)
  If Input 
    If Len(SplitChar) = 1
      Dim StringArray(CountString(Input,SplitChar))
      Input + SplitChar
      *Check = @SplitChar
      StringStart = @Input
      StringEnd = StringStart + StringByteLength(Input)
      For Offset = StringStart To StringEnd Step 2
        *Char = Offset
        If *Char\w = *Check\w
          If Offset = StringEnd - 2
            If Not Seek
              Break
            EndIf
          EndIf
          StringPos = (Offset - StringStart) / 2 + 1
          StringArray(ArrayIndex) = Mid(Input,Seek,StringPos - Seek)
          Seek = StringPos + 1
          ArrayIndex + 1
        EndIf
      Next
      StringArraySize = ArraySize(StringArray())
      If StringArray(0)
        ProcedureReturn Str(StringArraySize)
      EndIf 
    EndIf
  Else
    ProcedureReturn StringArray(Index)
  EndIf
EndProcedure

Entries.s =  SplitStringByChar("This is a TestString :)"," ",#Null)
If Entries 
  For i.i = 0 To Val(Entries)
    Debug SplitStringByChar(#Null$,#Null$,i)
  Next
EndIf
Last edited by Mijikai on Sat Jun 24, 2017 4:43 pm, edited 1 time in total.
User avatar
skywalk
Addict
Addict
Posts: 3972
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Modified StringField() ?

Post by skywalk »

The asm code is x86 only and references PB functions that most likely are renamed.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Fig
Enthusiast
Enthusiast
Posts: 351
Joined: Thu Apr 30, 2009 5:23 pm
Location: Côtes d'Azur, France

Re: Modified StringField() ?

Post by Fig »

Code: Select all

Procedure.l SplitStringByChar(StringToSplit.l, SplitChar.s, ArrayPtr.l)
EnableASM
; replace all split chars by 0
 MOV edx,[p.v_SplitChar]
 MOV ecx,[p.v_ArrayPtr]
 MOV bh,byte [edx]
 XOR eax,eax
 MOV edx,[p.v_StringToSplit]
 MOV dword [ecx],edx
 ADD ecx,4
 ! _splitstring_loop0:
 MOV bl,byte [edx] 
 CMP bl,0
 JZ _fin
 INC edx
 CompilerIf #PB_Compiler_Unicode
    INC edx
 CompilerEndIf
 XOR bl,bh
 JNZ _splitstring_loop0
 CompilerIf #PB_Compiler_Unicode
    MOV byte [edx-2],0
 CompilerElse
    MOV byte [edx-1],0
 CompilerEndIf
 INC eax
 MOV dword [ecx],edx
 ADD ecx,4
 JMP _splitstring_loop0
 ! _fin:
DisableASM
ProcedureReturn
EndProcedure
Dim b.s(30) ;half the len of your string
nb=SplitStringByChar(@"Hello, this is a test to see if the splitting of this string will be correct"," ",B())
ReDim b(nb)
For i=0 To nb
Debug b(i)
Next i
This is not bulletproof and highly risky.
You have to dim your area in order it's larger than the number of split you will do. (at most half the number of letters in your string)
I have no idea how fast it is.
Last edited by Fig on Sat Jun 24, 2017 5:07 pm, edited 4 times in total.
There are 2 methods to program bugless.
But only the third works fine.

Win10, Pb x64 5.71 LTS
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Modified StringField() ?

Post by wilbert »

It's indeed a risky approach.
I still like asm :wink: but I'm more hesitant now to use internal PB functions.
You can try my code from this post instead
http://www.purebasic.fr/english/viewtop ... 05#p409005
http://www.purebasic.fr/english/viewtopic.php?p=409005#p409005 wrote:

Code: Select all

Procedure.i Split(Array StringArray.s(1), StringToSplit.s, Separator.s = " ")
  Protected c = CountString(StringToSplit, Separator)
  Protected i, l = StringByteLength(Separator)
  Protected *p1.Character = @StringToSplit
  Protected *p2.Character = @Separator
  Protected *p = *p1

  ReDim StringArray(c)
  While i < c
    While *p1\c <> *p2\c
      *p1 + SizeOf(Character)
    Wend
    If CompareMemory(*p1, *p2, l)
      CompilerIf #PB_Compiler_Unicode
        StringArray(i) = PeekS(*p, (*p1 - *p) >> 1)
      CompilerElse
        StringArray(i) = PeekS(*p, *p1 - *p)
      CompilerEndIf
      *p1 + l
      *p = *p1
      i + 1
    Else
      *p1 + SizeOf(Character)
    EndIf
  Wend
  StringArray(c) = PeekS(*p)
  ProcedureReturn c
EndProcedure

Procedure.s Join(Array StringArray.s(1), Separator.s = "")
  Protected r.s, i, l, c = ArraySize(StringArray())
  While i <= c
    l + Len(StringArray(i))
    i + 1  
  Wend
  r = Space(l + Len(Separator) * c)
  i = 1
  l = @r
  CopyMemoryString(@StringArray(0), @l)
  While i <= c
    CopyMemoryString(@Separator)
    CopyMemoryString(@StringArray(i))
    i + 1  
  Wend
  ProcedureReturn r
EndProcedure
It still seems to work and is pretty fast.
Last edited by wilbert on Sun Jun 25, 2017 7:18 pm, edited 2 times in total.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Fig
Enthusiast
Enthusiast
Posts: 351
Joined: Thu Apr 30, 2009 5:23 pm
Location: Côtes d'Azur, France

Re: Modified StringField() ?

Post by Fig »

If you use countstring(), i am not sure to see the optimisation...
It already scans each char of the whole string for separators, no ?
There are 2 methods to program bugless.
But only the third works fine.

Win10, Pb x64 5.71 LTS
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Modified StringField() ?

Post by wilbert »

Fig wrote:If you use countstring(), i am not sure to see the optimisation...
It already scans each char of all the string for separators, no ?
Unfortunately you need to do it to know how big the resulting array will be to ReDim it.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Otrebor
Enthusiast
Enthusiast
Posts: 198
Joined: Mon Mar 17, 2014 1:42 pm
Location: São Paulo, Brasil
Contact:

Re: Modified StringField() ?

Post by Otrebor »

Thank you very much to ALL!!
GJ-68
User
User
Posts: 32
Joined: Sun Jun 23, 2013 1:00 pm
Location: France (68)

Re: Modified StringField() ?

Post by GJ-68 »

@wilbert

You should test your Split function with a separator with more than one character and when the first char of the sep matches.
Example: StringToSplit = "ABCxzDEFxyGHI", Separator = "xy"

Fixed version:

Code: Select all

Procedure.i Split(Array StringArray.s(1), StringToSplit.s, Separator.s = " ")
  Protected c = CountString(StringToSplit, Separator)
  Protected i, l = StringByteLength(Separator)
  Protected *p1.Character = @StringToSplit
  Protected *p2.Character = @Separator
  Protected *p = *p1

  ReDim StringArray(c)
  While i < c
    While *p1\c <> *p2\c
      *p1 + SizeOf(Character)
    Wend
    If CompareMemory(*p1, *p2, l)
      CompilerIf #PB_Compiler_Unicode
        StringArray(i) = PeekS(*p, (*p1 - *p) >> 1)
      CompilerElse
        StringArray(i) = PeekS(*p, *p1 - *p)
      CompilerEndIf
      *p1 + l
      *p = *p1
      i + 1
    Else
      *p1 + SizeOf(Character)
    EndIf
  Wend
  StringArray(c) = PeekS(*p)
  ProcedureReturn c
EndProcedure
Post Reply