Faster StringField

Share your advanced PureBasic knowledge/code with the community.
User avatar
Michael Vogel
Addict
Addict
Posts: 2799
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Faster StringField

Post by Michael Vogel »

Did anyone use stringfield with a huge number of "fields"?
Something like for n=1 to z : Array(n)=Stringfield(s,n,"|") : next n ?

I have implemented such a loop for copying dropped files into a listicon gadget, a string array or a list. Then I realized how slow this will be done, when some hundred files have been dropped on the gadget. The problem is, that the string has to be scanned hundreds of times, so a StringNextField(s,@position,"|") would be needed for sch cases.

So I changed the code to just scan the string to extract al fields what takes only very few moments. The following code is not complete as the drag and drop part is missing but it is simple to adapt it to your needs.

Code: Select all

Global File.s(0)
Global FileCount

Procedure DoDroppedFiles()

	CompilerIf #PB_Compiler_Unicode
		#CharByte=2
	CompilerElse
		#CharByte=1
	CompilerEndIf

	Protected n,p,z
	Protected s.s,t.s

	If EventDropAction()=#PB_Drag_Move
		FileCount=0
	EndIf

	s=EventDropFiles()+#LF$
	z=CountString(s,#LF$)

	ReDim File(FileCount+z)

	n=0
	p=@s
	Repeat
		t=""
		While PeekC(p)<>#LF
			t+Chr(PeekC(p))
			p+#CharByte
		Wend
		p+#CharByte

		FileCount+1
		File(FileCount)=t
		n+1
	Until n=z

EndProcedure
User avatar
Michael Vogel
Addict
Addict
Posts: 2799
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Re: Faster StringField

Post by Michael Vogel »

Here's a demo file to see the speed difference (and how procedures slow down everything, so macros should be used more often)...

[and don't say you can't do time measuring while debugging is enabled - look at the time differences!]

Code: Select all

Global Dim Files.s(0)
CompilerIf #PB_Compiler_Unicode
	#CharByte=2
CompilerElse
	#CharByte=1
CompilerEndIf

Procedure.s GetNextStringField(s.s,Char,*Offset.Integer)

	Protected t.s
	Protected mem,o

	mem=@s+*Offset\i*#CharByte

	While PeekC(mem)<>Char
		t+Chr(PeekC(mem))
		mem+#CharByte
	Wend

	*Offset\i=(mem-@s)/#CharByte+1

	ProcedureReturn t

EndProcedure
Procedure Demo(mode)

	Protected s.s,f.s
	Protected i,n,p,t

	For i=1 To 25000
		s+Str(i)+"."
	Next i

	n=CountString(s,".")
	ReDim Files(n)

	Debug "Mode "+Str(mode)+": "
	DisableDebugger
	
	t-ElapsedMilliseconds()
	If mode<3
		For i=1 To n
			If mode=1
				Files(i)=StringField(s,i,".")
			Else
				Files(i)=GetNextStringField(s,'.',@p)
			EndIf
		Next i
	Else
		p=@s
		For i=1 To n
			f=""
			While PeekC(p)<>'.'
				f+Chr(PeekC(p))
				p+#CharByte
			Wend
			p+#CharByte
			Files(i)=f
		Next i
	EndIf

	t+ElapsedMilliseconds()

	EnableDebugger
	Debug Files(789)+" = 789"
	Debug Str(t)+"ms"
	Debug ""

EndProcedure

r=Random(2)
For i=1 To 3
	Demo((i+r)%3+1)
Next i
User avatar
idle
Always Here
Always Here
Posts: 5872
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Faster StringField

Post by idle »

[and don't say you can't do time measuring while debugging is enabled - look at the time differences!]
That's not entirely true, you need to compile without the debugger to do speed tests and use a console if you want to print output
the over head of procedures is marginal if you use pointers and it's possible to get more or less equal timing results
to the inline version with a procedure version but it's a little hacky and relies on you calling the procedure
to reset the memory pointer before calling it in a loop

Code: Select all

Global Dim Files.s(0)

Procedure.s GetNextStringField(*in.string,Char,Reset=0)
   
   Static *mem.Character
   Protected t.s 
   
   If reset  
     *mem = *in 
     ProcedureReturn ""
   EndIf   
      
   While (*mem\c <> Char And *mem\c <> #Null) 
     t + Chr(*mem\c) 
     *mem+SizeOf(Character)
   Wend
      
   If *mem\c <> #Null
     *mem+SizeOf(Character)  
   EndIf 
      
   ProcedureReturn t

EndProcedure

Macro StartStringField(in,char) 
  GetNextStringField(in,Char,1)
EndMacro   
 
Procedure Demo(mode)

   Protected s.s,f.s
   Protected i,n,*p.Character,t

   For i=1 To 25000
      s+Str(i)+"."
   Next i

   n=CountString(s,".")
   ReDim Files(n)

   PrintN("Mode "+Str(mode)+": ")
   DisableDebugger
   
   t-ElapsedMilliseconds()
   If mode = 1 
     For i=1 To n
       Files(i)=StringField(s,i,".")
     Next      
   ElseIf mode = 2 
     StartStringField(@s,'.') 
     For i=1 To n
       Files(i)=GetNextStringField(@s,'.')
     Next  
     
   Else
      *p=@s
      For i=1 To n
         f=""
         While *p\c <>'.'
            f + Chr(*p\c) 
            *p + SizeOf(Character) 
         Wend
         *p + SizeOf(Character) 
         Files(i)=f
      Next i
   EndIf

   t+ElapsedMilliseconds()

   EnableDebugger
   PrintN(Files(789)+" = 789")
   PrintN(Str(t)+"ms")
   PrintN("")

EndProcedure

OpenConsole() 

r=Random(2)
For i=1 To 3
   Demo((i+r)%3+1)
Next i


Global in.s = "Hello I am a split string"
Global out.s

StartStringField(@in,' ')
Repeat 
  out =  GetNextStringField(@in,' ')
  PrintN(out) 
Until out = ""   

in.s = "Hello I am another splitted string"
StartStringField(@in,' ')
Repeat 
  out =  GetNextStringField(@in,' ')
  PrintN(out) 
Until out = "" 

Input() 
Windows 11, Manjaro, Raspberry Pi OS
Image
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Faster StringField

Post by wilbert »

If it's about speed, it probably is faster to use the While/Wend loop to find the length of the string and use a ProcedureReturn with PeekS to retrieve the result.
Building strings by adding characters is slower.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Demivec
Addict
Addict
Posts: 4265
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: Faster StringField

Post by Demivec »

The methods posted so far for GetNextStringField() don't fully replace the functionality of StringField().

StringField() allows multi-character delimeters while GetNextStringField() doesn't.
Post Reply