String

Just starting out? Need help? Post your questions and find answers here.
User avatar
Michael Vogel
Addict
Addict
Posts: 2799
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

String

Post by Michael Vogel »

Need to speed up the following procedure:

Code: Select all

Procedure.s RegularSortBrakes(s.s)

	#CharByte=1<<#PB_Compiler_Unicode

	Protected n,z
	Protected x

	x='>'
	z=Len(s)*#CharByte
	While n<z
		If PeekC(@s+n)='<'
			x!2;	x='>'+'<'-x
			PokeC(@s+n,x)
		EndIf
		n+#CharByte
	Wend

	ProcedureReturn s

EndProcedure
Tried to use pointers, that didn't give it a boost - I also failed to check the string len within the procedure:

Code: Select all

Procedure.s RegularSortBrakes2(*s.string,z)

	#CharByte=1<<#PB_Compiler_Unicode

	Protected n;,z
	Protected x

	x='>'
	;z=Len(*s\s)
	While n<z
		If PeekC(*s)='<'
			x!2;	x='>'+'<'-x
			PokeC(*s,x)
		EndIf
		n+1
		*s+#CharByte
	Wend

	;ProcedureReturn s

EndProcedure
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Re: String

Post by srod »

For the life of me I don't know what that routine is supposed to do, but :

Code: Select all

Procedure.s RegularSortBrakes2(*ptrChar.CHARACTER)
  Protected x = '>'
  While *ptrChar\c
    If *ptrChar\c = '<'
      x!2;   x='>'+'<'-x
      *ptrChar\c = x
    EndIf
    *ptrChar + SizeOf(CHARACTER)
  Wend
EndProcedure

a$ = "test<<A>B"
RegularSortBrakes2(@a$)
Debug a$
I guess ASM would be your next port of call.
I may look like a mule, but I'm not a complete ass.
User avatar
Michael Vogel
Addict
Addict
Posts: 2799
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Re: String

Post by Michael Vogel »

Thanks for your help, seems to be around 10% faster now :)

It's needed to do some presorting of regular expression markers (for group names) which is used in a file rename tool I wrote some times ago...
User avatar
Fig
Enthusiast
Enthusiast
Posts: 352
Joined: Thu Apr 30, 2009 5:23 pm
Location: Côtes d'Azur, France

Re: String

Post by Fig »

Code: Select all

Procedure.s RegularSortBrakesAsm(*ptrChar)
   EnableASM 
   CompilerIf #PB_Compiler_Unicode
      MOV al,62 ;>
      MOV ebp,dword [p.p_ptrChar]
      DEC ebp
      DEC ebp
      !jump1:
      INC ebp
      INC ebp
      MOV bl,byte [ebp]
      CMP bl,0
      JZ efin1
      CMP bl,60 ;<
      JNE jump1
      XOR al,2
      MOV byte [ebp],al
      JMP jump1
      !efin1:
   CompilerElse
      MOV al,62 ;>
      MOV ebp,dword [p.p_ptrChar]
      DEC ebp
      !jump1:
      INC ebp
      MOV bl,byte [ebp]
      CMP bl,0
      JZ efin1
      CMP bl,60 ;<
      JNE jump1
      XOR al,2
      MOV byte [ebp],al
      JMP jump1
      !efin1:
   CompilerEndIf
   DisableASM
EndProcedure
I didn't test the non-unicode part ... Because my pb IS unicode :mrgreen:
Could be even faster if you read from memory, 32 or 64 bits once (instead of a byte - a double word is read anyway because of memory access-) and unroll the loop. Let me know if it's really usefull...
There are 2 methods to program bugless.
But only the third works fine.

Win10, Pb x64 5.71 LTS
User avatar
Michael Vogel
Addict
Addict
Posts: 2799
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Re: String

Post by Michael Vogel »

Thanks to all, seems that your both codes are quicker than mine :)

Changed a little bit to fit for my purposes and added it to my SpeedRenamer tool*...

___
*) don't be surprised about the filenames when starting the program the first time, they will be loaded from the included demo file SpeedRename.lst

Code: Select all


#Code=4

#TokenByte='*'
#Token=Chr(#TokenByte)

CompilerIf #PB_Compiler_Unicode
	#CharByte=2
CompilerElse
	#CharByte=1
CompilerEndIf

Procedure.s RegularSortBrakesV1(s.s)

	Protected n,z
	Protected x

	x='>'
	z=Len(s)
	While n<z
		If PeekC(@s+n*#CharByte)='<'
			x='>'+'<'-x
			PokeC(@s+n*#CharByte,x)
		EndIf
		n+1
	Wend

	ProcedureReturn s

EndProcedure
Procedure.s RegularSortBrakesV2(s.s)

	Protected n,z
	Protected x

	x='>'
	z=Len(s)*#CharByte
	While n<z
		If PeekC(@s+n)='<'
			x!2;	x='>'+'<'-x
			PokeC(@s+n,x)
		EndIf
		n+#CharByte
	Wend

	ProcedureReturn s

EndProcedure
Procedure.s RegularSortBrakesSrod(*ptrChar.CHARACTER)

	Protected x = '>'

	While *ptrChar\c
		If *ptrChar\c = #TokenByte
			x!2;   x='>'+'<'-x
			*ptrChar\c = x
		EndIf
		*ptrChar + SizeOf(CHARACTER)
	Wend

EndProcedure
Procedure.s RegularSortBrakesAsm(*ptrChar)

	EnableASM

	MOV al,62 ;>
	MOV ebp,dword [p.p_ptrChar]

	DEC ebp
	CompilerIf #PB_Compiler_Unicode
		DEC ebp
	CompilerEndIf

	!jump1:

	CompilerIf #PB_Compiler_Unicode
		INC ebp
	CompilerEndIf

	INC ebp
	MOV bl,byte [ebp]
	CMP bl,0
	JZ efin1
	CMP bl,#TokenByte
	JNE jump1

	XOr al,2
	MOV byte [ebp],al
	JMP jump1
	!efin1:

	DisableASM

EndProcedure

Procedure ss(s.s)




EndProcedure

#Test="**********"

DisableDebugger
t-ElapsedMilliseconds()

For i=0 To 999999
	s.s=#Test
	CompilerSelect #Code
	CompilerCase 1
		s=RegularSortBrakesV1(ReplaceString(s,#Token,"<"))
	CompilerCase 2
		ReplaceString(s,#Token,"<",#PB_String_InPlace)
		s=RegularSortBrakesV2(s)
	CompilerCase 3
		RegularSortBrakesSrod(@s)
	CompilerCase 4
		RegularSortBrakesAsm(@s)
	CompilerEndSelect
Next i

t+ElapsedMilliseconds()
EnableDebugger
MessageRequester("Time",s+" = "+Str(t)+"ms")
User avatar
CELTIC88
Enthusiast
Enthusiast
Posts: 154
Joined: Thu Sep 17, 2015 3:39 pm

Re: String

Post by CELTIC88 »

Code: Select all

DisableDebugger
Procedure LODSWandSTOSW(d,l)
  EnableASM
  MOV	ecx, [esp + 8]
  MOV	esi, [esp + 4]
  MOV	edi, [esp + 4]
  !looop:
  LODSW
  CMP AX, '>'
  JNZ skip
  XOR AX,2
  !skip:
  STOSW 
  LOOP looop ;loop ecx
  DisableASM
EndProcedure
EnableDebugger

STRr.s = ">>>>>>>>>>>>>>>xXXX<<<<"
LODSWandSTOSW(@STRr,Len(STRr))
Debug STRr
interested in Cybersecurity..
User avatar
CELTIC88
Enthusiast
Enthusiast
Posts: 154
Joined: Thu Sep 17, 2015 3:39 pm

Re: String

Post by CELTIC88 »

Ascii - unicode version

Code: Select all


DisableDebugger
EnableASM
Procedure LODSandSTOS(d,l) ;Size 22 byte
  
  CompilerIf #PB_Compiler_Unicode
    Macro _SUA:W:EndMacro
  CompilerElse
    Macro _SUA:B:EndMacro
  CompilerEndIf
  
  ;PUSHA
  MOV	ecx, [esp + 8 ]
  MOV	esi, [esp + 4 ]
  MOV	edi, [esp + 4 ]
  !looop:
  LODS#_SUA
  CMP Al, '>'
  JNE .__skip
  XOR Al,2
  !.__skip:
  STOS#_SUA
  LOOP looop ;loop ecx
  ;POPA

EndProcedure

STRr.s = ">>>>>>>>>>>>>>>xXXX<<<<"
LODSandSTOS(@STRr,Len(STRr))
MessageRequester("", STRr)

EnableDebugger
interested in Cybersecurity..
User avatar
Fig
Enthusiast
Enthusiast
Posts: 352
Joined: Thu Apr 30, 2009 5:23 pm
Location: Côtes d'Azur, France

Re: String

Post by Fig »

CELTIC88 wrote:Ascii - unicode version

Code: Select all


DisableDebugger
EnableASM
Procedure LODSandSTOS(d,l) ;Size 22 byte
  
  CompilerIf #PB_Compiler_Unicode
    Macro _SUA:W:EndMacro
  CompilerElse
    Macro _SUA:B:EndMacro
  CompilerEndIf
  
  ;PUSHA
  MOV	ecx, [esp + 8 ]
  MOV	esi, [esp + 4 ]
  MOV	edi, [esp + 4 ]
  !looop:
  LODS#_SUA
  CMP Al, '>'
  JNE .__skip
  XOR Al,2
  !.__skip:
  STOS#_SUA
  LOOP looop ;loop ecx
  ;POPA

EndProcedure

STRr.s = ">>>>>>>>>>>>>>>xXXX<<<<"
LODSandSTOS(@STRr,Len(STRr))
MessageRequester("", STRr)

EnableDebugger
Nice code !!
However, the result expected is :

Code: Select all

>>>>>>>>>>>>>>>xXXX<><>
There are 2 methods to program bugless.
But only the third works fine.

Win10, Pb x64 5.71 LTS
User avatar
CELTIC88
Enthusiast
Enthusiast
Posts: 154
Joined: Thu Sep 17, 2015 3:39 pm

Re: String

Post by CELTIC88 »

hi @fig ,
did you test it on "unicode pb"?

I tested it with pb 5.51 unicode :
Image
interested in Cybersecurity..
User avatar
Fig
Enthusiast
Enthusiast
Posts: 352
Joined: Thu Apr 30, 2009 5:23 pm
Location: Côtes d'Azur, France

Re: String

Post by Fig »

Try the first code of Vogel. The result should be ">>>>>>>>>>>>>>>xXXX<><>". Not the one you got.
There are 2 methods to program bugless.
But only the third works fine.

Win10, Pb x64 5.71 LTS
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: String

Post by wilbert »

Complex instructions like LODSW, STOSW and LOOP are usually slower.
Besides that, you have to preserve all non volatile registers.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Fig
Enthusiast
Enthusiast
Posts: 352
Joined: Thu Apr 30, 2009 5:23 pm
Location: Côtes d'Azur, France

Re: String

Post by Fig »

Concerning volatile/non volatile registers, as far as you understand what you are doing with them (ie it prevents from accessing to the Stack, therefore local variables) , i don't see any reason to save them and restore them in a pure asm procedure.
I usually use all registers to optimise inner loops in order to reduce memory accesses.

Have you ever observed any change in a procedure of these registers and why ? I'am curious about it...
There are 2 methods to program bugless.
But only the third works fine.

Win10, Pb x64 5.71 LTS
User avatar
CELTIC88
Enthusiast
Enthusiast
Posts: 154
Joined: Thu Sep 17, 2015 3:39 pm

Re: String

Post by CELTIC88 »

:lol: ah loool oky i understood :P

Code: Select all

EnableASM
Procedure LODSandSTOS(d) ;Size 22 byte
  CompilerIf #PB_Compiler_Unicode
    Macro _SUA:W:EndMacro
  CompilerElse
    Macro _SUA:B:EndMacro
  CompilerEndIf
  
  PUSH esi edi ebx
  
  MOV bl ,'>'
    
  MOV   esi, [esp + 4 + 12]
  MOV   edi, [esp + 4 + 12]
  
  !looop:
  
  LODS#_SUA
  CMP Al, '>'
  JNE .__skip
  
  XOR bl, 2
  MOV Al,bl
  
  !.__skip:
  STOS#_SUA
  
  TEST Al,Al
  JNZ looop
  
  POP ebx edi esi

EndProcedure

STRr.s = ">>>>>>>>>>>>>>>xXXX<<<<"
LODSandSTOS(@STRr)
MessageRequester("", STRr)

EnableDebugger
Last edited by CELTIC88 on Wed Jan 10, 2018 7:12 am, edited 1 time in total.
interested in Cybersecurity..
User avatar
Olliv
Enthusiast
Enthusiast
Posts: 542
Joined: Tue Sep 22, 2009 10:41 pm

Re: String

Post by Olliv »

Hello friends ! Happy new year !

Code: Select all

;*******************************************************************
Macro UnicodeLocalVersion(StringName) ; use eax, ecx, edx
! xor eax, eax
! xor ecx, ecx
! mov edx, [v_#StringName] ; in a proc, replace v_ with p.v_
! ulvStarted#MacroExpandedCount:
! add ax, [edx]
! jz ulvFinished#MacroExpandedCount
! cmp ax, '<'
! jnz ulvUnchanged#MacroExpandedCount
! xor ax, cx
! xor cx, 2
! ulvUnchanged#MacroExpandedCount:
! mov [edx], ax
! add edx, 2
! xor eax, eax
! jmp ulvStarted#MacroExpandedCount
! ulvFinished#MacroExpandedCount:
EndMacro
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: String

Post by wilbert »

Fig wrote:Concerning volatile/non volatile registers, as far as you understand what you are doing with them (ie it prevents from accessing to the Stack, therefore local variables) , i don't see any reason to save them and restore them in a pure asm procedure.
I usually use all registers to optimise inner loops in order to reduce memory accesses.

Have you ever observed any change in a procedure of these registers and why ? I'am curious about it...
It's just that officially you should preserve registers like ebx, edi, esi and ebp and the PB documentation also mentions that.
If PB depends on the value of one of those registers after calling your procedure, it will cause problems if your procedure changed them.
It may work perfectly fine now but it's not guaranteed.
Windows (x64)
Raspberry Pi OS (Arm64)
Post Reply