Page 1 of 2

Faster Mid() with asm

Posted: Fri Oct 13, 2006 9:16 pm
by technicorn
Here's a replacement routine for Mid() that is faster than the build in function:

Code: Select all

! macro EndMacro {}

; Assembler macro to align code on a given boundary,
; aligning loop starts on 8 byte boundaries can speed up the code
; by about 1.5 to 2 times!

! macro calignjmp value
! {
!   local dest
!
!   if ((value - 1) - ((($ - $$) + value - 1) mod value)) > 3
!     makeDest equ 1
!     jmp dest
!   else
!     makeDest equ 0
!   end if
!
!   rept value
!   \{
!     if ($ - $$) mod value
!       nop
!     end if
!   \}
!   dest:
! } EndMacro

; Uncommend the next 3 lines, if you want to replace
; Mid with FMid in all places:
; Macro Mid(string, start, length)
;   FMid(@string,start,length)
; EndMacro

CompilerIf #PB_Compiler_Unicode
Procedure.s FMid(*srcPtr.Word, start.l, length.l)
  !MOV   edx,[p.p_srcPtr]
  !MOV   ecx,[p.v_length]
  !TEST  edx,edx
  !JZ    l_fmid_empty
  !TEST  ecx,ecx
  !JLE   l_fmid_empty
  !MOV   ecx,[p.v_start]
  !CMP   ecx,1
  !JLE   l_fmid_return
  !XOR   eax,eax
  !DEC   ecx
  !calignjmp 4
  fmid_scan:
  !CMP   [edx],ax
  !JE    l_fmid_scanend
  !ADD   edx,2
  !DEC   ecx
  !JNZ    l_fmid_scan
  fmid_scanend:
  !MOV   [p.p_srcPtr],edx
  fmid_return:
  ProcedureReturn PeekS(*srcPtr, length)
  fmid_empty:
  ProcedureReturn ""
EndProcedure
CompilerElse
Procedure.s FMid(*srcPtr.Byte, start.l, length.l)
  !MOV   edx,[p.p_srcPtr]
  !MOV   ecx,[p.v_length]
  !TEST  edx,edx
  !JZ    l_fmid_empty
  !TEST  ecx,ecx
  !JLE   l_fmid_empty
  !MOV   ecx,[p.v_start]
  !CMP   ecx,1
  !JLE   l_fmid_return
  !XOR   eax,eax
  !DEC   ecx
  !calignjmp 4
  fmid_scan:
  !CMP   [edx],al
  !JE    l_fmid_scanend
  !INC   edx
  !DEC   ecx
  !JNZ    l_fmid_scan
  fmid_scanend:
  !MOV   [p.p_srcPtr],edx
  fmid_return:
  ProcedureReturn PeekS(*srcPtr, length)
  fmid_empty:
  ProcedureReturn ""
EndProcedure
CompilerEndIf
And a program for speed testing:

Code: Select all

EnableExplicit
;IncludePath #PBIncludePath
XIncludeFile "FMid.pbi"

OpenConsole()

Global tStart.l, tStart2.l, tDiff.d
Global str1.s, str1Pos.l, str1Len.l = 16384, str1Mid.s
Global i.l, iEnd.l, strCh.l

strCh = 1
For i = 1 To str1Len
  str1 + Chr(strCh)
  strCh + 1
  If strCh > 255: strCh = 1: EndIf
Next i

iEnd = 10000000
str1Pos = 1

While str1Pos <= str1Len
  tStart2 = ElapsedMilliseconds(): Repeat: tStart = ElapsedMilliseconds(): Until tStart <> tStart2
  For i = 1 To iEnd
    str1Mid = Mid(str1, str1Pos, 1)
  Next i
  tDiff = (ElapsedMilliseconds() - tStart) / 1000.0
  If tDiff >= 1: iEnd / tDiff: EndIf
  PrintN(Right("      " + Str(str1Pos), 7) + ": " + StrD(tDiff, 3) + " sec., Ops per sec.: " + StrD(iEnd / tDiff, 3) + "    " + Str(Asc(str1Mid)))
  str1Pos + str1Pos
Wend
PrintN("")

iEnd = 1000000
str1Pos = 1
While str1Pos <= str1Len
  tStart2 = ElapsedMilliseconds(): Repeat: tStart = ElapsedMilliseconds(): Until tStart <> tStart2
  For i = 1 To iEnd
    str1Mid = FMid(@str1, str1Pos, 1)
  Next i
  tDiff = (ElapsedMilliseconds() - tStart) / 1000.0
  If str1Mid <> Mid(str1, str1Pos, 1): PrintN("Error"): EndIf
  If tDiff >= 1: iEnd / tDiff: EndIf
  PrintN(Right("      " + Str(str1Pos), 7) + ": " + StrD(tDiff, 3) + " sec., Ops per sec.: " + StrD(iEnd / tDiff, 3) + "    " + Str(Asc(str1Mid)))
  str1Pos + str1Pos
Wend
PrintN("")

PrintN(#CRLF$ + "Done")
While Inkey() = "": Delay(100): Wend
The PB Mid() is especial slow on extracting parts at the end of long strings,
but FMid() is even faster on extracting from the start of a string,
for longer string the speed gain is between 2-3 times for positions above
64 for ascii strings, and about 1.5 - 2 for unicode strings.

Posted: Fri Oct 13, 2006 9:19 pm
by thamarok
Respect!

This is very useful for me. May I use parts of your sourcecode for AromaBasic? AromaBasic should work like PureBasic, first preprocess the source to ASM and then compile to EXE through FASM. And as this shows a faster Mid() in action, I thought it would be very useful for me since I have not had the time to implement my own.

Thank you!

Posted: Fri Oct 13, 2006 9:33 pm
by Xombie
I'm really excited to try this one out but I'm getting an error...

Code: Select all

PureBasic.asm [108]:
   \{
error:  extra characters on line.
With or without InlineASM enabled. Help! ^_^

Posted: Fri Oct 13, 2006 9:35 pm
by thamarok
Xombie wrote:I'm really excited to try this one out but I'm getting an error...

Code: Select all

PureBasic.asm [108]:
   \{
error:  extra characters on line.
With or without InlineASM enabled. Help! ^_^
I was so aborbed in the subject of this topic that I forgot to test the code :lol:
But yes, I seem to have the same error...

Posted: Fri Oct 13, 2006 9:44 pm
by mskuma
Thanks alot, but I get the same error as Xombie (except mine's on 82 instead of 108).

Posted: Fri Oct 13, 2006 9:56 pm
by technicorn
That seems to be a problem with the FAsm.exe version,
I've installed V1.64.

Please donload the newer version from:
Flat assembler home

Before replacing the Fasm please make a backup of the original on,
although I use it all the time and never had any trouble with the newer
version.

And the source and any part of it is freeware, so have fun with it :D

Especial the calignjmp macro can be handy, unfortunately, it's of no use
before a basic statement like For, thats because of the way the assem code is generated by PB.
But you could use in with /COMMENTED, although reassembling didn't seem to work right now,
tried it and got some unresolved externals :?

I've tested the code on a AMD Athlon XP under Windows XP home.

Posted: Fri Oct 13, 2006 10:50 pm
by technicorn
Did some loop unrolling, no its about 4-5 times faster than PB Mid(),
and the unicode version is nearly as fast as the ascii one:

Code: Select all

! macro EndMacro {} 

; Assembler macro to align code on a given boundary, 
; aligning loop starts on 8 byte boundaries can speed up the code 
; by about 1.5 to 2 times! 

! macro calignjmp value 
! { 
!   local dest 
! 
!   if ((value - 1) - ((($ - $$) + value - 1) mod value)) > 3 
!     makeDest equ 1 
!     jmp dest 
!   else 
!     makeDest equ 0 
!   end if 
! 
!   rept value 
!   \{ 
!     if ($ - $$) mod value 
!       nop 
!     end if 
!   \} 
!   dest: 
! } EndMacro 

; Uncommend the next 3 lines, if you want to replace 
; Mid with FMid in all places: 
; Macro Mid(string, start, length) 
;   FMid2(@string,start,length) 
; EndMacro 

CompilerIf #PB_Compiler_Unicode
Procedure.s FMid2(*srcPtr.Byte, start.l, length.l) 
  !MOV    edx,[p.p_srcPtr] 
  !MOV    ecx,[p.v_length] 
  !TEST   edx,edx 
  !JZ     l_fmid2_empty 
  !TEST   ecx,ecx 
  !JLE    l_fmid2_empty 
  !MOV    ecx,[p.v_start] 
  !CMP    ecx,1 
  !JLE    l_fmid2_return 
  !XOR    eax,eax 
  !DEC    ecx 
  !calignjmp 4 
  fmid2_scan1:
  !SUB    ecx,8
  !JL     l_fmid2_scanend1
  !CMP    [edx],ax
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !CMP    [edx],ax 
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !CMP    [edx],ax 
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !CMP    [edx],ax 
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !CMP    [edx],ax 
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !CMP    [edx],ax 
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !CMP    [edx],ax 
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !CMP    [edx],ax 
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !JMP    l_fmid2_scan1

  fmid2_scanend1:
  !ADD    ecx,8
  !JZ     l_fmid2_scanend2
  fmid2_scan2:
  !CMP    [edx],ax
  !JE     l_fmid2_scanend2
  !ADD    edx,2
  !DEC    ecx
  !JNZ    l_fmid2_scan2
  fmid2_scanend2: 
  !MOV   [p.p_srcPtr],edx 
  fmid2_return: 
  ProcedureReturn PeekS(*srcPtr, length) 
  fmid2_empty: 
  ProcedureReturn "" 
EndProcedure 
CompilerElse
Procedure.s FMid2(*srcPtr.Byte, start.l, length.l) 
  !MOV    edx,[p.p_srcPtr] 
  !MOV    ecx,[p.v_length] 
  !TEST   edx,edx 
  !JZ     l_fmid2_empty 
  !TEST   ecx,ecx 
  !JLE    l_fmid2_empty 
  !MOV    ecx,[p.v_start] 
  !CMP    ecx,1 
  !JLE    l_fmid2_return 
  !XOR    eax,eax 
  !DEC    ecx 
  !calignjmp 4 
  fmid2_scan1:
  !SUB    ecx,8
  !JL     l_fmid2_scanend1
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !JMP    l_fmid2_scan1

  fmid2_scanend1:
  !ADD    ecx,8
  !JZ     l_fmid2_scanend2
  fmid2_scan2:
  !CMP    [edx],al 
  !JE     l_fmid2_scanend2
  !INC    edx 
  !DEC    ecx 
  !JNZ    l_fmid2_scan2
  fmid2_scanend2: 
  !MOV   [p.p_srcPtr],edx 
  fmid2_return: 
  ProcedureReturn PeekS(*srcPtr, length) 
  fmid2_empty: 
  ProcedureReturn "" 
EndProcedure 
CompilerEndIf
And a test program to extract random start and length strings, also
with negatice values, to see that it works and don't crashes:

Code: Select all

EnableExplicit
;IncludePath #PBIncludePath
XIncludeFile "FMid_DL.pbi"

OpenConsole()

Global tStart.l, tStart2.l, tDiff.d
Global str1.s, str1Pos.l, str1Len.l = 4096, str1Mid.s, str1FMid.s
Global i.l, iEnd.l, strCh.l
Global start.l, length.l

strCh = 1
For i = 1 To str1Len
  str1 + Chr(strCh)
  strCh + 1
  If strCh > 255: strCh = 1: EndIf
Next i

iEnd = 10000
For i = 1 To iEnd
  start = Random(10000)-5000
  length = Random(10000)-5000
  str1Mid = Mid(str1, start, length)
  str1FMid = FMid(@str1, start, length)
  If str1Mid <> str1FMid
    PrintN("Error FMid: " + Str(start) + ", " + Str(length))
    CallDebugger
  EndIf
Next i

iEnd = 10000
For i = 1 To iEnd
  start = Random(10000)-5000
  length = Random(10000)-5000
  str1Mid = Mid(str1, start, length)
  str1FMid = FMid2(@str1, start, length)
  If str1Mid <> str1FMid
    PrintN("Error FMid2: " + Str(start) + ", " + Str(length))
    CallDebugger
  EndIf
Next i

PrintN(#CRLF$ + "Done")
While Inkey() = "": Delay(100): Wend
Because the procedure is only included once, it's worth the memory I think.

Posted: Sat Oct 14, 2006 12:07 am
by Xombie
Great! I can't wait to test this out and use it :)

What do you use for reference/study to learn ASM? I'm very impressed and have always wanted to learn more ASM.

Posted: Sat Oct 14, 2006 6:08 pm
by technicorn
Hi Xombie,

learning assembler is quite easy today I think.
For example using PB with inline asm enabled, you can step through all instructions
and watch the register values and program flow.
Beside that have a look at http://www.agner.org/optimize/

And you will need the Intel and/or AMD documentation for the CPU,
and don't get intimidated by the number of pages of thies manuals,
togehter it's about 3000!
But you will mainly need the section about application programming.
Also the optimization docs by Intel and AMD are worth reading.


To all using FMid, please change the "calignjmp 4" -> "calignjmp 8",
uploaded a test version by mistake.

Posted: Sun Oct 15, 2006 2:39 am
by SoulReaper
Nice to see Assembler :)

Reminds me of the Z80/6809E/68000 Assembler days Respect :!: :roll: :wink:

Regards
Kevin :) :wink:

Posted: Wed Oct 18, 2006 1:13 am
by Amundo
Reminds me of the Z80/6809E/68000 Assembler days
Ahh, the 6809....there never was a better hybrid of 8 and 16-bit microprocessor technology....there'll never be a better instruction set...

Posted: Wed Oct 18, 2006 1:27 am
by Joakim Christiansen
Hey Fred, we wont allow this! (people making faster functions than the inbuilt) :P

Posted: Wed Oct 18, 2006 1:41 am
by dracflamloc
On the contrary, its great to speed up processing of DracScript!

Now if someone can convert standard pb functions and parameters which internally use this fast method it'd make life and integration much better!

Posted: Wed Oct 18, 2006 6:07 am
by JCV
nice! :D thx

Posted: Thu Oct 19, 2006 8:17 pm
by technicorn
@dracflamloc:

There are three lines in the sourcecode you can use to make PB use FMid as if it is a normal Mid in all places:

Code: Select all

; Uncommend the next 3 lines, if you want to replace 
; Mid with FMid in all places: 
; Macro Mid(string, start, length) 
;   FMid2(@string,start,length) 
; EndMacro 
so you don't have to change your code to pass string pointers to FMid.