Page 1 of 1

Own API/ASM command for StrLen() !?

Posted: Wed Oct 05, 2005 12:32 am
by va!n
Is there any way to have a command (using API/InlineASM) like StrLen() without using any PureBasic lib? I think the PureBasic command MemoryStringLength() is exactly the StrLen() command!? But the question is just, how to have such a command without the use of any lib?

Many thanks in advance! Btw StrLen_() does not work! Is it just only a C/C++ keyword or a pure API command?



[Edit]
Seems StrLen() is only a C/C++ keyword from the StdLib and not any API function!? I have searched for a StrLen() ASM replacement (because i want write an own lib without the use of any other lib!)

this should be the fastest StrLen() routine!? Can someone help me how to get it work!? (or if you have any faster routine, please share it here) thanks!

Normal x86 code i found (by lingo12):

Code: Select all

xor     edx,edx            ; edx=0
C2_loop:                                  ;
mov     eax, [esi+edx]     ; get a dword (buffer is aligned)
lea     ecx, [eax-1010101h];sub 1 from each byte in eax
add     edx, 4             ; ready for next dword
And     ecx, 80808080h     ; test  sign
jz      C2_loop            ; if not loop again
                                          ;
test    eax, 000000FFh     ; is al zero?
jz      C2_minus4          ;
test    eax, 0000FF00h     ; is ah zero?
jz      C2_minus3          ;
test    eax, 00FF0000h     ; is zero?
jz      C2_minus2          ;
test    eax, 0FF000000h    ; is zero?
jnz     C2_loop            ; if not zeroes loop again
lea     eax, [edx-1]       ; eax= length of string
ret                        ;        
C2_minus2:                                ;
lea     eax, [edx-2]       ; eax= length of string
ret                        ;
C2_minus3:                                ;
lea     eax, [edx-3]       ; eax= length of string
ret                        ;
C2_minus4:                                ;
lea     eax, [edx-4]       ; eax= length of string
ret                        ;
Possible this x86 code may be a lot faster!? (by revolution)

Code: Select all

;edx=string start
 lea     ecx,[edx+4]             ;load and increment pointer
 mov     ebx,[edx]               ;read first 4 bytes
 lea     edx,[edx+7]             ;pointer+7 used in the end
._1:  lea     eax,[ebx-01010101h]     ;subtract 1 from each byte
  xor     ebx,-1                  ;invert all bytes
   and     eax,ebx                 ;and these two
      mov     ebx,[ecx]               ;read next 4 bytes
  add     ecx,4                   ;increment pointer
  and     eax,80808080h           ;test all sign bits
 jz      ._1                     ;no zero bytes, continue loop
       test    eax,00008080h           ;test first two bytes
       jnz     ._2
 shr     eax,16                  ;not in the first 2 bytes
   add     ecx,2
._2:       shl     al,1                    ;use carry flag to avoid a branch
   sbb     ecx,edx                 ;compute length
     lea     edx,[edx-7]             ;restore pointer 
here is a very fast MMX related version... @fred: maybe you can add this first MMX suported command for v4? ;-)

Code: Select all

; MMX version by Ryan Mack
; Roughly 13 + 3n + BRANCH clocks on a P-II

const unsigned __int64 STRINGTBL[8] = {0, 0xff,
        0xffff, 0xffffff, 0xffffffff, 0xffffffffff,
        0xffffffffffff, 0xffffffffffffff}

/* ... */

    pxor     mm1, mm1
    mov      ecx, eax
    mov      edx, eax
    and      ecx, -8
    and      eax, 7
    movq     mm0, [ecx]
    por      mm0, [STRINGTBL+eax*8]
MAIN:
    add      ecx, 8
    pcmpeqb  mm0, mm1
    packsswb mm0, mm0
    movd     eax, mm0
    movq     mm0, [ecx]
    test     eax, eax
    jz       MAIN

    bsf      eax, eax
    shr      eax, 2
    lea      ecx, [ecx+eax-8]
    sub      ecx, edx

Posted: Wed Oct 05, 2005 5:57 am
by Rescator
You really should add a optional maxlength/buffer length argument,
as those length functions are a bit different from PB's own (as PB always knows the length of it's own strings)
one could easily end up with a endless loop (or until it hits the first binary 0)
until it crashes out most likely at the end of memory.

Lookup StringCbLength in PSDK for a good and safe example of a string length function.

Re: Own API/ASM command for StrLen() !?

Posted: Wed Oct 05, 2005 7:40 am
by traumatic
What about these?

Straight PB way:

Code: Select all

Procedure strlen(*str.byte)
  While *str\b
    *str+1
    i+1
  Wend
  ProcedureReturn i
EndProcedure
  
Debug strlen(@"hello world!")
API way:

Code: Select all

kernel32dll.l = OpenLibrary(#PB_Any, "kernel32.dll")
*strlen.l = IsFunction(kernel32dll, "lstrlen")

Debug CallFunctionFast(*strlen, "hello world!")

CloseLibrary(kernel32dll)
EDIT: lol, of course you can call lstrlen_() directly as well... :oops:

Posted: Wed Oct 05, 2005 3:22 pm
by Thalius
or use PHPStrLen() from Deem's Compiler Lib =)
I personally love that one.. comes often handy tho never tested the speed but also never was an issue.

What's in the Lib?

AddCSlashes
AddSlashes
ChunkSplit
CountChars
Count_CharsS
Explode
Implode
Join
Ord
StrCSpn
StrChr
StrLen
StrPos
StrRChr
StrRPos
StrRepeat
StrRev
StrRot13
StrSpn
StrStr
StrTok
StrTr
SubStr
UCFirst

Where to get?
PureArea.net -> http://www.purearea.net/pb/download/use ... String.zip

Edit: *smack* i was in a hurry ... :roll: anyway... :)

Thalius

Posted: Wed Oct 05, 2005 3:32 pm
by traumatic
Thalius wrote:or use PHPStrLen() from Deem's Compiler Lib =)
va!n wrote:without using any PureBasic lib
;)

Posted: Wed Oct 05, 2005 7:52 pm
by remi_meier
MMX-Version (I think it should be correctly translated):

Code: Select all

; MMX version by Ryan Mack
; Translated to PureBasic by Remi
; Roughly 13 + 3n + BRANCH clocks on a P-II 


DataSection
STRINGTBL:
Data.l 0, 0, 0, $ff
Data.l 0, $ffff, 0, $ffffff
Data.l 0, $ffffffff, $ff, $ffffffff
Data.l $ffff, $ffffffff, $ffffff, $ffffffff
EndDataSection

Procedure.l STRLen(pString.l)
		MOV eax, pString
		
    !pxor     mm1, mm1 
    !MOV      ecx, eax 
    !MOV      edx, eax 
    !And      ecx, -8 
    !And      eax, 7 
    !movq     mm0, [ecx] 
    !por      mm0, [l_stringtbl+eax*8] 
!@@: 
    !ADD      ecx, 8 
    !pcmpeqb  mm0, mm1 
    !packsswb mm0, mm0 
    !movd     eax, mm0 
    !movq     mm0, [ecx] 
    !TEST     eax, eax 
    !JZ       @b 

    !BSF      eax, eax 
    !SHR      eax, 2 
    !LEA      ecx, [ecx+eax-8] 
    !SUB      ecx, edx
    !MOV eax, ecx
    !emms

    ProcedureReturn 
EndProcedure


s.s = "hallo du da!!klsajlkjasdlfj kljasflöjasioeruüqwupou4023u u0937u09 4903u03í"
Debug STRLen(@s)
Debug Len(s)

Posted: Fri Oct 07, 2005 1:39 am
by Rescator
A note about lstrlen and also relevant to the various code snippets here too. (yeah I' kinda repeating myself but :P
Security Alert Using this function incorrectly can compromise the security of your application. lstrlen assumes that lpString is a NULL-terminated string. If it is not, this could lead to a buffer overrun or a denial of service attack against your application. Consider using one of the following alternatives: StringCbLength or StringCchLength. You should review Security Considerations: Windows User Interface before continuing.
StringCbLength() is in the strsafe.lib which is a shame really,
if I had been a MS Dev I would have made StringCbLength() the "new" lstrlen() with optional arguments for maxlength,
and in absense (older apps using lstrlen() for example) of a max argument
I would have enforced a OS/System default max length.

Oh well! Then again I don't work at MS luckily. :P
so lstrlen() and any other string length functions or routines out there that do not allow specifying a max length is a severe security issue.

i have no idea how PB's Len() works, but hopefully it has a internal max length limit matching the max string size of PB internally. (am I correct Fred? Freak?)
So that buffer overrun can't be caused by Len() ?

Re: Own API/ASM command for StrLen() !?

Posted: Sun May 06, 2018 3:12 am
by Poplar
Debug lstrlen_("Hello world!")