Page 2 of 2

Re: Word Count

Posted: Wed Jul 30, 2014 10:05 pm
by spacebuddy
Danilo, I am studying your code, thank you :D

Re: Word Count

Posted: Thu Jul 31, 2014 5:11 am
by wilbert
spacebuddy wrote:Wilbert, I tested this on my machine and it is smoking fast :D
Glad to hear it is working for you :)

Here's also an updated version that is both shorter and faster.
It treats all character codes below 32 as spaces which normally shouldn't be a problem.

Code: Select all

Procedure.i CountWords(*Text.Character); Requires MMX
  
  ; init some mmx registers
  !pcmpeqd mm2, mm2
  !pxor mm3, mm3
  !psubd mm3, mm2
  !pslld mm3, 5
  !pxor mm2, mm2
  !pxor mm1, mm1
  
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !mov rdx, [p.p_Text]
  CompilerElse
    !mov edx, [p.p_Text]
  CompilerEndIf
  !jmp countwords_entry
  
  ; main loop
  !countwords_loop:
  !pcmpgtd mm0, mm3
  !pandn mm1, mm0
  !psubd mm2, mm1
  !movq mm1, mm0
  
  ; entry point for first character
  !countwords_entry:
  CompilerIf #PB_Compiler_Unicode
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, word [rdx]
      !add rdx, 2
    CompilerElse
      !movzx eax, word [edx]
      !add edx, 2
    CompilerEndIf
  CompilerElse
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, byte [rdx]
      !add rdx, 1
    CompilerElse
      !movzx eax, byte [edx]
      !add edx, 1
    CompilerEndIf
  CompilerEndIf
  !movd mm0, eax
  
  ; loop if not end of string
  !and ax, ax
  !jnz countwords_loop
  
  ; set result and empty mmx state
  !movd eax, mm2
  !emms
  ProcedureReturn
  
EndProcedure

Re: Word Count

Posted: Tue Nov 18, 2014 8:13 am
by wilbert
An additional procedure counting Words, LF and Len all at once.
A bit slower compared to the previous procedure but faster compared to doing everything separately.

Code: Select all

Structure WordAndLFCount
  WordCount.l
  LFCount.l
  Len.l
EndStructure

Procedure CountWordsAndLF(*Text, *TextInfo.WordAndLFCount); Requires MMX
  
  ; init some mmx registers
  !pcmpeqd mm2, mm2
  !pxor mm3, mm3
  !psubd mm3, mm2
  !pslld mm3, 5
  !pxor mm2, mm2
  !pxor mm1, mm1
  
  !xor ecx, ecx
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !mov rdx, [p.p_Text]
  CompilerElse
    !mov edx, [p.p_Text]
  CompilerEndIf
  !jmp countwords_entry
  
  ; main loop
  !countwords_loop:
  !pcmpgtd mm0, mm3
  !pandn mm1, mm0
  !psubd mm2, mm1
  !movq mm1, mm0
  
  ; entry point for first character
  !countwords_entry:
  CompilerIf #PB_Compiler_Unicode
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, word [rdx]
      !add rdx, 2
    CompilerElse
      !movzx eax, word [edx]
      !add edx, 2
    CompilerEndIf
  CompilerElse
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, byte [rdx]
      !add rdx, 1
    CompilerElse
      !movzx eax, byte [edx]
      !add edx, 1
    CompilerEndIf
  CompilerEndIf
  !movd mm0, eax
  
  ; loop if not end of string
  !test ax, 0xfff5
  !jnz countwords_loop
  !cmp ax, 9
  !sbb ecx, -1
  !and ax, ax
  !jnz countwords_loop
  
  ; set result and empty mmx state
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !sub rdx, [p.p_Text]
    CompilerIf #PB_Compiler_Unicode
      !shr rdx, 1
    CompilerEndIf
    !dec edx
    !mov rax, [p.p_TextInfo]
    !movd [rax], mm2
    !mov [rax + 4], ecx
    !mov [rax + 8], edx
  CompilerElse
    !sub edx, [p.p_Text]
    CompilerIf #PB_Compiler_Unicode
      !shr edx, 1
    CompilerEndIf
    !dec edx
    !mov eax, [p.p_TextInfo]
    !movd [eax], mm2
    !mov [eax + 4], ecx
    !mov [eax + 8], edx
  CompilerEndIf
  !emms
  ProcedureReturn
  
EndProcedure
Usage

Code: Select all

S.s = "This is a test string" + #LF$
For i = 1 To 15
  S + S
Next

CountWordsAndLF(@S, @TextInfo.WordAndLFCount)

Debug TextInfo\WordCount
Debug TextInfo\LFCount
Debug TextInfo\Len

Re: Word Count

Posted: Wed Nov 19, 2014 12:39 am
by electrochrisso
Nice one Wilbert, it is still supersonic fast. :)

Re: Word Count

Posted: Wed Nov 19, 2014 7:16 am
by davido
@wilbert,

Excellent! Thank you for sharing.

I tested with the following code:

Code: Select all

S.s = "This is a test string" + #LF$
For i = 1 To 23
  S + S
Next
dt = ElapsedMilliseconds()
CountWordsAndLF(@S, @TextInfo.WordAndLFCount)
With TextInfo
MessageRequester("Time: " + Str(ElapsedMilliseconds() - dt),"WordCount: " + Str(\WordCount) + Chr(10) + "LFCount: " + Str(\LFCount) + Chr(10) + "Len: " + Str(\Len))
EndWith