Word Count

Bare metal programming in PureBasic, for experienced users
spacebuddy
Enthusiast
Enthusiast
Posts: 346
Joined: Thu Jul 02, 2009 5:42 am

Re: Word Count

Post by spacebuddy »

Danilo, I am studying your code, thank you :D
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Word Count

Post by wilbert »

spacebuddy wrote:Wilbert, I tested this on my machine and it is smoking fast :D
Glad to hear it is working for you :)

Here's also an updated version that is both shorter and faster.
It treats all character codes below 32 as spaces which normally shouldn't be a problem.

Code: Select all

Procedure.i CountWords(*Text.Character); Requires MMX
  
  ; init some mmx registers
  !pcmpeqd mm2, mm2
  !pxor mm3, mm3
  !psubd mm3, mm2
  !pslld mm3, 5
  !pxor mm2, mm2
  !pxor mm1, mm1
  
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !mov rdx, [p.p_Text]
  CompilerElse
    !mov edx, [p.p_Text]
  CompilerEndIf
  !jmp countwords_entry
  
  ; main loop
  !countwords_loop:
  !pcmpgtd mm0, mm3
  !pandn mm1, mm0
  !psubd mm2, mm1
  !movq mm1, mm0
  
  ; entry point for first character
  !countwords_entry:
  CompilerIf #PB_Compiler_Unicode
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, word [rdx]
      !add rdx, 2
    CompilerElse
      !movzx eax, word [edx]
      !add edx, 2
    CompilerEndIf
  CompilerElse
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, byte [rdx]
      !add rdx, 1
    CompilerElse
      !movzx eax, byte [edx]
      !add edx, 1
    CompilerEndIf
  CompilerEndIf
  !movd mm0, eax
  
  ; loop if not end of string
  !and ax, ax
  !jnz countwords_loop
  
  ; set result and empty mmx state
  !movd eax, mm2
  !emms
  ProcedureReturn
  
EndProcedure
Windows (x64)
Raspberry Pi OS (Arm64)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Word Count

Post by wilbert »

An additional procedure counting Words, LF and Len all at once.
A bit slower compared to the previous procedure but faster compared to doing everything separately.

Code: Select all

Structure WordAndLFCount
  WordCount.l
  LFCount.l
  Len.l
EndStructure

Procedure CountWordsAndLF(*Text, *TextInfo.WordAndLFCount); Requires MMX
  
  ; init some mmx registers
  !pcmpeqd mm2, mm2
  !pxor mm3, mm3
  !psubd mm3, mm2
  !pslld mm3, 5
  !pxor mm2, mm2
  !pxor mm1, mm1
  
  !xor ecx, ecx
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !mov rdx, [p.p_Text]
  CompilerElse
    !mov edx, [p.p_Text]
  CompilerEndIf
  !jmp countwords_entry
  
  ; main loop
  !countwords_loop:
  !pcmpgtd mm0, mm3
  !pandn mm1, mm0
  !psubd mm2, mm1
  !movq mm1, mm0
  
  ; entry point for first character
  !countwords_entry:
  CompilerIf #PB_Compiler_Unicode
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, word [rdx]
      !add rdx, 2
    CompilerElse
      !movzx eax, word [edx]
      !add edx, 2
    CompilerEndIf
  CompilerElse
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      !movzx eax, byte [rdx]
      !add rdx, 1
    CompilerElse
      !movzx eax, byte [edx]
      !add edx, 1
    CompilerEndIf
  CompilerEndIf
  !movd mm0, eax
  
  ; loop if not end of string
  !test ax, 0xfff5
  !jnz countwords_loop
  !cmp ax, 9
  !sbb ecx, -1
  !and ax, ax
  !jnz countwords_loop
  
  ; set result and empty mmx state
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
    !sub rdx, [p.p_Text]
    CompilerIf #PB_Compiler_Unicode
      !shr rdx, 1
    CompilerEndIf
    !dec edx
    !mov rax, [p.p_TextInfo]
    !movd [rax], mm2
    !mov [rax + 4], ecx
    !mov [rax + 8], edx
  CompilerElse
    !sub edx, [p.p_Text]
    CompilerIf #PB_Compiler_Unicode
      !shr edx, 1
    CompilerEndIf
    !dec edx
    !mov eax, [p.p_TextInfo]
    !movd [eax], mm2
    !mov [eax + 4], ecx
    !mov [eax + 8], edx
  CompilerEndIf
  !emms
  ProcedureReturn
  
EndProcedure
Usage

Code: Select all

S.s = "This is a test string" + #LF$
For i = 1 To 15
  S + S
Next

CountWordsAndLF(@S, @TextInfo.WordAndLFCount)

Debug TextInfo\WordCount
Debug TextInfo\LFCount
Debug TextInfo\Len
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
electrochrisso
Addict
Addict
Posts: 980
Joined: Mon May 14, 2007 2:13 am
Location: Darling River

Re: Word Count

Post by electrochrisso »

Nice one Wilbert, it is still supersonic fast. :)
PureBasic! Purely one of the best 8)
davido
Addict
Addict
Posts: 1890
Joined: Fri Nov 09, 2012 11:04 pm
Location: Uttoxeter, UK

Re: Word Count

Post by davido »

@wilbert,

Excellent! Thank you for sharing.

I tested with the following code:

Code: Select all

S.s = "This is a test string" + #LF$
For i = 1 To 23
  S + S
Next
dt = ElapsedMilliseconds()
CountWordsAndLF(@S, @TextInfo.WordAndLFCount)
With TextInfo
MessageRequester("Time: " + Str(ElapsedMilliseconds() - dt),"WordCount: " + Str(\WordCount) + Chr(10) + "LFCount: " + Str(\LFCount) + Chr(10) + "Len: " + Str(\Len))
EndWith
DE AA EB
Post Reply