Page 1 of 2

Numbers of lines in a text file

Posted: Sat May 26, 2012 4:48 pm
by Tomi
how can i get Numbers of lines in a text file by a sure method ? (please simplest way) :?:

Code: Select all

blah blah
blah
blah blah blah
return 3 line for above file, of course i need this method for a file about 12000 to 30000 line! , but i can't create a big sample :D

Re: Numbers of lines in a text file

Posted: Sat May 26, 2012 4:52 pm
by ts-soft
the simple way:

Code: Select all

If ReadFile(0, "yourfile")
  ReadStringFormat(0)
  While Not Eof(0)
    If ReadString(0)
      count + 1
    EndIf
  Wend
  CloseFile(0)
  Debug count
EndIf
but the simple is the slow way, but sure.

Re: Numbers of lines in a text file

Posted: Sat May 26, 2012 4:55 pm
by Tomi
thx sir, very good :D

Re: Numbers of lines in a text file

Posted: Sat May 26, 2012 5:16 pm
by skywalk
ts-soft beat me, but another way...

Code: Select all

ReadFile(0,"SomeTextFile.txt")
Define.s s$ = Space(Lof(0))
ReadData(0, @s$, Lof(0))
CloseFile(0)
Debug CountString(s$,#CRLF$) ; If you know the type of EOL character.

Re: Numbers of lines in a text file

Posted: Sat May 26, 2012 5:36 pm
by Tomi
thx skywalk, your method is nice also :D

Re: Numbers of lines in a text file

Posted: Sat May 26, 2012 6:46 pm
by netmaestro
I originally considered the second method as an alternative but didn't even bother because it seemed like it would be too slow what with reading the whole file in first and then counting strings. But then Skywalk posted it and I thought, might as well see. This code snippet counts the strings in three different ways:

Code: Select all

;
;
; TURN THE DEBUGGER OFF <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
;

ProcedureDLL.l TicksHQ() 
  Static maxfreq.q 
  Protected t.q 
  If maxfreq=0 
    QueryPerformanceFrequency_(@maxfreq) 
    maxfreq=maxfreq/1000
  EndIf 
  QueryPerformanceCounter_(@t) 
  ProcedureReturn t/maxfreq 
EndProcedure

str$ = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"

If CreateFile(0, "c:\test.txt")
  For i=1 To 31234
    thislen = 20+Random(49)
    WriteStringN(0, Left(str$, thislen))
  Next
  FlushFileBuffers(0)
  filesize = Lof(0)
EndIf

CloseFile(0)

stime1 = TicksHQ()
count1=0
If ReadFile(0, "c:\test.txt")
  ReadStringFormat(0)
  While Not Eof(0)
    If ReadString(0)
      count1 + 1
    EndIf
  Wend
  CloseFile(0)
EndIf
etime1 = TicksHQ()

stime2 = TicksHQ()
ReadFile(0,"c:\test.txt")
Define.s s$ = Space(Lof(0))
ReadData(0, @s$, Lof(0))
CloseFile(0)
count2 = CountString(s$,#CRLF$) ; If you know the type of EOL character.
etime2 = TicksHQ()

EnableASM
  stime3 = TicksHQ()
  ReadFile(0,"c:\test.txt")
  *loc = AllocateMemory(Lof(0))
  ReadData(0, *loc, Lof(0))
  CloseFile(0)
  count3=0
  !xor ecx, ecx            ; linecount = 0
  !mov edx, [p_loc]        ; readpointer = *loc
  !mov eax, [v_filesize]   ; remainingbytes = filesize
  !loopstart:              ; While remainingbytes > 0
    !cmp word [edx], 0xA0D ;   If word at readpointer <> #CRLF$
    !jnz skip              ;     GOTO skip
    !inc ecx               ;   Else, increment the linecount
    !skip:                 ;
    !inc edx               ;   readpointer + 1
    !dec eax               ;   remainingbytes - 1
  !jnz loopstart           ; Wend
  !mov [v_count3], ecx
  FreeMemory(*loc)
DisableASM
etime3 = TicksHQ()

MessageRequester("", "First way: "+Str(count1)+" lines reported, time="+ Str(etime1-stime1)+" milliseconds")
MessageRequester("", "Second way: "+Str(count2)+" lines reported, time="+ Str(etime2-stime2)+" milliseconds")
MessageRequester("", "Third way: "+Str(count3)+" lines reported, time="+ Str(etime3-stime3)+" milliseconds")
My results:

ts-soft code: 78 ms.
Skywalk code: 18 ms.
Skywalk code optimized: 11 ms.

I never would have picked that!

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 2:24 am
by IdeasVacuum
Found this on the forum some time ago, do not know whose code it is:

Code: Select all

DataSection
  !CountLinesPcf_eoli:
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
EndDataSection

ProcedureDLL.i TicksHQ()
  Static maxfreq.q
  Protected t.q
  If maxfreq=0
    QueryPerformanceFrequency_(@maxfreq)
    maxfreq=maxfreq/1000
  EndIf
  QueryPerformanceCounter_(@t)
  ProcedureReturn t/maxfreq
EndProcedure

Procedure.i CountLinesPcf(filename.s)
Define fid.i, lcCR.i, lcLF.i, *mem, size.i
fid = ReadFile(#PB_Any, filename)
lcCR = 0
lcLF = 0
If IsFile(fid)
   size = Lof(fid)
   *mem = AllocateMemory(size)
   size = ReadData(fid, *mem, size)
EnableASM
   CompilerSelect #PB_Compiler_Processor
   CompilerCase #PB_Processor_x86
    ;eax, ecx and edx are ok, save rest
    Define.l old_esi
    !MOV dword [p.v_old_esi], esi
    !MOV esi, dword [p.p_mem]
    !MOV ecx, dword [p.v_size]
    !XOR eax, dword eax
    !CountLinesPcf_loop:
     !MOV al, byte [esi]
     !CMP byte [eax+CountLinesPcf_eoli], byte 1
     !JNE CountLinesPcf_noLFCR
      !CMP eax, dword 13 ;CR
      !JNE CountLinesPcf_noCR
       !INC dword [p.v_lcCR]
       !JMP CountLinesPcf_noLF
      !CountLinesPcf_noCR:
       !INC dword [p.v_lcLF]
      !CountLinesPcf_noLF:
     !CountLinesPcf_noLFCR:
     !INC dword esi
     !DEC dword ecx
    !JNZ CountLinesPcf_loop
    !MOV esi, dword [p.v_old_esi]
   CompilerCase #PB_Processor_x64
    ;eax, ecx and edx are ok, save rest
    Define.q old_esi
    !MOV qword [p.v_old_esi], rsi
    !MOV rsi, qword [p.p_mem]
    !MOV rcx, qword [p.v_size]
    !XOR rax, qword rax
    !MOV rdx, qword CountLinesPcf_eoli
    !CountLinesPcf_loop:
     !MOV al, byte [rsi]
     !CMP byte [rax+rdx], byte 1
     !JNE CountLinesPcf_noLFCR
      !CMP eax, dword 13 ;CR
      !JNE CountLinesPcf_noCR
       !INC qword [p.v_lcCR]
       !JMP CountLinesPcf_noLF
      !CountLinesPcf_noCR:
       !INC qword [p.v_lcLF]
      !CountLinesPcf_noLF:
     !CountLinesPcf_noLFCR:
     !INC qword rsi
     !DEC qword rcx
    !JNZ CountLinesPcf_loop
    !MOV rsi, qword [p.v_old_esi]
DisableASM
   CompilerDefault
    CompilerError "unsupported CPU architecture"
   CompilerEndSelect
   FreeMemory(*mem)
  CloseFile(fid)
EndIf
If lcLF > 0
  ProcedureReturn lcLF
EndIf
ProcedureReturn lcCR
EndProcedure

If CreateFile(0, "c:\test.txt")
sline.s = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"
  For i=1 To 31234
    thislen = 20+Random(49)
    WriteStringN(0, Left(sline, thislen))
  Next
  FlushFileBuffers(0)
  filesize = Lof(0)
  CloseFile(0)
EndIf

stime = TicksHQ()
count = CountLinesPcf("c:\test.txt")
etime = TicksHQ()
MessageRequester("", Str(count)+ " lines reported, time= " + Str(etime-stime) + " milliseconds")

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 2:32 am
by netmaestro
Lots and lots of code there and it's reporting 15ms here vs 11ms third version, Stargate optimized.

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 3:43 am
by IdeasVacuum
...it has some nice tidbits though. One thing of interest is the compatibility of the ASM stuff, 32bit and 64bit?

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 3:48 am
by netmaestro
64bit might be quite fast, can't tell as I only have x86 here.

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 12:08 pm
by Tomi
thx to all, very nice, in my machine (64bit) ,netmaestro has first place ! :D

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 12:21 pm
by IdeasVacuum
...and netmaestro takes 1st place on 32bit too. Definitely the Number 1! :mrgreen:

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 3:24 pm
by IdeasVacuum
...Hit a snag when running as a Procedure:

Code: Select all

;=========== TEST file start ====================================================

If CreateFile(0, "c:\test.txt")

  sLine.s = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"

  For i = 1 To 31234

                                   ilen = 20 + Random(49)
       WriteStringN(0, Left(sLine, ilen))

  Next
  FlushFileBuffers(0)
         CloseFile(0)
EndIf

;=========== TEST file End =====================================================

Procedure.i CountFileLines(sFile.s)
;----------------------------------
EnableASM
              If ReadFile(0,sFile)
                                  *loc = AllocateMemory(Lof(0))
                      ReadData(0, *loc, Lof(0))
                     CloseFile(0)
              EndIf

              cnt = 0
              !xor ecx, ecx            ; linecount = 0
              !mov edx, [p_loc]        ; readpointer = *loc
              !mov eax, [v_filesize]   ; remainingbytes = filesize
              !loopstart:              ; While remainingbytes > 0
                !cmp word [edx], 0xA0D ; If word at readpointer <> #CRLF$
                !jnz skip              ; GOTO skip
                !inc ecx               ; Else, increment the linecount
                !skip:                 ;
                !inc edx               ; readpointer + 1
                !dec eax               ; remainingbytes - 1
              !jnz loopstart           ; Wend
              !mov [v_cnt], ecx

DisableASM
              FreeMemory(*loc)
              ProcedureReturn

EndProcedure

LineCnt.i = CountFileLines("c:\test.txt")
Debug LineCnt
Halts with error:
PureBasic.asm [203]:
MP0
PureBasic.asm [163] MP0 [41]:
mov edx, [p_loc] ; readpointer = *loc
error: undefined symbol 'p_loc'.

Also, a further question Tomi - was your test on 64bit Windows compiled as a 64bit app?

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 3:32 pm
by ts-soft
You destroy the stack with your asm-loop in a procedure :wink:

Re: Numbers of lines in a text file

Posted: Sun May 27, 2012 3:45 pm
by netmaestro
Rules for accessing variables and pointers from inside a procedure are a bit different. How to write for a procedure can be found in the PB help file under "Inlined ASM". Anyway, this is the procedure version:

Code: Select all

str$ = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"

If CreateFile(0, "c:\test.txt")
  For i=1 To 31234
    thislen = 20+Random(49)
    WriteStringN(0, Left(str$, thislen))
  Next
  FlushFileBuffers(0)
  filesize = Lof(0)
EndIf

CloseFile(0)

Procedure CountLines(file$)
  Protected filesize
  EnableASM
    ReadFile(0, file$)
    filesize = Lof(0)
    *loc = AllocateMemory(filesize)
    ReadData(0, *loc, filesize)
    CloseFile(0)
    count3=0
    !xor ecx, ecx            
    !mov edx, [p.p_loc]      
    !mov eax, [p.v_filesize] 
    !loopstart:              
    !cmp word [edx], 0xA0D 
    !jnz skip              
    !inc ecx               
    !skip:                 
    !inc edx               
    !dec eax               
    !jnz loopstart           
    !mov [p.v_count3], ecx
    FreeMemory(*loc)
  DisableASM
  ProcedureReturn count3
EndProcedure


MessageRequester("", "CountLines reports "+Str(CountLines("c:\test.txt"))+" lines.")