Numbers of lines in a text file

Just starting out? Need help? Post your questions and find answers here.
User avatar
Tomi
Enthusiast
Enthusiast
Posts: 270
Joined: Wed Sep 03, 2008 9:29 am

Numbers of lines in a text file

Post by Tomi »

how can i get Numbers of lines in a text file by a sure method ? (please simplest way) :?:

Code: Select all

blah blah
blah
blah blah blah
return 3 line for above file, of course i need this method for a file about 12000 to 30000 line! , but i can't create a big sample :D
User avatar
ts-soft
Always Here
Always Here
Posts: 5756
Joined: Thu Jun 24, 2004 2:44 pm
Location: Berlin - Germany

Re: Numbers of lines in a text file

Post by ts-soft »

the simple way:

Code: Select all

If ReadFile(0, "yourfile")
  ReadStringFormat(0)
  While Not Eof(0)
    If ReadString(0)
      count + 1
    EndIf
  Wend
  CloseFile(0)
  Debug count
EndIf
but the simple is the slow way, but sure.
PureBasic 5.73 | SpiderBasic 2.30 | Windows 10 Pro (x64) | Linux Mint 20.1 (x64)
Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.
Image
User avatar
Tomi
Enthusiast
Enthusiast
Posts: 270
Joined: Wed Sep 03, 2008 9:29 am

Re: Numbers of lines in a text file

Post by Tomi »

thx sir, very good :D
User avatar
skywalk
Addict
Addict
Posts: 3972
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Numbers of lines in a text file

Post by skywalk »

ts-soft beat me, but another way...

Code: Select all

ReadFile(0,"SomeTextFile.txt")
Define.s s$ = Space(Lof(0))
ReadData(0, @s$, Lof(0))
CloseFile(0)
Debug CountString(s$,#CRLF$) ; If you know the type of EOL character.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Tomi
Enthusiast
Enthusiast
Posts: 270
Joined: Wed Sep 03, 2008 9:29 am

Re: Numbers of lines in a text file

Post by Tomi »

thx skywalk, your method is nice also :D
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8425
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: Numbers of lines in a text file

Post by netmaestro »

I originally considered the second method as an alternative but didn't even bother because it seemed like it would be too slow what with reading the whole file in first and then counting strings. But then Skywalk posted it and I thought, might as well see. This code snippet counts the strings in three different ways:

Code: Select all

;
;
; TURN THE DEBUGGER OFF <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
;

ProcedureDLL.l TicksHQ() 
  Static maxfreq.q 
  Protected t.q 
  If maxfreq=0 
    QueryPerformanceFrequency_(@maxfreq) 
    maxfreq=maxfreq/1000
  EndIf 
  QueryPerformanceCounter_(@t) 
  ProcedureReturn t/maxfreq 
EndProcedure

str$ = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"

If CreateFile(0, "c:\test.txt")
  For i=1 To 31234
    thislen = 20+Random(49)
    WriteStringN(0, Left(str$, thislen))
  Next
  FlushFileBuffers(0)
  filesize = Lof(0)
EndIf

CloseFile(0)

stime1 = TicksHQ()
count1=0
If ReadFile(0, "c:\test.txt")
  ReadStringFormat(0)
  While Not Eof(0)
    If ReadString(0)
      count1 + 1
    EndIf
  Wend
  CloseFile(0)
EndIf
etime1 = TicksHQ()

stime2 = TicksHQ()
ReadFile(0,"c:\test.txt")
Define.s s$ = Space(Lof(0))
ReadData(0, @s$, Lof(0))
CloseFile(0)
count2 = CountString(s$,#CRLF$) ; If you know the type of EOL character.
etime2 = TicksHQ()

EnableASM
  stime3 = TicksHQ()
  ReadFile(0,"c:\test.txt")
  *loc = AllocateMemory(Lof(0))
  ReadData(0, *loc, Lof(0))
  CloseFile(0)
  count3=0
  !xor ecx, ecx            ; linecount = 0
  !mov edx, [p_loc]        ; readpointer = *loc
  !mov eax, [v_filesize]   ; remainingbytes = filesize
  !loopstart:              ; While remainingbytes > 0
    !cmp word [edx], 0xA0D ;   If word at readpointer <> #CRLF$
    !jnz skip              ;     GOTO skip
    !inc ecx               ;   Else, increment the linecount
    !skip:                 ;
    !inc edx               ;   readpointer + 1
    !dec eax               ;   remainingbytes - 1
  !jnz loopstart           ; Wend
  !mov [v_count3], ecx
  FreeMemory(*loc)
DisableASM
etime3 = TicksHQ()

MessageRequester("", "First way: "+Str(count1)+" lines reported, time="+ Str(etime1-stime1)+" milliseconds")
MessageRequester("", "Second way: "+Str(count2)+" lines reported, time="+ Str(etime2-stime2)+" milliseconds")
MessageRequester("", "Third way: "+Str(count3)+" lines reported, time="+ Str(etime3-stime3)+" milliseconds")
My results:

ts-soft code: 78 ms.
Skywalk code: 18 ms.
Skywalk code optimized: 11 ms.

I never would have picked that!
BERESHEIT
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: Numbers of lines in a text file

Post by IdeasVacuum »

Found this on the forum some time ago, do not know whose code it is:

Code: Select all

DataSection
  !CountLinesPcf_eoli:
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  !db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
EndDataSection

ProcedureDLL.i TicksHQ()
  Static maxfreq.q
  Protected t.q
  If maxfreq=0
    QueryPerformanceFrequency_(@maxfreq)
    maxfreq=maxfreq/1000
  EndIf
  QueryPerformanceCounter_(@t)
  ProcedureReturn t/maxfreq
EndProcedure

Procedure.i CountLinesPcf(filename.s)
Define fid.i, lcCR.i, lcLF.i, *mem, size.i
fid = ReadFile(#PB_Any, filename)
lcCR = 0
lcLF = 0
If IsFile(fid)
   size = Lof(fid)
   *mem = AllocateMemory(size)
   size = ReadData(fid, *mem, size)
EnableASM
   CompilerSelect #PB_Compiler_Processor
   CompilerCase #PB_Processor_x86
    ;eax, ecx and edx are ok, save rest
    Define.l old_esi
    !MOV dword [p.v_old_esi], esi
    !MOV esi, dword [p.p_mem]
    !MOV ecx, dword [p.v_size]
    !XOR eax, dword eax
    !CountLinesPcf_loop:
     !MOV al, byte [esi]
     !CMP byte [eax+CountLinesPcf_eoli], byte 1
     !JNE CountLinesPcf_noLFCR
      !CMP eax, dword 13 ;CR
      !JNE CountLinesPcf_noCR
       !INC dword [p.v_lcCR]
       !JMP CountLinesPcf_noLF
      !CountLinesPcf_noCR:
       !INC dword [p.v_lcLF]
      !CountLinesPcf_noLF:
     !CountLinesPcf_noLFCR:
     !INC dword esi
     !DEC dword ecx
    !JNZ CountLinesPcf_loop
    !MOV esi, dword [p.v_old_esi]
   CompilerCase #PB_Processor_x64
    ;eax, ecx and edx are ok, save rest
    Define.q old_esi
    !MOV qword [p.v_old_esi], rsi
    !MOV rsi, qword [p.p_mem]
    !MOV rcx, qword [p.v_size]
    !XOR rax, qword rax
    !MOV rdx, qword CountLinesPcf_eoli
    !CountLinesPcf_loop:
     !MOV al, byte [rsi]
     !CMP byte [rax+rdx], byte 1
     !JNE CountLinesPcf_noLFCR
      !CMP eax, dword 13 ;CR
      !JNE CountLinesPcf_noCR
       !INC qword [p.v_lcCR]
       !JMP CountLinesPcf_noLF
      !CountLinesPcf_noCR:
       !INC qword [p.v_lcLF]
      !CountLinesPcf_noLF:
     !CountLinesPcf_noLFCR:
     !INC qword rsi
     !DEC qword rcx
    !JNZ CountLinesPcf_loop
    !MOV rsi, qword [p.v_old_esi]
DisableASM
   CompilerDefault
    CompilerError "unsupported CPU architecture"
   CompilerEndSelect
   FreeMemory(*mem)
  CloseFile(fid)
EndIf
If lcLF > 0
  ProcedureReturn lcLF
EndIf
ProcedureReturn lcCR
EndProcedure

If CreateFile(0, "c:\test.txt")
sline.s = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"
  For i=1 To 31234
    thislen = 20+Random(49)
    WriteStringN(0, Left(sline, thislen))
  Next
  FlushFileBuffers(0)
  filesize = Lof(0)
  CloseFile(0)
EndIf

stime = TicksHQ()
count = CountLinesPcf("c:\test.txt")
etime = TicksHQ()
MessageRequester("", Str(count)+ " lines reported, time= " + Str(etime-stime) + " milliseconds")
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8425
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: Numbers of lines in a text file

Post by netmaestro »

Lots and lots of code there and it's reporting 15ms here vs 11ms third version, Stargate optimized.
BERESHEIT
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: Numbers of lines in a text file

Post by IdeasVacuum »

...it has some nice tidbits though. One thing of interest is the compatibility of the ASM stuff, 32bit and 64bit?
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8425
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: Numbers of lines in a text file

Post by netmaestro »

64bit might be quite fast, can't tell as I only have x86 here.
BERESHEIT
User avatar
Tomi
Enthusiast
Enthusiast
Posts: 270
Joined: Wed Sep 03, 2008 9:29 am

Re: Numbers of lines in a text file

Post by Tomi »

thx to all, very nice, in my machine (64bit) ,netmaestro has first place ! :D
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: Numbers of lines in a text file

Post by IdeasVacuum »

...and netmaestro takes 1st place on 32bit too. Definitely the Number 1! :mrgreen:
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: Numbers of lines in a text file

Post by IdeasVacuum »

...Hit a snag when running as a Procedure:

Code: Select all

;=========== TEST file start ====================================================

If CreateFile(0, "c:\test.txt")

  sLine.s = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"

  For i = 1 To 31234

                                   ilen = 20 + Random(49)
       WriteStringN(0, Left(sLine, ilen))

  Next
  FlushFileBuffers(0)
         CloseFile(0)
EndIf

;=========== TEST file End =====================================================

Procedure.i CountFileLines(sFile.s)
;----------------------------------
EnableASM
              If ReadFile(0,sFile)
                                  *loc = AllocateMemory(Lof(0))
                      ReadData(0, *loc, Lof(0))
                     CloseFile(0)
              EndIf

              cnt = 0
              !xor ecx, ecx            ; linecount = 0
              !mov edx, [p_loc]        ; readpointer = *loc
              !mov eax, [v_filesize]   ; remainingbytes = filesize
              !loopstart:              ; While remainingbytes > 0
                !cmp word [edx], 0xA0D ; If word at readpointer <> #CRLF$
                !jnz skip              ; GOTO skip
                !inc ecx               ; Else, increment the linecount
                !skip:                 ;
                !inc edx               ; readpointer + 1
                !dec eax               ; remainingbytes - 1
              !jnz loopstart           ; Wend
              !mov [v_cnt], ecx

DisableASM
              FreeMemory(*loc)
              ProcedureReturn

EndProcedure

LineCnt.i = CountFileLines("c:\test.txt")
Debug LineCnt
Halts with error:
PureBasic.asm [203]:
MP0
PureBasic.asm [163] MP0 [41]:
mov edx, [p_loc] ; readpointer = *loc
error: undefined symbol 'p_loc'.

Also, a further question Tomi - was your test on 64bit Windows compiled as a 64bit app?
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
ts-soft
Always Here
Always Here
Posts: 5756
Joined: Thu Jun 24, 2004 2:44 pm
Location: Berlin - Germany

Re: Numbers of lines in a text file

Post by ts-soft »

You destroy the stack with your asm-loop in a procedure :wink:
PureBasic 5.73 | SpiderBasic 2.30 | Windows 10 Pro (x64) | Linux Mint 20.1 (x64)
Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.
Image
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8425
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: Numbers of lines in a text file

Post by netmaestro »

Rules for accessing variables and pointers from inside a procedure are a bit different. How to write for a procedure can be found in the PB help file under "Inlined ASM". Anyway, this is the procedure version:

Code: Select all

str$ = "abcdefghijklmnopqrstuvwxyz and that is all the letters I can think of"

If CreateFile(0, "c:\test.txt")
  For i=1 To 31234
    thislen = 20+Random(49)
    WriteStringN(0, Left(str$, thislen))
  Next
  FlushFileBuffers(0)
  filesize = Lof(0)
EndIf

CloseFile(0)

Procedure CountLines(file$)
  Protected filesize
  EnableASM
    ReadFile(0, file$)
    filesize = Lof(0)
    *loc = AllocateMemory(filesize)
    ReadData(0, *loc, filesize)
    CloseFile(0)
    count3=0
    !xor ecx, ecx            
    !mov edx, [p.p_loc]      
    !mov eax, [p.v_filesize] 
    !loopstart:              
    !cmp word [edx], 0xA0D 
    !jnz skip              
    !inc ecx               
    !skip:                 
    !inc edx               
    !dec eax               
    !jnz loopstart           
    !mov [p.v_count3], ecx
    FreeMemory(*loc)
  DisableASM
  ProcedureReturn count3
EndProcedure


MessageRequester("", "CountLines reports "+Str(CountLines("c:\test.txt"))+" lines.")

BERESHEIT
Post Reply