Another GetDirectorySize() Procedure (PB4.01)

Share your advanced PureBasic knowledge/code with the community.
Tranquil
Addict
Addict
Posts: 942
Joined: Mon Apr 28, 2003 2:22 pm
Location: Europe

Another GetDirectorySize() Procedure (PB4.01)

Post by Tranquil »

Code updated for 5.20+

Works even with huge Files and Directories. API free.

Code: Select all

Procedure.q GetDirectorySize(path$,pattern$ = "*.*")
  If Right(path$,1)<>"\":path$+"\":EndIf
  DirID = ExamineDirectory(#PB_Any,path$,pattern$)
    If DirID
    While NextDirectoryEntry(DirID)
      If DirectoryEntryType(DirID) = #PB_DirectoryEntry_Directory
        If DirectoryEntryName(DirID)<>"." And DirectoryEntryName(DirID)<>".."
          TotalSize.q + GetDirectorySize(path$+DirectoryEntryName(DirID)+"\",pattern$)
        EndIf
      EndIf
      If DirectoryEntryType(DirID) = #PB_DirectoryEntry_File
        TotalSize + DirectoryEntrySize(DirID)
      EndIf
    Wend
    FinishDirectory(DirID)
  EndIf
  ProcedureReturn TotalSize.q
EndProcedure
Tranquil
AND51
Addict
Addict
Posts: 1040
Joined: Sun Oct 15, 2006 8:56 pm
Location: Germany
Contact:

Post by AND51 »

Shorter (16 lines) and much faster:

Code: Select all

Procedure.q GetDirectorySize(path$, pattern$="")
	Protected dir=ExamineDirectory(#PB_Any, path$, pattern$), size.q
	If dir
		While NextDirectoryEntry(dir)
			If DirectoryEntryType(dir) = #PB_DirectoryEntry_File
				size+DirectoryEntrySize(dir)
				Continue
			ElseIf Not DirectoryEntryName(dir) = "." And Not DirectoryEntryName(dir) = ".."
				size+GetDirectorySize(path$+DirectoryEntryName(dir)+"\", pattern$)
				Continue
			EndIf
		Wend
		FinishDirectory(dir)
	EndIf
	ProcedureReturn size
EndProcedure
My procedure is faster, because I use Continue; this makes the program skipping the rest of the WHILE-loop and it directly jumps back to WHILE.

Tranquil, you have 4 IFs, I've got only 2 ... :D
Moreover, you don't need to append a backslash, because in PB you always end a path with a backslash.
Furhtemore, you forgot to PROTECT your variables... Your proc won't work when having EnableExplicit enabled...
PB 4.30

Code: Select all

onErrorGoto(?Fred)
Tranquil
Addict
Addict
Posts: 942
Joined: Mon Apr 28, 2003 2:22 pm
Location: Europe

Post by Tranquil »

AND51 wrote:Shorter (16 lines) and much faster:

Code: Select all

Procedure.q GetDirectorySize(path$, pattern$="")
	Protected dir=ExamineDirectory(#PB_Any, path$, pattern$), size.q
	If dir
		While NextDirectoryEntry(dir)
			If DirectoryEntryType(dir) = #PB_DirectoryEntry_File
				size+DirectoryEntrySize(dir)
				Continue
			ElseIf Not DirectoryEntryName(dir) = "." And Not DirectoryEntryName(dir) = ".."
				size+GetDirectorySize(path$+DirectoryEntryName(dir)+"", pattern$)
				Continue
			EndIf
		Wend
		FinishDirectory(dir)
	EndIf
	ProcedureReturn size
EndProcedure
My procedure is faster, because I use Continue; this makes the program skipping the rest of the WHILE-loop and it directly jumps back to WHILE.

Tranquil, you have 4 IFs, I've got only 2 ... :D
Moreover, you don't need to append a backslash, because in PB you always end a path with a backslash.
Furhtemore, you forgot to PROTECT your variables... Your proc won't work when having EnableExplicit enabled...
Cool, I hoped to see someone will optimize it again. :) My initial post was for readability only. I saw so damn API bloated threads in this forum and I dont know why couse its so easy.
Tranquil
AND51
Addict
Addict
Posts: 1040
Joined: Sun Oct 15, 2006 8:56 pm
Location: Germany
Contact:

Post by AND51 »

Naja, wenn man diese all diese kleinen Tricks wie mit Break/Continue im Hinterkopf hat, ist das Programmieren etwas effizienter/einfacher (behaupte ich mal).

Ganz ehrlich: Abgesehen davon, dass ich den Code schon mal geschrieben habe (habe ihn vorhin extra für dich neu geschrieben), wusste ich das mit Continue schon vorher.

So als Tipp von mir für dich. :D
PB 4.30

Code: Select all

onErrorGoto(?Fred)
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Have some fun with the following "benchmark" test.

I got:
First GetDirectorySize Evil: 9005391285 bytes, (457ms)
First GetDirectorySize Nice: 9005391285 bytes, (487ms)

Old GetDirectorySize Evil: 9005375925 bytes, (457ms)
Old GetDirectorySize Nice: 9005375925 bytes, (463ms)

New GetDirectorySize Evil: 9005375925 bytes, (474ms)
New GetDirectorySize Nice: 9005385465 bytes, (495ms)
Using my C:\ drive which has over 35000 files and almost 4000 folders. (ugh).

Code: Select all

EnableExplicit

Procedure.q FirstDirectorySizeEvil(path$,pattern$ = "*.*")
 Protected DirID.l,TotalSize.q
 If Right(path$,1)<>"\":path$+"\":EndIf
 DirID = ExamineDirectory(#PB_Any,path$,pattern$)
 If DirID
  While NextDirectoryEntry(DirID)
   If DirectoryEntryType(DirID) = #PB_DirectoryEntry_Directory
    If DirectoryEntryName(DirID)<>"." And DirectoryEntryName(DirID)<>".."
     TotalSize + FirstDirectorySizeEvil(path$+DirectoryEntryName(DirID)+"\",pattern$)
    EndIf
   EndIf
   If DirectoryEntryType(DirID) = #PB_DirectoryEntry_File
    TotalSize + DirectoryEntrySize(DirID)
   EndIf
   ;Delay(0)
  Wend
  FinishDirectory(DirID)
 EndIf
 ProcedureReturn TotalSize.q
EndProcedure

Procedure.q FirstDirectorySize(path$,pattern$ = "*.*")
 Protected DirID.l,TotalSize.q
 If Right(path$,1)<>"\":path$+"\":EndIf
 DirID = ExamineDirectory(#PB_Any,path$,pattern$)
 If DirID
  While NextDirectoryEntry(DirID)
   If DirectoryEntryType(DirID) = #PB_DirectoryEntry_Directory
    If DirectoryEntryName(DirID)<>"." And DirectoryEntryName(DirID)<>".."
     TotalSize.q + FirstDirectorySize(path$+DirectoryEntryName(DirID)+"\",pattern$)
    EndIf
   EndIf
   If DirectoryEntryType(DirID) = #PB_DirectoryEntry_File
    TotalSize + DirectoryEntrySize(DirID)
   EndIf
   Delay(0)
  Wend
  FinishDirectory(DirID)
 EndIf
 ProcedureReturn TotalSize.q
EndProcedure

Procedure.q GetDirectorySizeEvil(path$, pattern$="")
 Protected dir=ExamineDirectory(#PB_Any, path$, pattern$), size.q
 If dir
  While NextDirectoryEntry(dir)
   If DirectoryEntryType(dir) = #PB_DirectoryEntry_File
    size+DirectoryEntrySize(dir)
    Continue
   ElseIf Not DirectoryEntryName(dir) = "." And Not DirectoryEntryName(dir) = ".."
    size+GetDirectorySizeEvil(path$+DirectoryEntryName(dir)+"\", pattern$)
    Continue
   EndIf
   ;Delay(0)
  Wend
  FinishDirectory(dir)
 EndIf
 ProcedureReturn size
EndProcedure

Procedure.q GetDirectorySize(path$, pattern$="")
 Protected dir=ExamineDirectory(#PB_Any, path$, pattern$), size.q
 If dir
  While NextDirectoryEntry(dir)
   If DirectoryEntryType(dir) = #PB_DirectoryEntry_File
    size+DirectoryEntrySize(dir)
    Continue
   ElseIf Not DirectoryEntryName(dir) = "." And Not DirectoryEntryName(dir) = ".."
    size+GetDirectorySize(path$+DirectoryEntryName(dir)+"\", pattern$)
    Continue
   EndIf
   Delay(0)
  Wend
  FinishDirectory(dir)
 EndIf
 ProcedureReturn size
EndProcedure

Procedure.q NewDirectorySizeEvil(path$,pattern$="")
 Protected size.q,NewList folder.s(),dir.l,subpath$
 size=FileSize(path$)
 If size=-2
  size=0
  If Right(path$,1)<>"\"
   path$+"\"
  EndIf
  AddElement(folder())
  folder()=path$
  While CountList(folder())>0
   subpath$=folder()
   dir=ExamineDirectory(#PB_Any,subpath$,pattern$)
   If dir
    While NextDirectoryEntry(dir)
     If DirectoryEntryType(dir)=#PB_DirectoryEntry_File
      size+DirectoryEntrySize(dir)
     Else
      If DirectoryEntryName(dir)<>"." And DirectoryEntryName(dir)<>".."
       LastElement(folder())
       AddElement(folder())
       folder()=subpath$+DirectoryEntryName(dir)+"\"
      EndIf
     EndIf
     ;Delay(0)
    Wend
    FinishDirectory(dir)
   EndIf
   FirstElement(folder())
   DeleteElement(folder(),1)
   ;Delay(0)
  Wend
  ClearList(folder())
 EndIf
 ProcedureReturn size
EndProcedure

Procedure.q NewDirectorySize(path$,pattern$="")
 Protected size.q,NewList folder.s(),dir.l,subpath$
 size=FileSize(path$)
 If size=-2
  size=0
  If Right(path$,1)<>"\"
   path$+"\"
  EndIf
  AddElement(folder())
  folder()=path$
  While CountList(folder())>0
   subpath$=folder()
   dir=ExamineDirectory(#PB_Any,subpath$,pattern$)
   If dir
    While NextDirectoryEntry(dir)
     If DirectoryEntryType(dir)=#PB_DirectoryEntry_File
      size+DirectoryEntrySize(dir)
     Else
      If DirectoryEntryName(dir)<>"." And DirectoryEntryName(dir)<>".."
       LastElement(folder())
       AddElement(folder())
       folder()=subpath$+DirectoryEntryName(dir)+"\"
      EndIf
     EndIf
     Delay(0)
    Wend
    FinishDirectory(dir)
   EndIf
   FirstElement(folder())
   DeleteElement(folder(),1)
   Delay(0)
  Wend
  ClearList(folder())
 EndIf
 ProcedureReturn size
EndProcedure

Procedure.l GetTimerResolution(targetres.l=1) ;Default to 1ms target resolution
 ;TIMECAPS tc;
 Protected tc.TIMECAPS,TimerRes.l=#False
 
 If timeGetDevCaps_(tc,SizeOf(TIMECAPS))=#TIMERR_NOERROR 
  ;We must make sure we use a resolution the system is capable of providing.
  TimerRes=targetres
  If tc\wPeriodMin>TimerRes ;max(TimerRes,tc\wPeriodMin)
   TimerRes=tc\wPeriodMin
  EndIf
  If tc\wPeriodMax<TimerRes ;min(TimerRes,tc\wPeriodMax)
   TimerRes=tc\wPeriodMax
  EndIf
 EndIf
 ProcedureReturn TimerRes
EndProcedure ;return 0/#False if failure


OpenConsole()

;I use Multimedia Timers/functions as these can have up to 1ms resolution
;compared to ElapsedMilliseconds() that can have 10ms resolution or worse.
;These multimedia timing features are available on NT3.1 & W95 or later.
Define TimerRes.l

TimerRes=GetTimerResolution(1) ;We want 1ms resolution if available.
If TimerRes=#False
 PrintN("Oops! Failed to get timer resolution.")
 End
EndIf

timeBeginPeriod_(TimerRes) ;must match timeEndPeriod_()

;Ok, down to business!

Define path$,time1.l,time2.l,size.q
;For a good benchmark choose a big folder with subfolders and many files.
path$="C:\"

;This ensures that disk/system cache is taken advantage of,
;after all we are evaluating the routines not how slow the system is at the first scan.
;This avoid the first run from ending up taking several seconds or worse, ugh...
PrintN("Filling/preloading system disk cache")
size=NewDirectorySizeEvil(path$)

PrintN("Ready?")
PrintN("")
PrintN("3")
Delay(1000) ;to avoid program startup affecting timings too much.
PrintN("2")
Delay(1000) ;it also happens to look cool
PrintN("1")
Delay(1000) ;as well as initialize the debug window.
PrintN("")

;There are Nice and Evil variants of all three.
;the nice variants (the adviced way to do this)
;Uses Delay(0) this forces a context switch,
;all cpu will still be used, but it will not hog the system,
;this is very important in a multitasking os.
;and I really wish more people did this, especially on big/long running
;loops like these. because it is possible for a cpu to get so busy it will
;take minutes for even taskmanager to start, by being nice and using Delay(0)
;the program will use all available cpu that is "free" (not in use),
;this ensures the system remain stable.

;The main difference between the first/old and new methods is that the old uses
;nested procedures/calls, while effective it's not the nices way to do things.
;Although I did not do any stack or memory checking, I do believe my "new" variant
;is more system friendly.

;At the very least it serves as a nice example on building a file index,
;and how to use high precision media timing. *laughs*

;Try this with debugger enabled or disabled, my advice would be with debugger off.

;first run
time1=timeGetTime_()
size=FirstDirectorySizeEvil(path$)
time2=timeGetTime_()
PrintN("First GetDirectorySize Evil: "+StrQ(size)+" bytes, ("+Str(time2-time1)+"ms)")

Delay(1000)

;second run
time1=timeGetTime_()
size=FirstDirectorySize(path$)
time2=timeGetTime_()
PrintN("First GetDirectorySize Nice: "+StrQ(size)+" bytes, ("+Str(time2-time1)+"ms)")

PrintN("")
Delay(1000)

;first run old
time1=timeGetTime_()
size=GetDirectorySizeEvil(path$)
time2=timeGetTime_()
PrintN("Old GetDirectorySize Evil: "+StrQ(size)+" bytes, ("+Str(time2-time1)+"ms)")

Delay(1000)

;second run old
time1=timeGetTime_()
size=GetDirectorySize(path$)
time2=timeGetTime_()
PrintN("Old GetDirectorySize Nice: "+StrQ(size)+" bytes, ("+Str(time2-time1)+"ms)")

PrintN("")
Delay(1000)

;first run new
time1=timeGetTime_()
size=NewDirectorySizeEvil(path$)
time2=timeGetTime_()
PrintN("New GetDirectorySize Evil: "+StrQ(size)+" bytes, ("+Str(time2-time1)+"ms)")

Delay(1000)

;second run new
time1=timeGetTime_()
size=NewDirectorySize(path$)
time2=timeGetTime_()
PrintN("New GetDirectorySize Nice: "+StrQ(size)+" bytes, ("+Str(time2-time1)+"ms)")

PrintN("")
PrintN("Done!")

timeEndPeriod_(TimerRes) ;must match timeBeginPeriod_()

PrintN("")
PrintN("Press enter to quit!")
Input()
CloseConsole()
thefool
Always Here
Always Here
Posts: 5883
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

Rescator wrote: Using my C:\ drive which has over 35000 files and almost 4000 folders. (ugh).
ugh?

c: 85 485 files, 7590 folders.
f: 189 444 files, 18 713 folders :!:

(and i just cleaned up yesterday :/)
User avatar
GeoTrail
Addict
Addict
Posts: 2789
Joined: Fri Feb 13, 2004 12:45 am
Location: Bergen, Norway
Contact:

Post by GeoTrail »

Vista drive:
43 757 Files, 4 897 Folders.

XP drive:
85 534 Files, 7 256 Folders.

And I haven't cleaned anything, but I need too.
I Stepped On A Cornflake!!! Now I'm A Cereal Killer!
AND51
Addict
Addict
Posts: 1040
Joined: Sun Oct 15, 2006 8:56 pm
Location: Germany
Contact:

Post by AND51 »

Obvoiusly, my procedure was the fastest. 8) :wink:
You wrote:;Uses Delay(0) this forces a context switch,
;all cpu will still be used, but it will not hog the system,
;this is very important in a multitasking os.
Please, give me MORE information, I am interested in that. I don't know, what a "context switch" is, for example.

This sounds very interesting. Furthemore, I even have got a multitasking CPU (2 processors).

Have you got more tricks like that?
PB 4.30

Code: Select all

onErrorGoto(?Fred)
User avatar
Joakim Christiansen
Addict
Addict
Posts: 2453
Joined: Wed Dec 22, 2004 4:12 pm
Location: Norway
Contact:

Post by Joakim Christiansen »

GeoTrail wrote:Vista drive:
43 757 Files, 4 897 Folders.

XP drive:
85 534 Files, 7 256 Folders.

And I haven't cleaned anything, but I need too.
146 106 Files, 12 345 Folders.
And I clean every day!
But that's 198GB with stuff :P

And yeah, nice code btw! :D
I like logic, hence I dislike humans but love computers.
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6111
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

c:

49207 files in 11817 folders, no data, just installed software etc... :-(

talking about bloat where's my old atari 400? :-)

nice code though!
( PB5.73LTS Win10 x64 Asrock AB350 Pro4 Ryzen 1600X GTX1060 )
( The path to enlightenment and the PureBasic Survival Guide right here... )
thefool
Always Here
Always Here
Posts: 5883
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

Ha! I win 8)
AND51
Addict
Addict
Posts: 1040
Joined: Sun Oct 15, 2006 8:56 pm
Location: Germany
Contact:

Post by AND51 »

What do you win, thefool?
Which code do you mean, blueznl?

Well, Rescator? Where are my information? Please, give me some information about what I talked in my previous post.
PB 4.30

Code: Select all

onErrorGoto(?Fred)
User avatar
GeoTrail
Addict
Addict
Posts: 2789
Joined: Fri Feb 13, 2004 12:45 am
Location: Bergen, Norway
Contact:

Post by GeoTrail »

thefool wrote:Ha! I win 8)

Code: Select all

For i=0 To 1000000
	CreateFile(0, "C:\Crap\CrapFile"+Str( i )+".txt")
	WriteStringN(0, "This is just a crap file...")
	WriteStringN(0, "This is just a crap file...")
	WriteStringN(0, "This is just a crap file...")
	WriteStringN(0, "This is just a crap file...")
	WriteStringN(0, "This is just a crap file...")
	WriteStringN(0, "This is just a crap file...")
	WriteStringN(0, "This is just a crap file...")
	CloseFile(0)
Next i
I win hehehe
I Stepped On A Cornflake!!! Now I'm A Cereal Killer!
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

AND51 wrote:What do you win, thefool?
Which code do you mean, blueznl?

Well, Rescator? Where are my information? Please, give me some information about what I talked in my previous post.
PB's Delay() is a wrapper for the Win API Sleep function.

Here is a snipped from the PSDK about Sleep.
A value of zero causes the thread to relinquish the remainder of its time slice to any other thread of equal priority that is ready to run. If there are no other threads of equal priority ready to run, the function returns immediately, and the thread continues execution.
AND51
Addict
Addict
Posts: 1040
Joined: Sun Oct 15, 2006 8:56 pm
Location: Germany
Contact:

Post by AND51 »

Thank you! Very interesting!
So if I createa threaad, the thread can start faster, if I've got a Delay(0) in my main module? And if there is no thread is to be started, my main module wouldn't be influcened, because Delay(0) immediately returns...
PB 4.30

Code: Select all

onErrorGoto(?Fred)
Post Reply