Page 1 of 1

GetFolderSize()

Posted: Thu Jun 16, 2016 9:16 am
by Thunder93
I have a procedure to get folder size. Making use of directory traversal.

I have here two versions, one specific to Windows, another using PB native commands.


{ Windows Specific }

Code: Select all

Structure LARGE_INTEGER_
  StructureUnion
    x.LARGE_INTEGER
    QuadPart.q
  EndStructureUnion
EndStructure

Procedure.q GetFolderSize(Path.s, State.b = 1) 
  Protected _Data.WIN32_FIND_DATA, Path$ = Path + "\*.*"
  Static.q nSize
  
  If State : nSize = 0 : EndIf
  
  hF = FindFirstFile_(Path$, @_Data)
  
  If hF <> #INVALID_HANDLE_VALUE    
    Repeat
      If _Data\dwFileAttributes & #FILE_ATTRIBUTE_DIRECTORY
        
        ;// skip file or directory that has an associated reparse point,
        ;// or a file that is a symbolic link. Basically to avoid potential
        ;// infinitely recursive directory tree.
        If Not _Data\dwFileAttributes & #FILE_ATTRIBUTE_REPARSE_POINT          
          
          ;// make sure we skip "." And ".."
          If PeekS(@_Data\cFileName) <> "." And PeekS(@_Data\cFileName) <> ".."
            
            ;// We found a sub-directory, so get the files in it too
            Path$ = Path + "\" + PeekS(@_Data\cFileName)
            
            ;// recurrsion here!            
            GetFolderSize(Path$, 0) ; Traverse Directory
          EndIf
        EndIf

      Else
        sz.LARGE_INTEGER_
        
        ;// All we want here is the file size.  Since file sizes can be larger
        ;// than 2 gig, the size is reported As two DWORD objects.  Below we
        ;// combine them To make one 64-bit integer.
        sz\x\LowPart = _Data\nFileSizeLow
        sz\x\HighPart = _Data\nFileSizeHigh
        
        nSize + sz\QuadPart        
      EndIf      
    Until FindNextFile_(hF, @_Data) = 0
    FindClose_(hF)    
  EndIf
  
  ProcedureReturn nSize
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  Path.s = #PB_Compiler_Home
  Debug #CRLF$+"GetFolderSize()"
  Debug "Path: " + Path
  Debug "  "+Str(GetFolderSize(Path)) + " Bytes"
CompilerEndIf
{ PB Native commands }

Code: Select all

CompilerIf #PB_Compiler_OS = #PB_OS_Windows
  #PB_FileSystem_Link = #FILE_ATTRIBUTE_REPARSE_POINT
CompilerEndIf

Procedure.q GetFolderSize(Path.s, State.b = 1)
  Protected Path$ = Path
  Static.q nSize
  
  If State : nSize = 0 : EndIf
  
  hF = ExamineDirectory(#PB_Any, Path, "*.*")
  If hF <> #False   
    While NextDirectoryEntry(hF)
      If DirectoryEntryType(hF) = #PB_DirectoryEntry_Directory
        If DirectoryEntryName(hF) <> "." And DirectoryEntryName(hF) <> ".."
          
          If DirectoryEntryAttributes(hF) & #PB_FileSystem_Link
            Continue
          EndIf
          
          Path$ = Path + "\" + DirectoryEntryName(hF)  
          GetFolderSize(Path$, 0) ; Traverse Directory
        EndIf
      Else
        nSize + FileSize(Path + "\"  + DirectoryEntryName(hF))
      EndIf
    Wend
    FinishDirectory(hF)
  EndIf
  
  ProcedureReturn nSize
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  Path.s = #PB_Compiler_Home
  Debug #CRLF$+"GetFolderSize()"
  Debug "Path: " + Path
  Debug "  "+Str(GetFolderSize(Path)) + " Bytes"
CompilerEndIf

Re: Recursive Directory-traversals & reparse points, symboli

Posted: Fri Jun 17, 2016 1:01 am
by Thunder93
Old 2004 post by 'Raymond Chen - MSFT' - You can create an infinitely recursive directory tree https://blogs.msdn.microsoft.com/oldnew ... /?p=36883/

Re: Recursive Directory-traversals & reparse points, symboli

Posted: Fri Jun 17, 2016 5:28 am
by Lunasole
Thanks for sharing, your WinAPI version looks shorter and more completed that one I've made.

Btw I guess there should be a way to mix WinAPI calls with PB functions (returning handle) to recognize sym links, but didn't tried yet as it is rare case for me when need to filter those links.

Re: Recursive Directory-traversals & reparse points, symboli

Posted: Fri Jun 17, 2016 11:34 am
by Thunder93
Your welcome. More complete code would be to also offer excluding support of subdirectories. Easily done and with very little alteration.

{ Windows Specific Procedure }

Code: Select all

Structure LARGE_INTEGER_
  StructureUnion
    x.LARGE_INTEGER
    QuadPart.q
  EndStructureUnion
EndStructure

Procedure.q GetFolderSize(Path.s, Ignore_SubDIRs.b = 0, State.b = 1)
  Protected _Data.WIN32_FIND_DATA, Path$ = Path + "\*.*"
  Static.q nSize
  
  If State : nSize = 0 : EndIf
  
  hF = FindFirstFile_(Path$, @_Data)
  
  If hF <> #INVALID_HANDLE_VALUE    
    Repeat
      If _Data\dwFileAttributes & #FILE_ATTRIBUTE_DIRECTORY
        If Ignore_SubDIRs = 1 : Continue : EndIf
        
        ;// skip file or directory that has an associated reparse point,
        ;// or a file that is a symbolic link. Basically to avoid potential
        ;// infinitely recursive directory tree.
        If Not _Data\dwFileAttributes & #FILE_ATTRIBUTE_REPARSE_POINT
          
          ;// make sure we skip "." And "..".  Have To use strcmp here because
          ;// some file names can start With a dot, so just testing For the 
          ;// first dot is Not suffient.
          
          If PeekS(@_Data\cFileName) <> "." And PeekS(@_Data\cFileName) <> ".."
            
            ;// We found a sub-directory, so get the files in it too
            Path$ = Path + "\" + PeekS(@_Data\cFileName)
            
            ;// recurrsion here!            
            GetFolderSize(Path$, 0, 0)  ; Traverse Directory
          EndIf
        EndIf

      Else
        sz.LARGE_INTEGER_
        
        ;// All we want here is the file size.  Since file sizes can be larger
        ;// than 2 gig, the size is reported As two DWORD objects.  Below we
        ;// combine them To make one 64-bit integer.
        sz\x\LowPart = _Data\nFileSizeLow
        sz\x\HighPart = _Data\nFileSizeHigh
        
        nSize + sz\QuadPart        
      EndIf      
    Until FindNextFile_(hF, @_Data) = 0
    FindClose_(hF)    
  EndIf
  
  ProcedureReturn nSize
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  Path.s = #PB_Compiler_Home
  Debug "GetFolderSize()"
  Debug "Path: " + Path
  Debug "  "+GetFolderSize(Path, 1) + " Exclude SubDirectories."
  Debug "  "+GetFolderSize(Path) + " Regular, Include SubDirectories."
CompilerEndIf
Btw I guess there should be a way to mix WinAPI calls with PB functions (returning handle) to recognize sym links, but didn't tried yet as it is rare case for me when need to filter those links.
Sure you can, however defeats the purpose of having cross-platform supported code. GetFileAttributes_() API, not to be confused with the PB native command GetFileAttributes()

e.g.

Code: Select all

Procedure.q GetFolderSize(Path.s, State.b = 1)
  Protected Path$ = Path, dwAttrs.l
  Static.q nSize
  
  If State : nSize = 0 : EndIf
  
  hF = ExamineDirectory(#PB_Any, Path, "*.*")
  If hF <> #False   
    While NextDirectoryEntry(hF)
      If DirectoryEntryType(hF) = #PB_DirectoryEntry_Directory
        If DirectoryEntryName(hF) <> "." And DirectoryEntryName(hF) <> ".."
          Path$ = Path + "\" + DirectoryEntryName(hF)
          
          dwAttrs = GetFileAttributes_(fName)
          If dwAttrs <> #INVALID_FILE_ATTRIBUTES 
            If Not dwAttrs & #FILE_ATTRIBUTE_REPARSE_POINT
              GetFolderSize(Path$, 0)  ; Traverse Directory
            EndIf
          EndIf
        EndIf
      Else
        nSize + FileSize(Path + "\"  + DirectoryEntryName(hF))
      EndIf
    Wend
    FinishDirectory(hF)
  EndIf
  
  ProcedureReturn nSize
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  Path.s = #PB_Compiler_Home
  Debug "GetFolderSize()"
  Debug "Path: " + Path
  Debug "  "+GetFolderSize(Path)
CompilerEndIf
Already before this alteration, and when speed tested, it was a tad slower (2 milliseconds to be specific) than using Windows specific version. However the trade-off would be worth It anyways to have a cross-platform supported code.

Re: Recursive Directory-traversals & reparse points, symboli

Posted: Fri Jun 17, 2016 3:02 pm
by Kukulkan
Nice, this is helpful. :D

Does anyone know if there is any faster cross-platform option to get the size of a folder incl. subfolders? I'm not interested in any other information. Just size. No symlink-support needed.

Kukulkan

Re: Recursive Directory-traversals & reparse points, symboli

Posted: Fri Jun 17, 2016 4:49 pm
by Thunder93
These looks like regular folders, but they are shortcuts (basically putting it), that links to folder elsewhere and outside of that location. If that directory or one of the sub directories contains one pointing back to one of the previous folders, you could end-up in a loop and re-calculating the sizes for the same files.

C:\My Music\Top Classics\Frank Hits Us\[<Shortcut>]\

Then it becomes...;

C:\My Music\Top Classics\Frank Hits Us\[<Shortcut to C:\My Music>]\
C:\My Music\Top Classics\Frank Hits Us\[<Shortcut to C:\My Music>]\C:\My Music\
C:\My Music\Top Classics\Frank Hits Us\[<Shortcut to C:\My Music>]\C:\My Music\Top Classics\
C:\My Music\Top Classics\Frank Hits Us\[<Shortcut to C:\My Music>]\C:\My Music\Top Classics\Frank Hits Us\
C:\My Music\Top Classics\Frank Hits Us\[<Shortcut to C:\My Music>]\C:\My Music\Top Classics\Frank Hits Us\[<Shortcut to C:\My Music>]\

and repeats again and again and the size information keeps growing.

You don't have to have something special in-place to see these, the problem is you want to look at the attributes and catch the process to ensure your code doesn't fall into this looping.


Here's one way to make it support multiOSes.... shouldn't have to resort to this approach;

Code: Select all

Procedure.q GetFolderSize2(Path.s, State.b = 1)
  Protected fName.s = Path, dwAttrs.l
  Static.q nSize
  
  If State : nSize = 0 : EndIf
  
  hF = ExamineDirectory(#PB_Any, Path, "*.*")
  If hF <> #False   
    While NextDirectoryEntry(hF)
      If DirectoryEntryType(hF) = #PB_DirectoryEntry_Directory
        If DirectoryEntryName(hF) <> "." And DirectoryEntryName(hF) <> ".."
          fName = Path + "\" + DirectoryEntryName(hF)
          
          
          CompilerIf #PB_Compiler_OS = #PB_OS_MacOS Or #PB_Compiler_OS = #PB_OS_Linux
            dwAttrs = DirectoryEntryAttributes(hF)
            If Not dwAttrs & #PB_FileSystem_Link
              
            CompilerElseIf #PB_Compiler_OS = #PB_OS_Windows              
              dwAttrs = GetFileAttributes_(fName)
              If dwAttrs <> #INVALID_FILE_ATTRIBUTES 
                If Not dwAttrs & #FILE_ATTRIBUTE_REPARSE_POINT             
                  
                CompilerEndIf
                
                GetFolderSize2(fName, 0)
              EndIf
            EndIf
          EndIf
        Else
          nSize + FileSize(Path + "\"  + DirectoryEntryName(hF))
        EndIf
      Wend
      FinishDirectory(hF)
    EndIf
    
    ProcedureReturn nSize
  EndProcedure
  
  CompilerIf #PB_Compiler_IsMainFile
    Path.s = "D:\OutOfTheBox"
    Str$ = "GetFolderSize2()" + #CRLF$ +
           "...Path: " + Path + #CRLF$ +
           "..." + Str(GetFolderSize2(Path))
    Debug Str$
    
    CompilerIf Not #PB_Compiler_Debugger
      MessageRequester("", Str$)
    CompilerEndIf            
  CompilerEndIf

Re: GetFolderSize()

Posted: Tue Jul 05, 2016 5:06 am
by Thunder93
I've updated the main post. Also the second code section using PB native commandset now supports reparse points, symbolinks by basically creating the #PB_FileSystem_Link constant for Windows.