Backup regimen suggestions

For everything that's not in any way related to PureBasic. General chat etc...
User avatar
Keya
Addict
Addict
Posts: 1890
Joined: Thu Jun 04, 2015 7:10 am

Re: Backup regimen suggestions

Post by Keya »

jack wrote:one thing that concerns me is the use of SHA-1 hash in determining what 64k block to save, however remote the chance of a collision it's still not zero.
not 0 but still only 1 in 1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,975 :) I'm gonna take the risk hehe
tj1010
Enthusiast
Enthusiast
Posts: 716
Joined: Mon Feb 25, 2013 5:51 pm

Re: Backup regimen suggestions

Post by tj1010 »

Keya wrote:
jack wrote:one thing that concerns me is the use of SHA-1 hash in determining what 64k block to save, however remote the chance of a collision it's still not zero.
not 0 but still only 1 in 1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,975 :) I'm gonna take the risk hehe
I use CRC32 and fall back to MD5 still.. Along with file system metrics.

Quick userland solution just point it to a drive/volume/folder Windows/Linux/OSX(NOTICE: limited to process privilege):

Code: Select all

UseCRC32Fingerprint()

Declare crawl(path$)
Declare Verify(path$)

path$=PathRequester("","")
If Len(path$)>0
  Debug "file-path,file-size,CRC32 of file"
  crawl(path$)
EndIf

;path$=OpenFileRequester("","","*.*",0)
; If Len(path$)>0
;   Verify(path$)
; EndIf
End

Procedure crawl(path$)
  Protected emdir.l
  emdir=ExamineDirectory(#PB_Any,path$,"*.*")
  If emdir
    While NextDirectoryEntry(emdir)
      Delay(2)
      If DirectoryEntryType(emdir)=#PB_DirectoryEntry_Directory
        If DirectoryEntryName(emdir)<>"." And DirectoryEntryName(emdir)<>".." And DirectoryEntryName(emdir)<>"$RECYCLE.BIN"
          CompilerIf #PB_Compiler_OS = #PB_OS_Linux
            crawl(path$+DirectoryEntryName(emdir)+"/")
          CompilerElseIf #PB_Compiler_OS = #PB_OS_Windows
            crawl(path$+DirectoryEntryName(emdir)+"\")
          CompilerElse
            crawl(path$+DirectoryEntryName(emdir)+"/")
          CompilerEndIf
        EndIf
      Else
        Debug path$+DirectoryEntryName(emdir)+","+FileSize(path$+DirectoryEntryName(emdir))+","+FileFingerprint(path$+DirectoryEntryName(emdir),#PB_Cipher_CRC32)
      EndIf
    Wend
    FinishDirectory(emdir)
  EndIf
EndProcedure

Procedure Verify(path$)
  Protected ss$
  If FileSize(path$)>0
    If ReadFile(0,path$)
      While Eof(0)=0
        Delay(2)
        ss$=ReadString(0)
        If OSVersion()<>#PB_OS_Windows And CountString(StringField(ss$,1,","),"\")>0 : ss$=ReplaceString(ss$,"\","/") : EndIf
        If FileSize(StringField(ss$,1,","))<>Val(StringField(ss$,2,",")) Or FileFingerprint(StringField(ss$,1,","),#PB_Cipher_CRC32)<>StringField(ss$,3,",") : Debug StringField(ss$,1,",")+" is corrupt" : EndIf
      Wend
      CloseFile(0)
    EndIf
  EndIf
EndProcedure
I use that on low to medium value data on USB drives and folders. I only use CRC32 for speed not because of collision statistics. I'd probably go with SHA3-224 or MD5 but at that point I'd just make a hashed volume backup over some fast bus. Roughly five lines to this makes it also backup but I use firmware for that except for things like password safes which I put on encrypted thumb drives and AES encrypted 7zip on some clouds.
Post Reply