Search for duplicate files
Search for duplicate files
Search duplicates
Download: yandex upload.ee
screenshot on linux
1. In the context menu, you can select deletion priority levels.
2. You can save CSV and then use CSV list.
updated
Added unlimited priority level selection
Added removal of an item from the search box
Added group item color (in ini)
Added PseudoHashSize parameter to ini
Added saving results to a file (to compare Linux and Windows results)
Download: yandex upload.ee
screenshot on linux
1. In the context menu, you can select deletion priority levels.
2. You can save CSV and then use CSV list.
updated
Added unlimited priority level selection
Added removal of an item from the search box
Added group item color (in ini)
Added PseudoHashSize parameter to ini
Added saving results to a file (to compare Linux and Windows results)
Last edited by AZJIO on Fri Jul 01, 2022 9:07 pm, edited 7 times in total.
- Kwai chang caine
- Always Here
- Posts: 5357
- Joined: Sun Nov 05, 2006 11:42 pm
- Location: Lyon - France
Re: Search for duplicate files
To increase the speed of pre-comparison of files, I used the division of the file length into 32 sections and read the data byte 32 times. Now if a series consists of 200 series of the same size, then instead of calculating the md5 of large files with a total size of 100 GB, I read 32 bytes from each file. It happens 10 times faster. And only after that I calculate md5, if the preliminary comparison still gives a suspicion that the files are the same.
I added the source code with the prefix PseudoHash.
I added the source code with the prefix PseudoHash.
Code: Select all
DisableDebugger
EnableExplicit
UseMD5Fingerprint()
Define Path$, StartTime, Res.s, md5$
Procedure.s GetPseudoHash(Path$, Shift.q)
Protected res$, length, file_id
file_id = ReadFile(#PB_Any, Path$)
If file_id
length = Lof(file_id)
FileSeek(file_id, 4, #PB_Relative)
While Eof(file_id) = 0
res$ + Hex(ReadByte(file_id), #PB_Byte)
FileSeek(file_id, Shift, #PB_Relative)
Wend
FileSeek(file_id, length - 1, #PB_Absolute)
res$ + Hex(ReadByte(file_id), #PB_Byte)
CloseFile(file_id)
EndIf
ProcedureReturn res$
EndProcedure
Path$ = "path_to_video"
StartTime=ElapsedMilliseconds()
md5$ = GetPseudoHash(Path$, FileSize(Path$) / 31)
Res = "hash time = " + Str(ElapsedMilliseconds()-StartTime) + " ms"
MessageRequester("hash_0", md5$ + #LF$ + #LF$ + Res)
Path$ = "path_to_movie_of_the_same_size_but_different_hash"
StartTime=ElapsedMilliseconds()
md5$ = FileFingerprint(Path$, #PB_Cipher_MD5)
Res = "hash time md5 = " + Str(ElapsedMilliseconds()-StartTime) + " ms"
MessageRequester("md5", md5$ + #LF$ + #LF$ + Res)
Re: Search for duplicate files
I got the warning:
Couldn't download - Virus detected
Can you provide the source only too?
Couldn't download - Virus detected
Can you provide the source only too?
Belive!
<Wrapper>4PB, PB<game>, =QONK=, PetriDish, Movie2Image, PictureManager,...
<Wrapper>4PB, PB<game>, =QONK=, PetriDish, Movie2Image, PictureManager,...
Re: Search for duplicate files
https://disk.yandex.ru/d/QvQ5oqebC69uZA
Will the antivirus allow you to compile?
Will the antivirus allow you to compile?
Re: Search for duplicate files
Sure. I can stopp it.AZJIO wrote: ↑Tue Jun 28, 2022 3:59 pm https://disk.yandex.ru/d/QvQ5oqebC69uZA
Will the antivirus allow you to compile?
I see the source and can trust it
Belive!
<Wrapper>4PB, PB<game>, =QONK=, PetriDish, Movie2Image, PictureManager,...
<Wrapper>4PB, PB<game>, =QONK=, PetriDish, Movie2Image, PictureManager,...
Re: Search for duplicate files
At the moment, all my projects contain the source, even the archive in which you saw the virus. My free kaspersky antivirus says that there is no virus in the file.
Re: Search for duplicate files
Avast, Trendmicro, Defender
Belive!
<Wrapper>4PB, PB<game>, =QONK=, PetriDish, Movie2Image, PictureManager,...
<Wrapper>4PB, PB<game>, =QONK=, PetriDish, Movie2Image, PictureManager,...
Re: Search for duplicate files
Update
Added filter/mask for files.
The Windows version does not show checkboxes for groups.
Added filter/mask for files.
The Windows version does not show checkboxes for groups.