PB v6.11/6.12b2 x64, Windows 7 x64
I created a ZIP archive with PB today. That worked, but I have problems with it.
Directory and file names with umlauts such as "ÜÖÄßüöää" are not restored when unpacking, but are replaced by other characters.
Unpacking with Windows Explorer also works with subdirectories. But e.g. the program 7-Zip v23.01 unpacks all files without subdirectories.
EDIT: Unpacking with 7-Zip now works, I had to use #NPS$ (/) instead of the Windows standard #PS$ (\) in the paths for AddPackFile().
Is this known and normal?
I cannot pass on the ZIP archives in this way. When I create ZIP archives with Windows Explorer, everything works without any problems.
Peter
EDIT 2: Files with umlauts in the ZIP archive cannot be unpacked with PB.
Unpack PB ZIP archive with umlauts and subdirectory
Re: Unpack PB ZIP archive with umlauts and subdirectory
It looks like PB saves the file names in UTF8 format in the ZIP archive. According to RFC 1952, the file names must be saved in ISO 8859-1 (LATIN-1) format. This information comes from the website zlib.net. I assume this lib is used by PB.
Edit: PB apparently uses libarchive v3.6.2, which in turn uses zlib v1.2.12. At least that's what I can read in a compiled PB exe.
Peter
Edit: PB apparently uses libarchive v3.6.2, which in turn uses zlib v1.2.12. At least that's what I can read in a compiled PB exe.
Peter
Re: Unpack PB ZIP archive with umlauts and subdirectory
I took a closer look at the ZIP file created with PB. I noticed two things.
1.
The file names are not UTF-8 encoded, but different. But I don't know how exactly.
Here is a comparison with the data of the PB ZIP archive and the output of UTF8("DATEI_Ö_.txt")
2.
If the file names are encoded in UTF-8, bit 11 must also be set in the 'general purpose bit flag'.
This is not the case with the PB ZIP archive. When I create a ZIP archive with 7-Zip, bit 11 is set and the file names are encoded correctly.
If I change a PB ZIP archive manually with a HEX editor, I can unpack the archive with the correct file name using Windows Explorer.
It is best to use compression 0 (zero), which makes the ZIP archives easier to read in the HEX editor.
Peter
1.
The file names are not UTF-8 encoded, but different. But I don't know how exactly.
Here is a comparison with the data of the PB ZIP archive and the output of UTF8("DATEI_Ö_.txt")
Code: Select all
Original: DATEI_Ö_.txt
D A T E I _ Ö _ . t x t
PB ZIP: DATEI_Ç-_.txt [44 41 54 45 49 5F C7 2D 5F 2E 74 78 74]
UTF8(): DATEI_Ö_.txt [44 41 54 45 49 5F C3 96 5F 2E 74 78 74]
If the file names are encoded in UTF-8, bit 11 must also be set in the 'general purpose bit flag'.
This is not the case with the PB ZIP archive. When I create a ZIP archive with 7-Zip, bit 11 is set and the file names are encoded correctly.
Code: Select all
7z a "T:\Archive.zip" "T:\Archive\*.*" -mcu=on
It is best to use compression 0 (zero), which makes the ZIP archives easier to read in the HEX editor.
Peter
Re: Unpack PB ZIP archive with umlauts and subdirectory
I use this method: https://www.purebasic.fr/english/viewtopic.php?t=83418
I checked BriefLZ, Zip, Lzma on Linux.
BriefLZ creates the least amount of problems.
Zip - I had to cache the PackEntryName(0) name into a variable because reusing it produced an empty string, causing the folder's attachments to not appear.
Tar - requires caching the file name like in Zip
Lzma - When creating an archive, the file names consist of a question in a diamond. Archive manager "Engrampa" creates and extracts 7zip archives without any problems.
I checked BriefLZ, Zip, Lzma on Linux.
BriefLZ creates the least amount of problems.
Zip - I had to cache the PackEntryName(0) name into a variable because reusing it produced an empty string, causing the folder's attachments to not appear.
Code: Select all
tmp$ = PackEntryName(0)
If UncompressPackFile(0, path + tmp$, tmp$) = -1 And ForceDirectories(GetPathPart(path + tmp$))
UncompressPackFile(0, path + tmp$, tmp$)
EndIf
Lzma - When creating an archive, the file names consist of a question in a diamond. Archive manager "Engrampa" creates and extracts 7zip archives without any problems.
Re: Unpack PB ZIP archive with umlauts and subdirectory
Hello AZJIO,
thanks for your information.
I also tested with the ZIP properties fields for 'version' and 'version made by'. But unpacking a PB ZIP archive with Windows Explorer only worked if the names were encoded in UTF-8 and bit 11 was set.
However, Windows XP cannot unpack these files correctly, as the UTF-8 encoding is not recognized.
A ZIP archive created with 7-Zip without UTF-8 encoding works correctly everywhere under Windows.
Peter
thanks for your information.
I also tested with the ZIP properties fields for 'version' and 'version made by'. But unpacking a PB ZIP archive with Windows Explorer only worked if the names were encoded in UTF-8 and bit 11 was set.
However, Windows XP cannot unpack these files correctly, as the UTF-8 encoding is not recognized.
A ZIP archive created with 7-Zip without UTF-8 encoding works correctly everywhere under Windows.
Peter
Re: Unpack PB ZIP archive with umlauts and subdirectory
But on Linux there may be a problem. Then you need to explicitly specify "/"
I made a test example for Windows
Code: Select all
EnableExplicit
; AZJIO
; https://www.purebasic.fr/english/viewtopic.php?p=566355#p566355
EnableExplicit
Procedure FileSearch(List Files.s(), dir.s, mask.s = "*", depth = 130)
Protected Name.s, c
Protected Dim hDir(depth)
Protected Dim SearchPath.s(depth)
If Right(dir, 1) <> #PS$
dir + #PS$
EndIf
SearchPath(c) = dir
hDir(c) = ExamineDirectory(#PB_Any, dir, mask)
If Not hDir(c)
ProcedureReturn
EndIf
Repeat
While NextDirectoryEntry(hDir(c))
Name = DirectoryEntryName(hDir(c))
If Name = "." Or Name = ".."
Continue
EndIf
If DirectoryEntryType(hDir(c)) = #PB_DirectoryEntry_Directory
If c >= depth
Continue
EndIf
dir = SearchPath(c)
c + 1
SearchPath(c) = dir + Name + #PS$
hDir(c) = ExamineDirectory(#PB_Any, SearchPath(c), mask)
If Not hDir(c)
c - 1
EndIf
Else
If AddElement(Files())
Files() = SearchPath(c) + Name
EndIf
EndIf
Wend
FinishDirectory(hDir(c))
c - 1
Until c < 0
EndProcedure
UseZipPacker()
Define NewList Files.s()
Define Path$ = "C:\PB\Source\Current\archive\test\"
Define length = Len(Path$) + 1
FileSearch(Files(), Path$)
; Debug "Count = " + Str(ListSize(Files()))
; Создаём архивный файл
If CreatePack(0, "config-archive.zip", #PB_PackerPlugin_Zip)
ForEach Files()
; Debug Files()
; AddPackFile(0, Files(), Mid(Files(), length))
AddPackFile(0, Files(), ReplaceString(Mid(Files(), length), "\", "/", #PB_String_InPlace))
Next
ClosePack(0)
EndI

Windows:
7zip - ok
BriefLZ - ok
zip - Cyrillic problem in file names
tar - Cyrillic problem in file names