Page 1 of 2
HttpRequest with gzip
Posted: Fri Jan 10, 2025 11:48 am
by Rinzwind
So I can speed things up by using
headers("Accept-Encoding") = "gzip"
However, how do I uncompress the received data from HttpRequest?
Re: HttpRequest with gzip
Posted: Fri Jan 10, 2025 11:59 am
by NicTheQuick
Isn't `HTTPRequestMemory()` doing that for you already?
Do you have an example code that shows the issue?
Re: HttpRequest with gzip
Posted: Fri Jan 10, 2025 12:08 pm
by Rinzwind
No it doesn't. You get gzip binary data back. By default httprequest/httprequestmemory does not ask for gzip compressed. I can't seem to handle the HTTP gzip'ed result with PB.. a shame. Is is quite common by now.
"gzip
A format using the Lempel-Ziv coding (LZ77), with a 32-bit CRC. This is the original format of the UNIX gzip program. The HTTP/1.1 standard also recommends that the servers supporting this content-encoding should recognize x-gzip as an alias, for compatibility purposes.
"
BriefLZ: "BriefLZ - small fast Lempel-Ziv"
I hoped that would be compatible... it doesn't seem to be however.
Re: HttpRequest with gzip
Posted: Fri Jan 10, 2025 12:23 pm
by Fred
Did you try to uncompress the memory with the zip plugin ?
Re: HttpRequest with gzip
Posted: Fri Jan 10, 2025 2:28 pm
by Rinzwind
Doesn't work.
Raw test code:
Code: Select all
EnableExplicit
UseBriefLZPacker()
UseZipPacker()
UseLZMAPacker()
Define t1 = ElapsedMilliseconds()
Define url.s = "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json"
Define NewMap headers.s()
headers("Accept-Encoding") = "gzip"
Debug url
Define req = HTTPRequestMemory(#PB_HTTP_Get, url, 0, 0, 0, headers())
Debug "Timer: " + Str(ElapsedMilliseconds() - t1)
If req
Define stat.s = HTTPInfo(req, #PB_HTTP_StatusCode)
Define res.s = HTTPInfo(req, #PB_HTTP_Response)
Debug Left(res, 100) + "..."
Define *buf = HTTPMemory(req)
; CreateFile(0, "C:\temp\json.gzip")
; WriteData(0, *buf, MemorySize(*buf))
; CloseFile(0)
Define res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
Define *buf2 = AllocateMemory(MemorySize(*buf) * 4)
;ShowMemoryViewer(*buf, MemorySize(*buf))
Debug UncompressMemory(*buf, MemorySize(*buf), *buf2, MemorySize(*buf2), #PB_PackerPlugin_Zip)
res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
FreeMemory(*buf)
FreeMemory(*buf2)
FinishHTTP(req)
EndIf
ps. Try commenting the gzip line and experience how much slower it can be.
Re: HttpRequest with gzip
Posted: Fri Jan 10, 2025 4:28 pm
by Fred
You code is wrong, you used *buf again instead of *buf2 for second PeekS(). Anwyay it doesn't work beause of missing headers for zip, but libcurl support this natively through a flag, so I guess it could be added:
https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
Re: HttpRequest with gzip
Posted: Fri Jan 10, 2025 4:55 pm
by Rinzwind
Fred wrote: Fri Jan 10, 2025 4:28 pm
You code is wrong, you used *buf again instead of *buf2 for second PeekS(). Anwyay it doesn't work beause of missing headers for zip, but libcurl support this natively through a flag, so I guess it could be added:
https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
I was already afraid the quick example code would contain an error. But yes, the issue stands. So please make it so it's a relatively easy improvement.

Ps. 7zip can open the binary file when saved to disk. It is shown as an archive with one file. Could not read it from file with PB. Tried just for fun.
Re: HttpRequest with gzip
Posted: Fri Jan 10, 2025 10:40 pm
by infratec
Example with libcurl:
Code: Select all
EnableExplicit
#LibCurl_ExternalDLL = #True
XIncludeFile "libcurl.pbi"
Define.i curl, res
Define UserData.libcurl_userdata_structure
curl_global_init(#CURL_GLOBAL_DEFAULT)
curl = curl_easy_init();
If curl
;curl_easy_setopt_str(curl, #CURLOPT_URL, "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json")
;curl_easy_setopt(curl, #CURLOPT_SSL_VERIFYPEER, 0)
;curl_easy_setopt(curl, #CURLOPT_SSL_VERIFYHOST, 0)
curl_easy_setopt_str(curl, #CURLOPT_URL, "http://httpbin.org/gzip")
curl_easy_setopt_str(curl, #CURLOPT_ACCEPT_ENCODING, "gzip")
curl_easy_setopt(curl, #CURLOPT_WRITEFUNCTION, @LibCurl_WriteFunction())
curl_easy_setopt(curl, #CURLOPT_WRITEDATA, @UserData)
; to see that the content was sent with gzip (compare Content-Length and the real size)
curl_easy_setopt(curl, #CURLOPT_HEADER, 1)
res = curl_easy_perform(curl)
If res = #CURLE_OK
If UserData\Memory
Debug PeekS(UserData\Memory, MemorySize(UserData\Memory), #PB_UTF8|#PB_ByteLength)
Debug ""
Debug "unzipped length: " + Str(MemorySize(UserData\Memory))
EndIf
Else
Debug "Error: " + curl_easy_strerror(res)
EndIf
curl_easy_cleanup(curl)
EndIf
curl_global_cleanup()
You need the external dll, because the internal lbcurl does not include the zip stuff,
I enabled the header, so that you can see that the content was sent as gzip.
Re: HttpRequest with gzip
Posted: Sat Jan 11, 2025 5:14 am
by idle
gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Re: HttpRequest with gzip
Posted: Sat Jan 11, 2025 5:24 am
by Rinzwind
idle wrote: Sat Jan 11, 2025 5:14 am
gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Because it is about handling multiple quite large webserver REST results, and github only supports gzip (requesting deflate will not be honored, stays uncompressed) and gzip seems to be the common standard used anyway.
https://developer.mozilla.org/en-US/doc ... t-Encoding
Re: HttpRequest with gzip
Posted: Sat Jan 11, 2025 6:02 am
by idle
Rinzwind wrote: Sat Jan 11, 2025 5:24 am
idle wrote: Sat Jan 11, 2025 5:14 am
gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Because it is about handling a bigger webserver REST result, and this one only supports gzip (requesting deflate will bot be honored, stays uncompressed) and gzip seems to be the common standard used.
it would be good to have it added native.
If it's not setting extra data like file name and comment you can just skip the 1st 10 bytes and decompress the data minus the last 8 bytes which is the crc and uncompressed len%$ffffffff
Re: HttpRequest with gzip
Posted: Mon Jan 13, 2025 10:37 pm
by Sergey
Hi, Rinzwind
I adapted Windows code from this forum to your needs,
just test it and correct lines how you want
Please wait for ending, on my PC it took about 18 sec. buffer 1024
Buffer 1024 * 1024 (1 MB) took 0.1 sec.
No need any PB packer's code like ZIP or LZMA
Code: Select all
EnableExplicit
#Z_BUFFER_SIZE = 1024 * 1024 ;- buffer size
#ZLIB_VERSION = "1.2.8"
#Z_OK = 0
#Z_STREAM_END = 1
#Z_FULL_FLUSH = 3
#Z_FINISH = 4
#ENABLE_GZIP = 16
Structure z_stream Align #PB_Structure_AlignC
*next_in.BYTE
avail_in.l
total_in.l ;uLong
*next_out.BYTE
avail_out.l
total_out.l ;uLong
*msg.BYTE
*state
zalloc.i
zfree.i
opaque.i
data_type.l
adler.l ;uLong
reserved.l ;uLong
;without this, the inflateInit2() fails with a version error
CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
alignment.l
CompilerEndIf
EndStructure
ImportC "zlib.lib"
inflateInit2_.l(*stream.z_stream, windowBits.l, *version, streamsize.l)
inflate.l(*stream.z_stream, flush.l)
inflateEnd.l(*stream.z_stream)
EndImport
Procedure ungzip(*buf)
Protected gzip_strm.z_stream, gzip_opaque.l, *gzip_buffer, *gzip_out, gzip_result.l, gzip_unpacked_size.l
Protected buf_memory_size = MemorySize(*buf)
Protected *buf2
If buf_memory_size > 0
Debug "gzip_packed_size = " + Str(buf_memory_size)
gzip_strm.z_stream
gzip_strm\next_in = *buf
gzip_strm\avail_in = buf_memory_size
gzip_strm\opaque = @gzip_opaque
inflateInit2_(gzip_strm, 15 | #ENABLE_GZIP, #ZLIB_VERSION, SizeOf(z_stream))
*gzip_buffer = AllocateMemory(#Z_BUFFER_SIZE)
*gzip_out = AllocateMemory(#Z_BUFFER_SIZE)
If *gzip_buffer And *gzip_out
Repeat
gzip_strm\next_out = *gzip_buffer
gzip_strm\avail_out = #Z_BUFFER_SIZE
gzip_result = inflate(gzip_strm, #Z_FULL_FLUSH)
If gzip_result = #Z_OK Or gzip_result = #Z_STREAM_END Or gzip_strm\avail_in = 0
CopyMemory(*gzip_buffer, *gzip_out + MemorySize(*gzip_out) - #Z_BUFFER_SIZE, #Z_BUFFER_SIZE)
If gzip_result = #Z_STREAM_END Or gzip_strm\avail_in = 0
Break
Else
*gzip_out = ReAllocateMemory(*gzip_out, MemorySize(*gzip_out) + #Z_BUFFER_SIZE)
EndIf
Else
If gzip_strm\msg
Debug PeekS(gzip_strm\msg, #PB_UTF8)
Else
Debug "gzip_result = " + Str(gzip_result)
EndIf
Break
EndIf
ForEver
gzip_unpacked_size = gzip_strm\total_out
If gzip_unpacked_size > 0
Debug "gzip_unpacked_size = " + Str(gzip_unpacked_size)
Debug "compression ratio: " + StrF(buf_memory_size * 100 / gzip_unpacked_size, 1) + "%"
*buf2 = AllocateMemory(gzip_unpacked_size)
CopyMemory(*gzip_out, *buf2, gzip_unpacked_size)
EndIf
buf_memory_size = gzip_unpacked_size
FreeMemory(*gzip_out)
FreeMemory(*gzip_buffer)
inflateEnd(gzip_strm)
EndIf
Else
Debug "buf_memory_size = 0"
EndIf
ProcedureReturn *buf2
EndProcedure
Define t1 = ElapsedMilliseconds()
Define url.s = "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json"
Define NewMap headers.s()
headers("Accept-Encoding") = "gzip"
Debug url
Define req = HTTPRequestMemory(#PB_HTTP_Get, url, 0, 0, 0, headers())
Debug "HTTPRequest Timer: " + Str(ElapsedMilliseconds() - t1)
If req
Define stat.s = HTTPInfo(req, #PB_HTTP_StatusCode)
Define res.s = HTTPInfo(req, #PB_HTTP_Response)
Debug Left(res, 100) + "..."
Define *buf = HTTPMemory(req)
FinishHTTP(req)
If *buf
Define res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
Define t2 = ElapsedMilliseconds()
Define *buf2 = ungzip(*buf)
Debug "UnGZIP Timer: " + Str(ElapsedMilliseconds() - t2)
If *buf2
res2.s = PeekS(*buf2, MemorySize(*buf2), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
FreeMemory(*buf2)
EndIf
FreeMemory(*buf)
EndIf
EndIf
Re: HttpRequest with gzip
Posted: Tue Jan 14, 2025 6:58 am
by Rinzwind
Sergey wrote: Mon Jan 13, 2025 10:37 pm
Thanks for a workaround. Seems to work.
Re: HttpRequest with gzip
Posted: Tue Jan 14, 2025 10:08 am
by Fred
It's always amazing to see the alternative then you are all coming with to workaround PB limitation !
Re: HttpRequest with gzip
Posted: Mon Feb 24, 2025 11:07 pm
by matalog
idle wrote: Sat Jan 11, 2025 5:14 am
gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Idle, can I use what you are describing to ungzip a file using PB's own capabilities, without requiring zlib?