HttpRequest with gzip
HttpRequest with gzip
So I can speed things up by using
headers("Accept-Encoding") = "gzip"
However, how do I uncompress the received data from HttpRequest?
headers("Accept-Encoding") = "gzip"
However, how do I uncompress the received data from HttpRequest?
- NicTheQuick
- Addict
- Posts: 1503
- Joined: Sun Jun 22, 2003 7:43 pm
- Location: Germany, Saarbrücken
- Contact:
Re: HttpRequest with gzip
Isn't `HTTPRequestMemory()` doing that for you already?
Do you have an example code that shows the issue?
Do you have an example code that shows the issue?
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
Re: HttpRequest with gzip
No it doesn't. You get gzip binary data back. By default httprequest/httprequestmemory does not ask for gzip compressed. I can't seem to handle the HTTP gzip'ed result with PB.. a shame. Is is quite common by now.
"gzip
A format using the Lempel-Ziv coding (LZ77), with a 32-bit CRC. This is the original format of the UNIX gzip program. The HTTP/1.1 standard also recommends that the servers supporting this content-encoding should recognize x-gzip as an alias, for compatibility purposes.
"
BriefLZ: "BriefLZ - small fast Lempel-Ziv"
I hoped that would be compatible... it doesn't seem to be however.
"gzip
A format using the Lempel-Ziv coding (LZ77), with a 32-bit CRC. This is the original format of the UNIX gzip program. The HTTP/1.1 standard also recommends that the servers supporting this content-encoding should recognize x-gzip as an alias, for compatibility purposes.
"
BriefLZ: "BriefLZ - small fast Lempel-Ziv"
I hoped that would be compatible... it doesn't seem to be however.
Re: HttpRequest with gzip
Did you try to uncompress the memory with the zip plugin ?
Re: HttpRequest with gzip
Doesn't work.
Raw test code:
ps. Try commenting the gzip line and experience how much slower it can be.
Raw test code:
Code: Select all
EnableExplicit
UseBriefLZPacker()
UseZipPacker()
UseLZMAPacker()
Define t1 = ElapsedMilliseconds()
Define url.s = "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json"
Define NewMap headers.s()
headers("Accept-Encoding") = "gzip"
Debug url
Define req = HTTPRequestMemory(#PB_HTTP_Get, url, 0, 0, 0, headers())
Debug "Timer: " + Str(ElapsedMilliseconds() - t1)
If req
Define stat.s = HTTPInfo(req, #PB_HTTP_StatusCode)
Define res.s = HTTPInfo(req, #PB_HTTP_Response)
Debug Left(res, 100) + "..."
Define *buf = HTTPMemory(req)
; CreateFile(0, "C:\temp\json.gzip")
; WriteData(0, *buf, MemorySize(*buf))
; CloseFile(0)
Define res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
Define *buf2 = AllocateMemory(MemorySize(*buf) * 4)
;ShowMemoryViewer(*buf, MemorySize(*buf))
Debug UncompressMemory(*buf, MemorySize(*buf), *buf2, MemorySize(*buf2), #PB_PackerPlugin_Zip)
res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
FreeMemory(*buf)
FreeMemory(*buf2)
FinishHTTP(req)
EndIf
Re: HttpRequest with gzip
You code is wrong, you used *buf again instead of *buf2 for second PeekS(). Anwyay it doesn't work beause of missing headers for zip, but libcurl support this natively through a flag, so I guess it could be added:
https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
Re: HttpRequest with gzip
I was already afraid the quick example code would contain an error. But yes, the issue stands. So please make it so it's a relatively easy improvement.Fred wrote: Fri Jan 10, 2025 4:28 pm You code is wrong, you used *buf again instead of *buf2 for second PeekS(). Anwyay it doesn't work beause of missing headers for zip, but libcurl support this natively through a flag, so I guess it could be added:
https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
Ps. 7zip can open the binary file when saved to disk. It is shown as an archive with one file. Could not read it from file with PB. Tried just for fun.
Re: HttpRequest with gzip
Example with libcurl:
You need the external dll, because the internal lbcurl does not include the zip stuff,
I enabled the header, so that you can see that the content was sent as gzip.
Code: Select all
EnableExplicit
#LibCurl_ExternalDLL = #True
XIncludeFile "libcurl.pbi"
Define.i curl, res
Define UserData.libcurl_userdata_structure
curl_global_init(#CURL_GLOBAL_DEFAULT)
curl = curl_easy_init();
If curl
;curl_easy_setopt_str(curl, #CURLOPT_URL, "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json")
;curl_easy_setopt(curl, #CURLOPT_SSL_VERIFYPEER, 0)
;curl_easy_setopt(curl, #CURLOPT_SSL_VERIFYHOST, 0)
curl_easy_setopt_str(curl, #CURLOPT_URL, "http://httpbin.org/gzip")
curl_easy_setopt_str(curl, #CURLOPT_ACCEPT_ENCODING, "gzip")
curl_easy_setopt(curl, #CURLOPT_WRITEFUNCTION, @LibCurl_WriteFunction())
curl_easy_setopt(curl, #CURLOPT_WRITEDATA, @UserData)
; to see that the content was sent with gzip (compare Content-Length and the real size)
curl_easy_setopt(curl, #CURLOPT_HEADER, 1)
res = curl_easy_perform(curl)
If res = #CURLE_OK
If UserData\Memory
Debug PeekS(UserData\Memory, MemorySize(UserData\Memory), #PB_UTF8|#PB_ByteLength)
Debug ""
Debug "unzipped length: " + Str(MemorySize(UserData\Memory))
EndIf
Else
Debug "Error: " + curl_easy_strerror(res)
EndIf
curl_easy_cleanup(curl)
EndIf
curl_global_cleanup()
I enabled the header, so that you can see that the content was sent as gzip.
Re: HttpRequest with gzip
gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Re: HttpRequest with gzip
Because it is about handling multiple quite large webserver REST results, and github only supports gzip (requesting deflate will not be honored, stays uncompressed) and gzip seems to be the common standard used anyway.idle wrote: Sat Jan 11, 2025 5:14 am gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
https://developer.mozilla.org/en-US/doc ... t-Encoding
Re: HttpRequest with gzip
it would be good to have it added native.Rinzwind wrote: Sat Jan 11, 2025 5:24 amBecause it is about handling a bigger webserver REST result, and this one only supports gzip (requesting deflate will bot be honored, stays uncompressed) and gzip seems to be the common standard used.idle wrote: Sat Jan 11, 2025 5:14 am gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
If it's not setting extra data like file name and comment you can just skip the 1st 10 bytes and decompress the data minus the last 8 bytes which is the crc and uncompressed len%$ffffffff
Re: HttpRequest with gzip
Hi, Rinzwind
I adapted Windows code from this forum to your needs,
just test it and correct lines how you want
Please wait for ending, on my PC it took about 18 sec. buffer 1024
Buffer 1024 * 1024 (1 MB) took 0.1 sec.
No need any PB packer's code like ZIP or LZMA
I adapted Windows code from this forum to your needs,
just test it and correct lines how you want
Please wait for ending, on my PC it took about 18 sec. buffer 1024
Buffer 1024 * 1024 (1 MB) took 0.1 sec.

No need any PB packer's code like ZIP or LZMA
Code: Select all
EnableExplicit
#Z_BUFFER_SIZE = 1024 * 1024 ;- buffer size
#ZLIB_VERSION = "1.2.8"
#Z_OK = 0
#Z_STREAM_END = 1
#Z_FULL_FLUSH = 3
#Z_FINISH = 4
#ENABLE_GZIP = 16
Structure z_stream Align #PB_Structure_AlignC
*next_in.BYTE
avail_in.l
total_in.l ;uLong
*next_out.BYTE
avail_out.l
total_out.l ;uLong
*msg.BYTE
*state
zalloc.i
zfree.i
opaque.i
data_type.l
adler.l ;uLong
reserved.l ;uLong
;without this, the inflateInit2() fails with a version error
CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
alignment.l
CompilerEndIf
EndStructure
ImportC "zlib.lib"
inflateInit2_.l(*stream.z_stream, windowBits.l, *version, streamsize.l)
inflate.l(*stream.z_stream, flush.l)
inflateEnd.l(*stream.z_stream)
EndImport
Procedure ungzip(*buf)
Protected gzip_strm.z_stream, gzip_opaque.l, *gzip_buffer, *gzip_out, gzip_result.l, gzip_unpacked_size.l
Protected buf_memory_size = MemorySize(*buf)
Protected *buf2
If buf_memory_size > 0
Debug "gzip_packed_size = " + Str(buf_memory_size)
gzip_strm.z_stream
gzip_strm\next_in = *buf
gzip_strm\avail_in = buf_memory_size
gzip_strm\opaque = @gzip_opaque
inflateInit2_(gzip_strm, 15 | #ENABLE_GZIP, #ZLIB_VERSION, SizeOf(z_stream))
*gzip_buffer = AllocateMemory(#Z_BUFFER_SIZE)
*gzip_out = AllocateMemory(#Z_BUFFER_SIZE)
If *gzip_buffer And *gzip_out
Repeat
gzip_strm\next_out = *gzip_buffer
gzip_strm\avail_out = #Z_BUFFER_SIZE
gzip_result = inflate(gzip_strm, #Z_FULL_FLUSH)
If gzip_result = #Z_OK Or gzip_result = #Z_STREAM_END Or gzip_strm\avail_in = 0
CopyMemory(*gzip_buffer, *gzip_out + MemorySize(*gzip_out) - #Z_BUFFER_SIZE, #Z_BUFFER_SIZE)
If gzip_result = #Z_STREAM_END Or gzip_strm\avail_in = 0
Break
Else
*gzip_out = ReAllocateMemory(*gzip_out, MemorySize(*gzip_out) + #Z_BUFFER_SIZE)
EndIf
Else
If gzip_strm\msg
Debug PeekS(gzip_strm\msg, #PB_UTF8)
Else
Debug "gzip_result = " + Str(gzip_result)
EndIf
Break
EndIf
ForEver
gzip_unpacked_size = gzip_strm\total_out
If gzip_unpacked_size > 0
Debug "gzip_unpacked_size = " + Str(gzip_unpacked_size)
Debug "compression ratio: " + StrF(buf_memory_size * 100 / gzip_unpacked_size, 1) + "%"
*buf2 = AllocateMemory(gzip_unpacked_size)
CopyMemory(*gzip_out, *buf2, gzip_unpacked_size)
EndIf
buf_memory_size = gzip_unpacked_size
FreeMemory(*gzip_out)
FreeMemory(*gzip_buffer)
inflateEnd(gzip_strm)
EndIf
Else
Debug "buf_memory_size = 0"
EndIf
ProcedureReturn *buf2
EndProcedure
Define t1 = ElapsedMilliseconds()
Define url.s = "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json"
Define NewMap headers.s()
headers("Accept-Encoding") = "gzip"
Debug url
Define req = HTTPRequestMemory(#PB_HTTP_Get, url, 0, 0, 0, headers())
Debug "HTTPRequest Timer: " + Str(ElapsedMilliseconds() - t1)
If req
Define stat.s = HTTPInfo(req, #PB_HTTP_StatusCode)
Define res.s = HTTPInfo(req, #PB_HTTP_Response)
Debug Left(res, 100) + "..."
Define *buf = HTTPMemory(req)
FinishHTTP(req)
If *buf
Define res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
Define t2 = ElapsedMilliseconds()
Define *buf2 = ungzip(*buf)
Debug "UnGZIP Timer: " + Str(ElapsedMilliseconds() - t2)
If *buf2
res2.s = PeekS(*buf2, MemorySize(*buf2), #PB_UTF8 | #PB_ByteLength)
Debug Left(res2, 100) + "..."
FreeMemory(*buf2)
EndIf
FreeMemory(*buf)
EndIf
EndIf
Re: HttpRequest with gzip
Thanks for a workaround. Seems to work.
Re: HttpRequest with gzip
It's always amazing to see the alternative then you are all coming with to workaround PB limitation !
Re: HttpRequest with gzip
Idle, can I use what you are describing to ungzip a file using PB's own capabilities, without requiring zlib?idle wrote: Sat Jan 11, 2025 5:14 am gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)