HttpRequest with gzip

Just starting out? Need help? Post your questions and find answers here.
Rinzwind
Enthusiast
Enthusiast
Posts: 679
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

HttpRequest with gzip

Post by Rinzwind »

So I can speed things up by using
headers("Accept-Encoding") = "gzip"

However, how do I uncompress the received data from HttpRequest?
User avatar
NicTheQuick
Addict
Addict
Posts: 1503
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: HttpRequest with gzip

Post by NicTheQuick »

Isn't `HTTPRequestMemory()` doing that for you already?
Do you have an example code that shows the issue?
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
Rinzwind
Enthusiast
Enthusiast
Posts: 679
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: HttpRequest with gzip

Post by Rinzwind »

No it doesn't. You get gzip binary data back. By default httprequest/httprequestmemory does not ask for gzip compressed. I can't seem to handle the HTTP gzip'ed result with PB.. a shame. Is is quite common by now.

"gzip

A format using the Lempel-Ziv coding (LZ77), with a 32-bit CRC. This is the original format of the UNIX gzip program. The HTTP/1.1 standard also recommends that the servers supporting this content-encoding should recognize x-gzip as an alias, for compatibility purposes.
"

BriefLZ: "BriefLZ - small fast Lempel-Ziv"

I hoped that would be compatible... it doesn't seem to be however.
Fred
Administrator
Administrator
Posts: 18153
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: HttpRequest with gzip

Post by Fred »

Did you try to uncompress the memory with the zip plugin ?
Rinzwind
Enthusiast
Enthusiast
Posts: 679
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: HttpRequest with gzip

Post by Rinzwind »

Doesn't work.

Raw test code:

Code: Select all

EnableExplicit

UseBriefLZPacker()
UseZipPacker()
UseLZMAPacker()


Define t1 = ElapsedMilliseconds()
Define url.s = "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json"
Define NewMap headers.s()
headers("Accept-Encoding") = "gzip"
Debug url
Define req = HTTPRequestMemory(#PB_HTTP_Get, url, 0, 0, 0, headers())
Debug "Timer: " + Str(ElapsedMilliseconds() - t1)


If req
  Define stat.s = HTTPInfo(req, #PB_HTTP_StatusCode)
  Define res.s = HTTPInfo(req, #PB_HTTP_Response)
  Debug Left(res, 100) + "..."
  Define *buf = HTTPMemory(req)
  
;   CreateFile(0, "C:\temp\json.gzip")
;   WriteData(0, *buf, MemorySize(*buf))
;   CloseFile(0)


  Define res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
  Debug Left(res2, 100) + "..."
  Define *buf2 = AllocateMemory(MemorySize(*buf) * 4)

  ;ShowMemoryViewer(*buf, MemorySize(*buf))
  Debug UncompressMemory(*buf, MemorySize(*buf), *buf2, MemorySize(*buf2), #PB_PackerPlugin_Zip)
  res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
  Debug Left(res2, 100) + "..."
  
  FreeMemory(*buf)
  FreeMemory(*buf2)
  
  FinishHTTP(req)
EndIf

ps. Try commenting the gzip line and experience how much slower it can be.
Fred
Administrator
Administrator
Posts: 18153
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: HttpRequest with gzip

Post by Fred »

You code is wrong, you used *buf again instead of *buf2 for second PeekS(). Anwyay it doesn't work beause of missing headers for zip, but libcurl support this natively through a flag, so I guess it could be added:

https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
Rinzwind
Enthusiast
Enthusiast
Posts: 679
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: HttpRequest with gzip

Post by Rinzwind »

Fred wrote: Fri Jan 10, 2025 4:28 pm You code is wrong, you used *buf again instead of *buf2 for second PeekS(). Anwyay it doesn't work beause of missing headers for zip, but libcurl support this natively through a flag, so I guess it could be added:

https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
I was already afraid the quick example code would contain an error. But yes, the issue stands. So please make it so it's a relatively easy improvement. 👍🙏

Ps. 7zip can open the binary file when saved to disk. It is shown as an archive with one file. Could not read it from file with PB. Tried just for fun.
infratec
Always Here
Always Here
Posts: 7576
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: HttpRequest with gzip

Post by infratec »

Example with libcurl:

Code: Select all

EnableExplicit

#LibCurl_ExternalDLL = #True
XIncludeFile "libcurl.pbi"


Define.i curl, res
Define UserData.libcurl_userdata_structure

curl_global_init(#CURL_GLOBAL_DEFAULT)

curl = curl_easy_init();
If curl
  ;curl_easy_setopt_str(curl, #CURLOPT_URL, "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json")
  ;curl_easy_setopt(curl, #CURLOPT_SSL_VERIFYPEER, 0)
  ;curl_easy_setopt(curl, #CURLOPT_SSL_VERIFYHOST, 0)
  
  curl_easy_setopt_str(curl, #CURLOPT_URL, "http://httpbin.org/gzip")
  
  curl_easy_setopt_str(curl, #CURLOPT_ACCEPT_ENCODING, "gzip")
  
  curl_easy_setopt(curl, #CURLOPT_WRITEFUNCTION, @LibCurl_WriteFunction())
  curl_easy_setopt(curl, #CURLOPT_WRITEDATA, @UserData)
  
  ; to see that the content was sent with gzip (compare Content-Length and the real size)
  curl_easy_setopt(curl, #CURLOPT_HEADER, 1)
  
  res = curl_easy_perform(curl)
  If res = #CURLE_OK
    If UserData\Memory
      Debug PeekS(UserData\Memory, MemorySize(UserData\Memory), #PB_UTF8|#PB_ByteLength)
      Debug ""
      Debug "unzipped length: " + Str(MemorySize(UserData\Memory))
    EndIf
  Else
    Debug "Error: " + curl_easy_strerror(res)
  EndIf
  
  curl_easy_cleanup(curl)
EndIf

curl_global_cleanup()
You need the external dll, because the internal lbcurl does not include the zip stuff,

I enabled the header, so that you can see that the content was sent as gzip.
User avatar
idle
Always Here
Always Here
Posts: 5835
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: HttpRequest with gzip

Post by idle »

gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Rinzwind
Enthusiast
Enthusiast
Posts: 679
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: HttpRequest with gzip

Post by Rinzwind »

idle wrote: Sat Jan 11, 2025 5:14 am gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Because it is about handling multiple quite large webserver REST results, and github only supports gzip (requesting deflate will not be honored, stays uncompressed) and gzip seems to be the common standard used anyway.

https://developer.mozilla.org/en-US/doc ... t-Encoding
User avatar
idle
Always Here
Always Here
Posts: 5835
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: HttpRequest with gzip

Post by idle »

Rinzwind wrote: Sat Jan 11, 2025 5:24 am
idle wrote: Sat Jan 11, 2025 5:14 am gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Because it is about handling a bigger webserver REST result, and this one only supports gzip (requesting deflate will bot be honored, stays uncompressed) and gzip seems to be the common standard used.
it would be good to have it added native.

If it's not setting extra data like file name and comment you can just skip the 1st 10 bytes and decompress the data minus the last 8 bytes which is the crc and uncompressed len%$ffffffff
Sergey
User
User
Posts: 53
Joined: Wed Jan 12, 2022 2:41 pm

Re: HttpRequest with gzip

Post by Sergey »

Hi, Rinzwind
I adapted Windows code from this forum to your needs,
just test it and correct lines how you want
Please wait for ending, on my PC it took about 18 sec. buffer 1024
Buffer 1024 * 1024 (1 MB) took 0.1 sec. 8)
No need any PB packer's code like ZIP or LZMA

Code: Select all

EnableExplicit

#Z_BUFFER_SIZE = 1024 * 1024 ;- buffer size

#ZLIB_VERSION = "1.2.8"

#Z_OK = 0
#Z_STREAM_END = 1
#Z_FULL_FLUSH = 3
#Z_FINISH = 4

#ENABLE_GZIP = 16

Structure z_stream Align #PB_Structure_AlignC
	*next_in.BYTE
	avail_in.l
	total_in.l ;uLong
	
	*next_out.BYTE
	avail_out.l
	total_out.l ;uLong
	
	*msg.BYTE
	*state
	
	zalloc.i
	zfree.i
	opaque.i
	
	data_type.l
	adler.l ;uLong
	reserved.l ;uLong
			   ;without this, the inflateInit2() fails with a version error
	CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
		alignment.l
	CompilerEndIf
EndStructure

ImportC "zlib.lib"
	inflateInit2_.l(*stream.z_stream, windowBits.l, *version, streamsize.l)
	inflate.l(*stream.z_stream, flush.l)
	inflateEnd.l(*stream.z_stream)
EndImport

Procedure ungzip(*buf)	
	Protected gzip_strm.z_stream, gzip_opaque.l, *gzip_buffer, *gzip_out, gzip_result.l, gzip_unpacked_size.l	
	Protected buf_memory_size = MemorySize(*buf)
	Protected *buf2
	
	If buf_memory_size > 0
		Debug "gzip_packed_size = " + Str(buf_memory_size)
		gzip_strm.z_stream
		gzip_strm\next_in = *buf
		gzip_strm\avail_in = buf_memory_size
		gzip_strm\opaque = @gzip_opaque
		
		inflateInit2_(gzip_strm, 15 | #ENABLE_GZIP, #ZLIB_VERSION, SizeOf(z_stream))
		
		*gzip_buffer = AllocateMemory(#Z_BUFFER_SIZE)
		*gzip_out    = AllocateMemory(#Z_BUFFER_SIZE)
		
		If *gzip_buffer And *gzip_out
			Repeat
				gzip_strm\next_out = *gzip_buffer
				gzip_strm\avail_out = #Z_BUFFER_SIZE
				gzip_result = inflate(gzip_strm, #Z_FULL_FLUSH)
				
				If gzip_result = #Z_OK Or gzip_result = #Z_STREAM_END Or gzip_strm\avail_in = 0
					CopyMemory(*gzip_buffer, *gzip_out + MemorySize(*gzip_out) - #Z_BUFFER_SIZE, #Z_BUFFER_SIZE)
					If gzip_result = #Z_STREAM_END Or gzip_strm\avail_in = 0
						Break
					Else
						*gzip_out = ReAllocateMemory(*gzip_out, MemorySize(*gzip_out) + #Z_BUFFER_SIZE)
					EndIf
				Else
					If gzip_strm\msg
						Debug PeekS(gzip_strm\msg, #PB_UTF8)
					Else
						Debug "gzip_result = " + Str(gzip_result)
					EndIf
					Break
				EndIf
			ForEver
			
			gzip_unpacked_size = gzip_strm\total_out
			If gzip_unpacked_size > 0
				Debug "gzip_unpacked_size = " + Str(gzip_unpacked_size)
				Debug "compression ratio: " + StrF(buf_memory_size * 100 / gzip_unpacked_size, 1) + "%"
				
				*buf2 = AllocateMemory(gzip_unpacked_size)
				CopyMemory(*gzip_out, *buf2, gzip_unpacked_size)
			EndIf
			
			buf_memory_size = gzip_unpacked_size
			
			FreeMemory(*gzip_out)
			FreeMemory(*gzip_buffer)
			
			inflateEnd(gzip_strm)
		EndIf
		
	Else
		Debug "buf_memory_size = 0"
	EndIf
	
	ProcedureReturn *buf2
EndProcedure

Define t1 = ElapsedMilliseconds()
Define url.s = "https://raw.githubusercontent.com/json-iterator/test-data/refs/heads/master/large-file.json"
Define NewMap headers.s()
headers("Accept-Encoding") = "gzip"
Debug url
Define req = HTTPRequestMemory(#PB_HTTP_Get, url, 0, 0, 0, headers())
Debug "HTTPRequest Timer: " + Str(ElapsedMilliseconds() - t1)
If req
	Define stat.s = HTTPInfo(req, #PB_HTTP_StatusCode)
	Define res.s = HTTPInfo(req, #PB_HTTP_Response)
	Debug Left(res, 100) + "..."
	Define *buf = HTTPMemory(req)
	FinishHTTP(req)	
	
	If *buf
		Define res2.s = PeekS(*buf, MemorySize(*buf), #PB_UTF8 | #PB_ByteLength)
		Debug Left(res2, 100) + "..."
		
		Define t2 = ElapsedMilliseconds()
		Define *buf2 = ungzip(*buf)
		Debug "UnGZIP Timer: " + Str(ElapsedMilliseconds() - t2)
		
		If *buf2
			res2.s = PeekS(*buf2, MemorySize(*buf2), #PB_UTF8 | #PB_ByteLength)
			Debug Left(res2, 100) + "..."
		
			FreeMemory(*buf2)
		EndIf
		
		FreeMemory(*buf)
	EndIf
EndIf
Rinzwind
Enthusiast
Enthusiast
Posts: 679
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: HttpRequest with gzip

Post by Rinzwind »

Sergey wrote: Mon Jan 13, 2025 10:37 pm
Thanks for a workaround. Seems to work.
Fred
Administrator
Administrator
Posts: 18153
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: HttpRequest with gzip

Post by Fred »

It's always amazing to see the alternative then you are all coming with to workaround PB limitation !
User avatar
matalog
Enthusiast
Enthusiast
Posts: 301
Joined: Tue Sep 05, 2017 10:07 am

Re: HttpRequest with gzip

Post by matalog »

idle wrote: Sat Jan 11, 2025 5:14 am gzip is just deflate with a header and checksum, why not use deflate?
CompressMemory(*input,len,*output,outputlen,#PB_PackerPlugin_Zip)
Idle, can I use what you are describing to ungzip a file using PB's own capabilities, without requiring zlib?
Post Reply