Page 1 of 2
					
				looking for a solution to decode gzip compressed json stream
				Posted: Mon Aug 11, 2014 11:18 pm
				by Glow2k9
				hi,
I'm currently experimenting with server-client stuff. everything works fine, but I'd like to support gzip compressed streams. Is there a solution that works under x86/x64 and if possible without any overhead? I really only need to decompress a string, nothing more. I read somewhere, that the zlib used by pb might be used for that, but I found no working solution.
Can anyone help me here?
			 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Tue Aug 12, 2014 12:59 am
				by idle
				search for gzip in the forums 
you can use uncompress to deflate the stream 
 
Code: Select all
CompilerIf #PB_Compiler_OS = #PB_OS_Windows 
  
  ImportC "zlib.lib"
    compress2(*dest,*destlen,*source,sourcelen,level)
    uncompress(*dest,*destlen,*source,sourcelen)
  EndImport 
CompilerElse 
  
  ImportC "-lz"   
    compress2(*dest,*destlen,*source,sourcelen,level)
    uncompress(*dest,*destlen,*source,sourcelen)
  EndImport 
CompilerEndIf 
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Tue Aug 12, 2014 2:08 pm
				by Glow2k9
				hi,
thanks for your answer. however, I'm having trouble to get it to work. here's what I tried so far, but the result is, that my program is crashing at a strange point (outside the procedure which tries to uncomress the data, so it appears to be a memory problem):
Code: Select all
cString.s = SOMEDATA ; holds the compressed data
*compressed_data = AllocateMemory(len(cString))
PokeS(*compressed_data, cString)
uncompressed_len = PeekL(*compressed_data)
*uncompressed_data = AllocateMemory(uncompressed_len)
uncompress(*uncompressed_data, @uncompressed_len, *compressed_data, Len(cString))
no luck so far :/ what am I doing wrong?
Edit: the code above is executed in a thread, is there a limitation for zlib?
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Wed Aug 13, 2014 12:41 pm
				by Glow2k9
				anyone? sometimes even the whole debugger crashes, so I can't even check where exactly it crashes..
			 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Wed Aug 13, 2014 1:15 pm
				by Little John
				Glow2k9 wrote:anyone? sometimes even the whole debugger crashes, so I can't even check where exactly it crashes..
Did you try 
OnErrorCall() and related commands?
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Wed Aug 13, 2014 1:28 pm
				by Danilo
				When using strings in threads, you need to enable the thread-safe compiler option.
Code: Select all
cString.s = SOMEDATA ; holds the compressed data
What is SOMEDATA? Why put SOMEDATA into a string and not use a data buffer?
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Wed Aug 13, 2014 1:29 pm
				by Tranquil
				Are you using unicode?
LEN returns the length of characters of a string. That does not mean that this is the length you need in memory. Use StringByteLength() instead.
Anyway, this is how I have done it some days ago. It was just for testing and also includes encryption:
What does it do? It creates a List with two elements, do some compression on it, do some encryption on it and then revert all this back.
Code: Select all
UseLZMAPacker()
Structure Command
  id.i
  Parameters.s[5]
EndStructure
NewList Command.Command()
AddElement(command())
Command()\id = 1
Command()\Parameters[0] = "A"
Command()\Parameters[1] = "B"
AddElement(command())
Command()\id = 2
Command()\Parameters[0] = "C"
Command()\Parameters[1] = "D"
CreateJSON(0)
InsertJSONList(JSONValue(0),Command())
Debug ComposeJSON(0, #PB_JSON_PrettyPrint)
; Copy jSON Structure to workbuffer
jSONSize.i = ExportJSONSize(0)
*jSONmem = AllocateMemory(jSONSize)
ExportJSON(0,*jSONmem,jSONSize.i)
; Compress Workbuffer
*Compress = AllocateMemory(jSONSize)
res = CompressMemory(*jSONmem,jSONSize.i,*Compress,jSONSize,#PB_PackerPlugin_LZMA)
If res = 0
  Debug "Failed to compress"
Else
  Debug "Compressed to "+Str(res)+" Bytes, original Size: "+Str(jSONSize)
  CompressedSize = Res
EndIf
; Cipher
*Cipher = AllocateMemory(CompressedSize)
res = AESEncoder(*Compress,*Cipher,CompressedSize,?Key,128,?InitializationVector)
 If res = 0
  Debug "Cipher failed"
Else
  Debug "Cipher finished"
EndIf                
;- Way back
; DeCipher
    
res = AESDecoder(*Compress,*Cipher,CompressedSize,?Key,128,?InitializationVector)
 If res = 0
  Debug "DeCipher failed"
Else
  Debug "DeCipher finished"
EndIf  
; Unpack
*unpack = AllocateMemory(jSONSize)
res = UncompressMemory(*Compress,CompressedSize,*unpack,jSONSize,#PB_PackerPlugin_LZMA)
 If res = 0
  Debug "Unpack failed"
Else
  Debug "Unpack finished"
EndIf
res = CatchJSON(2,*unpack,jSONSize)
Debug ComposeJSON(2, #PB_JSON_PrettyPrint)
  DataSection
    Key:
      Data.b $06, $a9, $21, $40, $36, $b8, $a1, $5b, $51, $2e, $03, $d5, $34, $12, $00, $06
  
    InitializationVector:
      Data.b $3d, $af, $ba, $42, $9d, $9e, $b4, $30, $b4, $22, $da, $80, $2c, $9f, $ac, $41
    EndDataSection
    
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Wed Aug 13, 2014 2:22 pm
				by Glow2k9
				hi,
first, thanks for all your comments.
@Little John: I tried that, but nothing shows up :/
@Danilo: I had thread-safe enabled, but I enabled it when I had the include file open (the procedure is in a separate file), not when I had the focus main file, after enabling it also on the main file, the random crashes are gone. SOMEDATA is just the response I get from the server. When I disable gzip support, it's just a plain JSON string, and everything works ok. However, some of the servers I'm querying, are sending a pretty large JSON response that's why I need gzip compression there.
@Tranquil: It's not unicode. When I enable unicode in the compiler, the response is trranslated to chinese chars...lol...dunno why ^^
@all: after enabling thread-safe in the main file, the crashes are gone. however, it still does not work. I checked a little further, and *compressed_data definately holds the compressed JSON stuff (well...I can't tell if it's complete/non-corrupted since it just looks like...well...compressed data 

). So I think the problem is either:
1. uncompressed_len is wrong (I saw in an example that the first long from the compressed data holds the length of the uncompressed data, but maybe that was wrong?)
2. something goes wrong during the de-compression (how can I check that?)
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Wed Aug 13, 2014 2:57 pm
				by said
				
			 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Wed Aug 13, 2014 3:23 pm
				by Glow2k9
				
Hi,
I already saw this, but I prefer a minimalistic solution. If using zlib from PB really does the job with a few lines of code, that would be perfect for me.
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Thu Aug 14, 2014 3:16 am
				by Glow2k9
				update: I made a little progress. I gz compressed a string and saved it to a gz file using my webserver (using gzcompress which also uses zlib). then I opened the file in PB, used ReadData() to get the content and manually set the uncompressed_len var to the lenght of the original string. The string was decoded just fine.
So far so good, it looks like my problem is to read the size of the decompressed data from the compressed stream. I read somewhere, that the last 4 bytes in a gzip encoded server response is the length of the original, uncompressed data. That would mean, that:
Code: Select all
uncompressed_len = PeekL(*compressed_data)
is wrong. So I tried:
Code: Select all
uncompressed_len = PeekL(*compressed_data + len(cString) - 4)
but that gave me a negative value. My test string was only 14 bytes long, and the first code resultet in some 1000000 bytes, the second one, als already mentioned in a negative value. So how can I solve this? I'm now pretty sure, that my problem is the size of the uncompressed data.
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Thu Aug 14, 2014 3:23 am
				by JHPJHP
				Hi Glow2k9,
Awhile ago Thunder93 and myself worked on a project that partly utilized zlib, see if it can be of some help.
NB*: I've kept the package "PureBasic Interface to WinDivert" updated to the latest release, and in working order.
- see the example: wd_inflate.pb as well as the includes
			 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Thu Aug 14, 2014 12:43 pm
				by Glow2k9
				hi,
thanks, but I already checked parts of this code. 
http://www.purebasic.fr/english/viewtop ... C+zlib.lib+ was where I had the code to get the size of the decompressed data from.
However that code is not correct. I finally understood why ^^ there are different ways to encode content, in PHP for example, you can use gzcompress() which seems to be the way webservers send data when gzip compression is requested. Then, there is gzencode() which will generate a .gz compatible stream with all the extra info. using such a file/stream, you can get the uncompressed data size with:
Code: Select all
uncompressed_len = PeekL(*compressed_data + MemorySize(*compressed_data) - 4)
but it looks like the uncompress function imported from zlib in PB can't decode that data, while it can decode datastreams encoded with gzcompress(). The missing point was, that the server response is not only gzip encoded, but also chunked. So it looks like that I need to decode each chunk first, and append it to the decompression buffer. This might take a while to figure out, lol ^^
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Thu Aug 14, 2014 3:36 pm
				by JHPJHP
				Hi Glow2k9,
I believe they are all the same functions, just different wrappers...
http://stackoverflow.com/questions/1502 ... e-machines
You can use any of gzcompress, gzdeflate, or gzencode to produce compressed data that can be portably decompressed anywhere. Those functions only differ in the wrapper around the deflate data (RFC 1951). gzcompress has a zlib wrapper (RFC 1950), gzdeflate has no wrapper, and gzencode has a gzip wrapper (RFC 1952).
You may want to try the code from the download I provided (not the old post), and test using a different "flush" parameter.  Where my script presumed that all data could fit into a single buffer (#Z_FINISH), you probably need to incorporate a loop testing for #Z_OK using a different "flush" buffer.
http://www.zlib.net/manual.html
...
If the parameter flush is set to Z_FINISH, pending input is processed, pending output is flushed and deflate returns with Z_STREAM_END if there was enough output space; if deflate returns with Z_OK, this function must be called again with Z_FINISH and more output space (updated avail_out) but no more input data, until it returns with Z_STREAM_END or an error. After deflate has returned Z_STREAM_END, the only possible operations on the stream are deflateReset or deflateEnd. 
...
NB*: The example wd_inflate.pb returns a PHP encoded page with no issue.
 
			
					
				Re: looking for a solution to decode gzip compressed json st
				Posted: Fri Aug 15, 2014 2:37 am
				by Glow2k9
				what download do you mean? I searched for "PureBasic Interface to WinDivert", but found nothing. however, I'm pretty sure the problem is because the data is chunked. As far as I understand, the data should be:
Code: Select all
chunk-len
CRLF
chunked data
CRLF
chunk-len
....
the data my buffer holds does not look like that (i checked it with a hex editor). I'm currently not sure why, I'm using winapi (WinInet) to communicate with the server. When I have Fiddler running, I can clearly see that the data is chunked, and the chunk-len is there. But it might be, that, if I just put everything I get into a buffer, the data somehow gets corrupted (missing CRLF/chunk-len etc.).
I think I will first need to make sure, I get all the data I need, and the buffer is in a valid format before I can continue trying to decompress it 
