looking for a solution to decode gzip compressed json stream

Just starting out? Need help? Post your questions and find answers here.
Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

looking for a solution to decode gzip compressed json stream

Post by Glow2k9 »

hi,

I'm currently experimenting with server-client stuff. everything works fine, but I'd like to support gzip compressed streams. Is there a solution that works under x86/x64 and if possible without any overhead? I really only need to decompress a string, nothing more. I read somewhere, that the zlib used by pb might be used for that, but I found no working solution.

Can anyone help me here?
User avatar
idle
Always Here
Always Here
Posts: 6026
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: looking for a solution to decode gzip compressed json st

Post by idle »

search for gzip in the forums

you can use uncompress to deflate the stream

Code: Select all

CompilerIf #PB_Compiler_OS = #PB_OS_Windows 
  
  ImportC "zlib.lib"
    compress2(*dest,*destlen,*source,sourcelen,level)
    uncompress(*dest,*destlen,*source,sourcelen)
  EndImport 
CompilerElse 
  
  ImportC "-lz"   
    compress2(*dest,*destlen,*source,sourcelen,level)
    uncompress(*dest,*destlen,*source,sourcelen)
  EndImport 
CompilerEndIf 
Windows 11, Manjaro, Raspberry Pi OS
Image
Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

Re: looking for a solution to decode gzip compressed json st

Post by Glow2k9 »

hi,

thanks for your answer. however, I'm having trouble to get it to work. here's what I tried so far, but the result is, that my program is crashing at a strange point (outside the procedure which tries to uncomress the data, so it appears to be a memory problem):

Code: Select all

cString.s = SOMEDATA ; holds the compressed data

*compressed_data = AllocateMemory(len(cString))
PokeS(*compressed_data, cString)
uncompressed_len = PeekL(*compressed_data)
*uncompressed_data = AllocateMemory(uncompressed_len)

uncompress(*uncompressed_data, @uncompressed_len, *compressed_data, Len(cString))
no luck so far :/ what am I doing wrong?

Edit: the code above is executed in a thread, is there a limitation for zlib?
Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

Re: looking for a solution to decode gzip compressed json st

Post by Glow2k9 »

anyone? sometimes even the whole debugger crashes, so I can't even check where exactly it crashes..
Little John
Addict
Addict
Posts: 4802
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: looking for a solution to decode gzip compressed json st

Post by Little John »

Glow2k9 wrote:anyone? sometimes even the whole debugger crashes, so I can't even check where exactly it crashes..
Did you try OnErrorCall() and related commands?
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: looking for a solution to decode gzip compressed json st

Post by Danilo »

When using strings in threads, you need to enable the thread-safe compiler option.

Code: Select all

cString.s = SOMEDATA ; holds the compressed data
What is SOMEDATA? Why put SOMEDATA into a string and not use a data buffer?
Tranquil
Addict
Addict
Posts: 952
Joined: Mon Apr 28, 2003 2:22 pm
Location: Europe

Re: looking for a solution to decode gzip compressed json st

Post by Tranquil »

Are you using unicode?

LEN returns the length of characters of a string. That does not mean that this is the length you need in memory. Use StringByteLength() instead.

Anyway, this is how I have done it some days ago. It was just for testing and also includes encryption:

What does it do? It creates a List with two elements, do some compression on it, do some encryption on it and then revert all this back.

Code: Select all

UseLZMAPacker()

Structure Command
  id.i
  Parameters.s[5]
EndStructure

NewList Command.Command()

AddElement(command())
Command()\id = 1
Command()\Parameters[0] = "A"
Command()\Parameters[1] = "B"


AddElement(command())
Command()\id = 2
Command()\Parameters[0] = "C"
Command()\Parameters[1] = "D"


CreateJSON(0)
InsertJSONList(JSONValue(0),Command())
Debug ComposeJSON(0, #PB_JSON_PrettyPrint)

; Copy jSON Structure to workbuffer
jSONSize.i = ExportJSONSize(0)
*jSONmem = AllocateMemory(jSONSize)
ExportJSON(0,*jSONmem,jSONSize.i)

; Compress Workbuffer
*Compress = AllocateMemory(jSONSize)

res = CompressMemory(*jSONmem,jSONSize.i,*Compress,jSONSize,#PB_PackerPlugin_LZMA)
If res = 0
  Debug "Failed to compress"
Else
  Debug "Compressed to "+Str(res)+" Bytes, original Size: "+Str(jSONSize)
  CompressedSize = Res
EndIf


; Cipher
*Cipher = AllocateMemory(CompressedSize)
res = AESEncoder(*Compress,*Cipher,CompressedSize,?Key,128,?InitializationVector)

 If res = 0
  Debug "Cipher failed"
Else
  Debug "Cipher finished"
EndIf                

;- Way back

; DeCipher
    
res = AESDecoder(*Compress,*Cipher,CompressedSize,?Key,128,?InitializationVector)
 If res = 0
  Debug "DeCipher failed"
Else
  Debug "DeCipher finished"
EndIf  

; Unpack
*unpack = AllocateMemory(jSONSize)

res = UncompressMemory(*Compress,CompressedSize,*unpack,jSONSize,#PB_PackerPlugin_LZMA)

 If res = 0
  Debug "Unpack failed"
Else
  Debug "Unpack finished"
EndIf

res = CatchJSON(2,*unpack,jSONSize)

Debug ComposeJSON(2, #PB_JSON_PrettyPrint)





  DataSection
    Key:
      Data.b $06, $a9, $21, $40, $36, $b8, $a1, $5b, $51, $2e, $03, $d5, $34, $12, $00, $06
  
    InitializationVector:
      Data.b $3d, $af, $ba, $42, $9d, $9e, $b4, $30, $b4, $22, $da, $80, $2c, $9f, $ac, $41
    EndDataSection
    
Tranquil
Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

Re: looking for a solution to decode gzip compressed json st

Post by Glow2k9 »

hi,

first, thanks for all your comments.

@Little John: I tried that, but nothing shows up :/

@Danilo: I had thread-safe enabled, but I enabled it when I had the include file open (the procedure is in a separate file), not when I had the focus main file, after enabling it also on the main file, the random crashes are gone. SOMEDATA is just the response I get from the server. When I disable gzip support, it's just a plain JSON string, and everything works ok. However, some of the servers I'm querying, are sending a pretty large JSON response that's why I need gzip compression there.

@Tranquil: It's not unicode. When I enable unicode in the compiler, the response is trranslated to chinese chars...lol...dunno why ^^

@all: after enabling thread-safe in the main file, the crashes are gone. however, it still does not work. I checked a little further, and *compressed_data definately holds the compressed JSON stuff (well...I can't tell if it's complete/non-corrupted since it just looks like...well...compressed data :D). So I think the problem is either:

1. uncompressed_len is wrong (I saw in an example that the first long from the compressed data holds the length of the uncompressed data, but maybe that was wrong?)
2. something goes wrong during the de-compression (how can I check that?)
said
Enthusiast
Enthusiast
Posts: 342
Joined: Thu Apr 14, 2011 6:07 pm

Re: looking for a solution to decode gzip compressed json st

Post by said »

Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

Re: looking for a solution to decode gzip compressed json st

Post by Glow2k9 »

Hi,

I already saw this, but I prefer a minimalistic solution. If using zlib from PB really does the job with a few lines of code, that would be perfect for me.
Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

Re: looking for a solution to decode gzip compressed json st

Post by Glow2k9 »

update: I made a little progress. I gz compressed a string and saved it to a gz file using my webserver (using gzcompress which also uses zlib). then I opened the file in PB, used ReadData() to get the content and manually set the uncompressed_len var to the lenght of the original string. The string was decoded just fine.

So far so good, it looks like my problem is to read the size of the decompressed data from the compressed stream. I read somewhere, that the last 4 bytes in a gzip encoded server response is the length of the original, uncompressed data. That would mean, that:

Code: Select all

uncompressed_len = PeekL(*compressed_data)
is wrong. So I tried:

Code: Select all

uncompressed_len = PeekL(*compressed_data + len(cString) - 4)
but that gave me a negative value. My test string was only 14 bytes long, and the first code resultet in some 1000000 bytes, the second one, als already mentioned in a negative value. So how can I solve this? I'm now pretty sure, that my problem is the size of the uncompressed data.
User avatar
JHPJHP
Addict
Addict
Posts: 2266
Joined: Sat Oct 09, 2010 3:47 am

Re: looking for a solution to decode gzip compressed json st

Post by JHPJHP »

Hi Glow2k9,

Awhile ago Thunder93 and myself worked on a project that partly utilized zlib, see if it can be of some help.

NB*: I've kept the package "PureBasic Interface to WinDivert" updated to the latest release, and in working order.
- see the example: wd_inflate.pb as well as the includes

If you're not investing in yourself, you're falling behind.

My PureBasic StuffFREE STUFF, Scripts & Programs.
My PureBasic Forum ➤ Questions, Requests & Comments.
Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

Re: looking for a solution to decode gzip compressed json st

Post by Glow2k9 »

hi,

thanks, but I already checked parts of this code. http://www.purebasic.fr/english/viewtop ... C+zlib.lib+ was where I had the code to get the size of the decompressed data from.
However that code is not correct. I finally understood why ^^ there are different ways to encode content, in PHP for example, you can use gzcompress() which seems to be the way webservers send data when gzip compression is requested. Then, there is gzencode() which will generate a .gz compatible stream with all the extra info. using such a file/stream, you can get the uncompressed data size with:

Code: Select all

uncompressed_len = PeekL(*compressed_data + MemorySize(*compressed_data) - 4)
but it looks like the uncompress function imported from zlib in PB can't decode that data, while it can decode datastreams encoded with gzcompress(). The missing point was, that the server response is not only gzip encoded, but also chunked. So it looks like that I need to decode each chunk first, and append it to the decompression buffer. This might take a while to figure out, lol ^^
User avatar
JHPJHP
Addict
Addict
Posts: 2266
Joined: Sat Oct 09, 2010 3:47 am

Re: looking for a solution to decode gzip compressed json st

Post by JHPJHP »

Hi Glow2k9,

I believe they are all the same functions, just different wrappers...

http://stackoverflow.com/questions/1502 ... e-machines
You can use any of gzcompress, gzdeflate, or gzencode to produce compressed data that can be portably decompressed anywhere. Those functions only differ in the wrapper around the deflate data (RFC 1951). gzcompress has a zlib wrapper (RFC 1950), gzdeflate has no wrapper, and gzencode has a gzip wrapper (RFC 1952).
You may want to try the code from the download I provided (not the old post), and test using a different "flush" parameter. Where my script presumed that all data could fit into a single buffer (#Z_FINISH), you probably need to incorporate a loop testing for #Z_OK using a different "flush" buffer.

http://www.zlib.net/manual.html
...
If the parameter flush is set to Z_FINISH, pending input is processed, pending output is flushed and deflate returns with Z_STREAM_END if there was enough output space; if deflate returns with Z_OK, this function must be called again with Z_FINISH and more output space (updated avail_out) but no more input data, until it returns with Z_STREAM_END or an error. After deflate has returned Z_STREAM_END, the only possible operations on the stream are deflateReset or deflateEnd.
...
NB*: The example wd_inflate.pb returns a PHP encoded page with no issue.

If you're not investing in yourself, you're falling behind.

My PureBasic StuffFREE STUFF, Scripts & Programs.
My PureBasic Forum ➤ Questions, Requests & Comments.
Glow2k9
New User
New User
Posts: 9
Joined: Mon Aug 11, 2014 10:17 pm

Re: looking for a solution to decode gzip compressed json st

Post by Glow2k9 »

what download do you mean? I searched for "PureBasic Interface to WinDivert", but found nothing. however, I'm pretty sure the problem is because the data is chunked. As far as I understand, the data should be:

Code: Select all

chunk-len
CRLF
chunked data
CRLF
chunk-len
....
the data my buffer holds does not look like that (i checked it with a hex editor). I'm currently not sure why, I'm using winapi (WinInet) to communicate with the server. When I have Fiddler running, I can clearly see that the data is chunked, and the chunk-len is there. But it might be, that, if I just put everything I get into a buffer, the data somehow gets corrupted (missing CRLF/chunk-len etc.).

I think I will first need to make sure, I get all the data I need, and the buffer is in a valid format before I can continue trying to decompress it :(
Post Reply