Page 1 of 3

HTTPGetFromWeb() 2.0 - use WinInet API

Posted: Sun Nov 04, 2007 1:11 am
by luis
[Windows, PureBasic 5.00, HTTP]

Code: Select all

;* DESC
;* 	Download a file from Internet using the HTTP protocol.
;*
;* IN
;*  *tHTTP ; See the description on the T_PBL_HTTP_GET_FROM_WEB structure for the usage of the input / output fields.
;* 	 
;* OUT
;*  *tHTTP ; See the description on the T_PBL_HTTP_GET_FROM_WEB structure for the usage of the input / output fields.
;*
;* RET
;*  #WEB_OK                             ; If successful.
;*  #WEB_ERR_INVALID_PARAMETERS         ; If parameters are missing or invalid.
;*  #WEB_ERR_OUT_OF_MEMORY              ; If there is an error WHILE allocating memory for the data buffers.
;*  #WEB_ERR_FILE_CREATION              ; If there is an error while creating the file on the disk.
;*  #WEB_ERR_FILE_IO                    ; If there is an error while writing the file to the disk.
;*  #WEB_ERR_USER_ABORT                 ; If the callback return #False the download has been aborted by request.
;*  #WEB_ERR_HTTP_STATUS                ; If different from HTTP OK (200), the actual value will be returned in *tHTTP\iErrorCodeEx.
;*  #WEB_ERR_HTTP_OPEN                  ; Error in the API call to open.
;*  #WEB_ERR_HTTP_CONNECT               ; Error in the API call to connect.
;*  #WEB_ERR_HTTP_SET_PROXY             ; Error in the API call to set proxy.
;*  #WEB_ERR_HTTP_BASIC_AUTH            ; Error in the API call to basic auth.
;*  #WEB_ERR_HTTP_HTTP_OPEN             ; Error in the API call to HTTP open.
;*  #WEB_ERR_HTTP_HTTP_SEND             ; Error in the API call to HTTP send.
;*  #WEB_ERR_HTTP_HTTP_QUERY_STATUS     ; Error in the API call to query status.
;*  #WEB_ERR_HTTP_READ                  ; Error in the API call to read.
;*  #WEB_ERR_HTTP_CLOSE                 ; Error in the API call to close.
;*
;* EXAMPLE
;*  See the sample programs HTTPGetFromWeb_1.pb, HTTPGetFromWeb_2.pb.
;*
;*  The simplest form: download bar.zip in memory.
;*  The pointer to the allocated memory area will be returned in tHTTP\*DestBuffer.
;*  
;*  tHTTP\iThreadID = 0
;*  tHTTP\URL$ = "http://www.somedomain.com/foo/bar.zip"   
;*  tHTTP\iDestination = #WEB_WRITE_TO_MEMORY
;*  HTTPGetFromWeb (@tHTTP)
;*  
;*  Same as above, loaded entirely in memory and then copied to the specified file.
;*  The memory in this case will be automatically freed by the procedure after
;*  saving the file to disk.
;*  
;*  tHTTP\iThreadID = 0
;*  tHTTP\URL$ = "http://www.somedomain.com/foo/bar.zip"   
;*  tHTTP\iDestination = #WEB_WRITE_TO_FILE
;*  tHTTP\FullFileName$ = "c:\download\bar.zip"
;*  HTTPGetFromWeb (@tHTTP)
;*  
;*  Again we download bar.zip, but this time we don't load it entirely in memory.
;*  We try to load it in "chunks" of 16384 bytes, the proc will automatically save a chunk to
;*  file and then repeat the process until the download is completed.
;*  In this call we use a callback procedure as well.
;*  The callback procedure (see the prototype definition) will receive
;*  - a pointer to the tHTTP structure
;*  - the number of bytes downloaded up to this point
;*  - the total size of the download (if available, else 0)
;*  - the time passed in seconds from the start of the download.
;*  
;*  tHTTP\iThreadID = 0
;*  tHTTP\URL$ = "http://www.somedomain.com/foo/bar.zip"   
;*  tHTTP\iDestination = #WEB_WRITE_TO_FILE
;*  tHTTP\FullFileName$ = "c:\download\bar.zip"
;*  tHTTP\iChunkSize = 16384
;*  tHTTP\fpCB_Working = @MyCallBack()
;*  HTTPGetFromWeb (@tHTTP)
;*  
;*  Same as above but using a proxy without authentication required.
;*  
;*  tHTTP\iThreadID = 0
;*  tHTTP\URL$ = "http://www.somedomain.com/foo/bar.zip"   
;*  tHTTP\iAccess = #INTERNET_OPEN_TYPE_PROXY
;*  tHTTP\ProxyAndPort$ = "192.168.1.1:8080"
;*  tHTTP\iDestination = #WEB_WRITE_TO_FILE
;*  tHTTP\FullFileName$ = "c:\download\bar.zip"
;*  tHTTP\iChunkSize = 16384
;*  tHTTP\fpCB_Working = @MyCallBack()
;*  HTTPGetFromWeb (@tHTTP)
;*  
;*  Same as above but using a proxy requiring authentication,
;*  
;*  tHTTP\iThreadID = 0
;*  tHTTP\URL$ = "http://www.somedomain.com/foo/bar.zip"   
;*  tHTTP\iAccess = #INTERNET_OPEN_TYPE_PROXY
;*  tHTTP\ProxyAndPort$ = "192.168.1.1:8080"
;*  tHTTP\ProxyUsername$ = "username"
;*  tHTTP\ProxyPassword$ = "password"
;*  tHTTP\iDestination = #WEB_WRITE_TO_FILE
;*  tHTTP\FullFileName$ = "c:\download\bar.zip"
;*  tHTTP\iChunkSize = 16384
;*  tHTTP\fpCB_Working = @MyCallBack()
;*  HTTPGetFromWeb (@tHTTP)
;*  
;*  Accessing an HTTP resource requiring basic authentication
;*  
;*  tHTTP\iThreadID = 0
;*  tHTTP\URL$ = "http://username:password@www.somedomain.com/foo/bar.zip"   
;*  tHTTP\iDestination = #WEB_WRITE_TO_FILE
;*  tHTTP\FullFileName$ = "c:\download\bar.zip"
;*  tHTTP\iChunkSize = 16384
;*  tHTTP\fpCB_Working = @MyCallBack();
;*  HTTPGetFromWeb (@tHTTP)
;*  
;*  Accessing an HTTP resource requiring basic authentication through a
;*  proxy requiring authentication
;*
;*  tHTTP\iThreadID = 0
;*  tHTTP\URL$ = "http://username:password@www.somedomain.com/foo/bar.zip"   
;*  tHTTP\iAccess = #INTERNET_OPEN_TYPE_PROXY
;*  tHTTP\ProxyAndPort$ = "192.168.1.1:8080"
;*  tHTTP\ProxyUsername$ = "username"
;*  tHTTP\ProxyPassword$ = "password"
;*  tHTTP\iDestination = #WEB_WRITE_TO_FILE
;*  tHTTP\FullFileName$ = "c:\download\bar.zip"
;*  tHTTP\iChunkSize = 16384
;*  tHTTP\fpCB_Working = @MyCallBack()
;*  HTTPGetFromWeb (@tHTTP) 
;*  
;* NOTES
;*  The procedure will fall back to use the chunk method even if "all in one block" is requested if the server doesn't return the 
;*  total size of the file we are about to download.
;*  This happen for example when downloading stock quotes from the Yahoo site.
;*
;*  The main callback procedure can optionally abort a download (if in chunk mode) when a user defined custom condition arise, 
;*  returning #False instead of #True.
;*
;*  Valid values for *tHTTP\iAccess 
;* 
;*   #INTERNET_OPEN_TYPE_DIRECT ; Try a direct connection to the Internet.
;*   #INTERNET_OPEN_TYPE_PROXY  ; Try to passes the requests to the proxy specified in ProxyAndPort$.
;*
;*  Valid values for *tHTTP\iFlags 
;*
;*   #INTERNET_FLAG_NO_UI               ; Disables the cookie dialog box if cookies received.
;*
;*   #INTERNET_FLAG_RELOAD              ; Forces a download of the requested file from the server, not from the cache.
;*
;*   #INTERNET_FLAG_NO_COOKIES          ; Does not automatically add cookie headers to requests, and does not automatically add 
;*                                      ; returned cookies to the cookie database.
;*
;*   #INTERNET_FLAG_NO_AUTO_REDIRECT    ; Does not automatically handle redirection in HttpSendRequest.

Download [... from someone who got a copy at the time].

Posted: Sun Nov 04, 2007 1:17 am
by Thalius
haha! :lol:

Nice one ;) Now you made me make a crossplatform GUI example for mine too =)

Cheers,
Thalius

Posted: Sun Nov 04, 2007 6:20 am
by Dare
Thanks!

Posted: Sun Nov 04, 2007 6:40 am
by Fangbeast
I second what shorty above me said. This has possibilities.

Posted: Sun Nov 04, 2007 1:33 pm
by abc123
Thanks! :D

Posted: Sun Nov 04, 2007 1:54 pm
by ar-s
That is just Great :D

Posted: Sun Nov 04, 2007 2:03 pm
by srod
Bloody awesome! 8)

Thanks.

Posted: Sat Nov 10, 2007 1:24 am
by luis
Updated to 2.01 and this should be the final because fits my humble needs ...

Code: Select all

; 2.01 (November 7, 2007)

; ! Solved the double authentication problem. Now a connection to a HTTP site requiring
;   authentication through a proxy server also requiring authentication is possible.

; + If the file to download is bigger than 10 MB the routine will automatically switch
;   to chunk mode using the default chunk size (65535) even if tHTTP\lChunkSize is not 
;   specified or it is equal to 0.

; + Added the param hRequest to the HTTPGetFromWeb_Start() callback, to access the opened
;   HTTP request handle to use custom HttpQueryInfo() calls (see example in sample1.pb).

As always, if you enhance it please let me know.

Bye!

Luis

Posted: Sun Dec 02, 2007 4:18 am
by Tranquil
Hi Luis!

First, thanks for giving us such nice pice of code.
I tested it a lot today and maybe found a bug but atm I can not narrow it down in your source.

This is what I've tried:

Code: Select all

     tHTTP.T_PBL_HTTP_GET_FROM_WEB 
     tHTTP\sURL = "http://stromberg.bei.onlineglueck.de/1278627main.html"
     tHTTP\lThreadID = 0
     tHTTP\lDestination = #PBL_WRITE_TO_MEMORY  
If I try to download this file the download is never be completly downloaded. Dont know why, but anywhere get some datas lost. Can someone test it? Tomorrow I will take a closer look on the source.

Posted: Sun Dec 02, 2007 4:37 pm
by luis
Tranquil wrote: If I try to download this file the download is never be completly downloaded. Dont know why, but anywhere get some datas lost.
I can confirm there is something strange happening... I'll look into it ASAP and report back.

Bye!

EDIT: yup, I believe you found a bug. It's related to a condition when the site is not returning the length of the file to be retrieved. Normally all works correctly even in this case (with the other sites I've encountered so far), but with this particular site the InternetReadFile() api doesn't take advantage of the space in the memory buffer even if the file is smaller, and the chunk method I use doesn't work as expected. I'm implementing a workaround and I'll post it when done.

EDIT 2: ok, it should be fixed. I changed the way I keep count of the download progression, now it should work in any case. I have to test it a little more to be sure all the rest is working as expected. I'll probably upload the fixed version tomorrow or the day after. I don't have the time right now. Please check here later. Thank you.

Posted: Tue Dec 04, 2007 12:39 am
by luis
Updated to 2.02 (download from 1st post)

Please let me know if now is working for you.

Code: Select all

; 2.02 (Dicember 3, 2007)
; ! Changed the way to track the download progression internally, to correctly work 
;   with sites returning data in variable chunk sizes.
;   The problem was reported by Tranquil. Thank you !

Posted: Tue Dec 04, 2007 2:41 pm
by Tranquil
Works like a charm!! Thanks for fixing!

Best regards
Mike

Posted: Fri Apr 18, 2008 7:39 am
by RTEK
Just found this code - I have some VB programs that downloads webpages and extracts info from them. They can now be converted to PureBasic. This is really helpful and useful stuff.

Thanks a lot!!! :D

Posted: Fri Apr 18, 2008 12:35 pm
by luis
RTEK wrote:Just found this code ... Thanks a lot!!! :D
Thank you for letting me know ...

:-)

Luis

Posted: Sat Apr 19, 2008 9:31 pm
by CherokeeStalker
RTEK wrote:Just found this code - I have some VB programs that downloads webpages and extracts info from them. They can now be converted to PureBasic. This is really helpful and useful stuff.

Thanks a lot!!! :D
Will you be sharing this ? Sounds interesting . . .