HTTPGetFromWeb() 2.0 - use WinInet API

Share your advanced PureBasic knowledge/code with the community.
KoopaOne
New User
New User
Posts: 9
Joined: Thu May 08, 2008 3:04 pm

Post by KoopaOne »

like requests from luis in a PN, i'll ask the question here again:

Is this code able to handle downloads from the free-file-hoster www.adrive.com?
URL's to the shared files are like this one:
http://www.adrive.com/public/a5278606d5 ... 303d2.html

thx in advance!
User avatar
luis
Addict
Addict
Posts: 3893
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Post by luis »

If you use HTTPGetFromWeb with the above url

tHTTP\sURL = "http://www.adrive.com/public/a5278606d5 ... 303d2.html"

it won't work.

The link above is the link of a simple html page, so my code will download that html page.

A browser, on the other hand, would execute the javascript code inside that page:

<script>
Event.observe( window, 'load', function() {
frames['view'].location.href = "/public/view/a5278606d52fe6bd13f5a417347a6512ad81e8a3ec1da262e7521a71fb3303d2.html";
});
</script>

The code above is the one responsible for the launch of the download.

My routine is not (obviously) a browser so cannot execute the javascript code, and it will download the resource specified in the url (a html page in this case).

In this *particular* case, you see the file is hosted almost at the same url (the second url reply to the browser with a binary stream, an attachment).

You only need to insert a "view/" in the original url.

So if you pre-process the string, you'll have:

before

tHTTP\sURL = "http://www.adrive.com/public/a5278606d5 ... 303d2.html"

and after processing it

tHTTP\sURL = "http://www.adrive.com/public/view/a5278 ... 303d2.html"

The last one can be downloaded using my code.

So you can download from this site this way, *If* all the url on this site follow the pattern above.

Hope this helps.

Luis


EDIT: keep in mind they probably WANT to show that page before the download start, and the second direct url you compose by yourself can be considerate deep-linking, and so they could decide to change the current system to stop that type of access.
KoopaOne
New User
New User
Posts: 9
Joined: Thu May 08, 2008 3:04 pm

Post by KoopaOne »

and how would the code for this look like luis? when i try it, it still only downloads the 3kb html file and not the embedded (in this cas) RAR.

thx in advance
User avatar
luis
Addict
Addict
Posts: 3893
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Post by luis »

Now it's not working anymore. Yesterday it was.

They were fast :-)

It's happened what I told you at the end of my previous post.

See the HTML you just donwloaded, today is different.

"<div class="error-msg">You are trying to visit a page without visiting the download page first</div>"

So, I'm afraid they don't want this to happen.

Luis
KoopaOne
New User
New User
Posts: 9
Joined: Thu May 08, 2008 3:04 pm

Post by KoopaOne »

thanks for your tips, i found a way to do it ... :)
KoopaOne
New User
New User
Posts: 9
Joined: Thu May 08, 2008 3:04 pm

Post by KoopaOne »

Is it possible to include REFERRER into your code?
thx in advance
User avatar
luis
Addict
Addict
Posts: 3893
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Post by luis »

KoopaOne wrote:Is it possible to include REFERRER into your code?
thx in advance
It is certainly possible, but I've not investigated the subject.

You could try to modify the code by yourself, I can give you a starting point.

Probably (?) you need to play with the HttpOpenRequest API function.


http://msdn.microsoft.com/en-us/library ... S.85).aspx


5th parameter
lpszReferer
A pointer to a null-terminated string that specifies the URL of the document from which the URL in the request (lpszObjectName) was obtained. If this parameter is NULL, no referrer is specified.
You could modify the structure I use to communicate with HTTPGetFromWeb() and add a "referral" parameter. Then tweak my call to HttpOpenRequest to include this param, if not empty.

Anyway, even if now their check is based on the referral (it was the easiest way to implement it) they can block you in other ways.
KoopaOne
New User
New User
Posts: 9
Joined: Thu May 08, 2008 3:04 pm

Post by KoopaOne »

thx for your help luis.
that one hoster was just an example, since i try to code a multi-purpose downloader ... if they change the code, i don't mind actually *gg*

i will look into that particular apicall. thanks again!
User avatar
mback2k
Enthusiast
Enthusiast
Posts: 257
Joined: Sun Dec 02, 2007 12:11 pm
Location: Germany

Post by mback2k »

Hello everyone,

I would like to give back to luis and the community my improvements for this include:

Code: Select all

EnableExplicit

;****************************************************************************************
; HTTPGetFromWeb() 3.00 by Luis
; First release November 4, 2007
; Free to be used in freeware and commercial programs.
;****************************************************************************************

; 3.00 (Dicember 24, 2008)
; + Added HTTPS support
; + Added #PBL_WRITE_TO_MEMORY_HEADER to read the HTTP(S) header
; + Added HeaderData and OptionalData support for HTTP(S) requests
; + It does now "ping" the target HTTP server before requests are send
; ! Fixed memory heap crash with Goto inside loops
;   It's now using Break and does jump to the Exit label afterwards

; 2.02 (Dicember 3, 2007)
; ! Changed the way to track the download progression internally, to correctly work
;   with sites returning data in variable chunk sizes.
;   The problem was reported by Tranquil. Thank you !

; 2.01 (November 7, 2007)

; ! Solved the double authentication problem. Now a connection to a HTTP site requiring
;   authentication through a proxy server also requiring authentication is possible.

; + If the file to download is bigger than 10 MB the routine will automatically switch
;   to chunk mode using the default chunk size (65535) even if tHTTP\lChunkSize is not
;   specified or it is equal to 0.

; + Added the param hRequest to the HTTPGetFromWeb_Start() callback, to access the opened
;   HTTP request handle to use custom HttpQueryInfo() calls (see example in sample1.pb).


; Some features:

; - Based on the WinInet API

; - HTTP download to MEMORY and to FILE

; - Download in a single block or in chunk mode

; - Full callback communication in chunk mode, partial in single block mode

; - Supports three different callback procedures for start download, end/abort download
;   and processing (new in 2.0)

; - In chunk mode download can be aborted at any time

; - Able to connect through a proxy server with/without authentication

; - Able to connect to a web server using basic authentication in the form
;   http://username:password@www.somedomain.com (new in 2.0)

; - Can be easily used for multithreaded download (new in 2.0)



; Included:

; sample1.pb - basic single threaded usage example
; sample2.pn - multi threaded usage with very simple gui framework for reference



; Some usage examples:

; The simplest form: download bar.zip in memory.
; The pointer to the allocated memory area will be returned in tHTTP\*DestBuffer
;
; tHTTP\lThreadID = 0
; tHTTP\sURL = "http://www.somedomain.com/foo/bar.zip"
; tHTTP\lDestination = #PBL_WRITE_TO_MEMORY
; HTTPGetFromWeb (@tHTTP)


; Same as above, loaded entirely in memory and then copied to the specified file.
; The memory in this case will be automatically freed by the procedure after
; saving the file to disk.
;
; tHTTP\lThreadID = 0
; tHTTP\sURL = "http://www.somedomain.com/foo/bar.zip"
; tHTTP\lDestination = #PBL_WRITE_TO_FILE
; tHTTP\sFullFileName = "c:\download\bar.zip"
; HTTPGetFromWeb (@tHTTP)


; Again we download bar.zip, but this time we don't load it entirely in memory.
; We try to load it in "chunks" of 16384 bytes, the proc will automatically save a chunk to
; file and then repeat the process until the download is completed.
; In this call we use a callback procedure as well.
; The callback procedure (see the prototype definition) will receive
; - a pointer to the tHTTP structure
; - the number of bytes downloaded up to this point
; - the total size of the download (if available, else 0)
; - the time passed in seconds from the start of the download
;
; tHTTP\lThreadID = 0
; tHTTP\sURL = "http://www.somedomain.com/foo/bar.zip"
; tHTTP\lDestination = #PBL_WRITE_TO_FILE
; tHTTP\sFullFileName = "c:\download\bar.zip"
; tHTTP\lChunkSize = 16384
; tHTTP\fpCallBack = @MyCallBack()
; HTTPGetFromWeb (@tHTTP)


; Same as above but using a proxy without authentication required.
;
; tHTTP\lThreadID = 0
; tHTTP\sURL = "http://www.somedomain.com/foo/bar.zip"
; tHTTP\lAccess = #INTERNET_OPEN_TYPE_PROXY
; tHTTP\sProxyAndPort = "192.168.1.1:8080"
; tHTTP\lDestination = #PBL_WRITE_TO_FILE
; tHTTP\sFullFileName = "c:\download\bar.zip"
; tHTTP\lChunkSize = 16384
; tHTTP\fpCallBack = @MyCallBack()
; HTTPGetFromWeb (@tHTTP)


; Same as above but using a proxy requiring authentication
;
; tHTTP\lThreadID = 0
; tHTTP\sURL = "http://www.somedomain.com/foo/bar.zip"
; tHTTP\lAccess = #INTERNET_OPEN_TYPE_PROXY
; tHTTP\sProxyAndPort = "192.168.1.1:8080"
; tHTTP\sProxyUsername = "username"
; tHTTP\sProxyPassword = "password"
; tHTTP\lDestination = #PBL_WRITE_TO_FILE
; tHTTP\sFullFileName = "c:\download\bar.zip"
; tHTTP\lChunkSize = 16384
; tHTTP\fpCallBack = @MyCallBack()
; HTTPGetFromWeb (@tHTTP)


; Accessing an HTTP resource requiring basic authentication
;
; tHTTP\lThreadID = 0
; tHTTP\sURL = "http://username:password@www.somedomain.com/foo/bar.zip"
; tHTTP\lDestination = #PBL_WRITE_TO_FILE
; tHTTP\sFullFileName = "c:\download\bar.zip"
; tHTTP\lChunkSize = 16384
; tHTTP\fpCallBack = @MyCallBack();
; HTTPGetFromWeb (@tHTTP)


; Accessing an HTTP resource requiring basic authentication through a
; proxy requiring authentication
;
; tHTTP\lThreadID = 0
; tHTTP\sURL = "http://www.somedomain.com/foo/bar.zip"
; tHTTP\lAccess = #INTERNET_OPEN_TYPE_PROXY
; tHTTP\sUsername = "username"
; tHTTP\sPassword = "password"
; tHTTP\sProxyAndPort = "192.168.1.1:8080"
; tHTTP\sProxyAndPort = "192.168.1.1:8080"
; tHTTP\sProxyUsername = "username"
; tHTTP\sProxyPassword = "password"
; tHTTP\lDestination = #PBL_WRITE_TO_FILE
; tHTTP\sFullFileName = "c:\download\bar.zip"
; tHTTP\lChunkSize = 16384
; tHTTP\fpCallBack = @MyCallBack()
; HTTPGetFromWeb (@tHTTP)


;**********************************************************
; Converted from WININET.H
;**********************************************************
Enumeration
  #HTTP_STATUS_OK                         = 200
  
  #HTTP_QUERY_FLAG_NUMBER                 = $20000000
  
  #INTERNET_FLAG_RELOAD                   = $80000000
  #INTERNET_FLAG_RAW_DATA                 = $40000000
  #INTERNET_FLAG_EXISTING_CONNECT         = $20000000
  #INTERNET_FLAG_ASYNC                    = $10000000
  #INTERNET_FLAG_PASSIVE                  = $08000000
  #INTERNET_FLAG_NO_CACHE_WRITE           = $04000000
  #INTERNET_FLAG_MAKE_PERSISTENT          = $02000000
  #INTERNET_FLAG_FROM_CACHE               = $01000000
  #INTERNET_FLAG_SECURE                   = $00800000
  #INTERNET_FLAG_KEEP_CONNECTION          = $00400000
  #INTERNET_FLAG_NO_AUTO_REDIRECT         = $00200000
  #INTERNET_FLAG_READ_PREFETCH            = $00100000
  #INTERNET_FLAG_NO_COOKIES               = $00080000
  #INTERNET_FLAG_NO_AUTH                  = $00040000
  #INTERNET_FLAG_CACHE_IF_NET_FAIL        = $00010000
  #INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTP  = $00008000
  #INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTPS = $00004000
  #INTERNET_FLAG_IGNORE_CERT_DATE_INVALID = $00002000
  #INTERNET_FLAG_IGNORE_CERT_CN_INVALID   = $00001000
  #INTERNET_FLAG_RESYNCHRONIZE            = $00000800
  #INTERNET_FLAG_HYPERLINK                = $00000400
  #INTERNET_FLAG_NO_UI                    = $00000200
  #INTERNET_FLAG_PRAGMA_NOCACHE           = $00000100
  #INTERNET_FLAG_CACHE_ASYNC              = $00000080
  #INTERNET_FLAG_FORMS_SUBMIT             = $00000040
  #INTERNET_FLAG_NEED_FILE                = $00000010
  
  #INTERNET_CONNECTION_MODEM              = $1
  #INTERNET_CONNECTION_LAN                = $2
  #INTERNET_CONNECTION_PROXY              = $4
  #INTERNET_CONNECTION_MODEM_BUSY         = $8
  #INTERNET_CONNECTION_OFFLINE            = $20
  #INTERNET_CONNECTION_CONFIGURED         = $40
  #INTERNET_RAS_INSTALLED                 = $10
  
  #HTTP_QUERY_CONTENT_TYPE                = 1
  #HTTP_QUERY_CONTENT_LENGTH              = 5
  #HTTP_QUERY_STATUS_CODE                 = 19
  #HTTP_QUERY_STATUS_TEXT                 = 20
  #HTTP_QUERY_RAW_HEADERS                 = 21
  #HTTP_QUERY_RAW_HEADERS_CRLF            = 22
  
  #INTERNET_OPTION_USERNAME               = 28
  #INTERNET_OPTION_PASSWORD               = 29
  #INTERNET_OPTION_SECURITY_FLAGS         = 31
  #INTERNET_OPTION_PROXY_USERNAME         = 43
  #INTERNET_OPTION_PROXY_PASSWORD         = 44
  
  
  #INTERNET_INVALID_PORT_NUMBER           = 0
  #INTERNET_DEFAULT_FTP_PORT              = 21
  #INTERNET_DEFAULT_GOPHER_PORT           = 70
  #INTERNET_DEFAULT_HTTP_PORT             = 80
  #INTERNET_DEFAULT_HTTPS_PORT            = 443
  #INTERNET_DEFAULT_SOCKS_PORT            = 1080
  
  #INTERNET_SERVICE_URL                   = 0
  #INTERNET_SERVICE_FTP                   = 1
  #INTERNET_SERVICE_GOPHER                = 2
  #INTERNET_SERVICE_HTTP                  = 3
  
  #INTERNET_OPEN_TYPE_PRECONFIG           = 0
  #INTERNET_OPEN_TYPE_DIRECT              = 1
  #INTERNET_OPEN_TYPE_PROXY               = 3
  #INTERNET_OPEN_TYPE_PRECONFIG_WITH_NO_AUTOPROXY = 4
  
  #SECURITY_FLAG_IGNORE_REVOCATION        = $80
  #SECURITY_FLAG_IGNORE_UNKNOWN_CA        = $100
  #SECURITY_FLAG_IGNORE_WRONG_USAGE       = $200
  #SECURITY_FLAG_IGNORE_CERT_CN_INVALID   = $1000
  #SECURITY_FLAG_IGNORE_CERT_DATE_INVALID = $2000
EndEnumeration

Enumeration ; PBL constants
  #PBL_OK                        = $0001
  
  #PBL_WRITE_TO_FILE
  #PBL_WRITE_TO_MEMORY
  #PBL_WRITE_TO_MEMORY_HEADER
  
  ;******************************************************************************
  ;*** ERROR CONSTANTS **********************************************************
  ;******************************************************************************
  
  #PBL_ERR_CALL_FAILED           = $8000
  #PBL_ERR_INVALID_PARAMETERS
  #PBL_ERR_OUT_OF_MEMORY
  #PBL_ERR_ABORT_REQUESTED
  #PBL_ERR_NOT_FOUND
  
  #PBL_ERR_FILE_IO
  #PBL_ERR_FILE_CREATION
  
  #PBL_ERR_HTTP_STATUS
  
  #PBL_ERR_INTERNET_OPEN
  #PBL_ERR_INTERNET_CONNECT
  #PBL_ERR_INTERNET_SET_PROXY
  #PBL_ERR_INTERNET_BASIC_AUTH
  #PBL_ERR_INTERNET_HTTP_OPEN
  #PBL_ERR_INTERNET_HTTP_SEND
  #PBL_ERR_INTERNET_HTTP_QUERY_STATUS
  #PBL_ERR_INTERNET_READ
  #PBL_ERR_INTERNET_CLOSE
  #PBL_ERR_INTERNET_CHECK
EndEnumeration

Prototype   HTTPGetFromWeb_Start    (*tHTTP, hRequest)
Prototype   HTTPGetFromWeb_End      (*tHTTP, lRetVal, lBytesReceived, lSize, lElapsedTime)
Prototype   HTTPGetFromWeb_CallBack (*tHTTP, lBytesReceived, lSize, lElapsedTime)

Structure T_PBL_HTTP_GET_FROM_WEB ; used by HTTPGetFromWeb()
  
  lThreadID.l        ; *must* contain a number >= 0 to identify the calling thread in the callback function
  ; if you are not calling the proc in multithreading you can safely set this to 0
  
  sURL.s             ; *must* contain the full URL (eg: http://www.domain.com/foo/bar/file.zip)
  lDestination.l     ; *must* contain #PBL_WRITE_TO_FILE or #PBL_WRITE_TO_MEMORY
  lChunkSize.l       ; *must* contain the size of the data chunk we want to read for any iteration, or 0 if we want all in one big block
  
  sUserAgent.s       ; defaults to "PureBasic for Windows (HTTPGetFromWeb)" if empty
  sRequestType.s     ; defaults to "GET" if empty
  sUsername.s        ; defaults to an empty string
  sPassword.s        ; defaults to an empty string
  sHeaderData.s      ; defaults to an empty string
  lHeaderDataLen.i   ; defaults to 0
  sOptionalData.s    ; defaults to an empty string
  lOptionalDataLen.i ; defaults to 0
  
  lPortNumber.l      ; defaults to 80 if not specified (if = 0)
  lSecureSocket.l    ; defaults to 0, uses HTTPS if 1
  
  lAccess.l          ; defaults to #INTERNET_OPEN_TYPE_DIRECT if 0 (see the procedure body for other values)
  
  lFlags.l           ; defaults to #INTERNET_FLAG_NO_UI | #INTERNET_FLAG_RELOAD if 0
  ; (see the procedure body for more values and description)
  
  sFullFileName.s    ; if lDestination = #PBL_WRITE_TO_FILE *must* contain the full pathname of the destination file
  
  sProxyAndPort.s    ; if lAccess = #INTERNET_OPEN_TYPE_PROXY you *must* specify a proxy here (eg: "192.168.1.1:80")
  sProxyUsername.s   ; if lAccess = #INTERNET_OPEN_TYPE_PROXY and authentication is required *must* contain the username
  sProxyPassword.s   ; if lAccess = #INTERNET_OPEN_TYPE_PROXY and authentication is required *must* contain the password
  
  lTotBytesRead.l    ; if return code = #PBL_OK will contain the total number of bytes read
  ; else it will contain 0
  
  lErrorCodeEx.l     ; if return code = #PBL_ERR_HTTP_STATUS will contain the actual HTTP status code
  ; else it will contain 0
  
  ; some common status code you can encounter are:
  
  ; 301 moved permanently
  ; 302 moved temporarily
  ; 307 temporary redirect
  ; 400 bad request
  ; 401 unauthorized
  ; 403 forbidden
  ; 404 not found
  ; 407 proxy authorization required
  ; 408 request timeout
  ; 410 resource is gone
  ; 500 internal server erro
  ; 502 bad gateway
  ; 503 service unavailable
  ; 504 gateway timeout
  
  *DestBuffer        ; if lDestination = #PBL_WRITE_TO_MEMORY will contain the address of the memory buffer allocated by the proc
  ; else it will contain 0
  
  lElapsedTime.l
  
  fpCallBack.HTTPGetFromWeb_CallBack ; if not #Null, the callback procedure will be called for any iteration
  ; or one time only if lChunkSize = 0
  
  fpCallStart.HTTPGetFromWeb_Start   ; if not #Null, the callback procedure will be called after params validation
  ; and when download is about to start
  
  fpCallEnd.HTTPGetFromWeb_End       ; if not #Null, the callback procedure will be called on exiting
  ; in every case (success or failure)
EndStructure

Procedure HTTPGetFromWeb(*tHTTP.T_PBL_HTTP_GET_FROM_WEB)
  ; [in / out] *tHTTP
  ;
  ; See the definition above of structure T_PBL_HTTP_GET_FROM_WEB for the usage of the input / output fields
  
  ; [return]
  
  ; #PBL_OK                               if successful
  ; #PBL_ERR_INVALID_PARAMETERS           if parameters are missing or invalid
  ; #PBL_ERR_OUT_OF_MEMORY                if there is an error when allocating memory for the data buffers
  ; #PBL_ERR_FILE_CREATION                if there is an error while creating the file to disk
  ; #PBL_ERR_FILE_IO                      if there is an error while writing the file to disk
  ; #PBL_ERR_ABORT_REQUESTED              if the callback return #False it's because the download is aborted by request
  
  ; #PBL_ERR_HTTP_STATUS                  if different from HTTP OK (200), the value will be returned in *tHTTP\lErrorCodeEx
  
  ; #PBL_ERR_INTERNET_OPEN
  ; #PBL_ERR_INTERNET_CONNECT
  ; #PBL_ERR_INTERNET_SET_PROXY
  ; #PBL_ERR_INTERNET_BASIC_AUTH
  ; #PBL_ERR_INTERNET_HTTP_OPEN
  ; #PBL_ERR_INTERNET_HTTP_SEND
  ; #PBL_ERR_INTERNET_HTTP_QUERY_STATUS
  ; #PBL_ERR_INTERNET_READ
  ; #PBL_ERR_INTERNET_CLOSE
  ; #PBL_ERR_INTERNET_CHECK
  
  
  ; [notes]
  
  ; The procedure will fall back to use the chunk method even if the "all in one block" is requested if the server don't
  ; return the total size of the file we are about to download.
  ; This for example happen when downloading stock quotes from the Yahoo site.
  
  ; The callback procedure can optionally abort a download (if in chunk mode) when a user defined custom condition arise,
  ; returning #False instead of #True.
  
  ; *tHTTP\lAccess values
  ;
  ; #INTERNET_OPEN_TYPE_DIRECT        = direct connection to the Internet
  ; #INTERNET_OPEN_TYPE_PROXY         = passes requests to the proxy specified in sProxyAndPort
  ; #INTERNET_OPEN_TYPE_PRECONFIG     = uses the default configuration for Internet access
  ;
  ; *tHTTP\lFlags values
  ;
  ; #INTERNET_FLAG_NO_UI              = disables the cookie dialog box if cookies received
  ; #INTERNET_FLAG_RELOAD             = forces a download of the requested file, object, or directory listing from the server, not from the cache
  ; #INTERNET_FLAG_NO_COOKIES         = does not automatically add cookie headers to requests, and does not automatically add returned cookies to the cookie database
  ; #INTERNET_FLAG_NO_AUTO_REDIRECT   = does not automatically handle redirection in HttpSendRequest
  
  Protected hInet, hPing, hInetCon, hReq
  Protected sDomain$, sPath$, lpUrlComponents.URL_COMPONENTS\dwStructSize = SizeOf(URL_COMPONENTS)
  Protected lBytesRead, lReadUntilNow, lBufSize, lStartTime, lStatusCode.l, lContentLen.l, lLongSize = SizeOf(Long)
  Protected nFileNum, flgEOF = #False
  Protected flgBasicAuthTried = #False
  Protected lDefaultChunkSize = 65536 ; 64 KBytes - used when chunk size is not specified (0) and a fallback to chunk mode is needed
  Protected lRetVal, lIntRetVal
  Protected dwFlags, dwBuffLen
  
  ; clean up
  
  *tHTTP\lTotBytesRead = 0
  *tHTTP\lErrorCodeEx = 0
  
  lStartTime =  ElapsedMilliseconds()
  
  ; extract domain and path
  
  lpUrlComponents\dwHostNameLength = #True
  lpUrlComponents\dwUrlPathLength = #True
  
  If Not InternetCrackUrl_(*tHTTP\sURL, #Null, #Null, @lpUrlComponents)
    lRetVal = #PBL_ERR_INVALID_PARAMETERS
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  sDomain$ = PeekS(lpUrlComponents\lpszHostName, lpUrlComponents\dwHostNameLength)
  sPath$ = PeekS(lpUrlComponents\lpszUrlPath, lpUrlComponents\dwUrlPathLength)
  
  ; sanity checks
  
  Select *tHTTP\lDestination
      
    Case #PBL_WRITE_TO_FILE
      If Len(*tHTTP\sFullFileName) = 0 ; no filename specified
        lRetVal = #PBL_ERR_INVALID_PARAMETERS
        Goto lbl_HTTPGetFromWeb_Exit
      EndIf
      
    Case #PBL_WRITE_TO_MEMORY
      *tHTTP\DestBuffer = #Null
      
    Case #PBL_WRITE_TO_MEMORY_HEADER
      *tHTTP\DestBuffer = #Null
      
    Default
      lRetVal = #PBL_ERR_INVALID_PARAMETERS
      Goto lbl_HTTPGetFromWeb_Exit
  EndSelect
  
  If *tHTTP\lAccess = #INTERNET_OPEN_TYPE_PROXY
    If Len(*tHTTP\sProxyAndPort) = 0
      lRetVal = #PBL_ERR_INVALID_PARAMETERS
      Goto lbl_HTTPGetFromWeb_Exit
    EndIf
  EndIf
  
  If *tHTTP\lChunkSize < 0 ; paranoid
    *tHTTP\lChunkSize = 0
  EndIf
  
  If *tHTTP\lThreadID < 0 ; paranoid
    *tHTTP\lThreadID = 0
  EndIf
  
  ; defaults
  
  If *tHTTP\lAccess = 0 ; if access type not specified
    *tHTTP\lAccess = #INTERNET_OPEN_TYPE_PRECONFIG
  EndIf
  
  If Len(*tHTTP\sUserAgent) = 0 ; if user agent not specified
    *tHTTP\sUserAgent = "PureBasic for Windows (HTTPGetFromWeb)"
  EndIf
  
  If Len(*tHTTP\sRequestType) = 0 ; if request type not specified
    *tHTTP\sRequestType = "GET"
  EndIf
  
  If *tHTTP\lFlags = 0 ; if flags not specified
    ; no popup for cookies + load from Internet (not cache)
    *tHTTP\lFlags = #INTERNET_FLAG_NO_UI | #INTERNET_FLAG_NO_CACHE_WRITE | #INTERNET_FLAG_RELOAD | #INTERNET_FLAG_PRAGMA_NOCACHE
  EndIf
  
  If *tHTTP\lPortNumber = 0 ; if HTTP port not specified
    *tHTTP\lPortNumber = 80
  EndIf
  
  If *tHTTP\lSecureSocket
    *tHTTP\lFlags | #INTERNET_FLAG_SECURE | #INTERNET_FLAG_IGNORE_CERT_CN_INVALID | #INTERNET_FLAG_IGNORE_CERT_DATE_INVALID
  EndIf
  
  If *tHTTP\lHeaderDataLen <= 0
    *tHTTP\lHeaderDataLen = StringByteLength(*tHTTP\sHeaderData)
  EndIf
  If *tHTTP\lOptionalDataLen <= 0
    *tHTTP\lOptionalDataLen = StringByteLength(*tHTTP\sOptionalData)+1
  EndIf
  
  ; *** Internet Open ***
  
  hInet = InternetOpen_(*tHTTP\sUserAgent, *tHTTP\lAccess, *tHTTP\sProxyAndPort, #Null, 0)
  
  If hInet = #Null
    lRetVal = #PBL_ERR_INTERNET_OPEN
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  ; *** Internet Ping ***
  
  hPing = InternetCheckConnection_(*tHTTP\sURL, 1, 0)
  
  If hPing = #Null
    lRetVal = #PBL_ERR_INTERNET_CHECK
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  ; *** Internet Connect ***
  
  hInetCon = InternetConnect_(hInet, sDomain$, *tHTTP\lPortNumber, #Null, #Null, #INTERNET_SERVICE_HTTP, 0, 0)
  
  If hInetCon = #Null
    lRetVal = #PBL_ERR_INTERNET_CONNECT
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  ; *** proxy authentication required ? ***
  
  If *tHTTP\lAccess = #INTERNET_OPEN_TYPE_PROXY
    If Len(*tHTTP\sProxyUsername)
      If InternetSetOption_(hInetCon, #INTERNET_OPTION_PROXY_USERNAME, *tHTTP\sProxyUsername, Len(*tHTTP\sProxyUsername)) = #False
        lRetVal = #PBL_ERR_INTERNET_SET_PROXY
        Goto lbl_HTTPGetFromWeb_Exit
      EndIf
    EndIf
    
    If Len(*tHTTP\sProxyPassword)
      If InternetSetOption_(hInetCon, #INTERNET_OPTION_PROXY_PASSWORD, *tHTTP\sProxyPassword, Len(*tHTTP\sProxyPassword)) = #False
        lRetVal = #PBL_ERR_INTERNET_SET_PROXY
        Goto lbl_HTTPGetFromWeb_Exit
      EndIf
    EndIf
  EndIf
  
  ; *** Open Request ***
  
  hReq = HttpOpenRequest_(hInetCon, *tHTTP\sRequestType, sPath$, #Null, #Null, #Null, *tHTTP\lFlags, 0)
  
  If hReq = #Null
    lRetVal = #PBL_ERR_INTERNET_HTTP_OPEN
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  If *tHTTP\lSecureSocket
    ;
    ; *******************************************************************************************
    ; Common error anticipation/correction
    ; Often unnecessary but should be included for best reliability
    dwBuffLen = SizeOf(dwFlags)
    InternetQueryOption_(hReq, #INTERNET_OPTION_SECURITY_FLAGS, @dwFlags, @dwBuffLen)
    dwFlags | #SECURITY_FLAG_IGNORE_CERT_CN_INVALID | #SECURITY_FLAG_IGNORE_CERT_DATE_INVALID | #SECURITY_FLAG_IGNORE_REVOCATION | #SECURITY_FLAG_IGNORE_UNKNOWN_CA | #SECURITY_FLAG_IGNORE_WRONG_USAGE
    InternetSetOption_(hReq, #INTERNET_OPTION_SECURITY_FLAGS, @dwFlags, dwBuffLen)
    ; *******************************************************************************************
    ;
  EndIf
  
  Repeat
    ; *** Send Request ***
    
    If Not HttpSendRequest_(hReq, @*tHTTP\sHeaderData, *tHTTP\lHeaderDataLen, @*tHTTP\sOptionalData, *tHTTP\lOptionalDataLen)
      lIntRetVal = #PBL_ERR_INTERNET_HTTP_SEND
      Break
    EndIf
    
    ; *** Query Info (status code) ***
    
    If Not HttpQueryInfo_(hReq, #HTTP_QUERY_FLAG_NUMBER | #HTTP_QUERY_STATUS_CODE, @lStatusCode, @lLongSize, #Null)
      lIntRetVal = #PBL_ERR_INTERNET_HTTP_QUERY_STATUS
      Break
    EndIf
    
    ; *** basic authentication required ? ***
    
    If lStatusCode = 401 And flgBasicAuthTried = #False ; we try one time only
      If Len(*tHTTP\sUsername)
        If InternetSetOption_(hInetCon, #INTERNET_OPTION_USERNAME, @*tHTTP\sUsername, Len(*tHTTP\sUsername)) = #False
          lIntRetVal = #PBL_ERR_INTERNET_BASIC_AUTH
          Break
        EndIf
      EndIf
      
      If Len(*tHTTP\sPassword)
        If InternetSetOption_(hInetCon, #INTERNET_OPTION_PASSWORD, @*tHTTP\sPassword, Len(*tHTTP\sPassword)) = #False
          lIntRetVal = #PBL_ERR_INTERNET_BASIC_AUTH
          Break
        EndIf
      EndIf
      
      flgBasicAuthTried = #True
    Else
      Break ; exit this loop
    EndIf
  ForEver
  
  If lIntRetVal
    lRetVal = lIntRetVal
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  ; *** check status code ***
  
  If lStatusCode <> #HTTP_STATUS_OK And *tHTTP\lDestination <> #PBL_WRITE_TO_MEMORY_HEADER
    *tHTTP\lErrorCodeEx =  lStatusCode
    lRetVal = #PBL_ERR_HTTP_STATUS
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  ; *** Query Info (content length) ***
  
  If Not HttpQueryInfo_(hReq, #HTTP_QUERY_FLAG_NUMBER | #HTTP_QUERY_CONTENT_LENGTH, @lContentLen, @lLongSize, #Null)
    lContentLen = 0 ; if failed, we fall back to set content length = 0
  EndIf
  
  If *tHTTP\lChunkSize = 0
    If lContentLen > 0
      ; we want to read it all at once AND we know the total size
      If lContentLen > 10*1024*1024 ; if > 10 MB
        lBufSize = lDefaultChunkSize ; fall back to chunk mode with default size
      Else
        lBufSize = lContentLen ; ok, it is acceptable ... one big chunk
      EndIf
    Else
      ; we want to read it all at once BUT we don't know the total size
      ; fall back to chunk mode with default size
      lBufSize = lDefaultChunkSize
    EndIf
  Else
    ; we want to read it in chunk mode
    lBufSize = *tHTTP\lChunkSize
  EndIf
  
  ; we want to read the full header
  If lBufSize = #PBL_WRITE_TO_MEMORY_HEADER
    lBufSize = lDefaultChunkSize/10
  EndIf
  
  ; allocate memory buffer
  *tHTTP\DestBuffer = AllocateMemory (lBufSize)
  If (*tHTTP\DestBuffer = 0)
    lRetVal = #PBL_ERR_OUT_OF_MEMORY
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  ; create destination file if needed
  If *tHTTP\lDestination = #PBL_WRITE_TO_FILE
    nFileNum = CreateFile (#PB_Any, *tHTTP\sFullFileName)
    If nFileNum = 0
      lRetVal = #PBL_ERR_FILE_CREATION
      Goto lbl_HTTPGetFromWeb_Exit
    EndIf
  EndIf
  
  lReadUntilNow = 0
  
  ; *** download is about to start ***
  
  If *tHTTP\fpCallStart
    *tHTTP\fpCallStart (*tHTTP, hReq)
  EndIf
  
  
  Repeat
    
    Select *tHTTP\lDestination
        
      Case #PBL_WRITE_TO_MEMORY_HEADER
        
        HttpQueryInfo_(hReq, #HTTP_QUERY_RAW_HEADERS_CRLF, *tHTTP\DestBuffer, @lBufSize, #Null)
        If lBufSize = 0
          lIntRetVal = #PBL_ERR_INTERNET_READ
          Break
        EndIf
        
        lReadUntilNow + lBufSize
        
      Case #PBL_WRITE_TO_MEMORY
        If (InternetReadFile_(hReq, *tHTTP\DestBuffer + lReadUntilNow, lBufSize, @lBytesRead)) = #False
          lIntRetVal = #PBL_ERR_INTERNET_READ
          Break
        EndIf
        
      Case #PBL_WRITE_TO_FILE
        If (InternetReadFile_(hReq, *tHTTP\DestBuffer, lBufSize, @lBytesRead)) = #False
          lIntRetVal = #PBL_ERR_INTERNET_READ
          Break
        EndIf
        
        If (lBytesRead) : WriteData (nFileNum, *tHTTP\DestBuffer, lBytesRead) : EndIf
        
        ; check if it's going well
        If Lof(nFileNum) <> lReadUntilNow + lBytesRead
          lIntRetVal = #PBL_ERR_FILE_IO
          Break
        EndIf
        
    EndSelect
    
    lReadUntilNow + lBytesRead
    
    If *tHTTP\fpCallBack And lBytesRead
      If *tHTTP\fpCallBack (*tHTTP, lReadUntilNow, lContentLen, (ElapsedMilliseconds() - lStartTime) / 1000) = #False
        
        ; abort requested by the callback
        lIntRetVal = #PBL_ERR_ABORT_REQUESTED
        Break
      EndIf
    EndIf
    
    If (lBytesRead); probably more to come
      
      If *tHTTP\lDestination = #PBL_WRITE_TO_MEMORY
        *tHTTP\DestBuffer = ReAllocateMemory (*tHTTP\DestBuffer, lReadUntilNow + lBufSize)
      EndIf
      
    Else ; last chunk
      
      If (*tHTTP\lDestination = #PBL_WRITE_TO_MEMORY)
        *tHTTP\DestBuffer = ReAllocateMemory (*tHTTP\DestBuffer, lReadUntilNow)
      EndIf
      
      flgEOF = #True
    EndIf
    
    If (*tHTTP\DestBuffer = 0)
      lIntRetVal = #PBL_ERR_OUT_OF_MEMORY
      Break
    EndIf
    
  Until flgEOF
  
  If lIntRetVal
    lRetVal = lIntRetVal
    Goto lbl_HTTPGetFromWeb_Exit
  EndIf
  
  
  If *tHTTP\lDestination = #PBL_WRITE_TO_FILE
    FlushFileBuffers(nFileNum)
    
    If Lof(nFileNum) <> lReadUntilNow
      lRetVal = #PBL_ERR_FILE_IO
      Goto lbl_HTTPGetFromWeb_Exit
    EndIf
    
    CloseFile(nFileNum)
  EndIf
  
  *tHTTP\lTotBytesRead = lReadUntilNow
  
  lRetVal = #PBL_OK
  
  lbl_HTTPGetFromWeb_Exit:
  
  ; *** close request ***
  If hReq
    If Not InternetCloseHandle_(hReq)
      lRetVal = #PBL_ERR_INTERNET_CLOSE
    EndIf
  EndIf
  
  ; *** close connection ***
  If hInetCon
    If Not InternetCloseHandle_(hInetCon)
      lRetVal = #PBL_ERR_INTERNET_CLOSE
    EndIf
  EndIf
  
  ; *** close Internet ***
  If hInet
    If Not InternetCloseHandle_(hInet)
      lRetVal = #PBL_ERR_INTERNET_CLOSE
    EndIf
  EndIf
  
  If *tHTTP\fpCallEnd
    *tHTTP\fpCallEnd (*tHTTP, lRetVal, lReadUntilNow, lContentLen, (ElapsedMilliseconds() - lStartTime) / 1000)
  EndIf
  
  If (*tHTTP\lDestination = #PBL_WRITE_TO_FILE)
    ; free up the memory buffer
    If (*tHTTP\DestBuffer)
      FreeMemory(*tHTTP\DestBuffer)
      *tHTTP\DestBuffer = 0
    EndIf
    
    ; if an error is happened, the file should be still opened
    If (lRetVal <> #PBL_OK) And IsFile(nFileNum)
      ; clean up the partial file
      CloseFile(nFileNum)
      DeleteFile(*tHTTP\sFullFileName)
    EndIf
  EndIf
  
  ProcedureReturn lRetVal
  
EndProcedure
Changes:
  • + Added HTTPS support
  • + Added #PBL_WRITE_TO_MEMORY_HEADER to read the HTTP(S) header
  • + Added HeaderData and OptionalData support for HTTP(S) requests
  • + It does now "ping" the target HTTP server before requests are send
  • ! Fixed memory heap crash with Goto inside loops
  • It's now using Break and does jump to the Exit label afterwards
And it does not extract the username and password for HTTP auth from the URL, you need to pass it inside the structure!

Requires at PB4.30 :)

I hope that PB will include the same amount of features natively sometime..

Best regards and merry christmas!
jesperbrannmark
Enthusiast
Enthusiast
Posts: 536
Joined: Mon Feb 16, 2009 10:42 am
Location: sweden
Contact:

Error 32785 ($8011)

Post by jesperbrannmark »

I get a error on about 50% of windows machines running this. It doesnt happen straight away but after a few hours.
Here is the scenario:
I download a file every minute with some changes. This can be everything from 1kb to 3 mb. Its threaded.
After a while on these machines i get error 32785 (which in hex is 8011). After this nothing works to download with either the v2.03 or v3.00 of httpgetfromweb.
I am in a bad situation;
Receivehttpfile works fine on the mac but fails from time to time on windows.
URLdownloadtofile_ API call always work - but doesnt allow threading.

Anyone know what error 32785 ($8011) in the HTTPGetFromWeb means? I cant find what its all about ?!
jesperbrannmark
Enthusiast
Enthusiast
Posts: 536
Joined: Mon Feb 16, 2009 10:42 am
Location: sweden
Contact:

Re: HTTPGetFromWeb() 2.0 - use WinInet API

Post by jesperbrannmark »

I also get $800D (32781) - I can't find these two errorcodes in the source...
User avatar
luis
Addict
Addict
Posts: 3893
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: HTTPGetFromWeb() 2.0 - use WinInet API

Post by luis »

Code: Select all

;******************************************************************************
;*** ERROR CONSTANTS **********************************************************
;******************************************************************************

 #PBL_ERR_CALL_FAILED           = $8000
 #PBL_ERR_INVALID_PARAMETERS            
 #PBL_ERR_OUT_OF_MEMORY          
 #PBL_ERR_ABORT_REQUESTED
 #PBL_ERR_NOT_FOUND             
 
 #PBL_ERR_FILE_IO
 #PBL_ERR_FILE_CREATION
 
 #PBL_ERR_HTTP_STATUS

 #PBL_ERR_INTERNET_OPEN
 #PBL_ERR_INTERNET_CONNECT
 #PBL_ERR_INTERNET_SET_PROXY
 #PBL_ERR_INTERNET_BASIC_AUTH
 #PBL_ERR_INTERNET_HTTP_OPEN
 #PBL_ERR_INTERNET_HTTP_SEND
 #PBL_ERR_INTERNET_HTTP_QUERY_STATUS
 #PBL_ERR_INTERNET_READ
 #PBL_ERR_INTERNET_CLOSE
The error codes returned by HTTPGetFromWeb() go from $8000 (32768) to $8010 (32784)

So 32781 is #PBL_ERR_INTERNET_HTTP_SEND. It should never happen unless there is a problem with the connection.
There is not much anyone can do about it. The only thing is to retry the operation.

As for 32785 I don't see how it can be returned (it's not defined and if you look at the code, lRetVal always return one of the above constants or #PBL_OK, so this is quite suspect ! I would try to investigate how that return value pops out as a first step.

I cannot help you in a meaningful way without a code exhibiting the problem. Are you sure the problem isn't elsewhere ? Can you try without threads ? Did you try to enable purifier to see if something got corrupted ? I'm suggesting this relatively to the undefined error code returned.
"Have you tried turning it off and on again ?"
A little PureBasic review
PureGuy
Enthusiast
Enthusiast
Posts: 102
Joined: Mon Aug 30, 2010 11:51 am

Re: HTTPGetFromWeb() 2.0 - use WinInet API

Post by PureGuy »

Does it work with PB 5.00?

It seems to do nothing, but uses full CPU. :shock:
User avatar
luis
Addict
Addict
Posts: 3893
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: HTTPGetFromWeb() 2.0 - use WinInet API

Post by luis »

Yes, it worked (tried 5.10B)

Anyway I've just uploaded the last version I used time ago to be sure (I did some changes).
"Have you tried turning it off and on again ?"
A little PureBasic review
PureGuy
Enthusiast
Enthusiast
Posts: 102
Joined: Mon Aug 30, 2010 11:51 am

Re: HTTPGetFromWeb() 2.0 - use WinInet API

Post by PureGuy »

Thank you, luis.

This new one works much better.
Only problem there start 2 downloads, not 3 and in the end the one threat take ages.
I guess I can kill the threats, without a problem?
Post Reply