Page 1 of 1

Testing for active or dead websites

Posted: Fri Nov 27, 2015 3:40 am
by DeanH
I am using GetHTTPHeader(URL$) to determine if a particular URL is still active or not. The function is in a loop which can check hundreds or thousands of websites contained in a school's library management system's database. I'm checking for 'good values' in the first line - e.g. 200, 301, 302, etc.

I have run into a problem with one website and there may be others. When I test the URL http://classroomantarctica.aad.gov.au/, the function hangs. If I try to go to the website in a browser, it also hangs.

Is there a way to set a timeout?
Is there a better way to test for good URLs?

Re: Testing for active or dead websites

Posted: Fri Nov 27, 2015 8:21 am
by infratec
Hi,

this is a needed Feature-Request.

As temporary solution you can use this:

Code: Select all

Procedure.s GetHTTPHeaderWithTimeout(URL$, Timeout.i=0)
  
  Protected Header$, Get$, Para$, Server$, Connection.i, Port.i, Result$, Received.i
  
  Get$ = "/" + GetURLPart(URL$, #PB_URL_Path)
  Para$ = GetURLPart(URL$, #PB_URL_Parameters) 
  If Len(Para$)
    Get$ + "?" + Para$
  EndIf
  
  Server$ = GetURLPart(URL$, #PB_URL_Site)
  Port = Val(GetURLPart(URL$, #PB_URL_Port))
  If Port = 0
    Port = 80
  EndIf
  
  ;Result$ = "HTTP/1.1 503 OK"
  
  Connection = OpenNetworkConnection(Server$, Port, #PB_Network_TCP, Timeout)
  If Connection
    Header$ = "HEAD " + Get$ + " HTTP/1.1" + #CRLF$
    Header$ + "Host: " + Server$ + #CRLF$
    Header$ + #CRLF$
    
    SendNetworkString(Connection, Header$, #PB_UTF8)
    
    Timeout = 100
    Repeat
      Select NetworkClientEvent(Connection)
        Case #PB_NetworkEvent_None
          Delay(10)
          Timeout - 1
          
        Case #PB_NetworkEvent_Data
          *Buffer = AllocateMemory(2048)
          If *Buffer
            Received = ReceiveNetworkData(Connection, *Buffer, MemorySize(*Buffer))
            If Received
              Result$ = PeekS(*Buffer, Received, #PB_UTF8)
            EndIf
            FreeMemory(*Buffer)
          EndIf
          
        Case #PB_NetworkEvent_Disconnect
          Break
          
      EndSelect
    Until Timeout = 0
    
    CloseNetworkConnection(Connection)
  EndIf
  
  ProcedureReturn Result$
  
EndProcedure

InitNetwork()

Debug GetHTTPHeaderWithTimeout("http://classroomantarctica.aad.gov.au/", 1000)
If it is easier for your code, comment out the 'Result$ = ' line.

Bernd

Re: Testing for active or dead websites

Posted: Fri Nov 27, 2015 9:05 pm
by DeanH
Fantastic, Bern, that works great. It will be easy to pop into my code. Thank you very much.

Re: Testing for active or dead websites

Posted: Sun Nov 29, 2015 11:07 pm
by RichAlgeni
You will not be burned when following the advice of Bernd.