Testing for active or dead websites

Just starting out? Need help? Post your questions and find answers here.
User avatar
DeanH
Enthusiast
Enthusiast
Posts: 274
Joined: Wed May 07, 2008 4:57 am
Location: Adelaide, South Australia
Contact:

Testing for active or dead websites

Post by DeanH »

I am using GetHTTPHeader(URL$) to determine if a particular URL is still active or not. The function is in a loop which can check hundreds or thousands of websites contained in a school's library management system's database. I'm checking for 'good values' in the first line - e.g. 200, 301, 302, etc.

I have run into a problem with one website and there may be others. When I test the URL http://classroomantarctica.aad.gov.au/, the function hangs. If I try to go to the website in a browser, it also hangs.

Is there a way to set a timeout?
Is there a better way to test for good URLs?
infratec
Always Here
Always Here
Posts: 7588
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: Testing for active or dead websites

Post by infratec »

Hi,

this is a needed Feature-Request.

As temporary solution you can use this:

Code: Select all

Procedure.s GetHTTPHeaderWithTimeout(URL$, Timeout.i=0)
  
  Protected Header$, Get$, Para$, Server$, Connection.i, Port.i, Result$, Received.i
  
  Get$ = "/" + GetURLPart(URL$, #PB_URL_Path)
  Para$ = GetURLPart(URL$, #PB_URL_Parameters) 
  If Len(Para$)
    Get$ + "?" + Para$
  EndIf
  
  Server$ = GetURLPart(URL$, #PB_URL_Site)
  Port = Val(GetURLPart(URL$, #PB_URL_Port))
  If Port = 0
    Port = 80
  EndIf
  
  ;Result$ = "HTTP/1.1 503 OK"
  
  Connection = OpenNetworkConnection(Server$, Port, #PB_Network_TCP, Timeout)
  If Connection
    Header$ = "HEAD " + Get$ + " HTTP/1.1" + #CRLF$
    Header$ + "Host: " + Server$ + #CRLF$
    Header$ + #CRLF$
    
    SendNetworkString(Connection, Header$, #PB_UTF8)
    
    Timeout = 100
    Repeat
      Select NetworkClientEvent(Connection)
        Case #PB_NetworkEvent_None
          Delay(10)
          Timeout - 1
          
        Case #PB_NetworkEvent_Data
          *Buffer = AllocateMemory(2048)
          If *Buffer
            Received = ReceiveNetworkData(Connection, *Buffer, MemorySize(*Buffer))
            If Received
              Result$ = PeekS(*Buffer, Received, #PB_UTF8)
            EndIf
            FreeMemory(*Buffer)
          EndIf
          
        Case #PB_NetworkEvent_Disconnect
          Break
          
      EndSelect
    Until Timeout = 0
    
    CloseNetworkConnection(Connection)
  EndIf
  
  ProcedureReturn Result$
  
EndProcedure

InitNetwork()

Debug GetHTTPHeaderWithTimeout("http://classroomantarctica.aad.gov.au/", 1000)
If it is easier for your code, comment out the 'Result$ = ' line.

Bernd
User avatar
DeanH
Enthusiast
Enthusiast
Posts: 274
Joined: Wed May 07, 2008 4:57 am
Location: Adelaide, South Australia
Contact:

Re: Testing for active or dead websites

Post by DeanH »

Fantastic, Bern, that works great. It will be easy to pop into my code. Thank you very much.
User avatar
RichAlgeni
Addict
Addict
Posts: 935
Joined: Wed Sep 22, 2010 1:50 am
Location: Bradenton, FL

Re: Testing for active or dead websites

Post by RichAlgeni »

You will not be burned when following the advice of Bernd.
Post Reply