Page 1 of 1

Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 12:56 am
by jassing
Google.com works fine.
roosales.com does not.
Any ideas?

Code: Select all

#url$ = "https://roosales.com" ; /index.php doesn't help/work either. (also tried www.roosales.com)
;#url$ = "https://google.com"

Debug ReceiveHTTPMemory(#url$)
Debug ReceiveHTTPFile(#url$,"c:\temp\page.txt")

Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 1:12 am
by Kiffi
Works here without any problems (Win 10 pro, PB V6.10)

Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 2:34 am
by jassing
Kiffi wrote: Sun Apr 14, 2024 1:12 am Works here without any problems (Win 10 pro, PB V6.10)
Curious indeed...
It seems PB's receivehttp is unreliable, so I'll just shell to wget.exe instead....
Thank

Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 5:31 am
by idle
maybe it's a TLS version issue, I had a similar issue with httpRequest but with the flag #PB_HTTP_WeakSSL it worked.
PB uses curl which is TLS only but it's probably TLS 1.2 or 1.3 by default. so the flag will let you back to TLS 1.1

this works win 11 x64

Code: Select all

 NewMap Header$()
  Header$("Content-Type") = "plaintext"
  Header$("User-Agent") = "Firefox 54.0"
  
  HttpRequest = HTTPRequest(#PB_HTTP_Get, "https://roosales.com", "", #PB_HTTP_WeakSSL, Header$())
  If HttpRequest
    Debug "StatusCode: " + HTTPInfo(HTTPRequest, #PB_HTTP_StatusCode)
    Debug "Response: " + HTTPInfo(HTTPRequest, #PB_HTTP_Response)
    
    FinishHTTP(HTTPRequest)
  Else
    Debug "Request creation failed"
  EndIf


Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 9:46 am
by jassing
idle wrote: Sun Apr 14, 2024 5:31 am maybe it's a TLS version issue, I had a similar issue with httpRequest but with the flag #PB_HTTP_WeakSSL it worked.
Sadly, this didn't do much... Also completed in 177ms so I doubt it fetched a page.
Plus if it were a tls issue, wouldn't others have this problem?
Other https sites work...
[01:42:45] StatusCode: 0
[01:42:45] Response:

Code: Select all

#url$ = "https://roosales.com"
;#url$ = "https://google.com"

ElapsedMilliseconds()
Debug #url$
Debug " "+ReceiveHTTPMemory(#url$) + " "+GetLastError_()+" "+Str(ElapsedMilliseconds())
Debug " "+ReceiveHTTPFile(#url$,"c:\temp\page.txt") + " " + GetLastError_()+" "+Str(ElapsedMilliseconds())
For roosales -takes milliseconds, not really even trying...
[01:44:40] https://roosales.com
[01:44:40] 0 0 226
[01:44:40] 0 0 419
for google, it does take longer
[01:45:38] https://google.com
[01:45:39] 318355079300 0 1224
[01:45:40] 1 0 2051
So my guess is that it's not even doing any network i/o.
I don't have a sniffer installed,but I will and see if I see any traffic...

Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 10:24 am
by idle
so what was the result from the code I posted?
ReceiveHTTPMemory predate httprequest and there may be a difference in how they're implemented.
or maybe its a dns filter blocking the site.

Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 11:27 am
by jassing
idle wrote: Sun Apr 14, 2024 10:24 am so what was the result from the code I posted?
ReceiveHTTPMemory predate httprequest and there may be a difference in how they're implemented.
I posted it; I guess it wasn't obvious. It took less than a 1/4 second this last run. I even tried it with #PB_HTTP_NoSSLCheck, same results.

[01:42:45] StatusCode: 0
[01:42:45] Response:
idle wrote: Sun Apr 14, 2024 10:24 am or maybe its a dns filter blocking the site.
If it were a DNS filter or other issue, wget, chrome, firefox, etc wouldn't be able to get it...

Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 10:01 pm
by vmars316
My internet provider is Spectrum .
With Chrome , I get the following :
Suspicious Site Blocked
This site was blocked because it may contain unsafe content that can harm your device or compromise your personal info.

Re: Can't fetch a specific webpage.

Posted: Sun Apr 14, 2024 10:22 pm
by BarryG
jassing wrote: Sun Apr 14, 2024 2:34 amIt seems PB's receivehttp is unreliable
It'd more likely be the website's server setup, otherwise every website wouldn't work with ReceiveHTTP...(). Some websites compress their data which can make it return "gibberish". I've posted about this before.

Re: Can't fetch a specific webpage.

Posted: Mon Apr 15, 2024 1:33 am
by jassing
BarryG wrote: Sun Apr 14, 2024 10:22 pm
jassing wrote: Sun Apr 14, 2024 2:34 amIt seems PB's receivehttp is unreliable
It'd more likely be the website's server setup, otherwise every website wouldn't work with ReceiveHTTP...(). Some websites compress their data which can make it return "gibberish". I've posted about this before.
Disappointing it can't handle compressed data. It's fairly common -- but, since wget works w/o issue -- it's not that big of a deal, just disappointing.

Re: Can't fetch a specific webpage.

Posted: Tue Apr 16, 2024 4:54 am
by jassing
I used a webgadget - got an error -- it would be nice if one received errors when using the http functions...

Anyway, I know this is outside of purebasic at this point, but is there anyway to find out what is causing this?
webgadget wrote: Turn on TLS 1.0, TLS 1.1, and TLS 1.2 in Advanced settings and try connecting to https://roosales.com again. If this error persists, it is possible that this site uses an unsupported protocol or cipher suite such as RC4 (link for the details), which is not considered secure. Please contact your site administrator.

Re: Can't fetch a specific webpage.

Posted: Tue Apr 16, 2024 6:15 am
by TassyJim
When I connect using the code in the first post, it is using TLSv1.2
It works without any issues. W11 PB6.10

To see what is happening, I use WireShark. It (or something similar) is a must for this type of problem.

Re: Can't fetch a specific webpage.

Posted: Tue Apr 16, 2024 6:55 am
by jassing
No errors in wireshark, wget, chrome, firefox, etc. (page opens normally)