ChatGPT user request bot is blocked!

Everything else that doesn't fall into one of the other PB categories.
Rinzwind
Enthusiast
Enthusiast
Posts: 690
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

ChatGPT user request bot is blocked!

Post by Rinzwind »

After some research I figured out the commands ChatGPT uses when you ask a question like "What is https://www.purebasic.fr/french/viewtopic.php?t=19404 about?"

However, it cannot retrieve the information. Fred says he does not block ChatGPT, but I have proof that somewhere it is indeed blocked.

This curl commands (copy paste in cmd in Windows):

Code: Select all

curl -L -v -o page.html.gzip ^
  -H "X-Envoy-Expected-Rq-Timeout-Ms: 15000" ^
  -H "X-Request-Id: 3ebf99a3-4a26-4649-8f26-e2513cf97261" ^
  -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" ^
  -H "Accept-Encoding: gzip, deflate, br" ^
  -H "Accept-Language: en-US,en;q=0.9" ^
  -H "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible" ^
  -H "X-Forwarded-For-Anon: 4.151.0.0" ^
  -H "X-Forwarded-For: 4.151.241.245" ^
  -H "X-Forwarded-Port: 443" ^
  -H "X-Forwarded-Proto: https" ^
  -H "Host: www.purebasic.fr" ^
  -H "Connection: close" ^
  "https://www.purebasic.fr/french/viewtopic.php?t=19404"
"c:\Program Files\7-Zip\7z.exe" x page.html.gzip -so > page.html
del page.html.gzip
start page.html

Works fine. page.html has content.

This curl command (only difference is the user-agent send):

Code: Select all

curl -L -v -o page.html.gzip ^
  -H "X-Envoy-Expected-Rq-Timeout-Ms: 15000" ^
  -H "X-Request-Id: 3ebf99a3-4a26-4649-8f26-e2513cf97261" ^
  -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" ^
  -H "Accept-Encoding: gzip, deflate, br" ^
  -H "Accept-Language: en-US,en;q=0.9" ^
  -H "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot" ^
  -H "X-Forwarded-For-Anon: 4.151.0.0" ^
  -H "X-Forwarded-For: 4.151.241.245" ^
  -H "X-Forwarded-Port: 443" ^
  -H "X-Forwarded-Proto: https" ^
  -H "Host: www.purebasic.fr" ^
  -H "Connection: close" ^
  "https://www.purebasic.fr/french/viewtopic.php?t=19404"
"c:\Program Files\7-Zip\7z.exe" x page.html.gzip -so > page.html
del page.html.gzip
start page.html

Gets nothing back. page.html is totally empty. Which is the result of a user agent block somewhere. Get similar behavior at a different phpBB forum that actually blocks ChatGPT on purpose in its robots.txt .

robots.txt is cached? Far fetched. To make sure it is never cached:

Code: Select all

Apache Config (httpd.conf or vhost)
<Files "robots.txt">
    Header set Cache-Control "no-store, no-cache, must-revalidate, max-age=0"
    Header set Pragma "no-cache"
    Header set Expires "0"
</Files>

Code: Select all

.htaccess alternative
<IfModule mod_headers.c>
  <Files "robots.txt">
      Header set Cache-Control "no-store, no-cache, must-revalidate, max-age=0"
      Header set Pragma "no-cache"
      Header set Expires "0"
  </Files>
</IfModule>
Olli
Addict
Addict
Posts: 1240
Joined: Wed May 27, 2020 12:26 pm

Re: ChatGPT user request bot is blocked!

Post by Olli »

It is the french friend metalOS which demands if anybody can help him to create user gadget.

He adds he has a hint by choosing canvas but he wants some experiment advices to know if canvas is the best choice to make user gadget.
Rinzwind
Enthusiast
Enthusiast
Posts: 690
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: ChatGPT user request bot is blocked!

Post by Rinzwind »

Any update? User-agent blocks are still in place for sure.

Claude.ai also can't access any topic. Worse that none of this was communicated and I blamed the other party because of a incomplete previous answer 'no robots.txt blocking' is in place.
Fred
Administrator
Administrator
Posts: 18207
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: ChatGPT user request bot is blocked!

Post by Fred »

I checked but there is no user agent blocking anymore. I don't exactly what your curl request is really doing, but a simple one show than it works as expected:

Code: Select all

curl https://www.purebasic.fr/french/viewtopic.php?t=19404 -A "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot"
Fred
Administrator
Administrator
Posts: 18207
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: ChatGPT user request bot is blocked!

Post by Fred »

Ha indeed your curl request is wrong, you miss the "User-agent: " prefix. With this it works:

Code: Select all

curl -L -v -o page.html.gzip ^
  -H "X-Envoy-Expected-Rq-Timeout-Ms: 15000" ^
  -H "X-Request-Id: 3ebf99a3-4a26-4649-8f26-e2513cf97261" ^
  -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" ^
  -H "Accept-Encoding: gzip, deflate, br" ^
  -H "Accept-Language: en-US,en;q=0.9" ^
  -H "User-Agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot" ^
  -H "X-Forwarded-For-Anon: 4.151.0.0" ^
  -H "X-Forwarded-For: 4.151.241.245" ^
  -H "X-Forwarded-Port: 443" ^
  -H "X-Forwarded-Proto: https" ^
  -H "Host: www.purebasic.fr" ^
  -H "Connection: close" ^
  "https://www.purebasic.fr/english/viewtopic.php?t=87348"
"c:\Program Files\7-Zip\7z.exe" x page.html.gzip -so > page.html
del page.html.gzip
start page.html
Rinzwind
Enthusiast
Enthusiast
Posts: 690
Joined: Wed Mar 11, 2009 4:06 pm
Location: NL

Re: ChatGPT user request bot is blocked!

Post by Rinzwind »

I am clueless to how that typo ended up there. But yes, that works. In ChatGPT web client itself, still a no can't do. So back to zero. It can retrieve content from my own domain. So maybe it is careful about phpBB forums with a lot of user content or whatever. But it does work with a random other phpBB forum I don't know anything about other then getting it as test: https://www.conquerclub.com/forum/viewt ... =6&t=46658
what is page https://www.conquerclub.com/forum/viewt ... =6&t=46658 about?
ChatGPT said:
...
That page is a forum thread titled “General Congratulations” on Conquer Club’s Community Discussion board. It’s basically a catch‑all thread from March 17, 2008, where users post tongue-in-cheek congratulatory messages—anything from real accomplishments to absurd, self‑congratulatory jokes. Some sample posts:
As apposed to:
What is viewtopic.php?t=87334 about?
ChatGPT said:
...
I’m hitting a wall—looks like that specific thread on the PureBasic forums won’t load. No search results are popping up either. That usually means the thread may have been deleted, hidden behind login restrictions, or simply restricted.
Post Reply