However, it cannot retrieve the information. Fred says he does not block ChatGPT, but I have proof that somewhere it is indeed blocked.
This curl commands (copy paste in cmd in Windows):
Code: Select all
curl -L -v -o page.html.gzip ^
-H "X-Envoy-Expected-Rq-Timeout-Ms: 15000" ^
-H "X-Request-Id: 3ebf99a3-4a26-4649-8f26-e2513cf97261" ^
-H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" ^
-H "Accept-Encoding: gzip, deflate, br" ^
-H "Accept-Language: en-US,en;q=0.9" ^
-H "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible" ^
-H "X-Forwarded-For-Anon: 4.151.0.0" ^
-H "X-Forwarded-For: 4.151.241.245" ^
-H "X-Forwarded-Port: 443" ^
-H "X-Forwarded-Proto: https" ^
-H "Host: www.purebasic.fr" ^
-H "Connection: close" ^
"https://www.purebasic.fr/french/viewtopic.php?t=19404"
"c:\Program Files\7-Zip\7z.exe" x page.html.gzip -so > page.html
del page.html.gzip
start page.html
This curl command (only difference is the user-agent send):
Code: Select all
curl -L -v -o page.html.gzip ^
-H "X-Envoy-Expected-Rq-Timeout-Ms: 15000" ^
-H "X-Request-Id: 3ebf99a3-4a26-4649-8f26-e2513cf97261" ^
-H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" ^
-H "Accept-Encoding: gzip, deflate, br" ^
-H "Accept-Language: en-US,en;q=0.9" ^
-H "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot" ^
-H "X-Forwarded-For-Anon: 4.151.0.0" ^
-H "X-Forwarded-For: 4.151.241.245" ^
-H "X-Forwarded-Port: 443" ^
-H "X-Forwarded-Proto: https" ^
-H "Host: www.purebasic.fr" ^
-H "Connection: close" ^
"https://www.purebasic.fr/french/viewtopic.php?t=19404"
"c:\Program Files\7-Zip\7z.exe" x page.html.gzip -so > page.html
del page.html.gzip
start page.html
robots.txt is cached? Far fetched. To make sure it is never cached:
Code: Select all
Apache Config (httpd.conf or vhost)
<Files "robots.txt">
Header set Cache-Control "no-store, no-cache, must-revalidate, max-age=0"
Header set Pragma "no-cache"
Header set Expires "0"
</Files>
Code: Select all
.htaccess alternative
<IfModule mod_headers.c>
<Files "robots.txt">
Header set Cache-Control "no-store, no-cache, must-revalidate, max-age=0"
Header set Pragma "no-cache"
Header set Expires "0"
</Files>
</IfModule>